Cramming More Weight Data Onto Compute-in-Memory Macros for High Task-Level Energy Efficiency Using Custom ROM With 3984-kb/mm2 Density in 65-nm CMOS
Owing to the mature process and low access energy, static random-access memory (SRAM) has become a promising candidate for compute-in-memory (CiM) acceleration of multiply-accumulate (MAC) operations. However, SRAM-based CiM cells have rather low density and thus very limited total on-chip memory ca...
Gespeichert in:
Veröffentlicht in: | IEEE journal of solid-state circuits 2024-06, Vol.59 (6), p.1912-1925 |
---|---|
Hauptverfasser: | , , , , , , , , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Owing to the mature process and low access energy, static random-access memory (SRAM) has become a promising candidate for compute-in-memory (CiM) acceleration of multiply-accumulate (MAC) operations. However, SRAM-based CiM cells have rather low density and thus very limited total on-chip memory capacity. This fact, unfortunately, results in undesired weight data reload operations from the off-chip dynamic random-access memory (DRAM) in data-intensive scenarios and may even tarnish the energy efficiency of CiM at the task level. Therefore, exploration toward higher density CiM in CMOS is critical to ensure truly high energy efficiency in practice. Aligned with the goal of ultrahigh density, this article presents the first one-transistor (1T) multi-level-cell (MLC) read-only memory (ROM) CiM macro for multi-bit MAC. The highlights of the proposed ROM CiM techniques include: 1) multi-source-driven (MSD) 1T-MLC ROM; 2) charge-domain capacitor sharing (CDCS) for ultrahigh CiM memory density; and 3) ROM-based transfer-learning architectures to provide flexible support of different tasks with minor accuracy degradation. These techniques are demonstrated with a fabricated 2-Mb 1T-MLC ROM CiM macro for 8 b \times 8 b MAC computing. This macro features a record-high cell density of 0.096- \mu \text{m}^{2} /bit and a macro weight density of 3984 kb/mm2 in a 65-nm pure CMOS technology. It also achieves 3.8 \times -55.3 \times lower energy consumption per image inference than the state-of-the-art CiM macros when considering the possible DRAM access. |
---|---|
ISSN: | 0018-9200 1558-173X |
DOI: | 10.1109/JSSC.2023.3326955 |