Cramming More Weight Data Onto Compute-in-Memory Macros for High Task-Level Energy Efficiency Using Custom ROM With 3984-kb/mm2 Density in 65-nm CMOS

Owing to the mature process and low access energy, static random-access memory (SRAM) has become a promising candidate for compute-in-memory (CiM) acceleration of multiply-accumulate (MAC) operations. However, SRAM-based CiM cells have rather low density and thus very limited total on-chip memory ca...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE journal of solid-state circuits 2024-06, Vol.59 (6), p.1912-1925
Hauptverfasser: Yin, Guodong, Chen, Yiming, Zhou, Mufeng, Tang, Wenjun, Lee, Mingyen, Yang, Zekun, Liao, Tianyu, Du, Xirui, Narayanan, Vijaykrishnan, Yang, Huazhong, Jia, Hongyang, Liu, Yongpan, Li, Xueqing
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Owing to the mature process and low access energy, static random-access memory (SRAM) has become a promising candidate for compute-in-memory (CiM) acceleration of multiply-accumulate (MAC) operations. However, SRAM-based CiM cells have rather low density and thus very limited total on-chip memory capacity. This fact, unfortunately, results in undesired weight data reload operations from the off-chip dynamic random-access memory (DRAM) in data-intensive scenarios and may even tarnish the energy efficiency of CiM at the task level. Therefore, exploration toward higher density CiM in CMOS is critical to ensure truly high energy efficiency in practice. Aligned with the goal of ultrahigh density, this article presents the first one-transistor (1T) multi-level-cell (MLC) read-only memory (ROM) CiM macro for multi-bit MAC. The highlights of the proposed ROM CiM techniques include: 1) multi-source-driven (MSD) 1T-MLC ROM; 2) charge-domain capacitor sharing (CDCS) for ultrahigh CiM memory density; and 3) ROM-based transfer-learning architectures to provide flexible support of different tasks with minor accuracy degradation. These techniques are demonstrated with a fabricated 2-Mb 1T-MLC ROM CiM macro for 8 b \times 8 b MAC computing. This macro features a record-high cell density of 0.096- \mu \text{m}^{2} /bit and a macro weight density of 3984 kb/mm2 in a 65-nm pure CMOS technology. It also achieves 3.8 \times -55.3 \times lower energy consumption per image inference than the state-of-the-art CiM macros when considering the possible DRAM access.
ISSN:0018-9200
1558-173X
DOI:10.1109/JSSC.2023.3326955