SRAM-Based In-Memory Computing Macro Featuring Voltage-Mode Accumulator and Row-by-Row ADC for Processing Neural Networks

This paper presents a mixed-signal SRAM-based in-memory computing (IMC) macro for processing binarized neural networks. The IMC macro consists of 128\times 128 (16K) SRAM-based bitcells. Each bitcell consists of a standard 6T SRAM bitcell, an XNOR-based binary multiplier, and a pseudo-differential...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on circuits and systems. I, Regular papers Regular papers, 2022-06, Vol.69 (6), p.2412-2422
Hauptverfasser: Mu, Junjie, Kim, Hyunjoon, Kim, Bongjin
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:This paper presents a mixed-signal SRAM-based in-memory computing (IMC) macro for processing binarized neural networks. The IMC macro consists of 128\times 128 (16K) SRAM-based bitcells. Each bitcell consists of a standard 6T SRAM bitcell, an XNOR-based binary multiplier, and a pseudo-differential voltage-mode driver (i.e., an accumulator unit). Multiply-and-accumulate (MAC) operations between 64 pairs of inputs and weights (stored in the first 64 SRAM bitcells) are performed in 128 rows of the macro, all in parallel. A weight-stationary architecture, which minimizes off-chip memory accesses, effectively reduces energy-hungry data communications. A row-by-row analog-to-digital converter (ADC) based on 32 replica bitcells and a sense amplifier reduces the ADC area overhead and compensates for nonlinearity and variation. The ADC converts the MAC result from each row to an N-bit digital output taking 2 N -1 cycles per conversion by sweeping the reference level of 32 replica bitcells. The remaining 32 replica bitcells in the row are utilized for offset calibration. In addition, this paper presents a pseudo-differential voltage-mode accumulator to address issues in the current-mode or single-ended voltage-mode accumulator. A test chip including a 16Kbit SRAM IMC bitcell array is fabricated using a 65nm CMOS technology. The measured energy- and area-efficiency is 741-87TOPS/W with 1-5bit ADC at 0.5V supply and 3.97TOPS/mm 2 , respectively.
ISSN:1549-8328
1558-0806
DOI:10.1109/TCSI.2022.3152653