A Programmable Heterogeneous Microprocessor Based on Bit-Scalable In-Memory Computing

In-memory computing (IMC) addresses the cost of accessing data from memory in a manner that introduces a tradeoff between energy/throughput and computation signal-to-noise ratio (SNR). However, low SNR posed a primary restriction to integrating IMC in larger, heterogeneous architectures required for...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE journal of solid-state circuits 2020-09, Vol.55 (9), p.2609-2621
Hauptverfasser:	Jia, Hongyang, Valavi, Hossein, Tang, Yinqi, Zhang, Jintao, Verma, Naveen
Format:	Artikel
Sprache:	eng
Schlagworte:	Analog to digital conversion Capacitors Charge-domain compute CMOS Computational modeling Computer architecture Computer memory deep learning Energy consumption Engineering Engineering, Electrical & Electronic Hardware hardware accelerators Image classification in-memory computing (IMC) Mapping Mathematical analysis Matrix algebra Matrix methods Microprocessors neural networks (NNs) RISC Science & Technology Signal to noise ratio Software Technology Workload Workloads
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	In-memory computing (IMC) addresses the cost of accessing data from memory in a manner that introduces a tradeoff between energy/throughput and computation signal-to-noise ratio (SNR). However, low SNR posed a primary restriction to integrating IMC in larger, heterogeneous architectures required for practical workloads due to the challenges with creating robust abstractions necessary for the hardware and software stack. This work exploits recent progress in high-SNR IMC to achieve a programmable heterogeneous microprocessor architecture implemented in 65-nm CMOS and corresponding interfaces to the software that enables mapping of application workloads. The architecture consists of a 590-Kb IMC accelerator, configurable digital near-memory-computing (NMC) accelerator, RISC-V CPU, and other peripherals. To enable programmability, microarchitectural design of the IMC accelerator provides the integration in the standard processor memory space, areaand energy-efficient analog-to-digital conversion for interfacing to NMC, bit-scalable computation (1-8 b), and input-vector sparsity-proportional energy consumption. The IMC accelerator demonstrates excellent matching between computed outputs and idealized software-modeled outputs, at 1b TOPS/W of 192\|400 and 1b-TOPS/mm2 of 0.60\|0.24 for MAC hardware, at V DD of 1.2\|0.85 V, both of which scale directly with the bit precision of the input vector and matrix elements. Software libraries developed for application mapping are used to demonstrate CIFAR-10 image classification with a ten-layer CNN, achieving accuracy, throughput, and energy of 89.3%\|92.4%, 176\|23 images/s, and 5.31\|105.2 μJ/image, for 1\|4 b quantization levels.
ISSN:	0018-9200 1558-173X
DOI:	10.1109/JSSC.2020.2987714