An 8.9-71.3 TOPS/W Deep Learning Accelerator for Arbitrarily Quantized Neural Networks

This brief presents the first general-purpose deep learning accelerator for arbitrarily quantized neural networks. A pre-processing and corresponding multiply-and-accumulate operator are proposed to perform the vector products between arbitrarily quantized two vectors. As a result, the accelerator c...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on circuits and systems. II, Express briefs Express briefs, 2022-10, Vol.69 (10), p.4148-4152
Hauptverfasser: Moon, Seunghyun, Lee, Kyeong-Jun, Mun, Han-Gyeol, Kim, Byungjun, Sim, Jae-Yoon
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:This brief presents the first general-purpose deep learning accelerator for arbitrarily quantized neural networks. A pre-processing and corresponding multiply-and-accumulate operator are proposed to perform the vector products between arbitrarily quantized two vectors. As a result, the accelerator can support precision scalability of quantization up to 4b for one of input activation and weight, and up to 8b for the other, respectively. The proposed approach shows a reduction of 0.57\times area and 0.78\times power consumption, respectively, compared to those with the conventional 8b linear precision which is required to perform an equal level of task accuracy. The implemented accelerator in 28nm CMOS operates at an operating frequency range of 100-to-250MHz over a supply voltage range of 0.75-to-1.2V. Depending on configurations, it shows an energy efficiency of 8.9-to-71.3TOPS/W.
ISSN:1549-7747
1558-3791
DOI:10.1109/TCSII.2022.3185184