A Communication-Centric Approach for Designing Flexible DNN Accelerators

High computational demands of deep neural networks (DNNs) coupled with their pervasiveness across cloud and IoT platforms have led to the emergence of DNN accelerators employing hundreds of processing elements (PE). Most DNN accelerators are optimized for regular mapping of the problems, or dataflow...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE MICRO 2018-11, Vol.38 (6), p.25-35
Hauptverfasser:	Kwon, Hyoukjun, Samajdar, Ananda, Krishna, Tushar
Format:	Artikel
Sprache:	eng
Schlagworte:	Accelerators Adders Artificial neural networks Bandwidth Cloud computing Computer architecture Computer memory Engines Mapping Neurons Online advertising Switches Topology Weight reduction
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	High computational demands of deep neural networks (DNNs) coupled with their pervasiveness across cloud and IoT platforms have led to the emergence of DNN accelerators employing hundreds of processing elements (PE). Most DNN accelerators are optimized for regular mapping of the problems, or dataflows, emanating from dense matrix multiplications in convolutional layers. However, continuous innovations in DNN including myriad layer types/shapes, cross-layer fusion, and sparsity have led to irregular dataflows within accelerators, which introduces severe PE underutilization because of rigid and tightly coupled connections among PEs and buffers. To address this challenge, this paper proposes a communication-centric approach called MAERI for designing DNN accelerators. MAERI's key novelty is a light-weight configurable interconnect connecting all compute and memory elements that enable efficient mapping of both regular and irregular dataflows providing near 100% PE utilization.
ISSN:	0272-1732 1937-4143
DOI:	10.1109/MM.2018.2877289