Flexion: A Quantitative Metric for Flexibility in DNN Accelerators

Dataflow and tile size choices, which we collectively refer to as mappings, dictate the efficiency (i.e., latency and energy) of DNN accelerators. Rapidly evolving DNN models is one of the major challenges for DNN accelerators since the optimal mapping heavily depends on the layer shape and size. To...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE computer architecture letters 2021-01, Vol.20 (1), p.1-4
Hauptverfasser:	Kwon, Hyoukjun, Pellauer, Michael, Parashar, Angshuman, Krishna, Tushar
Format:	Artikel
Sprache:	eng
Schlagworte:	Accelerators Codification Computer architecture dataflow Deep learning DNN accelerator Flexibility Logic arrays Mapping Neural networks Parallel processing Residual neural networks Two dimensional displays
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Dataflow and tile size choices, which we collectively refer to as mappings, dictate the efficiency (i.e., latency and energy) of DNN accelerators. Rapidly evolving DNN models is one of the major challenges for DNN accelerators since the optimal mapping heavily depends on the layer shape and size. To maintain high efficiency across multiple DNN models, flexible accelerators that can support multiple mappings have emerged. However, we currently lack a metric to evaluate accelerator flexibility and quantitatively compare their capability to run different mappings. In this letter, we formally define the concept of flexibility in DNN accelerators and propose flexion ( flex ibility fract ion ), flexion, which is a quantitative metric of mapping flexibility on DNN accelerators. We codify the formalism we construct and evaluate the flexibility of accelerators based on Eyeriss, NVDLA, and TPUv1. We show that Eyeriss-like accelerator is 2.2× and 17.0× more flexible (i.e., capable of running more mappings) than NVDLA and TPUv1-based accelerators on selected ResNet-50 and MobileNetV2 layers. This work is the first work to enable such a quantitative comparison of the flexibility of accelerators.
ISSN:	1556-6056 1556-6064
DOI:	10.1109/LCA.2020.3044607