A Highly Parallel and Energy Efficient Three-Dimensional Multilayer CMOS-RRAM Accelerator for Tensorized Neural Network

It is a grand challenge to develop highly parallel yet energy-efficient machine learning hardware accelerator. This paper introduces a three-dimensional (3-D) multilayer CMOSRRAM accelerator for atensorized neural network. Highly parallel matrix-vector multiplication can be performed with low power...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on nanotechnology 2018-07, Vol.17 (4), p.645-656
Hauptverfasser: Huang, Hantao, Ni, Leibin, Wang, Kanwen, Wang, Yuangang, Yu, Hao
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:It is a grand challenge to develop highly parallel yet energy-efficient machine learning hardware accelerator. This paper introduces a three-dimensional (3-D) multilayer CMOSRRAM accelerator for atensorized neural network. Highly parallel matrix-vector multiplication can be performed with low power in the proposed 3-D multilayer CMOS-RRAM accelerator. The adoption of tensorization can significantly compress the weight matrix of a neural network using much fewer parameters. Simulation results using the benchmark MNIST show that the proposed accelerator has 1.283× speed-up, 4.276× energy-saving, and 9.339× area-saving compared to the 3-D CMOS-ASIC implementation; and 6.37× speed-up and 2612× energy-saving compared to 2-D CPU implementation. In addition, 14.85× model compression can be achieved by tensorization with acceptable accuracy loss.
ISSN:1536-125X
1941-0085
DOI:10.1109/TNANO.2017.2732698