Toward Software-Equivalent Accuracy on Transformer-Based Deep Neural Networks With Analog Memory Devices

Recent advances in deep learning have been driven by ever-increasing model sizes, with networks growing to millions or even billions of parameters. Such enormous models call for fast and energy-efficient hardware accelerators. We study the potential of Analog AI accelerators based on Non-Volatile Me...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Frontiers in computational neuroscience 2021-07, Vol.15, p.675741-675741
Hauptverfasser:	Spoon, Katie, Tsai, Hsinyu, Chen, An, Rasch, Malte J., Ambrogio, Stefano, Mackin, Charles, Fasoli, Andrea, Friz, Alexander M., Narayanan, Pritish, Stanisavljevic, Milos, Burr, Geoffrey W.
Format:	Artikel
Sprache:	eng
Schlagworte:	Accuracy analog accelerators Arrays BERT Deep learning DNN in-memory computing Machine translation Natural language processing Neural networks Neuroscience Noise PCM Probability RRAM Simulation
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Recent advances in deep learning have been driven by ever-increasing model sizes, with networks growing to millions or even billions of parameters. Such enormous models call for fast and energy-efficient hardware accelerators. We study the potential of Analog AI accelerators based on Non-Volatile Memory, in particular Phase Change Memory (PCM), for software-equivalent accurate inference of natural language processing applications. We demonstrate a path to software-equivalent accuracy for the GLUE benchmark on BERT (Bidirectional Encoder Representations from Transformers), by combining noise-aware training to combat inherent PCM drift and noise sources, together with reduced-precision digital attention-block computation down to INT6.
ISSN:	1662-5188 1662-5188
DOI:	10.3389/fncom.2021.675741