AccuReD: High Accuracy Training of CNNs on ReRAM/GPU Heterogeneous 3-D Architecture
The growing popularity of convolutional neural networks (CNNs) along with their complexity has led to the search for efficient computational platforms suitable for them. Resistive random-access memory (ReRAM)-based architectures offer a promising alternative to commonly used GPU-based platforms for...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on computer-aided design of integrated circuits and systems 2021-05, Vol.40 (5), p.971-984 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The growing popularity of convolutional neural networks (CNNs) along with their complexity has led to the search for efficient computational platforms suitable for them. Resistive random-access memory (ReRAM)-based architectures offer a promising alternative to commonly used GPU-based platforms for training CNNs. However, due to their low-precision storage capability, these architectures cannot support all types of CNN layers and suffer from accuracy loss of the learned model. In addition, ReRAM behavior varies with temperature. High temperature reduces noise margin and introduces additional noise. This makes training of CNNs challenging as outputs can be misinterpreted at higher operating temperatures leading to accuracy loss. In this work, we propose an M3D-enabled heterogeneous architecture: AccuReD, that combines ReRAM arrays with GPU cores, to address these challenges and achieve high accuracy CNN training. AccuReD supports all types of CNN layers and achieve near-GPU accuracy even with low-precision and nonideal behavior of ReRAMs. In addition, to reduce temperature, we present a performance-thermal-aware mapping policy that maps CNN layers to the computing elements of AccuReD. Experimental evaluation indicates that AccuReD does not lose accuracy while accelerating CNN training by 12\times on an average compared to conventional GPU-only platforms. |
---|---|
ISSN: | 0278-0070 1937-4151 |
DOI: | 10.1109/TCAD.2020.3013194 |