Energy-Efficient eDRAM-Based On-Chip Storage Architecture for GPGPUs

In a typical GPGPU, the on-chip storage is critical to the massive parallelism and is desired to be large. However, the fast increasing size of the on-chip storage based on traditional SRAM cells, such as register file (RF), shared memory and first level data (L1D) cache, makes the area cost and ene...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on computers 2016-01, Vol.65 (1), p.122-135
Hauptverfasser:	Jing, Naifeng, Jiang, Li, Zhang, Tao, Li, Chao, Fan, Fengfeng, Liang, Xiaoyao
Format:	Artikel
Sprache:	eng
Schlagworte:	Cache Compiler Embedded DRAM (eDRAM) Energy Efficiency GPGPU Instruction sets Radiation detectors Radio frequency Random access memory Refresh Register File (RF) Registers Shared Memory System-on-chip Transistors
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	In a typical GPGPU, the on-chip storage is critical to the massive parallelism and is desired to be large. However, the fast increasing size of the on-chip storage based on traditional SRAM cells, such as register file (RF), shared memory and first level data (L1D) cache, makes the area cost and energy consumption unsustainable for future GPGPUs. In this paper, we first propose to use the embedded-DRAM (eDRAM) as an alternative for the on-chip storage. Compared to the conventional SRAM, eDRAM enables higher density and lower leakage power, but suffers from limited data retention time. Periodic refresh operation is a viable approach to maintain data integrity but aggravates the performance and energy consumption with the scaling of eDRAM cells into deep sub-micron technology nodes. To recover the performance loss, we exploit the features in the GPGPU architecture and propose various novel refresh schemes to mitigate the refresh penalty. To improve the energy efficiency, we apply lightweight compiler techniques and runtime monitoring for selective refreshing that intelligently eliminate the unnecessary refreshes. The evaluation on our proposed refresh schemes demonstrates that, comparing to the conventional SRAM-based designs, our eDRAM-based on-chip storage exhibits comparable performance but less energy consumption and smaller silicon area, enabling the sustainable on-chip storage scaling for even higher parallelism in future GPGPUs.
ISSN:	0018-9340 1557-9956
DOI:	10.1109/TC.2015.2417545