MARS: Memory Aware Reordered Source
Memory bandwidth is critical in today's high performance computing systems. The bandwidth is particularly paramount for GPU workloads such as 3D Gaming, Imaging and Perceptual Computing, GPGPU due to their data-intensive nature. As the number of threads and data streams in the GPUs increases wi...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Memory bandwidth is critical in today's high performance computing systems.
The bandwidth is particularly paramount for GPU workloads such as 3D Gaming,
Imaging and Perceptual Computing, GPGPU due to their data-intensive nature. As
the number of threads and data streams in the GPUs increases with each
generation, along with a high available memory bandwidth, memory efficiency is
also crucial in order to achieve desired performance. In presence of multiple
concurrent data streams, the inherent locality in a single data stream is often
lost as these streams are interleaved while moving through multiple levels of
memory system. In DRAM based main memory, the poor request locality reduces
row-buffer reuse resulting in underutilized and inefficient memory bandwidth.
In this paper we propose Memory-Aware Reordered Source (\textit{MARS})
architecture to address memory inefficiency arising from highly interleaved
data streams. The key idea of \textit{MARS} is that with a sufficiently large
lookahead before the main memory, data streams can be reordered based on their
row-buffer address to regain the lost locality and improve memory efficiency.
We show that \textit{MARS} improves achieved memory bandwidth by 11\% for a set
of synthetic microbenchmarks. Moreover, MARS does so without any specific
knowledge of the memory configuration. |
---|---|
DOI: | 10.48550/arxiv.1808.03518 |