SACAT: Streaming-Aware Conflict-Avoiding Thrashing-Resistant GPGPU Cache Management Scheme
Modern graphical processing units (GPUs) are equipped with general-purpose L1 and L2 caches to reduce the memory bandwidth demand and improve the performance of some irregular general-purpose GPU (GPGPU) applications. However, due to the massive multithreading, GPGPU caches suffer from severe resour...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on parallel and distributed systems 2017-06, Vol.28 (6), p.1740-1753 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Modern graphical processing units (GPUs) are equipped with general-purpose L1 and L2 caches to reduce the memory bandwidth demand and improve the performance of some irregular general-purpose GPU (GPGPU) applications. However, due to the massive multithreading, GPGPU caches suffer from severe resource contention and low data-sharing which may lead to performance degradation instead. This paper proposes a low-cost streaming-aware conflict-avoiding thrashing-resistant (SACAT) GPGPU cache management scheme that efficiently utilizes the GPGPU cache resources and addresses all the problems associated with GPGPU caches. The proposed scheme employs three orthogonal techniques. First, it dynamically detects and bypasses streaming applications at fine granularity. Second, a dynamic warp throttling via cores sampling (DWT-CS) is proposed to alleviate cache thrashing. DWT-CS runs an exhaustive search over cores to find the best number of warps that achieves the highest performance. Third, it employs pseudo random interleaving cache (PRIC), which is an improved cache indexing function based on polynomial modulus mapping, to mitigate associativity stalls and eliminate conflict misses. Experimental results demonstrate that the proposed scheme achieves a 1.87× and a 1.5× performance improvement over the cache-conscious wavefront scheduler (CCWS) and the memory request prioritization buffer (MRPB), respectively. |
---|---|
ISSN: | 1045-9219 1558-2183 |
DOI: | 10.1109/TPDS.2016.2627560 |