Heterogeneous-race-free memory models

Commodity heterogeneous systems (e.g., integrated CPUs and GPUs), now support a unified, shared memory address space for all components. Because the latency of global communication in a heterogeneous system can be prohibi-tively high, heterogeneous systems (unlike homogeneous CPU systems) provide sy...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:SIGPLAN notices 2014-04, Vol.49 (4), p.427-440
Hauptverfasser: Hower, Derek R., Hechtman, Blake A., Beckmann, Bradford M., Gaster, Benedict R., Hill, Mark D., Reinhardt, Steven K., Wood, David A.
Format: Artikel
Sprache:eng
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Commodity heterogeneous systems (e.g., integrated CPUs and GPUs), now support a unified, shared memory address space for all components. Because the latency of global communication in a heterogeneous system can be prohibi-tively high, heterogeneous systems (unlike homogeneous CPU systems) provide synchronization mechanisms that only guarantee ordering among a subset of threads, which we call a scope. Unfortunately, the consequences and se-mantics of these scoped operations are not yet well under-stood. Without a formal and approachable model to reason about the behavior of these operations, we risk an array of portability and performance issues. In this paper, we embrace scoped synchronization with a new class of memory consistency models that add scoped synchronization to data-race-free models like those of C++ and Java. Called sequential consistency for heterogeneous-race-free (SC for HRF), the new models guarantee SC for programs with "sufficient" synchronization (no data races) of "sufficient" scope. We discuss two such models. The first, HRF-direct, works well for programs with highly regular parallelism. The second, HRF-indirect, builds on HRF-direct by allowing synchronization using different scopes in some cases involving transitive communication. We quanti-tatively show that HRF-indirect encourages forward-looking programs with irregular parallelism by showing up to a 10% performance increase in a task runtime for GPUs.
ISSN:0362-1340
1558-1160
DOI:10.1145/2644865.2541981