TURBULENCE: Complexity-Effective Out-of-Order Execution on GPU With Distance-Based ISA

A graphics processing unit (GPU) is a processor that achieves high throughput by exploiting data parallelism. We found that many GPU workloads also contain instruction-level parallelism that can be extracted through out-of-order execution to provide additional performance improvement opportunities....

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE computer architecture letters 2024-07, Vol.23 (2), p.175-178
Hauptverfasser:	Matsuo, Reoma, Koizumi, Toru, Irie, Hidetsugu, Sakai, Shuichi, Shioya, Ryota
Format:	Artikel
Sprache:	eng
Schlagworte:	Decoding Dynamic scheduling Energy efficiency GPU Graphics processing units instruction-level parallelism Microarchitecture Out of order out-of-order execution Registers Relays
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	A graphics processing unit (GPU) is a processor that achieves high throughput by exploiting data parallelism. We found that many GPU workloads also contain instruction-level parallelism that can be extracted through out-of-order execution to provide additional performance improvement opportunities. We propose the TURBULENCE architecture for very low-cost out-of-order execution on GPUs. TURBULENCE consists of a novel ISA that introduces the concept of referencing operands by inter-instruction distance instead of register numbers, and a novel microarchitecture that executes the novel ISA. This distance-based operand has the property of not causing false dependencies. By exploiting this property, we achieve cost-effective out-of-order execution on GPUs without introducing expensive hardware such as a rename logic and a load-store queue. Simulation results show that TURBULENCE improves performance by 17.6% without increasing energy consumption over an existing GPU.
ISSN:	1556-6056 1556-6064
DOI:	10.1109/LCA.2023.3289317