Domain-Specific Optimization and Generation of High-Performance GPU Code for Stencil Computations
Stencil computations arise in a number of computational domains. They exhibit significant data parallelism and are thus well suited for execution on graphical processing units (GPUs), but can be memory-bandwidth limited unless temporal locality is utilized via tiling. This paper describes how effect...
Gespeichert in:
Veröffentlicht in: | Proceedings of the IEEE 2018-11, Vol.106 (11), p.1902-1920 |
---|---|
Hauptverfasser: | , , , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Stencil computations arise in a number of computational domains. They exhibit significant data parallelism and are thus well suited for execution on graphical processing units (GPUs), but can be memory-bandwidth limited unless temporal locality is utilized via tiling. This paper describes how effective tiled code can be generated for GPUs from a domain-specific language (DSL) for stencils. Experimental results demonstrate the benefits of such a domain-specific optimization approach over state-of-the-art general-purpose compiler optimizations. |
---|---|
ISSN: | 0018-9219 1558-2256 |
DOI: | 10.1109/JPROC.2018.2862896 |