A Tool for Automatically Suggesting Source-Code Optimizations for Complex GPU Kernels
proceedings of the 2015 International Conference on Parallel and Distributed Processing Techniques and Applications, page 589-599 : WORLDCOMP'15, July 27-30, 2015, Las Vegas, Nevada Future computing systems, from handhelds to supercomputers, will undoubtedly be more parallel and heterogeneous t...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | proceedings of the 2015 International Conference on Parallel and
Distributed Processing Techniques and Applications, page 589-599 :
WORLDCOMP'15, July 27-30, 2015, Las Vegas, Nevada Future computing systems, from handhelds to supercomputers, will undoubtedly
be more parallel and heterogeneous than todays systems to provide more
performance and energy efficiency. Thus, GPUs are increasingly being used to
accelerate general purpose applications, including applications with data
dependent, irregular control flow and memory access patterns. However, the
growing complexity, exposed memory hierarchy, incoherence, heterogeneity, and
parallelism will make accelerator based systems progressively more difficult to
program. In the foreseeable future, the vast majority of programmers will no
longer be able to extract additional performance or energy savings from next
generation systems be-cause the programming will be too difficult. Automatic
performance analysis and optimization recommendation tools have the potential
to avert this situation. They embody expert knowledge and make it available to
software developers when needed. In this paper, we describe and evaluate such a
tool. |
---|---|
DOI: | 10.48550/arxiv.1910.07776 |