Autotuning of configuration for program execution in GPUs

Summary Graphics Processing Units (GPUs) are used as accelerators for improving performance while executing highly data parallel applications. The GPUs are characterized by a number of Streaming Multiprocessors (SM) and a large number of cores within each SM. In addition to this, a hierarchy of memo...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Concurrency and computation 2020-05, Vol.32 (9), p.n/a
Hauptverfasser: Balaiah, Thanasekhar, Parthasarathi, Ranjani
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Summary Graphics Processing Units (GPUs) are used as accelerators for improving performance while executing highly data parallel applications. The GPUs are characterized by a number of Streaming Multiprocessors (SM) and a large number of cores within each SM. In addition to this, a hierarchy of memories with different latencies and sizes is present in the GPUs. The program execution in GPUs is thus dependent on a number of parameter values, both at compile time and runtime. To obtain the optimal performance with these GPU resources, a large parameter space is to be explored, and this leads to a number of unproductive program executions. To alleviate this difficulty, machine learning–based autotuning systems are proposed to predict the right configuration using a limited set of compile‐time parameters. In this paper, we propose a two‐stage machine learning–based autotuning framework using an expanded set of attributes. The important parameters such as block size, occupancy, eligible warps, and execution time are predicted. The mean relative error in prediction of different parameters ranges from of 16% to 6.5%. Dimensionality reduction for the features set reduces the features by up to 50% with further increase in prediction accuracy.
ISSN:1532-0626
1532-0634
DOI:10.1002/cpe.5635