Efficient and Portable Workgroup Size Tuning

The performance of an OpenCL program is strongly influenced by both hardware and software attributes. To achieve superior performance, developers may leverage automatic performance tuning techniques to determine the optimal parameters on the target device. Although existing approaches have shown pro...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on parallel and distributed systems 2020-02, Vol.31 (2), p.455-469
Hauptverfasser: Yu, Chia-Lin, Tsao, Shiao-Li
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The performance of an OpenCL program is strongly influenced by both hardware and software attributes. To achieve superior performance, developers may leverage automatic performance tuning techniques to determine the optimal parameters on the target device. Although existing approaches have shown promising tuning results in their target scenarios, other requirements such as efficiency, portability, and usability should also be considered because of the rapid growth of heterogeneous computing applications and platforms. In this paper, we re-examine the workgroup size tuning problem and propose a novel approach to meet the aforementioned requirements. We abstract the architectural details into a set of hardware parameters so that the proposed approach can be applied without the presence of target devices, which makes it more accessible to developers. The proposed approach is evaluated on 20 OpenCL kernels and six devices, including both CPUs and GPUs. Experimental results demonstrate that, with negligible overhead, our approach filters out 88.6 percent of the possible workgroup sizes on average. Among all the workgroup size candidates, the bestand worst-performing candidates can achieve average performance of 95.5 and 92.1 percent, respectively, compared with the optimal workgroup size.
ISSN:1045-9219
1558-2183
DOI:10.1109/TPDS.2019.2937295