Automatic block dimensioning on GPU-accelerated programs through particle swarm optimization

Nowadays, the use of GPU to improve performance of computationally expensive systems are widely explored. On GPU-accelerated programs, performance is related to the partition of the problem into blocks of threads in such a way that the parallel tasks to be executed better fit the GPU architecture. A...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Information and software technology 2020-07, Vol.123, p.106299, Article 106299
Hauptverfasser: Pereira, Claudio M.N.A., Pinheiro, Andre L.S., Schirru, Roberto
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Nowadays, the use of GPU to improve performance of computationally expensive systems are widely explored. On GPU-accelerated programs, performance is related to the partition of the problem into blocks of threads in such a way that the parallel tasks to be executed better fit the GPU architecture. Although there exists some general guidelines to help defining block dimensions, finding the optimum partition is still a complex and problem dependent task. In this work, it has been investigated the use of particle swarm optimization (PSO) to optimize blocks dimensions aiming to minimize programs execution time. The approach was evaluated on a GPU-accelerated wind field calculation program, in which block dimensioning was based on literature guidelines and empirical adjusts. Before PSO optimization, the program was about 25 times faster than the sequential program. After applying PSO, speedup increased to about 60 times. Unexpected optimized configurations were observed, ratifying that finding optimum dimensioning is a complex task. So the use of a robust optimization tool, such as PSO, demonstrated to be very profitable, allowing automatic optimization of blocks dimensions without necessity of a priori knowledge about problem, programs peculiarities and GPU architecture. Improve speedup of GPU-accelerated programs by automatic defining optimized block dimensions using PSO. A GPU-accelerated wind field calculation problem has been focused. A PSO was interfaced to the program in order to find the block dimensions that leads to a minimum execution time. Results were compared to literature results. The speedup obtained with the proposed approach is more than 2 times the original speedup. PSO, demonstrated to be very profitable, allowing automatic optimization of blocks dimensions without necessity of a priori knowledge about problem/programs peculiarities and/or GPU architecture.
ISSN:0950-5849
1873-6025
DOI:10.1016/j.infsof.2020.106299