Scheduling in Heterogeneous Computing Environments for Proximity Queries

We present a novel, linear programming (LP)-based scheduling algorithm that exploits heterogeneous multicore architectures such as CPUs and GPUs to accelerate a wide variety of proximity queries. To represent complicated performance relationships between heterogeneous architectures and different com...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on visualization and computer graphics 2013-09, Vol.19 (9), p.1513-1525
Hauptverfasser:	Duksu Kim, Jinkyu Lee, Junghwan Lee, Insik Shin, Kim, J., Sung-Eui Yoon
Format:	Artikel
Sprache:	eng
Schlagworte:	Acceleration Algorithms Central processing units collision detection Computation Computational modeling Heterogeneous system Methods motion planning Multicore processing Optimization Performance enhancement Proximity proximity query Queries ray tracing Robustness Scheduling Scheduling algorithms Studies
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	We present a novel, linear programming (LP)-based scheduling algorithm that exploits heterogeneous multicore architectures such as CPUs and GPUs to accelerate a wide variety of proximity queries. To represent complicated performance relationships between heterogeneous architectures and different computations of proximity queries, we propose a simple, yet accurate model that measures the expected running time of these computations. Based on this model, we formulate an optimization problem that minimizes the largest time spent on computing resources, and propose a novel, iterative LP-based scheduling algorithm. Since our method is general, we are able to apply our method into various proximity queries used in five different applications that have different characteristics. Our method achieves an order of magnitude performance improvement by using four different GPUs and two hexa-core CPUs over using a hexa-core CPU only. Unlike prior scheduling methods, our method continually improves the performance, as we add more computing resources. Also, our method achieves much higher performance improvement compared with prior methods as heterogeneity of computing resources is increased. Moreover, for one of tested applications, our method achieves even higher performance than a prior parallel method optimized manually for the application. We also show that our method provides results that are close (e.g., 75 percent) to the performance provided by a conservative upper bound of the ideal throughput. These results demonstrate the efficiency and robustness of our algorithm that have not been achieved by prior methods. In addition, we integrate one of our contributions with a work stealing method. Our version of the work stealing method achieves 18 percent performance improvement on average over the original work stealing method. This result shows wide applicability of our approach.
ISSN:	1077-2626 1941-0506
DOI:	10.1109/TVCG.2013.71