Automatic dataflow application tuning for heterogeneous systems

Due to the increasing prevalence of multicore microprocessors and accelerator technologies in modern supercomputer design, new techniques for designing scientific applications are needed, in order to efficiently leverage all of the power inherent in these systems. The dataflow programming paradigm i...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Hartley, T D R, Saule, E, Catalyurek, U V
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Due to the increasing prevalence of multicore microprocessors and accelerator technologies in modern supercomputer design, new techniques for designing scientific applications are needed, in order to efficiently leverage all of the power inherent in these systems. The dataflow programming paradigm is well-suited to application design for distributed and heterogeneous systems than other techniques. Traditionally in dataflow middleware, application data domains are statically partitioned and distributed among the processors using a demand-driven algorithm. Unfortunately, this task scheduling technique can cause severe load imbalances in heterogeneous environments. Furthermore, in the presence of different types of processors, the optimum datasize can be different for each processor type. To solve the load imbalance problem and to leverage the optimum datasize dynamicity in a dataflow framework, we present an algorithm which automatically partitions the application workspace. By putting this partitioning into the purview of the dataflow runtime system, we can adaptively change the size of databuffers and correctly balance the load. Experiments with four applications show that our technique allows developers to skip the tedious and error-prone step of manually tuning the data granularity. Our technique is always competitive with the best-known data partitioning for these experiments, and can beat it under certain constraints.
ISSN:1094-7256
2640-0316
DOI:10.1109/HIPC.2010.5713173