Coarse Grained Task Parallelization by Dynamic Profiling for Heterogeneous SoC Based Embedded System

In this study, we introduce a methodology for automatically transforming user applications written in C/C++ to a parallel representation consisting of coarse-grained tasks based on dynamic profiling. Such a parallel representation is suitable for mapping applications onto heterogeneous SoCs. We pres...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:ACM transactions on embedded computing systems 2024-11
Hauptverfasser: Chang, Liangliang, Gener, Serhan, Mack, Joshua, Suluhan, Hasan Umut, Akoglu, Ali, Chakrabarti, Chaitali
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:In this study, we introduce a methodology for automatically transforming user applications written in C/C++ to a parallel representation consisting of coarse-grained tasks based on dynamic profiling. Such a parallel representation is suitable for mapping applications onto heterogeneous SoCs. We present our approach for instrumenting the user application binary during the compilation process with parallel primitives that enable the runtime system to schedule and execute independent computation-intensive coarse-grained tasks concurrently. We use the proposed compilation and code transformation methodology to retarget each application for execution on a heterogeneous SoC composed of processor cores and accelerators. We demonstrate the capabilities of our integrated compile time and runtime flow through task-level parallelization and functionally correct execution of real-world applications in the communication systems and radar processing domains. We demonstrate the functionality of our integrated system by executing six distinct applications with different degrees of parallelism on four different platforms: an eight-core general-purpose processor, a heterogeneous SoC simulator, and two heterogeneous SoCs utilizing the Xilinx Zynq UltraScale+ FPGA and the Nvidia Jetson AGX board. Our integrated approach offers a path forward for application developers to take full advantage of the target SoC without requiring users to become hardware or parallel programming experts.
ISSN:1539-9087
1558-3465
DOI:10.1145/3704635