Hybridhadoop: CPU-GPU hybrid scheduling in hadoop

As a GPU has become an essential component in high performance computing, it has been attempted by many works to leverage GPU computing in Hadoop. However, few works considered to fully utilize the GPU in Hadoop and only a few works studied utilizing both CPU and GPU at the same time. In this paper,...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Cluster computing 2024-06, Vol.27 (3), p.3875-3892
Hauptverfasser: Oh, Chanyoung, Yi, Saehanseul, Seok, Jongkyu, Jung, Hyeonjin, Yoon, Illo, Yi, Youngmin
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:As a GPU has become an essential component in high performance computing, it has been attempted by many works to leverage GPU computing in Hadoop. However, few works considered to fully utilize the GPU in Hadoop and only a few works studied utilizing both CPU and GPU at the same time. In this paper, we propose a CPU-GPU hybrid scheduling in Hadoop, where both CPUs and GPUs in a node are exploited as much as possible in an adaptive manner. The technical barrier stands in that the optimal number of GPU tasks is not known in advance, and the total number of Containers in a node cannot be changed once a Hadoop job starts. In the proposed approach, we first determine the initial number of Containers as well as the hybrid execution mode, then the proposed dynamic scheduler adjusts the number of Containers for a GPU and a CPU with the help of a GPU monitor during the job execution. It also employs a load-balancing algorithm for the tail. The experiments with various benchmarks show that the proposed CPU-GPU hybrid scheduling achieves 3.87 × of speedup on average against the 12-core CPU-only Hadoop.
ISSN:1386-7857
1573-7543
DOI:10.1007/s10586-023-04178-5