Probabilistic scheduling of dynamic I/O requests via application clustering for burst‐buffers equipped high‐performance computing

Summary Burst‐buffering is a promising storage solution that introduces an intermediate high‐throughput storage buffer layer to mitigate the I/O bottleneck problem that the current high‐performance computing (HPC) platforms suffer. The existing Markov‐Chain based probabilistic I/O scheduling utilize...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Concurrency and computation 2024-08, Vol.36 (19), p.n/a
Hauptverfasser: Zha, Benbo, Shen, Hong
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Summary Burst‐buffering is a promising storage solution that introduces an intermediate high‐throughput storage buffer layer to mitigate the I/O bottleneck problem that the current high‐performance computing (HPC) platforms suffer. The existing Markov‐Chain based probabilistic I/O scheduling utilizes the load state of burst‐buffers and the periodic characteristics of applications to reduce I/O congestion due to the limited capacity of burst‐buffers. However, this probabilistic approach requires consistent I/O characteristics of applications, including similar I/O duration and long application length, in order to obtain an accurate I/O load estimation. These consistency conditions do not often hold in realistic situations. In this paper, we propose a generic framework of dynamic probabilistic I/O scheduling based on application clustering (DPSAC) to make applications meet the consistency requirements. According to the I/O phase length of each application, our scheme first deploys a one‐dimensional K‐means clustering algorithm to cluster the applications into clusters. Next, it calculates the expected workload of each cluster through the probabilistic model of applications and then partitions the burst‐buffers proportionally. Then, to handle dynamic changes (join and exit) of applications, it updates the clusters based on a heuristic strategy. Finally, it applies the probabilistic I/O scheduling, which is based on the distribution of application workload and the state of burst‐buffers, to schedule I/O for all the concurrent applications to mitigate I/O congestion. The simulation results on synthetic data show that our DPSAC is effective and efficient.
ISSN:1532-0626
1532-0634
DOI:10.1002/cpe.8142