Job Management and Task Bundling

High Performance Computing is often performed on scarce and shared computing resources. To ensure computers are used to their full capacity, administrators often incentivize large workloads that are not possible on smaller systems. Measurements in Lattice QCD frequently do not scale to machine-size...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Berkowitz, Evan, Jansen, Gustav R., McElvain, Kenneth, Walker-Loud, André
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:High Performance Computing is often performed on scarce and shared computing resources. To ensure computers are used to their full capacity, administrators often incentivize large workloads that are not possible on smaller systems. Measurements in Lattice QCD frequently do not scale to machine-size workloads. By bundling tasks together we can create large jobs suitable for gigantic partitions. We discuss METAQ and mpi_jm, software developed to dynamically group computational tasks together, that can intelligently backfill to consume idle time without substantial changes to users’ current workflows or executables.
ISSN:2100-014X
2101-6275
2100-014X
DOI:10.1051/epjconf/201817509007