Dataset Artifact for Prodigy: Towards Unsupervised Anomaly Detection in Production HPC Systems
The dataset contains a small set of application runs from Eclipse supercomputer. The applications run with and without synthetic HPC performance anomalies. More detailed information regarding synthetic anomalies can be found at: https://github.com/peaclab/HPAS. We have chosen four applications, name...
Gespeichert in:
Hauptverfasser: | , , , , , , , , |
---|---|
Format: | Dataset |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The dataset contains a small set of application runs from Eclipse supercomputer. The applications run with and without synthetic HPC performance anomalies. More detailed information regarding synthetic anomalies can be found at: https://github.com/peaclab/HPAS. We have chosen four applications, namely LAMMPS, sw4, sw4Lite, and ExaMiniMD, to encompass both real and proxy applications. We have executed each application five times on four compute nodes without introducing any anomalies. To showcase our experiment, we have specifically selected the "memleak" anomaly as it is one of the most commonly occurring types. Additionally, we have also executed each application five times with the chosen anomaly. The dataset we have collected consists of a total of 160 samples, with 80 samples labeled as anomalous and 80 samples labeled as healthy. For the details of applications please refer to the paper. The applications were run on Eclipse, which is situated at Sandia National Laboratories. Eclipse comprises 1488 compute nodes, each equipped with 128GB of memory and two sockets. Each socket contains 18 E5-2695 v4 CPU cores with 2-way hyperthreading, providing substantial computational power for scientific and engineering applications. |
---|---|
DOI: | 10.5281/zenodo.8079387 |