Understanding the Software Needs of High End Computer Users with XALT
The dataset is produced by the software XALT, installed on the High Performance Computing (HPC) resource Stampede at the Texas Advanced Computing Center (TACC). XALT tracks and collects job-level information about software libraries and executables on open-science HPC systems, also known as supercom...
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Dataset |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The dataset is produced by the software XALT, installed on the High Performance Computing (HPC) resource Stampede at the Texas Advanced Computing Center (TACC). XALT tracks and collects job-level information about software libraries and executables on open-science HPC systems, also known as supercomputers. Open science HPC resources are shared via powerful networks by researchers across the country, and are maintained by a handful of supercomputer centers. To use the computations resources, researchers submit jobs, which consist of computational workflows designed to conduct analysis and calculations. The XALT data is used to determine the software libraries that are most often utilized in a given system, a fundamental administrative function for shared HPC resources. Since nodes/memory are finite resources, software libraries must be selected for continued use and maintenance to ensure optimal performance for users. In addition to running on Stampede, XALT software has been tested or installed at The National Institute for Computer Sciences, Oak Ridge Leadership Computing Facility, The National Center for Supercomputing Applications, Baden-Württemberg, The National Energy Research Scientific Computing Center, The Swiss National Supercomputing Centre, The National Oceanic and Atmospheric Administration, and KAUST Supercomputing Centre. Other current uses of the XALT data include debugging software libraries, indirect measurements of performance, and cost analysis based on the time and number of nodes in use. Sociologists, digital anthropologists and scientific software producers have identified possible additional uses for this data such as inferring collaborations, types of relationships and practices of domain scientists working on computational projects. XALT may also be used to gather provenance metadata during computational jobs. Provenance information for the xalt dataset entails the software, associated libraries, and usage metrics that show the initial stage of computational analysis for scientific work. The XALT dataset, in JSON format, contains information on the number of nodes and the libraries and executables used by each user running a given computational job on Stampede. It also includes the science domain that the users identify with for their projects. As part of the publication process, personal identification information is sanitized prior to publication, but all jobs can be related to a particular user through an anonymous user id. This |
---|---|
DOI: | 10.15781/t2pp4p |