Jobs Runtime Forecast for JSCC RAS Supercomputers Using Machine Learning Methods

The paper is devoted to machine learning methods and algorithms for the supercomputer jobs execution prediction. The supercomputers statistics shows that the actual runtime of the most of the jobs substantially diverges from the time requested by the user. This reduces the efficiency of scheduling j...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Lobachevskii journal of mathematics 2020-12, Vol.41 (12), p.2593-2602
Hauptverfasser: Savin, G. I., Shabanov, B. M., Nikolaev, D. S., Baranov, A. V., Telegin, P. N.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The paper is devoted to machine learning methods and algorithms for the supercomputer jobs execution prediction. The supercomputers statistics shows that the actual runtime of the most of the jobs substantially diverges from the time requested by the user. This reduces the efficiency of scheduling jobs, since an inaccurate job execution time estimation leads to a suboptimal jobs schedule. The job classification is considered, it is based on the difference between the job actual and the requested execution time. Forecast was made on the base of supercomputer multiuser job management system statistics by assigning a submitted job to one of the classes. The statistics of supercomputers MVS-100K and MVS-10P in the Joint Supercomputer Center of the Russian Academy of Sciences (JSCC RAS) was used. The job flow feature ranking by importance was done on the statistical analysis results. The cross-correlation of the most important features was determined. The probability estimates of correct prediction were obtained for selected well-known machine learning algorithms: logistic regression, decision trees, k-nearest neighbors, linear discriminant analysis, support vector machine, random forest, gradient boosting, and feedforward neural network. The best values were obtained using the random forest method.
ISSN:1995-0802
1818-9962
DOI:10.1134/S1995080220120343