Supercomputer computing resource fault prediction method

The invention discloses a super computer computing resource fault prediction method, which comprises the following steps of S1, collecting feature information of a computing node every s seconds, and recording the feature information as x1s; n s seconds are a time window T; s2, obtaining m total fea...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: LIU XIN, SONG CHANGMING, QIAN YU, DIAO XIAONA, ZHANG HONGYU, GONG DAOYONG, LI WEIDONG
Format: Patent
Sprache:chi ; eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The invention discloses a super computer computing resource fault prediction method, which comprises the following steps of S1, collecting feature information of a computing node every s seconds, and recording the feature information as x1s; n s seconds are a time window T; s2, obtaining m total features X as input samples through data accumulation of m time windows T; s3, dividing the m total features X subjected to data processing in the S2 and the corresponding states Y into groups according to the batch size; s4, starting from (m + 1) time windows T, performing data processing on the collected latest total feature Xtest and the corresponding state Ytest according to the S2; s5, a threshold value is set, the prediction result Y'and the corresponding state Ytest are compared, when the deviation is larger than the set threshold value, parameter adjustment retraining is conducted on the training model, and S4 is repeated. The problem that it is difficult to effectively predict the computing resource faults of