Identifying likely faulty components in a distributed system

In general, techniques are described for automatically identifying likely faulty components in massively distributed complex systems. In some examples, snapshots of component parameters are automatically repeatedly fed to a pre-trained classifier and the classifier indicates whether each received sn...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: MARQUES PEDRO, RANJAN ASHISH, NAKIL HARSHAD BHASKAR, SINGLA ANKUR, AJAY HAMPAPUR, GHOSE TIRTHANKAR, RAMESH ND, REDDY RAJASHEKAR
Format: Patent
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:In general, techniques are described for automatically identifying likely faulty components in massively distributed complex systems. In some examples, snapshots of component parameters are automatically repeatedly fed to a pre-trained classifier and the classifier indicates whether each received snapshot is likely to belong to a fault and failure class or to a non-fault/failure class. Components whose snapshots indicate a high likelihood of fault or failure are investigated, restarted or taken off line as a pre-emptive measure. The techniques may be applied in a massively distributed complex system such as a data center.