Anomaly detection and tuning recommendation system

Systems and methods are provided for detecting anomalies on multiple layers of a computer system, such as a compute server. For example, the system can detect anomalies from the lower firmware layer up to the upper application layer of the compute server. The system collects train data from the comp...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Kumar, Mukund, Estepp, Craig Allan, Rawtani, Nishant, Bhatnagar, Prateek, Lange, Klaus-Dieter, Sai Rajesh, Nalamati
Format: Patent
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Systems and methods are provided for detecting anomalies on multiple layers of a computer system, such as a compute server. For example, the system can detect anomalies from the lower firmware layer up to the upper application layer of the compute server. The system collects train data from the computer system that is under testing. The train data includes features that affect performance metrics, as defined by a selected benchmark. This train data is used in training machine learning (ML) models. The ML models create a train snapshot corresponding to the selected benchmark. Additionally with every new release, a test snapshot can be created corresponding to the selected benchmark or workload. The system can detect an anomaly based on the train snapshot and the test snapshot. Also, the system can recommend tunings for a best set of features based upon data collected over generations of compute server.