Prevent: An Unsupervised Approach to Predict Software Failures in Production
This paper presents P revent , a fully unsupervised approach to predict and localize failures in distributed enterprise applications. Software failures in production are unavoidable. Predicting failures and locating failing components online are the first steps to proactively manage faults in produc...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on software engineering 2023-12, Vol.49 (12), p.1-15 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | This paper presents P revent , a fully unsupervised approach to predict and localize failures in distributed enterprise applications.
Software failures in production are unavoidable. Predicting failures and locating failing components online are the first steps to proactively manage faults in production. Many techniques predict failures from anomalous combinations of system metrics with supervised, weakly supervised, and semi-supervised learning models. Supervised approaches require large sets of labelled data not commonly available in large enterprise pplications, and address failure types that can be either captured with predefined rules or observed while training supervised odels.
P revent integrates the core ingredients of unsupervised approaches into a novel fully unsupervised approach to predict failures and localize failing resources. The results of experimenting with P revent on a commercially-compliant distributed cloud system indicate that P revent provides more stable, reliable and timely predictions than supervised learning approaches, without requiring the often impractical training with labeled data. |
---|---|
ISSN: | 0098-5589 1939-3520 |
DOI: | 10.1109/TSE.2023.3327583 |