Dependency-aware fault diagnosis with metric-correlation models in enterprise software systems
The normal operation of enterprise software systems can be modeled by stable correlations between various system metrics; errors are detected when some of these correlations fail to hold. The typical approach to diagnosis (i.e., pinpoint the faulty component) based on the correlation models is to us...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Tagungsbericht |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The normal operation of enterprise software systems can be modeled by stable correlations between various system metrics; errors are detected when some of these correlations fail to hold. The typical approach to diagnosis (i.e., pinpoint the faulty component) based on the correlation models is to use the Jaccard coefficient or some variant thereof, without reference to system structure, dependency data, or prior fault data. In this paper we demonstrate the intrinsic limitations of this approach, and propose a solution that mitigates these limitations. We assume knowledge of dependencies between components in the system, and take this information into account when analyzing the correlation models. We also propose the use of the Tanimoto coefficient instead of the Jaccard coefficient to assign anomaly scores to components. We evaluate our new algorithm with a Trade6-based test-bed. We show that we can find the faulty components within top-3 components with the highest anomaly score in four out of nine cases, while the prior method can only find one. |
---|---|
ISSN: | 2165-9605 |
DOI: | 10.1109/CNSM.2010.5691319 |