A graph-based approach to detect unexplained sequences in a log

•A graph mining approach has been designed for recognizing anomalous sequences.•It supports both real time and batch processing for large scale data analysis.•A probabilistic penalty graph has been used for modeling log temporal sequences.•The approach’s effectiveness has been evaluated for differen...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Expert systems with applications 2021-06, Vol.171, p.114556, Article 114556
Hauptverfasser: Cinque, Marcello, Della Corte, Raffaele, Moscato, Vincenzo, Sperlí, Giancarlo
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:•A graph mining approach has been designed for recognizing anomalous sequences.•It supports both real time and batch processing for large scale data analysis.•A probabilistic penalty graph has been used for modeling log temporal sequences.•The approach’s effectiveness has been evaluated for different system configurations. In this paper we challenge the issue of detecting anomalous events in computer systems log files, through a novel graph mining approach. The basic idea is to model log temporal sequences as a particular graph and event detection as a particular path finding problem. Thus, anomalous sequences correspond to log parts that can not be “explained” by any path in the graph. We propose a novel Iterative Partitioning Log Mining technique to parse any kind of logs and to model their temporal sequence as a probabilistic penalty graph. The approach has been implemented in a framework supporting both real time and batch processing realized on the top of the Apache Spark analytics engine for large-scale data processing. Experimental results show the advantages of the proposed framework in terms of effectiveness for different system configurations.
ISSN:0957-4174
1873-6793
DOI:10.1016/j.eswa.2020.114556