FSMFLog: Discovering Anomalous Logs Combining Full Semantic Information and Multi-feature Fusion

Industrial Internet of Things devices usually use log information to record their run-time status, so log-based anomaly detection can contribute to discovering device failures in time. The first step of log-based anomaly detection is log parsing. However, existing methods mainly extract log template...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE internet of things journal 2023-07, p.1-1
Hauptverfasser: Niu, Weina, Li, Zimu, He, Zhaoxu, Wang, Aduo, Li, Beibei, Zhang, Xiaosong
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Industrial Internet of Things devices usually use log information to record their run-time status, so log-based anomaly detection can contribute to discovering device failures in time. The first step of log-based anomaly detection is log parsing. However, existing methods mainly extract log templates for analysis, which ignore some words that represent key semantics. Such omissions may cause semantic misunderstandings and further affect the performance of anomaly detection. On the other hand, existing deep learning-based log anomaly detection approaches only consider the sequential relations among log messages, ignoring the log time and type information. In this paper, we propose an anomaly detection method called FSMFLog based on full semantic information and multi-feature fusion. FSMFLog uses log word lists instead of log templates to represent semantic information. Specifically, the variable part is firstly removed through preprocessing, and then the log sentences are initially clustered using two heuristic strategies, after which the words in the log content are clustered through the prefix tree structure. By integrating semantic features, time features, and type features, FSMFLog also trains a bidirectional GRU model based on an attention mechanism. Evaluation on 16 real-word log datasets from LogHub shows that FSMFLog achieves a higher log parsing accuracy, outperforming other 5 state-of-the-art log parsing methods. We also evaluated FSMFLog on two most widely-used public datasets (HDFS and BGL), and the results demonstrate the effectiveness of FSMFLog, outperforming the compared approaches using deep learning with an average increase of more than 10% in F1-score.
ISSN:2327-4662
DOI:10.1109/JIOT.2023.3300690