Reservoir-based network traffic stream summarization for anomaly detection
Summarization is an important intermediate step for expediting knowledge discovery tasks such as anomaly detection. In the context of anomaly detection from data stream, the summary needs to represent both anomalous and normal data. But streaming data has distinct characteristics, such as one-pass c...
Gespeichert in:
Veröffentlicht in: | Pattern analysis and applications : PAA 2018-05, Vol.21 (2), p.579-599 |
---|---|
1. Verfasser: | |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Summarization is an important intermediate step for expediting knowledge discovery tasks such as anomaly detection. In the context of anomaly detection from data stream, the summary needs to represent both anomalous and normal data. But streaming data has distinct characteristics, such as one-pass constraint, for which conducting data mining operations are difficult. Existing stream summarization techniques are unable to create summary which represent both normal and anomalous instances. To address this problem, in this paper, a number of hybrid summarization techniques are designed and developed using the concept of
reservoir
for anomaly detection from network traffic. Experimental results on thirteen benchmark data streams show that the summaries produced from stream using pairwise distance (
PSSR
) and template matching (
TMSSR
) techniques can retain more anomalies than existing stream summarization techniques, and anomaly detection technique can identify the anomalies with high true positive and low false positive rate. |
---|---|
ISSN: | 1433-7541 1433-755X |
DOI: | 10.1007/s10044-017-0659-y |