Minimal infrequent pattern based approach for mining outliers in data streams

•Minimal Infrequent Pattern based Outlier Detection.•An algorithm for mining minimal infrequent patterns in data streams.•Three simple factors deciding outliers.•An algorithm for detecting outliers based on mined minimal infrequent patterns.•Experimental results with real time sensor data and public...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Expert systems with applications 2015-03, Vol.42 (4), p.1998-2012
Hauptverfasser: Sweetlin Hemalatha, C., Vaidehi, V., Lakshmi, R.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:•Minimal Infrequent Pattern based Outlier Detection.•An algorithm for mining minimal infrequent patterns in data streams.•Three simple factors deciding outliers.•An algorithm for detecting outliers based on mined minimal infrequent patterns.•Experimental results with real time sensor data and publically available UCI data set. Outlier detection is an important task in data mining which aims at detecting patterns that are unusual in a dataset. Though several techniques are proved to be useful in solving some outlier detection problems, there are certain issues yet to be resolved. Most of the existing methods compute distance of points in full dimensional space to detect outliers. But in high dimensional space, the concept of proximity may not be qualitatively meaningful due to the curse of dimensionality and incurs high computational cost. Moreover, the existing methods focus on discovering outliers but do not provide the interpretability of different subspaces that cause the abnormality. Frequent pattern mining based approaches resolve the aforementioned issues. Recently, infrequent pattern mining has attracted the attention of data mining research community which aims at discovering rare associations and researches in this area motivated to propose a new method to detect outliers in data streams. Infrequent patterns are more interesting than frequent patterns in some domains such as fraudulent credit transactions, anomaly detection, etc. In such applications, mining infrequent patterns facilitates detecting outliers. Minimal infrequent patterns are generators of family of infrequent patterns. In this paper, a novel method is presented to detect outliers by mining minimal infrequent patterns from data streams. Three measures namely Transaction Weighting Factor (TWF), Minimal Infrequent Deviation Factor (MIPDF) and Minimal Infrequent Pattern based Outlier Factor (MIFPOF) are defined. An algorithm called Minimal Infrequent Pattern based Outlier Detection (MIFPOD) method is proposed for detecting outliers in data streams based on mined minimal infrequent patterns. The effectiveness of the proposed method is demonstrated on synthetic dataset obtained from vital dataset collected from body sensors and a publicly available real dataset. The experimental results have shown that the proposed method outperforms the existing methods in detecting outliers.
ISSN:0957-4174
1873-6793
DOI:10.1016/j.eswa.2014.09.053