Isolation-Based Anomaly Detection
Anomalies are data points that are few and different. As a result of these properties, we show that, anomalies are susceptible to a mechanism called isolation . This article proposes a method called Isolation Forest ( i Forest), which detects anomalies purely based on the concept of isolation withou...
Gespeichert in:
Veröffentlicht in: | ACM transactions on knowledge discovery from data 2012-03, Vol.6 (1), p.1-39 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Anomalies are data points that are few and different. As a result of these properties, we show that, anomalies are susceptible to a mechanism called
isolation
. This article proposes a method called Isolation Forest (
i
Forest), which detects anomalies purely based on the concept of isolation without employing any distance or density measure---fundamentally different from all existing methods.
As a result,
i
Forest is able to exploit subsampling (i) to achieve a low linear time-complexity and a small memory-requirement and (ii) to deal with the effects of swamping and masking effectively. Our empirical evaluation shows that
i
Forest outperforms ORCA, one-class SVM, LOF and Random Forests in terms of AUC, processing time, and it is robust against masking and swamping effects.
i
Forest also works well in high dimensional problems containing a large number of irrelevant attributes, and when anomalies are not available in training sample. |
---|---|
ISSN: | 1556-4681 1556-472X |
DOI: | 10.1145/2133360.2133363 |