MERLIN++: parameter-free discovery of time series anomalies

The burgeoning age of IoT has reinforced the need for robust time series anomaly detection. While there are hundreds of anomaly detection methods in the literature, one definition, time series discords , has emerged as a competitive and popular choice for practitioners. Time series discords are subs...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Data mining and knowledge discovery 2023-03, Vol.37 (2), p.670-709
Hauptverfasser: Nakamura, Takaaki, Mercer, Ryan, Imamura, Makoto, Keogh, Eamonn
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The burgeoning age of IoT has reinforced the need for robust time series anomaly detection. While there are hundreds of anomaly detection methods in the literature, one definition, time series discords , has emerged as a competitive and popular choice for practitioners. Time series discords are subsequences of a time series that are maximally far away from their nearest neighbors. Perhaps the most attractive feature of discords is their simplicity. Unlike many of the parameter-laden methods proposed, discords require only a single parameter to be set by the user: the subsequence length. We believe that the utility of discords is reduced by sensitivity to even this single user choice. The obvious solution to this problem, computing discords of all lengths then selecting the best anomalies (under some measure), appears at first glance to be computationally untenable. However, in this work we discuss MERLIN, a recently introduced algorithm that can efficiently and exactly find discords of all lengths in massive time series archives. By exploiting computational redundancies, MERLIN is two orders of magnitude faster than comparable algorithms. Moreover, we show that by exploiting a little-known indexing technique called Orchard’s algorithm, we can create a new algorithm called MERLIN++, which is an order of magnitude faster than MERLIN, yet produces identical results. We demonstrate the utility of our ideas on a large and diverse set of experiments and show that MERLIN++ can discover subtle anomalies that defy existing algorithms or even careful human inspection. We further compare to five state-of-the-art rival methods, on the largest benchmark dataset for this task, and show that MERLIN++ is superior in terms of accuracy and speed.
ISSN:1384-5810
1573-756X
DOI:10.1007/s10618-022-00876-7