TIRPClo: efficient and complete mining of time intervals-related patterns

Mining frequent Time Intervals-Related Patterns ( TIRPs ) from series of symbolic time intervals offers a comprehensive framework for heterogeneous, multivariate temporal data analysis in various application domains. While gaining a growing interest in recent decades, the efficient mining of frequen...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Data mining and knowledge discovery 2023-09, Vol.37 (5), p.1806-1857
Hauptverfasser: Harel, Omer, Moskovitch, Robert
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Mining frequent Time Intervals-Related Patterns ( TIRPs ) from series of symbolic time intervals offers a comprehensive framework for heterogeneous, multivariate temporal data analysis in various application domains. While gaining a growing interest in recent decades, the efficient mining of frequent TIRPs is still a high computational challenge which has also not yet been investigated in its full complexity. The majority of previous methods discover only the first instances of the TIRPs within each series of symbolic time intervals, whereas their re-occurring instances are ignored. This eventually results in an incomplete discovery of frequent TIRPs, a problem that lies also in the challenge of mining only the frequent closed TIRPs , which was only recently investigated for the first time. In this paper, we introduce TIRPClo—an efficient algorithm for the complete mining of either the entire set of frequent TIRPs, or only the frequent closed TIRPs. The algorithm proposes a non-ambiguous sequential representation of symbolic time intervals series through the intervals’ end-points, as well as a memory-efficient index and a novel method for data projection, due to which it is the first algorithm to guarantee a complete discovery of frequent closed TIRPs. The experimental evaluation conducted on eleven real-world and four synthetic datasets demonstrates that TIRPClo is up to 10 times faster when mining the entire set of frequent TIRPs, and up to more than 100 times faster when mining only the frequent closed TIRPs compared to four state-of-the-art methods, while also reporting lower memory measurements.
ISSN:1384-5810
1573-756X
DOI:10.1007/s10618-023-00944-6