Mining high utility patterns in interval-based event sequences

Sequential pattern mining is an interesting research area with broad range of applications. Most prior research on sequential pattern mining has considered point-based data where events occur instantaneously. However, in many application domains, events persist over intervals of time of varying leng...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Data & knowledge engineering 2021-09, Vol.135, p.101924, Article 101924
Hauptverfasser: Mirbagheri, S. Mohammad, Hamilton, Howard J.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Sequential pattern mining is an interesting research area with broad range of applications. Most prior research on sequential pattern mining has considered point-based data where events occur instantaneously. However, in many application domains, events persist over intervals of time of varying lengths. Furthermore, traditional frameworks for sequential pattern mining assume all events have the same weight or utility. This simplifying assumption neglects the opportunity to find informative patterns in terms of utilities, such as profits. To address these issues, we incorporate the concept of utility into interval-based sequences and define a framework to mine high utility patterns in interval-based sequences i.e., patterns whose utility meets or exceeds a minimum threshold. In the proposed framework, the utility of events is considered while assuming multiple events can occur coincidentally and persist over varying periods of time. An algorithm named High Utility Interval-based Pattern Miner (HUIPMiner) is proposed and applied to real datasets. To achieve an efficient solution, HUIPMiner is augmented with two effective pruning strategies. Experimental results show that HUIPMiner is an effective solution to the problem of mining high utility interval-based sequences. Moreover, it is shown that the execution time of the algorithm is reduced when the proposed pruning strategies are applied.
ISSN:0169-023X
1872-6933
DOI:10.1016/j.datak.2021.101924