Mining discriminative itemsets in data streams using the tilted-time window model

A discriminative itemset is a frequent itemset in the target data stream with much higher frequency than that of the same itemset in the rest of the data streams in the dataset. The discriminative itemsets describe the distinguishing features between data streams. Mining discriminative itemsets in d...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Knowledge and information systems 2021-05, Vol.63 (5), p.1241-1270
Hauptverfasser: Seyfi, Majid, Nayak, Richi, Xu, Yue, Geva, Shlomo
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:A discriminative itemset is a frequent itemset in the target data stream with much higher frequency than that of the same itemset in the rest of the data streams in the dataset. The discriminative itemsets describe the distinguishing features between data streams. Mining discriminative itemsets in data streams is very important, where continuously arriving transactions can be inserted in fast speed and large volume. Compared with frequent itemset mining in single data stream, there are additional challenges in the discriminative itemset mining process as the Apriori property of subset is not applicable. We propose an efficient and high accurate method for mining discriminative itemsets in data streams using a tilted-time window model. The proposed single-pass H-DISSparse algorithm is designed particularly based on several well-defined characteristics aiming to improve the approximate frequencies of the itemsets in the tilted-time window model. The data structures are dynamically adjusted in offline time intervals to reflect the discriminative itemset frequencies in different time periods in unsynchronized data streams. Empirical analysis shows the efficient time and space complexity of the proposed method in the fast-growing big data streams.
ISSN:0219-1377
0219-3116
DOI:10.1007/s10115-021-01550-y