Mining discriminative itemsets in data streams using the tilted-time window model
A discriminative itemset is a frequent itemset in the target data stream with much higher frequency than that of the same itemset in the rest of the data streams in the dataset. The discriminative itemsets describe the distinguishing features between data streams. Mining discriminative itemsets in d...
Gespeichert in:
Veröffentlicht in: | Knowledge and information systems 2021-05, Vol.63 (5), p.1241-1270 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | A discriminative itemset is a frequent itemset in the target data stream with much higher frequency than that of the same itemset in the rest of the data streams in the dataset. The discriminative itemsets describe the distinguishing features between data streams. Mining discriminative itemsets in data streams is very important, where continuously arriving transactions can be inserted in fast speed and large volume. Compared with frequent itemset mining in single data stream, there are additional challenges in the discriminative itemset mining process as the
Apriori
property of subset is not applicable. We propose an efficient and high accurate method for mining discriminative itemsets in data streams using a tilted-time window model. The proposed single-pass
H-DISSparse
algorithm is designed particularly based on several well-defined characteristics aiming to improve the approximate frequencies of the itemsets in the tilted-time window model. The data structures are dynamically adjusted in offline time intervals to reflect the discriminative itemset frequencies in different time periods in unsynchronized data streams. Empirical analysis shows the efficient time and space complexity of the proposed method in the fast-growing big data streams. |
---|---|
ISSN: | 0219-1377 0219-3116 |
DOI: | 10.1007/s10115-021-01550-y |