H-DAC: discriminative associative classification in data streams

In this paper, we propose an efficient and highly accurate method for data stream classification, called discriminative associative classification . We define class discriminative association rules (CDARs) as the class association rules (CARs) in one data stream that have higher support compared wit...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Soft computing (Berlin, Germany) Germany), 2023, Vol.27 (2), p.953-971
Hauptverfasser:	Seyfi, Majid, Xu, Yue
Format:	Artikel
Sprache:	eng
Schlagworte:	Artificial Intelligence Computational Intelligence Control Data Analytics and Machine Learning Engineering Mathematical Logic and Foundations Mechatronics Robotics
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	In this paper, we propose an efficient and highly accurate method for data stream classification, called discriminative associative classification . We define class discriminative association rules (CDARs) as the class association rules (CARs) in one data stream that have higher support compared with the same rules in the rest of the data streams. Compared to associative classification mining in a single data stream, there are additional challenges in the discriminative associative classification mining in multiple data streams, as the Apriori property of the subset is not applicable. The proposed single-pass H-DAC algorithm is designed based on distinguishing features of the rules to improve classification accuracy and efficiency. Continuously arriving transactions are inserted at fast speed and large volume, and CDARs are discovered in the tilted-time window model. The data structures are dynamically adjusted in offline time intervals to reflect each rule supported in different periods. Empirical analysis shows the effectiveness of the proposed method in the large fast speed data streams. Good efficiency is achieved for batch processing of small and large datasets, plus 0–2% improvements in classification accuracy using the tilted-time window model (i.e., almost with zero overhead). These improvements are seen only for the first 32 incoming batches in the scale of our experiments and we expect better results as the data streams grow.
ISSN:	1432-7643 1433-7479
DOI:	10.1007/s00500-022-07517-7