H-DAC: discriminative associative classification in data streams
In this paper, we propose an efficient and highly accurate method for data stream classification, called discriminative associative classification . We define class discriminative association rules (CDARs) as the class association rules (CARs) in one data stream that have higher support compared wit...
Gespeichert in:
Veröffentlicht in: | Soft computing (Berlin, Germany) Germany), 2023, Vol.27 (2), p.953-971 |
---|---|
Hauptverfasser: | , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | In this paper, we propose an efficient and highly accurate method for data stream classification, called
discriminative associative classification
. We define
class discriminative association rules
(CDARs) as the
class association rules
(CARs) in one data stream that have higher support compared with the same rules in the rest of the data streams. Compared to associative classification mining in a single data stream, there are additional challenges in the discriminative associative classification mining in multiple data streams, as the Apriori property of the subset is not applicable. The proposed single-pass H-DAC algorithm is designed based on distinguishing features of the rules to improve classification accuracy and efficiency. Continuously arriving transactions are inserted at fast speed and large volume, and CDARs are discovered in the tilted-time window model. The data structures are dynamically adjusted in offline time intervals to reflect each rule supported in different periods. Empirical analysis shows the effectiveness of the proposed method in the large fast speed data streams. Good efficiency is achieved for batch processing of small and large datasets, plus 0–2% improvements in classification accuracy using the tilted-time window model (i.e., almost with zero overhead). These improvements are seen only for the first 32 incoming batches in the scale of our experiments and we expect better results as the data streams grow. |
---|---|
ISSN: | 1432-7643 1433-7479 |
DOI: | 10.1007/s00500-022-07517-7 |