CDA-PDDWE: Concept Drift-Aware Performance-Based Diversified Dynamic Weighted Ensemble for Non-stationary Environments
Over the past decades, technological advancements have included the production of a huge number of data streams. Data streams comprise large amounts of partially sequenced, infinite data. Changes in the statistical properties of the input data distribution, such as mean, variance and standard deviat...
Gespeichert in:
Veröffentlicht in: | Arabian journal for science and engineering (2011) 2024, Vol.49 (9), p.12989-13004 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Over the past decades, technological advancements have included the production of a huge number of data streams. Data streams comprise large amounts of partially sequenced, infinite data. Changes in the statistical properties of the input data distribution, such as mean, variance and standard deviation or changes in their relationships with the target label, are referred to as concept drift. Concept drift and class imbalance poses challenges in maintaining accurate classification, adapting to evolving data patterns and effectively classifying minority classes. Addressing this problem requires techniques that handle both class imbalance and concept drift. If such problems are left unaddressed, they will hinder the learning model’s performance. The proposed model in the paper uses adaptive synthetic sampling (ADASYN) to deal with class imbalances. The ADASYN method generates synthetic data by using a weighted distribution based on the severity of the minority class and hard to learn minority class samples. To adapt to concept drift, the performance-based diversified dynamic weighted ensemble is used subsequently. In addition, the cumulative sum statistical test is used to detect drift and, one of the ensemble’s base learners, the self-organizing neural network model, which automatically creates a new layer when drift occurs and provide solution to catastrophic forgetting, performance-based pruning and ensemble evolution if more drift occurs. The proposed model’s efficacy is assessed by utilizing a variety of state-of-the-art ensemble methods and seven datasets in a prequential test-then-train approach with single-pass learning. The results of the experiments show that the proposed model outperforms state-of-the-art ensemble methods. |
---|---|
ISSN: | 2193-567X 1319-8025 2191-4281 |
DOI: | 10.1007/s13369-024-08929-3 |