A clustering and ensemble based classifier for data stream classification

In the era of data mining, the research industry has great attention to data stream mining as well as it has a great impact on a wide range of applications like networking, telecommunication, education, banking, weather forecasting, a stock market, and so on. Because of these data stream mining havi...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Applied soft computing 2021-04, Vol.102, p.107076, Article 107076
Hauptverfasser:	Wankhade, Kapil K., Jondhale, Kalpana C., Dongre, Snehlata S.
Format:	Artikel
Sprache:	eng
Schlagworte:	Classification Clustering Concept drift Data mining Ensemble method Grid and density based clustering
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	In the era of data mining, the research industry has great attention to data stream mining as well as it has a great impact on a wide range of applications like networking, telecommunication, education, banking, weather forecasting, a stock market, and so on. Because of these data stream mining having more attention from researchers. The handling of concept drifting data streams is one of the major issues and challenges in the data stream mining field. In the presence of the concept drift, the performance of the learning algorithm always degrades. In this paper, a hybrid method has been proposed which are the combination of an ensemble, and grid and density-based clustering methods. The proposed method is tested on both synthetic as well as real data. The proposed method works well in the presence of concept drift and performance is measured in terms of time, accuracy, and memory. As compared with the state-of-art algorithms, the proposed method performed well and gave better accuracy using synthetic datasets like 88.29%, 71.34%, and 75.39% for Hyperplane, RBF, and LED respectively and for real datasets 86.17%, 86.28%, 95.15%, and 99.83% for Adult, Census-Income, KDDCup99%–10%, and Covertype respectively. •This paper presents hybrid method which uses both supervised and unsupervised learning.•In this paper we have used ensemble method to handle huge amount of data stream.•Grid and density based clustering method is used as a base learner.•Divide and merge method is used to improve the performance in terms of accuracy.•This proposed method handles both gradual and abrupt concept drifts and mainly focused on accuracy as well as requires comparative time and memory.
ISSN:	1568-4946 1872-9681
DOI:	10.1016/j.asoc.2020.107076