A clustering and ensemble based classifier for data stream classification
In the era of data mining, the research industry has great attention to data stream mining as well as it has a great impact on a wide range of applications like networking, telecommunication, education, banking, weather forecasting, a stock market, and so on. Because of these data stream mining havi...
Gespeichert in:
Veröffentlicht in: | Applied soft computing 2021-04, Vol.102, p.107076, Article 107076 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | In the era of data mining, the research industry has great attention to data stream mining as well as it has a great impact on a wide range of applications like networking, telecommunication, education, banking, weather forecasting, a stock market, and so on. Because of these data stream mining having more attention from researchers. The handling of concept drifting data streams is one of the major issues and challenges in the data stream mining field. In the presence of the concept drift, the performance of the learning algorithm always degrades. In this paper, a hybrid method has been proposed which are the combination of an ensemble, and grid and density-based clustering methods. The proposed method is tested on both synthetic as well as real data. The proposed method works well in the presence of concept drift and performance is measured in terms of time, accuracy, and memory. As compared with the state-of-art algorithms, the proposed method performed well and gave better accuracy using synthetic datasets like 88.29%, 71.34%, and 75.39% for Hyperplane, RBF, and LED respectively and for real datasets 86.17%, 86.28%, 95.15%, and 99.83% for Adult, Census-Income, KDDCup99%–10%, and Covertype respectively.
•This paper presents hybrid method which uses both supervised and unsupervised learning.•In this paper we have used ensemble method to handle huge amount of data stream.•Grid and density based clustering method is used as a base learner.•Divide and merge method is used to improve the performance in terms of accuracy.•This proposed method handles both gradual and abrupt concept drifts and mainly focused on accuracy as well as requires comparative time and memory. |
---|---|
ISSN: | 1568-4946 1872-9681 |
DOI: | 10.1016/j.asoc.2020.107076 |