CODES: Efficient Incremental Semi-Supervised Classification Over Drifting and Evolving Social Streams

Classification over data streams is a crucial task of explosive social stream mining and computing. Efficient learning techniques provide high-quality services in the aspect of content distribution and event browsing. Due to the concept drift and concept evolution in data streams, the classification...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE access 2020, Vol.8, p.14024-14035
Hauptverfasser: Bi, Xin, Zhang, Chao, Zhao, Xiangguo, Li, Donghang, Sun, Yongjiao, Ma, Yuliang
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Classification over data streams is a crucial task of explosive social stream mining and computing. Efficient learning techniques provide high-quality services in the aspect of content distribution and event browsing. Due to the concept drift and concept evolution in data streams, the classification performance degrades drastically over time. Many existing methods utilize supervised and unsupervised learning strategies. However, supervised strategies require labeled emerging records to update the classifiers, which is unfeasible to work in the practical social stream applications. Although unsupervised strategies are popularly applied to detect concept evolution, it takes tremendous run-time computation cost to run online clustering. To this end, in this paper, we address these major challenges of social stream classification by proposing an efficient incremental semi-supervised classification method named CODES (Classification Over Drifting and Evolving Stream). The proposed CODES method consists of an efficient incremental semi-supervised learning module and a dynamic novelty threshold update module. Thus, in the drifting and evolving social streams, CODES is able to provide: 1) semi-supervised learning ability to eliminate dependency on the labels of emerging records; 2) fast incremental learning with real-time update ability to tackle concept drift; 3) efficient novel class detection ability to tackle concept evolution. Extensive experiments are conducted on several real-world datasets. The results indicate a higher performance than several state-of-the-art methods. CODES achieves efficient learning performance over drifting and evolving social streams, which improves practical significance in the real-world social stream applications.
ISSN:2169-3536
2169-3536
DOI:10.1109/ACCESS.2020.2965766