Classification Under Streaming Emerging New Classes: A Solution Using Completely-Random Trees

This paper investigates an important problem in stream mining, i.e., classification under streaming emerging new classes or SENC. The SENC problem can be decomposed into three subproblems: detecting emerging new classes, classifying known classes, and updating models to integrate each new class as p...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on knowledge and data engineering 2017-08, Vol.29 (8), p.1605-1618
Hauptverfasser: Mu, Xin, Ting, Kai Ming, Zhou, Zhi-Hua
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:This paper investigates an important problem in stream mining, i.e., classification under streaming emerging new classes or SENC. The SENC problem can be decomposed into three subproblems: detecting emerging new classes, classifying known classes, and updating models to integrate each new class as part of known classes. The common approach is to treat it as a classification problem and solve it using either a supervised learner or a semi-supervised learner. We propose an alternative approach by using unsupervised learning as the basis to solve this problem. The proposed method employs completely-random trees which have been shown to work well in unsupervised learning and supervised learning independently in the literature. The completely-random trees are used as a single common core to solve all three subproblems: unsupervised learning, supervised learning, and model update on data streams. We show that the proposed unsupervised-learning-focused method often achieves significantly better outcomes than existing classification-focused methods.
ISSN:1041-4347
1558-2191
DOI:10.1109/TKDE.2017.2691702