Unsupervised ensemble minority clustering
Cluster analysis lies at the core of most unsupervised learning tasks. However, the majority of clustering algorithms depend on the all-in assumption, in which all objects belong to some cluster, and perform poorly on minority clustering tasks, in which a small fraction of signal data stands against...
Gespeichert in:
Veröffentlicht in: | Machine learning 2015-01, Vol.98 (1-2), p.217-268 |
---|---|
Hauptverfasser: | , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Cluster analysis lies at the core of most unsupervised learning tasks. However, the majority of clustering algorithms depend on the all-in assumption, in which all objects belong to some cluster, and perform poorly on minority clustering tasks, in which a small fraction of signal data stands against a majority of noise.
The approaches proposed so far for minority clustering are supervised: they require the number and distribution of the foreground and background clusters. In supervised learning and all-in clustering, combination methods have been successfully applied to obtain distribution-free learners, even from the output of weak individual algorithms.
In this work, we propose a novel ensemble minority clustering algorithm,
Ewocs
, suitable for weak clustering combination. Its properties have been theoretically proved under a loose set of constraints. We also propose a number of weak clustering algorithms, and an unsupervised procedure to determine the scaling parameters for Gaussian kernels used within the task.
We have implemented a number of approaches built from the proposed components, and evaluated them on a collection of datasets. The results show how approaches based on
Ewocs
are competitive with respect to—and even outperform—other minority clustering approaches in the state of the art. |
---|---|
ISSN: | 0885-6125 1573-0565 |
DOI: | 10.1007/s10994-013-5394-z |