Unified Simultaneous Clustering and Feature Selection for Unlabeled and Labeled Data

This paper proposes a novel feature selection method, namely, unified simultaneous clustering feature selection (USCFS). A regularized regression with a new type of target matrix is formulated to select the most discriminative features among the original features from labeled or unlabeled data. The...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transaction on neural networks and learning systems 2018-12, Vol.29 (12), p.6083-6098
Hauptverfasser: Han, Dongyoon, Kim, Junmo
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:This paper proposes a novel feature selection method, namely, unified simultaneous clustering feature selection (USCFS). A regularized regression with a new type of target matrix is formulated to select the most discriminative features among the original features from labeled or unlabeled data. The regression with l_{2,1} -norm regularization allows the projection matrix to represent an effective selection of discriminative features. For unsupervised feature selection, the target matrix discovers label-like information not from the original data points but rather from projected data points, which are of a reduced dimensionality. Without the aid of an affinity graph-based local structure learning method, USCFS allows the target matrix to capture latent cluster centers via orthogonal basis clustering and to simultaneously select discriminative features guided by latent cluster centers. When class labels are available, the target matrix is also able to find latent class labels by regarding the ground-truth class labels as an approximate guide. Hence, supervised feature selection is realized using these latent class labels, which may differ from the ground-truth class labels. Experimental results demonstrate the effectiveness of the proposed method. Specifically, the proposed method outperforms the state-of-the-art methods on diverse real-world data sets for both the supervised and the unsupervised feature selection.
ISSN:2162-237X
2162-2388
DOI:10.1109/TNNLS.2018.2818444