Unsupervised feature selection via latent representation learning and manifold regularization

With the rapid development of multimedia technology, massive unlabelled data with high dimensionality need to be processed. As a means of dimensionality reduction, unsupervised feature selection has been widely recognized as an important and challenging pre-step for many machine learning and data mi...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Neural networks 2019-09, Vol.117, p.163-178
Hauptverfasser:	Tang, Chang, Bian, Meiru, Liu, Xinwang, Li, Miaomiao, Zhou, Hua, Wang, Pichao, Yin, Hailin
Format:	Artikel
Sprache:	eng
Schlagworte:	Latent representation learning Local structure preservation Manifold regularization Non-negative matrix factorization Unsupervised feature selection
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	With the rapid development of multimedia technology, massive unlabelled data with high dimensionality need to be processed. As a means of dimensionality reduction, unsupervised feature selection has been widely recognized as an important and challenging pre-step for many machine learning and data mining tasks. Traditional unsupervised feature selection algorithms usually assume that the data instances are identically distributed and there is no dependency between them. However, the data instances are not only associated with high dimensional features but also inherently interconnected with each other. Furthermore, the inevitable noises mixed in data could degenerate the performances of previous methods which perform feature selection in original data space. Without label information, the connection information between data instances can be exploited and could help select relevant features. In this work, we propose a robust unsupervised feature selection method which embeds the latent representation learning into feature selection. Instead of measuring the feature importances in original data space, the feature selection is carried out in the learned latent representation space which is more robust to noises. The latent representation is modelled by non-negative matrix factorization of the affinity matrix which explicitly reflects the relationships of data instances. Meanwhile, the local manifold structure of original data space is preserved by a graph based manifold regularization term in the transformed feature space. An efficient alternating algorithm is developed to optimize the proposed model. Experimental results on eight benchmark datasets demonstrate the effectiveness of the proposed method.
ISSN:	0893-6080 1879-2782
DOI:	10.1016/j.neunet.2019.04.015