Unsupervised feature selection based on kernel fisher discriminant analysis and regression learning

In this paper, we propose a new feature selection method called kernel fisher discriminant analysis and regression learning based algorithm for unsupervised feature selection. The existing feature selection methods are based on either manifold learning or discriminative techniques, each of which has...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Machine learning 2019-04, Vol.108 (4), p.659-686
Hauptverfasser: Shang, Ronghua, Meng, Yang, Liu, Chiyang, Jiao, Licheng, Esfahani, Amir M. Ghalamzan, Stolkin, Rustam
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:In this paper, we propose a new feature selection method called kernel fisher discriminant analysis and regression learning based algorithm for unsupervised feature selection. The existing feature selection methods are based on either manifold learning or discriminative techniques, each of which has some shortcomings. Although some studies show the advantages of two-steps method benefiting from both manifold learning and discriminative techniques, a joint formulation has been shown to be more efficient. To do so, we construct a global discriminant objective term of a clustering framework based on the kernel method. We add another term of regression learning into the objective function, which can impose the optimization to select a low-dimensional representation of the original dataset. We use L 2,1 - norm of the features to impose a sparse structure upon features, which can result in more discriminative features. We propose an algorithm to solve the optimization problem introduced in this paper. We further discuss convergence, parameter sensitivity, computational complexity, as well as the clustering and classification accuracy of the proposed algorithm. In order to demonstrate the effectiveness of the proposed algorithm, we perform a set of experiments with different available datasets. The results obtained by the proposed algorithm are compared against the state-of-the-art algorithms. These results show that our method outperforms the existing state-of-the-art methods in many cases on different datasets, but the improved performance comes with the cost of increased time complexity.
ISSN:0885-6125
1573-0565
DOI:10.1007/s10994-018-5765-6