Learning Sparse PCA with Stabilized ADMM Method on Stiefel Manifold
Sparse principal component analysis (SPCA) produces principal components with sparse loadings, which is very important for handling data with many irrelevant features and also critical to interpret the results. To deal with orthogonal constraints, most previous approaches address SPCA with several c...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on knowledge and data engineering 2021-03, Vol.33 (3), p.1078-1088 |
---|---|
Hauptverfasser: | , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Sparse principal component analysis (SPCA) produces principal components with sparse loadings, which is very important for handling data with many irrelevant features and also critical to interpret the results. To deal with orthogonal constraints, most previous approaches address SPCA with several components using techniques such as deflation technique and convex relaxations. However, the deflation technique usually suffers from suboptimal solutions due to poor approximations. On the other hand, the convex relaxations are often computationally expensive. To address the above issues, in this paper, we propose to address SPCA over the Stiefel manifold directly, and develop a stabilized Alternating Direction Method of Multipliers (SADMM) to handle the nonconvex orthogonal constraints. Compared to traditional ADMM, the proposed SADMM method converges well with a wide range of parameters and obtains a better solution. We also theoretically study the convergence property of the proposed SADMM method. Furthermore, most existing methods ignore an inherent drawback of SPCA - the importance of different components is not considered when doing feature selection, which often makes the selected features nonoptimal. To address this, we further propose a two-stage method which considers the importance of different components to select the most important features. Empirical studies on both synthetic and real-world datasets show that the proposed algorithms achieve better performance compared to existing state-of-the-art methods. |
---|---|
ISSN: | 1041-4347 1558-2191 |
DOI: | 10.1109/TKDE.2019.2935449 |