Simple Poisson PCA: an algorithm for (sparse) feature extraction with simultaneous dimension determination

Dimension reduction tools offer a popular approach to analysis of high-dimensional big data. In this paper, we propose an algorithm for sparse Principal Component Analysis for non-Gaussian data. Since our interest for the algorithm stems from applications in text data analysis we focus on the Poisso...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Computational statistics 2020-06, Vol.35 (2), p.559-577
Hauptverfasser: Smallman, Luke, Underwood, William, Artemiou, Andreas
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Dimension reduction tools offer a popular approach to analysis of high-dimensional big data. In this paper, we propose an algorithm for sparse Principal Component Analysis for non-Gaussian data. Since our interest for the algorithm stems from applications in text data analysis we focus on the Poisson distribution which has been used extensively in analysing text data. In addition to sparsity our algorithm is able to effectively determine the desired number of principal components in the model (order determination). The good performance of our proposal is demonstrated with both synthetic and real data examples.
ISSN:0943-4062
1613-9658
DOI:10.1007/s00180-019-00903-0