Stay on path: PCA along graph paths
We introduce a variant of (sparse) PCA in which the set of feasible support sets is determined by a graph. In particular, we consider the following setting: given a directed acyclic graph $G$ on $p$ vertices corresponding to variables, the non-zero entries of the extracted principal component must c...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | We introduce a variant of (sparse) PCA in which the set of feasible support
sets is determined by a graph. In particular, we consider the following
setting: given a directed acyclic graph $G$ on $p$ vertices corresponding to
variables, the non-zero entries of the extracted principal component must
coincide with vertices lying along a path in $G$.
From a statistical perspective, information on the underlying network may
potentially reduce the number of observations required to recover the
population principal component. We consider the canonical estimator which
optimally exploits the prior knowledge by solving a non-convex quadratic
maximization on the empirical covariance. We introduce a simple network and
analyze the estimator under the spiked covariance model. We show that side
information potentially improves the statistical complexity.
We propose two algorithms to approximate the solution of the constrained
quadratic maximization, and recover a component with the desired properties. We
empirically evaluate our schemes on synthetic and real datasets. |
---|---|
DOI: | 10.48550/arxiv.1506.02344 |