A Theoretical Analysis of Noisy Sparse Subspace Clustering on Dimensionality-Reduced Data

Subspace clustering is the problem of partitioning unlabeled data points into a number of clusters so that data points within one cluster lie approximately on a low-dimensional linear subspace . In many practical scenarios, the dimensionality of data points to be clustered is compressed due to the c...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on information theory 2019-02, Vol.65 (2), p.685-706
Hauptverfasser: Wang, Yining, Wang, Yu-Xiang, Singh, Aarti
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Subspace clustering is the problem of partitioning unlabeled data points into a number of clusters so that data points within one cluster lie approximately on a low-dimensional linear subspace . In many practical scenarios, the dimensionality of data points to be clustered is compressed due to the constraints of measurement, computation, or privacy. In this paper, we study the theoretical properties of a popular subspace clustering algorithm named sparse subspace clustering (SSC) and establish formal success conditions of SSC on dimensionality-reduced data. Our analysis applies to the most general fully deterministic model, where both underlying subspaces and data points within each subspace are deterministically positioned, and also a wide range of dimensionality reduction techniques (e.g., Gaussian random projection, uniform subsampling, and sketching) that fall into a subspace embedding framework. Finally, we apply our analysis to a differentially private SSC algorithm and established both privacy and utility guarantees of the proposed method.
ISSN:0018-9448
1557-9654
DOI:10.1109/TIT.2018.2879912