A Theoretical Analysis of Noisy Sparse Subspace Clustering on Dimensionality-Reduced Data
Subspace clustering is the problem of partitioning unlabeled data points into a number of clusters so that data points within one cluster lie approximately on a low-dimensional linear subspace . In many practical scenarios, the dimensionality of data points to be clustered is compressed due to the c...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on information theory 2019-02, Vol.65 (2), p.685-706 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Subspace clustering is the problem of partitioning unlabeled data points into a number of clusters so that data points within one cluster lie approximately on a low-dimensional linear subspace . In many practical scenarios, the dimensionality of data points to be clustered is compressed due to the constraints of measurement, computation, or privacy. In this paper, we study the theoretical properties of a popular subspace clustering algorithm named sparse subspace clustering (SSC) and establish formal success conditions of SSC on dimensionality-reduced data. Our analysis applies to the most general fully deterministic model, where both underlying subspaces and data points within each subspace are deterministically positioned, and also a wide range of dimensionality reduction techniques (e.g., Gaussian random projection, uniform subsampling, and sketching) that fall into a subspace embedding framework. Finally, we apply our analysis to a differentially private SSC algorithm and established both privacy and utility guarantees of the proposed method. |
---|---|
ISSN: | 0018-9448 1557-9654 |
DOI: | 10.1109/TIT.2018.2879912 |