Multi-scale geometric methods for data sets II: Geometric Multi-Resolution Analysis

Data sets are often modeled as samples from a probability distribution in RD, for D large. It is often assumed that the data has some interesting low-dimensional structure, for example that of a d-dimensional manifold M, with d much smaller than D. When M is simply a linear subspace, one may exploit...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Applied and computational harmonic analysis 2012-05, Vol.32 (3), p.435-462
Hauptverfasser: Allard, William K., Chen, Guangliang, Maggioni, Mauro
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Data sets are often modeled as samples from a probability distribution in RD, for D large. It is often assumed that the data has some interesting low-dimensional structure, for example that of a d-dimensional manifold M, with d much smaller than D. When M is simply a linear subspace, one may exploit this assumption for encoding efficiently the data by projecting onto a dictionary of d vectors in RD (for example found by SVD), at a cost (n+D)d for n data points. When M is nonlinear, there are no “explicit” and algorithmically efficient constructions of dictionaries that achieve a similar efficiency: typically one uses either random dictionaries, or dictionaries obtained by black-box global optimization. In this paper we construct data-dependent multi-scale dictionaries that aim at efficiently encoding and manipulating the data. Their construction is fast, and so are the algorithms that map data points to dictionary coefficients and vice versa, in contrast with L1-type sparsity-seeking algorithms, but like adaptive nonlinear approximation in classical multi-scale analysis. In addition, data points are guaranteed to have a compressible representation in terms of the dictionary, depending on the assumptions on the geometry of the underlying probability distribution.
ISSN:1063-5203
1096-603X
DOI:10.1016/j.acha.2011.08.001