Dictionary Learning for Stereo Image Representation

One of the major challenges in multi-view imaging is the definition of a representation that reveals the intrinsic geometry of the visual information. Sparse image representations with overcomplete geometric dictionaries offer a way to efficiently approximate these images, such that the multi-view g...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on image processing 2011-04, Vol.20 (4), p.921-934
Hauptverfasser:	Tošić, Ivana, Frossard, Pascal
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Applied sciences Approximation Approximation methods Artificial Intelligence Cameras Data Interpretation, Statistical Dictionaries Dictionary learning Exact sciences and technology Geometry Image coding Image Enhancement - methods Image Interpretation, Computer-Assisted - methods Image processing Information, signal and communications theory Learning Likelihood Functions Mathematical models multi-view imaging omni directional cameras Pattern Recognition, Automated - methods Photogrammetry - methods Pixel Representations Reproducibility of Results Sensitivity and Specificity Signal and communications theory Signal processing Signal representation. Spectral analysis Signal, noise sparse approximations Telecommunications and information theory Three dimensional displays Transforms Visual
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	One of the major challenges in multi-view imaging is the definition of a representation that reveals the intrinsic geometry of the visual information. Sparse image representations with overcomplete geometric dictionaries offer a way to efficiently approximate these images, such that the multi-view geometric structure becomes explicit in the representation. However, the choice of a good dictionary in this case is far from obvious. We propose a new method for learning overcomplete dictionaries that are adapted to the joint representation of stereo images. We first formulate a sparse stereo image model where the multi-view correlation is described by local geometric transforms of dictionary elements (atoms) in two stereo views. A maximum-likelihood (ML) method for learning stereo dictionaries is then proposed, where a multi-view geometry constraint is included in the probabilistic model. The ML objective function is optimized using the expectation-maximization algorithm. We apply the learning algorithm to the case of omnidirectional images, where we learn scales of atoms in a parametric dictionary. The resulting dictionaries provide better performance in the joint representation of stereo omnidirectional images as well as improved multi-view feature matching. We finally discuss and demonstrate the benefits of dictionary learning for distributed scene representation and camera pose estimation.
ISSN:	1057-7149 1941-0042
DOI:	10.1109/TIP.2010.2081679