Efficient multi-output scene coordinate prediction for fast and accurate camera relocalization from a single RGB image

Camera relocalization refers to the problematic of the camera pose estimation including 3D translation and 3D rotation expressed in the world coordinate system with no temporal constraint. Camera relocalization is necessary in localization systems. However, it is still challenging to have both a rea...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Computer vision and image understanding 2020-01, Vol.190, p.102850, Article 102850
Hauptverfasser: Duong, Nam-Duong, Soladié, Catherine, Kacete, Amine, Richard, Pierre-Yves, Royan, Jérôme
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Camera relocalization refers to the problematic of the camera pose estimation including 3D translation and 3D rotation expressed in the world coordinate system with no temporal constraint. Camera relocalization is necessary in localization systems. However, it is still challenging to have both a real-time and accurate method. In this paper, we introduce our data-oriented hybrid method merging both machine learning and geometric approaches for fast and accurate camera relocalization from a single RGB image. We propose an efficient multi-output deep-forest regression based on a sparse feature detection, that uses a whole learned feature vector at each split function to improve the accuracy of 2D–3D point correspondences. Especially, multiple coordinate regression of our deep-forest allows to deal with ambiguous repetitive structure. The learned feature extraction is able to be pre-trained and reused for different scenes. The use of sparse feature detection reduces processing time and increases accuracy of predictions. Finally, we show favorable results in terms of accuracy and computational time compared to the state-of-the-art methods. •Real-time and accurate camera relocalization method using only RGB images.•Decreasing the number of data handled and increasing their relevancy at the same time.•New split function that enables the accuracy of 2D–3D correspondences.•Selection of patches from each image based on sparse feature detection.•Learning feature extraction from a dedicated convolutional neural network.
ISSN:1077-3142
1090-235X
DOI:10.1016/j.cviu.2019.102850