Unsupervised Learning-Based Depth Estimation-Aided Visual SLAM Approach

Simultaneous localization and map construction (SLAM) tasks have been proven to benefit greatly from the depth information of the environment. In this paper, we first present an unsupervised end-to-end learning framework for the task of monocular depth and camera motion estimation from video sequenc...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Circuits, systems, and signal processing systems, and signal processing, 2020-02, Vol.39 (2), p.543-570
Hauptverfasser:	Geng, Mingyang, Shang, Suning, Ding, Bo, Wang, Huaimin, Zhang, Pengfei
Format:	Artikel
Sprache:	eng
Schlagworte:	Circuits and Systems Electrical Engineering Electronics and Microelectronics Engineering Image reconstruction Instrumentation Methods Motion simulation Signal,Image and Speech Processing Simultaneous localization and mapping Training Unsupervised learning
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Simultaneous localization and map construction (SLAM) tasks have been proven to benefit greatly from the depth information of the environment. In this paper, we first present an unsupervised end-to-end learning framework for the task of monocular depth and camera motion estimation from video sequences. The difference between our work and the existing unsupervised methods is that we not only use image reconstruction for supervising but also exploit the pose estimation method used in traditional SLAM approaches to enhance the supervised signal and add extra training constraints for the task of monocular depth and camera motion estimation. Furthermore, we successfully exploit our unsupervised learning framework to assist the traditional ORB-SLAM system when the initialization module of ORB-SLAM method could not match enough features. Qualitative and quantitative experiments have shown that our unsupervised learning framework performs the depth estimation task superior to the supervised methods and outperforms the previous state-of-the-art unsupervised approach by 13.5% on KITTI dataset. For the pose estimation task, our method performs comparably to the supervised methods that use ground-truth pose data for training. Besides, our unsupervised learning framework can significantly accelerate the initialization process of the traditional ORB-SLAM system and effectively improve the accuracy of environmental mapping in strong lighting and weak texture scenes.
ISSN:	0278-081X 1531-5878
DOI:	10.1007/s00034-019-01173-3