Real-Time Dense Construction With Deep Multiview Stereo Using Camera and IMU Sensors

Real-time dense 3-D reconstruction is one of the major challenges in computer vision and robotics. In this article, we propose a real-time 3-D reconstruction model with metric-scale, including a direct visual-inertial odometry with stereo cameras and a deep multiview stereo network. Aiming at the sc...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE sensors journal 2023-09, Vol.23 (17), p.19648-19659
Hauptverfasser:	Liu, Yanjie, Wu, Heng, Wang, Chao, Wei, Yanlong, Ren, Meixuan, Feng, Tong
Format:	Artikel
Sprache:	eng
Schlagworte:	Cameras Computer vision Degrees of freedom Image reconstruction Optimization Performance enhancement Pose estimation Real time Robotics Three dimensional models Vibration
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Real-time dense 3-D reconstruction is one of the major challenges in computer vision and robotics. In this article, we propose a real-time 3-D reconstruction model with metric-scale, including a direct visual-inertial odometry with stereo cameras and a deep multiview stereo network. Aiming at the scale uncertainty of dense map constructed by monocular camera, we designed a direct stereo visual-inertial odometry (DSVIO). The odometry combines static stereo optimization with direct visual-inertial odometry, using left-right images to initialize the depth of feature points, which can significantly improve the accuracy of 6-degree of freedom (DoF) pose and metric scale in the active window. In the aspect of depth estimation, the minimizing photometric re-projection loss (MPRP) proposed by us can integrate the common viewpoints for depth estimation under different view to improve the performance of deep multiview stereo network (CVA-MVSNet). Finally, the predicted depth map is fused into the truncated signed distance function (TSDF) voxel volume. The experiment shows that the pose estimation of our visual odometer has state-of-the-art (SOTA) performance when the trajectory is smooth and low jitter. In the case of fast jitter, our method is still superior to the monocular visual-internal odometry of oriented fast and rotated brief-simultaneous localization and mapping3 (ORB-SLAM3), but slightly inferior to the stereo visual-internal odometry of ORB-SLAM3. In the experiment of depth estimation, MPRP effectively improves the performance of CVA-MVSNet, and all evaluation indicators were superior to the original method. Moreover, our method had good performance in real-time 3-D reconstruction with metric scale.
ISSN:	1530-437X 1558-1748
DOI:	10.1109/JSEN.2023.3295000