FROMFusion: Fast and Robust On-Manifold Dense Reconstruction for Low-Cost Wheeled Robots

Existing voxel-hashing (VH)-based dense reconstruction methods have shown impressive results on datasets collected by hand-held cameras. Large-scale scenes are maintained with a truncated signed distance function (TSDF) volumetric representation. However, practically deploying such methods on low-co...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on instrumentation and measurement 2024, Vol.73, p.1-18
Hauptverfasser:	Bao, Minjie, Fan, Junguo, Fan, Zhendong, Xu, Runze, Wang, Ke, Mu, Chaonan, Li, Ruifeng, Zhou, Hewen, Kang, Peng
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Cameras Degeneration Dense reconstruction field robots Image reconstruction Image segmentation Location awareness Low cost Manifolds (mathematics) Mobile robots Multisensor fusion Occlusion Optimization Perception Point cloud compression real-time Real-time systems Robot vision systems Robots Robustness Spherical coordinates
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Existing voxel-hashing (VH)-based dense reconstruction methods have shown impressive results on datasets collected by hand-held cameras. Large-scale scenes are maintained with a truncated signed distance function (TSDF) volumetric representation. However, practically deploying such methods on low-cost embedded mobile robots remains challenging due to heavy computational burdens and various camera perception degeneration cases. In this work, we propose FROMFusion, a fast and robust on-manifold dense reconstruction framework based on multisensor fusion, which systematically solves how to align the point cloud with the hashed TSDF volume (HTV). Its purely geometric nature ensures the robustness to image motion blur and poor lighting conditions. To reduce memory overhead, we propose a spherical-coordinate-based HTV segmentation algorithm. To surmount missing geometric features, camera occlusion, and over range, a loosely coupled LiDAR-wheel-inertial odometry (LWIO) is applied for trustworthy initial guesses in camera pose optimization. At its core is a two-stage depth-to-HTV matching algorithm, which includes a coarse voxel-level pose ergodic search and a fine subvoxel-level Gauss-Newton (GN) solver with Anderson acceleration (AA) strategy for faster convergence. We evenly distribute heavy computational workloads to heterogeneous computing systems. Extensive field experiments on a low-cost wheeled robot cleaner demonstrate our method models continuous surfaces of large-scale scenes with high quality in both geometry and texture, outperforming current state-of-the-art methods in robustness to camera perception degeneration cases by a significant margin. The frame rate of online embedded implementation can reach up to 47.21 Hz maximum.
ISSN:	0018-9456 1557-9662
DOI:	10.1109/TIM.2024.3481537