Fusing Structure from Motion and Simulation-Augmented Pose Regression from Optical Flow for Challenging Indoor Environments
Journal of Visual Communication and Image Representation, Volume 103, August 2024 The localization of objects is a crucial task in various applications such as robotics, virtual and augmented reality, and the transportation of goods in warehouses. Recent advances in deep learning have enabled the lo...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Journal of Visual Communication and Image Representation, Volume
103, August 2024 The localization of objects is a crucial task in various applications such as
robotics, virtual and augmented reality, and the transportation of goods in
warehouses. Recent advances in deep learning have enabled the localization
using monocular visual cameras. While structure from motion (SfM) predicts the
absolute pose from a point cloud, absolute pose regression (APR) methods learn
a semantic understanding of the environment through neural networks. However,
both fields face challenges caused by the environment such as motion blur,
lighting changes, repetitive patterns, and feature-less structures. This study
aims to address these challenges by incorporating additional information and
regularizing the absolute pose using relative pose regression (RPR) methods.
RPR methods suffer under different challenges, i.e., motion blur. The optical
flow between consecutive images is computed using the Lucas-Kanade algorithm,
and the relative pose is predicted using an auxiliary small recurrent
convolutional network. The fusion of absolute and relative poses is a complex
task due to the mismatch between the global and local coordinate systems.
State-of-the-art methods fusing absolute and relative poses use pose graph
optimization (PGO) to regularize the absolute pose predictions using relative
poses. In this work, we propose recurrent fusion networks to optimally align
absolute and relative pose predictions to improve the absolute pose prediction.
We evaluate eight different recurrent units and construct a simulation
environment to pre-train the APR and RPR networks for better generalized
training. Additionally, we record a large database of different scenarios in a
challenging large-scale indoor environment that mimics a warehouse with
transportation robots. We conduct hyperparameter searches and experiments to
show the effectiveness of our recurrent fusion method compared to PGO. |
---|---|
DOI: | 10.48550/arxiv.2304.07250 |