See What the Robot Can't See: Learning Cooperative Perception for Visual Navigation
We consider the problem of navigating a mobile robot towards a target in an unknown environment that is endowed with visual sensors, where neither the robot nor the sensors have access to global positioning information and only use first-person-view images. In order to overcome the need for position...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | We consider the problem of navigating a mobile robot towards a target in an
unknown environment that is endowed with visual sensors, where neither the
robot nor the sensors have access to global positioning information and only
use first-person-view images. In order to overcome the need for positioning, we
train the sensors to encode and communicate relevant viewpoint information to
the mobile robot, whose objective it is to use this information to navigate to
the target along the shortest path. We overcome the challenge of enabling all
the sensors (even those that cannot directly see the target) to predict the
direction along the shortest path to the target by implementing a
neighborhood-based feature aggregation module using a Graph Neural Network
(GNN) architecture. In our experiments, we first demonstrate generalizability
to previously unseen environments with various sensor layouts. Our results show
that by using communication between the sensors and the robot, we achieve up to
2.0x improvement in SPL (Success weighted by Path Length) when compared to a
communication-free baseline. This is done without requiring a global map,
positioning data, nor pre-calibration of the sensor network. Second, we perform
a zero-shot transfer of our model from simulation to the real world. Laboratory
experiments demonstrate the feasibility of our approach in various cluttered
environments. Finally, we showcase examples of successful navigation to the
target while both the sensor network layout as well as obstacles are
dynamically reconfigured as the robot navigates. We provide a video demo, the
dataset, trained models, and source code.
https://www.youtube.com/watch?v=kcmr6RUgucw
https://github.com/proroklab/sensor-guided-visual-nav |
---|---|
DOI: | 10.48550/arxiv.2208.00759 |