Energy-Based Models for Cross-Modal Localization using Convolutional Transformers
We present a novel framework using Energy-Based Models (EBMs) for localizing a ground vehicle mounted with a range sensor against satellite imagery in the absence of GPS. Lidar sensors have become ubiquitous on autonomous vehicles for describing its surrounding environment. Map priors are typically...
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | We present a novel framework using Energy-Based Models (EBMs) for localizing
a ground vehicle mounted with a range sensor against satellite imagery in the
absence of GPS. Lidar sensors have become ubiquitous on autonomous vehicles for
describing its surrounding environment. Map priors are typically built using
the same sensor modality for localization purposes. However, these map building
endeavors using range sensors are often expensive and time-consuming.
Alternatively, we leverage the use of satellite images as map priors, which are
widely available, easily accessible, and provide comprehensive coverage. We
propose a method using convolutional transformers that performs accurate
metric-level localization in a cross-modal manner, which is challenging due to
the drastic difference in appearance between the sparse range sensor readings
and the rich satellite imagery. We train our model end-to-end and demonstrate
our approach achieving higher accuracy than the state-of-the-art on KITTI,
Pandaset, and a custom dataset. |
---|---|
DOI: | 10.48550/arxiv.2306.04021 |