Accelerating Online Mapping and Behavior Prediction via Direct BEV Feature Attention
Understanding road geometry is a critical component of the autonomous vehicle (AV) stack. While high-definition (HD) maps can readily provide such information, they suffer from high labeling and maintenance costs. Accordingly, many recent works have proposed methods for estimating HD maps online fro...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Understanding road geometry is a critical component of the autonomous vehicle
(AV) stack. While high-definition (HD) maps can readily provide such
information, they suffer from high labeling and maintenance costs. Accordingly,
many recent works have proposed methods for estimating HD maps online from
sensor data. The vast majority of recent approaches encode multi-camera
observations into an intermediate representation, e.g., a bird's eye view (BEV)
grid, and produce vector map elements via a decoder. While this architecture is
performant, it decimates much of the information encoded in the intermediate
representation, preventing downstream tasks (e.g., behavior prediction) from
leveraging them. In this work, we propose exposing the rich internal features
of online map estimation methods and show how they enable more tightly
integrating online mapping with trajectory forecasting. In doing so, we find
that directly accessing internal BEV features yields up to 73% faster inference
speeds and up to 29% more accurate predictions on the real-world nuScenes
dataset. |
---|---|
DOI: | 10.48550/arxiv.2407.06683 |