BOTT: Box Only Transformer Tracker for 3D Object Tracking
Tracking 3D objects is an important task in autonomous driving. Classical Kalman Filtering based methods are still the most popular solutions. However, these methods require handcrafted designs in motion modeling and can not benefit from the growing data amounts. In this paper, Box Only Transformer...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Tracking 3D objects is an important task in autonomous driving. Classical
Kalman Filtering based methods are still the most popular solutions. However,
these methods require handcrafted designs in motion modeling and can not
benefit from the growing data amounts. In this paper, Box Only Transformer
Tracker (BOTT) is proposed to learn to link 3D boxes of the same object from
the different frames, by taking all the 3D boxes in a time window as input.
Specifically, transformer self-attention is applied to exchange information
between all the boxes to learn global-informative box embeddings. The
similarity between these learned embeddings can be used to link the boxes of
the same object. BOTT can be used for both online and offline tracking modes
seamlessly. Its simplicity enables us to significantly reduce engineering
efforts required by traditional Kalman Filtering based methods. Experiments
show BOTT achieves competitive performance on two largest 3D MOT benchmarks:
69.9 and 66.7 AMOTA on nuScenes validation and test splits, respectively, 56.45
and 59.57 MOTA L2 on Waymo Open Dataset validation and test splits,
respectively. This work suggests that tracking 3D objects by learning features
directly from 3D boxes using transformers is a simple yet effective way. |
---|---|
DOI: | 10.48550/arxiv.2308.08753 |