ShaSTA-Fuse: Camera-LiDAR Sensor Fusion to Model Shape and Spatio-Temporal Affinities for 3D Multi-Object Tracking
3D multi-object tracking (MOT) is essential for an autonomous mobile agent to safely navigate a scene. In order to maximize the perception capabilities of the autonomous agent, we aim to develop a 3D MOT framework that fuses camera and LiDAR sensor information. Building on our prior LiDAR-only work,...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | 3D multi-object tracking (MOT) is essential for an autonomous mobile agent to
safely navigate a scene. In order to maximize the perception capabilities of
the autonomous agent, we aim to develop a 3D MOT framework that fuses camera
and LiDAR sensor information. Building on our prior LiDAR-only work, ShaSTA,
which models shape and spatio-temporal affinities for 3D MOT, we propose a
novel camera-LiDAR fusion approach for learning affinities. At its core, this
work proposes a fusion technique that generates a rich sensory signal
incorporating information about depth and distant objects to enhance affinity
estimation for improved data association, track lifecycle management,
false-positive elimination, false-negative propagation, and track confidence
score refinement. Our main contributions include a novel fusion approach for
combining camera and LiDAR sensory signals to learn affinities, and a
first-of-its-kind multimodal sequential track confidence refinement technique
that fuses 2D and 3D detections. Additionally, we perform an ablative analysis
on each fusion step to demonstrate the added benefits of incorporating the
camera sensor, particular for small, distant objects that tend to suffer from
the depth-sensing limits and sparsity of LiDAR sensors. In sum, our technique
achieves state-of-the-art performance on the nuScenes benchmark amongst
multimodal 3D MOT algorithms using CenterPoint detections. |
---|---|
DOI: | 10.48550/arxiv.2310.02532 |