SUIT: Learning Significance-guided Information for 3D Temporal Detection
3D object detection from LiDAR point cloud is of critical importance for autonomous driving and robotics. While sequential point cloud has the potential to enhance 3D perception through temporal information, utilizing these temporal features effectively and efficiently remains a challenging problem....
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | 3D object detection from LiDAR point cloud is of critical importance for
autonomous driving and robotics. While sequential point cloud has the potential
to enhance 3D perception through temporal information, utilizing these temporal
features effectively and efficiently remains a challenging problem. Based on
the observation that the foreground information is sparsely distributed in
LiDAR scenes, we believe sufficient knowledge can be provided by sparse format
rather than dense maps. To this end, we propose to learn Significance-gUided
Information for 3D Temporal detection (SUIT), which simplifies temporal
information as sparse features for information fusion across frames.
Specifically, we first introduce a significant sampling mechanism that extracts
information-rich yet sparse features based on predicted object centroids. On
top of that, we present an explicit geometric transformation learning
technique, which learns the object-centric transformations among sparse
features across frames. We evaluate our method on large-scale nuScenes and
Waymo dataset, where our SUIT not only significantly reduces the memory and
computation cost of temporal fusion, but also performs well over the
state-of-the-art baselines. |
---|---|
DOI: | 10.48550/arxiv.2307.01807 |