LinkOcc: 3D Semantic Occupancy Prediction with Temporal Association
3D semantic occupancy has garnered considerable attention due to its abundant structural information encompassing the entire autonomous driving scene. However, existing 3D occupancy prediction methods are typically tailored for single-frame inputs, resulting in unsatisfactory performance and tempora...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on circuits and systems for video technology 2024-10, p.1-1 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | 3D semantic occupancy has garnered considerable attention due to its abundant structural information encompassing the entire autonomous driving scene. However, existing 3D occupancy prediction methods are typically tailored for single-frame inputs, resulting in unsatisfactory performance and temporal inconsistencies in real-world continuous scenarios. In this paper, we introduce LinkOcc, a sparse-queries approach incorporating an efficient temporal association mechanism for 3D semantic occupancy prediction. LinkOcc is conceptually built on the prevalent DETR-like framework for 2D segmentation, and we further construct the temporal association mechanism on this basis. Specifically, we propose a near-online training strategy that jointly trains with two adjacent frames, which successfully combines the benefits of both online and off-online methods. Moreover, we introduce a temporal association strategy with contrastive learning to discriminate features for cross-frame semantic-level association. Comprehensive experiments demonstrate that LinkOcc not only surpasses the state-of-the-art methods in 3D occupancy prediction, but also guarantees a promising performance on foreground classes. |
---|---|
ISSN: | 1051-8215 1558-2205 |
DOI: | 10.1109/TCSVT.2024.3486019 |