Weak-to-Strong 3D Object Detection with X-Ray Distillation
This paper addresses the critical challenges of sparsity and occlusion in LiDAR-based 3D object detection. Current methods often rely on supplementary modules or specific architectural designs, potentially limiting their applicability to new and evolving architectures. To our knowledge, we are the f...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | This paper addresses the critical challenges of sparsity and occlusion in
LiDAR-based 3D object detection. Current methods often rely on supplementary
modules or specific architectural designs, potentially limiting their
applicability to new and evolving architectures. To our knowledge, we are the
first to propose a versatile technique that seamlessly integrates into any
existing framework for 3D Object Detection, marking the first instance of
Weak-to-Strong generalization in 3D computer vision. We introduce a novel
framework, X-Ray Distillation with Object-Complete Frames, suitable for both
supervised and semi-supervised settings, that leverages the temporal aspect of
point cloud sequences. This method extracts crucial information from both
previous and subsequent LiDAR frames, creating Object-Complete frames that
represent objects from multiple viewpoints, thus addressing occlusion and
sparsity. Given the limitation of not being able to generate Object-Complete
frames during online inference, we utilize Knowledge Distillation within a
Teacher-Student framework. This technique encourages the strong Student model
to emulate the behavior of the weaker Teacher, which processes simple and
informative Object-Complete frames, effectively offering a comprehensive view
of objects as if seen through X-ray vision. Our proposed methods surpass
state-of-the-art in semi-supervised learning by 1-1.5 mAP and enhance the
performance of five established supervised models by 1-2 mAP on standard
autonomous driving datasets, even with default hyperparameters. Code for
Object-Complete frames is available here:
https://github.com/sakharok13/X-Ray-Teacher-Patching-Tools. |
---|---|
DOI: | 10.48550/arxiv.2404.00679 |