PE-MCAT: Leveraging Image Sensor Fusion and Adaptive Thresholds for Semi-Supervised 3D Object Detection
Existing 3D object detection frameworks in sensor-based applications heavily rely on large-scale annotated data to achieve optimal performance. However, obtaining such annotations from sensor data-like LiDAR or image sensors-is both time-consuming and costly. Semi-supervised learning offers an effic...
Gespeichert in:
Veröffentlicht in: | Sensors (Basel, Switzerland) Switzerland), 2024-10, Vol.24 (21), p.6940 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Existing 3D object detection frameworks in sensor-based applications heavily rely on large-scale annotated data to achieve optimal performance. However, obtaining such annotations from sensor data-like LiDAR or image sensors-is both time-consuming and costly. Semi-supervised learning offers an efficient solution to this challenge and holds significant potential for sensor-driven artificial intelligence (AI) applications. While it reduces the need for labeled data, semi-supervised learning still depends on a small amount of labeled samples for training. In the initial stages, relying on such limited samples can adversely affect the effective training of student-teacher networks. In this paper, we propose PE-MCAT, a semi-supervised 3D object detection method that generates high-precision pseudo-labels. First, to address the challenges of insufficient local feature capture and poor robustness in point cloud data, we introduce a point enrichment module. This module incorporates information from image sensors and combines multiple feature fusion methods of local and self-features to directly enhance the quality of point clouds and pseudo-labels, compensating for the limitations posed by using only a few labeled samples. Second, we explore the relationship between the teacher network and the pseudo-labels it generates. We propose a multi-class adaptive threshold strategy to initially filter and create a high-quality pseudo-label set. Furthermore, a joint variable threshold strategy is introduced to refine this set further, enhancing the selection of superior pseudo-labels.Extensive experiments demonstrate that PE-MCAT consistently outperforms recent state-of-the-art methods across different datasets. Specifically, on the KITTI dataset and using only 2% of labeled samples, our method improved the mean Average Precision (mAP) by 0.7% for cars, 3.7% for pedestrians, and 3.0% for cyclists. |
---|---|
ISSN: | 1424-8220 1424-8220 |
DOI: | 10.3390/s24216940 |