SGF3D: Similarity-guided fusion network for 3D object detection

The representation of pseudo point cloud can significantly improve the precision of 3D object detection. However, existing pseudo point cloud-based methods typically fuse the processed features through coarse concatenation, which ignores the consistency between the point cloud and pseudo point cloud...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Image and vision computing 2024-02, Vol.142, p.104895, Article 104895
Hauptverfasser:	Li, Chunzheng, Wang, Gaihua, Long, Qian, Zhou, Zhengshu
Format:	Artikel
Sprache:	eng
Schlagworte:	3D object detection LiDAR Multi-modal Point cloud features
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	The representation of pseudo point cloud can significantly improve the precision of 3D object detection. However, existing pseudo point cloud-based methods typically fuse the processed features through coarse concatenation, which ignores the consistency between the point cloud and pseudo point cloud features. The inconsistency of features in different modal data can lead to detection bias. In this paper, we propose a novel pseudo point cloud-based network called SGF3D, which utilizes a cross-modal attention module cross-modal attention fusion (CMAF) to fuse point cloud and pseudo point cloud features. It can better learn the cross-modal similarity of output features, enabling the detection box to fit better with the target. We also designed a region of interest (RoI) head similarity attention head (SAH) to utilize the overlooked similarity to optimize training without increasing the complexity of the network. By using CMAF and SAH, the proposed method can obtain more accurate bounding boxes. Extensive experiments on KITTI dataset demonstrate that the proposed method can achieve competitive results. Training code and well trained weights are available at https://github.com/ChunZheng2022/SGF3D. •Better bounding boxes and higher accuracy.•Enhanced communication between features from different modalities.•Optimize training with auxiliary head without introducing more parameters.
ISSN:	0262-8856 1872-8138
DOI:	10.1016/j.imavis.2023.104895