ET-PointPillars: improved PointPillars for 3D object detection based on optimized voxel downsampling

The preprocessing of point cloud data has always been an important problem in 3D object detection. Due to the large volume of point cloud data, voxelization methods are often used to represent the point cloud while reducing data density. However, common voxelization randomly selects sampling points...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Machine vision and applications 2024-05, Vol.35 (3), p.56, Article 56
Hauptverfasser: Liu, Yiyi, Yang, Zhengyi, Tong, JianLin, Yang, Jiajia, Peng, Jiongcheng, Zhang, Lihang, Cheng, Wangxin
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The preprocessing of point cloud data has always been an important problem in 3D object detection. Due to the large volume of point cloud data, voxelization methods are often used to represent the point cloud while reducing data density. However, common voxelization randomly selects sampling points from voxels, which often fails to represent local spatial features well due to noise. To preserve local features, this paper proposes an optimized voxel downsampling(OVD) method based on evidence theory. This method uses fuzzy sets to model basic probability assignments (BPAs) for each candidate point, incorporating point location information. It then employs evidence theory to fuse the BPAs and determine the selected sampling points. In the PointPillars 3D object detection algorithm, the point cloud is partitioned into pillars and encoded using each pillar’s points. Convolutional neural networks are used for feature extraction and detection. Another contribution is the proposed improved PointPillars based on evidence theory (ET-PointPillars) by introducing an OVD-based feature point sampling module in the PointPillars’ pillar feature network, which can select feature points in pillars using the optimized method, computes offsets to these points, and adds them as features to facilitate learning more object characteristics, improving traditional PointPillars. Experiments on the KITTI datasets validate the method’s ability to preserve local spatial features. Results showed improved detection precision, with a 2.73 % average increase for pedestrians and cyclists on KITTI.
ISSN:0932-8092
1432-1769
DOI:10.1007/s00138-024-01538-y