ET-PointPillars: improved PointPillars for 3D object detection based on optimized voxel downsampling

The preprocessing of point cloud data has always been an important problem in 3D object detection. Due to the large volume of point cloud data, voxelization methods are often used to represent the point cloud while reducing data density. However, common voxelization randomly selects sampling points...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Machine vision and applications 2024-05, Vol.35 (3), p.56, Article 56
Hauptverfasser:	Liu, Yiyi, Yang, Zhengyi, Tong, JianLin, Yang, Jiajia, Peng, Jiongcheng, Zhang, Lihang, Cheng, Wangxin
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Artificial neural networks Communications Engineering Computer Science Feature extraction Fuzzy sets Image Processing and Computer Vision Networks Object recognition Pattern Recognition Pedestrians Sampling Three dimensional models
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue	3
container_start_page	56
container_title	Machine vision and applications
container_volume	35
creator	Liu, Yiyi Yang, Zhengyi Tong, JianLin Yang, Jiajia Peng, Jiongcheng Zhang, Lihang Cheng, Wangxin
description	The preprocessing of point cloud data has always been an important problem in 3D object detection. Due to the large volume of point cloud data, voxelization methods are often used to represent the point cloud while reducing data density. However, common voxelization randomly selects sampling points from voxels, which often fails to represent local spatial features well due to noise. To preserve local features, this paper proposes an optimized voxel downsampling(OVD) method based on evidence theory. This method uses fuzzy sets to model basic probability assignments (BPAs) for each candidate point, incorporating point location information. It then employs evidence theory to fuse the BPAs and determine the selected sampling points. In the PointPillars 3D object detection algorithm, the point cloud is partitioned into pillars and encoded using each pillar’s points. Convolutional neural networks are used for feature extraction and detection. Another contribution is the proposed improved PointPillars based on evidence theory (ET-PointPillars) by introducing an OVD-based feature point sampling module in the PointPillars’ pillar feature network, which can select feature points in pillars using the optimized method, computes offsets to these points, and adds them as features to facilitate learning more object characteristics, improving traditional PointPillars. Experiments on the KITTI datasets validate the method’s ability to preserve local spatial features. Results showed improved detection precision, with a 2.73 % average increase for pedestrians and cyclists on KITTI.
doi_str_mv	10.1007/s00138-024-01538-y
format	Article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_3042701456</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3042701456</sourcerecordid><originalsourceid>FETCH-LOGICAL-c270t-cb982b9c82376c61558ca43d13ee064053c3492697e337d2f5f2ca2769a3a7013</originalsourceid><addsrcrecordid>eNp9kM1OwzAQhC0EEqXwApwscQ6sfxIn3FBpAakSPZSz5ThO5SqJg50WytNjCBKcOO3samZX-yF0SeCaAIibAEBYngDlCZA0qsMRmhDOaEJEVhyjCRRR51DQU3QWwhYAuBB8gqr5Olk52w0r2zTKh1ts2967vanw3zGuncfsHrtya_SAKzPEYl2HSxWiNQrXD7a1H7HZu3fT4Mq9dUG1fWO7zTk6qVUTzMVPnaKXxXw9e0yWzw9Ps7tloqmAIdFlkdOy0DllItMZSdNcK84qwoyBjEPKNOMFzQphGBMVrdOaakXjg4opEQFM0dW4N37wujNhkFu38108KRnweIPwNIsuOrq0dyF4U8ve21b5gyQgv2jKkaaMNOU3TXmIITaGQjR3G-N_V_-T-gSvJXg7</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3042701456</pqid></control><display><type>article</type><title>ET-PointPillars: improved PointPillars for 3D object detection based on optimized voxel downsampling</title><source>SpringerLink Journals (MCLS)</source><creator>Liu, Yiyi ; Yang, Zhengyi ; Tong, JianLin ; Yang, Jiajia ; Peng, Jiongcheng ; Zhang, Lihang ; Cheng, Wangxin</creator><creatorcontrib>Liu, Yiyi ; Yang, Zhengyi ; Tong, JianLin ; Yang, Jiajia ; Peng, Jiongcheng ; Zhang, Lihang ; Cheng, Wangxin</creatorcontrib><description>The preprocessing of point cloud data has always been an important problem in 3D object detection. Due to the large volume of point cloud data, voxelization methods are often used to represent the point cloud while reducing data density. However, common voxelization randomly selects sampling points from voxels, which often fails to represent local spatial features well due to noise. To preserve local features, this paper proposes an optimized voxel downsampling(OVD) method based on evidence theory. This method uses fuzzy sets to model basic probability assignments (BPAs) for each candidate point, incorporating point location information. It then employs evidence theory to fuse the BPAs and determine the selected sampling points. In the PointPillars 3D object detection algorithm, the point cloud is partitioned into pillars and encoded using each pillar’s points. Convolutional neural networks are used for feature extraction and detection. Another contribution is the proposed improved PointPillars based on evidence theory (ET-PointPillars) by introducing an OVD-based feature point sampling module in the PointPillars’ pillar feature network, which can select feature points in pillars using the optimized method, computes offsets to these points, and adds them as features to facilitate learning more object characteristics, improving traditional PointPillars. Experiments on the KITTI datasets validate the method’s ability to preserve local spatial features. Results showed improved detection precision, with a 2.73 % average increase for pedestrians and cyclists on KITTI.</description><identifier>ISSN: 0932-8092</identifier><identifier>EISSN: 1432-1769</identifier><identifier>DOI: 10.1007/s00138-024-01538-y</identifier><language>eng</language><publisher>Berlin/Heidelberg: Springer Berlin Heidelberg</publisher><subject>Algorithms ; Artificial neural networks ; Communications Engineering ; Computer Science ; Feature extraction ; Fuzzy sets ; Image Processing and Computer Vision ; Networks ; Object recognition ; Pattern Recognition ; Pedestrians ; Sampling ; Three dimensional models</subject><ispartof>Machine vision and applications, 2024-05, Vol.35 (3), p.56, Article 56</ispartof><rights>The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2024. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c270t-cb982b9c82376c61558ca43d13ee064053c3492697e337d2f5f2ca2769a3a7013</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s00138-024-01538-y$$EPDF$$P50$$Gspringer$$H</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/s00138-024-01538-y$$EHTML$$P50$$Gspringer$$H</linktohtml><link.rule.ids>314,780,784,27924,27925,41488,42557,51319</link.rule.ids></links><search><creatorcontrib>Liu, Yiyi</creatorcontrib><creatorcontrib>Yang, Zhengyi</creatorcontrib><creatorcontrib>Tong, JianLin</creatorcontrib><creatorcontrib>Yang, Jiajia</creatorcontrib><creatorcontrib>Peng, Jiongcheng</creatorcontrib><creatorcontrib>Zhang, Lihang</creatorcontrib><creatorcontrib>Cheng, Wangxin</creatorcontrib><title>ET-PointPillars: improved PointPillars for 3D object detection based on optimized voxel downsampling</title><title>Machine vision and applications</title><addtitle>Machine Vision and Applications</addtitle><description>The preprocessing of point cloud data has always been an important problem in 3D object detection. Due to the large volume of point cloud data, voxelization methods are often used to represent the point cloud while reducing data density. However, common voxelization randomly selects sampling points from voxels, which often fails to represent local spatial features well due to noise. To preserve local features, this paper proposes an optimized voxel downsampling(OVD) method based on evidence theory. This method uses fuzzy sets to model basic probability assignments (BPAs) for each candidate point, incorporating point location information. It then employs evidence theory to fuse the BPAs and determine the selected sampling points. In the PointPillars 3D object detection algorithm, the point cloud is partitioned into pillars and encoded using each pillar’s points. Convolutional neural networks are used for feature extraction and detection. Another contribution is the proposed improved PointPillars based on evidence theory (ET-PointPillars) by introducing an OVD-based feature point sampling module in the PointPillars’ pillar feature network, which can select feature points in pillars using the optimized method, computes offsets to these points, and adds them as features to facilitate learning more object characteristics, improving traditional PointPillars. Experiments on the KITTI datasets validate the method’s ability to preserve local spatial features. Results showed improved detection precision, with a 2.73 % average increase for pedestrians and cyclists on KITTI.</description><subject>Algorithms</subject><subject>Artificial neural networks</subject><subject>Communications Engineering</subject><subject>Computer Science</subject><subject>Feature extraction</subject><subject>Fuzzy sets</subject><subject>Image Processing and Computer Vision</subject><subject>Networks</subject><subject>Object recognition</subject><subject>Pattern Recognition</subject><subject>Pedestrians</subject><subject>Sampling</subject><subject>Three dimensional models</subject><issn>0932-8092</issn><issn>1432-1769</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><recordid>eNp9kM1OwzAQhC0EEqXwApwscQ6sfxIn3FBpAakSPZSz5ThO5SqJg50WytNjCBKcOO3samZX-yF0SeCaAIibAEBYngDlCZA0qsMRmhDOaEJEVhyjCRRR51DQU3QWwhYAuBB8gqr5Olk52w0r2zTKh1ts2967vanw3zGuncfsHrtya_SAKzPEYl2HSxWiNQrXD7a1H7HZu3fT4Mq9dUG1fWO7zTk6qVUTzMVPnaKXxXw9e0yWzw9Ps7tloqmAIdFlkdOy0DllItMZSdNcK84qwoyBjEPKNOMFzQphGBMVrdOaakXjg4opEQFM0dW4N37wujNhkFu38108KRnweIPwNIsuOrq0dyF4U8ve21b5gyQgv2jKkaaMNOU3TXmIITaGQjR3G-N_V_-T-gSvJXg7</recordid><startdate>20240501</startdate><enddate>20240501</enddate><creator>Liu, Yiyi</creator><creator>Yang, Zhengyi</creator><creator>Tong, JianLin</creator><creator>Yang, Jiajia</creator><creator>Peng, Jiongcheng</creator><creator>Zhang, Lihang</creator><creator>Cheng, Wangxin</creator><general>Springer Berlin Heidelberg</general><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope></search><sort><creationdate>20240501</creationdate><title>ET-PointPillars: improved PointPillars for 3D object detection based on optimized voxel downsampling</title><author>Liu, Yiyi ; Yang, Zhengyi ; Tong, JianLin ; Yang, Jiajia ; Peng, Jiongcheng ; Zhang, Lihang ; Cheng, Wangxin</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c270t-cb982b9c82376c61558ca43d13ee064053c3492697e337d2f5f2ca2769a3a7013</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Algorithms</topic><topic>Artificial neural networks</topic><topic>Communications Engineering</topic><topic>Computer Science</topic><topic>Feature extraction</topic><topic>Fuzzy sets</topic><topic>Image Processing and Computer Vision</topic><topic>Networks</topic><topic>Object recognition</topic><topic>Pattern Recognition</topic><topic>Pedestrians</topic><topic>Sampling</topic><topic>Three dimensional models</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Liu, Yiyi</creatorcontrib><creatorcontrib>Yang, Zhengyi</creatorcontrib><creatorcontrib>Tong, JianLin</creatorcontrib><creatorcontrib>Yang, Jiajia</creatorcontrib><creatorcontrib>Peng, Jiongcheng</creatorcontrib><creatorcontrib>Zhang, Lihang</creatorcontrib><creatorcontrib>Cheng, Wangxin</creatorcontrib><collection>CrossRef</collection><jtitle>Machine vision and applications</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Liu, Yiyi</au><au>Yang, Zhengyi</au><au>Tong, JianLin</au><au>Yang, Jiajia</au><au>Peng, Jiongcheng</au><au>Zhang, Lihang</au><au>Cheng, Wangxin</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>ET-PointPillars: improved PointPillars for 3D object detection based on optimized voxel downsampling</atitle><jtitle>Machine vision and applications</jtitle><stitle>Machine Vision and Applications</stitle><date>2024-05-01</date><risdate>2024</risdate><volume>35</volume><issue>3</issue><spage>56</spage><pages>56-</pages><artnum>56</artnum><issn>0932-8092</issn><eissn>1432-1769</eissn><abstract>The preprocessing of point cloud data has always been an important problem in 3D object detection. Due to the large volume of point cloud data, voxelization methods are often used to represent the point cloud while reducing data density. However, common voxelization randomly selects sampling points from voxels, which often fails to represent local spatial features well due to noise. To preserve local features, this paper proposes an optimized voxel downsampling(OVD) method based on evidence theory. This method uses fuzzy sets to model basic probability assignments (BPAs) for each candidate point, incorporating point location information. It then employs evidence theory to fuse the BPAs and determine the selected sampling points. In the PointPillars 3D object detection algorithm, the point cloud is partitioned into pillars and encoded using each pillar’s points. Convolutional neural networks are used for feature extraction and detection. Another contribution is the proposed improved PointPillars based on evidence theory (ET-PointPillars) by introducing an OVD-based feature point sampling module in the PointPillars’ pillar feature network, which can select feature points in pillars using the optimized method, computes offsets to these points, and adds them as features to facilitate learning more object characteristics, improving traditional PointPillars. Experiments on the KITTI datasets validate the method’s ability to preserve local spatial features. Results showed improved detection precision, with a 2.73 % average increase for pedestrians and cyclists on KITTI.</abstract><cop>Berlin/Heidelberg</cop><pub>Springer Berlin Heidelberg</pub><doi>10.1007/s00138-024-01538-y</doi></addata></record>
fulltext	fulltext
identifier	ISSN: 0932-8092
ispartof	Machine vision and applications, 2024-05, Vol.35 (3), p.56, Article 56
issn	0932-8092 1432-1769
language	eng
recordid	cdi_proquest_journals_3042701456
source	SpringerLink Journals (MCLS)
subjects	Algorithms Artificial neural networks Communications Engineering Computer Science Feature extraction Fuzzy sets Image Processing and Computer Vision Networks Object recognition Pattern Recognition Pedestrians Sampling Three dimensional models
title	ET-PointPillars: improved PointPillars for 3D object detection based on optimized voxel downsampling
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-25T05%3A07%3A25IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=ET-PointPillars:%20improved%20PointPillars%20for%203D%20object%20detection%20based%20on%20optimized%20voxel%20downsampling&rft.jtitle=Machine%20vision%20and%20applications&rft.au=Liu,%20Yiyi&rft.date=2024-05-01&rft.volume=35&rft.issue=3&rft.spage=56&rft.pages=56-&rft.artnum=56&rft.issn=0932-8092&rft.eissn=1432-1769&rft_id=info:doi/10.1007/s00138-024-01538-y&rft_dat=%3Cproquest_cross%3E3042701456%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3042701456&rft_id=info:pmid/&rfr_iscdi=true