P2V-RCNN: Point to Voxel Feature Learning for 3D Object Detection From Point Clouds

The most recent 3D object detectors for point clouds rely on the coarse voxel-based representation rather than the accurate point-based representation due to a higher box recall in the voxel-based Region Proposal Network (RPN). However, the detection accuracy is severely restricted by the informatio...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE access 2021, Vol.9, p.98249-98260
Hauptverfasser: Li, Jiale, Sun, Yu, Luo, Shujie, Zhu, Ziqi, Dai, Hang, Krylov, Andrey S., Ding, Yong, Shao, Ling
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 98260
container_issue
container_start_page 98249
container_title IEEE access
container_volume 9
creator Li, Jiale
Sun, Yu
Luo, Shujie
Zhu, Ziqi
Dai, Hang
Krylov, Andrey S.
Ding, Yong
Shao, Ling
description The most recent 3D object detectors for point clouds rely on the coarse voxel-based representation rather than the accurate point-based representation due to a higher box recall in the voxel-based Region Proposal Network (RPN). However, the detection accuracy is severely restricted by the information loss of pose details in the voxels. Different from considering the point cloud as voxel or point representation only, we propose a point-to-voxel feature learning approach to voxelize the point cloud with both the point-wise semantic and local spatial features, which maintains the voxel-wise features to build the high-recall voxel-based RPN and also provides the accurate point-wise features for refining the detection results. Another difficulty in object detection for point cloud is that the visible part varies a lot against the full view of object because of the perspective issues in data acquisition. To address this, we propose an attentive corner aggregation module to attentively aggregate the features of local point cloud surrounding a 3D proposal from the perspectives of eight corners in the proposal 3D bounding box. The experimental results on the competitive KITTI 3D object detection benchmark show that the proposed method achieves state-of-the-art performance.
doi_str_mv 10.1109/ACCESS.2021.3094562
format Article
fullrecord <record><control><sourceid>proquest_ieee_</sourceid><recordid>TN_cdi_ieee_primary_9474438</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9474438</ieee_id><doaj_id>oai_doaj_org_article_1b95fafd01cc4cb58b405165a5358540</doaj_id><sourcerecordid>2552159963</sourcerecordid><originalsourceid>FETCH-LOGICAL-c408t-9e149c83da020cfecc8822ed6a069e44b3fce6dc8e7c53fd6296c71237e3d5ff3</originalsourceid><addsrcrecordid>eNpNUU1Lw0AUDKKgqL_Ay4Ln1P3OrjdJWxVKK1a9LpvN25KQZusmBf33RlOK7zKPYWbeg0mSG4InhGB995Dns_V6QjElE4Y1F5KeJBeUSJ0yweTpv_08ue66Gg-jBkpkF8n6hX6kr_lyeY9eQtX2qA_oI3xBg-Zg-30EtAAb26rdIB8iYlO0KmpwPZpCP0AVWjSPYXsw503Yl91VcuZt08H1AS-T9_nsLX9KF6vH5_xhkTqOVZ9qIFw7xUqLKXYenFOKUiilxVID5wXzDmTpFGROMF9KqqXLCGUZsFJ4zy6T5zG3DLY2u1htbfw2wVbmjwhxY2zsK9eAIYUW3voSE-e4K4QqOBZECiuYUILjIet2zNrF8LmHrjd12Md2eN9QISgRWks2qNiocjF0XQR_vEqw-S3DjGWY3zLMoYzBdTO6KgA4OjTPOGeK_QCS_IM_</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2552159963</pqid></control><display><type>article</type><title>P2V-RCNN: Point to Voxel Feature Learning for 3D Object Detection From Point Clouds</title><source>IEEE Open Access Journals</source><source>DOAJ Directory of Open Access Journals</source><source>Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals</source><creator>Li, Jiale ; Sun, Yu ; Luo, Shujie ; Zhu, Ziqi ; Dai, Hang ; Krylov, Andrey S. ; Ding, Yong ; Shao, Ling</creator><creatorcontrib>Li, Jiale ; Sun, Yu ; Luo, Shujie ; Zhu, Ziqi ; Dai, Hang ; Krylov, Andrey S. ; Ding, Yong ; Shao, Ling</creatorcontrib><description>The most recent 3D object detectors for point clouds rely on the coarse voxel-based representation rather than the accurate point-based representation due to a higher box recall in the voxel-based Region Proposal Network (RPN). However, the detection accuracy is severely restricted by the information loss of pose details in the voxels. Different from considering the point cloud as voxel or point representation only, we propose a point-to-voxel feature learning approach to voxelize the point cloud with both the point-wise semantic and local spatial features, which maintains the voxel-wise features to build the high-recall voxel-based RPN and also provides the accurate point-wise features for refining the detection results. Another difficulty in object detection for point cloud is that the visible part varies a lot against the full view of object because of the perspective issues in data acquisition. To address this, we propose an attentive corner aggregation module to attentively aggregate the features of local point cloud surrounding a 3D proposal from the perspectives of eight corners in the proposal 3D bounding box. The experimental results on the competitive KITTI 3D object detection benchmark show that the proposed method achieves state-of-the-art performance.</description><identifier>ISSN: 2169-3536</identifier><identifier>EISSN: 2169-3536</identifier><identifier>DOI: 10.1109/ACCESS.2021.3094562</identifier><identifier>CODEN: IAECCG</identifier><language>eng</language><publisher>Piscataway: IEEE</publisher><subject>3D object detection ; attention mechanism ; autonomous driving ; Cameras ; Feature extraction ; Learning ; Object detection ; Object recognition ; point clouds ; Proposals ; Recall ; Representations ; Semantics ; Three dimensional models ; Three-dimensional displays</subject><ispartof>IEEE access, 2021, Vol.9, p.98249-98260</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2021</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c408t-9e149c83da020cfecc8822ed6a069e44b3fce6dc8e7c53fd6296c71237e3d5ff3</citedby><cites>FETCH-LOGICAL-c408t-9e149c83da020cfecc8822ed6a069e44b3fce6dc8e7c53fd6296c71237e3d5ff3</cites><orcidid>0000-0002-8264-6117 ; 0000-0002-3299-5386 ; 0000-0001-9910-4501 ; 0000-0002-5226-7511 ; 0000-0002-4335-5827 ; 0000-0002-0004-2863 ; 0000-0002-7609-0124 ; 0000-0003-0679-4166</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9474438$$EHTML$$P50$$Gieee$$Hfree_for_read</linktohtml><link.rule.ids>314,776,780,860,2096,4010,27612,27902,27903,27904,54911</link.rule.ids></links><search><creatorcontrib>Li, Jiale</creatorcontrib><creatorcontrib>Sun, Yu</creatorcontrib><creatorcontrib>Luo, Shujie</creatorcontrib><creatorcontrib>Zhu, Ziqi</creatorcontrib><creatorcontrib>Dai, Hang</creatorcontrib><creatorcontrib>Krylov, Andrey S.</creatorcontrib><creatorcontrib>Ding, Yong</creatorcontrib><creatorcontrib>Shao, Ling</creatorcontrib><title>P2V-RCNN: Point to Voxel Feature Learning for 3D Object Detection From Point Clouds</title><title>IEEE access</title><addtitle>Access</addtitle><description>The most recent 3D object detectors for point clouds rely on the coarse voxel-based representation rather than the accurate point-based representation due to a higher box recall in the voxel-based Region Proposal Network (RPN). However, the detection accuracy is severely restricted by the information loss of pose details in the voxels. Different from considering the point cloud as voxel or point representation only, we propose a point-to-voxel feature learning approach to voxelize the point cloud with both the point-wise semantic and local spatial features, which maintains the voxel-wise features to build the high-recall voxel-based RPN and also provides the accurate point-wise features for refining the detection results. Another difficulty in object detection for point cloud is that the visible part varies a lot against the full view of object because of the perspective issues in data acquisition. To address this, we propose an attentive corner aggregation module to attentively aggregate the features of local point cloud surrounding a 3D proposal from the perspectives of eight corners in the proposal 3D bounding box. The experimental results on the competitive KITTI 3D object detection benchmark show that the proposed method achieves state-of-the-art performance.</description><subject>3D object detection</subject><subject>attention mechanism</subject><subject>autonomous driving</subject><subject>Cameras</subject><subject>Feature extraction</subject><subject>Learning</subject><subject>Object detection</subject><subject>Object recognition</subject><subject>point clouds</subject><subject>Proposals</subject><subject>Recall</subject><subject>Representations</subject><subject>Semantics</subject><subject>Three dimensional models</subject><subject>Three-dimensional displays</subject><issn>2169-3536</issn><issn>2169-3536</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><sourceid>ESBDL</sourceid><sourceid>RIE</sourceid><sourceid>DOA</sourceid><recordid>eNpNUU1Lw0AUDKKgqL_Ay4Ln1P3OrjdJWxVKK1a9LpvN25KQZusmBf33RlOK7zKPYWbeg0mSG4InhGB995Dns_V6QjElE4Y1F5KeJBeUSJ0yweTpv_08ue66Gg-jBkpkF8n6hX6kr_lyeY9eQtX2qA_oI3xBg-Zg-30EtAAb26rdIB8iYlO0KmpwPZpCP0AVWjSPYXsw503Yl91VcuZt08H1AS-T9_nsLX9KF6vH5_xhkTqOVZ9qIFw7xUqLKXYenFOKUiilxVID5wXzDmTpFGROMF9KqqXLCGUZsFJ4zy6T5zG3DLY2u1htbfw2wVbmjwhxY2zsK9eAIYUW3voSE-e4K4QqOBZECiuYUILjIet2zNrF8LmHrjd12Md2eN9QISgRWks2qNiocjF0XQR_vEqw-S3DjGWY3zLMoYzBdTO6KgA4OjTPOGeK_QCS_IM_</recordid><startdate>2021</startdate><enddate>2021</enddate><creator>Li, Jiale</creator><creator>Sun, Yu</creator><creator>Luo, Shujie</creator><creator>Zhu, Ziqi</creator><creator>Dai, Hang</creator><creator>Krylov, Andrey S.</creator><creator>Ding, Yong</creator><creator>Shao, Ling</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>ESBDL</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>7SR</scope><scope>8BQ</scope><scope>8FD</scope><scope>JG9</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>DOA</scope><orcidid>https://orcid.org/0000-0002-8264-6117</orcidid><orcidid>https://orcid.org/0000-0002-3299-5386</orcidid><orcidid>https://orcid.org/0000-0001-9910-4501</orcidid><orcidid>https://orcid.org/0000-0002-5226-7511</orcidid><orcidid>https://orcid.org/0000-0002-4335-5827</orcidid><orcidid>https://orcid.org/0000-0002-0004-2863</orcidid><orcidid>https://orcid.org/0000-0002-7609-0124</orcidid><orcidid>https://orcid.org/0000-0003-0679-4166</orcidid></search><sort><creationdate>2021</creationdate><title>P2V-RCNN: Point to Voxel Feature Learning for 3D Object Detection From Point Clouds</title><author>Li, Jiale ; Sun, Yu ; Luo, Shujie ; Zhu, Ziqi ; Dai, Hang ; Krylov, Andrey S. ; Ding, Yong ; Shao, Ling</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c408t-9e149c83da020cfecc8822ed6a069e44b3fce6dc8e7c53fd6296c71237e3d5ff3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>3D object detection</topic><topic>attention mechanism</topic><topic>autonomous driving</topic><topic>Cameras</topic><topic>Feature extraction</topic><topic>Learning</topic><topic>Object detection</topic><topic>Object recognition</topic><topic>point clouds</topic><topic>Proposals</topic><topic>Recall</topic><topic>Representations</topic><topic>Semantics</topic><topic>Three dimensional models</topic><topic>Three-dimensional displays</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Li, Jiale</creatorcontrib><creatorcontrib>Sun, Yu</creatorcontrib><creatorcontrib>Luo, Shujie</creatorcontrib><creatorcontrib>Zhu, Ziqi</creatorcontrib><creatorcontrib>Dai, Hang</creatorcontrib><creatorcontrib>Krylov, Andrey S.</creatorcontrib><creatorcontrib>Ding, Yong</creatorcontrib><creatorcontrib>Shao, Ling</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE Open Access Journals</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Engineered Materials Abstracts</collection><collection>METADEX</collection><collection>Technology Research Database</collection><collection>Materials Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>DOAJ Directory of Open Access Journals</collection><jtitle>IEEE access</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Li, Jiale</au><au>Sun, Yu</au><au>Luo, Shujie</au><au>Zhu, Ziqi</au><au>Dai, Hang</au><au>Krylov, Andrey S.</au><au>Ding, Yong</au><au>Shao, Ling</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>P2V-RCNN: Point to Voxel Feature Learning for 3D Object Detection From Point Clouds</atitle><jtitle>IEEE access</jtitle><stitle>Access</stitle><date>2021</date><risdate>2021</risdate><volume>9</volume><spage>98249</spage><epage>98260</epage><pages>98249-98260</pages><issn>2169-3536</issn><eissn>2169-3536</eissn><coden>IAECCG</coden><abstract>The most recent 3D object detectors for point clouds rely on the coarse voxel-based representation rather than the accurate point-based representation due to a higher box recall in the voxel-based Region Proposal Network (RPN). However, the detection accuracy is severely restricted by the information loss of pose details in the voxels. Different from considering the point cloud as voxel or point representation only, we propose a point-to-voxel feature learning approach to voxelize the point cloud with both the point-wise semantic and local spatial features, which maintains the voxel-wise features to build the high-recall voxel-based RPN and also provides the accurate point-wise features for refining the detection results. Another difficulty in object detection for point cloud is that the visible part varies a lot against the full view of object because of the perspective issues in data acquisition. To address this, we propose an attentive corner aggregation module to attentively aggregate the features of local point cloud surrounding a 3D proposal from the perspectives of eight corners in the proposal 3D bounding box. The experimental results on the competitive KITTI 3D object detection benchmark show that the proposed method achieves state-of-the-art performance.</abstract><cop>Piscataway</cop><pub>IEEE</pub><doi>10.1109/ACCESS.2021.3094562</doi><tpages>12</tpages><orcidid>https://orcid.org/0000-0002-8264-6117</orcidid><orcidid>https://orcid.org/0000-0002-3299-5386</orcidid><orcidid>https://orcid.org/0000-0001-9910-4501</orcidid><orcidid>https://orcid.org/0000-0002-5226-7511</orcidid><orcidid>https://orcid.org/0000-0002-4335-5827</orcidid><orcidid>https://orcid.org/0000-0002-0004-2863</orcidid><orcidid>https://orcid.org/0000-0002-7609-0124</orcidid><orcidid>https://orcid.org/0000-0003-0679-4166</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 2169-3536
ispartof IEEE access, 2021, Vol.9, p.98249-98260
issn 2169-3536
2169-3536
language eng
recordid cdi_ieee_primary_9474438
source IEEE Open Access Journals; DOAJ Directory of Open Access Journals; Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals
subjects 3D object detection
attention mechanism
autonomous driving
Cameras
Feature extraction
Learning
Object detection
Object recognition
point clouds
Proposals
Recall
Representations
Semantics
Three dimensional models
Three-dimensional displays
title P2V-RCNN: Point to Voxel Feature Learning for 3D Object Detection From Point Clouds
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-28T04%3A15%3A40IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_ieee_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=P2V-RCNN:%20Point%20to%20Voxel%20Feature%20Learning%20for%203D%20Object%20Detection%20From%20Point%20Clouds&rft.jtitle=IEEE%20access&rft.au=Li,%20Jiale&rft.date=2021&rft.volume=9&rft.spage=98249&rft.epage=98260&rft.pages=98249-98260&rft.issn=2169-3536&rft.eissn=2169-3536&rft.coden=IAECCG&rft_id=info:doi/10.1109/ACCESS.2021.3094562&rft_dat=%3Cproquest_ieee_%3E2552159963%3C/proquest_ieee_%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2552159963&rft_id=info:pmid/&rft_ieee_id=9474438&rft_doaj_id=oai_doaj_org_article_1b95fafd01cc4cb58b405165a5358540&rfr_iscdi=true