SimLOG: Simultaneous Local-Global Feature Learning for 3D Object Detection in Indoor Point Clouds

The acquisition of both local and global features from irregular point clouds is crucial for 3D object detection (3DOD). Current mainstream 3D detectors neglect significant local features during pooling operations or disregard many global features of the overall scene context. This paper proposes ne...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on intelligent transportation systems 2024-12, Vol.25 (12), p.19482-19495
Hauptverfasser:	Wei, Mingqiang, Chen, Baian, Nan, Liangliang, Xie, Haoran, Gu, Lipeng, Lu, Dening, Lee Wang, Fu, Li, Qing
Format:	Artikel
Sprache:	eng
Schlagworte:	3D object detection Aggregates dynamic points interaction Feature extraction global context aggregation Object detection Point cloud compression Representation learning SimLOG Three-dimensional displays Transformers
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	19495
container_issue	12
container_start_page	19482
container_title	IEEE transactions on intelligent transportation systems
container_volume	25
creator	Wei, Mingqiang Chen, Baian Nan, Liangliang Xie, Haoran Gu, Lipeng Lu, Dening Lee Wang, Fu Li, Qing
description	The acquisition of both local and global features from irregular point clouds is crucial for 3D object detection (3DOD). Current mainstream 3D detectors neglect significant local features during pooling operations or disregard many global features of the overall scene context. This paper proposes new techniques for simultaneously learning local-global features of scene point clouds to enhance 3DOD. Specifically, we propose an efficient 3DOD network in indoor point clouds, named SimLOG, which utilizes simultaneous local-global feature learning. SimLOG has two main contributions: a Dynamic Points Interaction (DPI) module to recover local features lost during pooling, and a Global Context Aggregation(GCA) module to aggregate multi-scale features from various layers of the encoder to improve scene context awareness. Unlike traditional local-global feature learning methods, our DPI and GCA modules are integrated into a single feature learning module, making it easily detachable and able to be incorporated into existing 3DOD networks to enhance their performance. SimLOG demonstrates superior performance over twenty competitors in terms of detection accuracy and robustness on both the SUN RGB-D and ScanNet V2 datasets. Specifically, SimLOG boosts the baseline VoteNet by 8.1% of mAP@0.25 on ScanNet V2 and by 3.9% of mAP@0.25 on SUN RGB-D. Code is publicly available at https://github.com/chenbaian-cs/SimLOG .
doi_str_mv	10.1109/TITS.2024.3449319
format	Article
fullrecord	<record><control><sourceid>crossref_RIE</sourceid><recordid>TN_cdi_ieee_primary_10666919</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10666919</ieee_id><sourcerecordid>10_1109_TITS_2024_3449319</sourcerecordid><originalsourceid>FETCH-LOGICAL-c148t-52e3d6ac9c7cf7caa00eaf5399ace4dadbd5d16325e1f90f7c56ebdec949e3543</originalsourceid><addsrcrecordid>eNpNkM1KAzEUhYMoWKsPILjIC0xNJj9t3Elra2GgQut6yCR3JGWaSDKz8O3N0C66Opd7zrkXPoSeKZlRStTrYXvYz0pS8hnjXDGqbtCECrEoCKHydpxLXigiyD16SOmYt1xQOkF6707VbvOGsw5drz2EIeEqGN0Vmy40usNr0P0QAVego3f-B7chYrbCu-YIpscr6LO44LHzeOttyO5XcL7Hyy4MNj2iu1Z3CZ4uOkXf64_D8rPIb7fL96owlC_6QpTArNRGmblp50ZrQkC3gimlDXCrbWOFpZKVAmirSI4ICY0Fo7gCJjibInq-a2JIKUJb_0Z30vGvpqQeGdUjo3pkVF8Y5c7LueMA4CovpVTZ_geBdWRj</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>SimLOG: Simultaneous Local-Global Feature Learning for 3D Object Detection in Indoor Point Clouds</title><source>IEEE Electronic Library (IEL)</source><creator>Wei, Mingqiang ; Chen, Baian ; Nan, Liangliang ; Xie, Haoran ; Gu, Lipeng ; Lu, Dening ; Lee Wang, Fu ; Li, Qing</creator><creatorcontrib>Wei, Mingqiang ; Chen, Baian ; Nan, Liangliang ; Xie, Haoran ; Gu, Lipeng ; Lu, Dening ; Lee Wang, Fu ; Li, Qing</creatorcontrib><description>The acquisition of both local and global features from irregular point clouds is crucial for 3D object detection (3DOD). Current mainstream 3D detectors neglect significant local features during pooling operations or disregard many global features of the overall scene context. This paper proposes new techniques for simultaneously learning local-global features of scene point clouds to enhance 3DOD. Specifically, we propose an efficient 3DOD network in indoor point clouds, named SimLOG, which utilizes simultaneous local-global feature learning. SimLOG has two main contributions: a Dynamic Points Interaction (DPI) module to recover local features lost during pooling, and a Global Context Aggregation(GCA) module to aggregate multi-scale features from various layers of the encoder to improve scene context awareness. Unlike traditional local-global feature learning methods, our DPI and GCA modules are integrated into a single feature learning module, making it easily detachable and able to be incorporated into existing 3DOD networks to enhance their performance. SimLOG demonstrates superior performance over twenty competitors in terms of detection accuracy and robustness on both the SUN RGB-D and ScanNet V2 datasets. Specifically, SimLOG boosts the baseline VoteNet by 8.1% of mAP@0.25 on ScanNet V2 and by 3.9% of mAP@0.25 on SUN RGB-D. Code is publicly available at https://github.com/chenbaian-cs/SimLOG .</description><identifier>ISSN: 1524-9050</identifier><identifier>EISSN: 1558-0016</identifier><identifier>DOI: 10.1109/TITS.2024.3449319</identifier><identifier>CODEN: ITISFG</identifier><language>eng</language><publisher>IEEE</publisher><subject>3D object detection ; Aggregates ; dynamic points interaction ; Feature extraction ; global context aggregation ; Object detection ; Point cloud compression ; Representation learning ; SimLOG ; Three-dimensional displays ; Transformers</subject><ispartof>IEEE transactions on intelligent transportation systems, 2024-12, Vol.25 (12), p.19482-19495</ispartof><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><orcidid>0000-0003-3370-471X ; 0000-0003-0316-0299 ; 0000-0002-1447-8991 ; 0000-0003-0965-3617 ; 0000-0003-0429-490X ; 0000-0002-3976-0053 ; 0000-0002-5629-9975</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10666919$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>315,781,785,797,27926,27927,54760</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/10666919$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Wei, Mingqiang</creatorcontrib><creatorcontrib>Chen, Baian</creatorcontrib><creatorcontrib>Nan, Liangliang</creatorcontrib><creatorcontrib>Xie, Haoran</creatorcontrib><creatorcontrib>Gu, Lipeng</creatorcontrib><creatorcontrib>Lu, Dening</creatorcontrib><creatorcontrib>Lee Wang, Fu</creatorcontrib><creatorcontrib>Li, Qing</creatorcontrib><title>SimLOG: Simultaneous Local-Global Feature Learning for 3D Object Detection in Indoor Point Clouds</title><title>IEEE transactions on intelligent transportation systems</title><addtitle>TITS</addtitle><description>The acquisition of both local and global features from irregular point clouds is crucial for 3D object detection (3DOD). Current mainstream 3D detectors neglect significant local features during pooling operations or disregard many global features of the overall scene context. This paper proposes new techniques for simultaneously learning local-global features of scene point clouds to enhance 3DOD. Specifically, we propose an efficient 3DOD network in indoor point clouds, named SimLOG, which utilizes simultaneous local-global feature learning. SimLOG has two main contributions: a Dynamic Points Interaction (DPI) module to recover local features lost during pooling, and a Global Context Aggregation(GCA) module to aggregate multi-scale features from various layers of the encoder to improve scene context awareness. Unlike traditional local-global feature learning methods, our DPI and GCA modules are integrated into a single feature learning module, making it easily detachable and able to be incorporated into existing 3DOD networks to enhance their performance. SimLOG demonstrates superior performance over twenty competitors in terms of detection accuracy and robustness on both the SUN RGB-D and ScanNet V2 datasets. Specifically, SimLOG boosts the baseline VoteNet by 8.1% of mAP@0.25 on ScanNet V2 and by 3.9% of mAP@0.25 on SUN RGB-D. Code is publicly available at https://github.com/chenbaian-cs/SimLOG .</description><subject>3D object detection</subject><subject>Aggregates</subject><subject>dynamic points interaction</subject><subject>Feature extraction</subject><subject>global context aggregation</subject><subject>Object detection</subject><subject>Point cloud compression</subject><subject>Representation learning</subject><subject>SimLOG</subject><subject>Three-dimensional displays</subject><subject>Transformers</subject><issn>1524-9050</issn><issn>1558-0016</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpNkM1KAzEUhYMoWKsPILjIC0xNJj9t3Elra2GgQut6yCR3JGWaSDKz8O3N0C66Opd7zrkXPoSeKZlRStTrYXvYz0pS8hnjXDGqbtCECrEoCKHydpxLXigiyD16SOmYt1xQOkF6707VbvOGsw5drz2EIeEqGN0Vmy40usNr0P0QAVego3f-B7chYrbCu-YIpscr6LO44LHzeOttyO5XcL7Hyy4MNj2iu1Z3CZ4uOkXf64_D8rPIb7fL96owlC_6QpTArNRGmblp50ZrQkC3gimlDXCrbWOFpZKVAmirSI4ICY0Fo7gCJjibInq-a2JIKUJb_0Z30vGvpqQeGdUjo3pkVF8Y5c7LueMA4CovpVTZ_geBdWRj</recordid><startdate>202412</startdate><enddate>202412</enddate><creator>Wei, Mingqiang</creator><creator>Chen, Baian</creator><creator>Nan, Liangliang</creator><creator>Xie, Haoran</creator><creator>Gu, Lipeng</creator><creator>Lu, Dening</creator><creator>Lee Wang, Fu</creator><creator>Li, Qing</creator><general>IEEE</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><orcidid>https://orcid.org/0000-0003-3370-471X</orcidid><orcidid>https://orcid.org/0000-0003-0316-0299</orcidid><orcidid>https://orcid.org/0000-0002-1447-8991</orcidid><orcidid>https://orcid.org/0000-0003-0965-3617</orcidid><orcidid>https://orcid.org/0000-0003-0429-490X</orcidid><orcidid>https://orcid.org/0000-0002-3976-0053</orcidid><orcidid>https://orcid.org/0000-0002-5629-9975</orcidid></search><sort><creationdate>202412</creationdate><title>SimLOG: Simultaneous Local-Global Feature Learning for 3D Object Detection in Indoor Point Clouds</title><author>Wei, Mingqiang ; Chen, Baian ; Nan, Liangliang ; Xie, Haoran ; Gu, Lipeng ; Lu, Dening ; Lee Wang, Fu ; Li, Qing</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c148t-52e3d6ac9c7cf7caa00eaf5399ace4dadbd5d16325e1f90f7c56ebdec949e3543</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>3D object detection</topic><topic>Aggregates</topic><topic>dynamic points interaction</topic><topic>Feature extraction</topic><topic>global context aggregation</topic><topic>Object detection</topic><topic>Point cloud compression</topic><topic>Representation learning</topic><topic>SimLOG</topic><topic>Three-dimensional displays</topic><topic>Transformers</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Wei, Mingqiang</creatorcontrib><creatorcontrib>Chen, Baian</creatorcontrib><creatorcontrib>Nan, Liangliang</creatorcontrib><creatorcontrib>Xie, Haoran</creatorcontrib><creatorcontrib>Gu, Lipeng</creatorcontrib><creatorcontrib>Lu, Dening</creatorcontrib><creatorcontrib>Lee Wang, Fu</creatorcontrib><creatorcontrib>Li, Qing</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><jtitle>IEEE transactions on intelligent transportation systems</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Wei, Mingqiang</au><au>Chen, Baian</au><au>Nan, Liangliang</au><au>Xie, Haoran</au><au>Gu, Lipeng</au><au>Lu, Dening</au><au>Lee Wang, Fu</au><au>Li, Qing</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>SimLOG: Simultaneous Local-Global Feature Learning for 3D Object Detection in Indoor Point Clouds</atitle><jtitle>IEEE transactions on intelligent transportation systems</jtitle><stitle>TITS</stitle><date>2024-12</date><risdate>2024</risdate><volume>25</volume><issue>12</issue><spage>19482</spage><epage>19495</epage><pages>19482-19495</pages><issn>1524-9050</issn><eissn>1558-0016</eissn><coden>ITISFG</coden><abstract>The acquisition of both local and global features from irregular point clouds is crucial for 3D object detection (3DOD). Current mainstream 3D detectors neglect significant local features during pooling operations or disregard many global features of the overall scene context. This paper proposes new techniques for simultaneously learning local-global features of scene point clouds to enhance 3DOD. Specifically, we propose an efficient 3DOD network in indoor point clouds, named SimLOG, which utilizes simultaneous local-global feature learning. SimLOG has two main contributions: a Dynamic Points Interaction (DPI) module to recover local features lost during pooling, and a Global Context Aggregation(GCA) module to aggregate multi-scale features from various layers of the encoder to improve scene context awareness. Unlike traditional local-global feature learning methods, our DPI and GCA modules are integrated into a single feature learning module, making it easily detachable and able to be incorporated into existing 3DOD networks to enhance their performance. SimLOG demonstrates superior performance over twenty competitors in terms of detection accuracy and robustness on both the SUN RGB-D and ScanNet V2 datasets. Specifically, SimLOG boosts the baseline VoteNet by 8.1% of mAP@0.25 on ScanNet V2 and by 3.9% of mAP@0.25 on SUN RGB-D. Code is publicly available at https://github.com/chenbaian-cs/SimLOG .</abstract><pub>IEEE</pub><doi>10.1109/TITS.2024.3449319</doi><tpages>14</tpages><orcidid>https://orcid.org/0000-0003-3370-471X</orcidid><orcidid>https://orcid.org/0000-0003-0316-0299</orcidid><orcidid>https://orcid.org/0000-0002-1447-8991</orcidid><orcidid>https://orcid.org/0000-0003-0965-3617</orcidid><orcidid>https://orcid.org/0000-0003-0429-490X</orcidid><orcidid>https://orcid.org/0000-0002-3976-0053</orcidid><orcidid>https://orcid.org/0000-0002-5629-9975</orcidid></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 1524-9050
ispartof	IEEE transactions on intelligent transportation systems, 2024-12, Vol.25 (12), p.19482-19495
issn	1524-9050 1558-0016
language	eng
recordid	cdi_ieee_primary_10666919
source	IEEE Electronic Library (IEL)
subjects	3D object detection Aggregates dynamic points interaction Feature extraction global context aggregation Object detection Point cloud compression Representation learning SimLOG Three-dimensional displays Transformers
title	SimLOG: Simultaneous Local-Global Feature Learning for 3D Object Detection in Indoor Point Clouds
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-18T06%3A54%3A35IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-crossref_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=SimLOG:%20Simultaneous%20Local-Global%20Feature%20Learning%20for%203D%20Object%20Detection%20in%20Indoor%20Point%20Clouds&rft.jtitle=IEEE%20transactions%20on%20intelligent%20transportation%20systems&rft.au=Wei,%20Mingqiang&rft.date=2024-12&rft.volume=25&rft.issue=12&rft.spage=19482&rft.epage=19495&rft.pages=19482-19495&rft.issn=1524-9050&rft.eissn=1558-0016&rft.coden=ITISFG&rft_id=info:doi/10.1109/TITS.2024.3449319&rft_dat=%3Ccrossref_RIE%3E10_1109_TITS_2024_3449319%3C/crossref_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=10666919&rfr_iscdi=true