3D target detection using dual domain attention and SIFT operator in indoor scenes

In a large number of real-life scenes and practical applications, 3D object detection is playing an increasingly important role. We need to estimate the position and direction of the 3D object in the real scene to complete the 3D object detection task. In this paper, we propose a new network archite...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:The Visual computer 2022-11, Vol.38 (11), p.3765-3774
Hauptverfasser: Zhao, Hanshuo, Yang, Dedong, Yu, Jiankang
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 3774
container_issue 11
container_start_page 3765
container_title The Visual computer
container_volume 38
creator Zhao, Hanshuo
Yang, Dedong
Yu, Jiankang
description In a large number of real-life scenes and practical applications, 3D object detection is playing an increasingly important role. We need to estimate the position and direction of the 3D object in the real scene to complete the 3D object detection task. In this paper, we propose a new network architecture based on VoteNet to detect 3D point cloud targets. On the one hand, we use channel and spatial dual-domain attention module to enhance the features of the object to be detected while suppressing other useless features. On the other hand, the SIFT operator has scale invariance and the ability to resist occlusion and background interference. The PointSIFT module we use can capture information in different directions of point cloud in space, and is robust to shapes of different proportions, so as to better detect objects that are partially occluded. Our method is evaluated on the SUN-RGBD and ScanNet datasets of indoor scenes. The experimental results show that our method has better performance than VoteNet.
doi_str_mv 10.1007/s00371-021-02217-z
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2918029450</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2918029450</sourcerecordid><originalsourceid>FETCH-LOGICAL-c249t-1824867fd48796f52f516a88f5419d2fd988e77150c77623b0bdd03295672a383</originalsourceid><addsrcrecordid>eNp9kEFLAzEQhYMoWKt_wFPA82oy2d0kR6lWCwVB6zmkm-yypU1qkj3YX2_aFbx5GGaGee8NfAjdUnJPCeEPkRDGaUHgWEB5cThDE1oyKIDR6hxNCOWiAC7kJbqKcUPyzks5Qe_sCScdOpuwsck2qfcOD7F3HTaD3mLjd7p3WKdk3emmncEfi_kK-70NOvmA87l3xucpNtbZeI0uWr2N9ua3T9Hn_Hk1ey2Wby-L2eOyaKCUqaACSlHz1pSCy7qtoK1orYVoq5JKA62RQljOaUUazmtga7I2hjCQVc1BM8Gm6G7M3Qf_NdiY1MYPweWXCiQVBGRZkayCUdUEH2OwrdqHfqfDt6JEHdmpkZ3K7NSJnTpkExtNMYtdZ8Nf9D-uH8yacFo</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2918029450</pqid></control><display><type>article</type><title>3D target detection using dual domain attention and SIFT operator in indoor scenes</title><source>Springer Nature - Complete Springer Journals</source><source>ProQuest Central UK/Ireland</source><source>ProQuest Central</source><creator>Zhao, Hanshuo ; Yang, Dedong ; Yu, Jiankang</creator><creatorcontrib>Zhao, Hanshuo ; Yang, Dedong ; Yu, Jiankang</creatorcontrib><description>In a large number of real-life scenes and practical applications, 3D object detection is playing an increasingly important role. We need to estimate the position and direction of the 3D object in the real scene to complete the 3D object detection task. In this paper, we propose a new network architecture based on VoteNet to detect 3D point cloud targets. On the one hand, we use channel and spatial dual-domain attention module to enhance the features of the object to be detected while suppressing other useless features. On the other hand, the SIFT operator has scale invariance and the ability to resist occlusion and background interference. The PointSIFT module we use can capture information in different directions of point cloud in space, and is robust to shapes of different proportions, so as to better detect objects that are partially occluded. Our method is evaluated on the SUN-RGBD and ScanNet datasets of indoor scenes. The experimental results show that our method has better performance than VoteNet.</description><identifier>ISSN: 0178-2789</identifier><identifier>EISSN: 1432-2315</identifier><identifier>DOI: 10.1007/s00371-021-02217-z</identifier><language>eng</language><publisher>Berlin/Heidelberg: Springer Berlin Heidelberg</publisher><subject>Artificial Intelligence ; Computer Graphics ; Computer Science ; Datasets ; Deep learning ; Image Processing and Computer Vision ; Modules ; Neural networks ; Object recognition ; Occlusion ; Original Article ; Scale invariance ; Target detection ; Three dimensional models</subject><ispartof>The Visual computer, 2022-11, Vol.38 (11), p.3765-3774</ispartof><rights>The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2021</rights><rights>The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2021.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c249t-1824867fd48796f52f516a88f5419d2fd988e77150c77623b0bdd03295672a383</citedby><cites>FETCH-LOGICAL-c249t-1824867fd48796f52f516a88f5419d2fd988e77150c77623b0bdd03295672a383</cites><orcidid>0000-0001-7950-6810</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s00371-021-02217-z$$EPDF$$P50$$Gspringer$$H</linktopdf><linktohtml>$$Uhttps://www.proquest.com/docview/2918029450?pq-origsite=primo$$EHTML$$P50$$Gproquest$$H</linktohtml><link.rule.ids>314,776,780,21368,27903,27904,33723,41467,42536,43784,51298,64362,64366,72216</link.rule.ids></links><search><creatorcontrib>Zhao, Hanshuo</creatorcontrib><creatorcontrib>Yang, Dedong</creatorcontrib><creatorcontrib>Yu, Jiankang</creatorcontrib><title>3D target detection using dual domain attention and SIFT operator in indoor scenes</title><title>The Visual computer</title><addtitle>Vis Comput</addtitle><description>In a large number of real-life scenes and practical applications, 3D object detection is playing an increasingly important role. We need to estimate the position and direction of the 3D object in the real scene to complete the 3D object detection task. In this paper, we propose a new network architecture based on VoteNet to detect 3D point cloud targets. On the one hand, we use channel and spatial dual-domain attention module to enhance the features of the object to be detected while suppressing other useless features. On the other hand, the SIFT operator has scale invariance and the ability to resist occlusion and background interference. The PointSIFT module we use can capture information in different directions of point cloud in space, and is robust to shapes of different proportions, so as to better detect objects that are partially occluded. Our method is evaluated on the SUN-RGBD and ScanNet datasets of indoor scenes. The experimental results show that our method has better performance than VoteNet.</description><subject>Artificial Intelligence</subject><subject>Computer Graphics</subject><subject>Computer Science</subject><subject>Datasets</subject><subject>Deep learning</subject><subject>Image Processing and Computer Vision</subject><subject>Modules</subject><subject>Neural networks</subject><subject>Object recognition</subject><subject>Occlusion</subject><subject>Original Article</subject><subject>Scale invariance</subject><subject>Target detection</subject><subject>Three dimensional models</subject><issn>0178-2789</issn><issn>1432-2315</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><sourceid>GNUQQ</sourceid><recordid>eNp9kEFLAzEQhYMoWKt_wFPA82oy2d0kR6lWCwVB6zmkm-yypU1qkj3YX2_aFbx5GGaGee8NfAjdUnJPCeEPkRDGaUHgWEB5cThDE1oyKIDR6hxNCOWiAC7kJbqKcUPyzks5Qe_sCScdOpuwsck2qfcOD7F3HTaD3mLjd7p3WKdk3emmncEfi_kK-70NOvmA87l3xucpNtbZeI0uWr2N9ua3T9Hn_Hk1ey2Wby-L2eOyaKCUqaACSlHz1pSCy7qtoK1orYVoq5JKA62RQljOaUUazmtga7I2hjCQVc1BM8Gm6G7M3Qf_NdiY1MYPweWXCiQVBGRZkayCUdUEH2OwrdqHfqfDt6JEHdmpkZ3K7NSJnTpkExtNMYtdZ8Nf9D-uH8yacFo</recordid><startdate>20221101</startdate><enddate>20221101</enddate><creator>Zhao, Hanshuo</creator><creator>Yang, Dedong</creator><creator>Yu, Jiankang</creator><general>Springer Berlin Heidelberg</general><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope><scope>8FE</scope><scope>8FG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K7-</scope><scope>P5Z</scope><scope>P62</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><orcidid>https://orcid.org/0000-0001-7950-6810</orcidid></search><sort><creationdate>20221101</creationdate><title>3D target detection using dual domain attention and SIFT operator in indoor scenes</title><author>Zhao, Hanshuo ; Yang, Dedong ; Yu, Jiankang</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c249t-1824867fd48796f52f516a88f5419d2fd988e77150c77623b0bdd03295672a383</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Artificial Intelligence</topic><topic>Computer Graphics</topic><topic>Computer Science</topic><topic>Datasets</topic><topic>Deep learning</topic><topic>Image Processing and Computer Vision</topic><topic>Modules</topic><topic>Neural networks</topic><topic>Object recognition</topic><topic>Occlusion</topic><topic>Original Article</topic><topic>Scale invariance</topic><topic>Target detection</topic><topic>Three dimensional models</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Zhao, Hanshuo</creatorcontrib><creatorcontrib>Yang, Dedong</creatorcontrib><creatorcontrib>Yu, Jiankang</creatorcontrib><collection>CrossRef</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Central UK/Ireland</collection><collection>Advanced Technologies &amp; Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>Computer Science Database</collection><collection>Advanced Technologies &amp; Aerospace Database</collection><collection>ProQuest Advanced Technologies &amp; Aerospace Collection</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><jtitle>The Visual computer</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Zhao, Hanshuo</au><au>Yang, Dedong</au><au>Yu, Jiankang</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>3D target detection using dual domain attention and SIFT operator in indoor scenes</atitle><jtitle>The Visual computer</jtitle><stitle>Vis Comput</stitle><date>2022-11-01</date><risdate>2022</risdate><volume>38</volume><issue>11</issue><spage>3765</spage><epage>3774</epage><pages>3765-3774</pages><issn>0178-2789</issn><eissn>1432-2315</eissn><abstract>In a large number of real-life scenes and practical applications, 3D object detection is playing an increasingly important role. We need to estimate the position and direction of the 3D object in the real scene to complete the 3D object detection task. In this paper, we propose a new network architecture based on VoteNet to detect 3D point cloud targets. On the one hand, we use channel and spatial dual-domain attention module to enhance the features of the object to be detected while suppressing other useless features. On the other hand, the SIFT operator has scale invariance and the ability to resist occlusion and background interference. The PointSIFT module we use can capture information in different directions of point cloud in space, and is robust to shapes of different proportions, so as to better detect objects that are partially occluded. Our method is evaluated on the SUN-RGBD and ScanNet datasets of indoor scenes. The experimental results show that our method has better performance than VoteNet.</abstract><cop>Berlin/Heidelberg</cop><pub>Springer Berlin Heidelberg</pub><doi>10.1007/s00371-021-02217-z</doi><tpages>10</tpages><orcidid>https://orcid.org/0000-0001-7950-6810</orcidid></addata></record>
fulltext fulltext
identifier ISSN: 0178-2789
ispartof The Visual computer, 2022-11, Vol.38 (11), p.3765-3774
issn 0178-2789
1432-2315
language eng
recordid cdi_proquest_journals_2918029450
source Springer Nature - Complete Springer Journals; ProQuest Central UK/Ireland; ProQuest Central
subjects Artificial Intelligence
Computer Graphics
Computer Science
Datasets
Deep learning
Image Processing and Computer Vision
Modules
Neural networks
Object recognition
Occlusion
Original Article
Scale invariance
Target detection
Three dimensional models
title 3D target detection using dual domain attention and SIFT operator in indoor scenes
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-21T18%3A17%3A06IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=3D%20target%20detection%20using%20dual%20domain%20attention%20and%20SIFT%20operator%20in%20indoor%20scenes&rft.jtitle=The%20Visual%20computer&rft.au=Zhao,%20Hanshuo&rft.date=2022-11-01&rft.volume=38&rft.issue=11&rft.spage=3765&rft.epage=3774&rft.pages=3765-3774&rft.issn=0178-2789&rft.eissn=1432-2315&rft_id=info:doi/10.1007/s00371-021-02217-z&rft_dat=%3Cproquest_cross%3E2918029450%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2918029450&rft_id=info:pmid/&rfr_iscdi=true