M-GCN: Multi-Branch Graph Convolution Network for 2D Image-based on 3D Model Retrieval
2D image based 3D model retrieval is a challenging research topic in the field of 3D model retrieval. The huge gap between two modalities - 2D image and 3D model, extremely constrains the retrieval performance. In order to handle this problem, we propose a novel multi-branch graph convolution networ...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on multimedia 2021, Vol.23, p.1962-1976 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 1976 |
---|---|
container_issue | |
container_start_page | 1962 |
container_title | IEEE transactions on multimedia |
container_volume | 23 |
creator | Nie, Wei-Zhi Ren, Min-Jie Liu, An-An Mao, Zhendong Nie, Jie |
description | 2D image based 3D model retrieval is a challenging research topic in the field of 3D model retrieval. The huge gap between two modalities - 2D image and 3D model, extremely constrains the retrieval performance. In order to handle this problem, we propose a novel multi-branch graph convolution network (M-GCN) to address the 2D image based 3D model retrieval problem. First, we compute the similarity between 2D image and 3D model based on visual information to construct one cross-modalities graph model, which can provide the original relationship between image and 3D model. However, this relationship is not accurate because of the difference of modalities. Thus, the multi-head attention mechanism is employed to generate a set of fully connected edge-weighted graphs, which can predict the hidden relationship between 2D image and 3D model to further strengthen the correlation for the embedding generation of nodes. Finally, we apply the max-pooling operation to fuse the multi-graphs information and generate the fusion embeddings of nodes for retrieval. To validate the performance of our method, we evaluated M-GCN on the MI3DOR dataset, Shrec 2018 track and Shrec 2014 track. The experimental results demonstrate the superiority of our proposed method over the state-of-the-art methods. |
doi_str_mv | 10.1109/TMM.2020.3006371 |
format | Article |
fullrecord | <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_proquest_journals_2544953000</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9133153</ieee_id><sourcerecordid>2544953000</sourcerecordid><originalsourceid>FETCH-LOGICAL-c291t-ac12e34b66351dce2db41ef11da781e7627656f695fb5e8af58bc8235c8066743</originalsourceid><addsrcrecordid>eNo9kEFLw0AQRhdRsFbvgpcFz6kzu9lN4k1brYWmglSvYZNMbGqarZuk4r83pcXTfDDvm4HH2DXCCBGiu2UcjwQIGEkALQM8YQOMfPQAguC0z0qAFwmEc3bRNGsA9BUEA_YRe9Px4p7HXdWW3qMzdbbiU2e2Kz629c5WXVvami-o_bHuixfWcTHhs435JC81DeW838oJj21OFX-j1pW0M9UlOytM1dDVcQ7Z-_PTcvzizV-ns_HD3MtEhK1nMhQk_VRrqTDPSOSpj1Qg5iYIkQItAq10oSNVpIpCU6gwzUIhVRaC1oEvh-z2cHfr7HdHTZusbefq_mUilO9HqrcBPQUHKnO2aRwVydaVG-N-E4Rkby_p7SV7e8nRXl-5OVRKIvrHI5QSlZR_iDBoWQ</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2544953000</pqid></control><display><type>article</type><title>M-GCN: Multi-Branch Graph Convolution Network for 2D Image-based on 3D Model Retrieval</title><source>IEEE</source><creator>Nie, Wei-Zhi ; Ren, Min-Jie ; Liu, An-An ; Mao, Zhendong ; Nie, Jie</creator><creatorcontrib>Nie, Wei-Zhi ; Ren, Min-Jie ; Liu, An-An ; Mao, Zhendong ; Nie, Jie</creatorcontrib><description>2D image based 3D model retrieval is a challenging research topic in the field of 3D model retrieval. The huge gap between two modalities - 2D image and 3D model, extremely constrains the retrieval performance. In order to handle this problem, we propose a novel multi-branch graph convolution network (M-GCN) to address the 2D image based 3D model retrieval problem. First, we compute the similarity between 2D image and 3D model based on visual information to construct one cross-modalities graph model, which can provide the original relationship between image and 3D model. However, this relationship is not accurate because of the difference of modalities. Thus, the multi-head attention mechanism is employed to generate a set of fully connected edge-weighted graphs, which can predict the hidden relationship between 2D image and 3D model to further strengthen the correlation for the embedding generation of nodes. Finally, we apply the max-pooling operation to fuse the multi-graphs information and generate the fusion embeddings of nodes for retrieval. To validate the performance of our method, we evaluated M-GCN on the MI3DOR dataset, Shrec 2018 track and Shrec 2014 track. The experimental results demonstrate the superiority of our proposed method over the state-of-the-art methods.</description><identifier>ISSN: 1520-9210</identifier><identifier>EISSN: 1941-0077</identifier><identifier>DOI: 10.1109/TMM.2020.3006371</identifier><identifier>CODEN: ITMUF8</identifier><language>eng</language><publisher>Piscataway: IEEE</publisher><subject>3D model retrieval ; Computational modeling ; Convolution ; Cross-domain retrieval ; Feature extraction ; Graphs ; multi-head attention ; multiple graphs ; Nodes ; Predictive models ; Retrieval ; Solid modeling ; Three dimensional models ; Three-dimensional displays ; Two dimensional displays ; Two dimensional models ; Visualization</subject><ispartof>IEEE transactions on multimedia, 2021, Vol.23, p.1962-1976</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2021</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c291t-ac12e34b66351dce2db41ef11da781e7627656f695fb5e8af58bc8235c8066743</citedby><cites>FETCH-LOGICAL-c291t-ac12e34b66351dce2db41ef11da781e7627656f695fb5e8af58bc8235c8066743</cites><orcidid>0000-0001-5755-9145 ; 0000-0003-4952-7666 ; 0000-0002-0578-8138 ; 0000-0001-5739-8126 ; 0000-0002-6080-8074</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9133153$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,4010,27900,27901,27902,54733</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/9133153$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Nie, Wei-Zhi</creatorcontrib><creatorcontrib>Ren, Min-Jie</creatorcontrib><creatorcontrib>Liu, An-An</creatorcontrib><creatorcontrib>Mao, Zhendong</creatorcontrib><creatorcontrib>Nie, Jie</creatorcontrib><title>M-GCN: Multi-Branch Graph Convolution Network for 2D Image-based on 3D Model Retrieval</title><title>IEEE transactions on multimedia</title><addtitle>TMM</addtitle><description>2D image based 3D model retrieval is a challenging research topic in the field of 3D model retrieval. The huge gap between two modalities - 2D image and 3D model, extremely constrains the retrieval performance. In order to handle this problem, we propose a novel multi-branch graph convolution network (M-GCN) to address the 2D image based 3D model retrieval problem. First, we compute the similarity between 2D image and 3D model based on visual information to construct one cross-modalities graph model, which can provide the original relationship between image and 3D model. However, this relationship is not accurate because of the difference of modalities. Thus, the multi-head attention mechanism is employed to generate a set of fully connected edge-weighted graphs, which can predict the hidden relationship between 2D image and 3D model to further strengthen the correlation for the embedding generation of nodes. Finally, we apply the max-pooling operation to fuse the multi-graphs information and generate the fusion embeddings of nodes for retrieval. To validate the performance of our method, we evaluated M-GCN on the MI3DOR dataset, Shrec 2018 track and Shrec 2014 track. The experimental results demonstrate the superiority of our proposed method over the state-of-the-art methods.</description><subject>3D model retrieval</subject><subject>Computational modeling</subject><subject>Convolution</subject><subject>Cross-domain retrieval</subject><subject>Feature extraction</subject><subject>Graphs</subject><subject>multi-head attention</subject><subject>multiple graphs</subject><subject>Nodes</subject><subject>Predictive models</subject><subject>Retrieval</subject><subject>Solid modeling</subject><subject>Three dimensional models</subject><subject>Three-dimensional displays</subject><subject>Two dimensional displays</subject><subject>Two dimensional models</subject><subject>Visualization</subject><issn>1520-9210</issn><issn>1941-0077</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNo9kEFLw0AQRhdRsFbvgpcFz6kzu9lN4k1brYWmglSvYZNMbGqarZuk4r83pcXTfDDvm4HH2DXCCBGiu2UcjwQIGEkALQM8YQOMfPQAguC0z0qAFwmEc3bRNGsA9BUEA_YRe9Px4p7HXdWW3qMzdbbiU2e2Kz629c5WXVvami-o_bHuixfWcTHhs435JC81DeW838oJj21OFX-j1pW0M9UlOytM1dDVcQ7Z-_PTcvzizV-ns_HD3MtEhK1nMhQk_VRrqTDPSOSpj1Qg5iYIkQItAq10oSNVpIpCU6gwzUIhVRaC1oEvh-z2cHfr7HdHTZusbefq_mUilO9HqrcBPQUHKnO2aRwVydaVG-N-E4Rkby_p7SV7e8nRXl-5OVRKIvrHI5QSlZR_iDBoWQ</recordid><startdate>2021</startdate><enddate>2021</enddate><creator>Nie, Wei-Zhi</creator><creator>Ren, Min-Jie</creator><creator>Liu, An-An</creator><creator>Mao, Zhendong</creator><creator>Nie, Jie</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0001-5755-9145</orcidid><orcidid>https://orcid.org/0000-0003-4952-7666</orcidid><orcidid>https://orcid.org/0000-0002-0578-8138</orcidid><orcidid>https://orcid.org/0000-0001-5739-8126</orcidid><orcidid>https://orcid.org/0000-0002-6080-8074</orcidid></search><sort><creationdate>2021</creationdate><title>M-GCN: Multi-Branch Graph Convolution Network for 2D Image-based on 3D Model Retrieval</title><author>Nie, Wei-Zhi ; Ren, Min-Jie ; Liu, An-An ; Mao, Zhendong ; Nie, Jie</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c291t-ac12e34b66351dce2db41ef11da781e7627656f695fb5e8af58bc8235c8066743</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>3D model retrieval</topic><topic>Computational modeling</topic><topic>Convolution</topic><topic>Cross-domain retrieval</topic><topic>Feature extraction</topic><topic>Graphs</topic><topic>multi-head attention</topic><topic>multiple graphs</topic><topic>Nodes</topic><topic>Predictive models</topic><topic>Retrieval</topic><topic>Solid modeling</topic><topic>Three dimensional models</topic><topic>Three-dimensional displays</topic><topic>Two dimensional displays</topic><topic>Two dimensional models</topic><topic>Visualization</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Nie, Wei-Zhi</creatorcontrib><creatorcontrib>Ren, Min-Jie</creatorcontrib><creatorcontrib>Liu, An-An</creatorcontrib><creatorcontrib>Mao, Zhendong</creatorcontrib><creatorcontrib>Nie, Jie</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005–Present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998–Present</collection><collection>IEEE</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE transactions on multimedia</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Nie, Wei-Zhi</au><au>Ren, Min-Jie</au><au>Liu, An-An</au><au>Mao, Zhendong</au><au>Nie, Jie</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>M-GCN: Multi-Branch Graph Convolution Network for 2D Image-based on 3D Model Retrieval</atitle><jtitle>IEEE transactions on multimedia</jtitle><stitle>TMM</stitle><date>2021</date><risdate>2021</risdate><volume>23</volume><spage>1962</spage><epage>1976</epage><pages>1962-1976</pages><issn>1520-9210</issn><eissn>1941-0077</eissn><coden>ITMUF8</coden><abstract>2D image based 3D model retrieval is a challenging research topic in the field of 3D model retrieval. The huge gap between two modalities - 2D image and 3D model, extremely constrains the retrieval performance. In order to handle this problem, we propose a novel multi-branch graph convolution network (M-GCN) to address the 2D image based 3D model retrieval problem. First, we compute the similarity between 2D image and 3D model based on visual information to construct one cross-modalities graph model, which can provide the original relationship between image and 3D model. However, this relationship is not accurate because of the difference of modalities. Thus, the multi-head attention mechanism is employed to generate a set of fully connected edge-weighted graphs, which can predict the hidden relationship between 2D image and 3D model to further strengthen the correlation for the embedding generation of nodes. Finally, we apply the max-pooling operation to fuse the multi-graphs information and generate the fusion embeddings of nodes for retrieval. To validate the performance of our method, we evaluated M-GCN on the MI3DOR dataset, Shrec 2018 track and Shrec 2014 track. The experimental results demonstrate the superiority of our proposed method over the state-of-the-art methods.</abstract><cop>Piscataway</cop><pub>IEEE</pub><doi>10.1109/TMM.2020.3006371</doi><tpages>15</tpages><orcidid>https://orcid.org/0000-0001-5755-9145</orcidid><orcidid>https://orcid.org/0000-0003-4952-7666</orcidid><orcidid>https://orcid.org/0000-0002-0578-8138</orcidid><orcidid>https://orcid.org/0000-0001-5739-8126</orcidid><orcidid>https://orcid.org/0000-0002-6080-8074</orcidid></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | ISSN: 1520-9210 |
ispartof | IEEE transactions on multimedia, 2021, Vol.23, p.1962-1976 |
issn | 1520-9210 1941-0077 |
language | eng |
recordid | cdi_proquest_journals_2544953000 |
source | IEEE |
subjects | 3D model retrieval Computational modeling Convolution Cross-domain retrieval Feature extraction Graphs multi-head attention multiple graphs Nodes Predictive models Retrieval Solid modeling Three dimensional models Three-dimensional displays Two dimensional displays Two dimensional models Visualization |
title | M-GCN: Multi-Branch Graph Convolution Network for 2D Image-based on 3D Model Retrieval |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-14T16%3A11%3A12IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=M-GCN:%20Multi-Branch%20Graph%20Convolution%20Network%20for%202D%20Image-based%20on%203D%20Model%20Retrieval&rft.jtitle=IEEE%20transactions%20on%20multimedia&rft.au=Nie,%20Wei-Zhi&rft.date=2021&rft.volume=23&rft.spage=1962&rft.epage=1976&rft.pages=1962-1976&rft.issn=1520-9210&rft.eissn=1941-0077&rft.coden=ITMUF8&rft_id=info:doi/10.1109/TMM.2020.3006371&rft_dat=%3Cproquest_RIE%3E2544953000%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2544953000&rft_id=info:pmid/&rft_ieee_id=9133153&rfr_iscdi=true |