3D model retrieval based on multi-view attentional convolutional neural network

We propose a discriminative Multi-View Attentional Convolutional Neural Network, dubbed as MVA-CNN, which takes the multiple views of an shape as input and output the object category. Unlike previous view-based approaches that simply ”compile” the view features into a compact 3D descriptors, our met...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Multimedia tools and applications 2020-02, Vol.79 (7-8), p.4699-4711
Hauptverfasser:	Liu, An-An, Zhou, He-Yu, Li, Meng-Jie, Nie, Wei-Zhi
Format:	Artikel
Sprache:	eng
Schlagworte:	Artificial neural networks Back propagation Computer Communication Networks Computer Science Data Structures and Information Theory Multimedia Information Systems Neural networks Spatial data Special Purpose and Application-Based Systems Three dimensional models Virtual cameras
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	4711
container_issue	7-8
container_start_page	4699
container_title	Multimedia tools and applications
container_volume	79
creator	Liu, An-An Zhou, He-Yu Li, Meng-Jie Nie, Wei-Zhi
description	We propose a discriminative Multi-View Attentional Convolutional Neural Network, dubbed as MVA-CNN, which takes the multiple views of an shape as input and output the object category. Unlike previous view-based approaches that simply ”compile” the view features into a compact 3D descriptors, our method can discover the context among multiple views in both the visual and spatial domain. First, we extract multiple rendered images from a 3D object by virtual cameras, and then we use Convolutional Neural Network (CNN) to abstract the information of the views. Second, we aggregate the visual views by two steps: 1). an element-wise maximum operation across the view features is adopted to discover discriminative features. 2). a soft attention mechanism is used to dynamically adjust the shape descriptors for better representing the spatial information. The entire network can be trained in an end-to-end way with the standard backpropagation. We verify the effectiveness of MVA-CNN on two widely used datasets: ModelNet10, ModelNet40 by comparing our method with state-of-the-art methods.
doi_str_mv	10.1007/s11042-019-7521-8
format	Article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2199610781</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2199610781</sourcerecordid><originalsourceid>FETCH-LOGICAL-c316t-fc580a9ece50bcc9a5a8a50db7d4b8ae2a14a954f0f84171ef57226baa20e37f3</originalsourceid><addsrcrecordid>eNp1kE1PwzAMhiMEEmPwA7hV4hyw06ZJj2h8SpN2gXOUti7q6JqRtJv492R0EidOtvV-yHoYu0a4RQB1FxAhExyw4EoK5PqEzVCqlCsl8DTuqYaoAJ6zixDWAJhLkc3YKn1INq6mLvE0-JZ2tktKG6hOXJ9sxm5o-a6lfWKHgfqhdX3UK9fvXDcer55G_zuGvfOfl-yssV2gq-Ocs_enx7fFC1-unl8X90tepZgPvKmkBltQRRLKqiqstNpKqEtVZ6W2JCxmtpBZA43OUCE1UgmRl9YKoFQ16ZzdTL1b775GCoNZu9HHf4IRWBQ5gtIYXTi5Ku9C8NSYrW831n8bBHPgZiZuJnIzB25Gx4yYMiF6-w_yf83_h34AceJxTg</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2199610781</pqid></control><display><type>article</type><title>3D model retrieval based on multi-view attentional convolutional neural network</title><source>SpringerLink Journals - AutoHoldings</source><creator>Liu, An-An ; Zhou, He-Yu ; Li, Meng-Jie ; Nie, Wei-Zhi</creator><creatorcontrib>Liu, An-An ; Zhou, He-Yu ; Li, Meng-Jie ; Nie, Wei-Zhi</creatorcontrib><description>We propose a discriminative Multi-View Attentional Convolutional Neural Network, dubbed as MVA-CNN, which takes the multiple views of an shape as input and output the object category. Unlike previous view-based approaches that simply ”compile” the view features into a compact 3D descriptors, our method can discover the context among multiple views in both the visual and spatial domain. First, we extract multiple rendered images from a 3D object by virtual cameras, and then we use Convolutional Neural Network (CNN) to abstract the information of the views. Second, we aggregate the visual views by two steps: 1). an element-wise maximum operation across the view features is adopted to discover discriminative features. 2). a soft attention mechanism is used to dynamically adjust the shape descriptors for better representing the spatial information. The entire network can be trained in an end-to-end way with the standard backpropagation. We verify the effectiveness of MVA-CNN on two widely used datasets: ModelNet10, ModelNet40 by comparing our method with state-of-the-art methods.</description><identifier>ISSN: 1380-7501</identifier><identifier>EISSN: 1573-7721</identifier><identifier>DOI: 10.1007/s11042-019-7521-8</identifier><language>eng</language><publisher>New York: Springer US</publisher><subject>Artificial neural networks ; Back propagation ; Computer Communication Networks ; Computer Science ; Data Structures and Information Theory ; Multimedia Information Systems ; Neural networks ; Spatial data ; Special Purpose and Application-Based Systems ; Three dimensional models ; Virtual cameras</subject><ispartof>Multimedia tools and applications, 2020-02, Vol.79 (7-8), p.4699-4711</ispartof><rights>Springer Science+Business Media, LLC, part of Springer Nature 2019</rights><rights>Multimedia Tools and Applications is a copyright of Springer, (2019). All Rights Reserved.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c316t-fc580a9ece50bcc9a5a8a50db7d4b8ae2a14a954f0f84171ef57226baa20e37f3</citedby><cites>FETCH-LOGICAL-c316t-fc580a9ece50bcc9a5a8a50db7d4b8ae2a14a954f0f84171ef57226baa20e37f3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s11042-019-7521-8$$EPDF$$P50$$Gspringer$$H</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/s11042-019-7521-8$$EHTML$$P50$$Gspringer$$H</linktohtml><link.rule.ids>314,776,780,27903,27904,41467,42536,51297</link.rule.ids></links><search><creatorcontrib>Liu, An-An</creatorcontrib><creatorcontrib>Zhou, He-Yu</creatorcontrib><creatorcontrib>Li, Meng-Jie</creatorcontrib><creatorcontrib>Nie, Wei-Zhi</creatorcontrib><title>3D model retrieval based on multi-view attentional convolutional neural network</title><title>Multimedia tools and applications</title><addtitle>Multimed Tools Appl</addtitle><description>We propose a discriminative Multi-View Attentional Convolutional Neural Network, dubbed as MVA-CNN, which takes the multiple views of an shape as input and output the object category. Unlike previous view-based approaches that simply ”compile” the view features into a compact 3D descriptors, our method can discover the context among multiple views in both the visual and spatial domain. First, we extract multiple rendered images from a 3D object by virtual cameras, and then we use Convolutional Neural Network (CNN) to abstract the information of the views. Second, we aggregate the visual views by two steps: 1). an element-wise maximum operation across the view features is adopted to discover discriminative features. 2). a soft attention mechanism is used to dynamically adjust the shape descriptors for better representing the spatial information. The entire network can be trained in an end-to-end way with the standard backpropagation. We verify the effectiveness of MVA-CNN on two widely used datasets: ModelNet10, ModelNet40 by comparing our method with state-of-the-art methods.</description><subject>Artificial neural networks</subject><subject>Back propagation</subject><subject>Computer Communication Networks</subject><subject>Computer Science</subject><subject>Data Structures and Information Theory</subject><subject>Multimedia Information Systems</subject><subject>Neural networks</subject><subject>Spatial data</subject><subject>Special Purpose and Application-Based Systems</subject><subject>Three dimensional models</subject><subject>Virtual cameras</subject><issn>1380-7501</issn><issn>1573-7721</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><sourceid>8G5</sourceid><sourceid>BENPR</sourceid><sourceid>GUQSH</sourceid><sourceid>M2O</sourceid><recordid>eNp1kE1PwzAMhiMEEmPwA7hV4hyw06ZJj2h8SpN2gXOUti7q6JqRtJv492R0EidOtvV-yHoYu0a4RQB1FxAhExyw4EoK5PqEzVCqlCsl8DTuqYaoAJ6zixDWAJhLkc3YKn1INq6mLvE0-JZ2tktKG6hOXJ9sxm5o-a6lfWKHgfqhdX3UK9fvXDcer55G_zuGvfOfl-yssV2gq-Ocs_enx7fFC1-unl8X90tepZgPvKmkBltQRRLKqiqstNpKqEtVZ6W2JCxmtpBZA43OUCE1UgmRl9YKoFQ16ZzdTL1b775GCoNZu9HHf4IRWBQ5gtIYXTi5Ku9C8NSYrW831n8bBHPgZiZuJnIzB25Gx4yYMiF6-w_yf83_h34AceJxTg</recordid><startdate>20200201</startdate><enddate>20200201</enddate><creator>Liu, An-An</creator><creator>Zhou, He-Yu</creator><creator>Li, Meng-Jie</creator><creator>Nie, Wei-Zhi</creator><general>Springer US</general><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope><scope>3V.</scope><scope>7SC</scope><scope>7WY</scope><scope>7WZ</scope><scope>7XB</scope><scope>87Z</scope><scope>8AL</scope><scope>8AO</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>8FK</scope><scope>8FL</scope><scope>8G5</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BEZIV</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>FRNLG</scope><scope>F~G</scope><scope>GNUQQ</scope><scope>GUQSH</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K60</scope><scope>K6~</scope><scope>K7-</scope><scope>L.-</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>M0C</scope><scope>M0N</scope><scope>M2O</scope><scope>MBDVC</scope><scope>P5Z</scope><scope>P62</scope><scope>PQBIZ</scope><scope>PQBZA</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>Q9U</scope></search><sort><creationdate>20200201</creationdate><title>3D model retrieval based on multi-view attentional convolutional neural network</title><author>Liu, An-An ; Zhou, He-Yu ; Li, Meng-Jie ; Nie, Wei-Zhi</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c316t-fc580a9ece50bcc9a5a8a50db7d4b8ae2a14a954f0f84171ef57226baa20e37f3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Artificial neural networks</topic><topic>Back propagation</topic><topic>Computer Communication Networks</topic><topic>Computer Science</topic><topic>Data Structures and Information Theory</topic><topic>Multimedia Information Systems</topic><topic>Neural networks</topic><topic>Spatial data</topic><topic>Special Purpose and Application-Based Systems</topic><topic>Three dimensional models</topic><topic>Virtual cameras</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Liu, An-An</creatorcontrib><creatorcontrib>Zhou, He-Yu</creatorcontrib><creatorcontrib>Li, Meng-Jie</creatorcontrib><creatorcontrib>Nie, Wei-Zhi</creatorcontrib><collection>CrossRef</collection><collection>ProQuest Central (Corporate)</collection><collection>Computer and Information Systems Abstracts</collection><collection>ABI/INFORM Collection</collection><collection>ABI/INFORM Global (PDF only)</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>ABI/INFORM Global (Alumni Edition)</collection><collection>Computing Database (Alumni Edition)</collection><collection>ProQuest Pharma Collection</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ABI/INFORM Collection (Alumni Edition)</collection><collection>Research Library (Alumni Edition)</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>Advanced Technologies & Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Business Premium Collection</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>Business Premium Collection (Alumni)</collection><collection>ABI/INFORM Global (Corporate)</collection><collection>ProQuest Central Student</collection><collection>Research Library Prep</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>ProQuest Business Collection (Alumni Edition)</collection><collection>ProQuest Business Collection</collection><collection>Computer Science Database</collection><collection>ABI/INFORM Professional Advanced</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>ABI/INFORM Global</collection><collection>Computing Database</collection><collection>Research Library</collection><collection>Research Library (Corporate)</collection><collection>Advanced Technologies & Aerospace Database</collection><collection>ProQuest Advanced Technologies & Aerospace Collection</collection><collection>ProQuest One Business</collection><collection>ProQuest One Business (Alumni)</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>ProQuest Central Basic</collection><jtitle>Multimedia tools and applications</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Liu, An-An</au><au>Zhou, He-Yu</au><au>Li, Meng-Jie</au><au>Nie, Wei-Zhi</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>3D model retrieval based on multi-view attentional convolutional neural network</atitle><jtitle>Multimedia tools and applications</jtitle><stitle>Multimed Tools Appl</stitle><date>2020-02-01</date><risdate>2020</risdate><volume>79</volume><issue>7-8</issue><spage>4699</spage><epage>4711</epage><pages>4699-4711</pages><issn>1380-7501</issn><eissn>1573-7721</eissn><abstract>We propose a discriminative Multi-View Attentional Convolutional Neural Network, dubbed as MVA-CNN, which takes the multiple views of an shape as input and output the object category. Unlike previous view-based approaches that simply ”compile” the view features into a compact 3D descriptors, our method can discover the context among multiple views in both the visual and spatial domain. First, we extract multiple rendered images from a 3D object by virtual cameras, and then we use Convolutional Neural Network (CNN) to abstract the information of the views. Second, we aggregate the visual views by two steps: 1). an element-wise maximum operation across the view features is adopted to discover discriminative features. 2). a soft attention mechanism is used to dynamically adjust the shape descriptors for better representing the spatial information. The entire network can be trained in an end-to-end way with the standard backpropagation. We verify the effectiveness of MVA-CNN on two widely used datasets: ModelNet10, ModelNet40 by comparing our method with state-of-the-art methods.</abstract><cop>New York</cop><pub>Springer US</pub><doi>10.1007/s11042-019-7521-8</doi><tpages>13</tpages></addata></record>
fulltext	fulltext
identifier	ISSN: 1380-7501
ispartof	Multimedia tools and applications, 2020-02, Vol.79 (7-8), p.4699-4711
issn	1380-7501 1573-7721
language	eng
recordid	cdi_proquest_journals_2199610781
source	SpringerLink Journals - AutoHoldings
subjects	Artificial neural networks Back propagation Computer Communication Networks Computer Science Data Structures and Information Theory Multimedia Information Systems Neural networks Spatial data Special Purpose and Application-Based Systems Three dimensional models Virtual cameras
title	3D model retrieval based on multi-view attentional convolutional neural network
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-28T04%3A06%3A02IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=3D%20model%20retrieval%20based%20on%20multi-view%20attentional%20convolutional%20neural%20network&rft.jtitle=Multimedia%20tools%20and%20applications&rft.au=Liu,%20An-An&rft.date=2020-02-01&rft.volume=79&rft.issue=7-8&rft.spage=4699&rft.epage=4711&rft.pages=4699-4711&rft.issn=1380-7501&rft.eissn=1573-7721&rft_id=info:doi/10.1007/s11042-019-7521-8&rft_dat=%3Cproquest_cross%3E2199610781%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2199610781&rft_id=info:pmid/&rfr_iscdi=true