Deep Non-Rigid Structure From Motion With Missing Data

Non-rigid structure from motion (NRSfM) refers to the problem of reconstructing cameras and the 3D point cloud of a non-rigid object from an ensemble of images with 2D correspondences. Current NRSfM algorithms are limited from two perspectives: (i) the number of images, and (ii) the type of shape va...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on pattern analysis and machine intelligence 2021-12, Vol.43 (12), p.4365-4377
Hauptverfasser:	Kong, Chen, Lucey, Simon
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Artificial neural networks Coders deep neural network Encoding hierarchical sparse coding Image reconstruction Machine learning Missing data Neural networks Nonrigid structure from motion reconstructability Rigid structures Shape Structure from motion Three dimensional models Three-dimensional displays Two dimensional displays Vision
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	4377
container_issue	12
container_start_page	4365
container_title	IEEE transactions on pattern analysis and machine intelligence
container_volume	43
creator	Kong, Chen Lucey, Simon
description	Non-rigid structure from motion (NRSfM) refers to the problem of reconstructing cameras and the 3D point cloud of a non-rigid object from an ensemble of images with 2D correspondences. Current NRSfM algorithms are limited from two perspectives: (i) the number of images, and (ii) the type of shape variability they can handle. These difficulties stem from the inherent conflict between the condition of the system and the degrees of freedom needing to be modeled - which has hampered its practical utility for many applications within vision. In this paper we propose a novel hierarchical sparse coding model for NRSFM which can overcome (i) and (ii) to such an extent, that NRSFM can be applied to problems in vision previously thought too ill posed. Our approach is realized in practice as the training of an unsupervised deep neural network (DNN) auto-encoder with a unique architecture that is able to disentangle pose from 3D structure. Using modern deep learning computational platforms allows us to solve NRSfM problems at an unprecedented scale and shape complexity. Our approach has no 3D supervision, relying solely on 2D point correspondences. Further, our approach is also able to handle missing/occluded 2D points without the need for matrix completion. Extensive experiments demonstrate the impressive performance of our approach where we exhibit superior precision and robustness against all available state-of-the-art works in some instances by an order of magnitude. We further propose a new quality measure (based on the network weights) which circumvents the need for 3D ground-truth to ascertain the confidence we have in the reconstructability. We believe our work to be a significant advance over state-of-the-art in NRSFM.
doi_str_mv	10.1109/TPAMI.2020.2997026
format	Article
fullrecord	<record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_crossref_primary_10_1109_TPAMI_2020_2997026</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9099404</ieee_id><sourcerecordid>2592630429</sourcerecordid><originalsourceid>FETCH-LOGICAL-c372t-b9114631e4648fe05600ac0cefeae9a29c0e5bb85f272f333691c1d1086863cf3</originalsourceid><addsrcrecordid>eNpdkEFPwkAQRjdGI4j-Ab008eKlODvb3XaPBERJRI1iPG7KMsUl0OJue_DfW4R48DSX975MHmOXHPqcg76dvQymkz4CQh-1TgHVEetyLXQspNDHrAtcYZxlmHXYWQgrAJ5IEKesIzCVkKbYZWpEtI2eqjJ-dUu3iN5q39i68RSNfbWJplXtqjL6cPVnNHUhuHIZjfI6P2cnRb4OdHG4PfY-vpsNH-LH5_vJcPAYW5FiHc8154kSnBKVZAWBVAC5BUsF5aRz1BZIzueZLDDFQgihNLd8wSFTmRK2ED12s9_d-uqroVCbjQuW1uu8pKoJBhMB7bSQWYte_0NXVePL9juDUqMSkKBuKdxT1lcheCrM1rtN7r8NB7Oran6rml1Vc6jaSld7yRHRn6BB6wQS8QMMwW68</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2592630429</pqid></control><display><type>article</type><title>Deep Non-Rigid Structure From Motion With Missing Data</title><source>IEEE Electronic Library (IEL)</source><creator>Kong, Chen ; Lucey, Simon</creator><creatorcontrib>Kong, Chen ; Lucey, Simon</creatorcontrib><description>Non-rigid structure from motion (NRSfM) refers to the problem of reconstructing cameras and the 3D point cloud of a non-rigid object from an ensemble of images with 2D correspondences. Current NRSfM algorithms are limited from two perspectives: (i) the number of images, and (ii) the type of shape variability they can handle. These difficulties stem from the inherent conflict between the condition of the system and the degrees of freedom needing to be modeled - which has hampered its practical utility for many applications within vision. In this paper we propose a novel hierarchical sparse coding model for NRSFM which can overcome (i) and (ii) to such an extent, that NRSFM can be applied to problems in vision previously thought too ill posed. Our approach is realized in practice as the training of an unsupervised deep neural network (DNN) auto-encoder with a unique architecture that is able to disentangle pose from 3D structure. Using modern deep learning computational platforms allows us to solve NRSfM problems at an unprecedented scale and shape complexity. Our approach has no 3D supervision, relying solely on 2D point correspondences. Further, our approach is also able to handle missing/occluded 2D points without the need for matrix completion. Extensive experiments demonstrate the impressive performance of our approach where we exhibit superior precision and robustness against all available state-of-the-art works in some instances by an order of magnitude. We further propose a new quality measure (based on the network weights) which circumvents the need for 3D ground-truth to ascertain the confidence we have in the reconstructability. We believe our work to be a significant advance over state-of-the-art in NRSFM.</description><identifier>ISSN: 0162-8828</identifier><identifier>EISSN: 1939-3539</identifier><identifier>EISSN: 2160-9292</identifier><identifier>DOI: 10.1109/TPAMI.2020.2997026</identifier><identifier>PMID: 32750772</identifier><identifier>CODEN: ITPIDJ</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Algorithms ; Artificial neural networks ; Coders ; deep neural network ; Encoding ; hierarchical sparse coding ; Image reconstruction ; Machine learning ; Missing data ; Neural networks ; Nonrigid structure from motion ; reconstructability ; Rigid structures ; Shape ; Structure from motion ; Three dimensional models ; Three-dimensional displays ; Two dimensional displays ; Vision</subject><ispartof>IEEE transactions on pattern analysis and machine intelligence, 2021-12, Vol.43 (12), p.4365-4377</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2021</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c372t-b9114631e4648fe05600ac0cefeae9a29c0e5bb85f272f333691c1d1086863cf3</citedby><cites>FETCH-LOGICAL-c372t-b9114631e4648fe05600ac0cefeae9a29c0e5bb85f272f333691c1d1086863cf3</cites><orcidid>0000-0002-5095-7930</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9099404$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,796,27924,27925,54758</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/9099404$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Kong, Chen</creatorcontrib><creatorcontrib>Lucey, Simon</creatorcontrib><title>Deep Non-Rigid Structure From Motion With Missing Data</title><title>IEEE transactions on pattern analysis and machine intelligence</title><addtitle>TPAMI</addtitle><description>Non-rigid structure from motion (NRSfM) refers to the problem of reconstructing cameras and the 3D point cloud of a non-rigid object from an ensemble of images with 2D correspondences. Current NRSfM algorithms are limited from two perspectives: (i) the number of images, and (ii) the type of shape variability they can handle. These difficulties stem from the inherent conflict between the condition of the system and the degrees of freedom needing to be modeled - which has hampered its practical utility for many applications within vision. In this paper we propose a novel hierarchical sparse coding model for NRSFM which can overcome (i) and (ii) to such an extent, that NRSFM can be applied to problems in vision previously thought too ill posed. Our approach is realized in practice as the training of an unsupervised deep neural network (DNN) auto-encoder with a unique architecture that is able to disentangle pose from 3D structure. Using modern deep learning computational platforms allows us to solve NRSfM problems at an unprecedented scale and shape complexity. Our approach has no 3D supervision, relying solely on 2D point correspondences. Further, our approach is also able to handle missing/occluded 2D points without the need for matrix completion. Extensive experiments demonstrate the impressive performance of our approach where we exhibit superior precision and robustness against all available state-of-the-art works in some instances by an order of magnitude. We further propose a new quality measure (based on the network weights) which circumvents the need for 3D ground-truth to ascertain the confidence we have in the reconstructability. We believe our work to be a significant advance over state-of-the-art in NRSFM.</description><subject>Algorithms</subject><subject>Artificial neural networks</subject><subject>Coders</subject><subject>deep neural network</subject><subject>Encoding</subject><subject>hierarchical sparse coding</subject><subject>Image reconstruction</subject><subject>Machine learning</subject><subject>Missing data</subject><subject>Neural networks</subject><subject>Nonrigid structure from motion</subject><subject>reconstructability</subject><subject>Rigid structures</subject><subject>Shape</subject><subject>Structure from motion</subject><subject>Three dimensional models</subject><subject>Three-dimensional displays</subject><subject>Two dimensional displays</subject><subject>Vision</subject><issn>0162-8828</issn><issn>1939-3539</issn><issn>2160-9292</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpdkEFPwkAQRjdGI4j-Ab008eKlODvb3XaPBERJRI1iPG7KMsUl0OJue_DfW4R48DSX975MHmOXHPqcg76dvQymkz4CQh-1TgHVEetyLXQspNDHrAtcYZxlmHXYWQgrAJ5IEKesIzCVkKbYZWpEtI2eqjJ-dUu3iN5q39i68RSNfbWJplXtqjL6cPVnNHUhuHIZjfI6P2cnRb4OdHG4PfY-vpsNH-LH5_vJcPAYW5FiHc8154kSnBKVZAWBVAC5BUsF5aRz1BZIzueZLDDFQgihNLd8wSFTmRK2ED12s9_d-uqroVCbjQuW1uu8pKoJBhMB7bSQWYte_0NXVePL9juDUqMSkKBuKdxT1lcheCrM1rtN7r8NB7Oran6rml1Vc6jaSld7yRHRn6BB6wQS8QMMwW68</recordid><startdate>20211201</startdate><enddate>20211201</enddate><creator>Kong, Chen</creator><creator>Lucey, Simon</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>7X8</scope><orcidid>https://orcid.org/0000-0002-5095-7930</orcidid></search><sort><creationdate>20211201</creationdate><title>Deep Non-Rigid Structure From Motion With Missing Data</title><author>Kong, Chen ; Lucey, Simon</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c372t-b9114631e4648fe05600ac0cefeae9a29c0e5bb85f272f333691c1d1086863cf3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Algorithms</topic><topic>Artificial neural networks</topic><topic>Coders</topic><topic>deep neural network</topic><topic>Encoding</topic><topic>hierarchical sparse coding</topic><topic>Image reconstruction</topic><topic>Machine learning</topic><topic>Missing data</topic><topic>Neural networks</topic><topic>Nonrigid structure from motion</topic><topic>reconstructability</topic><topic>Rigid structures</topic><topic>Shape</topic><topic>Structure from motion</topic><topic>Three dimensional models</topic><topic>Three-dimensional displays</topic><topic>Two dimensional displays</topic><topic>Vision</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Kong, Chen</creatorcontrib><creatorcontrib>Lucey, Simon</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>MEDLINE - Academic</collection><jtitle>IEEE transactions on pattern analysis and machine intelligence</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Kong, Chen</au><au>Lucey, Simon</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Deep Non-Rigid Structure From Motion With Missing Data</atitle><jtitle>IEEE transactions on pattern analysis and machine intelligence</jtitle><stitle>TPAMI</stitle><date>2021-12-01</date><risdate>2021</risdate><volume>43</volume><issue>12</issue><spage>4365</spage><epage>4377</epage><pages>4365-4377</pages><issn>0162-8828</issn><eissn>1939-3539</eissn><eissn>2160-9292</eissn><coden>ITPIDJ</coden><abstract>Non-rigid structure from motion (NRSfM) refers to the problem of reconstructing cameras and the 3D point cloud of a non-rigid object from an ensemble of images with 2D correspondences. Current NRSfM algorithms are limited from two perspectives: (i) the number of images, and (ii) the type of shape variability they can handle. These difficulties stem from the inherent conflict between the condition of the system and the degrees of freedom needing to be modeled - which has hampered its practical utility for many applications within vision. In this paper we propose a novel hierarchical sparse coding model for NRSFM which can overcome (i) and (ii) to such an extent, that NRSFM can be applied to problems in vision previously thought too ill posed. Our approach is realized in practice as the training of an unsupervised deep neural network (DNN) auto-encoder with a unique architecture that is able to disentangle pose from 3D structure. Using modern deep learning computational platforms allows us to solve NRSfM problems at an unprecedented scale and shape complexity. Our approach has no 3D supervision, relying solely on 2D point correspondences. Further, our approach is also able to handle missing/occluded 2D points without the need for matrix completion. Extensive experiments demonstrate the impressive performance of our approach where we exhibit superior precision and robustness against all available state-of-the-art works in some instances by an order of magnitude. We further propose a new quality measure (based on the network weights) which circumvents the need for 3D ground-truth to ascertain the confidence we have in the reconstructability. We believe our work to be a significant advance over state-of-the-art in NRSFM.</abstract><cop>New York</cop><pub>IEEE</pub><pmid>32750772</pmid><doi>10.1109/TPAMI.2020.2997026</doi><tpages>13</tpages><orcidid>https://orcid.org/0000-0002-5095-7930</orcidid><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 0162-8828
ispartof	IEEE transactions on pattern analysis and machine intelligence, 2021-12, Vol.43 (12), p.4365-4377
issn	0162-8828 1939-3539 2160-9292
language	eng
recordid	cdi_crossref_primary_10_1109_TPAMI_2020_2997026
source	IEEE Electronic Library (IEL)
subjects	Algorithms Artificial neural networks Coders deep neural network Encoding hierarchical sparse coding Image reconstruction Machine learning Missing data Neural networks Nonrigid structure from motion reconstructability Rigid structures Shape Structure from motion Three dimensional models Three-dimensional displays Two dimensional displays Vision
title	Deep Non-Rigid Structure From Motion With Missing Data
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-25T13%3A26%3A50IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Deep%20Non-Rigid%20Structure%20From%20Motion%20With%20Missing%20Data&rft.jtitle=IEEE%20transactions%20on%20pattern%20analysis%20and%20machine%20intelligence&rft.au=Kong,%20Chen&rft.date=2021-12-01&rft.volume=43&rft.issue=12&rft.spage=4365&rft.epage=4377&rft.pages=4365-4377&rft.issn=0162-8828&rft.eissn=1939-3539&rft.coden=ITPIDJ&rft_id=info:doi/10.1109/TPAMI.2020.2997026&rft_dat=%3Cproquest_RIE%3E2592630429%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2592630429&rft_id=info:pmid/32750772&rft_ieee_id=9099404&rfr_iscdi=true