Deep Non-Rigid Structure From Motion With Missing Data
Non-rigid structure from motion (NRSfM) refers to the problem of reconstructing cameras and the 3D point cloud of a non-rigid object from an ensemble of images with 2D correspondences. Current NRSfM algorithms are limited from two perspectives: (i) the number of images, and (ii) the type of shape va...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on pattern analysis and machine intelligence 2021-12, Vol.43 (12), p.4365-4377 |
---|---|
Hauptverfasser: | , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 4377 |
---|---|
container_issue | 12 |
container_start_page | 4365 |
container_title | IEEE transactions on pattern analysis and machine intelligence |
container_volume | 43 |
creator | Kong, Chen Lucey, Simon |
description | Non-rigid structure from motion (NRSfM) refers to the problem of reconstructing cameras and the 3D point cloud of a non-rigid object from an ensemble of images with 2D correspondences. Current NRSfM algorithms are limited from two perspectives: (i) the number of images, and (ii) the type of shape variability they can handle. These difficulties stem from the inherent conflict between the condition of the system and the degrees of freedom needing to be modeled - which has hampered its practical utility for many applications within vision. In this paper we propose a novel hierarchical sparse coding model for NRSFM which can overcome (i) and (ii) to such an extent, that NRSFM can be applied to problems in vision previously thought too ill posed. Our approach is realized in practice as the training of an unsupervised deep neural network (DNN) auto-encoder with a unique architecture that is able to disentangle pose from 3D structure. Using modern deep learning computational platforms allows us to solve NRSfM problems at an unprecedented scale and shape complexity. Our approach has no 3D supervision, relying solely on 2D point correspondences. Further, our approach is also able to handle missing/occluded 2D points without the need for matrix completion. Extensive experiments demonstrate the impressive performance of our approach where we exhibit superior precision and robustness against all available state-of-the-art works in some instances by an order of magnitude. We further propose a new quality measure (based on the network weights) which circumvents the need for 3D ground-truth to ascertain the confidence we have in the reconstructability. We believe our work to be a significant advance over state-of-the-art in NRSFM. |
doi_str_mv | 10.1109/TPAMI.2020.2997026 |
format | Article |
fullrecord | <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_crossref_primary_10_1109_TPAMI_2020_2997026</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9099404</ieee_id><sourcerecordid>2592630429</sourcerecordid><originalsourceid>FETCH-LOGICAL-c372t-b9114631e4648fe05600ac0cefeae9a29c0e5bb85f272f333691c1d1086863cf3</originalsourceid><addsrcrecordid>eNpdkEFPwkAQRjdGI4j-Ab008eKlODvb3XaPBERJRI1iPG7KMsUl0OJue_DfW4R48DSX975MHmOXHPqcg76dvQymkz4CQh-1TgHVEetyLXQspNDHrAtcYZxlmHXYWQgrAJ5IEKesIzCVkKbYZWpEtI2eqjJ-dUu3iN5q39i68RSNfbWJplXtqjL6cPVnNHUhuHIZjfI6P2cnRb4OdHG4PfY-vpsNH-LH5_vJcPAYW5FiHc8154kSnBKVZAWBVAC5BUsF5aRz1BZIzueZLDDFQgihNLd8wSFTmRK2ED12s9_d-uqroVCbjQuW1uu8pKoJBhMB7bSQWYte_0NXVePL9juDUqMSkKBuKdxT1lcheCrM1rtN7r8NB7Oran6rml1Vc6jaSld7yRHRn6BB6wQS8QMMwW68</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2592630429</pqid></control><display><type>article</type><title>Deep Non-Rigid Structure From Motion With Missing Data</title><source>IEEE Electronic Library (IEL)</source><creator>Kong, Chen ; Lucey, Simon</creator><creatorcontrib>Kong, Chen ; Lucey, Simon</creatorcontrib><description>Non-rigid structure from motion (NRSfM) refers to the problem of reconstructing cameras and the 3D point cloud of a non-rigid object from an ensemble of images with 2D correspondences. Current NRSfM algorithms are limited from two perspectives: (i) the number of images, and (ii) the type of shape variability they can handle. These difficulties stem from the inherent conflict between the condition of the system and the degrees of freedom needing to be modeled - which has hampered its practical utility for many applications within vision. In this paper we propose a novel hierarchical sparse coding model for NRSFM which can overcome (i) and (ii) to such an extent, that NRSFM can be applied to problems in vision previously thought too ill posed. Our approach is realized in practice as the training of an unsupervised deep neural network (DNN) auto-encoder with a unique architecture that is able to disentangle pose from 3D structure. Using modern deep learning computational platforms allows us to solve NRSfM problems at an unprecedented scale and shape complexity. Our approach has no 3D supervision, relying solely on 2D point correspondences. Further, our approach is also able to handle missing/occluded 2D points without the need for matrix completion. Extensive experiments demonstrate the impressive performance of our approach where we exhibit superior precision and robustness against all available state-of-the-art works in some instances by an order of magnitude. We further propose a new quality measure (based on the network weights) which circumvents the need for 3D ground-truth to ascertain the confidence we have in the reconstructability. We believe our work to be a significant advance over state-of-the-art in NRSFM.</description><identifier>ISSN: 0162-8828</identifier><identifier>EISSN: 1939-3539</identifier><identifier>EISSN: 2160-9292</identifier><identifier>DOI: 10.1109/TPAMI.2020.2997026</identifier><identifier>PMID: 32750772</identifier><identifier>CODEN: ITPIDJ</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Algorithms ; Artificial neural networks ; Coders ; deep neural network ; Encoding ; hierarchical sparse coding ; Image reconstruction ; Machine learning ; Missing data ; Neural networks ; Nonrigid structure from motion ; reconstructability ; Rigid structures ; Shape ; Structure from motion ; Three dimensional models ; Three-dimensional displays ; Two dimensional displays ; Vision</subject><ispartof>IEEE transactions on pattern analysis and machine intelligence, 2021-12, Vol.43 (12), p.4365-4377</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2021</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c372t-b9114631e4648fe05600ac0cefeae9a29c0e5bb85f272f333691c1d1086863cf3</citedby><cites>FETCH-LOGICAL-c372t-b9114631e4648fe05600ac0cefeae9a29c0e5bb85f272f333691c1d1086863cf3</cites><orcidid>0000-0002-5095-7930</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9099404$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,796,27924,27925,54758</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/9099404$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Kong, Chen</creatorcontrib><creatorcontrib>Lucey, Simon</creatorcontrib><title>Deep Non-Rigid Structure From Motion With Missing Data</title><title>IEEE transactions on pattern analysis and machine intelligence</title><addtitle>TPAMI</addtitle><description>Non-rigid structure from motion (NRSfM) refers to the problem of reconstructing cameras and the 3D point cloud of a non-rigid object from an ensemble of images with 2D correspondences. Current NRSfM algorithms are limited from two perspectives: (i) the number of images, and (ii) the type of shape variability they can handle. These difficulties stem from the inherent conflict between the condition of the system and the degrees of freedom needing to be modeled - which has hampered its practical utility for many applications within vision. In this paper we propose a novel hierarchical sparse coding model for NRSFM which can overcome (i) and (ii) to such an extent, that NRSFM can be applied to problems in vision previously thought too ill posed. Our approach is realized in practice as the training of an unsupervised deep neural network (DNN) auto-encoder with a unique architecture that is able to disentangle pose from 3D structure. Using modern deep learning computational platforms allows us to solve NRSfM problems at an unprecedented scale and shape complexity. Our approach has no 3D supervision, relying solely on 2D point correspondences. Further, our approach is also able to handle missing/occluded 2D points without the need for matrix completion. Extensive experiments demonstrate the impressive performance of our approach where we exhibit superior precision and robustness against all available state-of-the-art works in some instances by an order of magnitude. We further propose a new quality measure (based on the network weights) which circumvents the need for 3D ground-truth to ascertain the confidence we have in the reconstructability. We believe our work to be a significant advance over state-of-the-art in NRSFM.</description><subject>Algorithms</subject><subject>Artificial neural networks</subject><subject>Coders</subject><subject>deep neural network</subject><subject>Encoding</subject><subject>hierarchical sparse coding</subject><subject>Image reconstruction</subject><subject>Machine learning</subject><subject>Missing data</subject><subject>Neural networks</subject><subject>Nonrigid structure from motion</subject><subject>reconstructability</subject><subject>Rigid structures</subject><subject>Shape</subject><subject>Structure from motion</subject><subject>Three dimensional models</subject><subject>Three-dimensional displays</subject><subject>Two dimensional displays</subject><subject>Vision</subject><issn>0162-8828</issn><issn>1939-3539</issn><issn>2160-9292</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpdkEFPwkAQRjdGI4j-Ab008eKlODvb3XaPBERJRI1iPG7KMsUl0OJue_DfW4R48DSX975MHmOXHPqcg76dvQymkz4CQh-1TgHVEetyLXQspNDHrAtcYZxlmHXYWQgrAJ5IEKesIzCVkKbYZWpEtI2eqjJ-dUu3iN5q39i68RSNfbWJplXtqjL6cPVnNHUhuHIZjfI6P2cnRb4OdHG4PfY-vpsNH-LH5_vJcPAYW5FiHc8154kSnBKVZAWBVAC5BUsF5aRz1BZIzueZLDDFQgihNLd8wSFTmRK2ED12s9_d-uqroVCbjQuW1uu8pKoJBhMB7bSQWYte_0NXVePL9juDUqMSkKBuKdxT1lcheCrM1rtN7r8NB7Oran6rml1Vc6jaSld7yRHRn6BB6wQS8QMMwW68</recordid><startdate>20211201</startdate><enddate>20211201</enddate><creator>Kong, Chen</creator><creator>Lucey, Simon</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>7X8</scope><orcidid>https://orcid.org/0000-0002-5095-7930</orcidid></search><sort><creationdate>20211201</creationdate><title>Deep Non-Rigid Structure From Motion With Missing Data</title><author>Kong, Chen ; Lucey, Simon</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c372t-b9114631e4648fe05600ac0cefeae9a29c0e5bb85f272f333691c1d1086863cf3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Algorithms</topic><topic>Artificial neural networks</topic><topic>Coders</topic><topic>deep neural network</topic><topic>Encoding</topic><topic>hierarchical sparse coding</topic><topic>Image reconstruction</topic><topic>Machine learning</topic><topic>Missing data</topic><topic>Neural networks</topic><topic>Nonrigid structure from motion</topic><topic>reconstructability</topic><topic>Rigid structures</topic><topic>Shape</topic><topic>Structure from motion</topic><topic>Three dimensional models</topic><topic>Three-dimensional displays</topic><topic>Two dimensional displays</topic><topic>Vision</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Kong, Chen</creatorcontrib><creatorcontrib>Lucey, Simon</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>MEDLINE - Academic</collection><jtitle>IEEE transactions on pattern analysis and machine intelligence</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Kong, Chen</au><au>Lucey, Simon</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Deep Non-Rigid Structure From Motion With Missing Data</atitle><jtitle>IEEE transactions on pattern analysis and machine intelligence</jtitle><stitle>TPAMI</stitle><date>2021-12-01</date><risdate>2021</risdate><volume>43</volume><issue>12</issue><spage>4365</spage><epage>4377</epage><pages>4365-4377</pages><issn>0162-8828</issn><eissn>1939-3539</eissn><eissn>2160-9292</eissn><coden>ITPIDJ</coden><abstract>Non-rigid structure from motion (NRSfM) refers to the problem of reconstructing cameras and the 3D point cloud of a non-rigid object from an ensemble of images with 2D correspondences. Current NRSfM algorithms are limited from two perspectives: (i) the number of images, and (ii) the type of shape variability they can handle. These difficulties stem from the inherent conflict between the condition of the system and the degrees of freedom needing to be modeled - which has hampered its practical utility for many applications within vision. In this paper we propose a novel hierarchical sparse coding model for NRSFM which can overcome (i) and (ii) to such an extent, that NRSFM can be applied to problems in vision previously thought too ill posed. Our approach is realized in practice as the training of an unsupervised deep neural network (DNN) auto-encoder with a unique architecture that is able to disentangle pose from 3D structure. Using modern deep learning computational platforms allows us to solve NRSfM problems at an unprecedented scale and shape complexity. Our approach has no 3D supervision, relying solely on 2D point correspondences. Further, our approach is also able to handle missing/occluded 2D points without the need for matrix completion. Extensive experiments demonstrate the impressive performance of our approach where we exhibit superior precision and robustness against all available state-of-the-art works in some instances by an order of magnitude. We further propose a new quality measure (based on the network weights) which circumvents the need for 3D ground-truth to ascertain the confidence we have in the reconstructability. We believe our work to be a significant advance over state-of-the-art in NRSFM.</abstract><cop>New York</cop><pub>IEEE</pub><pmid>32750772</pmid><doi>10.1109/TPAMI.2020.2997026</doi><tpages>13</tpages><orcidid>https://orcid.org/0000-0002-5095-7930</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | ISSN: 0162-8828 |
ispartof | IEEE transactions on pattern analysis and machine intelligence, 2021-12, Vol.43 (12), p.4365-4377 |
issn | 0162-8828 1939-3539 2160-9292 |
language | eng |
recordid | cdi_crossref_primary_10_1109_TPAMI_2020_2997026 |
source | IEEE Electronic Library (IEL) |
subjects | Algorithms Artificial neural networks Coders deep neural network Encoding hierarchical sparse coding Image reconstruction Machine learning Missing data Neural networks Nonrigid structure from motion reconstructability Rigid structures Shape Structure from motion Three dimensional models Three-dimensional displays Two dimensional displays Vision |
title | Deep Non-Rigid Structure From Motion With Missing Data |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-25T13%3A26%3A50IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Deep%20Non-Rigid%20Structure%20From%20Motion%20With%20Missing%20Data&rft.jtitle=IEEE%20transactions%20on%20pattern%20analysis%20and%20machine%20intelligence&rft.au=Kong,%20Chen&rft.date=2021-12-01&rft.volume=43&rft.issue=12&rft.spage=4365&rft.epage=4377&rft.pages=4365-4377&rft.issn=0162-8828&rft.eissn=1939-3539&rft.coden=ITPIDJ&rft_id=info:doi/10.1109/TPAMI.2020.2997026&rft_dat=%3Cproquest_RIE%3E2592630429%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2592630429&rft_id=info:pmid/32750772&rft_ieee_id=9099404&rfr_iscdi=true |