3D Surface Reconstruction in the Wild by Deforming Shape Priors from Synthetic Data

Reconstructing the underlying 3D surface of an object from a single image is a challenging problem that has received extensive attention from the computer vision community. Many learning-based approaches tackle this problem by learning a 3D shape prior from either ground truth 3D data or multi-view...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Häni, Nicolai, Chao, Jun-Jee, Isler, Volkan
Format:	Artikel
Sprache:	eng
Schlagworte:	Computer Science - Computer Vision and Pattern Recognition
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title
container_volume
creator	Häni, Nicolai Chao, Jun-Jee Isler, Volkan
description	Reconstructing the underlying 3D surface of an object from a single image is a challenging problem that has received extensive attention from the computer vision community. Many learning-based approaches tackle this problem by learning a 3D shape prior from either ground truth 3D data or multi-view observations. To achieve state-of-the-art results, these methods assume that the objects are specified with respect to a fixed canonical coordinate frame, where instances of the same category are perfectly aligned. In this work, we present a new method for joint category-specific 3D reconstruction and object pose estimation from a single image. We show that one can leverage shape priors learned on purely synthetic 3D data together with a point cloud pose canonicalization method to achieve high-quality 3D reconstruction in the wild. Given a single depth image at test time, we first transform this partial point cloud into a learned canonical frame. Then, we use a neural deformation field to reconstruct the 3D surface of the object. Finally, we jointly optimize object pose and 3D shape to fit the partial depth observation. Our approach achieves state-of-the-art reconstruction performance across several real-world datasets, even when trained only on synthetic data. We further show that our method generalizes to different input modalities, from dense depth images to sparse and noisy LIDAR scans.
doi_str_mv	10.48550/arxiv.2302.12883
format	Article
fullrecord	<record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2302_12883</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2302_12883</sourcerecordid><originalsourceid>FETCH-LOGICAL-a673-da012895ba51c395093320fc668ca51f35cff6b8b730e633050564a8423329d33</originalsourceid><addsrcrecordid>eNotj81KxDAUhbNxIaMP4Mr7Aq1pbpNJlzL1DwYUO-Cy3KaJE5imQ5oR-_bW0dWBw8fhfIzdFDwvtZT8juK3_8oFcpEXQmu8ZA3W0JyiI2Ph3ZoxTCmeTPJjAB8g7S18-EMP3Qy1dWMcfPiEZk9HC2_Rj3ECF8cBmjksaPIGakp0xS4cHSZ7_Z8rtnt82G2es-3r08vmfpuRWmPWE18-VLIjWRisJK8QBXdGKW2WyqE0zqlOd2vkViFyyaUqSZdi4aoeccVu_2bPVu0x-oHi3P7atWc7_AF8LUjC</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>3D Surface Reconstruction in the Wild by Deforming Shape Priors from Synthetic Data</title><source>arXiv.org</source><creator>Häni, Nicolai ; Chao, Jun-Jee ; Isler, Volkan</creator><creatorcontrib>Häni, Nicolai ; Chao, Jun-Jee ; Isler, Volkan</creatorcontrib><description>Reconstructing the underlying 3D surface of an object from a single image is a challenging problem that has received extensive attention from the computer vision community. Many learning-based approaches tackle this problem by learning a 3D shape prior from either ground truth 3D data or multi-view observations. To achieve state-of-the-art results, these methods assume that the objects are specified with respect to a fixed canonical coordinate frame, where instances of the same category are perfectly aligned. In this work, we present a new method for joint category-specific 3D reconstruction and object pose estimation from a single image. We show that one can leverage shape priors learned on purely synthetic 3D data together with a point cloud pose canonicalization method to achieve high-quality 3D reconstruction in the wild. Given a single depth image at test time, we first transform this partial point cloud into a learned canonical frame. Then, we use a neural deformation field to reconstruct the 3D surface of the object. Finally, we jointly optimize object pose and 3D shape to fit the partial depth observation. Our approach achieves state-of-the-art reconstruction performance across several real-world datasets, even when trained only on synthetic data. We further show that our method generalizes to different input modalities, from dense depth images to sparse and noisy LIDAR scans.</description><identifier>DOI: 10.48550/arxiv.2302.12883</identifier><language>eng</language><subject>Computer Science - Computer Vision and Pattern Recognition</subject><creationdate>2023-02</creationdate><rights>http://creativecommons.org/licenses/by/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2302.12883$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2302.12883$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Häni, Nicolai</creatorcontrib><creatorcontrib>Chao, Jun-Jee</creatorcontrib><creatorcontrib>Isler, Volkan</creatorcontrib><title>3D Surface Reconstruction in the Wild by Deforming Shape Priors from Synthetic Data</title><description>Reconstructing the underlying 3D surface of an object from a single image is a challenging problem that has received extensive attention from the computer vision community. Many learning-based approaches tackle this problem by learning a 3D shape prior from either ground truth 3D data or multi-view observations. To achieve state-of-the-art results, these methods assume that the objects are specified with respect to a fixed canonical coordinate frame, where instances of the same category are perfectly aligned. In this work, we present a new method for joint category-specific 3D reconstruction and object pose estimation from a single image. We show that one can leverage shape priors learned on purely synthetic 3D data together with a point cloud pose canonicalization method to achieve high-quality 3D reconstruction in the wild. Given a single depth image at test time, we first transform this partial point cloud into a learned canonical frame. Then, we use a neural deformation field to reconstruct the 3D surface of the object. Finally, we jointly optimize object pose and 3D shape to fit the partial depth observation. Our approach achieves state-of-the-art reconstruction performance across several real-world datasets, even when trained only on synthetic data. We further show that our method generalizes to different input modalities, from dense depth images to sparse and noisy LIDAR scans.</description><subject>Computer Science - Computer Vision and Pattern Recognition</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotj81KxDAUhbNxIaMP4Mr7Aq1pbpNJlzL1DwYUO-Cy3KaJE5imQ5oR-_bW0dWBw8fhfIzdFDwvtZT8juK3_8oFcpEXQmu8ZA3W0JyiI2Ph3ZoxTCmeTPJjAB8g7S18-EMP3Qy1dWMcfPiEZk9HC2_Rj3ECF8cBmjksaPIGakp0xS4cHSZ7_Z8rtnt82G2es-3r08vmfpuRWmPWE18-VLIjWRisJK8QBXdGKW2WyqE0zqlOd2vkViFyyaUqSZdi4aoeccVu_2bPVu0x-oHi3P7atWc7_AF8LUjC</recordid><startdate>20230224</startdate><enddate>20230224</enddate><creator>Häni, Nicolai</creator><creator>Chao, Jun-Jee</creator><creator>Isler, Volkan</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20230224</creationdate><title>3D Surface Reconstruction in the Wild by Deforming Shape Priors from Synthetic Data</title><author>Häni, Nicolai ; Chao, Jun-Jee ; Isler, Volkan</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a673-da012895ba51c395093320fc668ca51f35cff6b8b730e633050564a8423329d33</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Computer Science - Computer Vision and Pattern Recognition</topic><toplevel>online_resources</toplevel><creatorcontrib>Häni, Nicolai</creatorcontrib><creatorcontrib>Chao, Jun-Jee</creatorcontrib><creatorcontrib>Isler, Volkan</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Häni, Nicolai</au><au>Chao, Jun-Jee</au><au>Isler, Volkan</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>3D Surface Reconstruction in the Wild by Deforming Shape Priors from Synthetic Data</atitle><date>2023-02-24</date><risdate>2023</risdate><abstract>Reconstructing the underlying 3D surface of an object from a single image is a challenging problem that has received extensive attention from the computer vision community. Many learning-based approaches tackle this problem by learning a 3D shape prior from either ground truth 3D data or multi-view observations. To achieve state-of-the-art results, these methods assume that the objects are specified with respect to a fixed canonical coordinate frame, where instances of the same category are perfectly aligned. In this work, we present a new method for joint category-specific 3D reconstruction and object pose estimation from a single image. We show that one can leverage shape priors learned on purely synthetic 3D data together with a point cloud pose canonicalization method to achieve high-quality 3D reconstruction in the wild. Given a single depth image at test time, we first transform this partial point cloud into a learned canonical frame. Then, we use a neural deformation field to reconstruct the 3D surface of the object. Finally, we jointly optimize object pose and 3D shape to fit the partial depth observation. Our approach achieves state-of-the-art reconstruction performance across several real-world datasets, even when trained only on synthetic data. We further show that our method generalizes to different input modalities, from dense depth images to sparse and noisy LIDAR scans.</abstract><doi>10.48550/arxiv.2302.12883</doi><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier	DOI: 10.48550/arxiv.2302.12883
ispartof
issn
language	eng
recordid	cdi_arxiv_primary_2302_12883
source	arXiv.org
subjects	Computer Science - Computer Vision and Pattern Recognition
title	3D Surface Reconstruction in the Wild by Deforming Shape Priors from Synthetic Data
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-18T15%3A25%3A06IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=3D%20Surface%20Reconstruction%20in%20the%20Wild%20by%20Deforming%20Shape%20Priors%20from%20Synthetic%20Data&rft.au=H%C3%A4ni,%20Nicolai&rft.date=2023-02-24&rft_id=info:doi/10.48550/arxiv.2302.12883&rft_dat=%3Carxiv_GOX%3E2302_12883%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true