3D Surface Reconstruction in the Wild by Deforming Shape Priors from Synthetic Data

Reconstructing the underlying 3D surface of an object from a single image is a challenging problem that has received extensive attention from the computer vision community. Many learning-based approaches tackle this problem by learning a 3D shape prior from either ground truth 3D data or multi-view...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Häni, Nicolai, Chao, Jun-Jee, Isler, Volkan
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator Häni, Nicolai
Chao, Jun-Jee
Isler, Volkan
description Reconstructing the underlying 3D surface of an object from a single image is a challenging problem that has received extensive attention from the computer vision community. Many learning-based approaches tackle this problem by learning a 3D shape prior from either ground truth 3D data or multi-view observations. To achieve state-of-the-art results, these methods assume that the objects are specified with respect to a fixed canonical coordinate frame, where instances of the same category are perfectly aligned. In this work, we present a new method for joint category-specific 3D reconstruction and object pose estimation from a single image. We show that one can leverage shape priors learned on purely synthetic 3D data together with a point cloud pose canonicalization method to achieve high-quality 3D reconstruction in the wild. Given a single depth image at test time, we first transform this partial point cloud into a learned canonical frame. Then, we use a neural deformation field to reconstruct the 3D surface of the object. Finally, we jointly optimize object pose and 3D shape to fit the partial depth observation. Our approach achieves state-of-the-art reconstruction performance across several real-world datasets, even when trained only on synthetic data. We further show that our method generalizes to different input modalities, from dense depth images to sparse and noisy LIDAR scans.
doi_str_mv 10.48550/arxiv.2302.12883
format Article
fullrecord <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2302_12883</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2302_12883</sourcerecordid><originalsourceid>FETCH-LOGICAL-a673-da012895ba51c395093320fc668ca51f35cff6b8b730e633050564a8423329d33</originalsourceid><addsrcrecordid>eNotj81KxDAUhbNxIaMP4Mr7Aq1pbpNJlzL1DwYUO-Cy3KaJE5imQ5oR-_bW0dWBw8fhfIzdFDwvtZT8juK3_8oFcpEXQmu8ZA3W0JyiI2Ph3ZoxTCmeTPJjAB8g7S18-EMP3Qy1dWMcfPiEZk9HC2_Rj3ECF8cBmjksaPIGakp0xS4cHSZ7_Z8rtnt82G2es-3r08vmfpuRWmPWE18-VLIjWRisJK8QBXdGKW2WyqE0zqlOd2vkViFyyaUqSZdi4aoeccVu_2bPVu0x-oHi3P7atWc7_AF8LUjC</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>3D Surface Reconstruction in the Wild by Deforming Shape Priors from Synthetic Data</title><source>arXiv.org</source><creator>Häni, Nicolai ; Chao, Jun-Jee ; Isler, Volkan</creator><creatorcontrib>Häni, Nicolai ; Chao, Jun-Jee ; Isler, Volkan</creatorcontrib><description>Reconstructing the underlying 3D surface of an object from a single image is a challenging problem that has received extensive attention from the computer vision community. Many learning-based approaches tackle this problem by learning a 3D shape prior from either ground truth 3D data or multi-view observations. To achieve state-of-the-art results, these methods assume that the objects are specified with respect to a fixed canonical coordinate frame, where instances of the same category are perfectly aligned. In this work, we present a new method for joint category-specific 3D reconstruction and object pose estimation from a single image. We show that one can leverage shape priors learned on purely synthetic 3D data together with a point cloud pose canonicalization method to achieve high-quality 3D reconstruction in the wild. Given a single depth image at test time, we first transform this partial point cloud into a learned canonical frame. Then, we use a neural deformation field to reconstruct the 3D surface of the object. Finally, we jointly optimize object pose and 3D shape to fit the partial depth observation. Our approach achieves state-of-the-art reconstruction performance across several real-world datasets, even when trained only on synthetic data. We further show that our method generalizes to different input modalities, from dense depth images to sparse and noisy LIDAR scans.</description><identifier>DOI: 10.48550/arxiv.2302.12883</identifier><language>eng</language><subject>Computer Science - Computer Vision and Pattern Recognition</subject><creationdate>2023-02</creationdate><rights>http://creativecommons.org/licenses/by/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2302.12883$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2302.12883$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Häni, Nicolai</creatorcontrib><creatorcontrib>Chao, Jun-Jee</creatorcontrib><creatorcontrib>Isler, Volkan</creatorcontrib><title>3D Surface Reconstruction in the Wild by Deforming Shape Priors from Synthetic Data</title><description>Reconstructing the underlying 3D surface of an object from a single image is a challenging problem that has received extensive attention from the computer vision community. Many learning-based approaches tackle this problem by learning a 3D shape prior from either ground truth 3D data or multi-view observations. To achieve state-of-the-art results, these methods assume that the objects are specified with respect to a fixed canonical coordinate frame, where instances of the same category are perfectly aligned. In this work, we present a new method for joint category-specific 3D reconstruction and object pose estimation from a single image. We show that one can leverage shape priors learned on purely synthetic 3D data together with a point cloud pose canonicalization method to achieve high-quality 3D reconstruction in the wild. Given a single depth image at test time, we first transform this partial point cloud into a learned canonical frame. Then, we use a neural deformation field to reconstruct the 3D surface of the object. Finally, we jointly optimize object pose and 3D shape to fit the partial depth observation. Our approach achieves state-of-the-art reconstruction performance across several real-world datasets, even when trained only on synthetic data. We further show that our method generalizes to different input modalities, from dense depth images to sparse and noisy LIDAR scans.</description><subject>Computer Science - Computer Vision and Pattern Recognition</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotj81KxDAUhbNxIaMP4Mr7Aq1pbpNJlzL1DwYUO-Cy3KaJE5imQ5oR-_bW0dWBw8fhfIzdFDwvtZT8juK3_8oFcpEXQmu8ZA3W0JyiI2Ph3ZoxTCmeTPJjAB8g7S18-EMP3Qy1dWMcfPiEZk9HC2_Rj3ECF8cBmjksaPIGakp0xS4cHSZ7_Z8rtnt82G2es-3r08vmfpuRWmPWE18-VLIjWRisJK8QBXdGKW2WyqE0zqlOd2vkViFyyaUqSZdi4aoeccVu_2bPVu0x-oHi3P7atWc7_AF8LUjC</recordid><startdate>20230224</startdate><enddate>20230224</enddate><creator>Häni, Nicolai</creator><creator>Chao, Jun-Jee</creator><creator>Isler, Volkan</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20230224</creationdate><title>3D Surface Reconstruction in the Wild by Deforming Shape Priors from Synthetic Data</title><author>Häni, Nicolai ; Chao, Jun-Jee ; Isler, Volkan</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a673-da012895ba51c395093320fc668ca51f35cff6b8b730e633050564a8423329d33</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Computer Science - Computer Vision and Pattern Recognition</topic><toplevel>online_resources</toplevel><creatorcontrib>Häni, Nicolai</creatorcontrib><creatorcontrib>Chao, Jun-Jee</creatorcontrib><creatorcontrib>Isler, Volkan</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Häni, Nicolai</au><au>Chao, Jun-Jee</au><au>Isler, Volkan</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>3D Surface Reconstruction in the Wild by Deforming Shape Priors from Synthetic Data</atitle><date>2023-02-24</date><risdate>2023</risdate><abstract>Reconstructing the underlying 3D surface of an object from a single image is a challenging problem that has received extensive attention from the computer vision community. Many learning-based approaches tackle this problem by learning a 3D shape prior from either ground truth 3D data or multi-view observations. To achieve state-of-the-art results, these methods assume that the objects are specified with respect to a fixed canonical coordinate frame, where instances of the same category are perfectly aligned. In this work, we present a new method for joint category-specific 3D reconstruction and object pose estimation from a single image. We show that one can leverage shape priors learned on purely synthetic 3D data together with a point cloud pose canonicalization method to achieve high-quality 3D reconstruction in the wild. Given a single depth image at test time, we first transform this partial point cloud into a learned canonical frame. Then, we use a neural deformation field to reconstruct the 3D surface of the object. Finally, we jointly optimize object pose and 3D shape to fit the partial depth observation. Our approach achieves state-of-the-art reconstruction performance across several real-world datasets, even when trained only on synthetic data. We further show that our method generalizes to different input modalities, from dense depth images to sparse and noisy LIDAR scans.</abstract><doi>10.48550/arxiv.2302.12883</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier DOI: 10.48550/arxiv.2302.12883
ispartof
issn
language eng
recordid cdi_arxiv_primary_2302_12883
source arXiv.org
subjects Computer Science - Computer Vision and Pattern Recognition
title 3D Surface Reconstruction in the Wild by Deforming Shape Priors from Synthetic Data
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-18T15%3A25%3A06IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=3D%20Surface%20Reconstruction%20in%20the%20Wild%20by%20Deforming%20Shape%20Priors%20from%20Synthetic%20Data&rft.au=H%C3%A4ni,%20Nicolai&rft.date=2023-02-24&rft_id=info:doi/10.48550/arxiv.2302.12883&rft_dat=%3Carxiv_GOX%3E2302_12883%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true