VAIR: Visuo-Acoustic Implicit Representations for Low-Cost, Multi-Modal Transparent Surface Reconstruction in Indoor Scenes

Mobile robots operating indoors must be prepared to navigate challenging scenes that contain transparent surfaces. This paper proposes a novel method for the fusion of acoustic and visual sensing modalities through implicit neural representations to enable dense reconstruction of transparent surface...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Sethuraman, Advaith V, Bagoren, Onur, Seetharaman, Harikrishnan, Richardson, Dalton, Taylor, Joseph, Skinner, Katherine A
Format:	Artikel
Sprache:	eng
Schlagworte:	Computer Science - Computer Vision and Pattern Recognition
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title
container_volume
creator	Sethuraman, Advaith V Bagoren, Onur Seetharaman, Harikrishnan Richardson, Dalton Taylor, Joseph Skinner, Katherine A
description	Mobile robots operating indoors must be prepared to navigate challenging scenes that contain transparent surfaces. This paper proposes a novel method for the fusion of acoustic and visual sensing modalities through implicit neural representations to enable dense reconstruction of transparent surfaces in indoor scenes. We propose a novel model that leverages generative latent optimization to learn an implicit representation of indoor scenes consisting of transparent surfaces. We demonstrate that we can query the implicit representation to enable volumetric rendering in image space or 3D geometry reconstruction (point clouds or mesh) with transparent surface prediction. We evaluate our method's effectiveness qualitatively and quantitatively on a new dataset collected using a custom, low-cost sensing platform featuring RGB-D cameras and ultrasonic sensors. Our method exhibits significant improvement over state-of-the-art for transparent surface reconstruction.
doi_str_mv	10.48550/arxiv.2411.04963
format	Article
fullrecord	<record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2411_04963</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2411_04963</sourcerecordid><originalsourceid>FETCH-arxiv_primary_2411_049633</originalsourceid><addsrcrecordid>eNqFjr0OgjAURrs4GPUBnLwPIAiCRt2M0UgiCxBW0tSS3ARbctv6E19eJO5O33LOl8PYNAz8eLNaBQtOT7z7yzgM_SDerqMhe5f7JNtBicZpby-0MxYFJLe2QYEWMtmSNFJZblErA7UmuOiHd9DGziF1jUUv1VfeQEFcmZZTx0LuqOZCdrboJEtOfG1ABYm66u4iF1JJM2aDmjdGTn47YrPTsTicvT6zaglvnF7VN7fqc6P_xAd96Uxk</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>VAIR: Visuo-Acoustic Implicit Representations for Low-Cost, Multi-Modal Transparent Surface Reconstruction in Indoor Scenes</title><source>arXiv.org</source><creator>Sethuraman, Advaith V ; Bagoren, Onur ; Seetharaman, Harikrishnan ; Richardson, Dalton ; Taylor, Joseph ; Skinner, Katherine A</creator><creatorcontrib>Sethuraman, Advaith V ; Bagoren, Onur ; Seetharaman, Harikrishnan ; Richardson, Dalton ; Taylor, Joseph ; Skinner, Katherine A</creatorcontrib><description>Mobile robots operating indoors must be prepared to navigate challenging scenes that contain transparent surfaces. This paper proposes a novel method for the fusion of acoustic and visual sensing modalities through implicit neural representations to enable dense reconstruction of transparent surfaces in indoor scenes. We propose a novel model that leverages generative latent optimization to learn an implicit representation of indoor scenes consisting of transparent surfaces. We demonstrate that we can query the implicit representation to enable volumetric rendering in image space or 3D geometry reconstruction (point clouds or mesh) with transparent surface prediction. We evaluate our method's effectiveness qualitatively and quantitatively on a new dataset collected using a custom, low-cost sensing platform featuring RGB-D cameras and ultrasonic sensors. Our method exhibits significant improvement over state-of-the-art for transparent surface reconstruction.</description><identifier>DOI: 10.48550/arxiv.2411.04963</identifier><language>eng</language><subject>Computer Science - Computer Vision and Pattern Recognition</subject><creationdate>2024-11</creationdate><rights>http://creativecommons.org/licenses/by/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2411.04963$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2411.04963$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Sethuraman, Advaith V</creatorcontrib><creatorcontrib>Bagoren, Onur</creatorcontrib><creatorcontrib>Seetharaman, Harikrishnan</creatorcontrib><creatorcontrib>Richardson, Dalton</creatorcontrib><creatorcontrib>Taylor, Joseph</creatorcontrib><creatorcontrib>Skinner, Katherine A</creatorcontrib><title>VAIR: Visuo-Acoustic Implicit Representations for Low-Cost, Multi-Modal Transparent Surface Reconstruction in Indoor Scenes</title><description>Mobile robots operating indoors must be prepared to navigate challenging scenes that contain transparent surfaces. This paper proposes a novel method for the fusion of acoustic and visual sensing modalities through implicit neural representations to enable dense reconstruction of transparent surfaces in indoor scenes. We propose a novel model that leverages generative latent optimization to learn an implicit representation of indoor scenes consisting of transparent surfaces. We demonstrate that we can query the implicit representation to enable volumetric rendering in image space or 3D geometry reconstruction (point clouds or mesh) with transparent surface prediction. We evaluate our method's effectiveness qualitatively and quantitatively on a new dataset collected using a custom, low-cost sensing platform featuring RGB-D cameras and ultrasonic sensors. Our method exhibits significant improvement over state-of-the-art for transparent surface reconstruction.</description><subject>Computer Science - Computer Vision and Pattern Recognition</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNqFjr0OgjAURrs4GPUBnLwPIAiCRt2M0UgiCxBW0tSS3ARbctv6E19eJO5O33LOl8PYNAz8eLNaBQtOT7z7yzgM_SDerqMhe5f7JNtBicZpby-0MxYFJLe2QYEWMtmSNFJZblErA7UmuOiHd9DGziF1jUUv1VfeQEFcmZZTx0LuqOZCdrboJEtOfG1ABYm66u4iF1JJM2aDmjdGTn47YrPTsTicvT6zaglvnF7VN7fqc6P_xAd96Uxk</recordid><startdate>20241107</startdate><enddate>20241107</enddate><creator>Sethuraman, Advaith V</creator><creator>Bagoren, Onur</creator><creator>Seetharaman, Harikrishnan</creator><creator>Richardson, Dalton</creator><creator>Taylor, Joseph</creator><creator>Skinner, Katherine A</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20241107</creationdate><title>VAIR: Visuo-Acoustic Implicit Representations for Low-Cost, Multi-Modal Transparent Surface Reconstruction in Indoor Scenes</title><author>Sethuraman, Advaith V ; Bagoren, Onur ; Seetharaman, Harikrishnan ; Richardson, Dalton ; Taylor, Joseph ; Skinner, Katherine A</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-arxiv_primary_2411_049633</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Computer Science - Computer Vision and Pattern Recognition</topic><toplevel>online_resources</toplevel><creatorcontrib>Sethuraman, Advaith V</creatorcontrib><creatorcontrib>Bagoren, Onur</creatorcontrib><creatorcontrib>Seetharaman, Harikrishnan</creatorcontrib><creatorcontrib>Richardson, Dalton</creatorcontrib><creatorcontrib>Taylor, Joseph</creatorcontrib><creatorcontrib>Skinner, Katherine A</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Sethuraman, Advaith V</au><au>Bagoren, Onur</au><au>Seetharaman, Harikrishnan</au><au>Richardson, Dalton</au><au>Taylor, Joseph</au><au>Skinner, Katherine A</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>VAIR: Visuo-Acoustic Implicit Representations for Low-Cost, Multi-Modal Transparent Surface Reconstruction in Indoor Scenes</atitle><date>2024-11-07</date><risdate>2024</risdate><abstract>Mobile robots operating indoors must be prepared to navigate challenging scenes that contain transparent surfaces. This paper proposes a novel method for the fusion of acoustic and visual sensing modalities through implicit neural representations to enable dense reconstruction of transparent surfaces in indoor scenes. We propose a novel model that leverages generative latent optimization to learn an implicit representation of indoor scenes consisting of transparent surfaces. We demonstrate that we can query the implicit representation to enable volumetric rendering in image space or 3D geometry reconstruction (point clouds or mesh) with transparent surface prediction. We evaluate our method's effectiveness qualitatively and quantitatively on a new dataset collected using a custom, low-cost sensing platform featuring RGB-D cameras and ultrasonic sensors. Our method exhibits significant improvement over state-of-the-art for transparent surface reconstruction.</abstract><doi>10.48550/arxiv.2411.04963</doi><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier	DOI: 10.48550/arxiv.2411.04963
ispartof
issn
language	eng
recordid	cdi_arxiv_primary_2411_04963
source	arXiv.org
subjects	Computer Science - Computer Vision and Pattern Recognition
title	VAIR: Visuo-Acoustic Implicit Representations for Low-Cost, Multi-Modal Transparent Surface Reconstruction in Indoor Scenes
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-01T10%3A57%3A43IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=VAIR:%20Visuo-Acoustic%20Implicit%20Representations%20for%20Low-Cost,%20Multi-Modal%20Transparent%20Surface%20Reconstruction%20in%20Indoor%20Scenes&rft.au=Sethuraman,%20Advaith%20V&rft.date=2024-11-07&rft_id=info:doi/10.48550/arxiv.2411.04963&rft_dat=%3Carxiv_GOX%3E2411_04963%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true