Panoptic NeRF: 3D-to-2D Label Transfer for Panoptic Urban Scene Segmentation
Large-scale training data with high-quality annotations is critical for training semantic and instance segmentation models. Unfortunately, pixel-wise annotation is labor-intensive and costly, raising the demand for more efficient labeling strategies. In this work, we present a novel 3D-to-2D label t...
Gespeichert in:
Veröffentlicht in: | arXiv.org 2022-09 |
---|---|
Hauptverfasser: | , , , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | arXiv.org |
container_volume | |
creator | Fu, Xiao Zhang, Shangzhan Chen, Tianrun Lu, Yichong Zhu, Lanyun Zhou, Xiaowei Geiger, Andreas Liao, Yiyi |
description | Large-scale training data with high-quality annotations is critical for training semantic and instance segmentation models. Unfortunately, pixel-wise annotation is labor-intensive and costly, raising the demand for more efficient labeling strategies. In this work, we present a novel 3D-to-2D label transfer method, Panoptic NeRF, which aims for obtaining per-pixel 2D semantic and instance labels from easy-to-obtain coarse 3D bounding primitives. Our method utilizes NeRF as a differentiable tool to unify coarse 3D annotations and 2D semantic cues transferred from existing datasets. We demonstrate that this combination allows for improved geometry guided by semantic information, enabling rendering of accurate semantic maps across multiple views. Furthermore, this fusion process resolves label ambiguity of the coarse 3D annotations and filters noise in the 2D predictions. By inferring in 3D space and rendering to 2D labels, our 2D semantic and instance labels are multi-view consistent by design. Experimental results show that Panoptic NeRF outperforms existing label transfer methods in terms of accuracy and multi-view consistency on challenging urban scenes of the KITTI-360 dataset. |
format | Article |
fullrecord | <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2713093357</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2713093357</sourcerecordid><originalsourceid>FETCH-proquest_journals_27130933573</originalsourceid><addsrcrecordid>eNqNykELgjAYgOERBEn5Hz7oPJj7MqtrJh0kIu0sUz5Dsc22-f_rEJ07vYf3mbFAIkZ8t5FywULneiGE3CYyjjFg-VVpM_qugQvdsgNgyr3hMoVc1TRAaZV2LVlojYUfvdtaaSga0gQFPZ6kvfKd0Ss2b9XgKPx2ydbZqTye-WjNayLnq95MVn9WJZMIxR4xTvA_9QY8xTus</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2713093357</pqid></control><display><type>article</type><title>Panoptic NeRF: 3D-to-2D Label Transfer for Panoptic Urban Scene Segmentation</title><source>Free E- Journals</source><creator>Fu, Xiao ; Zhang, Shangzhan ; Chen, Tianrun ; Lu, Yichong ; Zhu, Lanyun ; Zhou, Xiaowei ; Geiger, Andreas ; Liao, Yiyi</creator><creatorcontrib>Fu, Xiao ; Zhang, Shangzhan ; Chen, Tianrun ; Lu, Yichong ; Zhu, Lanyun ; Zhou, Xiaowei ; Geiger, Andreas ; Liao, Yiyi</creatorcontrib><description>Large-scale training data with high-quality annotations is critical for training semantic and instance segmentation models. Unfortunately, pixel-wise annotation is labor-intensive and costly, raising the demand for more efficient labeling strategies. In this work, we present a novel 3D-to-2D label transfer method, Panoptic NeRF, which aims for obtaining per-pixel 2D semantic and instance labels from easy-to-obtain coarse 3D bounding primitives. Our method utilizes NeRF as a differentiable tool to unify coarse 3D annotations and 2D semantic cues transferred from existing datasets. We demonstrate that this combination allows for improved geometry guided by semantic information, enabling rendering of accurate semantic maps across multiple views. Furthermore, this fusion process resolves label ambiguity of the coarse 3D annotations and filters noise in the 2D predictions. By inferring in 3D space and rendering to 2D labels, our 2D semantic and instance labels are multi-view consistent by design. Experimental results show that Panoptic NeRF outperforms existing label transfer methods in terms of accuracy and multi-view consistency on challenging urban scenes of the KITTI-360 dataset.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Datasets ; Image annotation ; Image segmentation ; Labels ; Noise prediction ; Pixels ; Rendering ; Semantics ; Training</subject><ispartof>arXiv.org, 2022-09</ispartof><rights>2022. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>776,780</link.rule.ids></links><search><creatorcontrib>Fu, Xiao</creatorcontrib><creatorcontrib>Zhang, Shangzhan</creatorcontrib><creatorcontrib>Chen, Tianrun</creatorcontrib><creatorcontrib>Lu, Yichong</creatorcontrib><creatorcontrib>Zhu, Lanyun</creatorcontrib><creatorcontrib>Zhou, Xiaowei</creatorcontrib><creatorcontrib>Geiger, Andreas</creatorcontrib><creatorcontrib>Liao, Yiyi</creatorcontrib><title>Panoptic NeRF: 3D-to-2D Label Transfer for Panoptic Urban Scene Segmentation</title><title>arXiv.org</title><description>Large-scale training data with high-quality annotations is critical for training semantic and instance segmentation models. Unfortunately, pixel-wise annotation is labor-intensive and costly, raising the demand for more efficient labeling strategies. In this work, we present a novel 3D-to-2D label transfer method, Panoptic NeRF, which aims for obtaining per-pixel 2D semantic and instance labels from easy-to-obtain coarse 3D bounding primitives. Our method utilizes NeRF as a differentiable tool to unify coarse 3D annotations and 2D semantic cues transferred from existing datasets. We demonstrate that this combination allows for improved geometry guided by semantic information, enabling rendering of accurate semantic maps across multiple views. Furthermore, this fusion process resolves label ambiguity of the coarse 3D annotations and filters noise in the 2D predictions. By inferring in 3D space and rendering to 2D labels, our 2D semantic and instance labels are multi-view consistent by design. Experimental results show that Panoptic NeRF outperforms existing label transfer methods in terms of accuracy and multi-view consistency on challenging urban scenes of the KITTI-360 dataset.</description><subject>Datasets</subject><subject>Image annotation</subject><subject>Image segmentation</subject><subject>Labels</subject><subject>Noise prediction</subject><subject>Pixels</subject><subject>Rendering</subject><subject>Semantics</subject><subject>Training</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>BENPR</sourceid><recordid>eNqNykELgjAYgOERBEn5Hz7oPJj7MqtrJh0kIu0sUz5Dsc22-f_rEJ07vYf3mbFAIkZ8t5FywULneiGE3CYyjjFg-VVpM_qugQvdsgNgyr3hMoVc1TRAaZV2LVlojYUfvdtaaSga0gQFPZ6kvfKd0Ss2b9XgKPx2ydbZqTye-WjNayLnq95MVn9WJZMIxR4xTvA_9QY8xTus</recordid><startdate>20220909</startdate><enddate>20220909</enddate><creator>Fu, Xiao</creator><creator>Zhang, Shangzhan</creator><creator>Chen, Tianrun</creator><creator>Lu, Yichong</creator><creator>Zhu, Lanyun</creator><creator>Zhou, Xiaowei</creator><creator>Geiger, Andreas</creator><creator>Liao, Yiyi</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20220909</creationdate><title>Panoptic NeRF: 3D-to-2D Label Transfer for Panoptic Urban Scene Segmentation</title><author>Fu, Xiao ; Zhang, Shangzhan ; Chen, Tianrun ; Lu, Yichong ; Zhu, Lanyun ; Zhou, Xiaowei ; Geiger, Andreas ; Liao, Yiyi</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_27130933573</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Datasets</topic><topic>Image annotation</topic><topic>Image segmentation</topic><topic>Labels</topic><topic>Noise prediction</topic><topic>Pixels</topic><topic>Rendering</topic><topic>Semantics</topic><topic>Training</topic><toplevel>online_resources</toplevel><creatorcontrib>Fu, Xiao</creatorcontrib><creatorcontrib>Zhang, Shangzhan</creatorcontrib><creatorcontrib>Chen, Tianrun</creatorcontrib><creatorcontrib>Lu, Yichong</creatorcontrib><creatorcontrib>Zhu, Lanyun</creatorcontrib><creatorcontrib>Zhou, Xiaowei</creatorcontrib><creatorcontrib>Geiger, Andreas</creatorcontrib><creatorcontrib>Liao, Yiyi</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Fu, Xiao</au><au>Zhang, Shangzhan</au><au>Chen, Tianrun</au><au>Lu, Yichong</au><au>Zhu, Lanyun</au><au>Zhou, Xiaowei</au><au>Geiger, Andreas</au><au>Liao, Yiyi</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>Panoptic NeRF: 3D-to-2D Label Transfer for Panoptic Urban Scene Segmentation</atitle><jtitle>arXiv.org</jtitle><date>2022-09-09</date><risdate>2022</risdate><eissn>2331-8422</eissn><abstract>Large-scale training data with high-quality annotations is critical for training semantic and instance segmentation models. Unfortunately, pixel-wise annotation is labor-intensive and costly, raising the demand for more efficient labeling strategies. In this work, we present a novel 3D-to-2D label transfer method, Panoptic NeRF, which aims for obtaining per-pixel 2D semantic and instance labels from easy-to-obtain coarse 3D bounding primitives. Our method utilizes NeRF as a differentiable tool to unify coarse 3D annotations and 2D semantic cues transferred from existing datasets. We demonstrate that this combination allows for improved geometry guided by semantic information, enabling rendering of accurate semantic maps across multiple views. Furthermore, this fusion process resolves label ambiguity of the coarse 3D annotations and filters noise in the 2D predictions. By inferring in 3D space and rendering to 2D labels, our 2D semantic and instance labels are multi-view consistent by design. Experimental results show that Panoptic NeRF outperforms existing label transfer methods in terms of accuracy and multi-view consistency on challenging urban scenes of the KITTI-360 dataset.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | EISSN: 2331-8422 |
ispartof | arXiv.org, 2022-09 |
issn | 2331-8422 |
language | eng |
recordid | cdi_proquest_journals_2713093357 |
source | Free E- Journals |
subjects | Datasets Image annotation Image segmentation Labels Noise prediction Pixels Rendering Semantics Training |
title | Panoptic NeRF: 3D-to-2D Label Transfer for Panoptic Urban Scene Segmentation |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-28T05%3A28%3A52IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=Panoptic%20NeRF:%203D-to-2D%20Label%20Transfer%20for%20Panoptic%20Urban%20Scene%20Segmentation&rft.jtitle=arXiv.org&rft.au=Fu,%20Xiao&rft.date=2022-09-09&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E2713093357%3C/proquest%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2713093357&rft_id=info:pmid/&rfr_iscdi=true |