MaterialSeg3D: Segmenting Dense Materials from 2D Priors for 3D Assets
Driven by powerful image diffusion models, recent research has achieved the automatic creation of 3D objects from textual or visual guidance. By performing score distillation sampling (SDS) iteratively across different views, these methods succeed in lifting 2D generative prior to the 3D space. Howe...
Gespeichert in:
Hauptverfasser: | , , , , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | |
container_volume | |
creator | Li, Zeyu Gan, Ruitong Luo, Chuanchen Wang, Yuxi Liu, Jiaheng Zhang, Ziwei Zhu Man Li, Qing Yin, Xucheng Zhang, Zhaoxiang Peng, Junran |
description | Driven by powerful image diffusion models, recent research has achieved the
automatic creation of 3D objects from textual or visual guidance. By performing
score distillation sampling (SDS) iteratively across different views, these
methods succeed in lifting 2D generative prior to the 3D space. However, such a
2D generative image prior bakes the effect of illumination and shadow into the
texture. As a result, material maps optimized by SDS inevitably involve
spurious correlated components. The absence of precise material definition
makes it infeasible to relight the generated assets reasonably in novel scenes,
which limits their application in downstream scenarios. In contrast, humans can
effortlessly circumvent this ambiguity by deducing the material of the object
from its appearance and semantics. Motivated by this insight, we propose
MaterialSeg3D, a 3D asset material generation framework to infer underlying
material from the 2D semantic prior. Based on such a prior model, we devise a
mechanism to parse material in 3D space. We maintain a UV stack, each map of
which is unprojected from a specific viewpoint. After traversing all
viewpoints, we fuse the stack through a weighted voting scheme and then employ
region unification to ensure the coherence of the object parts. To fuel the
learning of semantics prior, we collect a material dataset, named Materialized
Individual Objects (MIO), which features abundant images, diverse categories,
and accurate annotations. Extensive quantitative and qualitative experiments
demonstrate the effectiveness of our method. |
doi_str_mv | 10.48550/arxiv.2404.13923 |
format | Article |
fullrecord | <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2404_13923</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2404_13923</sourcerecordid><originalsourceid>FETCH-LOGICAL-a673-6fc81a5c29471db6369c219562d6d594b2619d91031ce28e257f0bccbc90ba463</originalsourceid><addsrcrecordid>eNo1j8FKxDAURbNxIaMf4GryA63Je0nauBumjgojCjP7kqSvQ8C2khTRv7eOujpc7uXCYexGilLVWotblz7jRwlKqFKiBbxku2c3U4ru7UAnbO74goHGOY4n3tCYif_3mfdpGjg0_DXFKS1xShwbvsmZ5nzFLvplQ9d_XLHj7v64fSz2Lw9P282-cKbCwvShlk4HsKqSnTdobABptYHOdNoqD0bazkqBMhDUBLrqhQ_BByu8UwZXbP17exZp31McXPpqf4TasxB-A9v2RAM</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>MaterialSeg3D: Segmenting Dense Materials from 2D Priors for 3D Assets</title><source>arXiv.org</source><creator>Li, Zeyu ; Gan, Ruitong ; Luo, Chuanchen ; Wang, Yuxi ; Liu, Jiaheng ; Zhang, Ziwei Zhu Man ; Li, Qing ; Yin, Xucheng ; Zhang, Zhaoxiang ; Peng, Junran</creator><creatorcontrib>Li, Zeyu ; Gan, Ruitong ; Luo, Chuanchen ; Wang, Yuxi ; Liu, Jiaheng ; Zhang, Ziwei Zhu Man ; Li, Qing ; Yin, Xucheng ; Zhang, Zhaoxiang ; Peng, Junran</creatorcontrib><description>Driven by powerful image diffusion models, recent research has achieved the
automatic creation of 3D objects from textual or visual guidance. By performing
score distillation sampling (SDS) iteratively across different views, these
methods succeed in lifting 2D generative prior to the 3D space. However, such a
2D generative image prior bakes the effect of illumination and shadow into the
texture. As a result, material maps optimized by SDS inevitably involve
spurious correlated components. The absence of precise material definition
makes it infeasible to relight the generated assets reasonably in novel scenes,
which limits their application in downstream scenarios. In contrast, humans can
effortlessly circumvent this ambiguity by deducing the material of the object
from its appearance and semantics. Motivated by this insight, we propose
MaterialSeg3D, a 3D asset material generation framework to infer underlying
material from the 2D semantic prior. Based on such a prior model, we devise a
mechanism to parse material in 3D space. We maintain a UV stack, each map of
which is unprojected from a specific viewpoint. After traversing all
viewpoints, we fuse the stack through a weighted voting scheme and then employ
region unification to ensure the coherence of the object parts. To fuel the
learning of semantics prior, we collect a material dataset, named Materialized
Individual Objects (MIO), which features abundant images, diverse categories,
and accurate annotations. Extensive quantitative and qualitative experiments
demonstrate the effectiveness of our method.</description><identifier>DOI: 10.48550/arxiv.2404.13923</identifier><language>eng</language><subject>Computer Science - Computer Vision and Pattern Recognition</subject><creationdate>2024-04</creationdate><rights>http://creativecommons.org/licenses/by-nc-nd/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,782,887</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2404.13923$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2404.13923$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Li, Zeyu</creatorcontrib><creatorcontrib>Gan, Ruitong</creatorcontrib><creatorcontrib>Luo, Chuanchen</creatorcontrib><creatorcontrib>Wang, Yuxi</creatorcontrib><creatorcontrib>Liu, Jiaheng</creatorcontrib><creatorcontrib>Zhang, Ziwei Zhu Man</creatorcontrib><creatorcontrib>Li, Qing</creatorcontrib><creatorcontrib>Yin, Xucheng</creatorcontrib><creatorcontrib>Zhang, Zhaoxiang</creatorcontrib><creatorcontrib>Peng, Junran</creatorcontrib><title>MaterialSeg3D: Segmenting Dense Materials from 2D Priors for 3D Assets</title><description>Driven by powerful image diffusion models, recent research has achieved the
automatic creation of 3D objects from textual or visual guidance. By performing
score distillation sampling (SDS) iteratively across different views, these
methods succeed in lifting 2D generative prior to the 3D space. However, such a
2D generative image prior bakes the effect of illumination and shadow into the
texture. As a result, material maps optimized by SDS inevitably involve
spurious correlated components. The absence of precise material definition
makes it infeasible to relight the generated assets reasonably in novel scenes,
which limits their application in downstream scenarios. In contrast, humans can
effortlessly circumvent this ambiguity by deducing the material of the object
from its appearance and semantics. Motivated by this insight, we propose
MaterialSeg3D, a 3D asset material generation framework to infer underlying
material from the 2D semantic prior. Based on such a prior model, we devise a
mechanism to parse material in 3D space. We maintain a UV stack, each map of
which is unprojected from a specific viewpoint. After traversing all
viewpoints, we fuse the stack through a weighted voting scheme and then employ
region unification to ensure the coherence of the object parts. To fuel the
learning of semantics prior, we collect a material dataset, named Materialized
Individual Objects (MIO), which features abundant images, diverse categories,
and accurate annotations. Extensive quantitative and qualitative experiments
demonstrate the effectiveness of our method.</description><subject>Computer Science - Computer Vision and Pattern Recognition</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNo1j8FKxDAURbNxIaMf4GryA63Je0nauBumjgojCjP7kqSvQ8C2khTRv7eOujpc7uXCYexGilLVWotblz7jRwlKqFKiBbxku2c3U4ru7UAnbO74goHGOY4n3tCYif_3mfdpGjg0_DXFKS1xShwbvsmZ5nzFLvplQ9d_XLHj7v64fSz2Lw9P282-cKbCwvShlk4HsKqSnTdobABptYHOdNoqD0bazkqBMhDUBLrqhQ_BByu8UwZXbP17exZp31McXPpqf4TasxB-A9v2RAM</recordid><startdate>20240422</startdate><enddate>20240422</enddate><creator>Li, Zeyu</creator><creator>Gan, Ruitong</creator><creator>Luo, Chuanchen</creator><creator>Wang, Yuxi</creator><creator>Liu, Jiaheng</creator><creator>Zhang, Ziwei Zhu Man</creator><creator>Li, Qing</creator><creator>Yin, Xucheng</creator><creator>Zhang, Zhaoxiang</creator><creator>Peng, Junran</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20240422</creationdate><title>MaterialSeg3D: Segmenting Dense Materials from 2D Priors for 3D Assets</title><author>Li, Zeyu ; Gan, Ruitong ; Luo, Chuanchen ; Wang, Yuxi ; Liu, Jiaheng ; Zhang, Ziwei Zhu Man ; Li, Qing ; Yin, Xucheng ; Zhang, Zhaoxiang ; Peng, Junran</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a673-6fc81a5c29471db6369c219562d6d594b2619d91031ce28e257f0bccbc90ba463</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Computer Science - Computer Vision and Pattern Recognition</topic><toplevel>online_resources</toplevel><creatorcontrib>Li, Zeyu</creatorcontrib><creatorcontrib>Gan, Ruitong</creatorcontrib><creatorcontrib>Luo, Chuanchen</creatorcontrib><creatorcontrib>Wang, Yuxi</creatorcontrib><creatorcontrib>Liu, Jiaheng</creatorcontrib><creatorcontrib>Zhang, Ziwei Zhu Man</creatorcontrib><creatorcontrib>Li, Qing</creatorcontrib><creatorcontrib>Yin, Xucheng</creatorcontrib><creatorcontrib>Zhang, Zhaoxiang</creatorcontrib><creatorcontrib>Peng, Junran</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Li, Zeyu</au><au>Gan, Ruitong</au><au>Luo, Chuanchen</au><au>Wang, Yuxi</au><au>Liu, Jiaheng</au><au>Zhang, Ziwei Zhu Man</au><au>Li, Qing</au><au>Yin, Xucheng</au><au>Zhang, Zhaoxiang</au><au>Peng, Junran</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>MaterialSeg3D: Segmenting Dense Materials from 2D Priors for 3D Assets</atitle><date>2024-04-22</date><risdate>2024</risdate><abstract>Driven by powerful image diffusion models, recent research has achieved the
automatic creation of 3D objects from textual or visual guidance. By performing
score distillation sampling (SDS) iteratively across different views, these
methods succeed in lifting 2D generative prior to the 3D space. However, such a
2D generative image prior bakes the effect of illumination and shadow into the
texture. As a result, material maps optimized by SDS inevitably involve
spurious correlated components. The absence of precise material definition
makes it infeasible to relight the generated assets reasonably in novel scenes,
which limits their application in downstream scenarios. In contrast, humans can
effortlessly circumvent this ambiguity by deducing the material of the object
from its appearance and semantics. Motivated by this insight, we propose
MaterialSeg3D, a 3D asset material generation framework to infer underlying
material from the 2D semantic prior. Based on such a prior model, we devise a
mechanism to parse material in 3D space. We maintain a UV stack, each map of
which is unprojected from a specific viewpoint. After traversing all
viewpoints, we fuse the stack through a weighted voting scheme and then employ
region unification to ensure the coherence of the object parts. To fuel the
learning of semantics prior, we collect a material dataset, named Materialized
Individual Objects (MIO), which features abundant images, diverse categories,
and accurate annotations. Extensive quantitative and qualitative experiments
demonstrate the effectiveness of our method.</abstract><doi>10.48550/arxiv.2404.13923</doi><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | DOI: 10.48550/arxiv.2404.13923 |
ispartof | |
issn | |
language | eng |
recordid | cdi_arxiv_primary_2404_13923 |
source | arXiv.org |
subjects | Computer Science - Computer Vision and Pattern Recognition |
title | MaterialSeg3D: Segmenting Dense Materials from 2D Priors for 3D Assets |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-04T06%3A27%3A01IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=MaterialSeg3D:%20Segmenting%20Dense%20Materials%20from%202D%20Priors%20for%203D%20Assets&rft.au=Li,%20Zeyu&rft.date=2024-04-22&rft_id=info:doi/10.48550/arxiv.2404.13923&rft_dat=%3Carxiv_GOX%3E2404_13923%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true |