CLIP3Dstyler: Language Guided 3D Arbitrary Neural Style Transfer

In this paper, we propose a novel language-guided 3D arbitrary neural style transfer method (CLIP3Dstyler). We aim at stylizing any 3D scene with an arbitrary style from a text description, and synthesizing the novel stylized view, which is more flexible than the image-conditioned style transfer. Co...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	arXiv.org 2023-05
Hauptverfasser:	Gao, Ming, Xu, YanWu, Zhao, Yang, Hou, Tingbo, Zhao, Chenkai, Gong, Mingming
Format:	Artikel
Sprache:	eng
Schlagworte:	Model matching Three dimensional models
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title	arXiv.org
container_volume
creator	Gao, Ming Xu, YanWu Zhao, Yang Hou, Tingbo Zhao, Chenkai Gong, Mingming
description	In this paper, we propose a novel language-guided 3D arbitrary neural style transfer method (CLIP3Dstyler). We aim at stylizing any 3D scene with an arbitrary style from a text description, and synthesizing the novel stylized view, which is more flexible than the image-conditioned style transfer. Compared with the previous 2D method CLIPStyler, we are able to stylize a 3D scene and generalize to novel scenes without re-train our model. A straightforward solution is to combine previous image-conditioned 3D style transfer and text-conditioned 2D style transfer \bigskip methods. However, such a solution cannot achieve our goal due to two main challenges. First, there is no multi-modal model matching point clouds and language at different feature scales (low-level, high-level). Second, we observe a style mixing issue when we stylize the content with different style conditions from text prompts. To address the first issue, we propose a 3D stylization framework to match the point cloud features with text features in local and global views. For the second issue, we propose an improved directional divergence loss to make arbitrary text styles more distinguishable as a complement to our framework. We conduct extensive experiments to show the effectiveness of our model on text-guided 3D scene style transfer.
format	Article
fullrecord	<record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2819551237</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2819551237</sourcerecordid><originalsourceid>FETCH-proquest_journals_28195512373</originalsourceid><addsrcrecordid>eNpjYuA0MjY21LUwMTLiYOAtLs4yMDAwMjM3MjU15mRwcPbxDDB2KS6pzEktslLwScxLL01MT1VwL81MSU1RMHZRcCxKyiwpSiyqVPBLLS1KzFEIBqlVCClKzCtOSy3iYWBNS8wpTuWF0twMym6uIc4eugVF-YWlqcUl8Vn5pUV5QKl4IwtDS1NTQyNjc2PiVAEAicM30Q</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2819551237</pqid></control><display><type>article</type><title>CLIP3Dstyler: Language Guided 3D Arbitrary Neural Style Transfer</title><source>Free E- Journals</source><creator>Gao, Ming ; Xu, YanWu ; Zhao, Yang ; Hou, Tingbo ; Zhao, Chenkai ; Gong, Mingming</creator><creatorcontrib>Gao, Ming ; Xu, YanWu ; Zhao, Yang ; Hou, Tingbo ; Zhao, Chenkai ; Gong, Mingming</creatorcontrib><description>In this paper, we propose a novel language-guided 3D arbitrary neural style transfer method (CLIP3Dstyler). We aim at stylizing any 3D scene with an arbitrary style from a text description, and synthesizing the novel stylized view, which is more flexible than the image-conditioned style transfer. Compared with the previous 2D method CLIPStyler, we are able to stylize a 3D scene and generalize to novel scenes without re-train our model. A straightforward solution is to combine previous image-conditioned 3D style transfer and text-conditioned 2D style transfer \bigskip methods. However, such a solution cannot achieve our goal due to two main challenges. First, there is no multi-modal model matching point clouds and language at different feature scales (low-level, high-level). Second, we observe a style mixing issue when we stylize the content with different style conditions from text prompts. To address the first issue, we propose a 3D stylization framework to match the point cloud features with text features in local and global views. For the second issue, we propose an improved directional divergence loss to make arbitrary text styles more distinguishable as a complement to our framework. We conduct extensive experiments to show the effectiveness of our model on text-guided 3D scene style transfer.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Model matching ; Three dimensional models</subject><ispartof>arXiv.org, 2023-05</ispartof><rights>2023. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>780,784</link.rule.ids></links><search><creatorcontrib>Gao, Ming</creatorcontrib><creatorcontrib>Xu, YanWu</creatorcontrib><creatorcontrib>Zhao, Yang</creatorcontrib><creatorcontrib>Hou, Tingbo</creatorcontrib><creatorcontrib>Zhao, Chenkai</creatorcontrib><creatorcontrib>Gong, Mingming</creatorcontrib><title>CLIP3Dstyler: Language Guided 3D Arbitrary Neural Style Transfer</title><title>arXiv.org</title><description>In this paper, we propose a novel language-guided 3D arbitrary neural style transfer method (CLIP3Dstyler). We aim at stylizing any 3D scene with an arbitrary style from a text description, and synthesizing the novel stylized view, which is more flexible than the image-conditioned style transfer. Compared with the previous 2D method CLIPStyler, we are able to stylize a 3D scene and generalize to novel scenes without re-train our model. A straightforward solution is to combine previous image-conditioned 3D style transfer and text-conditioned 2D style transfer \bigskip methods. However, such a solution cannot achieve our goal due to two main challenges. First, there is no multi-modal model matching point clouds and language at different feature scales (low-level, high-level). Second, we observe a style mixing issue when we stylize the content with different style conditions from text prompts. To address the first issue, we propose a 3D stylization framework to match the point cloud features with text features in local and global views. For the second issue, we propose an improved directional divergence loss to make arbitrary text styles more distinguishable as a complement to our framework. We conduct extensive experiments to show the effectiveness of our model on text-guided 3D scene style transfer.</description><subject>Model matching</subject><subject>Three dimensional models</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><recordid>eNpjYuA0MjY21LUwMTLiYOAtLs4yMDAwMjM3MjU15mRwcPbxDDB2KS6pzEktslLwScxLL01MT1VwL81MSU1RMHZRcCxKyiwpSiyqVPBLLS1KzFEIBqlVCClKzCtOSy3iYWBNS8wpTuWF0twMym6uIc4eugVF-YWlqcUl8Vn5pUV5QKl4IwtDS1NTQyNjc2PiVAEAicM30Q</recordid><startdate>20230526</startdate><enddate>20230526</enddate><creator>Gao, Ming</creator><creator>Xu, YanWu</creator><creator>Zhao, Yang</creator><creator>Hou, Tingbo</creator><creator>Zhao, Chenkai</creator><creator>Gong, Mingming</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20230526</creationdate><title>CLIP3Dstyler: Language Guided 3D Arbitrary Neural Style Transfer</title><author>Gao, Ming ; Xu, YanWu ; Zhao, Yang ; Hou, Tingbo ; Zhao, Chenkai ; Gong, Mingming</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_28195512373</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Model matching</topic><topic>Three dimensional models</topic><toplevel>online_resources</toplevel><creatorcontrib>Gao, Ming</creatorcontrib><creatorcontrib>Xu, YanWu</creatorcontrib><creatorcontrib>Zhao, Yang</creatorcontrib><creatorcontrib>Hou, Tingbo</creatorcontrib><creatorcontrib>Zhao, Chenkai</creatorcontrib><creatorcontrib>Gong, Mingming</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Gao, Ming</au><au>Xu, YanWu</au><au>Zhao, Yang</au><au>Hou, Tingbo</au><au>Zhao, Chenkai</au><au>Gong, Mingming</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>CLIP3Dstyler: Language Guided 3D Arbitrary Neural Style Transfer</atitle><jtitle>arXiv.org</jtitle><date>2023-05-26</date><risdate>2023</risdate><eissn>2331-8422</eissn><abstract>In this paper, we propose a novel language-guided 3D arbitrary neural style transfer method (CLIP3Dstyler). We aim at stylizing any 3D scene with an arbitrary style from a text description, and synthesizing the novel stylized view, which is more flexible than the image-conditioned style transfer. Compared with the previous 2D method CLIPStyler, we are able to stylize a 3D scene and generalize to novel scenes without re-train our model. A straightforward solution is to combine previous image-conditioned 3D style transfer and text-conditioned 2D style transfer \bigskip methods. However, such a solution cannot achieve our goal due to two main challenges. First, there is no multi-modal model matching point clouds and language at different feature scales (low-level, high-level). Second, we observe a style mixing issue when we stylize the content with different style conditions from text prompts. To address the first issue, we propose a 3D stylization framework to match the point cloud features with text features in local and global views. For the second issue, we propose an improved directional divergence loss to make arbitrary text styles more distinguishable as a complement to our framework. We conduct extensive experiments to show the effectiveness of our model on text-guided 3D scene style transfer.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	EISSN: 2331-8422
ispartof	arXiv.org, 2023-05
issn	2331-8422
language	eng
recordid	cdi_proquest_journals_2819551237
source	Free E- Journals
subjects	Model matching Three dimensional models
title	CLIP3Dstyler: Language Guided 3D Arbitrary Neural Style Transfer
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-07T09%3A46%3A50IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=CLIP3Dstyler:%20Language%20Guided%203D%20Arbitrary%20Neural%20Style%20Transfer&rft.jtitle=arXiv.org&rft.au=Gao,%20Ming&rft.date=2023-05-26&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E2819551237%3C/proquest%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2819551237&rft_id=info:pmid/&rfr_iscdi=true