CLIP3Dstyler: Language Guided 3D Arbitrary Neural Style Transfer

In this paper, we propose a novel language-guided 3D arbitrary neural style transfer method (CLIP3Dstyler). We aim at stylizing any 3D scene with an arbitrary style from a text description, and synthesizing the novel stylized view, which is more flexible than the image-conditioned style transfer. Co...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:arXiv.org 2023-05
Hauptverfasser: Gao, Ming, Xu, YanWu, Zhao, Yang, Hou, Tingbo, Zhao, Chenkai, Gong, Mingming
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title arXiv.org
container_volume
creator Gao, Ming
Xu, YanWu
Zhao, Yang
Hou, Tingbo
Zhao, Chenkai
Gong, Mingming
description In this paper, we propose a novel language-guided 3D arbitrary neural style transfer method (CLIP3Dstyler). We aim at stylizing any 3D scene with an arbitrary style from a text description, and synthesizing the novel stylized view, which is more flexible than the image-conditioned style transfer. Compared with the previous 2D method CLIPStyler, we are able to stylize a 3D scene and generalize to novel scenes without re-train our model. A straightforward solution is to combine previous image-conditioned 3D style transfer and text-conditioned 2D style transfer \bigskip methods. However, such a solution cannot achieve our goal due to two main challenges. First, there is no multi-modal model matching point clouds and language at different feature scales (low-level, high-level). Second, we observe a style mixing issue when we stylize the content with different style conditions from text prompts. To address the first issue, we propose a 3D stylization framework to match the point cloud features with text features in local and global views. For the second issue, we propose an improved directional divergence loss to make arbitrary text styles more distinguishable as a complement to our framework. We conduct extensive experiments to show the effectiveness of our model on text-guided 3D scene style transfer.
format Article
fullrecord <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2819551237</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2819551237</sourcerecordid><originalsourceid>FETCH-proquest_journals_28195512373</originalsourceid><addsrcrecordid>eNpjYuA0MjY21LUwMTLiYOAtLs4yMDAwMjM3MjU15mRwcPbxDDB2KS6pzEktslLwScxLL01MT1VwL81MSU1RMHZRcCxKyiwpSiyqVPBLLS1KzFEIBqlVCClKzCtOSy3iYWBNS8wpTuWF0twMym6uIc4eugVF-YWlqcUl8Vn5pUV5QKl4IwtDS1NTQyNjc2PiVAEAicM30Q</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2819551237</pqid></control><display><type>article</type><title>CLIP3Dstyler: Language Guided 3D Arbitrary Neural Style Transfer</title><source>Free E- Journals</source><creator>Gao, Ming ; Xu, YanWu ; Zhao, Yang ; Hou, Tingbo ; Zhao, Chenkai ; Gong, Mingming</creator><creatorcontrib>Gao, Ming ; Xu, YanWu ; Zhao, Yang ; Hou, Tingbo ; Zhao, Chenkai ; Gong, Mingming</creatorcontrib><description>In this paper, we propose a novel language-guided 3D arbitrary neural style transfer method (CLIP3Dstyler). We aim at stylizing any 3D scene with an arbitrary style from a text description, and synthesizing the novel stylized view, which is more flexible than the image-conditioned style transfer. Compared with the previous 2D method CLIPStyler, we are able to stylize a 3D scene and generalize to novel scenes without re-train our model. A straightforward solution is to combine previous image-conditioned 3D style transfer and text-conditioned 2D style transfer \bigskip methods. However, such a solution cannot achieve our goal due to two main challenges. First, there is no multi-modal model matching point clouds and language at different feature scales (low-level, high-level). Second, we observe a style mixing issue when we stylize the content with different style conditions from text prompts. To address the first issue, we propose a 3D stylization framework to match the point cloud features with text features in local and global views. For the second issue, we propose an improved directional divergence loss to make arbitrary text styles more distinguishable as a complement to our framework. We conduct extensive experiments to show the effectiveness of our model on text-guided 3D scene style transfer.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Model matching ; Three dimensional models</subject><ispartof>arXiv.org, 2023-05</ispartof><rights>2023. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>780,784</link.rule.ids></links><search><creatorcontrib>Gao, Ming</creatorcontrib><creatorcontrib>Xu, YanWu</creatorcontrib><creatorcontrib>Zhao, Yang</creatorcontrib><creatorcontrib>Hou, Tingbo</creatorcontrib><creatorcontrib>Zhao, Chenkai</creatorcontrib><creatorcontrib>Gong, Mingming</creatorcontrib><title>CLIP3Dstyler: Language Guided 3D Arbitrary Neural Style Transfer</title><title>arXiv.org</title><description>In this paper, we propose a novel language-guided 3D arbitrary neural style transfer method (CLIP3Dstyler). We aim at stylizing any 3D scene with an arbitrary style from a text description, and synthesizing the novel stylized view, which is more flexible than the image-conditioned style transfer. Compared with the previous 2D method CLIPStyler, we are able to stylize a 3D scene and generalize to novel scenes without re-train our model. A straightforward solution is to combine previous image-conditioned 3D style transfer and text-conditioned 2D style transfer \bigskip methods. However, such a solution cannot achieve our goal due to two main challenges. First, there is no multi-modal model matching point clouds and language at different feature scales (low-level, high-level). Second, we observe a style mixing issue when we stylize the content with different style conditions from text prompts. To address the first issue, we propose a 3D stylization framework to match the point cloud features with text features in local and global views. For the second issue, we propose an improved directional divergence loss to make arbitrary text styles more distinguishable as a complement to our framework. We conduct extensive experiments to show the effectiveness of our model on text-guided 3D scene style transfer.</description><subject>Model matching</subject><subject>Three dimensional models</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><recordid>eNpjYuA0MjY21LUwMTLiYOAtLs4yMDAwMjM3MjU15mRwcPbxDDB2KS6pzEktslLwScxLL01MT1VwL81MSU1RMHZRcCxKyiwpSiyqVPBLLS1KzFEIBqlVCClKzCtOSy3iYWBNS8wpTuWF0twMym6uIc4eugVF-YWlqcUl8Vn5pUV5QKl4IwtDS1NTQyNjc2PiVAEAicM30Q</recordid><startdate>20230526</startdate><enddate>20230526</enddate><creator>Gao, Ming</creator><creator>Xu, YanWu</creator><creator>Zhao, Yang</creator><creator>Hou, Tingbo</creator><creator>Zhao, Chenkai</creator><creator>Gong, Mingming</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20230526</creationdate><title>CLIP3Dstyler: Language Guided 3D Arbitrary Neural Style Transfer</title><author>Gao, Ming ; Xu, YanWu ; Zhao, Yang ; Hou, Tingbo ; Zhao, Chenkai ; Gong, Mingming</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_28195512373</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Model matching</topic><topic>Three dimensional models</topic><toplevel>online_resources</toplevel><creatorcontrib>Gao, Ming</creatorcontrib><creatorcontrib>Xu, YanWu</creatorcontrib><creatorcontrib>Zhao, Yang</creatorcontrib><creatorcontrib>Hou, Tingbo</creatorcontrib><creatorcontrib>Zhao, Chenkai</creatorcontrib><creatorcontrib>Gong, Mingming</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science &amp; Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Gao, Ming</au><au>Xu, YanWu</au><au>Zhao, Yang</au><au>Hou, Tingbo</au><au>Zhao, Chenkai</au><au>Gong, Mingming</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>CLIP3Dstyler: Language Guided 3D Arbitrary Neural Style Transfer</atitle><jtitle>arXiv.org</jtitle><date>2023-05-26</date><risdate>2023</risdate><eissn>2331-8422</eissn><abstract>In this paper, we propose a novel language-guided 3D arbitrary neural style transfer method (CLIP3Dstyler). We aim at stylizing any 3D scene with an arbitrary style from a text description, and synthesizing the novel stylized view, which is more flexible than the image-conditioned style transfer. Compared with the previous 2D method CLIPStyler, we are able to stylize a 3D scene and generalize to novel scenes without re-train our model. A straightforward solution is to combine previous image-conditioned 3D style transfer and text-conditioned 2D style transfer \bigskip methods. However, such a solution cannot achieve our goal due to two main challenges. First, there is no multi-modal model matching point clouds and language at different feature scales (low-level, high-level). Second, we observe a style mixing issue when we stylize the content with different style conditions from text prompts. To address the first issue, we propose a 3D stylization framework to match the point cloud features with text features in local and global views. For the second issue, we propose an improved directional divergence loss to make arbitrary text styles more distinguishable as a complement to our framework. We conduct extensive experiments to show the effectiveness of our model on text-guided 3D scene style transfer.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier EISSN: 2331-8422
ispartof arXiv.org, 2023-05
issn 2331-8422
language eng
recordid cdi_proquest_journals_2819551237
source Free E- Journals
subjects Model matching
Three dimensional models
title CLIP3Dstyler: Language Guided 3D Arbitrary Neural Style Transfer
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-07T09%3A46%3A50IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=CLIP3Dstyler:%20Language%20Guided%203D%20Arbitrary%20Neural%20Style%20Transfer&rft.jtitle=arXiv.org&rft.au=Gao,%20Ming&rft.date=2023-05-26&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E2819551237%3C/proquest%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2819551237&rft_id=info:pmid/&rfr_iscdi=true