CLIP3Dstyler: Language Guided 3D Arbitrary Neural Style Transfer
In this paper, we propose a novel language-guided 3D arbitrary neural style transfer method (CLIP3Dstyler). We aim at stylizing any 3D scene with an arbitrary style from a text description, and synthesizing the novel stylized view, which is more flexible than the image-conditioned style transfer. Co...
Gespeichert in:
Hauptverfasser: | , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | In this paper, we propose a novel language-guided 3D arbitrary neural style
transfer method (CLIP3Dstyler). We aim at stylizing any 3D scene with an
arbitrary style from a text description, and synthesizing the novel stylized
view, which is more flexible than the image-conditioned style transfer.
Compared with the previous 2D method CLIPStyler, we are able to stylize a 3D
scene and generalize to novel scenes without re-train our model. A
straightforward solution is to combine previous image-conditioned 3D style
transfer and text-conditioned 2D style transfer \bigskip methods. However, such
a solution cannot achieve our goal due to two main challenges. First, there is
no multi-modal model matching point clouds and language at different feature
scales (low-level, high-level). Second, we observe a style mixing issue when we
stylize the content with different style conditions from text prompts. To
address the first issue, we propose a 3D stylization framework to match the
point cloud features with text features in local and global views. For the
second issue, we propose an improved directional divergence loss to make
arbitrary text styles more distinguishable as a complement to our framework. We
conduct extensive experiments to show the effectiveness of our model on
text-guided 3D scene style transfer. |
---|---|
DOI: | 10.48550/arxiv.2305.15732 |