MagicMix: Semantic Mixing with Diffusion Models
Have you ever imagined what a corgi-alike coffee machine or a tiger-alike rabbit would look like? In this work, we attempt to answer these questions by exploring a new task called semantic mixing, aiming at blending two different semantics to create a new concept (e.g., corgi + coffee machine -- >...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Have you ever imagined what a corgi-alike coffee machine or a tiger-alike
rabbit would look like? In this work, we attempt to answer these questions by
exploring a new task called semantic mixing, aiming at blending two different
semantics to create a new concept (e.g., corgi + coffee machine -- >
corgi-alike coffee machine). Unlike style transfer, where an image is stylized
according to the reference style without changing the image content, semantic
blending mixes two different concepts in a semantic manner to synthesize a
novel concept while preserving the spatial layout and geometry. To this end, we
present MagicMix, a simple yet effective solution based on pre-trained
text-conditioned diffusion models. Motivated by the progressive generation
property of diffusion models where layout/shape emerges at early denoising
steps while semantically meaningful details appear at later steps during the
denoising process, our method first obtains a coarse layout (either by
corrupting an image or denoising from a pure Gaussian noise given a text
prompt), followed by injection of conditional prompt for semantic mixing. Our
method does not require any spatial mask or re-training, yet is able to
synthesize novel objects with high fidelity. To improve the mixing quality, we
further devise two simple strategies to provide better control and flexibility
over the synthesized content. With our method, we present our results over
diverse downstream applications, including semantic style transfer, novel
object synthesis, breed mixing, and concept removal, demonstrating the
flexibility of our method. More results can be found on the project page
https://magicmix.github.io |
---|---|
DOI: | 10.48550/arxiv.2210.16056 |