StyleFusion: A Generative Model for Disentangling Spatial Segments
We present StyleFusion, a new mapping architecture for StyleGAN, which takes as input a number of latent codes and fuses them into a single style code. Inserting the resulting style code into a pre-trained StyleGAN generator results in a single harmonized image in which each semantic region is contr...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | We present StyleFusion, a new mapping architecture for StyleGAN, which takes
as input a number of latent codes and fuses them into a single style code.
Inserting the resulting style code into a pre-trained StyleGAN generator
results in a single harmonized image in which each semantic region is
controlled by one of the input latent codes. Effectively, StyleFusion yields a
disentangled representation of the image, providing fine-grained control over
each region of the generated image. Moreover, to help facilitate global control
over the generated image, a special input latent code is incorporated into the
fused representation. StyleFusion operates in a hierarchical manner, where each
level is tasked with learning to disentangle a pair of image regions (e.g., the
car body and wheels). The resulting learned disentanglement allows one to
modify both local, fine-grained semantics (e.g., facial features) as well as
more global features (e.g., pose and background), providing improved
flexibility in the synthesis process. As a natural extension, StyleFusion
enables one to perform semantically-aware cross-image mixing of regions that
are not necessarily aligned. Finally, we demonstrate how StyleFusion can be
paired with existing editing techniques to more faithfully constrain the edit
to the user's region of interest. |
---|---|
DOI: | 10.48550/arxiv.2107.07437 |