Drag Your Noise: Interactive Point-based Editing via Diffusion Semantic Propagation
Point-based interactive editing serves as an essential tool to complement the controllability of existing generative models. A concurrent work, DragDiffusion, updates the diffusion latent map in response to user inputs, causing global latent map alterations. This results in imprecise preservation of...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Point-based interactive editing serves as an essential tool to complement the
controllability of existing generative models. A concurrent work,
DragDiffusion, updates the diffusion latent map in response to user inputs,
causing global latent map alterations. This results in imprecise preservation
of the original content and unsuccessful editing due to gradient vanishing. In
contrast, we present DragNoise, offering robust and accelerated editing without
retracing the latent map. The core rationale of DragNoise lies in utilizing the
predicted noise output of each U-Net as a semantic editor. This approach is
grounded in two critical observations: firstly, the bottleneck features of
U-Net inherently possess semantically rich features ideal for interactive
editing; secondly, high-level semantics, established early in the denoising
process, show minimal variation in subsequent stages. Leveraging these
insights, DragNoise edits diffusion semantics in a single denoising step and
efficiently propagates these changes, ensuring stability and efficiency in
diffusion editing. Comparative experiments reveal that DragNoise achieves
superior control and semantic retention, reducing the optimization time by over
50% compared to DragDiffusion. Our codes are available at
https://github.com/haofengl/DragNoise. |
---|---|
DOI: | 10.48550/arxiv.2404.01050 |