MotionFollower: Editing Video Motion via Lightweight Score-Guided Diffusion
Despite impressive advancements in diffusion-based video editing models in altering video attributes, there has been limited exploration into modifying motion information while preserving the original protagonist's appearance and background. In this paper, we propose MotionFollower, a lightweig...
Gespeichert in:
Hauptverfasser: | , , , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Despite impressive advancements in diffusion-based video editing models in
altering video attributes, there has been limited exploration into modifying
motion information while preserving the original protagonist's appearance and
background. In this paper, we propose MotionFollower, a lightweight
score-guided diffusion model for video motion editing. To introduce conditional
controls to the denoising process, MotionFollower leverages two of our proposed
lightweight signal controllers, one for poses and the other for appearances,
both of which consist of convolution blocks without involving heavy attention
calculations. Further, we design a score guidance principle based on a
two-branch architecture, including the reconstruction and editing branches,
which significantly enhance the modeling capability of texture details and
complicated backgrounds. Concretely, we enforce several consistency
regularizers and losses during the score estimation. The resulting gradients
thus inject appropriate guidance to the intermediate latents, forcing the model
to preserve the original background details and protagonists' appearances
without interfering with the motion modification. Experiments demonstrate the
competitive motion editing ability of MotionFollower qualitatively and
quantitatively. Compared with MotionEditor, the most advanced motion editing
model, MotionFollower achieves an approximately 80% reduction in GPU memory
while delivering superior motion editing performance and exclusively supporting
large camera movements and actions. |
---|---|
DOI: | 10.48550/arxiv.2405.20325 |