Towards Context-Stable and Visual-Consistent Image Inpainting

Recent progress in inpainting increasingly relies on generative models, leveraging their strong generation capabilities for addressing large irregular masks. However, this enhanced generation often introduces context-instability, leading to arbitrary object generation within masked regions. This pap...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:arXiv.org 2024-03
Hauptverfasser: Wang, Yikai, Cao, Chenjie, Ke Fan Xiangyang Xue Yanwei Fu
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Recent progress in inpainting increasingly relies on generative models, leveraging their strong generation capabilities for addressing large irregular masks. However, this enhanced generation often introduces context-instability, leading to arbitrary object generation within masked regions. This paper proposes a balanced solution, emphasizing the importance of unmasked regions in guiding inpainting while preserving generation capacity. Our approach, Aligned Stable Inpainting with UnKnown Areas Prior (ASUKA), employs a Masked Auto-Encoder (MAE) to produce reconstruction-based prior. Aligned with the powerful Stable Diffusion inpainting model (SD), ASUKA significantly improves context stability. ASUKA further adopts an inpainting-specialized decoder, highly reducing the color inconsistency issue of SD and thus ensuring more visual-consistent inpainting. We validate effectiveness of inpainting algorithms on benchmark dataset Places 2 and a collection of several existing datasets, dubbed MISATO, across diverse domains and masking scenarios. Results on these benchmark datasets confirm ASUKA's efficacy in both context-stability and visual-consistency compared to SD and other inpainting algorithms.
ISSN:2331-8422