Inst-Inpaint: Instructing to Remove Objects with Diffusion Models
Image inpainting task refers to erasing unwanted pixels from images and filling them in a semantically consistent and realistic way. Traditionally, the pixels that are wished to be erased are defined with binary masks. From the application point of view, a user needs to generate the masks for the ob...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Image inpainting task refers to erasing unwanted pixels from images and
filling them in a semantically consistent and realistic way. Traditionally, the
pixels that are wished to be erased are defined with binary masks. From the
application point of view, a user needs to generate the masks for the objects
they would like to remove which can be time-consuming and prone to errors. In
this work, we are interested in an image inpainting algorithm that estimates
which object to be removed based on natural language input and removes it,
simultaneously. For this purpose, first, we construct a dataset named
GQA-Inpaint for this task. Second, we present a novel inpainting framework,
Inst-Inpaint, that can remove objects from images based on the instructions
given as text prompts. We set various GAN and diffusion-based baselines and run
experiments on synthetic and real image datasets. We compare methods with
different evaluation metrics that measure the quality and accuracy of the
models and show significant quantitative and qualitative improvements. |
---|---|
DOI: | 10.48550/arxiv.2304.03246 |