LogoSticker: Inserting Logos into Diffusion Models for Customized Generation
Recent advances in text-to-image model customization have underscored the importance of integrating new concepts with a few examples. Yet, these progresses are largely confined to widely recognized subjects, which can be learned with relative ease through models' adequate shared prior knowledge...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Recent advances in text-to-image model customization have underscored the
importance of integrating new concepts with a few examples. Yet, these
progresses are largely confined to widely recognized subjects, which can be
learned with relative ease through models' adequate shared prior knowledge. In
contrast, logos, characterized by unique patterns and textual elements, are
hard to establish shared knowledge within diffusion models, thus presenting a
unique challenge. To bridge this gap, we introduce the task of logo insertion.
Our goal is to insert logo identities into diffusion models and enable their
seamless synthesis in varied contexts. We present a novel two-phase pipeline
LogoSticker to tackle this task. First, we propose the actor-critic relation
pre-training algorithm, which addresses the nontrivial gaps in models'
understanding of the potential spatial positioning of logos and interactions
with other objects. Second, we propose a decoupled identity learning algorithm,
which enables precise localization and identity extraction of logos.
LogoSticker can generate logos accurately and harmoniously in diverse contexts.
We comprehensively validate the effectiveness of LogoSticker over customization
methods and large models such as DALLE~3.
\href{https://mingkangz.github.io/logosticker}{Project page}. |
---|---|
DOI: | 10.48550/arxiv.2407.13752 |