Dynamic Prompting of Frozen Text-to-Image Diffusion Models for Panoptic Narrative Grounding

Panoptic narrative grounding (PNG), whose core target is fine-grained image-text alignment, requires a panoptic segmentation of referred objects given a narrative caption. Previous discriminative methods achieve only weak or coarse-grained alignment by panoptic segmentation pretraining or CLIP model...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Li, Hongyu, Hui, Tianrui, Ding, Zihan, Zhang, Jing, Ma, Bin, Wei, Xiaoming, Han, Jizhong, Liu, Si
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!