A Text-Guided Generation and Refinement Model for Image Captioning
A high-quality image description requires not only the logic and fluency of language but also the richness and accuracy ofcontent. However, due to the semantic gap between vision and language, most existing image captioning approaches thatdirectly learn the cross-modal mapping from vision to languag...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on multimedia 2023, Vol.25, p.2966-2977 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Schreiben Sie den ersten Kommentar!