Aligning where to see and what to tell: image caption with region-based attention and scene factorization

Recent progress on automatic generation of image captions has shown that it is possible to describe the most salient information conveyed by images with accurate and meaningful sentences. In this paper, we propose an image caption system that exploits the parallel structures between images and sente...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	arXiv.org 2015-06
Hauptverfasser:	Jin, Junqi, Fu, Kun, Cui, Runpeng, Sha, Fei, Zhang, Changshui
Format:	Artikel
Sprache:	eng
Schlagworte:	Alignment Coding Modelling Sentences Visual perception
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Schreiben Sie den ersten Kommentar!