Answer-checking in Context: A Multi-modal FullyAttention Network for Visual Question Answering

Visual Question Answering (VQA) is challenging due to the complex cross-modal relations. It has received extensive attention from the research community. From the human perspective, to answer a visual question, one needs to read the question and then refer to the image to generate an answer. This an...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	arXiv.org 2020-10
Hauptverfasser:	Huang, Hantao, Han, Tao, Han, Wei, Yap, Deep, Cheng-Ming, Chiang
Format:	Artikel
Sprache:	eng
Schlagworte:	Context Model accuracy Modules Questions
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Schreiben Sie den ersten Kommentar!