Method for solving video question and answer tasks needing common knowledge by using question-knowledge guided progressive space-time attention network

The invention discloses a method for solving a video question and answer task needing common knowledge by using a question-knowledge guided progressive space-time attention network, which comprises the following steps: for a video, obtaining a video object set by using a Faster-RCNN; retrieving an a...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	ZHAO ZHOU, ZHANG PINHAN, JIN WEIKE, CHEN MOSHA
Format:	Patent
Sprache:	chi ; eng
Schlagworte:	CALCULATING COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS COMPUTING COUNTING ELECTRIC DIGITAL DATA PROCESSING HANDLING RECORD CARRIERS PHYSICS PRESENTATION OF DATA RECOGNITION OF DATA RECORD CARRIERS
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	The invention discloses a method for solving a video question and answer task needing common knowledge by using a question-knowledge guided progressive space-time attention network, which comprises the following steps: for a video, obtaining a video object set by using a Faster-RCNN; retrieving an annotation text corresponding to the video object set in an external knowledge base to obtain external knowledge; extracting semantic features of external knowledge by using Doc2Vec to obtain a knowledge feature set of the video; aiming at the problem, converting an input word into a word embedding vector by using an embedding layer (embedding layer); inputting the word embedding vector into a progressive space-time attention network to generate an answer; by using the additional information, more specific questions, such as some common questions, can be answered; external knowledge and questions are combined, progressive video attention is guided in space and time dimensions, and fine-grained joint video representa