An image subtitle generation method based on MLL and ASCA-FR

The invention discloses an image subtitle generation method of feature reconstruction ASCA-FR of a joint attention mechanism based on multi-scale learning MLL and adjacent time nodes. The invention mainly solves the problems of inaccurate generated caption description and unsmooth expression caused...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: CAI HONGXIA, QU LINZI, HE LIHUO, ZHONG YANZHE, LU WEN, ZHANG YI, WU TIANYAN, LI QIQI, GAO XINBO
Format: Patent
Sprache:chi ; eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The invention discloses an image subtitle generation method of feature reconstruction ASCA-FR of a joint attention mechanism based on multi-scale learning MLL and adjacent time nodes. The invention mainly solves the problems of inaccurate generated caption description and unsmooth expression caused by the fact that the output of an attention model at a certain moment only considers the feature setof an image and the word vector at the previous moment and only uses a cross entropy loss function to train a network in the prior art. The method comprises the following specific steps: (1) generating a natural image test set and a training set; (2) extracting feature vectors; (3) constructing ASCA-FR network; (4) training ASCA-FR network; (5) obtaining natural image subtitles; according to themethod, the MLL loss function pair is utilized to train the constructed ASCA-FR network, so that the generated subtitles are accurately described and smoothly expressed. 本发明公开一种基于多尺度学习MLL和相邻时间节点联合注意力机制特征重建ASCA-FR的图像字幕生成方法,主要解决