An image subtitle generation method based on MLL and ASCA-FR
The invention discloses an image subtitle generation method of feature reconstruction ASCA-FR of a joint attention mechanism based on multi-scale learning MLL and adjacent time nodes. The invention mainly solves the problems of inaccurate generated caption description and unsmooth expression caused...
Gespeichert in:
Hauptverfasser: | , , , , , , , , |
---|---|
Format: | Patent |
Sprache: | chi ; eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The invention discloses an image subtitle generation method of feature reconstruction ASCA-FR of a joint attention mechanism based on multi-scale learning MLL and adjacent time nodes. The invention mainly solves the problems of inaccurate generated caption description and unsmooth expression caused by the fact that the output of an attention model at a certain moment only considers the feature setof an image and the word vector at the previous moment and only uses a cross entropy loss function to train a network in the prior art. The method comprises the following specific steps: (1) generating a natural image test set and a training set; (2) extracting feature vectors; (3) constructing ASCA-FR network; (4) training ASCA-FR network; (5) obtaining natural image subtitles; according to themethod, the MLL loss function pair is utilized to train the constructed ASCA-FR network, so that the generated subtitles are accurately described and smoothly expressed.
本发明公开一种基于多尺度学习MLL和相邻时间节点联合注意力机制特征重建ASCA-FR的图像字幕生成方法,主要解决 |
---|