Cross-modal image-text retrieval method and system based on image-text semantic similarity optimization

The invention discloses a cross-modal image-text retrieval method based on image-text semantic similarity optimization, and the method comprises the following steps: obtaining image and text data, and obtaining a training data set; constructing an initial cross-modal image-text retrieval model based...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	PAN LILI, MA XUEQIANG, LUO YUANJIE, WANG TIAN'E
Format:	Patent
Sprache:	chi ; eng
Schlagworte:	CALCULATING COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS COMPUTING COUNTING ELECTRIC DIGITAL DATA PROCESSING PHYSICS
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	The invention discloses a cross-modal image-text retrieval method based on image-text semantic similarity optimization, and the method comprises the following steps: obtaining image and text data, and obtaining a training data set; constructing an initial cross-modal image-text retrieval model based on image-text semantic similarity optimization; using the obtained training data set to train the initial cross-modal image-text retrieval model to obtain a cross-modal image-text retrieval model; and carrying out actual image-text retrieval by using a cross-model image-text retrieval model. The invention further discloses a system for realizing the cross-modal image-text retrieval method based on image-text semantic similarity optimization. According to the method, the semantic similarity learning module is adopted to perform feature extraction on the multi-head attention mechanism image, so that the accuracy of local similarity is improved. The invention further provides a loss function based on noise label opti