Cross-modal image-text retrieval method and system based on image-text semantic similarity optimization

The invention discloses a cross-modal image-text retrieval method based on image-text semantic similarity optimization, and the method comprises the following steps: obtaining image and text data, and obtaining a training data set; constructing an initial cross-modal image-text retrieval model based...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: PAN LILI, MA XUEQIANG, LUO YUANJIE, WANG TIAN'E
Format: Patent
Sprache:chi ; eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The invention discloses a cross-modal image-text retrieval method based on image-text semantic similarity optimization, and the method comprises the following steps: obtaining image and text data, and obtaining a training data set; constructing an initial cross-modal image-text retrieval model based on image-text semantic similarity optimization; using the obtained training data set to train the initial cross-modal image-text retrieval model to obtain a cross-modal image-text retrieval model; and carrying out actual image-text retrieval by using a cross-model image-text retrieval model. The invention further discloses a system for realizing the cross-modal image-text retrieval method based on image-text semantic similarity optimization. According to the method, the semantic similarity learning module is adopted to perform feature extraction on the multi-head attention mechanism image, so that the accuracy of local similarity is improved. The invention further provides a loss function based on noise label opti