Cross-modal image-text retrieval method and system based on image-text semantic similarity optimization
The invention discloses a cross-modal image-text retrieval method based on image-text semantic similarity optimization, and the method comprises the following steps: obtaining image and text data, and obtaining a training data set; constructing an initial cross-modal image-text retrieval model based...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Patent |
Sprache: | chi ; eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The invention discloses a cross-modal image-text retrieval method based on image-text semantic similarity optimization, and the method comprises the following steps: obtaining image and text data, and obtaining a training data set; constructing an initial cross-modal image-text retrieval model based on image-text semantic similarity optimization; using the obtained training data set to train the initial cross-modal image-text retrieval model to obtain a cross-modal image-text retrieval model; and carrying out actual image-text retrieval by using a cross-model image-text retrieval model. The invention further discloses a system for realizing the cross-modal image-text retrieval method based on image-text semantic similarity optimization. According to the method, the semantic similarity learning module is adopted to perform feature extraction on the multi-head attention mechanism image, so that the accuracy of local similarity is improved. The invention further provides a loss function based on noise label opti |
---|