Token compression and bidirectional asymmetric matching multi-modal query image retrieval method
The invention provides a multi-modal query image retrieval method based on token compression and bidirectional asymmetric matching. The method comprises the steps that S1, an input image is partitioned and coded respectively, and an input text is converted into a token sequence through word embeddin...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Patent |
Sprache: | chi ; eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The invention provides a multi-modal query image retrieval method based on token compression and bidirectional asymmetric matching. The method comprises the steps that S1, an input image is partitioned and coded respectively, and an input text is converted into a token sequence through word embedding; performing token compression and coding on the serialized data; s2, adding an additional fusion token to the obtained image modal and text modal token sequence of the fusion context, and carrying out token compression and coding again; and S3, performing forward accurate matching and reverse fuzzy matching on the single-mode and fusion-mode feature representations obtained in the step S2, and guiding a neural network learning process by using a matching result. S4, training the neural network, and reserving the best model weight for calculating the feature representation of the test set data to realize combined query image retrieval; according to the method, the multi-modal query semantics can be fully fused, an |
---|