AUTOMATIC IMAGE SELECTION WITH CROSS MODAL MATCHING
The present technology pertains to a multi-modal transformer model that is designed and trained to perform cross-modal tasks such as image-text matching, wherein the model is further refined with data for the particular downstream use case of the model. More specifically, the present technology can...
Gespeichert in:
Hauptverfasser: | , , , , , , |
---|---|
Format: | Patent |
Sprache: | eng ; fre ; ger |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The present technology pertains to a multi-modal transformer model that is designed and trained to perform cross-modal tasks such as image-text matching, wherein the model is further refined with data for the particular downstream use case of the model. More specifically, the present technology can refine the underlying model with labeled examples derived from a dataset of text-image pairs that ultimately achieved a desired interaction in the proper context. For example, in the use case of advertising applications in an App store, the present technology can refine the underlying model with examples of images used to advertise applications in the App store where the respective invitational content was clicked or converted. |
---|