AUTOMATIC IMAGE SELECTION WITH CROSS MODAL MATCHING

The present technology pertains to a multi-modal transformer model that is designed and trained to perform cross-modal tasks such as image-text matching, wherein the model is further refined with data for the particular downstream use case of the model. More specifically, the present technology can...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: GU, Xiaoyuan, THIYAGARAJAN, Kailash, KHURD, Parmeshwar, KIM, Alex J, HUANG, Jia, MONARCH, Robert J, KWAC, Jungsuk
Format: Patent
Sprache:eng ; fre ; ger
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The present technology pertains to a multi-modal transformer model that is designed and trained to perform cross-modal tasks such as image-text matching, wherein the model is further refined with data for the particular downstream use case of the model. More specifically, the present technology can refine the underlying model with labeled examples derived from a dataset of text-image pairs that ultimately achieved a desired interaction in the proper context. For example, in the use case of advertising applications in an App store, the present technology can refine the underlying model with examples of images used to advertise applications in the App store where the respective invitational content was clicked or converted.