METHOD AND SYSTEM FOR VISIO-LINGUISTIC UNDERSTANDING USING CONTEXTUAL LANGUAGE MODEL REASONERS

This disclosure relates generally to visio-linguistic understanding. Conventional methods use contextual visio-linguistic reasoner for visio-linguistic understanding which requires more compute power and large amount of pre-training data. Embodiments of the present disclosure provide a method for vi...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Vadakkeeveetil Sreelatha, Silpa, Karande, Shirish Subhash, KALRA, Kanika, Patwardhan, Manasi, Kurma, Sai Sree Bhargav
Format:	Patent
Sprache:	eng ; fre ; ger
Schlagworte:	CALCULATING COMPUTING COUNTING PHYSICS
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	This disclosure relates generally to visio-linguistic understanding. Conventional methods use contextual visio-linguistic reasoner for visio-linguistic understanding which requires more compute power and large amount of pre-training data. Embodiments of the present disclosure provide a method for visio-linguistic understanding using contextual language model reasoner. The method converts the visual information of an input image into a format that the contextual language model reasoner understands and accepts for a downstream task. The method utilizes the image captions and confidence score associated with the image captions along with a knowledge graph to obtain a combined input in a format compatible with the contextual language model reasoner. Contextual embeddings corresponding to the downstream task is obtained using the combined input. The disclosed method is used to solve several downstream tasks such as scene understanding, visual question answering, visual common-sense reasoning and so on.