GROUNDED MULTIMODAL AGENT INTERACTIONS

Aspects of the present disclosure relate to grounded multimodal agent interactions, where a user input is processed using a multimodal machine learning model to generate model output. The model output may then be processed to affect the behavior of an application, for example to enable a user to con...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: DESGARENNES, Gabriel, A, RAO, Sudha, VOLUM, Ryan, BROCKETT, Christopher, John, DOLAN, William B
Format: Patent
Sprache:eng ; fre ; ger
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Aspects of the present disclosure relate to grounded multimodal agent interactions, where a user input is processed using a multimodal machine learning model to generate model output. The model output may then be processed to affect the behavior of an application, for example to enable a user to control the application and/or to facilitate user interactions with a conversational agent, among other examples. In some instances, at least a part of the model output may be executed or parsed, for example to call an application programming interface or function of the application. Thus, use of a multimodal machine learning model according to aspects described herein may enable the use of user-provided natural language input to affect the behavior of an application accordingly.