Interpretability of deep reinforcement learning models in assistant systems

In one embodiment, a method includes training a target machine-learning model iteratively by accessing training data of content objects, training an intermediate machine-learning model that outputs contextual evaluation measurements based on the training data, generating state-indications associated...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Shah, Pararth Paresh, Liu, Honglei, Li, Wenxuan, Yang, Wenhai, Kumar, Anuj
Format:	Patent
Sprache:	eng
Schlagworte:	CALCULATING COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS COMPUTING COUNTING DATA PROCESSING SYSTEMS OR METHODS, SPECIALLY ADAPTED FORADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORYOR FORECASTING PURPOSES ELECTRIC DIGITAL DATA PROCESSING PHYSICS SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE,COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTINGPURPOSES, NOT OTHERWISE PROVIDED FOR
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	In one embodiment, a method includes training a target machine-learning model iteratively by accessing training data of content objects, training an intermediate machine-learning model that outputs contextual evaluation measurements based on the training data, generating state-indications associated with the training data, wherein the state-indications comprise user-intents, system actions, and user actions, training the target machine-learning model based on the contextual evaluation measurements, the state-indications, and an action set comprising possible system actions, extracting rules based on the target machine-learning model by a sequential pattern-mining model, generating synthetic training data based on the rules, updating the training data by adding the synthetic training data to the training data, determining if a completion condition is reached for the training, and if the completion condition is reached returning the target machine-learning model, else repeating the iterative training of the target machine-learning model.