DATASET GENERATION USING LARGE LANGUAGE MODELS

Disclosed are systems and techniques that may generate datasets for training task-oriented dialogue systems. The techniques include generating natural language queries by selecting a template query, sampling one or more tokens from a data store of domain-specific tokens, modifying the selected templ...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Nagaraju, Divija, Parisien, Christopher
Format:	Patent
Sprache:	eng
Schlagworte:	CALCULATING COMPUTING COUNTING ELECTRIC DIGITAL DATA PROCESSING PHYSICS
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Disclosed are systems and techniques that may generate datasets for training task-oriented dialogue systems. The techniques include generating natural language queries by selecting a template query, sampling one or more tokens from a data store of domain-specific tokens, modifying the selected template query using the one or more sampled tokens to generate a query prompt, and using a natural language generative machine-learning model to generate, based on the query prompt, a respective natural language query of the subset of the plurality of natural language queries, and causing the generated plurality of natural language queries to be provided to a machine-learning model training engine configured to train, using the generated plurality of natural language queries, a conversational machine-learning model to perform a domain-specific conversational task.