DATASET GENERATION USING LARGE LANGUAGE MODELS
Disclosed are systems and techniques that may generate datasets for training task-oriented dialogue systems. The techniques include generating natural language queries by selecting a template query, sampling one or more tokens from a data store of domain-specific tokens, modifying the selected templ...
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Patent |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Disclosed are systems and techniques that may generate datasets for training task-oriented dialogue systems. The techniques include generating natural language queries by selecting a template query, sampling one or more tokens from a data store of domain-specific tokens, modifying the selected template query using the one or more sampled tokens to generate a query prompt, and using a natural language generative machine-learning model to generate, based on the query prompt, a respective natural language query of the subset of the plurality of natural language queries, and causing the generated plurality of natural language queries to be provided to a machine-learning model training engine configured to train, using the generated plurality of natural language queries, a conversational machine-learning model to perform a domain-specific conversational task. |
---|