A Generative User Simulator with GPT-based Architecture and Goal State Tracking for Reinforced Multi-Domain Dialog Systems
Building user simulators (USs) for reinforcement learning (RL) of task-oriented dialog systems (DSs) has gained more and more attention, which, however, still faces several fundamental challenges. First, it is unclear whether we can leverage pretrained language models to design, for example, GPT-2 b...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Building user simulators (USs) for reinforcement learning (RL) of
task-oriented dialog systems (DSs) has gained more and more attention, which,
however, still faces several fundamental challenges. First, it is unclear
whether we can leverage pretrained language models to design, for example,
GPT-2 based USs, to catch up and interact with the recently advanced GPT-2
based DSs. Second, an important ingredient in a US is that the user goal can be
effectively incorporated and tracked; but how to flexibly integrate goal state
tracking and develop an end-to-end trainable US for multi-domains has remained
to be a challenge. In this work, we propose a generative user simulator (GUS)
with GPT-2 based architecture and goal state tracking towards addressing the
above two challenges. Extensive experiments are conducted on MultiWOZ2.1.
Different DSs are trained via RL with GUS, the classic agenda-based user
simulator (ABUS) and other ablation simulators respectively, and are compared
for cross-model evaluation, corpus-based evaluation and human evaluation. The
GUS achieves superior results in all three evaluation tasks. |
---|---|
DOI: | 10.48550/arxiv.2210.08692 |