Modeling hierarchical attention interaction between contexts and triple-channel encoding networks for document-grounded dialog generation

Dialog systems have attracted attention as they are promising in many intelligent applications. Generating fluent and informative responses is of critical importance for dialog systems. Some recent studies introduce documents as extra knowledge to improve the performance of dialog generation. Howeve...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Frontiers in physics 2022-10, Vol.10
Hauptverfasser: Cai, Yuanyuan, Zuo, Min, Xiong, Haitao
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Dialog systems have attracted attention as they are promising in many intelligent applications. Generating fluent and informative responses is of critical importance for dialog systems. Some recent studies introduce documents as extra knowledge to improve the performance of dialog generation. However, it is hard to understand the unstructured document and extract crucial information related to dialog history and current utterance. This leads to uninformative and inflexible responses in existing studies. To address this issue, we propose a generative model of a neural network with an attention mechanism for document-grounded multi-turn dialog. This model encodes the context of utterances that contains the given document, dialog history, and the last utterance into distributed representations via a triple-channel. Then, it introduces a hierarchical attention interaction between dialog contexts and previously generated utterances into the decoder for generating an appropriate response. We compare our model with various baselines on dataset CMU_DoG in terms of the evaluation criteria. The experimental results demonstrate the state-of-the-art performance of our model as compared to previous studies. Furthermore, the results of ablation experiments show the effectiveness of the hierarchical attention interaction and the triple channel for encoding. We also conduct human judgment to evaluate the informativeness of responses and the consistency of responses with dialog history.
ISSN:2296-424X
2296-424X
DOI:10.3389/fphy.2022.1019969