Latent Question Interpretation Through Variational Adaptation

Most artificial neural network models for question-answering rely on complex attention mechanisms. These techniques demonstrate high performance on existing datasets; however, they are limited in their ability to capture natural language variability, and to generate diverse relevant answers. To addr...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE/ACM transactions on audio, speech, and language processing speech, and language processing, 2019-11, Vol.27 (11), p.1713-1724
Hauptverfasser:	Parshakova, Tetiana, Rameau, Francois, Serdega, Andriy, In So Kweon, Dae-Shik Kim
Format:	Artikel
Sprache:	eng
Schlagworte:	Adaptation models Artificial neural networks Datasets discrete latent variable Feature extraction Indexes information retrieval Knowledge discovery Neural networks neural variational inference policy gradient Question answering Questions semi-supervised learning Speech processing Training
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Most artificial neural network models for question-answering rely on complex attention mechanisms. These techniques demonstrate high performance on existing datasets; however, they are limited in their ability to capture natural language variability, and to generate diverse relevant answers. To address this limitation, we propose a model that learns multiple interpretations of a given question. This diversity is ensured by our interpretation policy module which automatically adapts the parameters of a question-answering model with respect to a discrete latent variable. This variable follows the distribution of interpretations learned by the interpretation policy through a semi-supervised variational inference framework. To boost the performance further, the resulting policy is fine-tuned using the rewards from the answer accuracy with a policy gradient. We demonstrate the relevance and efficiency of our model through a large panel of experiments. Qualitative results, in particular, underline the ability of the proposed architecture to discover multiple interpretations of a question. When tested using the Stanford Question Answering Dataset 1.1, our model outperforms the baseline methods in finding multiple and diverse answers. To assess our strategy from a human standpoint, we also conduct a large-scale user study. This study highlights the ability of our network to produce diverse and coherent answers compared to existing approaches. Our Pytorch implementation is available as open source. 11 github.com/parshakova/APIP.
ISSN:	2329-9290 2329-9304
DOI:	10.1109/TASLP.2019.2929647