Inflected Forms Are Redundant in Question Generation Models
Neural models with an encoder-decoder framework provide a feasible solution to Question Generation (QG). However, after analyzing the model vocabulary we find that current models (both RNN-based and pre-training based) have more than 23\% inflected forms. As a result, the encoder will generate separ...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Neural models with an encoder-decoder framework provide a feasible solution
to Question Generation (QG). However, after analyzing the model vocabulary we
find that current models (both RNN-based and pre-training based) have more than
23\% inflected forms. As a result, the encoder will generate separate
embeddings for the inflected forms, leading to a waste of training data and
parameters. Even worse, in decoding these models are vulnerable to irrelevant
noise and they suffer from high computational costs. In this paper, we propose
an approach to enhance the performance of QG by fusing word transformation.
Firstly, we identify the inflected forms of words from the input of encoder,
and replace them with the root words, letting the encoder pay more attention to
the repetitive root words. Secondly, we propose to adapt QG as a combination of
the following actions in the encode-decoder framework: generating a question
word, copying a word from the source sequence or generating a word
transformation type. Such extension can greatly decrease the size of predicted
words in the decoder as well as noise. We apply our approach to a typical
RNN-based model and \textsc{UniLM} to get the improved versions. We conduct
extensive experiments on SQuAD and MS MARCO datasets. The experimental results
show that the improved versions can significantly outperform the corresponding
baselines in terms of BLEU, ROUGE-L and METEOR as well as time cost. |
---|---|
DOI: | 10.48550/arxiv.2301.00397 |