Multi-task Stack Propagation for Neural Quality Estimation
Quality estimation is an important task in machine translation that has attracted increased interest in recent years. A key problem in translation-quality estimation is the lack of a sufficient amount of the quality annotated training data. To address this shortcoming, the Predictor-Estimator was pr...
Gespeichert in:
Veröffentlicht in: | ACM transactions on Asian and low-resource language information processing 2019-08, Vol.18 (4), p.1-18 |
---|---|
Hauptverfasser: | , |
Format: | Artikel |
Sprache: | eng |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Quality estimation is an important task in machine translation that has attracted increased interest in recent years. A key problem in translation-quality estimation is the lack of a sufficient amount of the quality annotated training data. To address this shortcoming, the
Predictor-Estimator
was proposed recently by introducing “word prediction” as an additional pre-subtask that predicts a current target word with consideration of surrounding source and target contexts, resulting in a two-stage neural model composed of a
predictor
and an
estimator
. However, the original Predictor-Estimator is not trained on a continuous stacking model but instead in a cascaded manner that separately trains the predictor from the estimator. In addition, the Predictor-Estimator is trained based on single-task learning only, which uses target-specific quality-estimation data without using other training data that are available from other-level quality-estimation tasks. In this article, we thus propose a
multi-task stack propagation
, which extensively applies
stack propagation
to fully train the Predictor-Estimator on a continuous stacking architecture and
multi-task learning
to enhance the training data from related other-level quality-estimation tasks. Experimental results on WMT17 quality-estimation datasets show that the Predictor-Estimator trained with multi-task stack propagation provides statistically significant improvements over the baseline models. In particular, under an ensemble setting, the proposed multi-task stack propagation leads to state-of-the-art performance at all the sentence/word/phrase levels for WMT17 quality estimation tasks. |
---|---|
ISSN: | 2375-4699 2375-4702 |
DOI: | 10.1145/3321127 |