ENGINE: Energy-Based Inference Networks for Non-Autoregressive Machine Translation
We propose to train a non-autoregressive machine translation model to minimize the energy defined by a pretrained autoregressive model. In particular, we view our non-autoregressive translation system as an inference network (Tu and Gimpel, 2018) trained to minimize the autoregressive teacher energy...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | We propose to train a non-autoregressive machine translation model to
minimize the energy defined by a pretrained autoregressive model. In
particular, we view our non-autoregressive translation system as an inference
network (Tu and Gimpel, 2018) trained to minimize the autoregressive teacher
energy. This contrasts with the popular approach of training a
non-autoregressive model on a distilled corpus consisting of the beam-searched
outputs of such a teacher model. Our approach, which we call ENGINE
(ENerGy-based Inference NEtworks), achieves state-of-the-art non-autoregressive
results on the IWSLT 2014 DE-EN and WMT 2016 RO-EN datasets, approaching the
performance of autoregressive models. |
---|---|
DOI: | 10.48550/arxiv.2005.00850 |