Incorporating history and future into non-autoregressive machine translation

In non-autoregressive machine translation, the target tokens are generated by the decoder in one shot. Although this decoding process can significantly reduce the decoding latency, non-autoregressive machine translation still suffers from the sacrifice of translation accuracy. We argue that the reas...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Computer speech & language 2023-01, Vol.77, p.101439, Article 101439
Hauptverfasser: Wang, Shuheng, Huang, Heyan, Shi, Shumin
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:In non-autoregressive machine translation, the target tokens are generated by the decoder in one shot. Although this decoding process can significantly reduce the decoding latency, non-autoregressive machine translation still suffers from the sacrifice of translation accuracy. We argue that the reason for such decrease is the lack of the target dependencies, history and future information, between target tokens. So, in this work, we propose a novel method to address this problem. We suppose the hidden representation of a target token from the decoder should consist of three parts: history, present, and future information. And we dynamically aggregate such parts-to-whole information with capsule network for the decoder to improve the performance of non-autoregressive machine translation. In addition, to ensure the capsules learn the information as we expect, we introduce an autoregressive decoder. Several experiments on benchmark tasks demonstrate that the explicit modeling of history and future information can significantly improve performance of NAT model. Extensive analyses show that our model is able to learn history and future information as we expect. •In order to solve the problem that the NAT model cannot model target dependency, we adapt the capsule network to assist the model to learn future and history information.•To ensure the capsule network learn as we expect, we design an autoregressive model to extract information.•We conduct a series of experiments on the benchmark tasks, and the results shows that our model can improve the performance.
ISSN:0885-2308
1095-8363
DOI:10.1016/j.csl.2022.101439