Multilingual Neural Machine Translation for Zero-Resource Languages
In recent years, Neural Machine Translation (NMT) has been shown to be more effective than phrase-based statistical methods, thus quickly becoming the state of the art in machine translation (MT). However, NMT systems are limited in translating low-resourced languages, due to the significant amount...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | In recent years, Neural Machine Translation (NMT) has been shown to be more
effective than phrase-based statistical methods, thus quickly becoming the
state of the art in machine translation (MT). However, NMT systems are limited
in translating low-resourced languages, due to the significant amount of
parallel data that is required to learn useful mappings between languages. In
this work, we show how the so-called multilingual NMT can help to tackle the
challenges associated with low-resourced language translation. The underlying
principle of multilingual NMT is to force the creation of hidden
representations of words in a shared semantic space across multiple languages,
thus enabling a positive parameter transfer across languages. Along this
direction, we present multilingual translation experiments with three languages
(English, Italian, Romanian) covering six translation directions, utilizing
both recurrent neural networks and transformer (or self-attentive) neural
networks. We then focus on the zero-shot translation problem, that is how to
leverage multi-lingual data in order to learn translation directions that are
not covered by the available training material. To this aim, we introduce our
recently proposed iterative self-training method, which incrementally improves
a multilingual NMT on a zero-shot direction by just relying on monolingual
data. Our results on TED talks data show that multilingual NMT outperforms
conventional bilingual NMT, that the transformer NMT outperforms recurrent NMT,
and that zero-shot NMT outperforms conventional pivoting methods and even
matches the performance of a fully-trained bilingual system. |
---|---|
DOI: | 10.48550/arxiv.1909.07342 |