Agreement on Target-Bidirectional Recurrent Neural Networks for Sequence-to-Sequence Learning

Recurrent neural networks are extremely appealing for sequence-to-sequence learning tasks. Despite their great success, they typically suffer from a shortcoming: they are prone to generate unbalanced targets with good prefixes but bad suffixes, and thus performance suffers when dealing with long seq...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	The Journal of artificial intelligence research 2020-03, Vol.67, p.581-606
Hauptverfasser:	Liu, Lemao, Finch, Andrew, Utiyama, Masao, Sumita, Eiichiro
Format:	Artikel
Sprache:	eng
Schlagworte:	Approximation Artificial intelligence Cognitive tasks Computer Science Computer Science, Artificial Intelligence Learning Machine translation Neural networks Recurrent neural networks Science & Technology Technology
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Recurrent neural networks are extremely appealing for sequence-to-sequence learning tasks. Despite their great success, they typically suffer from a shortcoming: they are prone to generate unbalanced targets with good prefixes but bad suffixes, and thus performance suffers when dealing with long sequences. We propose a simple yet effective approach to overcome this shortcoming. Our approach relies on the agreement between a pair of target-directional RNNs, which generates more balanced targets. In addition, we develop two efficient approximate search methods for agreement that are empirically shown to be almost optimal in terms of either sequence level or non-sequence level metrics. Extensive experiments were performed on three standard sequence-to-sequence transduction tasks: machine transliteration, grapheme-to-phoneme transformation and machine translation. The results show that the proposed approach achieves consistent and substantial improvements, compared to many state-of-the-art systems.
ISSN:	1076-9757 1943-5037 1076-9757
DOI:	10.1613/jair.1.12008