Incorporating target language semantic roles into a string-to-tree translation model

The string-to-tree model is one of the most successful syntax-based statistical machine translation(SMT) models. It models the grammaticality of the output via target-side syntax. However, it does not use any semantic information and tends to produce translations containing semantic role confusions...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Frontiers of information technology & electronic engineering 2017-10, Vol.18 (10), p.1534-1542
Hauptverfasser:	Su, Chao, Guo, Yu-hang, Huang, He-yan, Shi, Shu-min, Feng, Chong
Format:	Artikel
Sprache:	eng
Schlagworte:	Communications Engineering Computer Hardware Computer Science Computer Systems Organization and Communication Networks Electrical Engineering Electronics and Microelectronics Instrumentation Labeling Language Machine translation Networks Semantics Strings Syntax
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	The string-to-tree model is one of the most successful syntax-based statistical machine translation(SMT) models. It models the grammaticality of the output via target-side syntax. However, it does not use any semantic information and tends to produce translations containing semantic role confusions and error chunk sequences. In this paper, we propose two methods to use semantic roles to improve the performance of the string-to-tree translation model:(1) adding role labels in the syntax tree;(2) constructing a semantic role tree, and then incorporating the syntax information into it. We then perform string-to-tree machine translation using the newly generated trees. Our methods enable the system to train and choose better translation rules using semantic information. Our experiments showed significant improvements over the state-of-the-art string-to-tree translation system on both spoken and news corpora, and the two proposed methods surpass the phrase-based system on large-scale training data.
ISSN:	2095-9184 2095-9230
DOI:	10.1631/FITEE.1601349