An EM algorithm for SCFG in formal syntax-based translation

In this paper, we investigate the use of bilingual parsing on parallel corpora to better estimate the rule parameters in a formal syntax-based machine translation system, which are normally estimated from the inaccurate heuristics. We use an expectation-maximization (EM) algorithm to re-estimate the...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Songfang Huang, Bowen Zhou
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:In this paper, we investigate the use of bilingual parsing on parallel corpora to better estimate the rule parameters in a formal syntax-based machine translation system, which are normally estimated from the inaccurate heuristics. We use an expectation-maximization (EM) algorithm to re-estimate the parameters of synchronous context-free grammar (SCFG) rules according to the derivation knowledge from parallel corpora based on maximum likelihood principle, rather than using only the heuristic information. The proposed algorithm produces significantly better BLEU scores than a state-of-the-art formal syntax-based machine translation system on the IWSLT 2006 Chinese to English task.
ISSN:1520-6149
2379-190X
DOI:10.1109/ICASSP.2009.4960708