Automatic translation from parallel speech: Simultaneous interpretation as MT training data

State-of-the art statistical machine translation depends heavily on the availability of domain-specific bilingual parallel text. However, acquiring large amounts of bilingual parallel text is costly and, depending on the language pair, sometimes impossible. We propose an alternative to parallel text...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Paulik, M., Waibel, A.
Format:	Tagungsbericht
Sprache:	eng
Schlagworte:	Automatic speech recognition Dictionaries Interactive systems Laboratories Large-scale systems Natural languages Training data
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	State-of-the art statistical machine translation depends heavily on the availability of domain-specific bilingual parallel text. However, acquiring large amounts of bilingual parallel text is costly and, depending on the language pair, sometimes impossible. We propose an alternative to parallel text as machine translation (MT) training data; audio recordings of parallel speech (pSp) as it occurs in any scenario where interpreters are involved. Although interpretation (pSp) differs significantly from translation (parallel text), we achieve surprisingly strong translation results with our pSp-trained MT and speech translation systems.We argue that the presented approach is of special interest for developing speech translation in the context of resource-deficient languages where even monolingual resources are scarce.
DOI:	10.1109/ASRU.2009.5372880