Speech Translation and the End-to-End Promise: Taking Stock of Where We Are
Over its three decade history, speech translation has experienced several shifts in its primary research themes; moving from loosely coupled cascades of speech recognition and machine translation, to exploring questions of tight coupling, and finally to end-to-end models that have recently attracted...
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Over its three decade history, speech translation has experienced several
shifts in its primary research themes; moving from loosely coupled cascades of
speech recognition and machine translation, to exploring questions of tight
coupling, and finally to end-to-end models that have recently attracted much
attention. This paper provides a brief survey of these developments, along with
a discussion of the main challenges of traditional approaches which stem from
committing to intermediate representations from the speech recognizer, and from
training cascaded models separately towards different objectives.
Recent end-to-end modeling techniques promise a principled way of overcoming
these issues by allowing joint training of all model components and removing
the need for explicit intermediate representations. However, a closer look
reveals that many end-to-end models fall short of solving these issues, due to
compromises made to address data scarcity. This paper provides a unifying
categorization and nomenclature that covers both traditional and recent
approaches and that may help researchers by highlighting both trade-offs and
open research questions. |
---|---|
DOI: | 10.48550/arxiv.2004.06358 |