Clean Text and Full-Body Transformer: Microsoft's Submission to the WMT22 Shared Task on Sign Language Translation
This paper describes Microsoft's submission to the first shared task on sign language translation at WMT 2022, a public competition tackling sign language to spoken language translation for Swiss German sign language. The task is very challenging due to data scarcity and an unprecedented vocabu...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | This paper describes Microsoft's submission to the first shared task on sign
language translation at WMT 2022, a public competition tackling sign language
to spoken language translation for Swiss German sign language. The task is very
challenging due to data scarcity and an unprecedented vocabulary size of more
than 20k words on the target side. Moreover, the data is taken from real
broadcast news, includes native signing and covers scenarios of long videos.
Motivated by recent advances in action recognition, we incorporate full body
information by extracting features from a pre-trained I3D model and applying a
standard transformer network. The accuracy of the system is further improved by
applying careful data cleaning on the target text. We obtain BLEU scores of 0.6
and 0.78 on the test and dev set respectively, which is the best score among
the participants of the shared task. Also in the human evaluation the
submission reaches the first place. The BLEU score is further improved to 1.08
on the dev set by applying features extracted from a lip reading model. |
---|---|
DOI: | 10.48550/arxiv.2210.13326 |