Changing the Representation: Examining Language Representation for Neural Sign Language Production
Neural Sign Language Production (SLP) aims to automatically translate from spoken language sentences to sign language videos. Historically the SLP task has been broken into two steps; Firstly, translating from a spoken language sentence to a gloss sequence and secondly, producing a sign language vid...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Neural Sign Language Production (SLP) aims to automatically translate from
spoken language sentences to sign language videos. Historically the SLP task
has been broken into two steps; Firstly, translating from a spoken language
sentence to a gloss sequence and secondly, producing a sign language video
given a sequence of glosses. In this paper we apply Natural Language Processing
techniques to the first step of the SLP pipeline. We use language models such
as BERT and Word2Vec to create better sentence level embeddings, and apply
several tokenization techniques, demonstrating how these improve performance on
the low resource translation task of Text to Gloss. We introduce Text to
HamNoSys (T2H) translation, and show the advantages of using a phonetic
representation for sign language translation rather than a sign level gloss
representation. Furthermore, we use HamNoSys to extract the hand shape of a
sign and use this as additional supervision during training, further increasing
the performance on T2H. Assembling best practise, we achieve a BLEU-4 score of
26.99 on the MineDGS dataset and 25.09 on PHOENIX14T, two new state-of-the-art
baselines. |
---|---|
DOI: | 10.48550/arxiv.2210.06312 |