Using an LLM to Turn Sign Spottings into Spoken Language Sentences
Sign Language Translation (SLT) is a challenging task that aims to generate spoken language sentences from sign language videos. In this paper, we introduce a hybrid SLT approach, Spotter+GPT, that utilizes a sign spotter and a powerful Large Language Model (LLM) to improve SLT performance. Spotter+...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Sign Language Translation (SLT) is a challenging task that aims to generate
spoken language sentences from sign language videos. In this paper, we
introduce a hybrid SLT approach, Spotter+GPT, that utilizes a sign spotter and
a powerful Large Language Model (LLM) to improve SLT performance. Spotter+GPT
breaks down the SLT task into two stages. The videos are first processed by the
Spotter, which is trained on a linguistic sign language dataset, to identify
individual signs. These spotted signs are then passed to an LLM, which
transforms them into coherent and contextually appropriate spoken language
sentences. The source code of the Spotter is available at
https://gitlab.surrey.ac.uk/cogvispublic/sign-spotter. |
---|---|
DOI: | 10.48550/arxiv.2403.10434 |