ASL Trigger Recognition in Mixed Activity/Signing Sequences for RF Sensor-Based User Interfaces

The past decade has seen great advancements in speech recognition for control of interactive devices, personal assistants, and computer interfaces. However, deaf and hard-of-hearing (HoH) individuals, whose primary mode of communication is sign language, cannot use voice-controlled interfaces. Altho...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on human-machine systems 2022-08, Vol.52 (4), p.699-712
Hauptverfasser: Kurtoglu, Emre, Gurbuz, Ali C., Malaia, Evie A., Griffin, Darrin, Crawford, Chris, Gurbuz, Sevgi Z.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The past decade has seen great advancements in speech recognition for control of interactive devices, personal assistants, and computer interfaces. However, deaf and hard-of-hearing (HoH) individuals, whose primary mode of communication is sign language, cannot use voice-controlled interfaces. Although there has been significant work in video-based sign language recognition, video is not effective in the dark and has raised privacy concerns in the deaf community when used in the context of human ambient intelligence. RF sensors have been recently proposed as a new modality that can be effective under the circumstances where video is not. This article considers the problem of recognizing a trigger sign (wake word) in the context of daily living, where gross motor activities are interwoven with signing sequences. The proposed approach exploits multiple RF data domain representations (time-frequency, range-Doppler, and range-angle) for sequential classification of mixed motion data streams. The recognition accuracy of signs with varying kinematic properties is compared and used to make recommendations on appropriate trigger sign selection for RF-sensor-based user interfaces. The proposed approach achieves a trigger sign detection rate of 98.9% and a classification accuracy of 92% for 15 ASL words and three gross motor activities.
ISSN:2168-2291
2168-2305
DOI:10.1109/THMS.2021.3131675