Techniques for Detecting the Start and End Points of Sign Language Utterances to Enhance Recognition Performance in Mobile Environments

Recent AI-based technologies in mobile environments have enabled sign language recognition, allowing deaf individuals to communicate effectively with hearing individuals. However, varying computational performance across different mobile devices can result in differences in the number of image frame...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Applied sciences 2024-10, Vol.14 (20), p.9199
Hauptverfasser: Kim, Taewan, Kim, Bongjae
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Recent AI-based technologies in mobile environments have enabled sign language recognition, allowing deaf individuals to communicate effectively with hearing individuals. However, varying computational performance across different mobile devices can result in differences in the number of image frames extracted in real time during sign language utterances. The number of extracted frames is a critical factor influencing the accuracy of sign language recognition models. If the number of extracted frames is too small, the performance of the sign language recognition model may decline. Additionally, detecting the start and end points of sign language utterances is crucial for improving recognition accuracy, as the period before the start point and after the end point often involves no action being performed. These parts do not capture the unique characteristics of each sign language. Therefore, this paper proposes a technique to dynamically adjust the sampling rate based on the number of frames extracted in real time during sign language utterances in mobile environments, with the aim of accurately detecting the start and end points of the sign language. Experiments were conducted to compare the proposed technique with the fixed sampling rate method and with the no-sampling method as a baseline. Our findings show that the proposed dynamic sampling rate adjustment method improves performance by up to 83.64% in top-5 accuracy and by up to 66.54% in top-1 accuracy compared to the fixed sampling rate method. The performance evaluation results underscore the effectiveness of our dynamic sampling rate adjustment approach in enhancing the accuracy and robustness of sign language recognition systems across different operational conditions.
ISSN:2076-3417
2076-3417
DOI:10.3390/app14209199