Development of Visual-Only Speech Recognition System for Mute People

People with sensory difficulties like dumbness, or with a disease like laryngeal cancer are the major causes of loss of production of human voice signal. This sensory difficulty leads to the use of sign language for their communication with a normal person. A normal person requires a special skill t...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Circuits, systems, and signal processing systems, and signal processing, 2022-04, Vol.41 (4), p.2152-2172
Hauptverfasser: Kumar, G. Aswanth, William, Jino Hans
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:People with sensory difficulties like dumbness, or with a disease like laryngeal cancer are the major causes of loss of production of human voice signal. This sensory difficulty leads to the use of sign language for their communication with a normal person. A normal person requires a special skill to decode sign language. This paper presents an efficient methodology to recognize the uttered word with the facial expression of speaker using deep learning framework. The proposed model generates an artificial acoustic signal along with an appropriate expression for the words uttered by mute people. It employs two deep learning architectures to recognize the uttered word and facial expressions, respectively. The recognized facial expression is then appended to the artificial acoustic signal. In this work, the uttered word is recognized by a combination of HOG + SVM classifier and a fine-tuned VGG-16 ConvNet with LSTM network using transfer learning. Furthermore, the facial expression of speaker is recognized using a combination of Haar-Cascade classifier and a fine-tuned MobileNet with LSTM network. A detailed evaluation on the proposed model shows that the accuracy of the model has improved by 40% and 10% for uttered word recognition and facial expression recognition, respectively.
ISSN:0278-081X
1531-5878
DOI:10.1007/s00034-021-01880-w