Novel three-axis accelerometer-based silent speech interface using deep neural network

Silent speech interfaces (SSIs) have been developed as new non-acoustic communication channels for people with speech impairment. Various modalities have been employed to implement SSIs, including ultrasound imaging, electromagnetic articulography, and surface electromyography. In this study, for th...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Engineering applications of artificial intelligence 2023-04, Vol.120, p.105909, Article 105909
Hauptverfasser:	Kwon, Jinuk, Nam, Hyerin, Chae, Younsoo, Lee, Seungjae, Kim, In Young, Im, Chang-Hwan
Format:	Artikel
Sprache:	eng
Schlagworte:	1D CNN-bLSTM Deep neural network Human-computer interface Silent speech interface (SSI) Three-axis accelerometer
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Silent speech interfaces (SSIs) have been developed as new non-acoustic communication channels for people with speech impairment. Various modalities have been employed to implement SSIs, including ultrasound imaging, electromagnetic articulography, and surface electromyography. In this study, for the first time, we examined the feasibility of implementing an SSI using accelerometers, which have been widely used to acquire motion-related information in human activity recognition. Five accelerometers were attached to the facial surface of participants to measure speech-induced facial movements. A deep neural network architecture combining a one-dimensional (1D) convolutional neural network and bidirectional long short-term memory (1D CNN-bLSTM) was implemented to decode speech-related information contained in the accelerometer signals. In total, 20 healthy individuals participated in the SSI experiments, wherein they were asked to articulate 40 words consisting of 30 Korean words and 10 English Numbers without vocalization. Leave-one-session-out cross-validation was employed to evaluate the classification accuracy of the proposed accelerometer-based SSI. Consequently, an average classification accuracy of 95.58 ± 1.83% was achieved with only four accelerometers, which is significantly higher than that of the conventional sEMG-based SSI (89.68 ± 5.27%, p < 0.0005, Wilcoxon signed-rank test). In addition, the proposed SSI achieved an average classification accuracy of 94.65 ± 2.54% in classifying 40 English words spoken silently. The result demonstrates that accelerometers can be a promising modality to implement SSIs. Considering that accelerometers have multiple advantages over conventional modalities, including non-invasiveness, cost-effectiveness, low power consumption, and portability, it is expected that accelerometer-based SSIs would provide a novel means of communication to those who cannot generate speech signals.
ISSN:	0952-1976 1873-6769
DOI:	10.1016/j.engappai.2023.105909