SVIT‐SSR: A sEMG‐based vision transformer approach for silent speech recognition

Silent speech recognition (SSR) based on surface electromyography (sEMG) is a voice interaction technology proposed for scenarios requiring silent operations. This article s the SSR task based on sEMG into a short‐term image sequence classification task. Time‐frequency domain feature extraction and...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Electronics letters 2024-11, Vol.60 (21), p.n/a
Hauptverfasser: Li, Zhao, Ma, Bin, Mao, Weifan, Zhang, Jianxing, Yu, Zhuting, Lu, Yizhou
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Silent speech recognition (SSR) based on surface electromyography (sEMG) is a voice interaction technology proposed for scenarios requiring silent operations. This article s the SSR task based on sEMG into a short‐term image sequence classification task. Time‐frequency domain feature extraction and data reconstruction on the muscle activity segment data is performed. Additionally, the temporal and spatial dimensions to capture the intrinsic correlation representation of muscle activity is analysed. The SVIT‐SSR model is proposed based on the vision transformer (VIT) framework. Finally, experiments to identify 33 types of typical silent speech commands in the SSR dataset are designed. The results demonstrate that the proposed model achieves an accuracy of 96.67 ± 1.15%, outperforming similar algorithms. The silent speech recognition (SSR) task based on surface electromyography is ed into a short‐term image sequence classification task and the SVIT‐SSR model based on the vision transformer (VIT) framework for silent speech recognition is proposed.
ISSN:0013-5194
1350-911X
DOI:10.1049/ell2.13285