Human Gesture Recognition Based on CT-A Hybrid Deep Learning Model in Wi-Fi Environment

Human gesture recognition has become an important aspect of human-computer interaction due to the rapid development of human behavior sensing technology in Wi-Fi environments. Although Wi-Fi-based gesture recognition systems have achieved good accuracy within specific domains, they still have limita...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE sensors journal 2023-11, Vol.23 (22), p.28021-28034
Hauptverfasser:	Yao, Yancheng, Zhao, Chuanxin, Pan, Yahui, Sha, Chao, Rao, Yuan, Wang, Taochun
Format:	Artikel
Sprache:	eng
Schlagworte:	Accuracy Artificial neural networks Body-coordinate velocity profile (BVP) convolutional neural network (CNN) Deep learning ensemble learning Feature extraction Gesture recognition Machine learning Modules Sensors Solid modeling Transformer Velocity distribution Wireless communication Wireless fidelity Wireless sensor networks
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Human gesture recognition has become an important aspect of human-computer interaction due to the rapid development of human behavior sensing technology in Wi-Fi environments. Although Wi-Fi-based gesture recognition systems have achieved good accuracy within specific domains, they still have limitations in terms of cross-domain capability. In light of this, this article aims to explore methods that can achieve high recognition accuracy within specific scenes while also maintaining cross-scene capability. To address this challenge, we propose a hybrid deep learning model that leverages a combination of convolutional neural network (CNN) and the encoder module in the Transformer. This model takes into consideration the spatial localization characteristics and long-distance dependence of gestures, which improves its ability to model the spatiotemporal features in the body-coordinate velocity profile (BVP) series. In addition, we enhance the model's modeling effect on spatiotemporal features in BVP series by extracting low-dimensional vectors containing a significant amount of classification information. These vectors are then fed into the Adaboost module for ensemble learning. Finally, a strong classifier is used to compute the class of gestures. To evaluate the performance of our proposed model, we conduct experiments on a common dataset. The results demonstrate that our model achieves an average accuracy of 96.78% and 88.27% in in-domain and cross-domain cases, respectively. This indicates the superiority and effectiveness of the proposed approach.
ISSN:	1530-437X 1558-1748
DOI:	10.1109/JSEN.2023.3323761