Sensor-Based Human Activity Recognition Based on Multi-Stream Time-Varying Features With ECA-Net Dimensionality Reduction

Sensor-based datasets are extensively utilized in human-computer interaction (HCI) and medical applications due to their portability and strong privacy features. Many researchers have developed sensor-based human activity recognition (HAR) systems to increase recognition performance. However, existi...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE access 2024, Vol.12, p.151649-151668
Hauptverfasser:	Miah, Abu Saleh Musa, Hwang, Yong Seok, Shin, Jungpil
Format:	Artikel
Sprache:	eng
Schlagworte:	Accuracy convolutional neural network (CNN) Convolutional neural networks Datasets Deep learning enahance channel attention network (ECANet) Face recognition Feature extraction gesture recognition HAR Hidden Markov models Human activity recognition human activity recognition (HAR) Human performance Human-computer interface Long short term memory LSTM Machine learning Man-machine interfaces Medical research Pattern recognition Prostheses Robot sensing systems Robotics Sensor Sensors signal processing Support vector machines temporal convolutional network (TCN) vision
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Sensor-based datasets are extensively utilized in human-computer interaction (HCI) and medical applications due to their portability and strong privacy features. Many researchers have developed sensor-based human activity recognition (HAR) systems to increase recognition performance. However, existing systems still face challenges in achieving satisfactory performance due to insufficient time-varying features and gradient explosion issues. To address these challenges, we proposed a multi-stream temporal convolutional network (TCN)-based approach for time-varying feature extraction and feature selection to recognize human activity (HA) from sensor datasets. The proposed model effectively extracts and emphasizes the spatial-temporal features of various human activities based on a 4-stream model. Each stream uses TCN to extract time-varying features and enhances them using an appropriate integration module. The first stream extracts fine-grained temporal features with TCN. The second and third streams integrate TCN features with LSTM, applying pre-integration and post-integration, respectively. The fourth stream uses CNN for spatial features and TCN for enhancing temporal features. The concatenation of the 4-stream features captures complex dependencies, improving the model's understanding of prolonged activities. In addition, we proposed a modified effective channel attention network (ECA-Net) that assigns higher dimensionality weight to lower dimensionality, enabling the proposed model to learn and recognise human activities effectively despite their complex patterns. Evaluations on the WISDM, PAMAP2, USC-HAD, Opportunity UCI, and UCI-HAR datasets showed accuracy improvements of 1.12%, 1.99%, 1.30%, 5.72%, and 0.38%, respectively, over state-of-the-art systems. The high-performance accuracy of the proposed model demonstrates its superiority, with implications for improving prosthetic limb functionality and advancing robotics human-machine interfaces. Our data preprocessing approach, deep learning model code, and dataset information are available at the following link: https://github.com/musaru/HAR_Sensor .
ISSN:	2169-3536 2169-3536
DOI:	10.1109/ACCESS.2024.3473828