Gesture Recognition using FastDTW and Deep Learning Methods in the MSRC-12 and the NTU RGB+D Databases

This work explores the use of three deep learning methods for gesture recognition: Convolutional Neural Networks (CNN), Long Short-Term Memory (LSTM), and Gated Recurrent Unit (GRU) using Fast Dynamic Time Warping (FastDTW). The gestures were captured by Kinect sensors, two skeleton-based databases...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Revista IEEE América Latina 2022-09, Vol.20 (9), p.2189-2195
Hauptverfasser: Peixoto, Julia Schubert, Cukla, Anselmo Rafael, De Souza Leite Cuadros, Marco Antonio, Welfer, Daniel, Tello Gamarra, Daniel Fernando
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:This work explores the use of three deep learning methods for gesture recognition: Convolutional Neural Networks (CNN), Long Short-Term Memory (LSTM), and Gated Recurrent Unit (GRU) using Fast Dynamic Time Warping (FastDTW). The gestures were captured by Kinect sensors, two skeleton-based databases are used: Microsoft Research Cambridge-12 (MSRC-12) and NTU RGB+D. Also, the FastDTW technique was also employed to standardize the input size of the data. The MSRC-12 database achieved an accuracy rate of 82,36% in the test set with the CNN, the LSTM achieved an accuracy rate of 87,30% also in the test set, and in GRU the accuracy achieved in the test set was 89,34%. With the NTU RGB+D database, two evaluation methods were used: Cross-View and Cross-Subject. In the test set with Cross-View evaluation was obtained an accuracy rate of 63,53%, 55,14%, and 61,00%, with CNN, LSTM, and GRU respectively; and with the Cross-Subject evaluation method, it was achieved an accuracy rate of 66,19%, 64,43% and 60,17% in the test set on CNN, LSTM and GRU, respectively.
ISSN:1548-0992
1548-0992
DOI:10.1109/TLA.2022.9878175