Real-time spatial normalization for dynamic gesture classification

In this paper, we provide a new spatial data generalization method which we applied in hand gesture recognition tasks. Data gathering can be a tedious task when it comes to gesture recognition, especially dynamic gestures. Nowadays, the standard solutions when lacking data still consist of either th...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:The Visual computer 2022-04, Vol.38 (4), p.1345-1357
Hauptverfasser: Zeghoud, Sofiane, Ali, Saba Ghazanfar, Ertugrul, Egemen, Kamel, Aouaidjia, Sheng, Bin, Li, Ping, Chi, Xiaoyu, Kim, Jinman, Mao, Lijuan
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:In this paper, we provide a new spatial data generalization method which we applied in hand gesture recognition tasks. Data gathering can be a tedious task when it comes to gesture recognition, especially dynamic gestures. Nowadays, the standard solutions when lacking data still consist of either the expensive gathering of new data or the impractical employment of hand-crafted data augmentation algorithms. While these solutions may show improvement, they come with disadvantages. We believe that a better extrapolation of the limited data’s common pattern, through an improved generalization, should first be considered. We, therefore, propose a dynamic generalization method that allows to capture and normalize in real-time the spatial evolution of the input. The latter procedure can be fully converted into a neural network processing layer which we call Evolution Normalization Layer . Experimental results on the SHREC2017 dataset showed that the addition of the proposed layer improved the prediction accuracy of a standard sequence-processing model while requiring 6 times fewer weights on average for a similar score. Furthermore, when trained on only 10% of the original training data, the standard model was able to reach a maximum accuracy of only 36.5% alone and 56.8% when applying a state-of-the-art processing method to the data, whereas the addition of our layer alone permitted to achieve a prediction accuracy of 81.5%.
ISSN:0178-2789
1432-2315
DOI:10.1007/s00371-021-02229-9