Compact Spatial Pyramid Pooling Deep Convolutional Neural Network Based Hand Gestures Decoder

Current deep learning convolutional neural network (DCNN) -based hand gesture detectors with acute precision demand incredibly high-performance computing power. Although DCNN-based detectors are capable of accurate classification, the sheer computing power needed for this form of classification make...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Applied sciences 2020-11, Vol.10 (21), p.7898, Article 7898
Hauptverfasser:	Ashiquzzaman, Akm, Lee, Hyunmin, Kim, Kwangki, Kim, Hye-Young, Park, Jaehyung, Kim, Jinsul
Format:	Artikel
Sprache:	eng
Schlagworte:	Accuracy Algorithms Chemistry Chemistry, Multidisciplinary Classification Communication Computation Computer applications Computers convolutional neural network Deep learning Engineering Engineering, Multidisciplinary Feature selection Hand hand gesture recognition Human-computer interface Machine learning Materials Science Materials Science, Multidisciplinary Motion capture neural network pruning Neural networks optimization Optimization techniques Performance evaluation Physical Sciences Physics Physics, Applied Science & Technology Sign language Technology
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Current deep learning convolutional neural network (DCNN) -based hand gesture detectors with acute precision demand incredibly high-performance computing power. Although DCNN-based detectors are capable of accurate classification, the sheer computing power needed for this form of classification makes it very difficult to run with lower computational power in remote environments. Moreover, classical DCNN architectures have a fixed number of input dimensions, which forces preprocessing, thus making it impractical for real-world applications. In this research, a practical DCNN with an optimized architecture is proposed with DCNN filter/node pruning, and spatial pyramid pooling (SPP) is introduced in order to make the model input dimension-invariant. This compact SPP-DCNN module uses 65% fewer parameters than traditional classifiers and operates almost 3x faster than classical models. Moreover, the new improved proposed algorithm, which decodes gestures or sign language finger-spelling from videos, gave a benchmark highest accuracy with the fastest processing speed. This proposed method paves the way for various practical and applied hand gesture input-based human-computer interaction (HCI) applications.
ISSN:	2076-3417 2076-3417
DOI:	10.3390/app10217898