Normalized edge convolutional networks for skeleton-based hand gesture recognition

•We propose a novel edge-varying graph by dividing each neighborhood of the central node into three groups: physical neighbors, temporal neighbors and varying neighbors. The design trick design called a “black hole” is presented to enhance the performance of the graph.•We conduct edge normalization...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Pattern recognition 2021-10, Vol.118, p.108044, Article 108044
Hauptverfasser: Guo, Fangtai, He, Zaixing, Zhang, Shuyou, Zhao, Xinyue, Fang, Jinhui, Tan, Jianrong
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:•We propose a novel edge-varying graph by dividing each neighborhood of the central node into three groups: physical neighbors, temporal neighbors and varying neighbors. The design trick design called a “black hole” is presented to enhance the performance of the graph.•We conduct edge normalization within the central node's two-hop neighborhood, resulting in a novel normalized edge convolution operation.•A novel sampling strategy called zig-zag sampling is proposed. The strategy is designed to maintain a graceful balance between intergroup and intragroup sampling priorities.•Normalized edge convolutional networks are constructed for hand gesture recognition, and systematic experiments on publicly available datasets validate the robustness and superiority of our method. Dynamic hand skeletons consisting of discrete spatial-temporal finger joint clouds effectively convey the intentions of communicators. Previous graph convolutional networks (GCNs) relying on human hand-crafted inductive biases have been quickly promoted for skeleton-based hand gesture recognition (SHGR). However, most existing graph constructions for GCN-based solutions are set manually, only considering the physical topology of the hand skeleton, and the fixed dependencies among hand joints may lead to suboptimal models. To enrich the local dependencies, we emphasize that hand skeletons can be seen from two views: explicit joint clouds and implicit skeleton topology. Starting from those two views of hand gestures, we attempt to introduce dynamics and diversities into the local neighborhood of the graph by dividing it into sets of physical neighbors, temporal neighbors and varying neighbors. Next, we systematically proceed with three innovations, including the novel edge-varying graph, normalized edge convolution operation, and zig-zag sampling strategy, to alleviate the challenges resulting from engineering practices. Finally, spatial-based GCNs called normalized edge convolutional networks are constructed for hand gesture recognition. Experiments on publicly available hand datasets show that our work is stable for performing state-of-the-art gesture recognition, and ablation experiments are also provided to validate each contribution.
ISSN:0031-3203
1873-5142
DOI:10.1016/j.patcog.2021.108044