Static graph convolution with learned temporal and channel-wise graph topology generation for skeleton-based action recognition

Graph convolutional networks (GCNs) are widely used in skeleton-based action recognition. It is known that the graph topology is a vital part in GCNs, and different kinds of graph topologies have been proposed for skeleton-based action recognition, mostly based on a predefined topology and a dynamic...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Computer vision and image understanding 2024-07, Vol.244, p.104012, Article 104012
Hauptverfasser: Li, Chuankun, Li, Shuai, Gao, Yanbo, Zhou, Lijuan, Li, Wanqing
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Graph convolutional networks (GCNs) are widely used in skeleton-based action recognition. It is known that the graph topology is a vital part in GCNs, and different kinds of graph topologies have been proposed for skeleton-based action recognition, mostly based on a predefined topology and a dynamically learned one. The predefined topology is based on the human intuition for skeleton (the connectivity of joints) and has not been investigated whether it is optimal. In this paper, we focus on investigating this static graph topology and propose to generate a learned static graph topology for skeleton. To be specific, a temporal frame-wise and channel-wise topology-based GCNs (TC-GCNs) are developed, where, instead of using a predefined topology by human, a topology is learned for skeleton-based action recognition. The TC-GCNs consist of generating a temporal frame-wise topology and a channel-wise topology to formulate the relationship of skeleton joints in the temporal dimension and channel dimension, respectively. The proposed method can be integrated with the conventional dynamic topology by replacing the predefined graph topology with our generated one. Experimental results show that our method with learned static graph achieves better performance than the predefined graph and dynamic graph on three widely used benchmarks, namely the NTU-RGB+D, NTU-RGB+D 120 and UAV-Human. •Temporal frame-wise and channel-wise topology based GCNs (TC-GCNs) are developed instead of using a predefined topology.•The proposed TC-GCNs can be integrated with the conventional dynamic graph to improve performance.•Extensive experiments have been performed on three datasets: NTU-RGB+D, NTURGB+D 120 and UAV-Human, and the TC-GCNs achieve better performance than the predefined graph and dynamic graph based methods.
ISSN:1077-3142
1090-235X
DOI:10.1016/j.cviu.2024.104012