Harnessing Time: A Skeleton Temporal Fusion Graph Convolutional Network for Elderly Action Recognition
Elderly action recognition is a more challenging task due to the fact that elderly individuals move with small amplitude and long duration of actions. There are two pivotal issues that warrant further exploration including the development of enhanced temporal feature representations and the expansio...
Gespeichert in:
Veröffentlicht in: | IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences Communications and Computer Sciences, 2024, pp.2024MAP0005 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng ; jpn |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Elderly action recognition is a more challenging task due to the fact that elderly individuals move with small amplitude and long duration of actions. There are two pivotal issues that warrant further exploration including the development of enhanced temporal feature representations and the expansion of convolutional models' capacity to capture long-range temporal features. In this paper, we propose a novel Skeleton Temporal Fusion Graph Convolutional Network (STF-GCN) for skeleton-based elderly action recognition, which effectively models advanced temporal feature representations. More specifically, the STF-GCN employs three encoding strategies to integrate two types of temporal feature representations. These strategies are designed to capture the intricacies of motion dynamics and the subtleties in action variations, enabling a more accurate and robust recognition of elderly actions. Furthermore, we propose a Skeleton Temporal Fusion (STF) module to highlight the temporal feature representations, employing a structure that alternates between large and small kernel convolutions to achieve various effective receptive fields. The integration of large kernel convolutions allows our model to perceive an expanded temporal context, enhancing its ability to deeply understand action dynamics. Our evaluation demonstrates that the STF-GCN achieves state-of-the-art performance on the largest elderly dataset, ETRI-Activity3D. Additionally, more extensive experimental results show that the STF-GCN is also comparable to other methods in general action recognition tasks. |
---|---|
ISSN: | 0916-8508 1745-1337 |
DOI: | 10.1587/transfun.2024MAP0005 |