A strong benchmark for yoga action recognition based on lightweight pose estimation model: A strong benchmark for yoga action recognition based on lightweight
Yoga action recognition is crucial for enabling precise motion analysis and providing effective training guidance, which in turn facilitates the optimization of physical health and skill enhancement. However, current methods struggle to maintain high accuracy and real-time performance when dealing w...
Gespeichert in:
Veröffentlicht in: | Multimedia systems 2025, Vol.31 (1) |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Yoga action recognition is crucial for enabling precise motion analysis and providing effective training guidance, which in turn facilitates the optimization of physical health and skill enhancement. However, current methods struggle to maintain high accuracy and real-time performance when dealing with the complex poses and occlusions. Additionally, these methods neglect the dynamic characteristics and temporal sequence information inherent in yoga actions. Therefore, this paper proposes a two-stage action recognition method tailored for yoga scenarios. The method initially employs pose estimation technology based on knowledge distillation to optimize the accuracy and efficiency of lightweight models in detecting complex poses and occlusions. Subsequently, a lightweight 3D convolutional neural network (3D-CNN) is utilized for action recognition, achieving seamless integration of the two stages through heat maps, thereby enhancing recognition accuracy and precisely capturing spatiotemporal features in video sequences. Experimental results indicate that on the COCO dataset, the DistillPose-m model achieves a 2.5% improvement in Average Precision (AP) compared to RTMPose-m. In the yoga action recognition task, our model exhibites approximately a 2% improvement over traditional Graph Convolutional Network (GCN) methods on both the Deepyoga and 3Dyoga90 datasets. This study enhances the performance and accuracy of pose estimation in yoga scenarios, addressing the challenges of bodily occlusions and complex postures. By fully leveraging the spatiotemporal information inherent in yoga movements, it improves the accuracy of yoga action recognition. This research provides critical insights and support for motion training and analysis systems in other dynamic activities, such as martial arts and dance. |
---|---|
ISSN: | 0942-4962 1432-1882 |
DOI: | 10.1007/s00530-024-01646-9 |