GAN-BodyPose: Real-time 3D human body pose data key point detection and quality assessment assisted by generative adversarial network

With the rapid advancement of deep learning and computer vision, these technologies are becoming increasingly vital in areas like virtual reality, medical diagnosis, and sports training. Existing methods for real-time 3D human body pose keypoint detection and quality assessment face significant chal...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Image and vision computing 2024-09, Vol.149, p.105144, Article 105144
Hauptverfasser: Zhu, Xicheng, Ye, Xinchen
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:With the rapid advancement of deep learning and computer vision, these technologies are becoming increasingly vital in areas like virtual reality, medical diagnosis, and sports training. Existing methods for real-time 3D human body pose keypoint detection and quality assessment face significant challenges such as insufficient detection accuracy, low computational efficiency, and high data quality requirements. To address these challenges, we propose an innovative solution, GAN-BodyPose. This approach integrates 3D convolutional neural networks, self-attention mechanisms, and generative adversarial networks to deliver efficient and accurate detection and assessment in real time. The GAN-BodyPose framework combines 3D-CNN and self-attention for effective feature extraction and keypoint detection, enhanced further by generative adversarial networks for superior data quality and accuracy. Our extensive evaluations using a large-scale 3D human body pose dataset demonstrated that GAN-BodyPose outperforms traditional methods, showing improvements in processing speed (15% faster), accuracy in terms of Mean Per Joint Position Error (reduced by approximately 2.2%), and an Area Under the Curve (AUC) score increased by approximately 9.5% compared to HR-Net and other datasets. Additionally, it achieves lower Floating-Point Operations (FLOPs) by about 9.3%, indicating more efficient computational performance. These advancements underline the potential of our approach to significantly enhance user experiences in virtual reality, motion capture, and other real-time applications. The successful application of GAN-BodyPose promises greater efficiency and precision in fields ranging from game development to medical diagnostics, and robust support for human-computer interaction and gesture recognition. This research represents a substantial contribution to deep learning applications in robot control, decision-making, and broadens the research foundation in these domains. •Introduced 3D-CNN for efficient feature extraction, improving keypoint detection in real-time scenarios.•Added self-attention mechanism to enhance understanding of relationships within 3D human body pose data.•Integrated GANs to enhance data quality and accuracy, generating realistic 3D human body pose data.
ISSN:0262-8856
DOI:10.1016/j.imavis.2024.105144