HDA-pose: a real-time 2D human pose estimation method based on modified YOLOv8
2D human pose estimation aims to accurately regress the keypoints of human body from images or videos. However, it remains challenging due to the occlusion and intersection among multiple individuals and the difficulty of dealing with different body scales. In order to better tackle these issues, we...
Gespeichert in:
Veröffentlicht in: | Signal, image and video processing image and video processing, 2024-09, Vol.18 (8-9), p.5823-5839 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | 2D human pose estimation aims to accurately regress the keypoints of human body from images or videos. However, it remains challenging due to the occlusion and intersection among multiple individuals and the difficulty of dealing with different body scales. In order to better tackle these issues, we propose a human pose estimation framework named HDA-Pose. By improving the real-time framework of YOLOv8, we achieve simultaneous regression of all individuals' keypoint locations in the image. Specifically, we propose the High-Grade Dual Attention (HDA) module to further enhance the focus of YOLOv8 on important features of individuals in the image. Additionally, we improve the original data augmentation strategy in YOLOv8 to better simulate cases where key points of individuals are occluded in the image. Lastly, we introduce a novel regression loss metric, Vertex Intersection over Union, to further enhance the effectiveness of the model in multi-person pose estimation. Our approach attains competitive results on multiple metrics of two open-source datasets, MS COCO 2017 and CrowdPose. Compared with the baseline model YOLOv8x-pose, HDA-Pose improves the average precision by 2.9% and 3.3% on the two datasets, respectively. |
---|---|
ISSN: | 1863-1703 1863-1711 |
DOI: | 10.1007/s11760-024-03274-2 |