HDA-pose: a real-time 2D human pose estimation method based on modified YOLOv8

2D human pose estimation aims to accurately regress the keypoints of human body from images or videos. However, it remains challenging due to the occlusion and intersection among multiple individuals and the difficulty of dealing with different body scales. In order to better tackle these issues, we...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Signal, image and video processing image and video processing, 2024-09, Vol.18 (8-9), p.5823-5839
Hauptverfasser:	Dong, Chengang, Tang, Yuhao, Zhang, Liyan
Format:	Artikel
Sprache:	eng
Schlagworte:	Computer Imaging Computer Science Data augmentation Datasets Image enhancement Image Processing and Computer Vision Multimedia Information Systems Occlusion Original Paper Pattern Recognition and Graphics Pose estimation Real time Regression models Signal,Image and Speech Processing Two dimensional bodies Vision
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	5839
container_issue	8-9
container_start_page	5823
container_title	Signal, image and video processing
container_volume	18
creator	Dong, Chengang Tang, Yuhao Zhang, Liyan
description	2D human pose estimation aims to accurately regress the keypoints of human body from images or videos. However, it remains challenging due to the occlusion and intersection among multiple individuals and the difficulty of dealing with different body scales. In order to better tackle these issues, we propose a human pose estimation framework named HDA-Pose. By improving the real-time framework of YOLOv8, we achieve simultaneous regression of all individuals' keypoint locations in the image. Specifically, we propose the High-Grade Dual Attention (HDA) module to further enhance the focus of YOLOv8 on important features of individuals in the image. Additionally, we improve the original data augmentation strategy in YOLOv8 to better simulate cases where key points of individuals are occluded in the image. Lastly, we introduce a novel regression loss metric, Vertex Intersection over Union, to further enhance the effectiveness of the model in multi-person pose estimation. Our approach attains competitive results on multiple metrics of two open-source datasets, MS COCO 2017 and CrowdPose. Compared with the baseline model YOLOv8x-pose, HDA-Pose improves the average precision by 2.9% and 3.3% on the two datasets, respectively.
doi_str_mv	10.1007/s11760-024-03274-2
format	Article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_3086029658</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3086029658</sourcerecordid><originalsourceid>FETCH-LOGICAL-c200t-88de191467af8185ab01745f3c75074711611bb1c3f3695dade012a19deb697f3</originalsourceid><addsrcrecordid>eNp9UD1PwzAQtRBIVKV_gMkSs-HOTmyHrWqBIlV0gYHJcmKHpmqTYqdI_HtcgmDjlvt67z4eIZcI1wigbiKiksCAZwwEVxnjJ2SEWgqGCvH0NwZxTiYxbiBZwmmpR-RpMZ-yfRf9LbU0eLtlfbPzlM_p-rCzLT22qI-paPuma-nO9-vO0dJG7-gx71xTNyl-XS1XH_qCnNV2G_3kx4_Jy_3d82zBlquHx9l0ySoO0DOtnccCM6lsrVHntgRUWV6LSuWgsnS1RCxLrEQtZJE76zwgt1g4X8pC1WJMroa5-9C9H9J9ZtMdQptWGgFaAi9krhOKD6gqdDEGX5t9SI-ET4NgjtKZQTqTpDPf0hmeSGIgxQRu33z4G_0P6wsVgW4f</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3086029658</pqid></control><display><type>article</type><title>HDA-pose: a real-time 2D human pose estimation method based on modified YOLOv8</title><source>SpringerLink Journals - AutoHoldings</source><creator>Dong, Chengang ; Tang, Yuhao ; Zhang, Liyan</creator><creatorcontrib>Dong, Chengang ; Tang, Yuhao ; Zhang, Liyan</creatorcontrib><description>2D human pose estimation aims to accurately regress the keypoints of human body from images or videos. However, it remains challenging due to the occlusion and intersection among multiple individuals and the difficulty of dealing with different body scales. In order to better tackle these issues, we propose a human pose estimation framework named HDA-Pose. By improving the real-time framework of YOLOv8, we achieve simultaneous regression of all individuals' keypoint locations in the image. Specifically, we propose the High-Grade Dual Attention (HDA) module to further enhance the focus of YOLOv8 on important features of individuals in the image. Additionally, we improve the original data augmentation strategy in YOLOv8 to better simulate cases where key points of individuals are occluded in the image. Lastly, we introduce a novel regression loss metric, Vertex Intersection over Union, to further enhance the effectiveness of the model in multi-person pose estimation. Our approach attains competitive results on multiple metrics of two open-source datasets, MS COCO 2017 and CrowdPose. Compared with the baseline model YOLOv8x-pose, HDA-Pose improves the average precision by 2.9% and 3.3% on the two datasets, respectively.</description><identifier>ISSN: 1863-1703</identifier><identifier>EISSN: 1863-1711</identifier><identifier>DOI: 10.1007/s11760-024-03274-2</identifier><language>eng</language><publisher>London: Springer London</publisher><subject>Computer Imaging ; Computer Science ; Data augmentation ; Datasets ; Image enhancement ; Image Processing and Computer Vision ; Multimedia Information Systems ; Occlusion ; Original Paper ; Pattern Recognition and Graphics ; Pose estimation ; Real time ; Regression models ; Signal,Image and Speech Processing ; Two dimensional bodies ; Vision</subject><ispartof>Signal, image and video processing, 2024-09, Vol.18 (8-9), p.5823-5839</ispartof><rights>The Author(s), under exclusive licence to Springer-Verlag London Ltd., part of Springer Nature 2024. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c200t-88de191467af8185ab01745f3c75074711611bb1c3f3695dade012a19deb697f3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s11760-024-03274-2$$EPDF$$P50$$Gspringer$$H</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/s11760-024-03274-2$$EHTML$$P50$$Gspringer$$H</linktohtml><link.rule.ids>314,780,784,27924,27925,41488,42557,51319</link.rule.ids></links><search><creatorcontrib>Dong, Chengang</creatorcontrib><creatorcontrib>Tang, Yuhao</creatorcontrib><creatorcontrib>Zhang, Liyan</creatorcontrib><title>HDA-pose: a real-time 2D human pose estimation method based on modified YOLOv8</title><title>Signal, image and video processing</title><addtitle>SIViP</addtitle><description>2D human pose estimation aims to accurately regress the keypoints of human body from images or videos. However, it remains challenging due to the occlusion and intersection among multiple individuals and the difficulty of dealing with different body scales. In order to better tackle these issues, we propose a human pose estimation framework named HDA-Pose. By improving the real-time framework of YOLOv8, we achieve simultaneous regression of all individuals' keypoint locations in the image. Specifically, we propose the High-Grade Dual Attention (HDA) module to further enhance the focus of YOLOv8 on important features of individuals in the image. Additionally, we improve the original data augmentation strategy in YOLOv8 to better simulate cases where key points of individuals are occluded in the image. Lastly, we introduce a novel regression loss metric, Vertex Intersection over Union, to further enhance the effectiveness of the model in multi-person pose estimation. Our approach attains competitive results on multiple metrics of two open-source datasets, MS COCO 2017 and CrowdPose. Compared with the baseline model YOLOv8x-pose, HDA-Pose improves the average precision by 2.9% and 3.3% on the two datasets, respectively.</description><subject>Computer Imaging</subject><subject>Computer Science</subject><subject>Data augmentation</subject><subject>Datasets</subject><subject>Image enhancement</subject><subject>Image Processing and Computer Vision</subject><subject>Multimedia Information Systems</subject><subject>Occlusion</subject><subject>Original Paper</subject><subject>Pattern Recognition and Graphics</subject><subject>Pose estimation</subject><subject>Real time</subject><subject>Regression models</subject><subject>Signal,Image and Speech Processing</subject><subject>Two dimensional bodies</subject><subject>Vision</subject><issn>1863-1703</issn><issn>1863-1711</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><recordid>eNp9UD1PwzAQtRBIVKV_gMkSs-HOTmyHrWqBIlV0gYHJcmKHpmqTYqdI_HtcgmDjlvt67z4eIZcI1wigbiKiksCAZwwEVxnjJ2SEWgqGCvH0NwZxTiYxbiBZwmmpR-RpMZ-yfRf9LbU0eLtlfbPzlM_p-rCzLT22qI-paPuma-nO9-vO0dJG7-gx71xTNyl-XS1XH_qCnNV2G_3kx4_Jy_3d82zBlquHx9l0ySoO0DOtnccCM6lsrVHntgRUWV6LSuWgsnS1RCxLrEQtZJE76zwgt1g4X8pC1WJMroa5-9C9H9J9ZtMdQptWGgFaAi9krhOKD6gqdDEGX5t9SI-ET4NgjtKZQTqTpDPf0hmeSGIgxQRu33z4G_0P6wsVgW4f</recordid><startdate>20240901</startdate><enddate>20240901</enddate><creator>Dong, Chengang</creator><creator>Tang, Yuhao</creator><creator>Zhang, Liyan</creator><general>Springer London</general><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope></search><sort><creationdate>20240901</creationdate><title>HDA-pose: a real-time 2D human pose estimation method based on modified YOLOv8</title><author>Dong, Chengang ; Tang, Yuhao ; Zhang, Liyan</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c200t-88de191467af8185ab01745f3c75074711611bb1c3f3695dade012a19deb697f3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Computer Imaging</topic><topic>Computer Science</topic><topic>Data augmentation</topic><topic>Datasets</topic><topic>Image enhancement</topic><topic>Image Processing and Computer Vision</topic><topic>Multimedia Information Systems</topic><topic>Occlusion</topic><topic>Original Paper</topic><topic>Pattern Recognition and Graphics</topic><topic>Pose estimation</topic><topic>Real time</topic><topic>Regression models</topic><topic>Signal,Image and Speech Processing</topic><topic>Two dimensional bodies</topic><topic>Vision</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Dong, Chengang</creatorcontrib><creatorcontrib>Tang, Yuhao</creatorcontrib><creatorcontrib>Zhang, Liyan</creatorcontrib><collection>CrossRef</collection><jtitle>Signal, image and video processing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Dong, Chengang</au><au>Tang, Yuhao</au><au>Zhang, Liyan</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>HDA-pose: a real-time 2D human pose estimation method based on modified YOLOv8</atitle><jtitle>Signal, image and video processing</jtitle><stitle>SIViP</stitle><date>2024-09-01</date><risdate>2024</risdate><volume>18</volume><issue>8-9</issue><spage>5823</spage><epage>5839</epage><pages>5823-5839</pages><issn>1863-1703</issn><eissn>1863-1711</eissn><abstract>2D human pose estimation aims to accurately regress the keypoints of human body from images or videos. However, it remains challenging due to the occlusion and intersection among multiple individuals and the difficulty of dealing with different body scales. In order to better tackle these issues, we propose a human pose estimation framework named HDA-Pose. By improving the real-time framework of YOLOv8, we achieve simultaneous regression of all individuals' keypoint locations in the image. Specifically, we propose the High-Grade Dual Attention (HDA) module to further enhance the focus of YOLOv8 on important features of individuals in the image. Additionally, we improve the original data augmentation strategy in YOLOv8 to better simulate cases where key points of individuals are occluded in the image. Lastly, we introduce a novel regression loss metric, Vertex Intersection over Union, to further enhance the effectiveness of the model in multi-person pose estimation. Our approach attains competitive results on multiple metrics of two open-source datasets, MS COCO 2017 and CrowdPose. Compared with the baseline model YOLOv8x-pose, HDA-Pose improves the average precision by 2.9% and 3.3% on the two datasets, respectively.</abstract><cop>London</cop><pub>Springer London</pub><doi>10.1007/s11760-024-03274-2</doi><tpages>17</tpages></addata></record>
fulltext	fulltext
identifier	ISSN: 1863-1703
ispartof	Signal, image and video processing, 2024-09, Vol.18 (8-9), p.5823-5839
issn	1863-1703 1863-1711
language	eng
recordid	cdi_proquest_journals_3086029658
source	SpringerLink Journals - AutoHoldings
subjects	Computer Imaging Computer Science Data augmentation Datasets Image enhancement Image Processing and Computer Vision Multimedia Information Systems Occlusion Original Paper Pattern Recognition and Graphics Pose estimation Real time Regression models Signal,Image and Speech Processing Two dimensional bodies Vision
title	HDA-pose: a real-time 2D human pose estimation method based on modified YOLOv8
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-07T19%3A45%3A40IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=HDA-pose:%20a%20real-time%202D%20human%20pose%20estimation%20method%20based%20on%20modified%20YOLOv8&rft.jtitle=Signal,%20image%20and%20video%20processing&rft.au=Dong,%20Chengang&rft.date=2024-09-01&rft.volume=18&rft.issue=8-9&rft.spage=5823&rft.epage=5839&rft.pages=5823-5839&rft.issn=1863-1703&rft.eissn=1863-1711&rft_id=info:doi/10.1007/s11760-024-03274-2&rft_dat=%3Cproquest_cross%3E3086029658%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3086029658&rft_id=info:pmid/&rfr_iscdi=true