LSDNet: lightweight stochastic depth network for human pose estimation: LSDNet: lightweight stochastic depth network for human pose estimation
Human pose estimation plays a critical role in human-centred vision applications. Its influence extends to various aspects of daily life, from healthcare diagnostics and sports training to augmented reality experiences and gesture-controlled interfaces. While current approaches have achieved impress...
Gespeichert in:
Veröffentlicht in: | The Visual computer 2025-01, Vol.41 (1), p.257-270 |
---|---|
Hauptverfasser: | , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 270 |
---|---|
container_issue | 1 |
container_start_page | 257 |
container_title | The Visual computer |
container_volume | 41 |
creator | Zhang, Hengrui Qi, Yongfeng Chen, Huili Cao, Panpan Liang, Anye Wen, Shengcong |
description | Human pose estimation plays a critical role in human-centred vision applications. Its influence extends to various aspects of daily life, from healthcare diagnostics and sports training to augmented reality experiences and gesture-controlled interfaces. While current approaches have achieved impressive accuracy, their high model complexity and slow detection speeds significantly limit their deployment on edge devices with limited computing power, such as mobile phones and IoT devices. In this paper, we introduce a novel lightweight network for 2D human pose estimation, called lightweight stochastic depth network (LSDNet). Our approach is based on the observation that the majority of HRNet’s parameters are located in the middle and later stages in the network. We reduce some unnecessary branches to significantly reduce these parameters. This is achieved by leveraging the Bernoulli distribution to randomly remove these redundant branches, which improves the network’s efficiency while also increasing its robustness. To further reduce the network’s parameter count, we introduce two lightweight blocks with simple yet effective architectures. These blocks achieve significant parameter reduction while maintaining good accuracy. Furthermore, we leverage coordinate attention to effectively fuse features from different branches and scales. This mechanism captures both inter-channel dependencies and spatial context, enabling the network to accurately localize keypoints across the human body. We evaluated the effectiveness of our method on the MPII and COCO datasets, demonstrating superior results on human pose estimation compared to popular lightweight networks. Our code is available at:
https://github.com/illusory2333/LSDNet
. |
doi_str_mv | 10.1007/s00371-024-03323-4 |
format | Article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_3159547550</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3159547550</sourcerecordid><originalsourceid>FETCH-LOGICAL-c270t-c5d0512d55255cb9531c83142f87b20f88c5576265264f840ef55e834eae39c93</originalsourceid><addsrcrecordid>eNp9kE9LAzEQxYMoWKtfwFPAc3TyZ5qsN6lWhaIH9Ry2abbb2m5qklL89kZX8OZlhoH33sz8CDnncMkB9FUCkJozEIqBlEIydUAGXEnBhOR4SAbAtWFCm-qYnKS0gjJrVQ3IZPpy--TzNV0vF23e--9KUw6urVNeOjr329zSzud9iO-0CZG2u03d0W1Invoi2dR5GbpTctTU6-TPfvuQvE3uXscPbPp8_zi-mTInNGTmcA7IxRxRILpZhZI7I7kSjdEzAY0xDlGPxAjFSDVGgW8QvZHK115WrpJDctHnbmP42JX9dhV2sSsrbfmzQqURoahEr3IxpBR9Y7exHBo_LQf7zcv2vGzhZX94WVVMsjelIu4WPv5F_-P6Aj4wbEY</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3159547550</pqid></control><display><type>article</type><title>LSDNet: lightweight stochastic depth network for human pose estimation: LSDNet: lightweight stochastic depth network for human pose estimation</title><source>Springer Nature - Complete Springer Journals</source><creator>Zhang, Hengrui ; Qi, Yongfeng ; Chen, Huili ; Cao, Panpan ; Liang, Anye ; Wen, Shengcong</creator><creatorcontrib>Zhang, Hengrui ; Qi, Yongfeng ; Chen, Huili ; Cao, Panpan ; Liang, Anye ; Wen, Shengcong</creatorcontrib><description>Human pose estimation plays a critical role in human-centred vision applications. Its influence extends to various aspects of daily life, from healthcare diagnostics and sports training to augmented reality experiences and gesture-controlled interfaces. While current approaches have achieved impressive accuracy, their high model complexity and slow detection speeds significantly limit their deployment on edge devices with limited computing power, such as mobile phones and IoT devices. In this paper, we introduce a novel lightweight network for 2D human pose estimation, called lightweight stochastic depth network (LSDNet). Our approach is based on the observation that the majority of HRNet’s parameters are located in the middle and later stages in the network. We reduce some unnecessary branches to significantly reduce these parameters. This is achieved by leveraging the Bernoulli distribution to randomly remove these redundant branches, which improves the network’s efficiency while also increasing its robustness. To further reduce the network’s parameter count, we introduce two lightweight blocks with simple yet effective architectures. These blocks achieve significant parameter reduction while maintaining good accuracy. Furthermore, we leverage coordinate attention to effectively fuse features from different branches and scales. This mechanism captures both inter-channel dependencies and spatial context, enabling the network to accurately localize keypoints across the human body. We evaluated the effectiveness of our method on the MPII and COCO datasets, demonstrating superior results on human pose estimation compared to popular lightweight networks. Our code is available at:
https://github.com/illusory2333/LSDNet
.</description><identifier>ISSN: 0178-2789</identifier><identifier>EISSN: 1432-2315</identifier><identifier>DOI: 10.1007/s00371-024-03323-4</identifier><language>eng</language><publisher>Berlin/Heidelberg: Springer Berlin Heidelberg</publisher><subject>Accuracy ; Artificial Intelligence ; Augmented reality ; Computer Graphics ; Computer Science ; Effectiveness ; Image Processing and Computer Vision ; Lightweight ; Methods ; Neural networks ; Parameter robustness ; Pose estimation ; Semantics ; Weight reduction</subject><ispartof>The Visual computer, 2025-01, Vol.41 (1), p.257-270</ispartof><rights>The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2024 Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.</rights><rights>Copyright Springer Nature B.V. Jan 2025</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c270t-c5d0512d55255cb9531c83142f87b20f88c5576265264f840ef55e834eae39c93</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s00371-024-03323-4$$EPDF$$P50$$Gspringer$$H</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/s00371-024-03323-4$$EHTML$$P50$$Gspringer$$H</linktohtml><link.rule.ids>314,776,780,27901,27902,41464,42533,51294</link.rule.ids></links><search><creatorcontrib>Zhang, Hengrui</creatorcontrib><creatorcontrib>Qi, Yongfeng</creatorcontrib><creatorcontrib>Chen, Huili</creatorcontrib><creatorcontrib>Cao, Panpan</creatorcontrib><creatorcontrib>Liang, Anye</creatorcontrib><creatorcontrib>Wen, Shengcong</creatorcontrib><title>LSDNet: lightweight stochastic depth network for human pose estimation: LSDNet: lightweight stochastic depth network for human pose estimation</title><title>The Visual computer</title><addtitle>Vis Comput</addtitle><description>Human pose estimation plays a critical role in human-centred vision applications. Its influence extends to various aspects of daily life, from healthcare diagnostics and sports training to augmented reality experiences and gesture-controlled interfaces. While current approaches have achieved impressive accuracy, their high model complexity and slow detection speeds significantly limit their deployment on edge devices with limited computing power, such as mobile phones and IoT devices. In this paper, we introduce a novel lightweight network for 2D human pose estimation, called lightweight stochastic depth network (LSDNet). Our approach is based on the observation that the majority of HRNet’s parameters are located in the middle and later stages in the network. We reduce some unnecessary branches to significantly reduce these parameters. This is achieved by leveraging the Bernoulli distribution to randomly remove these redundant branches, which improves the network’s efficiency while also increasing its robustness. To further reduce the network’s parameter count, we introduce two lightweight blocks with simple yet effective architectures. These blocks achieve significant parameter reduction while maintaining good accuracy. Furthermore, we leverage coordinate attention to effectively fuse features from different branches and scales. This mechanism captures both inter-channel dependencies and spatial context, enabling the network to accurately localize keypoints across the human body. We evaluated the effectiveness of our method on the MPII and COCO datasets, demonstrating superior results on human pose estimation compared to popular lightweight networks. Our code is available at:
https://github.com/illusory2333/LSDNet
.</description><subject>Accuracy</subject><subject>Artificial Intelligence</subject><subject>Augmented reality</subject><subject>Computer Graphics</subject><subject>Computer Science</subject><subject>Effectiveness</subject><subject>Image Processing and Computer Vision</subject><subject>Lightweight</subject><subject>Methods</subject><subject>Neural networks</subject><subject>Parameter robustness</subject><subject>Pose estimation</subject><subject>Semantics</subject><subject>Weight reduction</subject><issn>0178-2789</issn><issn>1432-2315</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2025</creationdate><recordtype>article</recordtype><recordid>eNp9kE9LAzEQxYMoWKtfwFPAc3TyZ5qsN6lWhaIH9Ry2abbb2m5qklL89kZX8OZlhoH33sz8CDnncMkB9FUCkJozEIqBlEIydUAGXEnBhOR4SAbAtWFCm-qYnKS0gjJrVQ3IZPpy--TzNV0vF23e--9KUw6urVNeOjr329zSzud9iO-0CZG2u03d0W1Invoi2dR5GbpTctTU6-TPfvuQvE3uXscPbPp8_zi-mTInNGTmcA7IxRxRILpZhZI7I7kSjdEzAY0xDlGPxAjFSDVGgW8QvZHK115WrpJDctHnbmP42JX9dhV2sSsrbfmzQqURoahEr3IxpBR9Y7exHBo_LQf7zcv2vGzhZX94WVVMsjelIu4WPv5F_-P6Aj4wbEY</recordid><startdate>20250101</startdate><enddate>20250101</enddate><creator>Zhang, Hengrui</creator><creator>Qi, Yongfeng</creator><creator>Chen, Huili</creator><creator>Cao, Panpan</creator><creator>Liang, Anye</creator><creator>Wen, Shengcong</creator><general>Springer Berlin Heidelberg</general><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope><scope>JQ2</scope></search><sort><creationdate>20250101</creationdate><title>LSDNet: lightweight stochastic depth network for human pose estimation</title><author>Zhang, Hengrui ; Qi, Yongfeng ; Chen, Huili ; Cao, Panpan ; Liang, Anye ; Wen, Shengcong</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c270t-c5d0512d55255cb9531c83142f87b20f88c5576265264f840ef55e834eae39c93</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2025</creationdate><topic>Accuracy</topic><topic>Artificial Intelligence</topic><topic>Augmented reality</topic><topic>Computer Graphics</topic><topic>Computer Science</topic><topic>Effectiveness</topic><topic>Image Processing and Computer Vision</topic><topic>Lightweight</topic><topic>Methods</topic><topic>Neural networks</topic><topic>Parameter robustness</topic><topic>Pose estimation</topic><topic>Semantics</topic><topic>Weight reduction</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Zhang, Hengrui</creatorcontrib><creatorcontrib>Qi, Yongfeng</creatorcontrib><creatorcontrib>Chen, Huili</creatorcontrib><creatorcontrib>Cao, Panpan</creatorcontrib><creatorcontrib>Liang, Anye</creatorcontrib><creatorcontrib>Wen, Shengcong</creatorcontrib><collection>CrossRef</collection><collection>ProQuest Computer Science Collection</collection><jtitle>The Visual computer</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Zhang, Hengrui</au><au>Qi, Yongfeng</au><au>Chen, Huili</au><au>Cao, Panpan</au><au>Liang, Anye</au><au>Wen, Shengcong</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>LSDNet: lightweight stochastic depth network for human pose estimation: LSDNet: lightweight stochastic depth network for human pose estimation</atitle><jtitle>The Visual computer</jtitle><stitle>Vis Comput</stitle><date>2025-01-01</date><risdate>2025</risdate><volume>41</volume><issue>1</issue><spage>257</spage><epage>270</epage><pages>257-270</pages><issn>0178-2789</issn><eissn>1432-2315</eissn><abstract>Human pose estimation plays a critical role in human-centred vision applications. Its influence extends to various aspects of daily life, from healthcare diagnostics and sports training to augmented reality experiences and gesture-controlled interfaces. While current approaches have achieved impressive accuracy, their high model complexity and slow detection speeds significantly limit their deployment on edge devices with limited computing power, such as mobile phones and IoT devices. In this paper, we introduce a novel lightweight network for 2D human pose estimation, called lightweight stochastic depth network (LSDNet). Our approach is based on the observation that the majority of HRNet’s parameters are located in the middle and later stages in the network. We reduce some unnecessary branches to significantly reduce these parameters. This is achieved by leveraging the Bernoulli distribution to randomly remove these redundant branches, which improves the network’s efficiency while also increasing its robustness. To further reduce the network’s parameter count, we introduce two lightweight blocks with simple yet effective architectures. These blocks achieve significant parameter reduction while maintaining good accuracy. Furthermore, we leverage coordinate attention to effectively fuse features from different branches and scales. This mechanism captures both inter-channel dependencies and spatial context, enabling the network to accurately localize keypoints across the human body. We evaluated the effectiveness of our method on the MPII and COCO datasets, demonstrating superior results on human pose estimation compared to popular lightweight networks. Our code is available at:
https://github.com/illusory2333/LSDNet
.</abstract><cop>Berlin/Heidelberg</cop><pub>Springer Berlin Heidelberg</pub><doi>10.1007/s00371-024-03323-4</doi><tpages>14</tpages></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0178-2789 |
ispartof | The Visual computer, 2025-01, Vol.41 (1), p.257-270 |
issn | 0178-2789 1432-2315 |
language | eng |
recordid | cdi_proquest_journals_3159547550 |
source | Springer Nature - Complete Springer Journals |
subjects | Accuracy Artificial Intelligence Augmented reality Computer Graphics Computer Science Effectiveness Image Processing and Computer Vision Lightweight Methods Neural networks Parameter robustness Pose estimation Semantics Weight reduction |
title | LSDNet: lightweight stochastic depth network for human pose estimation: LSDNet: lightweight stochastic depth network for human pose estimation |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-14T19%3A58%3A01IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=LSDNet:%20lightweight%20stochastic%20depth%20network%20for%20human%20pose%20estimation:%20LSDNet:%20lightweight%20stochastic%20depth%20network%20for%20human%20pose%20estimation&rft.jtitle=The%20Visual%20computer&rft.au=Zhang,%20Hengrui&rft.date=2025-01-01&rft.volume=41&rft.issue=1&rft.spage=257&rft.epage=270&rft.pages=257-270&rft.issn=0178-2789&rft.eissn=1432-2315&rft_id=info:doi/10.1007/s00371-024-03323-4&rft_dat=%3Cproquest_cross%3E3159547550%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3159547550&rft_id=info:pmid/&rfr_iscdi=true |