Efficient Human Pose Estimation from Single Depth Images

We describe two new approaches to human pose estimation. Both can quickly and accurately predict the 3D positions of body joints from a single depth image, without using any temporal information. The key to both approaches is the use of a large, realistic, and highly varied synthetic set of training...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on pattern analysis and machine intelligence 2012-10
Hauptverfasser:	Shotton, Jamie, Girshick, Ross, Fitzgibbon, Andrew, Sharp, Toby, Cook, Mat, Finocchio, Mark, Moore, Richard, Kohli, Pushmeet, Criminisi, Antonio, Kipman, Alex, Blake, Andrew
Format:	Artikel
Sprache:	eng
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title	IEEE transactions on pattern analysis and machine intelligence
container_volume
creator	Shotton, Jamie Girshick, Ross Fitzgibbon, Andrew Sharp, Toby Cook, Mat Finocchio, Mark Moore, Richard Kohli, Pushmeet Criminisi, Antonio Kipman, Alex Blake, Andrew
description	We describe two new approaches to human pose estimation. Both can quickly and accurately predict the 3D positions of body joints from a single depth image, without using any temporal information. The key to both approaches is the use of a large, realistic, and highly varied synthetic set of training images. This allows us to learn models that are largely invariant to factors such as pose, body shape, field-of-view cropping, and clothing. Our first approach employs an intermediate body parts representation, designed so that an accurate per-pixel classification of the parts will localize the joints of the body. The second approach instead directly regresses the positions of body joints. By using simple depth pixel comparison features, and parallelizable decision forests, both approaches can run super-realtime on consumer hardware. Our evaluation investigates many aspects of our methods, and compares the approaches to each other and to the state of the art. Results on silhouettes suggest broader applicability to other imaging modalities.
format	Article
fullrecord	<record><control><sourceid>pubmed</sourceid><recordid>TN_cdi_pubmed_primary_23109523</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>23109523</sourcerecordid><originalsourceid>FETCH-pubmed_primary_231095233</originalsourceid><addsrcrecordid>eNpjYuA0tDS21DU2NbbkYOAqLs4yMDA0MTUwZmfgMDI2NLA0NTLmZLBwTUvLTM5MzStR8CjNTcxTCMgvTlVwLS7JzE0syczPU0grys9VCM7MS89JVXBJLSjJUPDMTUxPLeZhYE1LzClO5YXS3Axybq4hzh66BaVJuakp8QVFQBOKKuNhVhkTVAAAohcy1Q</addsrcrecordid><sourcetype>Index Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Efficient Human Pose Estimation from Single Depth Images</title><source>IEEE Electronic Library (IEL)</source><creator>Shotton, Jamie ; Girshick, Ross ; Fitzgibbon, Andrew ; Sharp, Toby ; Cook, Mat ; Finocchio, Mark ; Moore, Richard ; Kohli, Pushmeet ; Criminisi, Antonio ; Kipman, Alex ; Blake, Andrew</creator><creatorcontrib>Shotton, Jamie ; Girshick, Ross ; Fitzgibbon, Andrew ; Sharp, Toby ; Cook, Mat ; Finocchio, Mark ; Moore, Richard ; Kohli, Pushmeet ; Criminisi, Antonio ; Kipman, Alex ; Blake, Andrew</creatorcontrib><description>We describe two new approaches to human pose estimation. Both can quickly and accurately predict the 3D positions of body joints from a single depth image, without using any temporal information. The key to both approaches is the use of a large, realistic, and highly varied synthetic set of training images. This allows us to learn models that are largely invariant to factors such as pose, body shape, field-of-view cropping, and clothing. Our first approach employs an intermediate body parts representation, designed so that an accurate per-pixel classification of the parts will localize the joints of the body. The second approach instead directly regresses the positions of body joints. By using simple depth pixel comparison features, and parallelizable decision forests, both approaches can run super-realtime on consumer hardware. Our evaluation investigates many aspects of our methods, and compares the approaches to each other and to the state of the art. Results on silhouettes suggest broader applicability to other imaging modalities.</description><identifier>EISSN: 1939-3539</identifier><identifier>PMID: 23109523</identifier><language>eng</language><publisher>United States</publisher><ispartof>IEEE transactions on pattern analysis and machine intelligence, 2012-10</ispartof><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>315,781,785</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/23109523$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Shotton, Jamie</creatorcontrib><creatorcontrib>Girshick, Ross</creatorcontrib><creatorcontrib>Fitzgibbon, Andrew</creatorcontrib><creatorcontrib>Sharp, Toby</creatorcontrib><creatorcontrib>Cook, Mat</creatorcontrib><creatorcontrib>Finocchio, Mark</creatorcontrib><creatorcontrib>Moore, Richard</creatorcontrib><creatorcontrib>Kohli, Pushmeet</creatorcontrib><creatorcontrib>Criminisi, Antonio</creatorcontrib><creatorcontrib>Kipman, Alex</creatorcontrib><creatorcontrib>Blake, Andrew</creatorcontrib><title>Efficient Human Pose Estimation from Single Depth Images</title><title>IEEE transactions on pattern analysis and machine intelligence</title><addtitle>IEEE Trans Pattern Anal Mach Intell</addtitle><description>We describe two new approaches to human pose estimation. Both can quickly and accurately predict the 3D positions of body joints from a single depth image, without using any temporal information. The key to both approaches is the use of a large, realistic, and highly varied synthetic set of training images. This allows us to learn models that are largely invariant to factors such as pose, body shape, field-of-view cropping, and clothing. Our first approach employs an intermediate body parts representation, designed so that an accurate per-pixel classification of the parts will localize the joints of the body. The second approach instead directly regresses the positions of body joints. By using simple depth pixel comparison features, and parallelizable decision forests, both approaches can run super-realtime on consumer hardware. Our evaluation investigates many aspects of our methods, and compares the approaches to each other and to the state of the art. Results on silhouettes suggest broader applicability to other imaging modalities.</description><issn>1939-3539</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2012</creationdate><recordtype>article</recordtype><recordid>eNpjYuA0tDS21DU2NbbkYOAqLs4yMDA0MTUwZmfgMDI2NLA0NTLmZLBwTUvLTM5MzStR8CjNTcxTCMgvTlVwLS7JzE0syczPU0grys9VCM7MS89JVXBJLSjJUPDMTUxPLeZhYE1LzClO5YXS3Axybq4hzh66BaVJuakp8QVFQBOKKuNhVhkTVAAAohcy1Q</recordid><startdate>20121026</startdate><enddate>20121026</enddate><creator>Shotton, Jamie</creator><creator>Girshick, Ross</creator><creator>Fitzgibbon, Andrew</creator><creator>Sharp, Toby</creator><creator>Cook, Mat</creator><creator>Finocchio, Mark</creator><creator>Moore, Richard</creator><creator>Kohli, Pushmeet</creator><creator>Criminisi, Antonio</creator><creator>Kipman, Alex</creator><creator>Blake, Andrew</creator><scope>NPM</scope></search><sort><creationdate>20121026</creationdate><title>Efficient Human Pose Estimation from Single Depth Images</title><author>Shotton, Jamie ; Girshick, Ross ; Fitzgibbon, Andrew ; Sharp, Toby ; Cook, Mat ; Finocchio, Mark ; Moore, Richard ; Kohli, Pushmeet ; Criminisi, Antonio ; Kipman, Alex ; Blake, Andrew</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-pubmed_primary_231095233</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2012</creationdate><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Shotton, Jamie</creatorcontrib><creatorcontrib>Girshick, Ross</creatorcontrib><creatorcontrib>Fitzgibbon, Andrew</creatorcontrib><creatorcontrib>Sharp, Toby</creatorcontrib><creatorcontrib>Cook, Mat</creatorcontrib><creatorcontrib>Finocchio, Mark</creatorcontrib><creatorcontrib>Moore, Richard</creatorcontrib><creatorcontrib>Kohli, Pushmeet</creatorcontrib><creatorcontrib>Criminisi, Antonio</creatorcontrib><creatorcontrib>Kipman, Alex</creatorcontrib><creatorcontrib>Blake, Andrew</creatorcontrib><collection>PubMed</collection><jtitle>IEEE transactions on pattern analysis and machine intelligence</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Shotton, Jamie</au><au>Girshick, Ross</au><au>Fitzgibbon, Andrew</au><au>Sharp, Toby</au><au>Cook, Mat</au><au>Finocchio, Mark</au><au>Moore, Richard</au><au>Kohli, Pushmeet</au><au>Criminisi, Antonio</au><au>Kipman, Alex</au><au>Blake, Andrew</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Efficient Human Pose Estimation from Single Depth Images</atitle><jtitle>IEEE transactions on pattern analysis and machine intelligence</jtitle><addtitle>IEEE Trans Pattern Anal Mach Intell</addtitle><date>2012-10-26</date><risdate>2012</risdate><eissn>1939-3539</eissn><abstract>We describe two new approaches to human pose estimation. Both can quickly and accurately predict the 3D positions of body joints from a single depth image, without using any temporal information. The key to both approaches is the use of a large, realistic, and highly varied synthetic set of training images. This allows us to learn models that are largely invariant to factors such as pose, body shape, field-of-view cropping, and clothing. Our first approach employs an intermediate body parts representation, designed so that an accurate per-pixel classification of the parts will localize the joints of the body. The second approach instead directly regresses the positions of body joints. By using simple depth pixel comparison features, and parallelizable decision forests, both approaches can run super-realtime on consumer hardware. Our evaluation investigates many aspects of our methods, and compares the approaches to each other and to the state of the art. Results on silhouettes suggest broader applicability to other imaging modalities.</abstract><cop>United States</cop><pmid>23109523</pmid></addata></record>
fulltext	fulltext
identifier	EISSN: 1939-3539
ispartof	IEEE transactions on pattern analysis and machine intelligence, 2012-10
issn	1939-3539
language	eng
recordid	cdi_pubmed_primary_23109523
source	IEEE Electronic Library (IEL)
title	Efficient Human Pose Estimation from Single Depth Images
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-16T09%3A12%3A42IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-pubmed&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Efficient%20Human%20Pose%20Estimation%20from%20Single%20Depth%20Images&rft.jtitle=IEEE%20transactions%20on%20pattern%20analysis%20and%20machine%20intelligence&rft.au=Shotton,%20Jamie&rft.date=2012-10-26&rft.eissn=1939-3539&rft_id=info:doi/&rft_dat=%3Cpubmed%3E23109523%3C/pubmed%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/23109523&rfr_iscdi=true