Model-Based 3D Hand Pose Estimation from Monocular Video

A novel model-based approach to 3D hand tracking from monocular video is presented. The 3D hand pose, the hand texture, and the illuminant are dynamically estimated through minimization of an objective function. Derived from an inverse problem formulation, the objective function enables explicit use...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on pattern analysis and machine intelligence 2011-09, Vol.33 (9), p.1793-1805
Hauptverfasser: de La Gorce, M., Fleet, D. J., Paragios, N.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 1805
container_issue 9
container_start_page 1793
container_title IEEE transactions on pattern analysis and machine intelligence
container_volume 33
creator de La Gorce, M.
Fleet, D. J.
Paragios, N.
description A novel model-based approach to 3D hand tracking from monocular video is presented. The 3D hand pose, the hand texture, and the illuminant are dynamically estimated through minimization of an objective function. Derived from an inverse problem formulation, the objective function enables explicit use of temporal texture continuity and shading information while handling important self-occlusions and time-varying illumination. The minimization is done efficiently using a quasi-Newton method, for which we provide a rigorous derivation of the objective function gradient. Particular attention is given to terms related to the change of visibility near self-occlusion boundaries that are neglected in existing formulations. To this end, we introduce new occlusion forces and show that using all gradient terms greatly improves the performance of the method. Qualitative and quantitative experimental results demonstrate the potential of the approach.
doi_str_mv 10.1109/TPAMI.2011.33
format Article
fullrecord <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_pubmed_primary_21339527</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>5719617</ieee_id><sourcerecordid>1711536912</sourcerecordid><originalsourceid>FETCH-LOGICAL-c476t-d8eb22cbc7d787eb9cc97aeef95a9f6710ee3d08dada1625be2fce9ec4f177473</originalsourceid><addsrcrecordid>eNqF0cFr2zAUBnBRVtqs7bGnwTCDwXpwpqdnW9Yx67qlkNAeul6FLD0zF8dqpXiw_35Kk2bQS0-Cpx8f7_Exdg58CsDV17vb2fJ6KjjAFPGATUChyrFE9Y5NOFQir2tRH7P3MT5wDkXJ8YgdC0BUpZATVi-9oz7_ZiK5DL9nczO47NZHyq7iuluZdeeHrA1-lS394O3Ym5Ddd478KTtsTR_pbPeesF8_ru4u5_ni5uf15WyR20JW69zV1AhhGyudrCU1ylolDVGrSqPaSgInQsdrZ5xJ25YNidaSIlu0IGUh8YRdbHN_m14_hrRS-Ku96fR8ttCbGed1WSHgH0j2y9Y-Bv80UlzrVRct9b0ZyI9RgwQosVIg3qZYqEIAPNNPr-iDH8OQjtYqMZHieEL5FtngYwzU7ncFrjdF6eei9KYojZj8x13o2KzI7fVLMwl83gETrenbYAbbxf-ukFxKsbn5w9Z1RLT_LiWoCiT-Az2zoGc</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>913429120</pqid></control><display><type>article</type><title>Model-Based 3D Hand Pose Estimation from Monocular Video</title><source>IEEE Electronic Library (IEL)</source><creator>de La Gorce, M. ; Fleet, D. J. ; Paragios, N.</creator><creatorcontrib>de La Gorce, M. ; Fleet, D. J. ; Paragios, N.</creatorcontrib><description>A novel model-based approach to 3D hand tracking from monocular video is presented. The 3D hand pose, the hand texture, and the illuminant are dynamically estimated through minimization of an objective function. Derived from an inverse problem formulation, the objective function enables explicit use of temporal texture continuity and shading information while handling important self-occlusions and time-varying illumination. The minimization is done efficiently using a quasi-Newton method, for which we provide a rigorous derivation of the objective function gradient. Particular attention is given to terms related to the change of visibility near self-occlusion boundaries that are neglected in existing formulations. To this end, we introduce new occlusion forces and show that using all gradient terms greatly improves the performance of the method. Qualitative and quantitative experimental results demonstrate the potential of the approach.</description><identifier>ISSN: 0162-8828</identifier><identifier>EISSN: 1939-3539</identifier><identifier>EISSN: 2160-9292</identifier><identifier>DOI: 10.1109/TPAMI.2011.33</identifier><identifier>PMID: 21339527</identifier><identifier>CODEN: ITPIDJ</identifier><language>eng</language><publisher>Los Alamitos, CA: IEEE</publisher><subject>Algorithms ; Applied sciences ; Artificial intelligence ; Computer Science ; Computer science; control theory; systems ; Computer Vision and Pattern Recognition ; Exact sciences and technology ; generative modeling ; gradient descent ; Hand - physiology ; Hand tracking ; Humans ; Image edge detection ; Imaging, Three-Dimensional - methods ; Materials handling ; Mathematical models ; Minimization ; model based shape from shading ; Models, Theoretical ; Optimization ; Pattern recognition. Digital image processing. Computational geometry ; pose estimation ; Solid modeling ; Studies ; Surface layer ; Surface texture ; Texture ; Three dimensional ; Three dimensional displays ; Three dimensional models ; Tracking ; variational formulation ; Video Recording - methods</subject><ispartof>IEEE transactions on pattern analysis and machine intelligence, 2011-09, Vol.33 (9), p.1793-1805</ispartof><rights>2015 INIST-CNRS</rights><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) Sep 2011</rights><rights>Distributed under a Creative Commons Attribution 4.0 International License</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c476t-d8eb22cbc7d787eb9cc97aeef95a9f6710ee3d08dada1625be2fce9ec4f177473</citedby><cites>FETCH-LOGICAL-c476t-d8eb22cbc7d787eb9cc97aeef95a9f6710ee3d08dada1625be2fce9ec4f177473</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/5719617$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>230,314,780,784,796,885,27924,27925,54758</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/5719617$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&amp;idt=24707721$$DView record in Pascal Francis$$Hfree_for_read</backlink><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/21339527$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink><backlink>$$Uhttps://hal.science/hal-00856313$$DView record in HAL$$Hfree_for_read</backlink></links><search><creatorcontrib>de La Gorce, M.</creatorcontrib><creatorcontrib>Fleet, D. J.</creatorcontrib><creatorcontrib>Paragios, N.</creatorcontrib><title>Model-Based 3D Hand Pose Estimation from Monocular Video</title><title>IEEE transactions on pattern analysis and machine intelligence</title><addtitle>TPAMI</addtitle><addtitle>IEEE Trans Pattern Anal Mach Intell</addtitle><description>A novel model-based approach to 3D hand tracking from monocular video is presented. The 3D hand pose, the hand texture, and the illuminant are dynamically estimated through minimization of an objective function. Derived from an inverse problem formulation, the objective function enables explicit use of temporal texture continuity and shading information while handling important self-occlusions and time-varying illumination. The minimization is done efficiently using a quasi-Newton method, for which we provide a rigorous derivation of the objective function gradient. Particular attention is given to terms related to the change of visibility near self-occlusion boundaries that are neglected in existing formulations. To this end, we introduce new occlusion forces and show that using all gradient terms greatly improves the performance of the method. Qualitative and quantitative experimental results demonstrate the potential of the approach.</description><subject>Algorithms</subject><subject>Applied sciences</subject><subject>Artificial intelligence</subject><subject>Computer Science</subject><subject>Computer science; control theory; systems</subject><subject>Computer Vision and Pattern Recognition</subject><subject>Exact sciences and technology</subject><subject>generative modeling</subject><subject>gradient descent</subject><subject>Hand - physiology</subject><subject>Hand tracking</subject><subject>Humans</subject><subject>Image edge detection</subject><subject>Imaging, Three-Dimensional - methods</subject><subject>Materials handling</subject><subject>Mathematical models</subject><subject>Minimization</subject><subject>model based shape from shading</subject><subject>Models, Theoretical</subject><subject>Optimization</subject><subject>Pattern recognition. Digital image processing. Computational geometry</subject><subject>pose estimation</subject><subject>Solid modeling</subject><subject>Studies</subject><subject>Surface layer</subject><subject>Surface texture</subject><subject>Texture</subject><subject>Three dimensional</subject><subject>Three dimensional displays</subject><subject>Three dimensional models</subject><subject>Tracking</subject><subject>variational formulation</subject><subject>Video Recording - methods</subject><issn>0162-8828</issn><issn>1939-3539</issn><issn>2160-9292</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2011</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><sourceid>EIF</sourceid><recordid>eNqF0cFr2zAUBnBRVtqs7bGnwTCDwXpwpqdnW9Yx67qlkNAeul6FLD0zF8dqpXiw_35Kk2bQS0-Cpx8f7_Exdg58CsDV17vb2fJ6KjjAFPGATUChyrFE9Y5NOFQir2tRH7P3MT5wDkXJ8YgdC0BUpZATVi-9oz7_ZiK5DL9nczO47NZHyq7iuluZdeeHrA1-lS394O3Ym5Ddd478KTtsTR_pbPeesF8_ru4u5_ni5uf15WyR20JW69zV1AhhGyudrCU1ylolDVGrSqPaSgInQsdrZ5xJ25YNidaSIlu0IGUh8YRdbHN_m14_hrRS-Ku96fR8ttCbGed1WSHgH0j2y9Y-Bv80UlzrVRct9b0ZyI9RgwQosVIg3qZYqEIAPNNPr-iDH8OQjtYqMZHieEL5FtngYwzU7ncFrjdF6eei9KYojZj8x13o2KzI7fVLMwl83gETrenbYAbbxf-ukFxKsbn5w9Z1RLT_LiWoCiT-Az2zoGc</recordid><startdate>20110901</startdate><enddate>20110901</enddate><creator>de La Gorce, M.</creator><creator>Fleet, D. J.</creator><creator>Paragios, N.</creator><general>IEEE</general><general>IEEE Computer Society</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><general>Institute of Electrical and Electronics Engineers</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>IQODW</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>F28</scope><scope>FR3</scope><scope>7X8</scope><scope>1XC</scope></search><sort><creationdate>20110901</creationdate><title>Model-Based 3D Hand Pose Estimation from Monocular Video</title><author>de La Gorce, M. ; Fleet, D. J. ; Paragios, N.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c476t-d8eb22cbc7d787eb9cc97aeef95a9f6710ee3d08dada1625be2fce9ec4f177473</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2011</creationdate><topic>Algorithms</topic><topic>Applied sciences</topic><topic>Artificial intelligence</topic><topic>Computer Science</topic><topic>Computer science; control theory; systems</topic><topic>Computer Vision and Pattern Recognition</topic><topic>Exact sciences and technology</topic><topic>generative modeling</topic><topic>gradient descent</topic><topic>Hand - physiology</topic><topic>Hand tracking</topic><topic>Humans</topic><topic>Image edge detection</topic><topic>Imaging, Three-Dimensional - methods</topic><topic>Materials handling</topic><topic>Mathematical models</topic><topic>Minimization</topic><topic>model based shape from shading</topic><topic>Models, Theoretical</topic><topic>Optimization</topic><topic>Pattern recognition. Digital image processing. Computational geometry</topic><topic>pose estimation</topic><topic>Solid modeling</topic><topic>Studies</topic><topic>Surface layer</topic><topic>Surface texture</topic><topic>Texture</topic><topic>Three dimensional</topic><topic>Three dimensional displays</topic><topic>Three dimensional models</topic><topic>Tracking</topic><topic>variational formulation</topic><topic>Video Recording - methods</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>de La Gorce, M.</creatorcontrib><creatorcontrib>Fleet, D. J.</creatorcontrib><creatorcontrib>Paragios, N.</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>Pascal-Francis</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>ANTE: Abstracts in New Technology &amp; Engineering</collection><collection>Engineering Research Database</collection><collection>MEDLINE - Academic</collection><collection>Hyper Article en Ligne (HAL)</collection><jtitle>IEEE transactions on pattern analysis and machine intelligence</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>de La Gorce, M.</au><au>Fleet, D. J.</au><au>Paragios, N.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Model-Based 3D Hand Pose Estimation from Monocular Video</atitle><jtitle>IEEE transactions on pattern analysis and machine intelligence</jtitle><stitle>TPAMI</stitle><addtitle>IEEE Trans Pattern Anal Mach Intell</addtitle><date>2011-09-01</date><risdate>2011</risdate><volume>33</volume><issue>9</issue><spage>1793</spage><epage>1805</epage><pages>1793-1805</pages><issn>0162-8828</issn><eissn>1939-3539</eissn><eissn>2160-9292</eissn><coden>ITPIDJ</coden><abstract>A novel model-based approach to 3D hand tracking from monocular video is presented. The 3D hand pose, the hand texture, and the illuminant are dynamically estimated through minimization of an objective function. Derived from an inverse problem formulation, the objective function enables explicit use of temporal texture continuity and shading information while handling important self-occlusions and time-varying illumination. The minimization is done efficiently using a quasi-Newton method, for which we provide a rigorous derivation of the objective function gradient. Particular attention is given to terms related to the change of visibility near self-occlusion boundaries that are neglected in existing formulations. To this end, we introduce new occlusion forces and show that using all gradient terms greatly improves the performance of the method. Qualitative and quantitative experimental results demonstrate the potential of the approach.</abstract><cop>Los Alamitos, CA</cop><pub>IEEE</pub><pmid>21339527</pmid><doi>10.1109/TPAMI.2011.33</doi><tpages>13</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 0162-8828
ispartof IEEE transactions on pattern analysis and machine intelligence, 2011-09, Vol.33 (9), p.1793-1805
issn 0162-8828
1939-3539
2160-9292
language eng
recordid cdi_pubmed_primary_21339527
source IEEE Electronic Library (IEL)
subjects Algorithms
Applied sciences
Artificial intelligence
Computer Science
Computer science
control theory
systems
Computer Vision and Pattern Recognition
Exact sciences and technology
generative modeling
gradient descent
Hand - physiology
Hand tracking
Humans
Image edge detection
Imaging, Three-Dimensional - methods
Materials handling
Mathematical models
Minimization
model based shape from shading
Models, Theoretical
Optimization
Pattern recognition. Digital image processing. Computational geometry
pose estimation
Solid modeling
Studies
Surface layer
Surface texture
Texture
Three dimensional
Three dimensional displays
Three dimensional models
Tracking
variational formulation
Video Recording - methods
title Model-Based 3D Hand Pose Estimation from Monocular Video
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-06T15%3A12%3A09IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Model-Based%203D%20Hand%20Pose%20Estimation%20from%20Monocular%20Video&rft.jtitle=IEEE%20transactions%20on%20pattern%20analysis%20and%20machine%20intelligence&rft.au=de%20La%20Gorce,%20M.&rft.date=2011-09-01&rft.volume=33&rft.issue=9&rft.spage=1793&rft.epage=1805&rft.pages=1793-1805&rft.issn=0162-8828&rft.eissn=1939-3539&rft.coden=ITPIDJ&rft_id=info:doi/10.1109/TPAMI.2011.33&rft_dat=%3Cproquest_RIE%3E1711536912%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=913429120&rft_id=info:pmid/21339527&rft_ieee_id=5719617&rfr_iscdi=true