Web-Shaped Model for Head Pose Estimation: An Approach for Best Exemplar Selection
Head pose estimation is a sensitive topic in video surveillance/smart ambient scenarios since head rotations can hide/distort discriminative features of the face. Face recognition would often tackle the problem of video frames where subjects appear in poses making it quite impossible. In this respec...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on image processing 2020, Vol.29, p.5457-5468 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 5468 |
---|---|
container_issue | |
container_start_page | 5457 |
container_title | IEEE transactions on image processing |
container_volume | 29 |
creator | Barra, Paola Barra, Silvio Bisogni, Carmen De Marsico, Maria Nappi, Michele |
description | Head pose estimation is a sensitive topic in video surveillance/smart ambient scenarios since head rotations can hide/distort discriminative features of the face. Face recognition would often tackle the problem of video frames where subjects appear in poses making it quite impossible. In this respect, the selection of the frames with the best face orientation can allow triggering recognition only on these, therefore decreasing the possibility of errors. This paper proposes a novel approach to head pose estimation for smart cities and video surveillance scenarios, aiming at this goal. The method relies on a cascade of two models: the first one predicts the positions of 68 well-known face landmarks; the second one applies a web-shaped model over the detected landmarks, to associate each of them to a specific face sector. The method can work on detected faces at a reasonable distance and with a resolution that is supported by several present devices. Results of experiments executed over some classical pose estimation benchmarks, namely Point '04, Biwi, and AFLW datasets show good performance in terms of both pose estimation and computing time. Further results refer to noisy images that are typical of the addressed settings. Finally, examples demonstrate the selection of the best frames from videos captured in video surveillance conditions. |
doi_str_mv | 10.1109/TIP.2020.2984373 |
format | Article |
fullrecord | <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_proquest_journals_2389362610</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9057523</ieee_id><sourcerecordid>2389362610</sourcerecordid><originalsourceid>FETCH-LOGICAL-c291t-1c6920e7046e7e47bc402a619876c75c3cb9ece396c66927f083fd3487fec9883</originalsourceid><addsrcrecordid>eNo9kE1PAjEQhhujiYjeTbw08bw4_dh26w3JKiQYiWA8Nkt3NkAWdm2XRP-9RYinmcPzzsdDyC2DAWNgHhaT2YADhwE3mRRanJEeM5IlAJKfxx5SnWgmzSW5CmEDwGTKVI-8f-Iyma-KFkv62pRY06rxdIxFSWdNQJqHbr0tunWze6TDHR22rW8Kt_qjnjB0NP_GbVsXns6xRncAr8lFVdQBb061Tz6e88VonEzfXiaj4TRx3LAuYU4ZDqhBKtQo9dJJ4IViJtPK6dQJtzToUBjlVCR1BZmoSiEzXaEzWSb65P44N570tY-32E2z97u40nKRGaG4YhApOFLONyF4rGzr40f-xzKwB3M2mrMHc_ZkLkbujpE1Iv7jJhpMuRC_c01nyQ</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2389362610</pqid></control><display><type>article</type><title>Web-Shaped Model for Head Pose Estimation: An Approach for Best Exemplar Selection</title><source>IEEE Electronic Library (IEL)</source><creator>Barra, Paola ; Barra, Silvio ; Bisogni, Carmen ; De Marsico, Maria ; Nappi, Michele</creator><creatorcontrib>Barra, Paola ; Barra, Silvio ; Bisogni, Carmen ; De Marsico, Maria ; Nappi, Michele</creatorcontrib><description>Head pose estimation is a sensitive topic in video surveillance/smart ambient scenarios since head rotations can hide/distort discriminative features of the face. Face recognition would often tackle the problem of video frames where subjects appear in poses making it quite impossible. In this respect, the selection of the frames with the best face orientation can allow triggering recognition only on these, therefore decreasing the possibility of errors. This paper proposes a novel approach to head pose estimation for smart cities and video surveillance scenarios, aiming at this goal. The method relies on a cascade of two models: the first one predicts the positions of 68 well-known face landmarks; the second one applies a web-shaped model over the detected landmarks, to associate each of them to a specific face sector. The method can work on detected faces at a reasonable distance and with a resolution that is supported by several present devices. Results of experiments executed over some classical pose estimation benchmarks, namely Point '04, Biwi, and AFLW datasets show good performance in terms of both pose estimation and computing time. Further results refer to noisy images that are typical of the addressed settings. Finally, examples demonstrate the selection of the best frames from videos captured in video surveillance conditions.</description><identifier>ISSN: 1057-7149</identifier><identifier>EISSN: 1941-0042</identifier><identifier>DOI: 10.1109/TIP.2020.2984373</identifier><identifier>CODEN: IIPRE4</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Computing time ; Face ; Face recognition ; Frames (data processing) ; Head movement ; Head pose estimation ; head pose exemplar selection ; Landmarks ; Pose estimation ; Proposals ; smart cities applications ; Surveillance ; Training ; web-shaped model ; Webs</subject><ispartof>IEEE transactions on image processing, 2020, Vol.29, p.5457-5468</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2020</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c291t-1c6920e7046e7e47bc402a619876c75c3cb9ece396c66927f083fd3487fec9883</citedby><cites>FETCH-LOGICAL-c291t-1c6920e7046e7e47bc402a619876c75c3cb9ece396c66927f083fd3487fec9883</cites><orcidid>0000-0002-1391-8502 ; 0000-0002-2517-2867 ; 0000-0002-7692-0626 ; 0000-0003-4042-3000 ; 0000-0003-1358-006X</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9057523$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,4009,27902,27903,27904,54736</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/9057523$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Barra, Paola</creatorcontrib><creatorcontrib>Barra, Silvio</creatorcontrib><creatorcontrib>Bisogni, Carmen</creatorcontrib><creatorcontrib>De Marsico, Maria</creatorcontrib><creatorcontrib>Nappi, Michele</creatorcontrib><title>Web-Shaped Model for Head Pose Estimation: An Approach for Best Exemplar Selection</title><title>IEEE transactions on image processing</title><addtitle>TIP</addtitle><description>Head pose estimation is a sensitive topic in video surveillance/smart ambient scenarios since head rotations can hide/distort discriminative features of the face. Face recognition would often tackle the problem of video frames where subjects appear in poses making it quite impossible. In this respect, the selection of the frames with the best face orientation can allow triggering recognition only on these, therefore decreasing the possibility of errors. This paper proposes a novel approach to head pose estimation for smart cities and video surveillance scenarios, aiming at this goal. The method relies on a cascade of two models: the first one predicts the positions of 68 well-known face landmarks; the second one applies a web-shaped model over the detected landmarks, to associate each of them to a specific face sector. The method can work on detected faces at a reasonable distance and with a resolution that is supported by several present devices. Results of experiments executed over some classical pose estimation benchmarks, namely Point '04, Biwi, and AFLW datasets show good performance in terms of both pose estimation and computing time. Further results refer to noisy images that are typical of the addressed settings. Finally, examples demonstrate the selection of the best frames from videos captured in video surveillance conditions.</description><subject>Computing time</subject><subject>Face</subject><subject>Face recognition</subject><subject>Frames (data processing)</subject><subject>Head movement</subject><subject>Head pose estimation</subject><subject>head pose exemplar selection</subject><subject>Landmarks</subject><subject>Pose estimation</subject><subject>Proposals</subject><subject>smart cities applications</subject><subject>Surveillance</subject><subject>Training</subject><subject>web-shaped model</subject><subject>Webs</subject><issn>1057-7149</issn><issn>1941-0042</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNo9kE1PAjEQhhujiYjeTbw08bw4_dh26w3JKiQYiWA8Nkt3NkAWdm2XRP-9RYinmcPzzsdDyC2DAWNgHhaT2YADhwE3mRRanJEeM5IlAJKfxx5SnWgmzSW5CmEDwGTKVI-8f-Iyma-KFkv62pRY06rxdIxFSWdNQJqHbr0tunWze6TDHR22rW8Kt_qjnjB0NP_GbVsXns6xRncAr8lFVdQBb061Tz6e88VonEzfXiaj4TRx3LAuYU4ZDqhBKtQo9dJJ4IViJtPK6dQJtzToUBjlVCR1BZmoSiEzXaEzWSb65P44N570tY-32E2z97u40nKRGaG4YhApOFLONyF4rGzr40f-xzKwB3M2mrMHc_ZkLkbujpE1Iv7jJhpMuRC_c01nyQ</recordid><startdate>2020</startdate><enddate>2020</enddate><creator>Barra, Paola</creator><creator>Barra, Silvio</creator><creator>Bisogni, Carmen</creator><creator>De Marsico, Maria</creator><creator>Nappi, Michele</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0002-1391-8502</orcidid><orcidid>https://orcid.org/0000-0002-2517-2867</orcidid><orcidid>https://orcid.org/0000-0002-7692-0626</orcidid><orcidid>https://orcid.org/0000-0003-4042-3000</orcidid><orcidid>https://orcid.org/0000-0003-1358-006X</orcidid></search><sort><creationdate>2020</creationdate><title>Web-Shaped Model for Head Pose Estimation: An Approach for Best Exemplar Selection</title><author>Barra, Paola ; Barra, Silvio ; Bisogni, Carmen ; De Marsico, Maria ; Nappi, Michele</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c291t-1c6920e7046e7e47bc402a619876c75c3cb9ece396c66927f083fd3487fec9883</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Computing time</topic><topic>Face</topic><topic>Face recognition</topic><topic>Frames (data processing)</topic><topic>Head movement</topic><topic>Head pose estimation</topic><topic>head pose exemplar selection</topic><topic>Landmarks</topic><topic>Pose estimation</topic><topic>Proposals</topic><topic>smart cities applications</topic><topic>Surveillance</topic><topic>Training</topic><topic>web-shaped model</topic><topic>Webs</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Barra, Paola</creatorcontrib><creatorcontrib>Barra, Silvio</creatorcontrib><creatorcontrib>Bisogni, Carmen</creatorcontrib><creatorcontrib>De Marsico, Maria</creatorcontrib><creatorcontrib>Nappi, Michele</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE transactions on image processing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Barra, Paola</au><au>Barra, Silvio</au><au>Bisogni, Carmen</au><au>De Marsico, Maria</au><au>Nappi, Michele</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Web-Shaped Model for Head Pose Estimation: An Approach for Best Exemplar Selection</atitle><jtitle>IEEE transactions on image processing</jtitle><stitle>TIP</stitle><date>2020</date><risdate>2020</risdate><volume>29</volume><spage>5457</spage><epage>5468</epage><pages>5457-5468</pages><issn>1057-7149</issn><eissn>1941-0042</eissn><coden>IIPRE4</coden><abstract>Head pose estimation is a sensitive topic in video surveillance/smart ambient scenarios since head rotations can hide/distort discriminative features of the face. Face recognition would often tackle the problem of video frames where subjects appear in poses making it quite impossible. In this respect, the selection of the frames with the best face orientation can allow triggering recognition only on these, therefore decreasing the possibility of errors. This paper proposes a novel approach to head pose estimation for smart cities and video surveillance scenarios, aiming at this goal. The method relies on a cascade of two models: the first one predicts the positions of 68 well-known face landmarks; the second one applies a web-shaped model over the detected landmarks, to associate each of them to a specific face sector. The method can work on detected faces at a reasonable distance and with a resolution that is supported by several present devices. Results of experiments executed over some classical pose estimation benchmarks, namely Point '04, Biwi, and AFLW datasets show good performance in terms of both pose estimation and computing time. Further results refer to noisy images that are typical of the addressed settings. Finally, examples demonstrate the selection of the best frames from videos captured in video surveillance conditions.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/TIP.2020.2984373</doi><tpages>12</tpages><orcidid>https://orcid.org/0000-0002-1391-8502</orcidid><orcidid>https://orcid.org/0000-0002-2517-2867</orcidid><orcidid>https://orcid.org/0000-0002-7692-0626</orcidid><orcidid>https://orcid.org/0000-0003-4042-3000</orcidid><orcidid>https://orcid.org/0000-0003-1358-006X</orcidid></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | ISSN: 1057-7149 |
ispartof | IEEE transactions on image processing, 2020, Vol.29, p.5457-5468 |
issn | 1057-7149 1941-0042 |
language | eng |
recordid | cdi_proquest_journals_2389362610 |
source | IEEE Electronic Library (IEL) |
subjects | Computing time Face Face recognition Frames (data processing) Head movement Head pose estimation head pose exemplar selection Landmarks Pose estimation Proposals smart cities applications Surveillance Training web-shaped model Webs |
title | Web-Shaped Model for Head Pose Estimation: An Approach for Best Exemplar Selection |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-23T13%3A22%3A54IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Web-Shaped%20Model%20for%20Head%20Pose%20Estimation:%20An%20Approach%20for%20Best%20Exemplar%20Selection&rft.jtitle=IEEE%20transactions%20on%20image%20processing&rft.au=Barra,%20Paola&rft.date=2020&rft.volume=29&rft.spage=5457&rft.epage=5468&rft.pages=5457-5468&rft.issn=1057-7149&rft.eissn=1941-0042&rft.coden=IIPRE4&rft_id=info:doi/10.1109/TIP.2020.2984373&rft_dat=%3Cproquest_RIE%3E2389362610%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2389362610&rft_id=info:pmid/&rft_ieee_id=9057523&rfr_iscdi=true |