Web-Shaped Model for Head Pose Estimation: An Approach for Best Exemplar Selection

Head pose estimation is a sensitive topic in video surveillance/smart ambient scenarios since head rotations can hide/distort discriminative features of the face. Face recognition would often tackle the problem of video frames where subjects appear in poses making it quite impossible. In this respec...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on image processing 2020, Vol.29, p.5457-5468
Hauptverfasser: Barra, Paola, Barra, Silvio, Bisogni, Carmen, De Marsico, Maria, Nappi, Michele
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 5468
container_issue
container_start_page 5457
container_title IEEE transactions on image processing
container_volume 29
creator Barra, Paola
Barra, Silvio
Bisogni, Carmen
De Marsico, Maria
Nappi, Michele
description Head pose estimation is a sensitive topic in video surveillance/smart ambient scenarios since head rotations can hide/distort discriminative features of the face. Face recognition would often tackle the problem of video frames where subjects appear in poses making it quite impossible. In this respect, the selection of the frames with the best face orientation can allow triggering recognition only on these, therefore decreasing the possibility of errors. This paper proposes a novel approach to head pose estimation for smart cities and video surveillance scenarios, aiming at this goal. The method relies on a cascade of two models: the first one predicts the positions of 68 well-known face landmarks; the second one applies a web-shaped model over the detected landmarks, to associate each of them to a specific face sector. The method can work on detected faces at a reasonable distance and with a resolution that is supported by several present devices. Results of experiments executed over some classical pose estimation benchmarks, namely Point '04, Biwi, and AFLW datasets show good performance in terms of both pose estimation and computing time. Further results refer to noisy images that are typical of the addressed settings. Finally, examples demonstrate the selection of the best frames from videos captured in video surveillance conditions.
doi_str_mv 10.1109/TIP.2020.2984373
format Article
fullrecord <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_proquest_journals_2389362610</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9057523</ieee_id><sourcerecordid>2389362610</sourcerecordid><originalsourceid>FETCH-LOGICAL-c291t-1c6920e7046e7e47bc402a619876c75c3cb9ece396c66927f083fd3487fec9883</originalsourceid><addsrcrecordid>eNo9kE1PAjEQhhujiYjeTbw08bw4_dh26w3JKiQYiWA8Nkt3NkAWdm2XRP-9RYinmcPzzsdDyC2DAWNgHhaT2YADhwE3mRRanJEeM5IlAJKfxx5SnWgmzSW5CmEDwGTKVI-8f-Iyma-KFkv62pRY06rxdIxFSWdNQJqHbr0tunWze6TDHR22rW8Kt_qjnjB0NP_GbVsXns6xRncAr8lFVdQBb061Tz6e88VonEzfXiaj4TRx3LAuYU4ZDqhBKtQo9dJJ4IViJtPK6dQJtzToUBjlVCR1BZmoSiEzXaEzWSb65P44N570tY-32E2z97u40nKRGaG4YhApOFLONyF4rGzr40f-xzKwB3M2mrMHc_ZkLkbujpE1Iv7jJhpMuRC_c01nyQ</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2389362610</pqid></control><display><type>article</type><title>Web-Shaped Model for Head Pose Estimation: An Approach for Best Exemplar Selection</title><source>IEEE Electronic Library (IEL)</source><creator>Barra, Paola ; Barra, Silvio ; Bisogni, Carmen ; De Marsico, Maria ; Nappi, Michele</creator><creatorcontrib>Barra, Paola ; Barra, Silvio ; Bisogni, Carmen ; De Marsico, Maria ; Nappi, Michele</creatorcontrib><description>Head pose estimation is a sensitive topic in video surveillance/smart ambient scenarios since head rotations can hide/distort discriminative features of the face. Face recognition would often tackle the problem of video frames where subjects appear in poses making it quite impossible. In this respect, the selection of the frames with the best face orientation can allow triggering recognition only on these, therefore decreasing the possibility of errors. This paper proposes a novel approach to head pose estimation for smart cities and video surveillance scenarios, aiming at this goal. The method relies on a cascade of two models: the first one predicts the positions of 68 well-known face landmarks; the second one applies a web-shaped model over the detected landmarks, to associate each of them to a specific face sector. The method can work on detected faces at a reasonable distance and with a resolution that is supported by several present devices. Results of experiments executed over some classical pose estimation benchmarks, namely Point '04, Biwi, and AFLW datasets show good performance in terms of both pose estimation and computing time. Further results refer to noisy images that are typical of the addressed settings. Finally, examples demonstrate the selection of the best frames from videos captured in video surveillance conditions.</description><identifier>ISSN: 1057-7149</identifier><identifier>EISSN: 1941-0042</identifier><identifier>DOI: 10.1109/TIP.2020.2984373</identifier><identifier>CODEN: IIPRE4</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Computing time ; Face ; Face recognition ; Frames (data processing) ; Head movement ; Head pose estimation ; head pose exemplar selection ; Landmarks ; Pose estimation ; Proposals ; smart cities applications ; Surveillance ; Training ; web-shaped model ; Webs</subject><ispartof>IEEE transactions on image processing, 2020, Vol.29, p.5457-5468</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2020</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c291t-1c6920e7046e7e47bc402a619876c75c3cb9ece396c66927f083fd3487fec9883</citedby><cites>FETCH-LOGICAL-c291t-1c6920e7046e7e47bc402a619876c75c3cb9ece396c66927f083fd3487fec9883</cites><orcidid>0000-0002-1391-8502 ; 0000-0002-2517-2867 ; 0000-0002-7692-0626 ; 0000-0003-4042-3000 ; 0000-0003-1358-006X</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9057523$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,4009,27902,27903,27904,54736</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/9057523$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Barra, Paola</creatorcontrib><creatorcontrib>Barra, Silvio</creatorcontrib><creatorcontrib>Bisogni, Carmen</creatorcontrib><creatorcontrib>De Marsico, Maria</creatorcontrib><creatorcontrib>Nappi, Michele</creatorcontrib><title>Web-Shaped Model for Head Pose Estimation: An Approach for Best Exemplar Selection</title><title>IEEE transactions on image processing</title><addtitle>TIP</addtitle><description>Head pose estimation is a sensitive topic in video surveillance/smart ambient scenarios since head rotations can hide/distort discriminative features of the face. Face recognition would often tackle the problem of video frames where subjects appear in poses making it quite impossible. In this respect, the selection of the frames with the best face orientation can allow triggering recognition only on these, therefore decreasing the possibility of errors. This paper proposes a novel approach to head pose estimation for smart cities and video surveillance scenarios, aiming at this goal. The method relies on a cascade of two models: the first one predicts the positions of 68 well-known face landmarks; the second one applies a web-shaped model over the detected landmarks, to associate each of them to a specific face sector. The method can work on detected faces at a reasonable distance and with a resolution that is supported by several present devices. Results of experiments executed over some classical pose estimation benchmarks, namely Point '04, Biwi, and AFLW datasets show good performance in terms of both pose estimation and computing time. Further results refer to noisy images that are typical of the addressed settings. Finally, examples demonstrate the selection of the best frames from videos captured in video surveillance conditions.</description><subject>Computing time</subject><subject>Face</subject><subject>Face recognition</subject><subject>Frames (data processing)</subject><subject>Head movement</subject><subject>Head pose estimation</subject><subject>head pose exemplar selection</subject><subject>Landmarks</subject><subject>Pose estimation</subject><subject>Proposals</subject><subject>smart cities applications</subject><subject>Surveillance</subject><subject>Training</subject><subject>web-shaped model</subject><subject>Webs</subject><issn>1057-7149</issn><issn>1941-0042</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNo9kE1PAjEQhhujiYjeTbw08bw4_dh26w3JKiQYiWA8Nkt3NkAWdm2XRP-9RYinmcPzzsdDyC2DAWNgHhaT2YADhwE3mRRanJEeM5IlAJKfxx5SnWgmzSW5CmEDwGTKVI-8f-Iyma-KFkv62pRY06rxdIxFSWdNQJqHbr0tunWze6TDHR22rW8Kt_qjnjB0NP_GbVsXns6xRncAr8lFVdQBb061Tz6e88VonEzfXiaj4TRx3LAuYU4ZDqhBKtQo9dJJ4IViJtPK6dQJtzToUBjlVCR1BZmoSiEzXaEzWSb65P44N570tY-32E2z97u40nKRGaG4YhApOFLONyF4rGzr40f-xzKwB3M2mrMHc_ZkLkbujpE1Iv7jJhpMuRC_c01nyQ</recordid><startdate>2020</startdate><enddate>2020</enddate><creator>Barra, Paola</creator><creator>Barra, Silvio</creator><creator>Bisogni, Carmen</creator><creator>De Marsico, Maria</creator><creator>Nappi, Michele</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0002-1391-8502</orcidid><orcidid>https://orcid.org/0000-0002-2517-2867</orcidid><orcidid>https://orcid.org/0000-0002-7692-0626</orcidid><orcidid>https://orcid.org/0000-0003-4042-3000</orcidid><orcidid>https://orcid.org/0000-0003-1358-006X</orcidid></search><sort><creationdate>2020</creationdate><title>Web-Shaped Model for Head Pose Estimation: An Approach for Best Exemplar Selection</title><author>Barra, Paola ; Barra, Silvio ; Bisogni, Carmen ; De Marsico, Maria ; Nappi, Michele</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c291t-1c6920e7046e7e47bc402a619876c75c3cb9ece396c66927f083fd3487fec9883</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Computing time</topic><topic>Face</topic><topic>Face recognition</topic><topic>Frames (data processing)</topic><topic>Head movement</topic><topic>Head pose estimation</topic><topic>head pose exemplar selection</topic><topic>Landmarks</topic><topic>Pose estimation</topic><topic>Proposals</topic><topic>smart cities applications</topic><topic>Surveillance</topic><topic>Training</topic><topic>web-shaped model</topic><topic>Webs</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Barra, Paola</creatorcontrib><creatorcontrib>Barra, Silvio</creatorcontrib><creatorcontrib>Bisogni, Carmen</creatorcontrib><creatorcontrib>De Marsico, Maria</creatorcontrib><creatorcontrib>Nappi, Michele</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE transactions on image processing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Barra, Paola</au><au>Barra, Silvio</au><au>Bisogni, Carmen</au><au>De Marsico, Maria</au><au>Nappi, Michele</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Web-Shaped Model for Head Pose Estimation: An Approach for Best Exemplar Selection</atitle><jtitle>IEEE transactions on image processing</jtitle><stitle>TIP</stitle><date>2020</date><risdate>2020</risdate><volume>29</volume><spage>5457</spage><epage>5468</epage><pages>5457-5468</pages><issn>1057-7149</issn><eissn>1941-0042</eissn><coden>IIPRE4</coden><abstract>Head pose estimation is a sensitive topic in video surveillance/smart ambient scenarios since head rotations can hide/distort discriminative features of the face. Face recognition would often tackle the problem of video frames where subjects appear in poses making it quite impossible. In this respect, the selection of the frames with the best face orientation can allow triggering recognition only on these, therefore decreasing the possibility of errors. This paper proposes a novel approach to head pose estimation for smart cities and video surveillance scenarios, aiming at this goal. The method relies on a cascade of two models: the first one predicts the positions of 68 well-known face landmarks; the second one applies a web-shaped model over the detected landmarks, to associate each of them to a specific face sector. The method can work on detected faces at a reasonable distance and with a resolution that is supported by several present devices. Results of experiments executed over some classical pose estimation benchmarks, namely Point '04, Biwi, and AFLW datasets show good performance in terms of both pose estimation and computing time. Further results refer to noisy images that are typical of the addressed settings. Finally, examples demonstrate the selection of the best frames from videos captured in video surveillance conditions.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/TIP.2020.2984373</doi><tpages>12</tpages><orcidid>https://orcid.org/0000-0002-1391-8502</orcidid><orcidid>https://orcid.org/0000-0002-2517-2867</orcidid><orcidid>https://orcid.org/0000-0002-7692-0626</orcidid><orcidid>https://orcid.org/0000-0003-4042-3000</orcidid><orcidid>https://orcid.org/0000-0003-1358-006X</orcidid></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 1057-7149
ispartof IEEE transactions on image processing, 2020, Vol.29, p.5457-5468
issn 1057-7149
1941-0042
language eng
recordid cdi_proquest_journals_2389362610
source IEEE Electronic Library (IEL)
subjects Computing time
Face
Face recognition
Frames (data processing)
Head movement
Head pose estimation
head pose exemplar selection
Landmarks
Pose estimation
Proposals
smart cities applications
Surveillance
Training
web-shaped model
Webs
title Web-Shaped Model for Head Pose Estimation: An Approach for Best Exemplar Selection
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-23T13%3A22%3A54IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Web-Shaped%20Model%20for%20Head%20Pose%20Estimation:%20An%20Approach%20for%20Best%20Exemplar%20Selection&rft.jtitle=IEEE%20transactions%20on%20image%20processing&rft.au=Barra,%20Paola&rft.date=2020&rft.volume=29&rft.spage=5457&rft.epage=5468&rft.pages=5457-5468&rft.issn=1057-7149&rft.eissn=1941-0042&rft.coden=IIPRE4&rft_id=info:doi/10.1109/TIP.2020.2984373&rft_dat=%3Cproquest_RIE%3E2389362610%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2389362610&rft_id=info:pmid/&rft_ieee_id=9057523&rfr_iscdi=true