Soft thresholding squeeze-and-excitation network for pose-invariant facial expression recognition

Pose-invariant facial expression recognition is one of the popular research directions within the field of computer vision, but pose variant usually change the facial appearance significantly, making the recognition results unstable from different perspectives. In this paper, a novel deep learning m...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:The Visual computer 2023-07, Vol.39 (7), p.2637-2652
Hauptverfasser: Liu, Chaoji, Liu, Xingqiao, Chen, Chong, Wang, Qiankun
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 2652
container_issue 7
container_start_page 2637
container_title The Visual computer
container_volume 39
creator Liu, Chaoji
Liu, Xingqiao
Chen, Chong
Wang, Qiankun
description Pose-invariant facial expression recognition is one of the popular research directions within the field of computer vision, but pose variant usually change the facial appearance significantly, making the recognition results unstable from different perspectives. In this paper, a novel deep learning method, namely, soft thresholding squeeze-and-excitation (ST-SE) block, was proposed to extract salient features of different channels for pose-invariant FER. For the purpose of adapting to different pose-invariant facial images better, global average pooling (GAP) operation was adopted to compute the average value of each channel of the feature map. To enhance the representational power of the network, Squeeze-and-Excitation (SE) block was embedded into the nonlinear transformation layer to filter out the redundant feature information. To further shrink the significant features, the absolute values of GAP and SE were multiplied to calculate the threshold suitable for the current view. And the developed ST-SE block was inserted into ResNet50 for the evaluation of recognition performance. In this study, extensive experiments on four pose-invariant datasets were carried out, i.e., BU-3DFE, Multi-PIE, Pose-RAF-DB and Pose-AffectNet, and the influences of different environments, poses and intensities on expression recognition were specifically analyzed. The experimental results demonstrate the feasibility and effectiveness of our method.
doi_str_mv 10.1007/s00371-022-02483-5
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2918043634</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2918043634</sourcerecordid><originalsourceid>FETCH-LOGICAL-c319t-84494d325d6619d2cadcc759fd571b7e9431fb722b764dde1be532a9a129950e3</originalsourceid><addsrcrecordid>eNp9kE1LAzEURYMoWKt_wNWA62g-J5OlFL9AcKGuQybJtKk1GZNUq7_e1AruXIRH4Jz7eBeAU4zOMULiIiNEBYaIkPpYRyHfAxPMKIGEYr4PJgiLDhLRyUNwlPMS1b9gcgL0YxxKUxbJ5UVcWR_mTX5bO_floA4Wuo3xRRcfQxNc-YjppRliasaYHfThXSevQ2kGbbxeNW4z1pi8hZMzcR78VjwGB4NeZXfyO6fg-frqaXYL7x9u7maX99BQLAvsGJPMUsJt22JpidHWGMHlYLnAvXCSUTz0gpBetMxah3vHKdFSYyIlR45Owdkud0yxXpCLWsZ1CnWlIhJ3iNGWskqRHWVSzDm5QY3Jv-r0qTBS2yrVrkpVq1Q_VSpeJbqTcoXD3KW_6H-sb9cPeI0</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2918043634</pqid></control><display><type>article</type><title>Soft thresholding squeeze-and-excitation network for pose-invariant facial expression recognition</title><source>Springer Nature - Complete Springer Journals</source><source>ProQuest Central UK/Ireland</source><source>ProQuest Central</source><creator>Liu, Chaoji ; Liu, Xingqiao ; Chen, Chong ; Wang, Qiankun</creator><creatorcontrib>Liu, Chaoji ; Liu, Xingqiao ; Chen, Chong ; Wang, Qiankun</creatorcontrib><description>Pose-invariant facial expression recognition is one of the popular research directions within the field of computer vision, but pose variant usually change the facial appearance significantly, making the recognition results unstable from different perspectives. In this paper, a novel deep learning method, namely, soft thresholding squeeze-and-excitation (ST-SE) block, was proposed to extract salient features of different channels for pose-invariant FER. For the purpose of adapting to different pose-invariant facial images better, global average pooling (GAP) operation was adopted to compute the average value of each channel of the feature map. To enhance the representational power of the network, Squeeze-and-Excitation (SE) block was embedded into the nonlinear transformation layer to filter out the redundant feature information. To further shrink the significant features, the absolute values of GAP and SE were multiplied to calculate the threshold suitable for the current view. And the developed ST-SE block was inserted into ResNet50 for the evaluation of recognition performance. In this study, extensive experiments on four pose-invariant datasets were carried out, i.e., BU-3DFE, Multi-PIE, Pose-RAF-DB and Pose-AffectNet, and the influences of different environments, poses and intensities on expression recognition were specifically analyzed. The experimental results demonstrate the feasibility and effectiveness of our method.</description><identifier>ISSN: 0178-2789</identifier><identifier>EISSN: 1432-2315</identifier><identifier>DOI: 10.1007/s00371-022-02483-5</identifier><language>eng</language><publisher>Berlin/Heidelberg: Springer Berlin Heidelberg</publisher><subject>Accuracy ; Artificial Intelligence ; Computer Graphics ; Computer Science ; Computer vision ; Deep learning ; Excitation ; Face recognition ; Feature maps ; Image Processing and Computer Vision ; Invariants ; Neural networks ; Original Article ; Teaching methods</subject><ispartof>The Visual computer, 2023-07, Vol.39 (7), p.2637-2652</ispartof><rights>The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2022</rights><rights>The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2022.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c319t-84494d325d6619d2cadcc759fd571b7e9431fb722b764dde1be532a9a129950e3</citedby><cites>FETCH-LOGICAL-c319t-84494d325d6619d2cadcc759fd571b7e9431fb722b764dde1be532a9a129950e3</cites><orcidid>0000-0002-9381-5158</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s00371-022-02483-5$$EPDF$$P50$$Gspringer$$H</linktopdf><linktohtml>$$Uhttps://www.proquest.com/docview/2918043634?pq-origsite=primo$$EHTML$$P50$$Gproquest$$H</linktohtml><link.rule.ids>314,776,780,21368,27903,27904,33723,41467,42536,43784,51297,64361,64365,72215</link.rule.ids></links><search><creatorcontrib>Liu, Chaoji</creatorcontrib><creatorcontrib>Liu, Xingqiao</creatorcontrib><creatorcontrib>Chen, Chong</creatorcontrib><creatorcontrib>Wang, Qiankun</creatorcontrib><title>Soft thresholding squeeze-and-excitation network for pose-invariant facial expression recognition</title><title>The Visual computer</title><addtitle>Vis Comput</addtitle><description>Pose-invariant facial expression recognition is one of the popular research directions within the field of computer vision, but pose variant usually change the facial appearance significantly, making the recognition results unstable from different perspectives. In this paper, a novel deep learning method, namely, soft thresholding squeeze-and-excitation (ST-SE) block, was proposed to extract salient features of different channels for pose-invariant FER. For the purpose of adapting to different pose-invariant facial images better, global average pooling (GAP) operation was adopted to compute the average value of each channel of the feature map. To enhance the representational power of the network, Squeeze-and-Excitation (SE) block was embedded into the nonlinear transformation layer to filter out the redundant feature information. To further shrink the significant features, the absolute values of GAP and SE were multiplied to calculate the threshold suitable for the current view. And the developed ST-SE block was inserted into ResNet50 for the evaluation of recognition performance. In this study, extensive experiments on four pose-invariant datasets were carried out, i.e., BU-3DFE, Multi-PIE, Pose-RAF-DB and Pose-AffectNet, and the influences of different environments, poses and intensities on expression recognition were specifically analyzed. The experimental results demonstrate the feasibility and effectiveness of our method.</description><subject>Accuracy</subject><subject>Artificial Intelligence</subject><subject>Computer Graphics</subject><subject>Computer Science</subject><subject>Computer vision</subject><subject>Deep learning</subject><subject>Excitation</subject><subject>Face recognition</subject><subject>Feature maps</subject><subject>Image Processing and Computer Vision</subject><subject>Invariants</subject><subject>Neural networks</subject><subject>Original Article</subject><subject>Teaching methods</subject><issn>0178-2789</issn><issn>1432-2315</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><sourceid>GNUQQ</sourceid><recordid>eNp9kE1LAzEURYMoWKt_wNWA62g-J5OlFL9AcKGuQybJtKk1GZNUq7_e1AruXIRH4Jz7eBeAU4zOMULiIiNEBYaIkPpYRyHfAxPMKIGEYr4PJgiLDhLRyUNwlPMS1b9gcgL0YxxKUxbJ5UVcWR_mTX5bO_floA4Wuo3xRRcfQxNc-YjppRliasaYHfThXSevQ2kGbbxeNW4z1pi8hZMzcR78VjwGB4NeZXfyO6fg-frqaXYL7x9u7maX99BQLAvsGJPMUsJt22JpidHWGMHlYLnAvXCSUTz0gpBetMxah3vHKdFSYyIlR45Owdkud0yxXpCLWsZ1CnWlIhJ3iNGWskqRHWVSzDm5QY3Jv-r0qTBS2yrVrkpVq1Q_VSpeJbqTcoXD3KW_6H-sb9cPeI0</recordid><startdate>20230701</startdate><enddate>20230701</enddate><creator>Liu, Chaoji</creator><creator>Liu, Xingqiao</creator><creator>Chen, Chong</creator><creator>Wang, Qiankun</creator><general>Springer Berlin Heidelberg</general><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope><scope>8FE</scope><scope>8FG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K7-</scope><scope>P5Z</scope><scope>P62</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><orcidid>https://orcid.org/0000-0002-9381-5158</orcidid></search><sort><creationdate>20230701</creationdate><title>Soft thresholding squeeze-and-excitation network for pose-invariant facial expression recognition</title><author>Liu, Chaoji ; Liu, Xingqiao ; Chen, Chong ; Wang, Qiankun</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c319t-84494d325d6619d2cadcc759fd571b7e9431fb722b764dde1be532a9a129950e3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Accuracy</topic><topic>Artificial Intelligence</topic><topic>Computer Graphics</topic><topic>Computer Science</topic><topic>Computer vision</topic><topic>Deep learning</topic><topic>Excitation</topic><topic>Face recognition</topic><topic>Feature maps</topic><topic>Image Processing and Computer Vision</topic><topic>Invariants</topic><topic>Neural networks</topic><topic>Original Article</topic><topic>Teaching methods</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Liu, Chaoji</creatorcontrib><creatorcontrib>Liu, Xingqiao</creatorcontrib><creatorcontrib>Chen, Chong</creatorcontrib><creatorcontrib>Wang, Qiankun</creatorcontrib><collection>CrossRef</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Central UK/Ireland</collection><collection>Advanced Technologies &amp; Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>Computer Science Database</collection><collection>Advanced Technologies &amp; Aerospace Database</collection><collection>ProQuest Advanced Technologies &amp; Aerospace Collection</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><jtitle>The Visual computer</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Liu, Chaoji</au><au>Liu, Xingqiao</au><au>Chen, Chong</au><au>Wang, Qiankun</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Soft thresholding squeeze-and-excitation network for pose-invariant facial expression recognition</atitle><jtitle>The Visual computer</jtitle><stitle>Vis Comput</stitle><date>2023-07-01</date><risdate>2023</risdate><volume>39</volume><issue>7</issue><spage>2637</spage><epage>2652</epage><pages>2637-2652</pages><issn>0178-2789</issn><eissn>1432-2315</eissn><abstract>Pose-invariant facial expression recognition is one of the popular research directions within the field of computer vision, but pose variant usually change the facial appearance significantly, making the recognition results unstable from different perspectives. In this paper, a novel deep learning method, namely, soft thresholding squeeze-and-excitation (ST-SE) block, was proposed to extract salient features of different channels for pose-invariant FER. For the purpose of adapting to different pose-invariant facial images better, global average pooling (GAP) operation was adopted to compute the average value of each channel of the feature map. To enhance the representational power of the network, Squeeze-and-Excitation (SE) block was embedded into the nonlinear transformation layer to filter out the redundant feature information. To further shrink the significant features, the absolute values of GAP and SE were multiplied to calculate the threshold suitable for the current view. And the developed ST-SE block was inserted into ResNet50 for the evaluation of recognition performance. In this study, extensive experiments on four pose-invariant datasets were carried out, i.e., BU-3DFE, Multi-PIE, Pose-RAF-DB and Pose-AffectNet, and the influences of different environments, poses and intensities on expression recognition were specifically analyzed. The experimental results demonstrate the feasibility and effectiveness of our method.</abstract><cop>Berlin/Heidelberg</cop><pub>Springer Berlin Heidelberg</pub><doi>10.1007/s00371-022-02483-5</doi><tpages>16</tpages><orcidid>https://orcid.org/0000-0002-9381-5158</orcidid></addata></record>
fulltext fulltext
identifier ISSN: 0178-2789
ispartof The Visual computer, 2023-07, Vol.39 (7), p.2637-2652
issn 0178-2789
1432-2315
language eng
recordid cdi_proquest_journals_2918043634
source Springer Nature - Complete Springer Journals; ProQuest Central UK/Ireland; ProQuest Central
subjects Accuracy
Artificial Intelligence
Computer Graphics
Computer Science
Computer vision
Deep learning
Excitation
Face recognition
Feature maps
Image Processing and Computer Vision
Invariants
Neural networks
Original Article
Teaching methods
title Soft thresholding squeeze-and-excitation network for pose-invariant facial expression recognition
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-25T04%3A41%3A55IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Soft%20thresholding%20squeeze-and-excitation%20network%20for%20pose-invariant%20facial%20expression%20recognition&rft.jtitle=The%20Visual%20computer&rft.au=Liu,%20Chaoji&rft.date=2023-07-01&rft.volume=39&rft.issue=7&rft.spage=2637&rft.epage=2652&rft.pages=2637-2652&rft.issn=0178-2789&rft.eissn=1432-2315&rft_id=info:doi/10.1007/s00371-022-02483-5&rft_dat=%3Cproquest_cross%3E2918043634%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2918043634&rft_id=info:pmid/&rfr_iscdi=true