Facial Emotion Recognition Using Asymmetric Pyramidal Networks With Gradient Centralization

Facial expression recognition (FER) is a promising but challenging area of Computer Vision (CV). Many researchers have devoted significant resources to exploring FER in recent years, but an impediment remains: classifiers perform well on fine resolution images but have difficulty recognizing in-the-...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE access 2021, Vol.9, p.64487-64498
Hauptverfasser: Zang, Huanyu, Foo, Simon Y., Bernadin, Shonda, Meyer-Baese, Anke
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 64498
container_issue
container_start_page 64487
container_title IEEE access
container_volume 9
creator Zang, Huanyu
Foo, Simon Y.
Bernadin, Shonda
Meyer-Baese, Anke
description Facial expression recognition (FER) is a promising but challenging area of Computer Vision (CV). Many researchers have devoted significant resources to exploring FER in recent years, but an impediment remains: classifiers perform well on fine resolution images but have difficulty recognizing in-the-wild human emotional states. In order to solve the aforementioned issue, we introduced three novel designs and implemented them in neural networks. More specifically, we utilized an asymmetric pyramidal network (APNet) and employed multi-scale kernels instead of identical size kernels. In addition, square kernels were replaced by a sequence of square, horizontal, and vertical convolutions. This structure can increase the description ability of convolutional neural networks (CNN) and transfer multi-scale features between different layers. Additionally, when training CNN, we adopted stochastic gradient descent with gradient centralization (SGDGC) where it centralizes gradients to have zero mean and makes the training process more efficient and stable. To verify the effectiveness of APNet with SGDGC, we used three of the most popular in-the-wild emotion datasets, FER-2013, CK+, and JAFFE, for our experiments. The results of our experiment and comparisons with state-of-the-art designs from others demonstrate that our method outperforms all the single model methods and has comparable performance with model fusion methods.
doi_str_mv 10.1109/ACCESS.2021.3075389
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2519966974</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9411858</ieee_id><doaj_id>oai_doaj_org_article_cf55273f942d47ee810baef8b035e098</doaj_id><sourcerecordid>2519966974</sourcerecordid><originalsourceid>FETCH-LOGICAL-c408t-2366c382810a8a2d8701f199c1f7af3848c846ac7323cdc15aec6407d675009a3</originalsourceid><addsrcrecordid>eNpNUUtLAzEQXkRBUX9BLwueW_PYvI5laWuhqFjFg4cwzWZranejyYrUX2_aFXEOM8PwPQa-LBtgNMIYqetxWU6WyxFBBI8oEoxKdZSdEczVkDLKj__tp9lljBuUSqYTE2fZyxSMg20-aXznfJs_WOPXrTvsT9G163wcd01ju-BMfr8L0LgqwW9t9-XDW8yfXfeazwJUzrZdXqYWYOu-YS9wkZ3UsI328neeZ0_TyWN5M1zczebleDE0BZLdkFDODZVEYgQSSCUFwjVWyuBaQE1lIY0sOBhBCTWVwQys4QUSFRcMIQX0PJv3upWHjX4ProGw0x6cPhx8WGsInTNbq03NGBG0VgWpCmFt8lyBreUKUWaRkknrqtd6D_7j08ZOb_xnaNP7mrD0FOdKFAlFe5QJPsZg6z9XjPQ-FN2Hoveh6N9QEmvQs5y19o-hCowlk_QHKMeH2w</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2519966974</pqid></control><display><type>article</type><title>Facial Emotion Recognition Using Asymmetric Pyramidal Networks With Gradient Centralization</title><source>IEEE Open Access Journals</source><source>DOAJ Directory of Open Access Journals</source><source>Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals</source><creator>Zang, Huanyu ; Foo, Simon Y. ; Bernadin, Shonda ; Meyer-Baese, Anke</creator><creatorcontrib>Zang, Huanyu ; Foo, Simon Y. ; Bernadin, Shonda ; Meyer-Baese, Anke</creatorcontrib><description>Facial expression recognition (FER) is a promising but challenging area of Computer Vision (CV). Many researchers have devoted significant resources to exploring FER in recent years, but an impediment remains: classifiers perform well on fine resolution images but have difficulty recognizing in-the-wild human emotional states. In order to solve the aforementioned issue, we introduced three novel designs and implemented them in neural networks. More specifically, we utilized an asymmetric pyramidal network (APNet) and employed multi-scale kernels instead of identical size kernels. In addition, square kernels were replaced by a sequence of square, horizontal, and vertical convolutions. This structure can increase the description ability of convolutional neural networks (CNN) and transfer multi-scale features between different layers. Additionally, when training CNN, we adopted stochastic gradient descent with gradient centralization (SGDGC) where it centralizes gradients to have zero mean and makes the training process more efficient and stable. To verify the effectiveness of APNet with SGDGC, we used three of the most popular in-the-wild emotion datasets, FER-2013, CK+, and JAFFE, for our experiments. The results of our experiment and comparisons with state-of-the-art designs from others demonstrate that our method outperforms all the single model methods and has comparable performance with model fusion methods.</description><identifier>ISSN: 2169-3536</identifier><identifier>EISSN: 2169-3536</identifier><identifier>DOI: 10.1109/ACCESS.2021.3075389</identifier><identifier>CODEN: IAECCG</identifier><language>eng</language><publisher>Piscataway: IEEE</publisher><subject>Artificial neural networks ; asymmetric block ; Asymmetry ; Computational efficiency ; Computer vision ; convolution neutral networks ; Emotion recognition ; Emotional factors ; Emotions ; Facial expression recognition ; Feature extraction ; Kernel ; Kernels ; Neural networks ; Object recognition ; Optimization ; optimizer ; Technological innovation ; Training ; Videos</subject><ispartof>IEEE access, 2021, Vol.9, p.64487-64498</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2021</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c408t-2366c382810a8a2d8701f199c1f7af3848c846ac7323cdc15aec6407d675009a3</citedby><cites>FETCH-LOGICAL-c408t-2366c382810a8a2d8701f199c1f7af3848c846ac7323cdc15aec6407d675009a3</cites><orcidid>0000-0002-7334-5718</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9411858$$EHTML$$P50$$Gieee$$Hfree_for_read</linktohtml><link.rule.ids>314,776,780,860,2096,4010,27610,27900,27901,27902,54908</link.rule.ids></links><search><creatorcontrib>Zang, Huanyu</creatorcontrib><creatorcontrib>Foo, Simon Y.</creatorcontrib><creatorcontrib>Bernadin, Shonda</creatorcontrib><creatorcontrib>Meyer-Baese, Anke</creatorcontrib><title>Facial Emotion Recognition Using Asymmetric Pyramidal Networks With Gradient Centralization</title><title>IEEE access</title><addtitle>Access</addtitle><description>Facial expression recognition (FER) is a promising but challenging area of Computer Vision (CV). Many researchers have devoted significant resources to exploring FER in recent years, but an impediment remains: classifiers perform well on fine resolution images but have difficulty recognizing in-the-wild human emotional states. In order to solve the aforementioned issue, we introduced three novel designs and implemented them in neural networks. More specifically, we utilized an asymmetric pyramidal network (APNet) and employed multi-scale kernels instead of identical size kernels. In addition, square kernels were replaced by a sequence of square, horizontal, and vertical convolutions. This structure can increase the description ability of convolutional neural networks (CNN) and transfer multi-scale features between different layers. Additionally, when training CNN, we adopted stochastic gradient descent with gradient centralization (SGDGC) where it centralizes gradients to have zero mean and makes the training process more efficient and stable. To verify the effectiveness of APNet with SGDGC, we used three of the most popular in-the-wild emotion datasets, FER-2013, CK+, and JAFFE, for our experiments. The results of our experiment and comparisons with state-of-the-art designs from others demonstrate that our method outperforms all the single model methods and has comparable performance with model fusion methods.</description><subject>Artificial neural networks</subject><subject>asymmetric block</subject><subject>Asymmetry</subject><subject>Computational efficiency</subject><subject>Computer vision</subject><subject>convolution neutral networks</subject><subject>Emotion recognition</subject><subject>Emotional factors</subject><subject>Emotions</subject><subject>Facial expression recognition</subject><subject>Feature extraction</subject><subject>Kernel</subject><subject>Kernels</subject><subject>Neural networks</subject><subject>Object recognition</subject><subject>Optimization</subject><subject>optimizer</subject><subject>Technological innovation</subject><subject>Training</subject><subject>Videos</subject><issn>2169-3536</issn><issn>2169-3536</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><sourceid>ESBDL</sourceid><sourceid>RIE</sourceid><sourceid>DOA</sourceid><recordid>eNpNUUtLAzEQXkRBUX9BLwueW_PYvI5laWuhqFjFg4cwzWZranejyYrUX2_aFXEOM8PwPQa-LBtgNMIYqetxWU6WyxFBBI8oEoxKdZSdEczVkDLKj__tp9lljBuUSqYTE2fZyxSMg20-aXznfJs_WOPXrTvsT9G163wcd01ju-BMfr8L0LgqwW9t9-XDW8yfXfeazwJUzrZdXqYWYOu-YS9wkZ3UsI328neeZ0_TyWN5M1zczebleDE0BZLdkFDODZVEYgQSSCUFwjVWyuBaQE1lIY0sOBhBCTWVwQys4QUSFRcMIQX0PJv3upWHjX4ProGw0x6cPhx8WGsInTNbq03NGBG0VgWpCmFt8lyBreUKUWaRkknrqtd6D_7j08ZOb_xnaNP7mrD0FOdKFAlFe5QJPsZg6z9XjPQ-FN2Hoveh6N9QEmvQs5y19o-hCowlk_QHKMeH2w</recordid><startdate>2021</startdate><enddate>2021</enddate><creator>Zang, Huanyu</creator><creator>Foo, Simon Y.</creator><creator>Bernadin, Shonda</creator><creator>Meyer-Baese, Anke</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>ESBDL</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>7SR</scope><scope>8BQ</scope><scope>8FD</scope><scope>JG9</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>DOA</scope><orcidid>https://orcid.org/0000-0002-7334-5718</orcidid></search><sort><creationdate>2021</creationdate><title>Facial Emotion Recognition Using Asymmetric Pyramidal Networks With Gradient Centralization</title><author>Zang, Huanyu ; Foo, Simon Y. ; Bernadin, Shonda ; Meyer-Baese, Anke</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c408t-2366c382810a8a2d8701f199c1f7af3848c846ac7323cdc15aec6407d675009a3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Artificial neural networks</topic><topic>asymmetric block</topic><topic>Asymmetry</topic><topic>Computational efficiency</topic><topic>Computer vision</topic><topic>convolution neutral networks</topic><topic>Emotion recognition</topic><topic>Emotional factors</topic><topic>Emotions</topic><topic>Facial expression recognition</topic><topic>Feature extraction</topic><topic>Kernel</topic><topic>Kernels</topic><topic>Neural networks</topic><topic>Object recognition</topic><topic>Optimization</topic><topic>optimizer</topic><topic>Technological innovation</topic><topic>Training</topic><topic>Videos</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Zang, Huanyu</creatorcontrib><creatorcontrib>Foo, Simon Y.</creatorcontrib><creatorcontrib>Bernadin, Shonda</creatorcontrib><creatorcontrib>Meyer-Baese, Anke</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE Open Access Journals</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Engineered Materials Abstracts</collection><collection>METADEX</collection><collection>Technology Research Database</collection><collection>Materials Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>DOAJ Directory of Open Access Journals</collection><jtitle>IEEE access</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Zang, Huanyu</au><au>Foo, Simon Y.</au><au>Bernadin, Shonda</au><au>Meyer-Baese, Anke</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Facial Emotion Recognition Using Asymmetric Pyramidal Networks With Gradient Centralization</atitle><jtitle>IEEE access</jtitle><stitle>Access</stitle><date>2021</date><risdate>2021</risdate><volume>9</volume><spage>64487</spage><epage>64498</epage><pages>64487-64498</pages><issn>2169-3536</issn><eissn>2169-3536</eissn><coden>IAECCG</coden><abstract>Facial expression recognition (FER) is a promising but challenging area of Computer Vision (CV). Many researchers have devoted significant resources to exploring FER in recent years, but an impediment remains: classifiers perform well on fine resolution images but have difficulty recognizing in-the-wild human emotional states. In order to solve the aforementioned issue, we introduced three novel designs and implemented them in neural networks. More specifically, we utilized an asymmetric pyramidal network (APNet) and employed multi-scale kernels instead of identical size kernels. In addition, square kernels were replaced by a sequence of square, horizontal, and vertical convolutions. This structure can increase the description ability of convolutional neural networks (CNN) and transfer multi-scale features between different layers. Additionally, when training CNN, we adopted stochastic gradient descent with gradient centralization (SGDGC) where it centralizes gradients to have zero mean and makes the training process more efficient and stable. To verify the effectiveness of APNet with SGDGC, we used three of the most popular in-the-wild emotion datasets, FER-2013, CK+, and JAFFE, for our experiments. The results of our experiment and comparisons with state-of-the-art designs from others demonstrate that our method outperforms all the single model methods and has comparable performance with model fusion methods.</abstract><cop>Piscataway</cop><pub>IEEE</pub><doi>10.1109/ACCESS.2021.3075389</doi><tpages>12</tpages><orcidid>https://orcid.org/0000-0002-7334-5718</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 2169-3536
ispartof IEEE access, 2021, Vol.9, p.64487-64498
issn 2169-3536
2169-3536
language eng
recordid cdi_proquest_journals_2519966974
source IEEE Open Access Journals; DOAJ Directory of Open Access Journals; Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals
subjects Artificial neural networks
asymmetric block
Asymmetry
Computational efficiency
Computer vision
convolution neutral networks
Emotion recognition
Emotional factors
Emotions
Facial expression recognition
Feature extraction
Kernel
Kernels
Neural networks
Object recognition
Optimization
optimizer
Technological innovation
Training
Videos
title Facial Emotion Recognition Using Asymmetric Pyramidal Networks With Gradient Centralization
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-01T21%3A18%3A23IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Facial%20Emotion%20Recognition%20Using%20Asymmetric%20Pyramidal%20Networks%20With%20Gradient%20Centralization&rft.jtitle=IEEE%20access&rft.au=Zang,%20Huanyu&rft.date=2021&rft.volume=9&rft.spage=64487&rft.epage=64498&rft.pages=64487-64498&rft.issn=2169-3536&rft.eissn=2169-3536&rft.coden=IAECCG&rft_id=info:doi/10.1109/ACCESS.2021.3075389&rft_dat=%3Cproquest_cross%3E2519966974%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2519966974&rft_id=info:pmid/&rft_ieee_id=9411858&rft_doaj_id=oai_doaj_org_article_cf55273f942d47ee810baef8b035e098&rfr_iscdi=true