Facial Emotion Recognition Using Asymmetric Pyramidal Networks With Gradient Centralization
Facial expression recognition (FER) is a promising but challenging area of Computer Vision (CV). Many researchers have devoted significant resources to exploring FER in recent years, but an impediment remains: classifiers perform well on fine resolution images but have difficulty recognizing in-the-...
Gespeichert in:
Veröffentlicht in: | IEEE access 2021, Vol.9, p.64487-64498 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 64498 |
---|---|
container_issue | |
container_start_page | 64487 |
container_title | IEEE access |
container_volume | 9 |
creator | Zang, Huanyu Foo, Simon Y. Bernadin, Shonda Meyer-Baese, Anke |
description | Facial expression recognition (FER) is a promising but challenging area of Computer Vision (CV). Many researchers have devoted significant resources to exploring FER in recent years, but an impediment remains: classifiers perform well on fine resolution images but have difficulty recognizing in-the-wild human emotional states. In order to solve the aforementioned issue, we introduced three novel designs and implemented them in neural networks. More specifically, we utilized an asymmetric pyramidal network (APNet) and employed multi-scale kernels instead of identical size kernels. In addition, square kernels were replaced by a sequence of square, horizontal, and vertical convolutions. This structure can increase the description ability of convolutional neural networks (CNN) and transfer multi-scale features between different layers. Additionally, when training CNN, we adopted stochastic gradient descent with gradient centralization (SGDGC) where it centralizes gradients to have zero mean and makes the training process more efficient and stable. To verify the effectiveness of APNet with SGDGC, we used three of the most popular in-the-wild emotion datasets, FER-2013, CK+, and JAFFE, for our experiments. The results of our experiment and comparisons with state-of-the-art designs from others demonstrate that our method outperforms all the single model methods and has comparable performance with model fusion methods. |
doi_str_mv | 10.1109/ACCESS.2021.3075389 |
format | Article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2519966974</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9411858</ieee_id><doaj_id>oai_doaj_org_article_cf55273f942d47ee810baef8b035e098</doaj_id><sourcerecordid>2519966974</sourcerecordid><originalsourceid>FETCH-LOGICAL-c408t-2366c382810a8a2d8701f199c1f7af3848c846ac7323cdc15aec6407d675009a3</originalsourceid><addsrcrecordid>eNpNUUtLAzEQXkRBUX9BLwueW_PYvI5laWuhqFjFg4cwzWZranejyYrUX2_aFXEOM8PwPQa-LBtgNMIYqetxWU6WyxFBBI8oEoxKdZSdEczVkDLKj__tp9lljBuUSqYTE2fZyxSMg20-aXznfJs_WOPXrTvsT9G163wcd01ju-BMfr8L0LgqwW9t9-XDW8yfXfeazwJUzrZdXqYWYOu-YS9wkZ3UsI328neeZ0_TyWN5M1zczebleDE0BZLdkFDODZVEYgQSSCUFwjVWyuBaQE1lIY0sOBhBCTWVwQys4QUSFRcMIQX0PJv3upWHjX4ProGw0x6cPhx8WGsInTNbq03NGBG0VgWpCmFt8lyBreUKUWaRkknrqtd6D_7j08ZOb_xnaNP7mrD0FOdKFAlFe5QJPsZg6z9XjPQ-FN2Hoveh6N9QEmvQs5y19o-hCowlk_QHKMeH2w</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2519966974</pqid></control><display><type>article</type><title>Facial Emotion Recognition Using Asymmetric Pyramidal Networks With Gradient Centralization</title><source>IEEE Open Access Journals</source><source>DOAJ Directory of Open Access Journals</source><source>Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals</source><creator>Zang, Huanyu ; Foo, Simon Y. ; Bernadin, Shonda ; Meyer-Baese, Anke</creator><creatorcontrib>Zang, Huanyu ; Foo, Simon Y. ; Bernadin, Shonda ; Meyer-Baese, Anke</creatorcontrib><description>Facial expression recognition (FER) is a promising but challenging area of Computer Vision (CV). Many researchers have devoted significant resources to exploring FER in recent years, but an impediment remains: classifiers perform well on fine resolution images but have difficulty recognizing in-the-wild human emotional states. In order to solve the aforementioned issue, we introduced three novel designs and implemented them in neural networks. More specifically, we utilized an asymmetric pyramidal network (APNet) and employed multi-scale kernels instead of identical size kernels. In addition, square kernels were replaced by a sequence of square, horizontal, and vertical convolutions. This structure can increase the description ability of convolutional neural networks (CNN) and transfer multi-scale features between different layers. Additionally, when training CNN, we adopted stochastic gradient descent with gradient centralization (SGDGC) where it centralizes gradients to have zero mean and makes the training process more efficient and stable. To verify the effectiveness of APNet with SGDGC, we used three of the most popular in-the-wild emotion datasets, FER-2013, CK+, and JAFFE, for our experiments. The results of our experiment and comparisons with state-of-the-art designs from others demonstrate that our method outperforms all the single model methods and has comparable performance with model fusion methods.</description><identifier>ISSN: 2169-3536</identifier><identifier>EISSN: 2169-3536</identifier><identifier>DOI: 10.1109/ACCESS.2021.3075389</identifier><identifier>CODEN: IAECCG</identifier><language>eng</language><publisher>Piscataway: IEEE</publisher><subject>Artificial neural networks ; asymmetric block ; Asymmetry ; Computational efficiency ; Computer vision ; convolution neutral networks ; Emotion recognition ; Emotional factors ; Emotions ; Facial expression recognition ; Feature extraction ; Kernel ; Kernels ; Neural networks ; Object recognition ; Optimization ; optimizer ; Technological innovation ; Training ; Videos</subject><ispartof>IEEE access, 2021, Vol.9, p.64487-64498</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2021</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c408t-2366c382810a8a2d8701f199c1f7af3848c846ac7323cdc15aec6407d675009a3</citedby><cites>FETCH-LOGICAL-c408t-2366c382810a8a2d8701f199c1f7af3848c846ac7323cdc15aec6407d675009a3</cites><orcidid>0000-0002-7334-5718</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9411858$$EHTML$$P50$$Gieee$$Hfree_for_read</linktohtml><link.rule.ids>314,776,780,860,2096,4010,27610,27900,27901,27902,54908</link.rule.ids></links><search><creatorcontrib>Zang, Huanyu</creatorcontrib><creatorcontrib>Foo, Simon Y.</creatorcontrib><creatorcontrib>Bernadin, Shonda</creatorcontrib><creatorcontrib>Meyer-Baese, Anke</creatorcontrib><title>Facial Emotion Recognition Using Asymmetric Pyramidal Networks With Gradient Centralization</title><title>IEEE access</title><addtitle>Access</addtitle><description>Facial expression recognition (FER) is a promising but challenging area of Computer Vision (CV). Many researchers have devoted significant resources to exploring FER in recent years, but an impediment remains: classifiers perform well on fine resolution images but have difficulty recognizing in-the-wild human emotional states. In order to solve the aforementioned issue, we introduced three novel designs and implemented them in neural networks. More specifically, we utilized an asymmetric pyramidal network (APNet) and employed multi-scale kernels instead of identical size kernels. In addition, square kernels were replaced by a sequence of square, horizontal, and vertical convolutions. This structure can increase the description ability of convolutional neural networks (CNN) and transfer multi-scale features between different layers. Additionally, when training CNN, we adopted stochastic gradient descent with gradient centralization (SGDGC) where it centralizes gradients to have zero mean and makes the training process more efficient and stable. To verify the effectiveness of APNet with SGDGC, we used three of the most popular in-the-wild emotion datasets, FER-2013, CK+, and JAFFE, for our experiments. The results of our experiment and comparisons with state-of-the-art designs from others demonstrate that our method outperforms all the single model methods and has comparable performance with model fusion methods.</description><subject>Artificial neural networks</subject><subject>asymmetric block</subject><subject>Asymmetry</subject><subject>Computational efficiency</subject><subject>Computer vision</subject><subject>convolution neutral networks</subject><subject>Emotion recognition</subject><subject>Emotional factors</subject><subject>Emotions</subject><subject>Facial expression recognition</subject><subject>Feature extraction</subject><subject>Kernel</subject><subject>Kernels</subject><subject>Neural networks</subject><subject>Object recognition</subject><subject>Optimization</subject><subject>optimizer</subject><subject>Technological innovation</subject><subject>Training</subject><subject>Videos</subject><issn>2169-3536</issn><issn>2169-3536</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><sourceid>ESBDL</sourceid><sourceid>RIE</sourceid><sourceid>DOA</sourceid><recordid>eNpNUUtLAzEQXkRBUX9BLwueW_PYvI5laWuhqFjFg4cwzWZranejyYrUX2_aFXEOM8PwPQa-LBtgNMIYqetxWU6WyxFBBI8oEoxKdZSdEczVkDLKj__tp9lljBuUSqYTE2fZyxSMg20-aXznfJs_WOPXrTvsT9G163wcd01ju-BMfr8L0LgqwW9t9-XDW8yfXfeazwJUzrZdXqYWYOu-YS9wkZ3UsI328neeZ0_TyWN5M1zczebleDE0BZLdkFDODZVEYgQSSCUFwjVWyuBaQE1lIY0sOBhBCTWVwQys4QUSFRcMIQX0PJv3upWHjX4ProGw0x6cPhx8WGsInTNbq03NGBG0VgWpCmFt8lyBreUKUWaRkknrqtd6D_7j08ZOb_xnaNP7mrD0FOdKFAlFe5QJPsZg6z9XjPQ-FN2Hoveh6N9QEmvQs5y19o-hCowlk_QHKMeH2w</recordid><startdate>2021</startdate><enddate>2021</enddate><creator>Zang, Huanyu</creator><creator>Foo, Simon Y.</creator><creator>Bernadin, Shonda</creator><creator>Meyer-Baese, Anke</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>ESBDL</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>7SR</scope><scope>8BQ</scope><scope>8FD</scope><scope>JG9</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>DOA</scope><orcidid>https://orcid.org/0000-0002-7334-5718</orcidid></search><sort><creationdate>2021</creationdate><title>Facial Emotion Recognition Using Asymmetric Pyramidal Networks With Gradient Centralization</title><author>Zang, Huanyu ; Foo, Simon Y. ; Bernadin, Shonda ; Meyer-Baese, Anke</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c408t-2366c382810a8a2d8701f199c1f7af3848c846ac7323cdc15aec6407d675009a3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Artificial neural networks</topic><topic>asymmetric block</topic><topic>Asymmetry</topic><topic>Computational efficiency</topic><topic>Computer vision</topic><topic>convolution neutral networks</topic><topic>Emotion recognition</topic><topic>Emotional factors</topic><topic>Emotions</topic><topic>Facial expression recognition</topic><topic>Feature extraction</topic><topic>Kernel</topic><topic>Kernels</topic><topic>Neural networks</topic><topic>Object recognition</topic><topic>Optimization</topic><topic>optimizer</topic><topic>Technological innovation</topic><topic>Training</topic><topic>Videos</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Zang, Huanyu</creatorcontrib><creatorcontrib>Foo, Simon Y.</creatorcontrib><creatorcontrib>Bernadin, Shonda</creatorcontrib><creatorcontrib>Meyer-Baese, Anke</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE Open Access Journals</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Engineered Materials Abstracts</collection><collection>METADEX</collection><collection>Technology Research Database</collection><collection>Materials Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>DOAJ Directory of Open Access Journals</collection><jtitle>IEEE access</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Zang, Huanyu</au><au>Foo, Simon Y.</au><au>Bernadin, Shonda</au><au>Meyer-Baese, Anke</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Facial Emotion Recognition Using Asymmetric Pyramidal Networks With Gradient Centralization</atitle><jtitle>IEEE access</jtitle><stitle>Access</stitle><date>2021</date><risdate>2021</risdate><volume>9</volume><spage>64487</spage><epage>64498</epage><pages>64487-64498</pages><issn>2169-3536</issn><eissn>2169-3536</eissn><coden>IAECCG</coden><abstract>Facial expression recognition (FER) is a promising but challenging area of Computer Vision (CV). Many researchers have devoted significant resources to exploring FER in recent years, but an impediment remains: classifiers perform well on fine resolution images but have difficulty recognizing in-the-wild human emotional states. In order to solve the aforementioned issue, we introduced three novel designs and implemented them in neural networks. More specifically, we utilized an asymmetric pyramidal network (APNet) and employed multi-scale kernels instead of identical size kernels. In addition, square kernels were replaced by a sequence of square, horizontal, and vertical convolutions. This structure can increase the description ability of convolutional neural networks (CNN) and transfer multi-scale features between different layers. Additionally, when training CNN, we adopted stochastic gradient descent with gradient centralization (SGDGC) where it centralizes gradients to have zero mean and makes the training process more efficient and stable. To verify the effectiveness of APNet with SGDGC, we used three of the most popular in-the-wild emotion datasets, FER-2013, CK+, and JAFFE, for our experiments. The results of our experiment and comparisons with state-of-the-art designs from others demonstrate that our method outperforms all the single model methods and has comparable performance with model fusion methods.</abstract><cop>Piscataway</cop><pub>IEEE</pub><doi>10.1109/ACCESS.2021.3075389</doi><tpages>12</tpages><orcidid>https://orcid.org/0000-0002-7334-5718</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 2169-3536 |
ispartof | IEEE access, 2021, Vol.9, p.64487-64498 |
issn | 2169-3536 2169-3536 |
language | eng |
recordid | cdi_proquest_journals_2519966974 |
source | IEEE Open Access Journals; DOAJ Directory of Open Access Journals; Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals |
subjects | Artificial neural networks asymmetric block Asymmetry Computational efficiency Computer vision convolution neutral networks Emotion recognition Emotional factors Emotions Facial expression recognition Feature extraction Kernel Kernels Neural networks Object recognition Optimization optimizer Technological innovation Training Videos |
title | Facial Emotion Recognition Using Asymmetric Pyramidal Networks With Gradient Centralization |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-01T21%3A18%3A23IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Facial%20Emotion%20Recognition%20Using%20Asymmetric%20Pyramidal%20Networks%20With%20Gradient%20Centralization&rft.jtitle=IEEE%20access&rft.au=Zang,%20Huanyu&rft.date=2021&rft.volume=9&rft.spage=64487&rft.epage=64498&rft.pages=64487-64498&rft.issn=2169-3536&rft.eissn=2169-3536&rft.coden=IAECCG&rft_id=info:doi/10.1109/ACCESS.2021.3075389&rft_dat=%3Cproquest_cross%3E2519966974%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2519966974&rft_id=info:pmid/&rft_ieee_id=9411858&rft_doaj_id=oai_doaj_org_article_cf55273f942d47ee810baef8b035e098&rfr_iscdi=true |