A Multi-term and Multi-task Analyzing Framework for Affective Analysis in-the-wild

Human affective recognition is an important factor in human-computer interaction. However, the method development with in-the-wild data is not yet accurate enough for practical usage. In this paper, we introduce the affective recognition method focusing on valence-arousal (VA) and expression (EXP) t...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Youoku, Sachihiro, Toyoda, Yuushi, Yamamoto, Takahisa, Saito, Junya, Kawamura, Ryosuke, Mi, Xiaoyu, Murase, Kentaro
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator Youoku, Sachihiro
Toyoda, Yuushi
Yamamoto, Takahisa
Saito, Junya
Kawamura, Ryosuke
Mi, Xiaoyu
Murase, Kentaro
description Human affective recognition is an important factor in human-computer interaction. However, the method development with in-the-wild data is not yet accurate enough for practical usage. In this paper, we introduce the affective recognition method focusing on valence-arousal (VA) and expression (EXP) that was submitted to the Affective Behavior Analysis in-the-wild (ABAW) 2020 Contest. Since we considered that affective behaviors have many observable features that have their own time frames, we introduced multiple optimized time windows (short-term, middle-term, and long-term) into our analyzing framework for extracting feature parameters from video data. Moreover, multiple modality data are used, including action units, head poses, gaze, posture, and ResNet 50 or Efficient NET features, and are optimized during the extraction of these features. Then, we generated affective recognition models for each time window and ensembled these models together. Also, we fussed the valence, arousal, and expression models together to enable the multi-task learning, considering the fact that the basic psychological states behind facial expressions are closely related to each another. In the validation set, our model achieved a valence-arousal score of 0.498 and a facial expression score of 0.471. These verification results reveal that our proposed framework can improve estimation accuracy and robustness effectively.
doi_str_mv 10.48550/arxiv.2009.13885
format Article
fullrecord <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2009_13885</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2009_13885</sourcerecordid><originalsourceid>FETCH-LOGICAL-a675-fae42526951ebed7c4f904f0e30b9a31ecc3090fd6c3e7001259575d0c4f9e3f3</originalsourceid><addsrcrecordid>eNo1j8FKw0AURWfjQqof4Mr5gYlvMpkkswzFqlARpPvwOvOeDk1SmcTW-vXS1q4uFw4HjhB3GrKithYeMP3EXZYDuEyburbX4r2Rr9_dFNVEqZc4hMvFcSObAbvDbxw-5CJhT_tt2kjeJtkwk5_ijs7EGEcZBzV9ktrHLtyIK8ZupNv_nYnV4nE1f1bLt6eXebNUWFZWMVKR27x0VtOaQuULdlAwkIG1Q6PJewMOOJTeUAWgc-tsZQMcQTJsZuL-rD1FtV8p9pgO7TGuPcWZP7BsShQ</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>A Multi-term and Multi-task Analyzing Framework for Affective Analysis in-the-wild</title><source>arXiv.org</source><creator>Youoku, Sachihiro ; Toyoda, Yuushi ; Yamamoto, Takahisa ; Saito, Junya ; Kawamura, Ryosuke ; Mi, Xiaoyu ; Murase, Kentaro</creator><creatorcontrib>Youoku, Sachihiro ; Toyoda, Yuushi ; Yamamoto, Takahisa ; Saito, Junya ; Kawamura, Ryosuke ; Mi, Xiaoyu ; Murase, Kentaro</creatorcontrib><description>Human affective recognition is an important factor in human-computer interaction. However, the method development with in-the-wild data is not yet accurate enough for practical usage. In this paper, we introduce the affective recognition method focusing on valence-arousal (VA) and expression (EXP) that was submitted to the Affective Behavior Analysis in-the-wild (ABAW) 2020 Contest. Since we considered that affective behaviors have many observable features that have their own time frames, we introduced multiple optimized time windows (short-term, middle-term, and long-term) into our analyzing framework for extracting feature parameters from video data. Moreover, multiple modality data are used, including action units, head poses, gaze, posture, and ResNet 50 or Efficient NET features, and are optimized during the extraction of these features. Then, we generated affective recognition models for each time window and ensembled these models together. Also, we fussed the valence, arousal, and expression models together to enable the multi-task learning, considering the fact that the basic psychological states behind facial expressions are closely related to each another. In the validation set, our model achieved a valence-arousal score of 0.498 and a facial expression score of 0.471. These verification results reveal that our proposed framework can improve estimation accuracy and robustness effectively.</description><identifier>DOI: 10.48550/arxiv.2009.13885</identifier><language>eng</language><subject>Computer Science - Computer Vision and Pattern Recognition</subject><creationdate>2020-09</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2009.13885$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2009.13885$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Youoku, Sachihiro</creatorcontrib><creatorcontrib>Toyoda, Yuushi</creatorcontrib><creatorcontrib>Yamamoto, Takahisa</creatorcontrib><creatorcontrib>Saito, Junya</creatorcontrib><creatorcontrib>Kawamura, Ryosuke</creatorcontrib><creatorcontrib>Mi, Xiaoyu</creatorcontrib><creatorcontrib>Murase, Kentaro</creatorcontrib><title>A Multi-term and Multi-task Analyzing Framework for Affective Analysis in-the-wild</title><description>Human affective recognition is an important factor in human-computer interaction. However, the method development with in-the-wild data is not yet accurate enough for practical usage. In this paper, we introduce the affective recognition method focusing on valence-arousal (VA) and expression (EXP) that was submitted to the Affective Behavior Analysis in-the-wild (ABAW) 2020 Contest. Since we considered that affective behaviors have many observable features that have their own time frames, we introduced multiple optimized time windows (short-term, middle-term, and long-term) into our analyzing framework for extracting feature parameters from video data. Moreover, multiple modality data are used, including action units, head poses, gaze, posture, and ResNet 50 or Efficient NET features, and are optimized during the extraction of these features. Then, we generated affective recognition models for each time window and ensembled these models together. Also, we fussed the valence, arousal, and expression models together to enable the multi-task learning, considering the fact that the basic psychological states behind facial expressions are closely related to each another. In the validation set, our model achieved a valence-arousal score of 0.498 and a facial expression score of 0.471. These verification results reveal that our proposed framework can improve estimation accuracy and robustness effectively.</description><subject>Computer Science - Computer Vision and Pattern Recognition</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNo1j8FKw0AURWfjQqof4Mr5gYlvMpkkswzFqlARpPvwOvOeDk1SmcTW-vXS1q4uFw4HjhB3GrKithYeMP3EXZYDuEyburbX4r2Rr9_dFNVEqZc4hMvFcSObAbvDbxw-5CJhT_tt2kjeJtkwk5_ijs7EGEcZBzV9ktrHLtyIK8ZupNv_nYnV4nE1f1bLt6eXebNUWFZWMVKR27x0VtOaQuULdlAwkIG1Q6PJewMOOJTeUAWgc-tsZQMcQTJsZuL-rD1FtV8p9pgO7TGuPcWZP7BsShQ</recordid><startdate>20200929</startdate><enddate>20200929</enddate><creator>Youoku, Sachihiro</creator><creator>Toyoda, Yuushi</creator><creator>Yamamoto, Takahisa</creator><creator>Saito, Junya</creator><creator>Kawamura, Ryosuke</creator><creator>Mi, Xiaoyu</creator><creator>Murase, Kentaro</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20200929</creationdate><title>A Multi-term and Multi-task Analyzing Framework for Affective Analysis in-the-wild</title><author>Youoku, Sachihiro ; Toyoda, Yuushi ; Yamamoto, Takahisa ; Saito, Junya ; Kawamura, Ryosuke ; Mi, Xiaoyu ; Murase, Kentaro</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a675-fae42526951ebed7c4f904f0e30b9a31ecc3090fd6c3e7001259575d0c4f9e3f3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Computer Science - Computer Vision and Pattern Recognition</topic><toplevel>online_resources</toplevel><creatorcontrib>Youoku, Sachihiro</creatorcontrib><creatorcontrib>Toyoda, Yuushi</creatorcontrib><creatorcontrib>Yamamoto, Takahisa</creatorcontrib><creatorcontrib>Saito, Junya</creatorcontrib><creatorcontrib>Kawamura, Ryosuke</creatorcontrib><creatorcontrib>Mi, Xiaoyu</creatorcontrib><creatorcontrib>Murase, Kentaro</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Youoku, Sachihiro</au><au>Toyoda, Yuushi</au><au>Yamamoto, Takahisa</au><au>Saito, Junya</au><au>Kawamura, Ryosuke</au><au>Mi, Xiaoyu</au><au>Murase, Kentaro</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A Multi-term and Multi-task Analyzing Framework for Affective Analysis in-the-wild</atitle><date>2020-09-29</date><risdate>2020</risdate><abstract>Human affective recognition is an important factor in human-computer interaction. However, the method development with in-the-wild data is not yet accurate enough for practical usage. In this paper, we introduce the affective recognition method focusing on valence-arousal (VA) and expression (EXP) that was submitted to the Affective Behavior Analysis in-the-wild (ABAW) 2020 Contest. Since we considered that affective behaviors have many observable features that have their own time frames, we introduced multiple optimized time windows (short-term, middle-term, and long-term) into our analyzing framework for extracting feature parameters from video data. Moreover, multiple modality data are used, including action units, head poses, gaze, posture, and ResNet 50 or Efficient NET features, and are optimized during the extraction of these features. Then, we generated affective recognition models for each time window and ensembled these models together. Also, we fussed the valence, arousal, and expression models together to enable the multi-task learning, considering the fact that the basic psychological states behind facial expressions are closely related to each another. In the validation set, our model achieved a valence-arousal score of 0.498 and a facial expression score of 0.471. These verification results reveal that our proposed framework can improve estimation accuracy and robustness effectively.</abstract><doi>10.48550/arxiv.2009.13885</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier DOI: 10.48550/arxiv.2009.13885
ispartof
issn
language eng
recordid cdi_arxiv_primary_2009_13885
source arXiv.org
subjects Computer Science - Computer Vision and Pattern Recognition
title A Multi-term and Multi-task Analyzing Framework for Affective Analysis in-the-wild
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-08T09%3A38%3A35IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20Multi-term%20and%20Multi-task%20Analyzing%20Framework%20for%20Affective%20Analysis%20in-the-wild&rft.au=Youoku,%20Sachihiro&rft.date=2020-09-29&rft_id=info:doi/10.48550/arxiv.2009.13885&rft_dat=%3Carxiv_GOX%3E2009_13885%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true