A Multi-term and Multi-task Analyzing Framework for Affective Analysis in-the-wild

Human affective recognition is an important factor in human-computer interaction. However, the method development with in-the-wild data is not yet accurate enough for practical usage. In this paper, we introduce the affective recognition method focusing on valence-arousal (VA) and expression (EXP) t...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Youoku, Sachihiro, Toyoda, Yuushi, Yamamoto, Takahisa, Saito, Junya, Kawamura, Ryosuke, Mi, Xiaoyu, Murase, Kentaro
Format:	Artikel
Sprache:	eng
Schlagworte:	Computer Science - Computer Vision and Pattern Recognition
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title
container_volume
creator	Youoku, Sachihiro Toyoda, Yuushi Yamamoto, Takahisa Saito, Junya Kawamura, Ryosuke Mi, Xiaoyu Murase, Kentaro
description	Human affective recognition is an important factor in human-computer interaction. However, the method development with in-the-wild data is not yet accurate enough for practical usage. In this paper, we introduce the affective recognition method focusing on valence-arousal (VA) and expression (EXP) that was submitted to the Affective Behavior Analysis in-the-wild (ABAW) 2020 Contest. Since we considered that affective behaviors have many observable features that have their own time frames, we introduced multiple optimized time windows (short-term, middle-term, and long-term) into our analyzing framework for extracting feature parameters from video data. Moreover, multiple modality data are used, including action units, head poses, gaze, posture, and ResNet 50 or Efficient NET features, and are optimized during the extraction of these features. Then, we generated affective recognition models for each time window and ensembled these models together. Also, we fussed the valence, arousal, and expression models together to enable the multi-task learning, considering the fact that the basic psychological states behind facial expressions are closely related to each another. In the validation set, our model achieved a valence-arousal score of 0.498 and a facial expression score of 0.471. These verification results reveal that our proposed framework can improve estimation accuracy and robustness effectively.
doi_str_mv	10.48550/arxiv.2009.13885
format	Article
fullrecord	<record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2009_13885</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2009_13885</sourcerecordid><originalsourceid>FETCH-LOGICAL-a675-fae42526951ebed7c4f904f0e30b9a31ecc3090fd6c3e7001259575d0c4f9e3f3</originalsourceid><addsrcrecordid>eNo1j8FKw0AURWfjQqof4Mr5gYlvMpkkswzFqlARpPvwOvOeDk1SmcTW-vXS1q4uFw4HjhB3GrKithYeMP3EXZYDuEyburbX4r2Rr9_dFNVEqZc4hMvFcSObAbvDbxw-5CJhT_tt2kjeJtkwk5_ijs7EGEcZBzV9ktrHLtyIK8ZupNv_nYnV4nE1f1bLt6eXebNUWFZWMVKR27x0VtOaQuULdlAwkIG1Q6PJewMOOJTeUAWgc-tsZQMcQTJsZuL-rD1FtV8p9pgO7TGuPcWZP7BsShQ</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>A Multi-term and Multi-task Analyzing Framework for Affective Analysis in-the-wild</title><source>arXiv.org</source><creator>Youoku, Sachihiro ; Toyoda, Yuushi ; Yamamoto, Takahisa ; Saito, Junya ; Kawamura, Ryosuke ; Mi, Xiaoyu ; Murase, Kentaro</creator><creatorcontrib>Youoku, Sachihiro ; Toyoda, Yuushi ; Yamamoto, Takahisa ; Saito, Junya ; Kawamura, Ryosuke ; Mi, Xiaoyu ; Murase, Kentaro</creatorcontrib><description>Human affective recognition is an important factor in human-computer interaction. However, the method development with in-the-wild data is not yet accurate enough for practical usage. In this paper, we introduce the affective recognition method focusing on valence-arousal (VA) and expression (EXP) that was submitted to the Affective Behavior Analysis in-the-wild (ABAW) 2020 Contest. Since we considered that affective behaviors have many observable features that have their own time frames, we introduced multiple optimized time windows (short-term, middle-term, and long-term) into our analyzing framework for extracting feature parameters from video data. Moreover, multiple modality data are used, including action units, head poses, gaze, posture, and ResNet 50 or Efficient NET features, and are optimized during the extraction of these features. Then, we generated affective recognition models for each time window and ensembled these models together. Also, we fussed the valence, arousal, and expression models together to enable the multi-task learning, considering the fact that the basic psychological states behind facial expressions are closely related to each another. In the validation set, our model achieved a valence-arousal score of 0.498 and a facial expression score of 0.471. These verification results reveal that our proposed framework can improve estimation accuracy and robustness effectively.</description><identifier>DOI: 10.48550/arxiv.2009.13885</identifier><language>eng</language><subject>Computer Science - Computer Vision and Pattern Recognition</subject><creationdate>2020-09</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2009.13885$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2009.13885$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Youoku, Sachihiro</creatorcontrib><creatorcontrib>Toyoda, Yuushi</creatorcontrib><creatorcontrib>Yamamoto, Takahisa</creatorcontrib><creatorcontrib>Saito, Junya</creatorcontrib><creatorcontrib>Kawamura, Ryosuke</creatorcontrib><creatorcontrib>Mi, Xiaoyu</creatorcontrib><creatorcontrib>Murase, Kentaro</creatorcontrib><title>A Multi-term and Multi-task Analyzing Framework for Affective Analysis in-the-wild</title><description>Human affective recognition is an important factor in human-computer interaction. However, the method development with in-the-wild data is not yet accurate enough for practical usage. In this paper, we introduce the affective recognition method focusing on valence-arousal (VA) and expression (EXP) that was submitted to the Affective Behavior Analysis in-the-wild (ABAW) 2020 Contest. Since we considered that affective behaviors have many observable features that have their own time frames, we introduced multiple optimized time windows (short-term, middle-term, and long-term) into our analyzing framework for extracting feature parameters from video data. Moreover, multiple modality data are used, including action units, head poses, gaze, posture, and ResNet 50 or Efficient NET features, and are optimized during the extraction of these features. Then, we generated affective recognition models for each time window and ensembled these models together. Also, we fussed the valence, arousal, and expression models together to enable the multi-task learning, considering the fact that the basic psychological states behind facial expressions are closely related to each another. In the validation set, our model achieved a valence-arousal score of 0.498 and a facial expression score of 0.471. These verification results reveal that our proposed framework can improve estimation accuracy and robustness effectively.</description><subject>Computer Science - Computer Vision and Pattern Recognition</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNo1j8FKw0AURWfjQqof4Mr5gYlvMpkkswzFqlARpPvwOvOeDk1SmcTW-vXS1q4uFw4HjhB3GrKithYeMP3EXZYDuEyburbX4r2Rr9_dFNVEqZc4hMvFcSObAbvDbxw-5CJhT_tt2kjeJtkwk5_ijs7EGEcZBzV9ktrHLtyIK8ZupNv_nYnV4nE1f1bLt6eXebNUWFZWMVKR27x0VtOaQuULdlAwkIG1Q6PJewMOOJTeUAWgc-tsZQMcQTJsZuL-rD1FtV8p9pgO7TGuPcWZP7BsShQ</recordid><startdate>20200929</startdate><enddate>20200929</enddate><creator>Youoku, Sachihiro</creator><creator>Toyoda, Yuushi</creator><creator>Yamamoto, Takahisa</creator><creator>Saito, Junya</creator><creator>Kawamura, Ryosuke</creator><creator>Mi, Xiaoyu</creator><creator>Murase, Kentaro</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20200929</creationdate><title>A Multi-term and Multi-task Analyzing Framework for Affective Analysis in-the-wild</title><author>Youoku, Sachihiro ; Toyoda, Yuushi ; Yamamoto, Takahisa ; Saito, Junya ; Kawamura, Ryosuke ; Mi, Xiaoyu ; Murase, Kentaro</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a675-fae42526951ebed7c4f904f0e30b9a31ecc3090fd6c3e7001259575d0c4f9e3f3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Computer Science - Computer Vision and Pattern Recognition</topic><toplevel>online_resources</toplevel><creatorcontrib>Youoku, Sachihiro</creatorcontrib><creatorcontrib>Toyoda, Yuushi</creatorcontrib><creatorcontrib>Yamamoto, Takahisa</creatorcontrib><creatorcontrib>Saito, Junya</creatorcontrib><creatorcontrib>Kawamura, Ryosuke</creatorcontrib><creatorcontrib>Mi, Xiaoyu</creatorcontrib><creatorcontrib>Murase, Kentaro</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Youoku, Sachihiro</au><au>Toyoda, Yuushi</au><au>Yamamoto, Takahisa</au><au>Saito, Junya</au><au>Kawamura, Ryosuke</au><au>Mi, Xiaoyu</au><au>Murase, Kentaro</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A Multi-term and Multi-task Analyzing Framework for Affective Analysis in-the-wild</atitle><date>2020-09-29</date><risdate>2020</risdate><abstract>Human affective recognition is an important factor in human-computer interaction. However, the method development with in-the-wild data is not yet accurate enough for practical usage. In this paper, we introduce the affective recognition method focusing on valence-arousal (VA) and expression (EXP) that was submitted to the Affective Behavior Analysis in-the-wild (ABAW) 2020 Contest. Since we considered that affective behaviors have many observable features that have their own time frames, we introduced multiple optimized time windows (short-term, middle-term, and long-term) into our analyzing framework for extracting feature parameters from video data. Moreover, multiple modality data are used, including action units, head poses, gaze, posture, and ResNet 50 or Efficient NET features, and are optimized during the extraction of these features. Then, we generated affective recognition models for each time window and ensembled these models together. Also, we fussed the valence, arousal, and expression models together to enable the multi-task learning, considering the fact that the basic psychological states behind facial expressions are closely related to each another. In the validation set, our model achieved a valence-arousal score of 0.498 and a facial expression score of 0.471. These verification results reveal that our proposed framework can improve estimation accuracy and robustness effectively.</abstract><doi>10.48550/arxiv.2009.13885</doi><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier	DOI: 10.48550/arxiv.2009.13885
ispartof
issn
language	eng
recordid	cdi_arxiv_primary_2009_13885
source	arXiv.org
subjects	Computer Science - Computer Vision and Pattern Recognition
title	A Multi-term and Multi-task Analyzing Framework for Affective Analysis in-the-wild
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-08T09%3A38%3A35IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20Multi-term%20and%20Multi-task%20Analyzing%20Framework%20for%20Affective%20Analysis%20in-the-wild&rft.au=Youoku,%20Sachihiro&rft.date=2020-09-29&rft_id=info:doi/10.48550/arxiv.2009.13885&rft_dat=%3Carxiv_GOX%3E2009_13885%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true