Unveiling hidden factors: explainable AI for feature boosting in speech emotion recognition

Speech emotion recognition (SER) has gained significant attention due to its several application fields, such as mental health, education, and human-computer interaction. However, the accuracy of SER systems is hindered by high-dimensional feature sets that may contain irrelevant and redundant infor...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Applied intelligence (Dordrecht, Netherlands) Netherlands), 2024-06, Vol.54 (11-12), p.7046-7069
Hauptverfasser:	Nfissi, Alaa, Bouachir, Wassim, Bouguila, Nizar, Mishara, Brian
Format:	Artikel
Sprache:	eng
Schlagworte:	Artificial Intelligence Computer Science Emotion recognition Emotions Explainable artificial intelligence Feature recognition Machine learning Machines Manufacturing Mechanical Engineering Processes Source code Speech Speech recognition
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	7069
container_issue	11-12
container_start_page	7046
container_title	Applied intelligence (Dordrecht, Netherlands)
container_volume	54
creator	Nfissi, Alaa Bouachir, Wassim Bouguila, Nizar Mishara, Brian
description	Speech emotion recognition (SER) has gained significant attention due to its several application fields, such as mental health, education, and human-computer interaction. However, the accuracy of SER systems is hindered by high-dimensional feature sets that may contain irrelevant and redundant information. To overcome this challenge, this study proposes an iterative feature boosting approach for SER that emphasizes feature relevance and explainability to enhance machine learning model performance. Our approach involves meticulous feature selection and analysis to build efficient SER systems. In addressing our main problem through model explainability, we employ a feature evaluation loop with Shapley values to iteratively refine feature sets. This process strikes a balance between model performance and transparency, which enables a comprehensive understanding of the model’s predictions. The proposed approach offers several advantages, including the identification and removal of irrelevant and redundant features, leading to a more effective model. Additionally, it promotes explainability, facilitating comprehension of the model’s predictions and the identification of crucial features for emotion determination. The effectiveness of the proposed method is validated on the SER benchmarks Toronto emotional speech set (TESS), Berlin Database of Emotional Speech (EMO-DB), Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS), and Surrey Audio-Visual Expressed Emotion (SAVEE) dataset, outperforming state-of-the-art methods. These results highlight the potential of the proposed technique in developing accurate and explainable SER systems. To the best of our knowledge, this is the first work to incorporate model explainability into an SER framework. The source code of this paper is publicly available via this https://github.com/alaaNfissi/Unveiling-Hidden-Factors-Explainable-AI-for-Feature-Boosting-in-Speech-Emotion-Recognition .
doi_str_mv	10.1007/s10489-024-05536-5
format	Article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_3071639397</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3071639397</sourcerecordid><originalsourceid>FETCH-LOGICAL-c200t-646cad51b0111aa42ee5356a5d107d6edccda0f5f069cf73ca5ea55b29bbcc023</originalsourceid><addsrcrecordid>eNp9kE1LAzEQhoMoWKt_wFPA8-pks8kab6X4USh4sSB4CNnsbJuyTdZkK_rv3bqCN08zMO_zDjyEXDK4ZgDlTWJQ3KoM8iIDIbjMxBGZMFHyrCxUeUwmoIaTlOr1lJyltAUAzoFNyNvKf6BrnV_Tjatr9LQxtg8x3VH87FrjvKlapLMFbUKkDZp-H5FWIaT-wDhPU4doNxR3oXfB04g2rL077OfkpDFtwovfOSWrh_uX-VO2fH5czGfLzOYAfSYLaU0tWAWMMWOKHFFwIY2oGZS1xNra2kAjGpDKNiW3RqARospVVVkLOZ-Sq7G3i-F9j6nX27CPfnipOZRMcsVVOaTyMWVjSClio7vodiZ-aQb6IFGPEvUgUf9I1GKA-AilIezXGP-q_6G-AQQCdmk</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3071639397</pqid></control><display><type>article</type><title>Unveiling hidden factors: explainable AI for feature boosting in speech emotion recognition</title><source>Springer Nature - Complete Springer Journals</source><creator>Nfissi, Alaa ; Bouachir, Wassim ; Bouguila, Nizar ; Mishara, Brian</creator><creatorcontrib>Nfissi, Alaa ; Bouachir, Wassim ; Bouguila, Nizar ; Mishara, Brian</creatorcontrib><description>Speech emotion recognition (SER) has gained significant attention due to its several application fields, such as mental health, education, and human-computer interaction. However, the accuracy of SER systems is hindered by high-dimensional feature sets that may contain irrelevant and redundant information. To overcome this challenge, this study proposes an iterative feature boosting approach for SER that emphasizes feature relevance and explainability to enhance machine learning model performance. Our approach involves meticulous feature selection and analysis to build efficient SER systems. In addressing our main problem through model explainability, we employ a feature evaluation loop with Shapley values to iteratively refine feature sets. This process strikes a balance between model performance and transparency, which enables a comprehensive understanding of the model’s predictions. The proposed approach offers several advantages, including the identification and removal of irrelevant and redundant features, leading to a more effective model. Additionally, it promotes explainability, facilitating comprehension of the model’s predictions and the identification of crucial features for emotion determination. The effectiveness of the proposed method is validated on the SER benchmarks Toronto emotional speech set (TESS), Berlin Database of Emotional Speech (EMO-DB), Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS), and Surrey Audio-Visual Expressed Emotion (SAVEE) dataset, outperforming state-of-the-art methods. These results highlight the potential of the proposed technique in developing accurate and explainable SER systems. To the best of our knowledge, this is the first work to incorporate model explainability into an SER framework. The source code of this paper is publicly available via this https://github.com/alaaNfissi/Unveiling-Hidden-Factors-Explainable-AI-for-Feature-Boosting-in-Speech-Emotion-Recognition .</description><identifier>ISSN: 0924-669X</identifier><identifier>EISSN: 1573-7497</identifier><identifier>DOI: 10.1007/s10489-024-05536-5</identifier><language>eng</language><publisher>New York: Springer US</publisher><subject>Artificial Intelligence ; Computer Science ; Emotion recognition ; Emotions ; Explainable artificial intelligence ; Feature recognition ; Machine learning ; Machines ; Manufacturing ; Mechanical Engineering ; Processes ; Source code ; Speech ; Speech recognition</subject><ispartof>Applied intelligence (Dordrecht, Netherlands), 2024-06, Vol.54 (11-12), p.7046-7069</ispartof><rights>The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2024. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c200t-646cad51b0111aa42ee5356a5d107d6edccda0f5f069cf73ca5ea55b29bbcc023</cites><orcidid>0000-0001-7224-7940 ; 0000-0003-3896-7674 ; 0000-0003-2828-8010 ; 0000-0003-2449-6269</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s10489-024-05536-5$$EPDF$$P50$$Gspringer$$H</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/s10489-024-05536-5$$EHTML$$P50$$Gspringer$$H</linktohtml><link.rule.ids>314,776,780,27903,27904,41467,42536,51298</link.rule.ids></links><search><creatorcontrib>Nfissi, Alaa</creatorcontrib><creatorcontrib>Bouachir, Wassim</creatorcontrib><creatorcontrib>Bouguila, Nizar</creatorcontrib><creatorcontrib>Mishara, Brian</creatorcontrib><title>Unveiling hidden factors: explainable AI for feature boosting in speech emotion recognition</title><title>Applied intelligence (Dordrecht, Netherlands)</title><addtitle>Appl Intell</addtitle><description>Speech emotion recognition (SER) has gained significant attention due to its several application fields, such as mental health, education, and human-computer interaction. However, the accuracy of SER systems is hindered by high-dimensional feature sets that may contain irrelevant and redundant information. To overcome this challenge, this study proposes an iterative feature boosting approach for SER that emphasizes feature relevance and explainability to enhance machine learning model performance. Our approach involves meticulous feature selection and analysis to build efficient SER systems. In addressing our main problem through model explainability, we employ a feature evaluation loop with Shapley values to iteratively refine feature sets. This process strikes a balance between model performance and transparency, which enables a comprehensive understanding of the model’s predictions. The proposed approach offers several advantages, including the identification and removal of irrelevant and redundant features, leading to a more effective model. Additionally, it promotes explainability, facilitating comprehension of the model’s predictions and the identification of crucial features for emotion determination. The effectiveness of the proposed method is validated on the SER benchmarks Toronto emotional speech set (TESS), Berlin Database of Emotional Speech (EMO-DB), Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS), and Surrey Audio-Visual Expressed Emotion (SAVEE) dataset, outperforming state-of-the-art methods. These results highlight the potential of the proposed technique in developing accurate and explainable SER systems. To the best of our knowledge, this is the first work to incorporate model explainability into an SER framework. The source code of this paper is publicly available via this https://github.com/alaaNfissi/Unveiling-Hidden-Factors-Explainable-AI-for-Feature-Boosting-in-Speech-Emotion-Recognition .</description><subject>Artificial Intelligence</subject><subject>Computer Science</subject><subject>Emotion recognition</subject><subject>Emotions</subject><subject>Explainable artificial intelligence</subject><subject>Feature recognition</subject><subject>Machine learning</subject><subject>Machines</subject><subject>Manufacturing</subject><subject>Mechanical Engineering</subject><subject>Processes</subject><subject>Source code</subject><subject>Speech</subject><subject>Speech recognition</subject><issn>0924-669X</issn><issn>1573-7497</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><recordid>eNp9kE1LAzEQhoMoWKt_wFPA8-pks8kab6X4USh4sSB4CNnsbJuyTdZkK_rv3bqCN08zMO_zDjyEXDK4ZgDlTWJQ3KoM8iIDIbjMxBGZMFHyrCxUeUwmoIaTlOr1lJyltAUAzoFNyNvKf6BrnV_Tjatr9LQxtg8x3VH87FrjvKlapLMFbUKkDZp-H5FWIaT-wDhPU4doNxR3oXfB04g2rL077OfkpDFtwovfOSWrh_uX-VO2fH5czGfLzOYAfSYLaU0tWAWMMWOKHFFwIY2oGZS1xNra2kAjGpDKNiW3RqARospVVVkLOZ-Sq7G3i-F9j6nX27CPfnipOZRMcsVVOaTyMWVjSClio7vodiZ-aQb6IFGPEvUgUf9I1GKA-AilIezXGP-q_6G-AQQCdmk</recordid><startdate>20240601</startdate><enddate>20240601</enddate><creator>Nfissi, Alaa</creator><creator>Bouachir, Wassim</creator><creator>Bouguila, Nizar</creator><creator>Mishara, Brian</creator><general>Springer US</general><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0001-7224-7940</orcidid><orcidid>https://orcid.org/0000-0003-3896-7674</orcidid><orcidid>https://orcid.org/0000-0003-2828-8010</orcidid><orcidid>https://orcid.org/0000-0003-2449-6269</orcidid></search><sort><creationdate>20240601</creationdate><title>Unveiling hidden factors: explainable AI for feature boosting in speech emotion recognition</title><author>Nfissi, Alaa ; Bouachir, Wassim ; Bouguila, Nizar ; Mishara, Brian</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c200t-646cad51b0111aa42ee5356a5d107d6edccda0f5f069cf73ca5ea55b29bbcc023</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Artificial Intelligence</topic><topic>Computer Science</topic><topic>Emotion recognition</topic><topic>Emotions</topic><topic>Explainable artificial intelligence</topic><topic>Feature recognition</topic><topic>Machine learning</topic><topic>Machines</topic><topic>Manufacturing</topic><topic>Mechanical Engineering</topic><topic>Processes</topic><topic>Source code</topic><topic>Speech</topic><topic>Speech recognition</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Nfissi, Alaa</creatorcontrib><creatorcontrib>Bouachir, Wassim</creatorcontrib><creatorcontrib>Bouguila, Nizar</creatorcontrib><creatorcontrib>Mishara, Brian</creatorcontrib><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>Applied intelligence (Dordrecht, Netherlands)</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Nfissi, Alaa</au><au>Bouachir, Wassim</au><au>Bouguila, Nizar</au><au>Mishara, Brian</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Unveiling hidden factors: explainable AI for feature boosting in speech emotion recognition</atitle><jtitle>Applied intelligence (Dordrecht, Netherlands)</jtitle><stitle>Appl Intell</stitle><date>2024-06-01</date><risdate>2024</risdate><volume>54</volume><issue>11-12</issue><spage>7046</spage><epage>7069</epage><pages>7046-7069</pages><issn>0924-669X</issn><eissn>1573-7497</eissn><abstract>Speech emotion recognition (SER) has gained significant attention due to its several application fields, such as mental health, education, and human-computer interaction. However, the accuracy of SER systems is hindered by high-dimensional feature sets that may contain irrelevant and redundant information. To overcome this challenge, this study proposes an iterative feature boosting approach for SER that emphasizes feature relevance and explainability to enhance machine learning model performance. Our approach involves meticulous feature selection and analysis to build efficient SER systems. In addressing our main problem through model explainability, we employ a feature evaluation loop with Shapley values to iteratively refine feature sets. This process strikes a balance between model performance and transparency, which enables a comprehensive understanding of the model’s predictions. The proposed approach offers several advantages, including the identification and removal of irrelevant and redundant features, leading to a more effective model. Additionally, it promotes explainability, facilitating comprehension of the model’s predictions and the identification of crucial features for emotion determination. The effectiveness of the proposed method is validated on the SER benchmarks Toronto emotional speech set (TESS), Berlin Database of Emotional Speech (EMO-DB), Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS), and Surrey Audio-Visual Expressed Emotion (SAVEE) dataset, outperforming state-of-the-art methods. These results highlight the potential of the proposed technique in developing accurate and explainable SER systems. To the best of our knowledge, this is the first work to incorporate model explainability into an SER framework. The source code of this paper is publicly available via this https://github.com/alaaNfissi/Unveiling-Hidden-Factors-Explainable-AI-for-Feature-Boosting-in-Speech-Emotion-Recognition .</abstract><cop>New York</cop><pub>Springer US</pub><doi>10.1007/s10489-024-05536-5</doi><tpages>24</tpages><orcidid>https://orcid.org/0000-0001-7224-7940</orcidid><orcidid>https://orcid.org/0000-0003-3896-7674</orcidid><orcidid>https://orcid.org/0000-0003-2828-8010</orcidid><orcidid>https://orcid.org/0000-0003-2449-6269</orcidid></addata></record>
fulltext	fulltext
identifier	ISSN: 0924-669X
ispartof	Applied intelligence (Dordrecht, Netherlands), 2024-06, Vol.54 (11-12), p.7046-7069
issn	0924-669X 1573-7497
language	eng
recordid	cdi_proquest_journals_3071639397
source	Springer Nature - Complete Springer Journals
subjects	Artificial Intelligence Computer Science Emotion recognition Emotions Explainable artificial intelligence Feature recognition Machine learning Machines Manufacturing Mechanical Engineering Processes Source code Speech Speech recognition
title	Unveiling hidden factors: explainable AI for feature boosting in speech emotion recognition
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-21T14%3A37%3A19IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Unveiling%20hidden%20factors:%20explainable%20AI%20for%20feature%20boosting%20in%20speech%20emotion%20recognition&rft.jtitle=Applied%20intelligence%20(Dordrecht,%20Netherlands)&rft.au=Nfissi,%20Alaa&rft.date=2024-06-01&rft.volume=54&rft.issue=11-12&rft.spage=7046&rft.epage=7069&rft.pages=7046-7069&rft.issn=0924-669X&rft.eissn=1573-7497&rft_id=info:doi/10.1007/s10489-024-05536-5&rft_dat=%3Cproquest_cross%3E3071639397%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3071639397&rft_id=info:pmid/&rfr_iscdi=true