Emotion recognition based on the energy distribution of plosive syllables

We usually encounter two problems during speech emotion recognition (SER): expression and perception problems, which vary considerably between speakers, languages, and sentence pronunciation. In fact, finding an optimal system that characterizes the emotions overcoming all these differences is a pro...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:International journal of electrical and computer engineering (Malacca, Malacca) Malacca), 2022-12, Vol.12 (6), p.6159
Hauptverfasser: Agrima, Abdellah, Mounir, Ilham, Farchi, Abdelmajid, Elmazouzi, Laila, Mounir, Badia
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue 6
container_start_page 6159
container_title International journal of electrical and computer engineering (Malacca, Malacca)
container_volume 12
creator Agrima, Abdellah
Mounir, Ilham
Farchi, Abdelmajid
Elmazouzi, Laila
Mounir, Badia
description We usually encounter two problems during speech emotion recognition (SER): expression and perception problems, which vary considerably between speakers, languages, and sentence pronunciation. In fact, finding an optimal system that characterizes the emotions overcoming all these differences is a promising prospect. In this perspective, we considered two emotional databases: Moroccan Arabic dialect emotional database (MADED), and Ryerson audio-visual database on emotional speech and song (RAVDESS) which present notable differences in terms of type (natural/acted), and language (Arabic/English). We proposed a detection process based on 27 acoustic features extracted from consonant-vowel (CV) syllabic units: \ba, \du, \ki, \ta common to both databases. We tested two classification strategies: multiclass (all emotions combined: joy, sadness, neutral, anger) and binary (neutral vs. others, positive emotions (joy) vs. negative emotions (sadness, anger), sadness vs. anger). These strategies were tested three times: i) on MADED, ii) on RAVDESS, iii) on MADED and RAVDESS. The proposed method gave better recognition accuracy in the case of binary classification. The rates reach an average of 78% for the multi-class classification, 100% for neutral vs. other cases, 100% for the negative emotions (i.e. anger vs. sadness), and 96% for the positive vs. negative emotions.
doi_str_mv 10.11591/ijece.v12i6.pp6159-6171
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2766672702</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2766672702</sourcerecordid><originalsourceid>FETCH-LOGICAL-c147t-1e1716bc5f02e2b17529ab794caca8d21980600556da69634b8295487c18a5e83</originalsourceid><addsrcrecordid>eNotkF1LwzAUhoMoOOb-Q8Dr1iTLVy9lTB0MvNHrkKSnM6NratIO9u-t3a7Oy-HhPYcHIUxJSamo6Es4gofyTFmQZd_LaVdIqugdWjDFWMGE0vdTJloXWhH9iFY5B0c4V5woKRZotz3FIcQOJ_Dx0IU5O5uhxlMYfgBDB-lwwXXIQwpunIHY4L6NOZwB50vbWtdCfkIPjW0zrG5zib7ftl-bj2L_-b7bvO4LT7kaCgrTf9J50RAGzFElWGWdqri33uqa0UoTSYgQsraykmvuNKsE18pTbQXo9RI9X3v7FH9HyIM5xjF100nDlJRSMUXYROkr5VPMOUFj-hRONl0MJWZ2Z2Z3ZnZnru7Mv7v1HzgFZP0</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2766672702</pqid></control><display><type>article</type><title>Emotion recognition based on the energy distribution of plosive syllables</title><source>EZB-FREE-00999 freely available EZB journals</source><creator>Agrima, Abdellah ; Mounir, Ilham ; Farchi, Abdelmajid ; Elmazouzi, Laila ; Mounir, Badia</creator><creatorcontrib>Agrima, Abdellah ; Mounir, Ilham ; Farchi, Abdelmajid ; Elmazouzi, Laila ; Mounir, Badia</creatorcontrib><description>We usually encounter two problems during speech emotion recognition (SER): expression and perception problems, which vary considerably between speakers, languages, and sentence pronunciation. In fact, finding an optimal system that characterizes the emotions overcoming all these differences is a promising prospect. In this perspective, we considered two emotional databases: Moroccan Arabic dialect emotional database (MADED), and Ryerson audio-visual database on emotional speech and song (RAVDESS) which present notable differences in terms of type (natural/acted), and language (Arabic/English). We proposed a detection process based on 27 acoustic features extracted from consonant-vowel (CV) syllabic units: \ba, \du, \ki, \ta common to both databases. We tested two classification strategies: multiclass (all emotions combined: joy, sadness, neutral, anger) and binary (neutral vs. others, positive emotions (joy) vs. negative emotions (sadness, anger), sadness vs. anger). These strategies were tested three times: i) on MADED, ii) on RAVDESS, iii) on MADED and RAVDESS. The proposed method gave better recognition accuracy in the case of binary classification. The rates reach an average of 78% for the multi-class classification, 100% for neutral vs. other cases, 100% for the negative emotions (i.e. anger vs. sadness), and 96% for the positive vs. negative emotions.</description><identifier>ISSN: 2088-8708</identifier><identifier>EISSN: 2722-2578</identifier><identifier>EISSN: 2088-8708</identifier><identifier>DOI: 10.11591/ijece.v12i6.pp6159-6171</identifier><language>eng</language><publisher>Yogyakarta: IAES Institute of Advanced Engineering and Science</publisher><subject>Classification ; Consonants (speech) ; Emotion recognition ; Emotions ; Energy distribution ; Feature extraction ; Speech recognition</subject><ispartof>International journal of electrical and computer engineering (Malacca, Malacca), 2022-12, Vol.12 (6), p.6159</ispartof><rights>Copyright IAES Institute of Advanced Engineering and Science 2022</rights><woscitedreferencessubscribed>false</woscitedreferencessubscribed><orcidid>0000-0002-2430-5752 ; 0000-0003-2898-2571 ; 0000-0002-1503-7821 ; 0000-0002-4470-8153</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27924,27925</link.rule.ids></links><search><creatorcontrib>Agrima, Abdellah</creatorcontrib><creatorcontrib>Mounir, Ilham</creatorcontrib><creatorcontrib>Farchi, Abdelmajid</creatorcontrib><creatorcontrib>Elmazouzi, Laila</creatorcontrib><creatorcontrib>Mounir, Badia</creatorcontrib><title>Emotion recognition based on the energy distribution of plosive syllables</title><title>International journal of electrical and computer engineering (Malacca, Malacca)</title><description>We usually encounter two problems during speech emotion recognition (SER): expression and perception problems, which vary considerably between speakers, languages, and sentence pronunciation. In fact, finding an optimal system that characterizes the emotions overcoming all these differences is a promising prospect. In this perspective, we considered two emotional databases: Moroccan Arabic dialect emotional database (MADED), and Ryerson audio-visual database on emotional speech and song (RAVDESS) which present notable differences in terms of type (natural/acted), and language (Arabic/English). We proposed a detection process based on 27 acoustic features extracted from consonant-vowel (CV) syllabic units: \ba, \du, \ki, \ta common to both databases. We tested two classification strategies: multiclass (all emotions combined: joy, sadness, neutral, anger) and binary (neutral vs. others, positive emotions (joy) vs. negative emotions (sadness, anger), sadness vs. anger). These strategies were tested three times: i) on MADED, ii) on RAVDESS, iii) on MADED and RAVDESS. The proposed method gave better recognition accuracy in the case of binary classification. The rates reach an average of 78% for the multi-class classification, 100% for neutral vs. other cases, 100% for the negative emotions (i.e. anger vs. sadness), and 96% for the positive vs. negative emotions.</description><subject>Classification</subject><subject>Consonants (speech)</subject><subject>Emotion recognition</subject><subject>Emotions</subject><subject>Energy distribution</subject><subject>Feature extraction</subject><subject>Speech recognition</subject><issn>2088-8708</issn><issn>2722-2578</issn><issn>2088-8708</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><sourceid>GNUQQ</sourceid><recordid>eNotkF1LwzAUhoMoOOb-Q8Dr1iTLVy9lTB0MvNHrkKSnM6NratIO9u-t3a7Oy-HhPYcHIUxJSamo6Es4gofyTFmQZd_LaVdIqugdWjDFWMGE0vdTJloXWhH9iFY5B0c4V5woKRZotz3FIcQOJ_Dx0IU5O5uhxlMYfgBDB-lwwXXIQwpunIHY4L6NOZwB50vbWtdCfkIPjW0zrG5zib7ftl-bj2L_-b7bvO4LT7kaCgrTf9J50RAGzFElWGWdqri33uqa0UoTSYgQsraykmvuNKsE18pTbQXo9RI9X3v7FH9HyIM5xjF100nDlJRSMUXYROkr5VPMOUFj-hRONl0MJWZ2Z2Z3ZnZnru7Mv7v1HzgFZP0</recordid><startdate>20221201</startdate><enddate>20221201</enddate><creator>Agrima, Abdellah</creator><creator>Mounir, Ilham</creator><creator>Farchi, Abdelmajid</creator><creator>Elmazouzi, Laila</creator><creator>Mounir, Badia</creator><general>IAES Institute of Advanced Engineering and Science</general><scope>AAYXX</scope><scope>CITATION</scope><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>BVBZV</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K7-</scope><scope>L6V</scope><scope>M7S</scope><scope>P5Z</scope><scope>P62</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope><orcidid>https://orcid.org/0000-0002-2430-5752</orcidid><orcidid>https://orcid.org/0000-0003-2898-2571</orcidid><orcidid>https://orcid.org/0000-0002-1503-7821</orcidid><orcidid>https://orcid.org/0000-0002-4470-8153</orcidid></search><sort><creationdate>20221201</creationdate><title>Emotion recognition based on the energy distribution of plosive syllables</title><author>Agrima, Abdellah ; Mounir, Ilham ; Farchi, Abdelmajid ; Elmazouzi, Laila ; Mounir, Badia</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c147t-1e1716bc5f02e2b17529ab794caca8d21980600556da69634b8295487c18a5e83</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Classification</topic><topic>Consonants (speech)</topic><topic>Emotion recognition</topic><topic>Emotions</topic><topic>Energy distribution</topic><topic>Feature extraction</topic><topic>Speech recognition</topic><toplevel>online_resources</toplevel><creatorcontrib>Agrima, Abdellah</creatorcontrib><creatorcontrib>Mounir, Ilham</creatorcontrib><creatorcontrib>Farchi, Abdelmajid</creatorcontrib><creatorcontrib>Elmazouzi, Laila</creatorcontrib><creatorcontrib>Mounir, Badia</creatorcontrib><collection>CrossRef</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science &amp; Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>Advanced Technologies &amp; Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>East &amp; South Asia Database</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>Computer Science Database</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Advanced Technologies &amp; Aerospace Database</collection><collection>ProQuest Advanced Technologies &amp; Aerospace Collection</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection><jtitle>International journal of electrical and computer engineering (Malacca, Malacca)</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Agrima, Abdellah</au><au>Mounir, Ilham</au><au>Farchi, Abdelmajid</au><au>Elmazouzi, Laila</au><au>Mounir, Badia</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Emotion recognition based on the energy distribution of plosive syllables</atitle><jtitle>International journal of electrical and computer engineering (Malacca, Malacca)</jtitle><date>2022-12-01</date><risdate>2022</risdate><volume>12</volume><issue>6</issue><spage>6159</spage><pages>6159-</pages><issn>2088-8708</issn><eissn>2722-2578</eissn><eissn>2088-8708</eissn><abstract>We usually encounter two problems during speech emotion recognition (SER): expression and perception problems, which vary considerably between speakers, languages, and sentence pronunciation. In fact, finding an optimal system that characterizes the emotions overcoming all these differences is a promising prospect. In this perspective, we considered two emotional databases: Moroccan Arabic dialect emotional database (MADED), and Ryerson audio-visual database on emotional speech and song (RAVDESS) which present notable differences in terms of type (natural/acted), and language (Arabic/English). We proposed a detection process based on 27 acoustic features extracted from consonant-vowel (CV) syllabic units: \ba, \du, \ki, \ta common to both databases. We tested two classification strategies: multiclass (all emotions combined: joy, sadness, neutral, anger) and binary (neutral vs. others, positive emotions (joy) vs. negative emotions (sadness, anger), sadness vs. anger). These strategies were tested three times: i) on MADED, ii) on RAVDESS, iii) on MADED and RAVDESS. The proposed method gave better recognition accuracy in the case of binary classification. The rates reach an average of 78% for the multi-class classification, 100% for neutral vs. other cases, 100% for the negative emotions (i.e. anger vs. sadness), and 96% for the positive vs. negative emotions.</abstract><cop>Yogyakarta</cop><pub>IAES Institute of Advanced Engineering and Science</pub><doi>10.11591/ijece.v12i6.pp6159-6171</doi><orcidid>https://orcid.org/0000-0002-2430-5752</orcidid><orcidid>https://orcid.org/0000-0003-2898-2571</orcidid><orcidid>https://orcid.org/0000-0002-1503-7821</orcidid><orcidid>https://orcid.org/0000-0002-4470-8153</orcidid></addata></record>
fulltext fulltext
identifier ISSN: 2088-8708
ispartof International journal of electrical and computer engineering (Malacca, Malacca), 2022-12, Vol.12 (6), p.6159
issn 2088-8708
2722-2578
2088-8708
language eng
recordid cdi_proquest_journals_2766672702
source EZB-FREE-00999 freely available EZB journals
subjects Classification
Consonants (speech)
Emotion recognition
Emotions
Energy distribution
Feature extraction
Speech recognition
title Emotion recognition based on the energy distribution of plosive syllables
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-29T14%3A03%3A16IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Emotion%20recognition%20based%20on%20the%20energy%20distribution%20of%20plosive%20syllables&rft.jtitle=International%20journal%20of%20electrical%20and%20computer%20engineering%20(Malacca,%20Malacca)&rft.au=Agrima,%20Abdellah&rft.date=2022-12-01&rft.volume=12&rft.issue=6&rft.spage=6159&rft.pages=6159-&rft.issn=2088-8708&rft.eissn=2722-2578&rft_id=info:doi/10.11591/ijece.v12i6.pp6159-6171&rft_dat=%3Cproquest_cross%3E2766672702%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2766672702&rft_id=info:pmid/&rfr_iscdi=true