Emotion recognition based on the energy distribution of plosive syllables
We usually encounter two problems during speech emotion recognition (SER): expression and perception problems, which vary considerably between speakers, languages, and sentence pronunciation. In fact, finding an optimal system that characterizes the emotions overcoming all these differences is a pro...
Gespeichert in:
Veröffentlicht in: | International journal of electrical and computer engineering (Malacca, Malacca) Malacca), 2022-12, Vol.12 (6), p.6159 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | 6 |
container_start_page | 6159 |
container_title | International journal of electrical and computer engineering (Malacca, Malacca) |
container_volume | 12 |
creator | Agrima, Abdellah Mounir, Ilham Farchi, Abdelmajid Elmazouzi, Laila Mounir, Badia |
description | We usually encounter two problems during speech emotion recognition (SER): expression and perception problems, which vary considerably between speakers, languages, and sentence pronunciation. In fact, finding an optimal system that characterizes the emotions overcoming all these differences is a promising prospect. In this perspective, we considered two emotional databases: Moroccan Arabic dialect emotional database (MADED), and Ryerson audio-visual database on emotional speech and song (RAVDESS) which present notable differences in terms of type (natural/acted), and language (Arabic/English). We proposed a detection process based on 27 acoustic features extracted from consonant-vowel (CV) syllabic units: \ba, \du, \ki, \ta common to both databases. We tested two classification strategies: multiclass (all emotions combined: joy, sadness, neutral, anger) and binary (neutral vs. others, positive emotions (joy) vs. negative emotions (sadness, anger), sadness vs. anger). These strategies were tested three times: i) on MADED, ii) on RAVDESS, iii) on MADED and RAVDESS. The proposed method gave better recognition accuracy in the case of binary classification. The rates reach an average of 78% for the multi-class classification, 100% for neutral vs. other cases, 100% for the negative emotions (i.e. anger vs. sadness), and 96% for the positive vs. negative emotions. |
doi_str_mv | 10.11591/ijece.v12i6.pp6159-6171 |
format | Article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2766672702</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2766672702</sourcerecordid><originalsourceid>FETCH-LOGICAL-c147t-1e1716bc5f02e2b17529ab794caca8d21980600556da69634b8295487c18a5e83</originalsourceid><addsrcrecordid>eNotkF1LwzAUhoMoOOb-Q8Dr1iTLVy9lTB0MvNHrkKSnM6NratIO9u-t3a7Oy-HhPYcHIUxJSamo6Es4gofyTFmQZd_LaVdIqugdWjDFWMGE0vdTJloXWhH9iFY5B0c4V5woKRZotz3FIcQOJ_Dx0IU5O5uhxlMYfgBDB-lwwXXIQwpunIHY4L6NOZwB50vbWtdCfkIPjW0zrG5zib7ftl-bj2L_-b7bvO4LT7kaCgrTf9J50RAGzFElWGWdqri33uqa0UoTSYgQsraykmvuNKsE18pTbQXo9RI9X3v7FH9HyIM5xjF100nDlJRSMUXYROkr5VPMOUFj-hRONl0MJWZ2Z2Z3ZnZnru7Mv7v1HzgFZP0</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2766672702</pqid></control><display><type>article</type><title>Emotion recognition based on the energy distribution of plosive syllables</title><source>EZB-FREE-00999 freely available EZB journals</source><creator>Agrima, Abdellah ; Mounir, Ilham ; Farchi, Abdelmajid ; Elmazouzi, Laila ; Mounir, Badia</creator><creatorcontrib>Agrima, Abdellah ; Mounir, Ilham ; Farchi, Abdelmajid ; Elmazouzi, Laila ; Mounir, Badia</creatorcontrib><description>We usually encounter two problems during speech emotion recognition (SER): expression and perception problems, which vary considerably between speakers, languages, and sentence pronunciation. In fact, finding an optimal system that characterizes the emotions overcoming all these differences is a promising prospect. In this perspective, we considered two emotional databases: Moroccan Arabic dialect emotional database (MADED), and Ryerson audio-visual database on emotional speech and song (RAVDESS) which present notable differences in terms of type (natural/acted), and language (Arabic/English). We proposed a detection process based on 27 acoustic features extracted from consonant-vowel (CV) syllabic units: \ba, \du, \ki, \ta common to both databases. We tested two classification strategies: multiclass (all emotions combined: joy, sadness, neutral, anger) and binary (neutral vs. others, positive emotions (joy) vs. negative emotions (sadness, anger), sadness vs. anger). These strategies were tested three times: i) on MADED, ii) on RAVDESS, iii) on MADED and RAVDESS. The proposed method gave better recognition accuracy in the case of binary classification. The rates reach an average of 78% for the multi-class classification, 100% for neutral vs. other cases, 100% for the negative emotions (i.e. anger vs. sadness), and 96% for the positive vs. negative emotions.</description><identifier>ISSN: 2088-8708</identifier><identifier>EISSN: 2722-2578</identifier><identifier>EISSN: 2088-8708</identifier><identifier>DOI: 10.11591/ijece.v12i6.pp6159-6171</identifier><language>eng</language><publisher>Yogyakarta: IAES Institute of Advanced Engineering and Science</publisher><subject>Classification ; Consonants (speech) ; Emotion recognition ; Emotions ; Energy distribution ; Feature extraction ; Speech recognition</subject><ispartof>International journal of electrical and computer engineering (Malacca, Malacca), 2022-12, Vol.12 (6), p.6159</ispartof><rights>Copyright IAES Institute of Advanced Engineering and Science 2022</rights><woscitedreferencessubscribed>false</woscitedreferencessubscribed><orcidid>0000-0002-2430-5752 ; 0000-0003-2898-2571 ; 0000-0002-1503-7821 ; 0000-0002-4470-8153</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27924,27925</link.rule.ids></links><search><creatorcontrib>Agrima, Abdellah</creatorcontrib><creatorcontrib>Mounir, Ilham</creatorcontrib><creatorcontrib>Farchi, Abdelmajid</creatorcontrib><creatorcontrib>Elmazouzi, Laila</creatorcontrib><creatorcontrib>Mounir, Badia</creatorcontrib><title>Emotion recognition based on the energy distribution of plosive syllables</title><title>International journal of electrical and computer engineering (Malacca, Malacca)</title><description>We usually encounter two problems during speech emotion recognition (SER): expression and perception problems, which vary considerably between speakers, languages, and sentence pronunciation. In fact, finding an optimal system that characterizes the emotions overcoming all these differences is a promising prospect. In this perspective, we considered two emotional databases: Moroccan Arabic dialect emotional database (MADED), and Ryerson audio-visual database on emotional speech and song (RAVDESS) which present notable differences in terms of type (natural/acted), and language (Arabic/English). We proposed a detection process based on 27 acoustic features extracted from consonant-vowel (CV) syllabic units: \ba, \du, \ki, \ta common to both databases. We tested two classification strategies: multiclass (all emotions combined: joy, sadness, neutral, anger) and binary (neutral vs. others, positive emotions (joy) vs. negative emotions (sadness, anger), sadness vs. anger). These strategies were tested three times: i) on MADED, ii) on RAVDESS, iii) on MADED and RAVDESS. The proposed method gave better recognition accuracy in the case of binary classification. The rates reach an average of 78% for the multi-class classification, 100% for neutral vs. other cases, 100% for the negative emotions (i.e. anger vs. sadness), and 96% for the positive vs. negative emotions.</description><subject>Classification</subject><subject>Consonants (speech)</subject><subject>Emotion recognition</subject><subject>Emotions</subject><subject>Energy distribution</subject><subject>Feature extraction</subject><subject>Speech recognition</subject><issn>2088-8708</issn><issn>2722-2578</issn><issn>2088-8708</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><sourceid>GNUQQ</sourceid><recordid>eNotkF1LwzAUhoMoOOb-Q8Dr1iTLVy9lTB0MvNHrkKSnM6NratIO9u-t3a7Oy-HhPYcHIUxJSamo6Es4gofyTFmQZd_LaVdIqugdWjDFWMGE0vdTJloXWhH9iFY5B0c4V5woKRZotz3FIcQOJ_Dx0IU5O5uhxlMYfgBDB-lwwXXIQwpunIHY4L6NOZwB50vbWtdCfkIPjW0zrG5zib7ftl-bj2L_-b7bvO4LT7kaCgrTf9J50RAGzFElWGWdqri33uqa0UoTSYgQsraykmvuNKsE18pTbQXo9RI9X3v7FH9HyIM5xjF100nDlJRSMUXYROkr5VPMOUFj-hRONl0MJWZ2Z2Z3ZnZnru7Mv7v1HzgFZP0</recordid><startdate>20221201</startdate><enddate>20221201</enddate><creator>Agrima, Abdellah</creator><creator>Mounir, Ilham</creator><creator>Farchi, Abdelmajid</creator><creator>Elmazouzi, Laila</creator><creator>Mounir, Badia</creator><general>IAES Institute of Advanced Engineering and Science</general><scope>AAYXX</scope><scope>CITATION</scope><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>BVBZV</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K7-</scope><scope>L6V</scope><scope>M7S</scope><scope>P5Z</scope><scope>P62</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope><orcidid>https://orcid.org/0000-0002-2430-5752</orcidid><orcidid>https://orcid.org/0000-0003-2898-2571</orcidid><orcidid>https://orcid.org/0000-0002-1503-7821</orcidid><orcidid>https://orcid.org/0000-0002-4470-8153</orcidid></search><sort><creationdate>20221201</creationdate><title>Emotion recognition based on the energy distribution of plosive syllables</title><author>Agrima, Abdellah ; Mounir, Ilham ; Farchi, Abdelmajid ; Elmazouzi, Laila ; Mounir, Badia</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c147t-1e1716bc5f02e2b17529ab794caca8d21980600556da69634b8295487c18a5e83</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Classification</topic><topic>Consonants (speech)</topic><topic>Emotion recognition</topic><topic>Emotions</topic><topic>Energy distribution</topic><topic>Feature extraction</topic><topic>Speech recognition</topic><toplevel>online_resources</toplevel><creatorcontrib>Agrima, Abdellah</creatorcontrib><creatorcontrib>Mounir, Ilham</creatorcontrib><creatorcontrib>Farchi, Abdelmajid</creatorcontrib><creatorcontrib>Elmazouzi, Laila</creatorcontrib><creatorcontrib>Mounir, Badia</creatorcontrib><collection>CrossRef</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>Advanced Technologies & Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>East & South Asia Database</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>Computer Science Database</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Advanced Technologies & Aerospace Database</collection><collection>ProQuest Advanced Technologies & Aerospace Collection</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection><jtitle>International journal of electrical and computer engineering (Malacca, Malacca)</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Agrima, Abdellah</au><au>Mounir, Ilham</au><au>Farchi, Abdelmajid</au><au>Elmazouzi, Laila</au><au>Mounir, Badia</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Emotion recognition based on the energy distribution of plosive syllables</atitle><jtitle>International journal of electrical and computer engineering (Malacca, Malacca)</jtitle><date>2022-12-01</date><risdate>2022</risdate><volume>12</volume><issue>6</issue><spage>6159</spage><pages>6159-</pages><issn>2088-8708</issn><eissn>2722-2578</eissn><eissn>2088-8708</eissn><abstract>We usually encounter two problems during speech emotion recognition (SER): expression and perception problems, which vary considerably between speakers, languages, and sentence pronunciation. In fact, finding an optimal system that characterizes the emotions overcoming all these differences is a promising prospect. In this perspective, we considered two emotional databases: Moroccan Arabic dialect emotional database (MADED), and Ryerson audio-visual database on emotional speech and song (RAVDESS) which present notable differences in terms of type (natural/acted), and language (Arabic/English). We proposed a detection process based on 27 acoustic features extracted from consonant-vowel (CV) syllabic units: \ba, \du, \ki, \ta common to both databases. We tested two classification strategies: multiclass (all emotions combined: joy, sadness, neutral, anger) and binary (neutral vs. others, positive emotions (joy) vs. negative emotions (sadness, anger), sadness vs. anger). These strategies were tested three times: i) on MADED, ii) on RAVDESS, iii) on MADED and RAVDESS. The proposed method gave better recognition accuracy in the case of binary classification. The rates reach an average of 78% for the multi-class classification, 100% for neutral vs. other cases, 100% for the negative emotions (i.e. anger vs. sadness), and 96% for the positive vs. negative emotions.</abstract><cop>Yogyakarta</cop><pub>IAES Institute of Advanced Engineering and Science</pub><doi>10.11591/ijece.v12i6.pp6159-6171</doi><orcidid>https://orcid.org/0000-0002-2430-5752</orcidid><orcidid>https://orcid.org/0000-0003-2898-2571</orcidid><orcidid>https://orcid.org/0000-0002-1503-7821</orcidid><orcidid>https://orcid.org/0000-0002-4470-8153</orcidid></addata></record> |
fulltext | fulltext |
identifier | ISSN: 2088-8708 |
ispartof | International journal of electrical and computer engineering (Malacca, Malacca), 2022-12, Vol.12 (6), p.6159 |
issn | 2088-8708 2722-2578 2088-8708 |
language | eng |
recordid | cdi_proquest_journals_2766672702 |
source | EZB-FREE-00999 freely available EZB journals |
subjects | Classification Consonants (speech) Emotion recognition Emotions Energy distribution Feature extraction Speech recognition |
title | Emotion recognition based on the energy distribution of plosive syllables |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-29T14%3A03%3A16IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Emotion%20recognition%20based%20on%20the%20energy%20distribution%20of%20plosive%20syllables&rft.jtitle=International%20journal%20of%20electrical%20and%20computer%20engineering%20(Malacca,%20Malacca)&rft.au=Agrima,%20Abdellah&rft.date=2022-12-01&rft.volume=12&rft.issue=6&rft.spage=6159&rft.pages=6159-&rft.issn=2088-8708&rft.eissn=2722-2578&rft_id=info:doi/10.11591/ijece.v12i6.pp6159-6171&rft_dat=%3Cproquest_cross%3E2766672702%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2766672702&rft_id=info:pmid/&rfr_iscdi=true |