Emotional Variability Analysis Based I-Vector for Speaker Verification in Under-Stress Conditions

Emotional conditions cause changes in the speech production system. It produces the differences in the acoustical characteristics compared to neutral conditions. The presence of emotion makes the performance of a speaker verification system degrade. In this paper, we propose a speaker modeling that...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Electronics (Basel) 2020-09, Vol.9 (9), p.1420
Hauptverfasser:	Prasetio, Barlian Henryranu, Tamura, Hiroki, Tanno, Koichi
Format:	Artikel
Sprache:	eng
Schlagworte:	Ablation Algorithms Analysis Applied research Compensation Computer simulation Discriminant analysis Emotions Factor analysis Speech recognition Stress (Psychology) Variability Verification
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue	9
container_start_page	1420
container_title	Electronics (Basel)
container_volume	9
creator	Prasetio, Barlian Henryranu Tamura, Hiroki Tanno, Koichi
description	Emotional conditions cause changes in the speech production system. It produces the differences in the acoustical characteristics compared to neutral conditions. The presence of emotion makes the performance of a speaker verification system degrade. In this paper, we propose a speaker modeling that accommodates the presence of emotions on the speech segments by extracting a speaker representation compactly. The speaker model is estimated by following a similar procedure to the i-vector technique, but it considerate the emotional effect as the channel variability component. We named this method as the emotional variability analysis (EVA). EVA represents the emotion subspace separately to the speaker subspace, like the joint factor analysis (JFA) model. The effectiveness of the proposed system is evaluated by comparing it with the standard i-vector system in the speaker verification task of the Speech Under Simulated and Actual Stress (SUSAS) dataset with three different scoring methods. The evaluation focus in terms of the equal error rate (EER). In addition, we also conducted an ablation study for a more comprehensive analysis of the EVA-based i-vector. Based on experiment results, the proposed system outperformed the standard i-vector system and achieved state-of-the-art results in the verification task for the under-stressed speakers.
doi_str_mv	10.3390/electronics9091420
format	Article
fullrecord	<record><control><sourceid>gale_proqu</sourceid><recordid>TN_cdi_proquest_journals_2440408515</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><galeid>A643043542</galeid><sourcerecordid>A643043542</sourcerecordid><originalsourceid>FETCH-LOGICAL-c347t-f6e95d36a5b5e214c59270fca77217302763f280e1bc89791b2821ef7b338f4a3</originalsourceid><addsrcrecordid>eNplkN9LAzEMx4soOHT_gE8Fn0_769br4xxTBwMf5vZaer1UOm_X2dwe9t97Y4KCgZCQfJKQLyF3nD1IadgjtOD7nLro0TDDlWAXZCSYNoURRlz-ya_JGHHLBjNcVpKNiJvvUh9T51q6cTm6OraxP9LpUDhiRPrkEBq6KDbDiZRpGHy1B_cJmW4gxxC9O43T2NF110AuVn0GRDpLXRNPHbwlV8G1COOfeEPWz_P32WuxfHtZzKbLwkul-yJMwJSNnLiyLkFw5UsjNAveaS24lkzoiQyiYsBrXxlteC0qwSHoWsoqKCdvyP157z6nrwNgb7fpkIc_0AqlmGJVyctf6sO1YGMXUp-d30X0djpRkilZKjFQ4kz5nBAzBLvPcefy0XJmT5rb_5rLb7xidmU</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2440408515</pqid></control><display><type>article</type><title>Emotional Variability Analysis Based I-Vector for Speaker Verification in Under-Stress Conditions</title><source>MDPI - Multidisciplinary Digital Publishing Institute</source><source>EZB-FREE-00999 freely available EZB journals</source><creator>Prasetio, Barlian Henryranu ; Tamura, Hiroki ; Tanno, Koichi</creator><creatorcontrib>Prasetio, Barlian Henryranu ; Tamura, Hiroki ; Tanno, Koichi</creatorcontrib><description>Emotional conditions cause changes in the speech production system. It produces the differences in the acoustical characteristics compared to neutral conditions. The presence of emotion makes the performance of a speaker verification system degrade. In this paper, we propose a speaker modeling that accommodates the presence of emotions on the speech segments by extracting a speaker representation compactly. The speaker model is estimated by following a similar procedure to the i-vector technique, but it considerate the emotional effect as the channel variability component. We named this method as the emotional variability analysis (EVA). EVA represents the emotion subspace separately to the speaker subspace, like the joint factor analysis (JFA) model. The effectiveness of the proposed system is evaluated by comparing it with the standard i-vector system in the speaker verification task of the Speech Under Simulated and Actual Stress (SUSAS) dataset with three different scoring methods. The evaluation focus in terms of the equal error rate (EER). In addition, we also conducted an ablation study for a more comprehensive analysis of the EVA-based i-vector. Based on experiment results, the proposed system outperformed the standard i-vector system and achieved state-of-the-art results in the verification task for the under-stressed speakers.</description><identifier>ISSN: 2079-9292</identifier><identifier>EISSN: 2079-9292</identifier><identifier>DOI: 10.3390/electronics9091420</identifier><language>eng</language><publisher>Basel: MDPI AG</publisher><subject>Ablation ; Algorithms ; Analysis ; Applied research ; Compensation ; Computer simulation ; Discriminant analysis ; Emotions ; Factor analysis ; Speech recognition ; Stress (Psychology) ; Variability ; Verification</subject><ispartof>Electronics (Basel), 2020-09, Vol.9 (9), p.1420</ispartof><rights>COPYRIGHT 2020 MDPI AG</rights><rights>2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c347t-f6e95d36a5b5e214c59270fca77217302763f280e1bc89791b2821ef7b338f4a3</citedby><cites>FETCH-LOGICAL-c347t-f6e95d36a5b5e214c59270fca77217302763f280e1bc89791b2821ef7b338f4a3</cites><orcidid>0000-0002-1064-3443 ; 0000-0001-8031-4744</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27924,27925</link.rule.ids></links><search><creatorcontrib>Prasetio, Barlian Henryranu</creatorcontrib><creatorcontrib>Tamura, Hiroki</creatorcontrib><creatorcontrib>Tanno, Koichi</creatorcontrib><title>Emotional Variability Analysis Based I-Vector for Speaker Verification in Under-Stress Conditions</title><title>Electronics (Basel)</title><description>Emotional conditions cause changes in the speech production system. It produces the differences in the acoustical characteristics compared to neutral conditions. The presence of emotion makes the performance of a speaker verification system degrade. In this paper, we propose a speaker modeling that accommodates the presence of emotions on the speech segments by extracting a speaker representation compactly. The speaker model is estimated by following a similar procedure to the i-vector technique, but it considerate the emotional effect as the channel variability component. We named this method as the emotional variability analysis (EVA). EVA represents the emotion subspace separately to the speaker subspace, like the joint factor analysis (JFA) model. The effectiveness of the proposed system is evaluated by comparing it with the standard i-vector system in the speaker verification task of the Speech Under Simulated and Actual Stress (SUSAS) dataset with three different scoring methods. The evaluation focus in terms of the equal error rate (EER). In addition, we also conducted an ablation study for a more comprehensive analysis of the EVA-based i-vector. Based on experiment results, the proposed system outperformed the standard i-vector system and achieved state-of-the-art results in the verification task for the under-stressed speakers.</description><subject>Ablation</subject><subject>Algorithms</subject><subject>Analysis</subject><subject>Applied research</subject><subject>Compensation</subject><subject>Computer simulation</subject><subject>Discriminant analysis</subject><subject>Emotions</subject><subject>Factor analysis</subject><subject>Speech recognition</subject><subject>Stress (Psychology)</subject><subject>Variability</subject><subject>Verification</subject><issn>2079-9292</issn><issn>2079-9292</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><recordid>eNplkN9LAzEMx4soOHT_gE8Fn0_769br4xxTBwMf5vZaer1UOm_X2dwe9t97Y4KCgZCQfJKQLyF3nD1IadgjtOD7nLro0TDDlWAXZCSYNoURRlz-ya_JGHHLBjNcVpKNiJvvUh9T51q6cTm6OraxP9LpUDhiRPrkEBq6KDbDiZRpGHy1B_cJmW4gxxC9O43T2NF110AuVn0GRDpLXRNPHbwlV8G1COOfeEPWz_P32WuxfHtZzKbLwkul-yJMwJSNnLiyLkFw5UsjNAveaS24lkzoiQyiYsBrXxlteC0qwSHoWsoqKCdvyP157z6nrwNgb7fpkIc_0AqlmGJVyctf6sO1YGMXUp-d30X0djpRkilZKjFQ4kz5nBAzBLvPcefy0XJmT5rb_5rLb7xidmU</recordid><startdate>20200901</startdate><enddate>20200901</enddate><creator>Prasetio, Barlian Henryranu</creator><creator>Tamura, Hiroki</creator><creator>Tanno, Koichi</creator><general>MDPI AG</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7SP</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L7M</scope><scope>P5Z</scope><scope>P62</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><orcidid>https://orcid.org/0000-0002-1064-3443</orcidid><orcidid>https://orcid.org/0000-0001-8031-4744</orcidid></search><sort><creationdate>20200901</creationdate><title>Emotional Variability Analysis Based I-Vector for Speaker Verification in Under-Stress Conditions</title><author>Prasetio, Barlian Henryranu ; Tamura, Hiroki ; Tanno, Koichi</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c347t-f6e95d36a5b5e214c59270fca77217302763f280e1bc89791b2821ef7b338f4a3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Ablation</topic><topic>Algorithms</topic><topic>Analysis</topic><topic>Applied research</topic><topic>Compensation</topic><topic>Computer simulation</topic><topic>Discriminant analysis</topic><topic>Emotions</topic><topic>Factor analysis</topic><topic>Speech recognition</topic><topic>Stress (Psychology)</topic><topic>Variability</topic><topic>Verification</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Prasetio, Barlian Henryranu</creatorcontrib><creatorcontrib>Tamura, Hiroki</creatorcontrib><creatorcontrib>Tanno, Koichi</creatorcontrib><collection>CrossRef</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>Advanced Technologies & Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Advanced Technologies & Aerospace Database</collection><collection>ProQuest Advanced Technologies & Aerospace Collection</collection><collection>Access via ProQuest (Open Access)</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><jtitle>Electronics (Basel)</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Prasetio, Barlian Henryranu</au><au>Tamura, Hiroki</au><au>Tanno, Koichi</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Emotional Variability Analysis Based I-Vector for Speaker Verification in Under-Stress Conditions</atitle><jtitle>Electronics (Basel)</jtitle><date>2020-09-01</date><risdate>2020</risdate><volume>9</volume><issue>9</issue><spage>1420</spage><pages>1420-</pages><issn>2079-9292</issn><eissn>2079-9292</eissn><abstract>Emotional conditions cause changes in the speech production system. It produces the differences in the acoustical characteristics compared to neutral conditions. The presence of emotion makes the performance of a speaker verification system degrade. In this paper, we propose a speaker modeling that accommodates the presence of emotions on the speech segments by extracting a speaker representation compactly. The speaker model is estimated by following a similar procedure to the i-vector technique, but it considerate the emotional effect as the channel variability component. We named this method as the emotional variability analysis (EVA). EVA represents the emotion subspace separately to the speaker subspace, like the joint factor analysis (JFA) model. The effectiveness of the proposed system is evaluated by comparing it with the standard i-vector system in the speaker verification task of the Speech Under Simulated and Actual Stress (SUSAS) dataset with three different scoring methods. The evaluation focus in terms of the equal error rate (EER). In addition, we also conducted an ablation study for a more comprehensive analysis of the EVA-based i-vector. Based on experiment results, the proposed system outperformed the standard i-vector system and achieved state-of-the-art results in the verification task for the under-stressed speakers.</abstract><cop>Basel</cop><pub>MDPI AG</pub><doi>10.3390/electronics9091420</doi><orcidid>https://orcid.org/0000-0002-1064-3443</orcidid><orcidid>https://orcid.org/0000-0001-8031-4744</orcidid><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	ISSN: 2079-9292
ispartof	Electronics (Basel), 2020-09, Vol.9 (9), p.1420
issn	2079-9292 2079-9292
language	eng
recordid	cdi_proquest_journals_2440408515
source	MDPI - Multidisciplinary Digital Publishing Institute; EZB-FREE-00999 freely available EZB journals
subjects	Ablation Algorithms Analysis Applied research Compensation Computer simulation Discriminant analysis Emotions Factor analysis Speech recognition Stress (Psychology) Variability Verification
title	Emotional Variability Analysis Based I-Vector for Speaker Verification in Under-Stress Conditions
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-29T10%3A35%3A43IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-gale_proqu&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Emotional%20Variability%20Analysis%20Based%20I-Vector%20for%20Speaker%20Verification%20in%20Under-Stress%20Conditions&rft.jtitle=Electronics%20(Basel)&rft.au=Prasetio,%20Barlian%20Henryranu&rft.date=2020-09-01&rft.volume=9&rft.issue=9&rft.spage=1420&rft.pages=1420-&rft.issn=2079-9292&rft.eissn=2079-9292&rft_id=info:doi/10.3390/electronics9091420&rft_dat=%3Cgale_proqu%3EA643043542%3C/gale_proqu%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2440408515&rft_id=info:pmid/&rft_galeid=A643043542&rfr_iscdi=true