Modelling microprosodic effects can lead to an audible improvement in articulatory synthesis

When pitch is explicitly modelled for parametric speech synthesis, microprosodic variations of the fundamental frequency f0 are usually disregarded by current intonation models. While there are numerous studies dealing with the nature and the origin of microprosody, little research has been done on...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	The Journal of the Acoustical Society of America 2021-08, Vol.150 (2), p.1209-1217
Hauptverfasser:	Krug, Paul Konstantin, Gerazov, Branislav, van Niekerk, Daniel R., Xu, Anqi, Xu, Yi, Birkholz, Peter
Format:	Artikel
Sprache:	eng
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	1217
container_issue	2
container_start_page	1209
container_title	The Journal of the Acoustical Society of America
container_volume	150
creator	Krug, Paul Konstantin Gerazov, Branislav van Niekerk, Daniel R. Xu, Anqi Xu, Yi Birkholz, Peter
description	When pitch is explicitly modelled for parametric speech synthesis, microprosodic variations of the fundamental frequency f0 are usually disregarded by current intonation models. While there are numerous studies dealing with the nature and the origin of microprosody, little research has been done on its audibility and its effect on the naturalness of synthetic speech. In this work, the influence of obstruent-related microprosodic variations on the perceived naturalness of articulatory speech synthesis was studied. A small corpus of 20 German words and sentences was re-synthesized using the state-of-the-art articulatory synthesizer VocalTractLab. The pitch contours of the real utterances were extracted and fitted with the Target-Approximation-Model. After the real microprosodic variations were removed from the obtained pitch contours, synthetic variations were applied based on a microprosody model. Subsequently, multiple stimuli with different microprosody amplitudes were synthesized and evaluated in a listening experiment. The results indicate that microprosodic variations are barely audible, but can lead to a greater perceived naturalness of the synthesized speech in certain cases.
doi_str_mv	10.1121/10.0005876
format	Article
fullrecord	<record><control><sourceid>proquest_scita</sourceid><recordid>TN_cdi_scitation_primary_10_1121_10_0005876</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2568596198</sourcerecordid><originalsourceid>FETCH-LOGICAL-c300t-d88d536c5057f7aaaf5e3a966ad8a023d9921e1f81b5f469fffdf2d195ab74c63</originalsourceid><addsrcrecordid>eNp9kEtLAzEUhYMoWKsbf0GWoowmmUkmWUrxBRU3uhOG2zw0MjOpSUbovze1Xbs6l8PH5ZyD0Dkl15QyelOUEMJlKw7QjHJGKslZc4hmxaVVo4Q4Ricpff1BtZqh9-dgbN_78QMPXsewjiEF4zW2zlmdE9Yw4t6CwTngcsJk_Kq32A-F_LGDHTP2xY7Z66mHHOIGp82YP23y6RQdOeiTPdvrHL3d370uHqvly8PT4nZZ6ZqQXBkpDa-F5oS3rgUAx20NJSsYCYTVRilGLXWSrrhrhHLOGccMVRxWbaNFPUcXu78l0_dkU-4Gn3SpBaMNU-oYF5IrQZUs6OUOLV1TitZ16-gHiJuOkm474Vb3Exb4agcn7TNkH8b_6F8qg3KX</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2568596198</pqid></control><display><type>article</type><title>Modelling microprosodic effects can lead to an audible improvement in articulatory synthesis</title><source>AIP Journals Complete</source><source>Alma/SFX Local Collection</source><source>AIP Acoustical Society of America</source><creator>Krug, Paul Konstantin ; Gerazov, Branislav ; van Niekerk, Daniel R. ; Xu, Anqi ; Xu, Yi ; Birkholz, Peter</creator><creatorcontrib>Krug, Paul Konstantin ; Gerazov, Branislav ; van Niekerk, Daniel R. ; Xu, Anqi ; Xu, Yi ; Birkholz, Peter</creatorcontrib><description>When pitch is explicitly modelled for parametric speech synthesis, microprosodic variations of the fundamental frequency f0 are usually disregarded by current intonation models. While there are numerous studies dealing with the nature and the origin of microprosody, little research has been done on its audibility and its effect on the naturalness of synthetic speech. In this work, the influence of obstruent-related microprosodic variations on the perceived naturalness of articulatory speech synthesis was studied. A small corpus of 20 German words and sentences was re-synthesized using the state-of-the-art articulatory synthesizer VocalTractLab. The pitch contours of the real utterances were extracted and fitted with the Target-Approximation-Model. After the real microprosodic variations were removed from the obtained pitch contours, synthetic variations were applied based on a microprosody model. Subsequently, multiple stimuli with different microprosody amplitudes were synthesized and evaluated in a listening experiment. The results indicate that microprosodic variations are barely audible, but can lead to a greater perceived naturalness of the synthesized speech in certain cases.</description><identifier>ISSN: 0001-4966</identifier><identifier>EISSN: 1520-8524</identifier><identifier>DOI: 10.1121/10.0005876</identifier><identifier>CODEN: JASMAN</identifier><language>eng</language><ispartof>The Journal of the Acoustical Society of America, 2021-08, Vol.150 (2), p.1209-1217</ispartof><rights>Acoustical Society of America</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c300t-d88d536c5057f7aaaf5e3a966ad8a023d9921e1f81b5f469fffdf2d195ab74c63</citedby><cites>FETCH-LOGICAL-c300t-d88d536c5057f7aaaf5e3a966ad8a023d9921e1f81b5f469fffdf2d195ab74c63</cites><orcidid>0000-0002-4331-6676</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://pubs.aip.org/jasa/article-lookup/doi/10.1121/10.0005876$$EHTML$$P50$$Gscitation$$H</linktohtml><link.rule.ids>207,208,314,776,780,790,1559,4498,27903,27904,76130</link.rule.ids></links><search><creatorcontrib>Krug, Paul Konstantin</creatorcontrib><creatorcontrib>Gerazov, Branislav</creatorcontrib><creatorcontrib>van Niekerk, Daniel R.</creatorcontrib><creatorcontrib>Xu, Anqi</creatorcontrib><creatorcontrib>Xu, Yi</creatorcontrib><creatorcontrib>Birkholz, Peter</creatorcontrib><title>Modelling microprosodic effects can lead to an audible improvement in articulatory synthesis</title><title>The Journal of the Acoustical Society of America</title><description>When pitch is explicitly modelled for parametric speech synthesis, microprosodic variations of the fundamental frequency f0 are usually disregarded by current intonation models. While there are numerous studies dealing with the nature and the origin of microprosody, little research has been done on its audibility and its effect on the naturalness of synthetic speech. In this work, the influence of obstruent-related microprosodic variations on the perceived naturalness of articulatory speech synthesis was studied. A small corpus of 20 German words and sentences was re-synthesized using the state-of-the-art articulatory synthesizer VocalTractLab. The pitch contours of the real utterances were extracted and fitted with the Target-Approximation-Model. After the real microprosodic variations were removed from the obtained pitch contours, synthetic variations were applied based on a microprosody model. Subsequently, multiple stimuli with different microprosody amplitudes were synthesized and evaluated in a listening experiment. The results indicate that microprosodic variations are barely audible, but can lead to a greater perceived naturalness of the synthesized speech in certain cases.</description><issn>0001-4966</issn><issn>1520-8524</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><recordid>eNp9kEtLAzEUhYMoWKsbf0GWoowmmUkmWUrxBRU3uhOG2zw0MjOpSUbovze1Xbs6l8PH5ZyD0Dkl15QyelOUEMJlKw7QjHJGKslZc4hmxaVVo4Q4Ricpff1BtZqh9-dgbN_78QMPXsewjiEF4zW2zlmdE9Yw4t6CwTngcsJk_Kq32A-F_LGDHTP2xY7Z66mHHOIGp82YP23y6RQdOeiTPdvrHL3d370uHqvly8PT4nZZ6ZqQXBkpDa-F5oS3rgUAx20NJSsYCYTVRilGLXWSrrhrhHLOGccMVRxWbaNFPUcXu78l0_dkU-4Gn3SpBaMNU-oYF5IrQZUs6OUOLV1TitZ16-gHiJuOkm474Vb3Exb4agcn7TNkH8b_6F8qg3KX</recordid><startdate>202108</startdate><enddate>202108</enddate><creator>Krug, Paul Konstantin</creator><creator>Gerazov, Branislav</creator><creator>van Niekerk, Daniel R.</creator><creator>Xu, Anqi</creator><creator>Xu, Yi</creator><creator>Birkholz, Peter</creator><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><orcidid>https://orcid.org/0000-0002-4331-6676</orcidid></search><sort><creationdate>202108</creationdate><title>Modelling microprosodic effects can lead to an audible improvement in articulatory synthesis</title><author>Krug, Paul Konstantin ; Gerazov, Branislav ; van Niekerk, Daniel R. ; Xu, Anqi ; Xu, Yi ; Birkholz, Peter</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c300t-d88d536c5057f7aaaf5e3a966ad8a023d9921e1f81b5f469fffdf2d195ab74c63</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Krug, Paul Konstantin</creatorcontrib><creatorcontrib>Gerazov, Branislav</creatorcontrib><creatorcontrib>van Niekerk, Daniel R.</creatorcontrib><creatorcontrib>Xu, Anqi</creatorcontrib><creatorcontrib>Xu, Yi</creatorcontrib><creatorcontrib>Birkholz, Peter</creatorcontrib><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><jtitle>The Journal of the Acoustical Society of America</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Krug, Paul Konstantin</au><au>Gerazov, Branislav</au><au>van Niekerk, Daniel R.</au><au>Xu, Anqi</au><au>Xu, Yi</au><au>Birkholz, Peter</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Modelling microprosodic effects can lead to an audible improvement in articulatory synthesis</atitle><jtitle>The Journal of the Acoustical Society of America</jtitle><date>2021-08</date><risdate>2021</risdate><volume>150</volume><issue>2</issue><spage>1209</spage><epage>1217</epage><pages>1209-1217</pages><issn>0001-4966</issn><eissn>1520-8524</eissn><coden>JASMAN</coden><abstract>When pitch is explicitly modelled for parametric speech synthesis, microprosodic variations of the fundamental frequency f0 are usually disregarded by current intonation models. While there are numerous studies dealing with the nature and the origin of microprosody, little research has been done on its audibility and its effect on the naturalness of synthetic speech. In this work, the influence of obstruent-related microprosodic variations on the perceived naturalness of articulatory speech synthesis was studied. A small corpus of 20 German words and sentences was re-synthesized using the state-of-the-art articulatory synthesizer VocalTractLab. The pitch contours of the real utterances were extracted and fitted with the Target-Approximation-Model. After the real microprosodic variations were removed from the obtained pitch contours, synthetic variations were applied based on a microprosody model. Subsequently, multiple stimuli with different microprosody amplitudes were synthesized and evaluated in a listening experiment. The results indicate that microprosodic variations are barely audible, but can lead to a greater perceived naturalness of the synthesized speech in certain cases.</abstract><doi>10.1121/10.0005876</doi><tpages>9</tpages><orcidid>https://orcid.org/0000-0002-4331-6676</orcidid><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	ISSN: 0001-4966
ispartof	The Journal of the Acoustical Society of America, 2021-08, Vol.150 (2), p.1209-1217
issn	0001-4966 1520-8524
language	eng
recordid	cdi_scitation_primary_10_1121_10_0005876
source	AIP Journals Complete; Alma/SFX Local Collection; AIP Acoustical Society of America
title	Modelling microprosodic effects can lead to an audible improvement in articulatory synthesis
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-27T16%3A40%3A40IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_scita&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Modelling%20microprosodic%20effects%20can%20lead%20to%20an%20audible%20improvement%20in%20articulatory%20synthesis&rft.jtitle=The%20Journal%20of%20the%20Acoustical%20Society%20of%20America&rft.au=Krug,%20Paul%20Konstantin&rft.date=2021-08&rft.volume=150&rft.issue=2&rft.spage=1209&rft.epage=1217&rft.pages=1209-1217&rft.issn=0001-4966&rft.eissn=1520-8524&rft.coden=JASMAN&rft_id=info:doi/10.1121/10.0005876&rft_dat=%3Cproquest_scita%3E2568596198%3C/proquest_scita%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2568596198&rft_id=info:pmid/&rfr_iscdi=true