Listening test materials for "A study of speaker adaptation for DNN-based speech synthesis"

The dataset contains the testing stimuli and listeners' MUSHRA test responses for the Interspeech 2015 paper, "A study of speaker adaptation for DNN-based speech synthesis". In this paper, we conduct an experimental analysis of speaker adaptation for Deep Neural Network (DNN) based sp...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
1. Verfasser:	Wu, Zhizheng
Format:	Dataset
Sprache:	eng
Schlagworte:	acoustic model adaptation DNN adaptation speech synthesis
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title
container_volume
creator	Wu, Zhizheng
description	The dataset contains the testing stimuli and listeners' MUSHRA test responses for the Interspeech 2015 paper, "A study of speaker adaptation for DNN-based speech synthesis". In this paper, we conduct an experimental analysis of speaker adaptation for Deep Neural Network (DNN) based speech synthesis at different levels. In particular, we augment a low-dimensional speaker-specific vector with linguistic features as input to represent speaker identity, perform model adaptation to scale the hidden activation weights, and perform a feature space transformation at the output layer to modify generated acoustic features. We systematically analyse the performance of each individual adaptation technique and that of their combinations. Experimental results confirm the adaptability of the DNN, and listening tests demonstrate that the DNN can achieve significantly better adaptation performance than the hidden Markov model (HMM) baseline in terms of naturalness and speaker similarity.
doi_str_mv	10.7488/ds/259
format	Dataset
fullrecord	<record><control><sourceid>datacite_PQ8</sourceid><recordid>TN_cdi_datacite_primary_10_7488_ds_259</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>10_7488_ds_259</sourcerecordid><originalsourceid>FETCH-datacite_primary_10_7488_ds_2593</originalsourceid><addsrcrecordid>eNqNzrsOgkAQheFtLIyXZ5hQ2CHgJWJpvMTCUNlZbEZ2kI2ykJ2x4O0F4wNYnebLya_UNInnm1WaRoajxXo7VLeLZSFn3QOEWKBCIW_xxVDUHoIdsLxNC3UB3BA-yQMabATF1u5LDlkW3pHJ9IDyErh1UhJbDsZqUHRPNPntSM1Ox-v-HBoUzK2Qbryt0Lc6iXVfpQ3rrmr5N_wANFFC0Q</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>dataset</recordtype></control><display><type>dataset</type><title>Listening test materials for "A study of speaker adaptation for DNN-based speech synthesis"</title><source>DataCite</source><creator>Wu, Zhizheng</creator><creatorcontrib>Wu, Zhizheng</creatorcontrib><description>The dataset contains the testing stimuli and listeners' MUSHRA test responses for the Interspeech 2015 paper, "A study of speaker adaptation for DNN-based speech synthesis". In this paper, we conduct an experimental analysis of speaker adaptation for Deep Neural Network (DNN) based speech synthesis at different levels. In particular, we augment a low-dimensional speaker-specific vector with linguistic features as input to represent speaker identity, perform model adaptation to scale the hidden activation weights, and perform a feature space transformation at the output layer to modify generated acoustic features. We systematically analyse the performance of each individual adaptation technique and that of their combinations. Experimental results confirm the adaptability of the DNN, and listening tests demonstrate that the DNN can achieve significantly better adaptation performance than the hidden Markov model (HMM) baseline in terms of naturalness and speaker similarity.</description><identifier>DOI: 10.7488/ds/259</identifier><language>eng</language><publisher>University of Edinburgh. The Centre for Speech Technology Research (CSTR)</publisher><subject>acoustic model ; adaptation ; DNN adaptation ; speech synthesis</subject><creationdate>2015</creationdate><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>776,1887</link.rule.ids><linktorsrc>$$Uhttps://commons.datacite.org/doi.org/10.7488/ds/259$$EView_record_in_DataCite.org$$FView_record_in_$$GDataCite.org$$Hfree_for_read</linktorsrc></links><search><creatorcontrib>Wu, Zhizheng</creatorcontrib><title>Listening test materials for "A study of speaker adaptation for DNN-based speech synthesis"</title><description>The dataset contains the testing stimuli and listeners' MUSHRA test responses for the Interspeech 2015 paper, "A study of speaker adaptation for DNN-based speech synthesis". In this paper, we conduct an experimental analysis of speaker adaptation for Deep Neural Network (DNN) based speech synthesis at different levels. In particular, we augment a low-dimensional speaker-specific vector with linguistic features as input to represent speaker identity, perform model adaptation to scale the hidden activation weights, and perform a feature space transformation at the output layer to modify generated acoustic features. We systematically analyse the performance of each individual adaptation technique and that of their combinations. Experimental results confirm the adaptability of the DNN, and listening tests demonstrate that the DNN can achieve significantly better adaptation performance than the hidden Markov model (HMM) baseline in terms of naturalness and speaker similarity.</description><subject>acoustic model</subject><subject>adaptation</subject><subject>DNN adaptation</subject><subject>speech synthesis</subject><fulltext>true</fulltext><rsrctype>dataset</rsrctype><creationdate>2015</creationdate><recordtype>dataset</recordtype><sourceid>PQ8</sourceid><recordid>eNqNzrsOgkAQheFtLIyXZ5hQ2CHgJWJpvMTCUNlZbEZ2kI2ykJ2x4O0F4wNYnebLya_UNInnm1WaRoajxXo7VLeLZSFn3QOEWKBCIW_xxVDUHoIdsLxNC3UB3BA-yQMabATF1u5LDlkW3pHJ9IDyErh1UhJbDsZqUHRPNPntSM1Ox-v-HBoUzK2Qbryt0Lc6iXVfpQ3rrmr5N_wANFFC0Q</recordid><startdate>2015</startdate><enddate>2015</enddate><creator>Wu, Zhizheng</creator><general>University of Edinburgh. The Centre for Speech Technology Research (CSTR)</general><scope>DYCCY</scope><scope>PQ8</scope></search><sort><creationdate>2015</creationdate><title>Listening test materials for "A study of speaker adaptation for DNN-based speech synthesis"</title><author>Wu, Zhizheng</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-datacite_primary_10_7488_ds_2593</frbrgroupid><rsrctype>datasets</rsrctype><prefilter>datasets</prefilter><language>eng</language><creationdate>2015</creationdate><topic>acoustic model</topic><topic>adaptation</topic><topic>DNN adaptation</topic><topic>speech synthesis</topic><toplevel>online_resources</toplevel><creatorcontrib>Wu, Zhizheng</creatorcontrib><collection>DataCite (Open Access)</collection><collection>DataCite</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Wu, Zhizheng</au><format>book</format><genre>unknown</genre><ristype>DATA</ristype><title>Listening test materials for "A study of speaker adaptation for DNN-based speech synthesis"</title><date>2015</date><risdate>2015</risdate><abstract>The dataset contains the testing stimuli and listeners' MUSHRA test responses for the Interspeech 2015 paper, "A study of speaker adaptation for DNN-based speech synthesis". In this paper, we conduct an experimental analysis of speaker adaptation for Deep Neural Network (DNN) based speech synthesis at different levels. In particular, we augment a low-dimensional speaker-specific vector with linguistic features as input to represent speaker identity, perform model adaptation to scale the hidden activation weights, and perform a feature space transformation at the output layer to modify generated acoustic features. We systematically analyse the performance of each individual adaptation technique and that of their combinations. Experimental results confirm the adaptability of the DNN, and listening tests demonstrate that the DNN can achieve significantly better adaptation performance than the hidden Markov model (HMM) baseline in terms of naturalness and speaker similarity.</abstract><pub>University of Edinburgh. The Centre for Speech Technology Research (CSTR)</pub><doi>10.7488/ds/259</doi><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier	DOI: 10.7488/ds/259
ispartof
issn
language	eng
recordid	cdi_datacite_primary_10_7488_ds_259
source	DataCite
subjects	acoustic model adaptation DNN adaptation speech synthesis
title	Listening test materials for "A study of speaker adaptation for DNN-based speech synthesis"
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-23T14%3A00%3A58IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-datacite_PQ8&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=unknown&rft.au=Wu,%20Zhizheng&rft.date=2015&rft_id=info:doi/10.7488/ds/259&rft_dat=%3Cdatacite_PQ8%3E10_7488_ds_259%3C/datacite_PQ8%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true