Neural Speech Synthesis for Estonian

This technical report describes the results of a collaboration between the NLP research group at the University of Tartu and the Institute of Estonian Language on improving neural speech synthesis for Estonian. The report (written in Estonian) describes the project results, the summary of which is:...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:arXiv.org 2020-10
Hauptverfasser: Rätsep, Liisa, Piits, Liisi, Hille Pajupuu, Hein, Indrek, Fišel, Mark
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title arXiv.org
container_volume
creator Rätsep, Liisa
Piits, Liisi
Hille Pajupuu
Hein, Indrek
Fišel, Mark
description This technical report describes the results of a collaboration between the NLP research group at the University of Tartu and the Institute of Estonian Language on improving neural speech synthesis for Estonian. The report (written in Estonian) describes the project results, the summary of which is: (1) Speech synthesis data from 6 speakers for a total of 92.4 hours is collected and openly released (CC-BY-4.0). Data available at https://konekorpus.tartunlp.ai and https://www.eki.ee/litsents/. (2) software and models for neural speech synthesis is released open-source (MIT license). Available at https://koodivaramu.eesti.ee/tartunlp/text-to-speech . (3) We ran evaluations of the new models and compared them to other existing solutions (HMM-based HTS models from EKI, http://www.eki.ee/heli/, and Google's speech synthesis for Estonian, accessed via https://translate.google.com). Evaluation includes voice acceptability MOS scores for sentence-level and longer excerpts, detailed error analysis and evaluation of the pre-processing module.
format Article
fullrecord <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2449044289</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2449044289</sourcerecordid><originalsourceid>FETCH-proquest_journals_24490442893</originalsourceid><addsrcrecordid>eNpjYuA0MjY21LUwMTLiYOAtLs4yMDAwMjM3MjU15mRQ8UstLUrMUQguSE1NzlAIrswryUgtzixWSMsvUnAtLsnPy0zM42FgTUvMKU7lhdLcDMpuriHOHroFRfmFpanFJfFZ-aVFeUCpeCMTE0sDExMjC0tj4lQBAMQ2LpY</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2449044289</pqid></control><display><type>article</type><title>Neural Speech Synthesis for Estonian</title><source>Free E- Journals</source><creator>Rätsep, Liisa ; Piits, Liisi ; Hille Pajupuu ; Hein, Indrek ; Fišel, Mark</creator><creatorcontrib>Rätsep, Liisa ; Piits, Liisi ; Hille Pajupuu ; Hein, Indrek ; Fišel, Mark</creatorcontrib><description>This technical report describes the results of a collaboration between the NLP research group at the University of Tartu and the Institute of Estonian Language on improving neural speech synthesis for Estonian. The report (written in Estonian) describes the project results, the summary of which is: (1) Speech synthesis data from 6 speakers for a total of 92.4 hours is collected and openly released (CC-BY-4.0). Data available at https://konekorpus.tartunlp.ai and https://www.eki.ee/litsents/. (2) software and models for neural speech synthesis is released open-source (MIT license). Available at https://koodivaramu.eesti.ee/tartunlp/text-to-speech . (3) We ran evaluations of the new models and compared them to other existing solutions (HMM-based HTS models from EKI, http://www.eki.ee/heli/, and Google's speech synthesis for Estonian, accessed via https://translate.google.com). Evaluation includes voice acceptability MOS scores for sentence-level and longer excerpts, detailed error analysis and evaluation of the pre-processing module.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Error analysis ; Source code ; Speech ; Speech recognition</subject><ispartof>arXiv.org, 2020-10</ispartof><rights>2020. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>780,784</link.rule.ids></links><search><creatorcontrib>Rätsep, Liisa</creatorcontrib><creatorcontrib>Piits, Liisi</creatorcontrib><creatorcontrib>Hille Pajupuu</creatorcontrib><creatorcontrib>Hein, Indrek</creatorcontrib><creatorcontrib>Fišel, Mark</creatorcontrib><title>Neural Speech Synthesis for Estonian</title><title>arXiv.org</title><description>This technical report describes the results of a collaboration between the NLP research group at the University of Tartu and the Institute of Estonian Language on improving neural speech synthesis for Estonian. The report (written in Estonian) describes the project results, the summary of which is: (1) Speech synthesis data from 6 speakers for a total of 92.4 hours is collected and openly released (CC-BY-4.0). Data available at https://konekorpus.tartunlp.ai and https://www.eki.ee/litsents/. (2) software and models for neural speech synthesis is released open-source (MIT license). Available at https://koodivaramu.eesti.ee/tartunlp/text-to-speech . (3) We ran evaluations of the new models and compared them to other existing solutions (HMM-based HTS models from EKI, http://www.eki.ee/heli/, and Google's speech synthesis for Estonian, accessed via https://translate.google.com). Evaluation includes voice acceptability MOS scores for sentence-level and longer excerpts, detailed error analysis and evaluation of the pre-processing module.</description><subject>Error analysis</subject><subject>Source code</subject><subject>Speech</subject><subject>Speech recognition</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><recordid>eNpjYuA0MjY21LUwMTLiYOAtLs4yMDAwMjM3MjU15mRQ8UstLUrMUQguSE1NzlAIrswryUgtzixWSMsvUnAtLsnPy0zM42FgTUvMKU7lhdLcDMpuriHOHroFRfmFpanFJfFZ-aVFeUCpeCMTE0sDExMjC0tj4lQBAMQ2LpY</recordid><startdate>20201006</startdate><enddate>20201006</enddate><creator>Rätsep, Liisa</creator><creator>Piits, Liisi</creator><creator>Hille Pajupuu</creator><creator>Hein, Indrek</creator><creator>Fišel, Mark</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20201006</creationdate><title>Neural Speech Synthesis for Estonian</title><author>Rätsep, Liisa ; Piits, Liisi ; Hille Pajupuu ; Hein, Indrek ; Fišel, Mark</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_24490442893</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Error analysis</topic><topic>Source code</topic><topic>Speech</topic><topic>Speech recognition</topic><toplevel>online_resources</toplevel><creatorcontrib>Rätsep, Liisa</creatorcontrib><creatorcontrib>Piits, Liisi</creatorcontrib><creatorcontrib>Hille Pajupuu</creatorcontrib><creatorcontrib>Hein, Indrek</creatorcontrib><creatorcontrib>Fišel, Mark</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science &amp; Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Rätsep, Liisa</au><au>Piits, Liisi</au><au>Hille Pajupuu</au><au>Hein, Indrek</au><au>Fišel, Mark</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>Neural Speech Synthesis for Estonian</atitle><jtitle>arXiv.org</jtitle><date>2020-10-06</date><risdate>2020</risdate><eissn>2331-8422</eissn><abstract>This technical report describes the results of a collaboration between the NLP research group at the University of Tartu and the Institute of Estonian Language on improving neural speech synthesis for Estonian. The report (written in Estonian) describes the project results, the summary of which is: (1) Speech synthesis data from 6 speakers for a total of 92.4 hours is collected and openly released (CC-BY-4.0). Data available at https://konekorpus.tartunlp.ai and https://www.eki.ee/litsents/. (2) software and models for neural speech synthesis is released open-source (MIT license). Available at https://koodivaramu.eesti.ee/tartunlp/text-to-speech . (3) We ran evaluations of the new models and compared them to other existing solutions (HMM-based HTS models from EKI, http://www.eki.ee/heli/, and Google's speech synthesis for Estonian, accessed via https://translate.google.com). Evaluation includes voice acceptability MOS scores for sentence-level and longer excerpts, detailed error analysis and evaluation of the pre-processing module.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier EISSN: 2331-8422
ispartof arXiv.org, 2020-10
issn 2331-8422
language eng
recordid cdi_proquest_journals_2449044289
source Free E- Journals
subjects Error analysis
Source code
Speech
Speech recognition
title Neural Speech Synthesis for Estonian
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-01T19%3A22%3A40IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=Neural%20Speech%20Synthesis%20for%20Estonian&rft.jtitle=arXiv.org&rft.au=R%C3%A4tsep,%20Liisa&rft.date=2020-10-06&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E2449044289%3C/proquest%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2449044289&rft_id=info:pmid/&rfr_iscdi=true