Speech synthesis method and speech synthesis model training method and device

The invention discloses a speech synthesis method, a speech synthesis model training method and a speech synthesis model training device. The speech synthesis method in one embodiment of the invention can comprise the following steps: performing text coding on a first text to be synthesized to obtai...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
1. Verfasser:	HU DAMENG
Format:	Patent
Sprache:	chi ; eng
Schlagworte:	ACOUSTICS MUSICAL INSTRUMENTS PHYSICS SPEECH ANALYSIS OR SYNTHESIS SPEECH OR AUDIO CODING OR DECODING SPEECH OR VOICE PROCESSING SPEECH RECOGNITION
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title
container_volume
creator	HU DAMENG
description	The invention discloses a speech synthesis method, a speech synthesis model training method and a speech synthesis model training device. The speech synthesis method in one embodiment of the invention can comprise the following steps: performing text coding on a first text to be synthesized to obtain a first synthesis feature; acoustically encoding the first acoustic feature to obtain a second synthetic feature; performing alignment processing on the first synthetic feature, the second synthetic feature and a pre-selected emotion expression parameter to obtain a third synthetic feature; and performing acoustic decoding on the third synthetic feature to obtain a second acoustic feature of the first text. The voice of the specific emotion degree can be synthesized on the basis of the preset emotion expression parameters, and the actual application requirement is met. 公开了一种语音合成方法、语音合成模型的训练方法及装置。本公开一实施例中的语音合成方法可以包括：对待合成的第一文本进行文本编码，以获得第一合成特征；对第一声学特征进行声学编码，以获得第二合成特征；对所述第一合成特征、第二合成特征和预先选定的情感表述参数进行对齐处理，以获得第三合成特征；对所述第
format	Patent
fullrecord	<record><control><sourceid>epo_EVB</sourceid><recordid>TN_cdi_epo_espacenet_CN113112987A</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>CN113112987A</sourcerecordid><originalsourceid>FETCH-epo_espacenet_CN113112987A3</originalsourceid><addsrcrecordid>eNrjZPANLkhNTc5QKK7MK8lILc4sVshNLcnIT1FIzEtRKMaQy09JzVEoKUrMzMvMS0dWmpJalpmcysPAmpaYU5zKC6W5GRTdXEOcPXRTC_LjU4sLEpNT81JL4p39DA2NDQ2NLC3MHY2JUQMA-gQ2Xw</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>patent</recordtype></control><display><type>patent</type><title>Speech synthesis method and speech synthesis model training method and device</title><source>esp@cenet</source><creator>HU DAMENG</creator><creatorcontrib>HU DAMENG</creatorcontrib><description>The invention discloses a speech synthesis method, a speech synthesis model training method and a speech synthesis model training device. The speech synthesis method in one embodiment of the invention can comprise the following steps: performing text coding on a first text to be synthesized to obtain a first synthesis feature; acoustically encoding the first acoustic feature to obtain a second synthetic feature; performing alignment processing on the first synthetic feature, the second synthetic feature and a pre-selected emotion expression parameter to obtain a third synthetic feature; and performing acoustic decoding on the third synthetic feature to obtain a second acoustic feature of the first text. The voice of the specific emotion degree can be synthesized on the basis of the preset emotion expression parameters, and the actual application requirement is met. 公开了一种语音合成方法、语音合成模型的训练方法及装置。本公开一实施例中的语音合成方法可以包括：对待合成的第一文本进行文本编码，以获得第一合成特征；对第一声学特征进行声学编码，以获得第二合成特征；对所述第一合成特征、第二合成特征和预先选定的情感表述参数进行对齐处理，以获得第三合成特征；对所述第</description><language>chi ; eng</language><subject>ACOUSTICS ; MUSICAL INSTRUMENTS ; PHYSICS ; SPEECH ANALYSIS OR SYNTHESIS ; SPEECH OR AUDIO CODING OR DECODING ; SPEECH OR VOICE PROCESSING ; SPEECH RECOGNITION</subject><creationdate>2021</creationdate><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&date=20210713&DB=EPODOC&CC=CN&NR=113112987A$$EHTML$$P50$$Gepo$$Hfree_for_read</linktohtml><link.rule.ids>230,308,777,882,25545,76296</link.rule.ids><linktorsrc>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&date=20210713&DB=EPODOC&CC=CN&NR=113112987A$$EView_record_in_European_Patent_Office$$FView_record_in_$$GEuropean_Patent_Office$$Hfree_for_read</linktorsrc></links><search><creatorcontrib>HU DAMENG</creatorcontrib><title>Speech synthesis method and speech synthesis model training method and device</title><description>The invention discloses a speech synthesis method, a speech synthesis model training method and a speech synthesis model training device. The speech synthesis method in one embodiment of the invention can comprise the following steps: performing text coding on a first text to be synthesized to obtain a first synthesis feature; acoustically encoding the first acoustic feature to obtain a second synthetic feature; performing alignment processing on the first synthetic feature, the second synthetic feature and a pre-selected emotion expression parameter to obtain a third synthetic feature; and performing acoustic decoding on the third synthetic feature to obtain a second acoustic feature of the first text. The voice of the specific emotion degree can be synthesized on the basis of the preset emotion expression parameters, and the actual application requirement is met. 公开了一种语音合成方法、语音合成模型的训练方法及装置。本公开一实施例中的语音合成方法可以包括：对待合成的第一文本进行文本编码，以获得第一合成特征；对第一声学特征进行声学编码，以获得第二合成特征；对所述第一合成特征、第二合成特征和预先选定的情感表述参数进行对齐处理，以获得第三合成特征；对所述第</description><subject>ACOUSTICS</subject><subject>MUSICAL INSTRUMENTS</subject><subject>PHYSICS</subject><subject>SPEECH ANALYSIS OR SYNTHESIS</subject><subject>SPEECH OR AUDIO CODING OR DECODING</subject><subject>SPEECH OR VOICE PROCESSING</subject><subject>SPEECH RECOGNITION</subject><fulltext>true</fulltext><rsrctype>patent</rsrctype><creationdate>2021</creationdate><recordtype>patent</recordtype><sourceid>EVB</sourceid><recordid>eNrjZPANLkhNTc5QKK7MK8lILc4sVshNLcnIT1FIzEtRKMaQy09JzVEoKUrMzMvMS0dWmpJalpmcysPAmpaYU5zKC6W5GRTdXEOcPXRTC_LjU4sLEpNT81JL4p39DA2NDQ2NLC3MHY2JUQMA-gQ2Xw</recordid><startdate>20210713</startdate><enddate>20210713</enddate><creator>HU DAMENG</creator><scope>EVB</scope></search><sort><creationdate>20210713</creationdate><title>Speech synthesis method and speech synthesis model training method and device</title><author>HU DAMENG</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-epo_espacenet_CN113112987A3</frbrgroupid><rsrctype>patents</rsrctype><prefilter>patents</prefilter><language>chi ; eng</language><creationdate>2021</creationdate><topic>ACOUSTICS</topic><topic>MUSICAL INSTRUMENTS</topic><topic>PHYSICS</topic><topic>SPEECH ANALYSIS OR SYNTHESIS</topic><topic>SPEECH OR AUDIO CODING OR DECODING</topic><topic>SPEECH OR VOICE PROCESSING</topic><topic>SPEECH RECOGNITION</topic><toplevel>online_resources</toplevel><creatorcontrib>HU DAMENG</creatorcontrib><collection>esp@cenet</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>HU DAMENG</au><format>patent</format><genre>patent</genre><ristype>GEN</ristype><title>Speech synthesis method and speech synthesis model training method and device</title><date>2021-07-13</date><risdate>2021</risdate><abstract>The invention discloses a speech synthesis method, a speech synthesis model training method and a speech synthesis model training device. The speech synthesis method in one embodiment of the invention can comprise the following steps: performing text coding on a first text to be synthesized to obtain a first synthesis feature; acoustically encoding the first acoustic feature to obtain a second synthetic feature; performing alignment processing on the first synthetic feature, the second synthetic feature and a pre-selected emotion expression parameter to obtain a third synthetic feature; and performing acoustic decoding on the third synthetic feature to obtain a second acoustic feature of the first text. The voice of the specific emotion degree can be synthesized on the basis of the preset emotion expression parameters, and the actual application requirement is met. 公开了一种语音合成方法、语音合成模型的训练方法及装置。本公开一实施例中的语音合成方法可以包括：对待合成的第一文本进行文本编码，以获得第一合成特征；对第一声学特征进行声学编码，以获得第二合成特征；对所述第一合成特征、第二合成特征和预先选定的情感表述参数进行对齐处理，以获得第三合成特征；对所述第</abstract><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier
ispartof
issn
language	chi ; eng
recordid	cdi_epo_espacenet_CN113112987A
source	esp@cenet
subjects	ACOUSTICS MUSICAL INSTRUMENTS PHYSICS SPEECH ANALYSIS OR SYNTHESIS SPEECH OR AUDIO CODING OR DECODING SPEECH OR VOICE PROCESSING SPEECH RECOGNITION
title	Speech synthesis method and speech synthesis model training method and device
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-19T18%3A39%3A41IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-epo_EVB&rft_val_fmt=info:ofi/fmt:kev:mtx:patent&rft.genre=patent&rft.au=HU%20DAMENG&rft.date=2021-07-13&rft_id=info:doi/&rft_dat=%3Cepo_EVB%3ECN113112987A%3C/epo_EVB%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true