Cross-language speech synthesis method based on end-to-end tone and emotion migration

The invention discloses a cross-language speech synthesis method based on end-to-end tone and emotion migration, and the method comprises the following steps: S1, collecting and processing Chinese and English speech training data, and extracting required speech features; s2, training a learning netw...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	GUO YONGBIN, ZHANG LIUJIAN, LIU JIANGFENG, MAO AIHUA
Format:	Patent
Sprache:	chi ; eng
Schlagworte:	ACOUSTICS MUSICAL INSTRUMENTS PHYSICS SPEECH ANALYSIS OR SYNTHESIS SPEECH OR AUDIO CODING OR DECODING SPEECH OR VOICE PROCESSING SPEECH RECOGNITION
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title
container_volume
creator	GUO YONGBIN ZHANG LIUJIAN LIU JIANGFENG MAO AIHUA
description	The invention discloses a cross-language speech synthesis method based on end-to-end tone and emotion migration, and the method comprises the following steps: S1, collecting and processing Chinese and English speech training data, and extracting required speech features; s2, training a learning network architecture for Chinese and English speech synthesis, wherein the learning network architecture comprises a speaker encoder, a synthesizer and a vocoder; and S3, performing cross-language speech synthesis on the real-time speech input by the speaker by using the trained learning network architecture, so that the synthesized speech can effectively retain the tone and emotion of the speaker. According to the cross-language speech synthesis method provided by the invention, the cross-language speech can be synthesized under the condition that a small amount of speech is given to the speaker, and the tone and emotion of the speaker can be kept in the synthesized speech. 本发明公开了一种基于端到端的音色及情感迁移的跨语言语音合成方法，步骤如下：S1、采集并处
format	Patent
fullrecord	<record><control><sourceid>epo_EVB</sourceid><recordid>TN_cdi_epo_espacenet_CN115359774A</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>CN115359774A</sourcerecordid><originalsourceid>FETCH-epo_espacenet_CN115359774A3</originalsourceid><addsrcrecordid>eNqNirEKwkAQBdNYiPoP6wdcEWIIliEoVlZahzX3vBzkbkP2LPx7T_ADrGZgZl3cu0VUzcTRvdiBdAaGkfQd0wj1SgFpFEsPVliSSIjWJDEZlCSCOAuCJJ9b8G7hr22L1ZMnxe7HTbE_n27dxWCWHjrzgIjUd9eyrKv62DSHtvrn-QDRjTkj</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>patent</recordtype></control><display><type>patent</type><title>Cross-language speech synthesis method based on end-to-end tone and emotion migration</title><source>esp@cenet</source><creator>GUO YONGBIN ; ZHANG LIUJIAN ; LIU JIANGFENG ; MAO AIHUA</creator><creatorcontrib>GUO YONGBIN ; ZHANG LIUJIAN ; LIU JIANGFENG ; MAO AIHUA</creatorcontrib><description>The invention discloses a cross-language speech synthesis method based on end-to-end tone and emotion migration, and the method comprises the following steps: S1, collecting and processing Chinese and English speech training data, and extracting required speech features; s2, training a learning network architecture for Chinese and English speech synthesis, wherein the learning network architecture comprises a speaker encoder, a synthesizer and a vocoder; and S3, performing cross-language speech synthesis on the real-time speech input by the speaker by using the trained learning network architecture, so that the synthesized speech can effectively retain the tone and emotion of the speaker. According to the cross-language speech synthesis method provided by the invention, the cross-language speech can be synthesized under the condition that a small amount of speech is given to the speaker, and the tone and emotion of the speaker can be kept in the synthesized speech. 本发明公开了一种基于端到端的音色及情感迁移的跨语言语音合成方法，步骤如下：S1、采集并处</description><language>chi ; eng</language><subject>ACOUSTICS ; MUSICAL INSTRUMENTS ; PHYSICS ; SPEECH ANALYSIS OR SYNTHESIS ; SPEECH OR AUDIO CODING OR DECODING ; SPEECH OR VOICE PROCESSING ; SPEECH RECOGNITION</subject><creationdate>2022</creationdate><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&date=20221118&DB=EPODOC&CC=CN&NR=115359774A$$EHTML$$P50$$Gepo$$Hfree_for_read</linktohtml><link.rule.ids>230,308,780,885,25564,76547</link.rule.ids><linktorsrc>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&date=20221118&DB=EPODOC&CC=CN&NR=115359774A$$EView_record_in_European_Patent_Office$$FView_record_in_$$GEuropean_Patent_Office$$Hfree_for_read</linktorsrc></links><search><creatorcontrib>GUO YONGBIN</creatorcontrib><creatorcontrib>ZHANG LIUJIAN</creatorcontrib><creatorcontrib>LIU JIANGFENG</creatorcontrib><creatorcontrib>MAO AIHUA</creatorcontrib><title>Cross-language speech synthesis method based on end-to-end tone and emotion migration</title><description>The invention discloses a cross-language speech synthesis method based on end-to-end tone and emotion migration, and the method comprises the following steps: S1, collecting and processing Chinese and English speech training data, and extracting required speech features; s2, training a learning network architecture for Chinese and English speech synthesis, wherein the learning network architecture comprises a speaker encoder, a synthesizer and a vocoder; and S3, performing cross-language speech synthesis on the real-time speech input by the speaker by using the trained learning network architecture, so that the synthesized speech can effectively retain the tone and emotion of the speaker. According to the cross-language speech synthesis method provided by the invention, the cross-language speech can be synthesized under the condition that a small amount of speech is given to the speaker, and the tone and emotion of the speaker can be kept in the synthesized speech. 本发明公开了一种基于端到端的音色及情感迁移的跨语言语音合成方法，步骤如下：S1、采集并处</description><subject>ACOUSTICS</subject><subject>MUSICAL INSTRUMENTS</subject><subject>PHYSICS</subject><subject>SPEECH ANALYSIS OR SYNTHESIS</subject><subject>SPEECH OR AUDIO CODING OR DECODING</subject><subject>SPEECH OR VOICE PROCESSING</subject><subject>SPEECH RECOGNITION</subject><fulltext>true</fulltext><rsrctype>patent</rsrctype><creationdate>2022</creationdate><recordtype>patent</recordtype><sourceid>EVB</sourceid><recordid>eNqNirEKwkAQBdNYiPoP6wdcEWIIliEoVlZahzX3vBzkbkP2LPx7T_ADrGZgZl3cu0VUzcTRvdiBdAaGkfQd0wj1SgFpFEsPVliSSIjWJDEZlCSCOAuCJJ9b8G7hr22L1ZMnxe7HTbE_n27dxWCWHjrzgIjUd9eyrKv62DSHtvrn-QDRjTkj</recordid><startdate>20221118</startdate><enddate>20221118</enddate><creator>GUO YONGBIN</creator><creator>ZHANG LIUJIAN</creator><creator>LIU JIANGFENG</creator><creator>MAO AIHUA</creator><scope>EVB</scope></search><sort><creationdate>20221118</creationdate><title>Cross-language speech synthesis method based on end-to-end tone and emotion migration</title><author>GUO YONGBIN ; ZHANG LIUJIAN ; LIU JIANGFENG ; MAO AIHUA</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-epo_espacenet_CN115359774A3</frbrgroupid><rsrctype>patents</rsrctype><prefilter>patents</prefilter><language>chi ; eng</language><creationdate>2022</creationdate><topic>ACOUSTICS</topic><topic>MUSICAL INSTRUMENTS</topic><topic>PHYSICS</topic><topic>SPEECH ANALYSIS OR SYNTHESIS</topic><topic>SPEECH OR AUDIO CODING OR DECODING</topic><topic>SPEECH OR VOICE PROCESSING</topic><topic>SPEECH RECOGNITION</topic><toplevel>online_resources</toplevel><creatorcontrib>GUO YONGBIN</creatorcontrib><creatorcontrib>ZHANG LIUJIAN</creatorcontrib><creatorcontrib>LIU JIANGFENG</creatorcontrib><creatorcontrib>MAO AIHUA</creatorcontrib><collection>esp@cenet</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>GUO YONGBIN</au><au>ZHANG LIUJIAN</au><au>LIU JIANGFENG</au><au>MAO AIHUA</au><format>patent</format><genre>patent</genre><ristype>GEN</ristype><title>Cross-language speech synthesis method based on end-to-end tone and emotion migration</title><date>2022-11-18</date><risdate>2022</risdate><abstract>The invention discloses a cross-language speech synthesis method based on end-to-end tone and emotion migration, and the method comprises the following steps: S1, collecting and processing Chinese and English speech training data, and extracting required speech features; s2, training a learning network architecture for Chinese and English speech synthesis, wherein the learning network architecture comprises a speaker encoder, a synthesizer and a vocoder; and S3, performing cross-language speech synthesis on the real-time speech input by the speaker by using the trained learning network architecture, so that the synthesized speech can effectively retain the tone and emotion of the speaker. According to the cross-language speech synthesis method provided by the invention, the cross-language speech can be synthesized under the condition that a small amount of speech is given to the speaker, and the tone and emotion of the speaker can be kept in the synthesized speech. 本发明公开了一种基于端到端的音色及情感迁移的跨语言语音合成方法，步骤如下：S1、采集并处</abstract><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier
ispartof
issn
language	chi ; eng
recordid	cdi_epo_espacenet_CN115359774A
source	esp@cenet
subjects	ACOUSTICS MUSICAL INSTRUMENTS PHYSICS SPEECH ANALYSIS OR SYNTHESIS SPEECH OR AUDIO CODING OR DECODING SPEECH OR VOICE PROCESSING SPEECH RECOGNITION
title	Cross-language speech synthesis method based on end-to-end tone and emotion migration
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-20T08%3A23%3A51IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-epo_EVB&rft_val_fmt=info:ofi/fmt:kev:mtx:patent&rft.genre=patent&rft.au=GUO%20YONGBIN&rft.date=2022-11-18&rft_id=info:doi/&rft_dat=%3Cepo_EVB%3ECN115359774A%3C/epo_EVB%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true