Cross-language speech synthesis method based on end-to-end tone and emotion migration

The invention discloses a cross-language speech synthesis method based on end-to-end tone and emotion migration, and the method comprises the following steps: S1, collecting and processing Chinese and English speech training data, and extracting required speech features; s2, training a learning netw...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: GUO YONGBIN, ZHANG LIUJIAN, LIU JIANGFENG, MAO AIHUA
Format: Patent
Sprache:chi ; eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator GUO YONGBIN
ZHANG LIUJIAN
LIU JIANGFENG
MAO AIHUA
description The invention discloses a cross-language speech synthesis method based on end-to-end tone and emotion migration, and the method comprises the following steps: S1, collecting and processing Chinese and English speech training data, and extracting required speech features; s2, training a learning network architecture for Chinese and English speech synthesis, wherein the learning network architecture comprises a speaker encoder, a synthesizer and a vocoder; and S3, performing cross-language speech synthesis on the real-time speech input by the speaker by using the trained learning network architecture, so that the synthesized speech can effectively retain the tone and emotion of the speaker. According to the cross-language speech synthesis method provided by the invention, the cross-language speech can be synthesized under the condition that a small amount of speech is given to the speaker, and the tone and emotion of the speaker can be kept in the synthesized speech. 本发明公开了一种基于端到端的音色及情感迁移的跨语言语音合成方法,步骤如下:S1、采集并处
format Patent
fullrecord <record><control><sourceid>epo_EVB</sourceid><recordid>TN_cdi_epo_espacenet_CN115359774A</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>CN115359774A</sourcerecordid><originalsourceid>FETCH-epo_espacenet_CN115359774A3</originalsourceid><addsrcrecordid>eNqNirEKwkAQBdNYiPoP6wdcEWIIliEoVlZahzX3vBzkbkP2LPx7T_ADrGZgZl3cu0VUzcTRvdiBdAaGkfQd0wj1SgFpFEsPVliSSIjWJDEZlCSCOAuCJJ9b8G7hr22L1ZMnxe7HTbE_n27dxWCWHjrzgIjUd9eyrKv62DSHtvrn-QDRjTkj</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>patent</recordtype></control><display><type>patent</type><title>Cross-language speech synthesis method based on end-to-end tone and emotion migration</title><source>esp@cenet</source><creator>GUO YONGBIN ; ZHANG LIUJIAN ; LIU JIANGFENG ; MAO AIHUA</creator><creatorcontrib>GUO YONGBIN ; ZHANG LIUJIAN ; LIU JIANGFENG ; MAO AIHUA</creatorcontrib><description>The invention discloses a cross-language speech synthesis method based on end-to-end tone and emotion migration, and the method comprises the following steps: S1, collecting and processing Chinese and English speech training data, and extracting required speech features; s2, training a learning network architecture for Chinese and English speech synthesis, wherein the learning network architecture comprises a speaker encoder, a synthesizer and a vocoder; and S3, performing cross-language speech synthesis on the real-time speech input by the speaker by using the trained learning network architecture, so that the synthesized speech can effectively retain the tone and emotion of the speaker. According to the cross-language speech synthesis method provided by the invention, the cross-language speech can be synthesized under the condition that a small amount of speech is given to the speaker, and the tone and emotion of the speaker can be kept in the synthesized speech. 本发明公开了一种基于端到端的音色及情感迁移的跨语言语音合成方法,步骤如下:S1、采集并处</description><language>chi ; eng</language><subject>ACOUSTICS ; MUSICAL INSTRUMENTS ; PHYSICS ; SPEECH ANALYSIS OR SYNTHESIS ; SPEECH OR AUDIO CODING OR DECODING ; SPEECH OR VOICE PROCESSING ; SPEECH RECOGNITION</subject><creationdate>2022</creationdate><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&amp;date=20221118&amp;DB=EPODOC&amp;CC=CN&amp;NR=115359774A$$EHTML$$P50$$Gepo$$Hfree_for_read</linktohtml><link.rule.ids>230,308,780,885,25564,76547</link.rule.ids><linktorsrc>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&amp;date=20221118&amp;DB=EPODOC&amp;CC=CN&amp;NR=115359774A$$EView_record_in_European_Patent_Office$$FView_record_in_$$GEuropean_Patent_Office$$Hfree_for_read</linktorsrc></links><search><creatorcontrib>GUO YONGBIN</creatorcontrib><creatorcontrib>ZHANG LIUJIAN</creatorcontrib><creatorcontrib>LIU JIANGFENG</creatorcontrib><creatorcontrib>MAO AIHUA</creatorcontrib><title>Cross-language speech synthesis method based on end-to-end tone and emotion migration</title><description>The invention discloses a cross-language speech synthesis method based on end-to-end tone and emotion migration, and the method comprises the following steps: S1, collecting and processing Chinese and English speech training data, and extracting required speech features; s2, training a learning network architecture for Chinese and English speech synthesis, wherein the learning network architecture comprises a speaker encoder, a synthesizer and a vocoder; and S3, performing cross-language speech synthesis on the real-time speech input by the speaker by using the trained learning network architecture, so that the synthesized speech can effectively retain the tone and emotion of the speaker. According to the cross-language speech synthesis method provided by the invention, the cross-language speech can be synthesized under the condition that a small amount of speech is given to the speaker, and the tone and emotion of the speaker can be kept in the synthesized speech. 本发明公开了一种基于端到端的音色及情感迁移的跨语言语音合成方法,步骤如下:S1、采集并处</description><subject>ACOUSTICS</subject><subject>MUSICAL INSTRUMENTS</subject><subject>PHYSICS</subject><subject>SPEECH ANALYSIS OR SYNTHESIS</subject><subject>SPEECH OR AUDIO CODING OR DECODING</subject><subject>SPEECH OR VOICE PROCESSING</subject><subject>SPEECH RECOGNITION</subject><fulltext>true</fulltext><rsrctype>patent</rsrctype><creationdate>2022</creationdate><recordtype>patent</recordtype><sourceid>EVB</sourceid><recordid>eNqNirEKwkAQBdNYiPoP6wdcEWIIliEoVlZahzX3vBzkbkP2LPx7T_ADrGZgZl3cu0VUzcTRvdiBdAaGkfQd0wj1SgFpFEsPVliSSIjWJDEZlCSCOAuCJJ9b8G7hr22L1ZMnxe7HTbE_n27dxWCWHjrzgIjUd9eyrKv62DSHtvrn-QDRjTkj</recordid><startdate>20221118</startdate><enddate>20221118</enddate><creator>GUO YONGBIN</creator><creator>ZHANG LIUJIAN</creator><creator>LIU JIANGFENG</creator><creator>MAO AIHUA</creator><scope>EVB</scope></search><sort><creationdate>20221118</creationdate><title>Cross-language speech synthesis method based on end-to-end tone and emotion migration</title><author>GUO YONGBIN ; ZHANG LIUJIAN ; LIU JIANGFENG ; MAO AIHUA</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-epo_espacenet_CN115359774A3</frbrgroupid><rsrctype>patents</rsrctype><prefilter>patents</prefilter><language>chi ; eng</language><creationdate>2022</creationdate><topic>ACOUSTICS</topic><topic>MUSICAL INSTRUMENTS</topic><topic>PHYSICS</topic><topic>SPEECH ANALYSIS OR SYNTHESIS</topic><topic>SPEECH OR AUDIO CODING OR DECODING</topic><topic>SPEECH OR VOICE PROCESSING</topic><topic>SPEECH RECOGNITION</topic><toplevel>online_resources</toplevel><creatorcontrib>GUO YONGBIN</creatorcontrib><creatorcontrib>ZHANG LIUJIAN</creatorcontrib><creatorcontrib>LIU JIANGFENG</creatorcontrib><creatorcontrib>MAO AIHUA</creatorcontrib><collection>esp@cenet</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>GUO YONGBIN</au><au>ZHANG LIUJIAN</au><au>LIU JIANGFENG</au><au>MAO AIHUA</au><format>patent</format><genre>patent</genre><ristype>GEN</ristype><title>Cross-language speech synthesis method based on end-to-end tone and emotion migration</title><date>2022-11-18</date><risdate>2022</risdate><abstract>The invention discloses a cross-language speech synthesis method based on end-to-end tone and emotion migration, and the method comprises the following steps: S1, collecting and processing Chinese and English speech training data, and extracting required speech features; s2, training a learning network architecture for Chinese and English speech synthesis, wherein the learning network architecture comprises a speaker encoder, a synthesizer and a vocoder; and S3, performing cross-language speech synthesis on the real-time speech input by the speaker by using the trained learning network architecture, so that the synthesized speech can effectively retain the tone and emotion of the speaker. According to the cross-language speech synthesis method provided by the invention, the cross-language speech can be synthesized under the condition that a small amount of speech is given to the speaker, and the tone and emotion of the speaker can be kept in the synthesized speech. 本发明公开了一种基于端到端的音色及情感迁移的跨语言语音合成方法,步骤如下:S1、采集并处</abstract><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier
ispartof
issn
language chi ; eng
recordid cdi_epo_espacenet_CN115359774A
source esp@cenet
subjects ACOUSTICS
MUSICAL INSTRUMENTS
PHYSICS
SPEECH ANALYSIS OR SYNTHESIS
SPEECH OR AUDIO CODING OR DECODING
SPEECH OR VOICE PROCESSING
SPEECH RECOGNITION
title Cross-language speech synthesis method based on end-to-end tone and emotion migration
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-20T08%3A23%3A51IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-epo_EVB&rft_val_fmt=info:ofi/fmt:kev:mtx:patent&rft.genre=patent&rft.au=GUO%20YONGBIN&rft.date=2022-11-18&rft_id=info:doi/&rft_dat=%3Cepo_EVB%3ECN115359774A%3C/epo_EVB%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true