Synthetic speech processing method and related device

The embodiment of the invention discloses a synthetic speech processing method and a related device. The method comprises the following steps: acquiring an original synthetic speech for a first user; extracting amplitude spectrums and phase spectrums of the original synthetic speech, wherein the amp...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: WANG ZHIGUO, HU GUOPING, WU HONGCHUAN, JIANG YUAN
Format: Patent
Sprache:chi ; eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator WANG ZHIGUO
HU GUOPING
WU HONGCHUAN
JIANG YUAN
description The embodiment of the invention discloses a synthetic speech processing method and a related device. The method comprises the following steps: acquiring an original synthetic speech for a first user; extracting amplitude spectrums and phase spectrums of the original synthetic speech, wherein the amplitude spectrums comprise an energy-dimension amplitude spectrum and other dimensions of amplitude spectrums except the energy-dimension amplitude spectrum; processing the other dimensions of amplitude spectrums through a pre-trained forward generator model to obtain a corresponding enhanced amplitude spectrum; and generating a target synthetic speech for the first user according to the energy-dimension amplitude spectrum, the enhanced amplitude spectrum and the phase spectrum. According to the synthetic speech processing method provided by the invention, the naturalness and similarity of the synthetic speech can be improved, so that the synthetic speech is closer to natural speech, and the interaction experience o
format Patent
fullrecord <record><control><sourceid>epo_EVB</sourceid><recordid>TN_cdi_epo_espacenet_CN113066472A</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>CN113066472A</sourcerecordid><originalsourceid>FETCH-epo_espacenet_CN113066472A3</originalsourceid><addsrcrecordid>eNrjZDANrswryUgtyUxWKC5ITU3OUCgoyk9OLS7OzEtXyE0tychPUUjMS1EoSs1JLElNUUhJLctMTuVhYE1LzClO5YXS3AyKbq4hzh66qQX58anFBYnJqXmpJfHOfoaGxgZmZibmRo7GxKgBAICILU0</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>patent</recordtype></control><display><type>patent</type><title>Synthetic speech processing method and related device</title><source>esp@cenet</source><creator>WANG ZHIGUO ; HU GUOPING ; WU HONGCHUAN ; JIANG YUAN</creator><creatorcontrib>WANG ZHIGUO ; HU GUOPING ; WU HONGCHUAN ; JIANG YUAN</creatorcontrib><description>The embodiment of the invention discloses a synthetic speech processing method and a related device. The method comprises the following steps: acquiring an original synthetic speech for a first user; extracting amplitude spectrums and phase spectrums of the original synthetic speech, wherein the amplitude spectrums comprise an energy-dimension amplitude spectrum and other dimensions of amplitude spectrums except the energy-dimension amplitude spectrum; processing the other dimensions of amplitude spectrums through a pre-trained forward generator model to obtain a corresponding enhanced amplitude spectrum; and generating a target synthetic speech for the first user according to the energy-dimension amplitude spectrum, the enhanced amplitude spectrum and the phase spectrum. According to the synthetic speech processing method provided by the invention, the naturalness and similarity of the synthetic speech can be improved, so that the synthetic speech is closer to natural speech, and the interaction experience o</description><language>chi ; eng</language><subject>ACOUSTICS ; MUSICAL INSTRUMENTS ; PHYSICS ; SPEECH ANALYSIS OR SYNTHESIS ; SPEECH OR AUDIO CODING OR DECODING ; SPEECH OR VOICE PROCESSING ; SPEECH RECOGNITION</subject><creationdate>2021</creationdate><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&amp;date=20210702&amp;DB=EPODOC&amp;CC=CN&amp;NR=113066472A$$EHTML$$P50$$Gepo$$Hfree_for_read</linktohtml><link.rule.ids>230,308,777,882,25545,76296</link.rule.ids><linktorsrc>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&amp;date=20210702&amp;DB=EPODOC&amp;CC=CN&amp;NR=113066472A$$EView_record_in_European_Patent_Office$$FView_record_in_$$GEuropean_Patent_Office$$Hfree_for_read</linktorsrc></links><search><creatorcontrib>WANG ZHIGUO</creatorcontrib><creatorcontrib>HU GUOPING</creatorcontrib><creatorcontrib>WU HONGCHUAN</creatorcontrib><creatorcontrib>JIANG YUAN</creatorcontrib><title>Synthetic speech processing method and related device</title><description>The embodiment of the invention discloses a synthetic speech processing method and a related device. The method comprises the following steps: acquiring an original synthetic speech for a first user; extracting amplitude spectrums and phase spectrums of the original synthetic speech, wherein the amplitude spectrums comprise an energy-dimension amplitude spectrum and other dimensions of amplitude spectrums except the energy-dimension amplitude spectrum; processing the other dimensions of amplitude spectrums through a pre-trained forward generator model to obtain a corresponding enhanced amplitude spectrum; and generating a target synthetic speech for the first user according to the energy-dimension amplitude spectrum, the enhanced amplitude spectrum and the phase spectrum. According to the synthetic speech processing method provided by the invention, the naturalness and similarity of the synthetic speech can be improved, so that the synthetic speech is closer to natural speech, and the interaction experience o</description><subject>ACOUSTICS</subject><subject>MUSICAL INSTRUMENTS</subject><subject>PHYSICS</subject><subject>SPEECH ANALYSIS OR SYNTHESIS</subject><subject>SPEECH OR AUDIO CODING OR DECODING</subject><subject>SPEECH OR VOICE PROCESSING</subject><subject>SPEECH RECOGNITION</subject><fulltext>true</fulltext><rsrctype>patent</rsrctype><creationdate>2021</creationdate><recordtype>patent</recordtype><sourceid>EVB</sourceid><recordid>eNrjZDANrswryUgtyUxWKC5ITU3OUCgoyk9OLS7OzEtXyE0tychPUUjMS1EoSs1JLElNUUhJLctMTuVhYE1LzClO5YXS3AyKbq4hzh66qQX58anFBYnJqXmpJfHOfoaGxgZmZibmRo7GxKgBAICILU0</recordid><startdate>20210702</startdate><enddate>20210702</enddate><creator>WANG ZHIGUO</creator><creator>HU GUOPING</creator><creator>WU HONGCHUAN</creator><creator>JIANG YUAN</creator><scope>EVB</scope></search><sort><creationdate>20210702</creationdate><title>Synthetic speech processing method and related device</title><author>WANG ZHIGUO ; HU GUOPING ; WU HONGCHUAN ; JIANG YUAN</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-epo_espacenet_CN113066472A3</frbrgroupid><rsrctype>patents</rsrctype><prefilter>patents</prefilter><language>chi ; eng</language><creationdate>2021</creationdate><topic>ACOUSTICS</topic><topic>MUSICAL INSTRUMENTS</topic><topic>PHYSICS</topic><topic>SPEECH ANALYSIS OR SYNTHESIS</topic><topic>SPEECH OR AUDIO CODING OR DECODING</topic><topic>SPEECH OR VOICE PROCESSING</topic><topic>SPEECH RECOGNITION</topic><toplevel>online_resources</toplevel><creatorcontrib>WANG ZHIGUO</creatorcontrib><creatorcontrib>HU GUOPING</creatorcontrib><creatorcontrib>WU HONGCHUAN</creatorcontrib><creatorcontrib>JIANG YUAN</creatorcontrib><collection>esp@cenet</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>WANG ZHIGUO</au><au>HU GUOPING</au><au>WU HONGCHUAN</au><au>JIANG YUAN</au><format>patent</format><genre>patent</genre><ristype>GEN</ristype><title>Synthetic speech processing method and related device</title><date>2021-07-02</date><risdate>2021</risdate><abstract>The embodiment of the invention discloses a synthetic speech processing method and a related device. The method comprises the following steps: acquiring an original synthetic speech for a first user; extracting amplitude spectrums and phase spectrums of the original synthetic speech, wherein the amplitude spectrums comprise an energy-dimension amplitude spectrum and other dimensions of amplitude spectrums except the energy-dimension amplitude spectrum; processing the other dimensions of amplitude spectrums through a pre-trained forward generator model to obtain a corresponding enhanced amplitude spectrum; and generating a target synthetic speech for the first user according to the energy-dimension amplitude spectrum, the enhanced amplitude spectrum and the phase spectrum. According to the synthetic speech processing method provided by the invention, the naturalness and similarity of the synthetic speech can be improved, so that the synthetic speech is closer to natural speech, and the interaction experience o</abstract><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier
ispartof
issn
language chi ; eng
recordid cdi_epo_espacenet_CN113066472A
source esp@cenet
subjects ACOUSTICS
MUSICAL INSTRUMENTS
PHYSICS
SPEECH ANALYSIS OR SYNTHESIS
SPEECH OR AUDIO CODING OR DECODING
SPEECH OR VOICE PROCESSING
SPEECH RECOGNITION
title Synthetic speech processing method and related device
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-20T02%3A13%3A09IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-epo_EVB&rft_val_fmt=info:ofi/fmt:kev:mtx:patent&rft.genre=patent&rft.au=WANG%20ZHIGUO&rft.date=2021-07-02&rft_id=info:doi/&rft_dat=%3Cepo_EVB%3ECN113066472A%3C/epo_EVB%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true