Synthetic speech processing method and related device
The embodiment of the invention discloses a synthetic speech processing method and a related device. The method comprises the following steps: acquiring an original synthetic speech for a first user; extracting amplitude spectrums and phase spectrums of the original synthetic speech, wherein the amp...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Patent |
Sprache: | chi ; eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | |
container_volume | |
creator | WANG ZHIGUO HU GUOPING WU HONGCHUAN JIANG YUAN |
description | The embodiment of the invention discloses a synthetic speech processing method and a related device. The method comprises the following steps: acquiring an original synthetic speech for a first user; extracting amplitude spectrums and phase spectrums of the original synthetic speech, wherein the amplitude spectrums comprise an energy-dimension amplitude spectrum and other dimensions of amplitude spectrums except the energy-dimension amplitude spectrum; processing the other dimensions of amplitude spectrums through a pre-trained forward generator model to obtain a corresponding enhanced amplitude spectrum; and generating a target synthetic speech for the first user according to the energy-dimension amplitude spectrum, the enhanced amplitude spectrum and the phase spectrum. According to the synthetic speech processing method provided by the invention, the naturalness and similarity of the synthetic speech can be improved, so that the synthetic speech is closer to natural speech, and the interaction experience o |
format | Patent |
fullrecord | <record><control><sourceid>epo_EVB</sourceid><recordid>TN_cdi_epo_espacenet_CN113066472A</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>CN113066472A</sourcerecordid><originalsourceid>FETCH-epo_espacenet_CN113066472A3</originalsourceid><addsrcrecordid>eNrjZDANrswryUgtyUxWKC5ITU3OUCgoyk9OLS7OzEtXyE0tychPUUjMS1EoSs1JLElNUUhJLctMTuVhYE1LzClO5YXS3AyKbq4hzh66qQX58anFBYnJqXmpJfHOfoaGxgZmZibmRo7GxKgBAICILU0</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>patent</recordtype></control><display><type>patent</type><title>Synthetic speech processing method and related device</title><source>esp@cenet</source><creator>WANG ZHIGUO ; HU GUOPING ; WU HONGCHUAN ; JIANG YUAN</creator><creatorcontrib>WANG ZHIGUO ; HU GUOPING ; WU HONGCHUAN ; JIANG YUAN</creatorcontrib><description>The embodiment of the invention discloses a synthetic speech processing method and a related device. The method comprises the following steps: acquiring an original synthetic speech for a first user; extracting amplitude spectrums and phase spectrums of the original synthetic speech, wherein the amplitude spectrums comprise an energy-dimension amplitude spectrum and other dimensions of amplitude spectrums except the energy-dimension amplitude spectrum; processing the other dimensions of amplitude spectrums through a pre-trained forward generator model to obtain a corresponding enhanced amplitude spectrum; and generating a target synthetic speech for the first user according to the energy-dimension amplitude spectrum, the enhanced amplitude spectrum and the phase spectrum. According to the synthetic speech processing method provided by the invention, the naturalness and similarity of the synthetic speech can be improved, so that the synthetic speech is closer to natural speech, and the interaction experience o</description><language>chi ; eng</language><subject>ACOUSTICS ; MUSICAL INSTRUMENTS ; PHYSICS ; SPEECH ANALYSIS OR SYNTHESIS ; SPEECH OR AUDIO CODING OR DECODING ; SPEECH OR VOICE PROCESSING ; SPEECH RECOGNITION</subject><creationdate>2021</creationdate><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&date=20210702&DB=EPODOC&CC=CN&NR=113066472A$$EHTML$$P50$$Gepo$$Hfree_for_read</linktohtml><link.rule.ids>230,308,777,882,25545,76296</link.rule.ids><linktorsrc>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&date=20210702&DB=EPODOC&CC=CN&NR=113066472A$$EView_record_in_European_Patent_Office$$FView_record_in_$$GEuropean_Patent_Office$$Hfree_for_read</linktorsrc></links><search><creatorcontrib>WANG ZHIGUO</creatorcontrib><creatorcontrib>HU GUOPING</creatorcontrib><creatorcontrib>WU HONGCHUAN</creatorcontrib><creatorcontrib>JIANG YUAN</creatorcontrib><title>Synthetic speech processing method and related device</title><description>The embodiment of the invention discloses a synthetic speech processing method and a related device. The method comprises the following steps: acquiring an original synthetic speech for a first user; extracting amplitude spectrums and phase spectrums of the original synthetic speech, wherein the amplitude spectrums comprise an energy-dimension amplitude spectrum and other dimensions of amplitude spectrums except the energy-dimension amplitude spectrum; processing the other dimensions of amplitude spectrums through a pre-trained forward generator model to obtain a corresponding enhanced amplitude spectrum; and generating a target synthetic speech for the first user according to the energy-dimension amplitude spectrum, the enhanced amplitude spectrum and the phase spectrum. According to the synthetic speech processing method provided by the invention, the naturalness and similarity of the synthetic speech can be improved, so that the synthetic speech is closer to natural speech, and the interaction experience o</description><subject>ACOUSTICS</subject><subject>MUSICAL INSTRUMENTS</subject><subject>PHYSICS</subject><subject>SPEECH ANALYSIS OR SYNTHESIS</subject><subject>SPEECH OR AUDIO CODING OR DECODING</subject><subject>SPEECH OR VOICE PROCESSING</subject><subject>SPEECH RECOGNITION</subject><fulltext>true</fulltext><rsrctype>patent</rsrctype><creationdate>2021</creationdate><recordtype>patent</recordtype><sourceid>EVB</sourceid><recordid>eNrjZDANrswryUgtyUxWKC5ITU3OUCgoyk9OLS7OzEtXyE0tychPUUjMS1EoSs1JLElNUUhJLctMTuVhYE1LzClO5YXS3AyKbq4hzh66qQX58anFBYnJqXmpJfHOfoaGxgZmZibmRo7GxKgBAICILU0</recordid><startdate>20210702</startdate><enddate>20210702</enddate><creator>WANG ZHIGUO</creator><creator>HU GUOPING</creator><creator>WU HONGCHUAN</creator><creator>JIANG YUAN</creator><scope>EVB</scope></search><sort><creationdate>20210702</creationdate><title>Synthetic speech processing method and related device</title><author>WANG ZHIGUO ; HU GUOPING ; WU HONGCHUAN ; JIANG YUAN</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-epo_espacenet_CN113066472A3</frbrgroupid><rsrctype>patents</rsrctype><prefilter>patents</prefilter><language>chi ; eng</language><creationdate>2021</creationdate><topic>ACOUSTICS</topic><topic>MUSICAL INSTRUMENTS</topic><topic>PHYSICS</topic><topic>SPEECH ANALYSIS OR SYNTHESIS</topic><topic>SPEECH OR AUDIO CODING OR DECODING</topic><topic>SPEECH OR VOICE PROCESSING</topic><topic>SPEECH RECOGNITION</topic><toplevel>online_resources</toplevel><creatorcontrib>WANG ZHIGUO</creatorcontrib><creatorcontrib>HU GUOPING</creatorcontrib><creatorcontrib>WU HONGCHUAN</creatorcontrib><creatorcontrib>JIANG YUAN</creatorcontrib><collection>esp@cenet</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>WANG ZHIGUO</au><au>HU GUOPING</au><au>WU HONGCHUAN</au><au>JIANG YUAN</au><format>patent</format><genre>patent</genre><ristype>GEN</ristype><title>Synthetic speech processing method and related device</title><date>2021-07-02</date><risdate>2021</risdate><abstract>The embodiment of the invention discloses a synthetic speech processing method and a related device. The method comprises the following steps: acquiring an original synthetic speech for a first user; extracting amplitude spectrums and phase spectrums of the original synthetic speech, wherein the amplitude spectrums comprise an energy-dimension amplitude spectrum and other dimensions of amplitude spectrums except the energy-dimension amplitude spectrum; processing the other dimensions of amplitude spectrums through a pre-trained forward generator model to obtain a corresponding enhanced amplitude spectrum; and generating a target synthetic speech for the first user according to the energy-dimension amplitude spectrum, the enhanced amplitude spectrum and the phase spectrum. According to the synthetic speech processing method provided by the invention, the naturalness and similarity of the synthetic speech can be improved, so that the synthetic speech is closer to natural speech, and the interaction experience o</abstract><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | |
ispartof | |
issn | |
language | chi ; eng |
recordid | cdi_epo_espacenet_CN113066472A |
source | esp@cenet |
subjects | ACOUSTICS MUSICAL INSTRUMENTS PHYSICS SPEECH ANALYSIS OR SYNTHESIS SPEECH OR AUDIO CODING OR DECODING SPEECH OR VOICE PROCESSING SPEECH RECOGNITION |
title | Synthetic speech processing method and related device |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-20T02%3A13%3A09IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-epo_EVB&rft_val_fmt=info:ofi/fmt:kev:mtx:patent&rft.genre=patent&rft.au=WANG%20ZHIGUO&rft.date=2021-07-02&rft_id=info:doi/&rft_dat=%3Cepo_EVB%3ECN113066472A%3C/epo_EVB%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true |