Information processing method, device and equipment
The invention provides an information processing method, device and equipment, and the method comprises the steps: obtaining first text coding content corresponding to to-be-processed audio data; using a target generator in a generative adversarial network model to obtain target audio data according...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Patent |
Sprache: | chi ; eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | |
container_volume | |
creator | ZHONG RONGXIU DENG CHAO YANG HUIBAO LIU YING ZHANG SHILEI |
description | The invention provides an information processing method, device and equipment, and the method comprises the steps: obtaining first text coding content corresponding to to-be-processed audio data; using a target generator in a generative adversarial network model to obtain target audio data according to the first text coding content and the target sound feature information; wherein the target sound feature information comprises at least one of target loudness information, target tone information and target timbre information. According to the scheme, the generative adversarial network model can be adopted to predict the voice waveform (that is, the target audio data is obtained), a vocoder is not needed to synthesize the voice waveform, end-to-end voice conversion is realized, the mismatch problem caused by vocoder cascade and the defects of noise or tone quality damage and the like existing in a result output by the vocoder are avoided, and the voice conversion efficiency is improved. The problem of noise or |
format | Patent |
fullrecord | <record><control><sourceid>epo_EVB</sourceid><recordid>TN_cdi_epo_espacenet_CN116911251A</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>CN116911251A</sourcerecordid><originalsourceid>FETCH-epo_espacenet_CN116911251A3</originalsourceid><addsrcrecordid>eNrjZDD2zEvLL8pNLMnMz1MoKMpPTi0uzsxLV8hNLcnIT9FRSEkty0xOVUjMS1FILSzNLMhNzSvhYWBNS8wpTuWF0twMim6uIc4euqkF-fGpxQWJyal5qSXxzn6GhmaWhoZGpoaOxsSoAQAb0yyb</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>patent</recordtype></control><display><type>patent</type><title>Information processing method, device and equipment</title><source>esp@cenet</source><creator>ZHONG RONGXIU ; DENG CHAO ; YANG HUIBAO ; LIU YING ; ZHANG SHILEI</creator><creatorcontrib>ZHONG RONGXIU ; DENG CHAO ; YANG HUIBAO ; LIU YING ; ZHANG SHILEI</creatorcontrib><description>The invention provides an information processing method, device and equipment, and the method comprises the steps: obtaining first text coding content corresponding to to-be-processed audio data; using a target generator in a generative adversarial network model to obtain target audio data according to the first text coding content and the target sound feature information; wherein the target sound feature information comprises at least one of target loudness information, target tone information and target timbre information. According to the scheme, the generative adversarial network model can be adopted to predict the voice waveform (that is, the target audio data is obtained), a vocoder is not needed to synthesize the voice waveform, end-to-end voice conversion is realized, the mismatch problem caused by vocoder cascade and the defects of noise or tone quality damage and the like existing in a result output by the vocoder are avoided, and the voice conversion efficiency is improved. The problem of noise or</description><language>chi ; eng</language><subject>ACOUSTICS ; CALCULATING ; COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS ; COMPUTING ; COUNTING ; ELECTRIC DIGITAL DATA PROCESSING ; MUSICAL INSTRUMENTS ; PHYSICS ; SPEECH ANALYSIS OR SYNTHESIS ; SPEECH OR AUDIO CODING OR DECODING ; SPEECH OR VOICE PROCESSING ; SPEECH RECOGNITION</subject><creationdate>2023</creationdate><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&date=20231020&DB=EPODOC&CC=CN&NR=116911251A$$EHTML$$P50$$Gepo$$Hfree_for_read</linktohtml><link.rule.ids>230,308,780,885,25563,76318</link.rule.ids><linktorsrc>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&date=20231020&DB=EPODOC&CC=CN&NR=116911251A$$EView_record_in_European_Patent_Office$$FView_record_in_$$GEuropean_Patent_Office$$Hfree_for_read</linktorsrc></links><search><creatorcontrib>ZHONG RONGXIU</creatorcontrib><creatorcontrib>DENG CHAO</creatorcontrib><creatorcontrib>YANG HUIBAO</creatorcontrib><creatorcontrib>LIU YING</creatorcontrib><creatorcontrib>ZHANG SHILEI</creatorcontrib><title>Information processing method, device and equipment</title><description>The invention provides an information processing method, device and equipment, and the method comprises the steps: obtaining first text coding content corresponding to to-be-processed audio data; using a target generator in a generative adversarial network model to obtain target audio data according to the first text coding content and the target sound feature information; wherein the target sound feature information comprises at least one of target loudness information, target tone information and target timbre information. According to the scheme, the generative adversarial network model can be adopted to predict the voice waveform (that is, the target audio data is obtained), a vocoder is not needed to synthesize the voice waveform, end-to-end voice conversion is realized, the mismatch problem caused by vocoder cascade and the defects of noise or tone quality damage and the like existing in a result output by the vocoder are avoided, and the voice conversion efficiency is improved. The problem of noise or</description><subject>ACOUSTICS</subject><subject>CALCULATING</subject><subject>COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS</subject><subject>COMPUTING</subject><subject>COUNTING</subject><subject>ELECTRIC DIGITAL DATA PROCESSING</subject><subject>MUSICAL INSTRUMENTS</subject><subject>PHYSICS</subject><subject>SPEECH ANALYSIS OR SYNTHESIS</subject><subject>SPEECH OR AUDIO CODING OR DECODING</subject><subject>SPEECH OR VOICE PROCESSING</subject><subject>SPEECH RECOGNITION</subject><fulltext>true</fulltext><rsrctype>patent</rsrctype><creationdate>2023</creationdate><recordtype>patent</recordtype><sourceid>EVB</sourceid><recordid>eNrjZDD2zEvLL8pNLMnMz1MoKMpPTi0uzsxLV8hNLcnIT9FRSEkty0xOVUjMS1FILSzNLMhNzSvhYWBNS8wpTuWF0twMim6uIc4euqkF-fGpxQWJyal5qSXxzn6GhmaWhoZGpoaOxsSoAQAb0yyb</recordid><startdate>20231020</startdate><enddate>20231020</enddate><creator>ZHONG RONGXIU</creator><creator>DENG CHAO</creator><creator>YANG HUIBAO</creator><creator>LIU YING</creator><creator>ZHANG SHILEI</creator><scope>EVB</scope></search><sort><creationdate>20231020</creationdate><title>Information processing method, device and equipment</title><author>ZHONG RONGXIU ; DENG CHAO ; YANG HUIBAO ; LIU YING ; ZHANG SHILEI</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-epo_espacenet_CN116911251A3</frbrgroupid><rsrctype>patents</rsrctype><prefilter>patents</prefilter><language>chi ; eng</language><creationdate>2023</creationdate><topic>ACOUSTICS</topic><topic>CALCULATING</topic><topic>COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS</topic><topic>COMPUTING</topic><topic>COUNTING</topic><topic>ELECTRIC DIGITAL DATA PROCESSING</topic><topic>MUSICAL INSTRUMENTS</topic><topic>PHYSICS</topic><topic>SPEECH ANALYSIS OR SYNTHESIS</topic><topic>SPEECH OR AUDIO CODING OR DECODING</topic><topic>SPEECH OR VOICE PROCESSING</topic><topic>SPEECH RECOGNITION</topic><toplevel>online_resources</toplevel><creatorcontrib>ZHONG RONGXIU</creatorcontrib><creatorcontrib>DENG CHAO</creatorcontrib><creatorcontrib>YANG HUIBAO</creatorcontrib><creatorcontrib>LIU YING</creatorcontrib><creatorcontrib>ZHANG SHILEI</creatorcontrib><collection>esp@cenet</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>ZHONG RONGXIU</au><au>DENG CHAO</au><au>YANG HUIBAO</au><au>LIU YING</au><au>ZHANG SHILEI</au><format>patent</format><genre>patent</genre><ristype>GEN</ristype><title>Information processing method, device and equipment</title><date>2023-10-20</date><risdate>2023</risdate><abstract>The invention provides an information processing method, device and equipment, and the method comprises the steps: obtaining first text coding content corresponding to to-be-processed audio data; using a target generator in a generative adversarial network model to obtain target audio data according to the first text coding content and the target sound feature information; wherein the target sound feature information comprises at least one of target loudness information, target tone information and target timbre information. According to the scheme, the generative adversarial network model can be adopted to predict the voice waveform (that is, the target audio data is obtained), a vocoder is not needed to synthesize the voice waveform, end-to-end voice conversion is realized, the mismatch problem caused by vocoder cascade and the defects of noise or tone quality damage and the like existing in a result output by the vocoder are avoided, and the voice conversion efficiency is improved. The problem of noise or</abstract><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | |
ispartof | |
issn | |
language | chi ; eng |
recordid | cdi_epo_espacenet_CN116911251A |
source | esp@cenet |
subjects | ACOUSTICS CALCULATING COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS COMPUTING COUNTING ELECTRIC DIGITAL DATA PROCESSING MUSICAL INSTRUMENTS PHYSICS SPEECH ANALYSIS OR SYNTHESIS SPEECH OR AUDIO CODING OR DECODING SPEECH OR VOICE PROCESSING SPEECH RECOGNITION |
title | Information processing method, device and equipment |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-10T16%3A48%3A50IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-epo_EVB&rft_val_fmt=info:ofi/fmt:kev:mtx:patent&rft.genre=patent&rft.au=ZHONG%20RONGXIU&rft.date=2023-10-20&rft_id=info:doi/&rft_dat=%3Cepo_EVB%3ECN116911251A%3C/epo_EVB%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true |