Information processing method, device and equipment

The invention provides an information processing method, device and equipment, and the method comprises the steps: obtaining first text coding content corresponding to to-be-processed audio data; using a target generator in a generative adversarial network model to obtain target audio data according...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: ZHONG RONGXIU, DENG CHAO, YANG HUIBAO, LIU YING, ZHANG SHILEI
Format: Patent
Sprache:chi ; eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator ZHONG RONGXIU
DENG CHAO
YANG HUIBAO
LIU YING
ZHANG SHILEI
description The invention provides an information processing method, device and equipment, and the method comprises the steps: obtaining first text coding content corresponding to to-be-processed audio data; using a target generator in a generative adversarial network model to obtain target audio data according to the first text coding content and the target sound feature information; wherein the target sound feature information comprises at least one of target loudness information, target tone information and target timbre information. According to the scheme, the generative adversarial network model can be adopted to predict the voice waveform (that is, the target audio data is obtained), a vocoder is not needed to synthesize the voice waveform, end-to-end voice conversion is realized, the mismatch problem caused by vocoder cascade and the defects of noise or tone quality damage and the like existing in a result output by the vocoder are avoided, and the voice conversion efficiency is improved. The problem of noise or
format Patent
fullrecord <record><control><sourceid>epo_EVB</sourceid><recordid>TN_cdi_epo_espacenet_CN116911251A</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>CN116911251A</sourcerecordid><originalsourceid>FETCH-epo_espacenet_CN116911251A3</originalsourceid><addsrcrecordid>eNrjZDD2zEvLL8pNLMnMz1MoKMpPTi0uzsxLV8hNLcnIT9FRSEkty0xOVUjMS1FILSzNLMhNzSvhYWBNS8wpTuWF0twMim6uIc4euqkF-fGpxQWJyal5qSXxzn6GhmaWhoZGpoaOxsSoAQAb0yyb</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>patent</recordtype></control><display><type>patent</type><title>Information processing method, device and equipment</title><source>esp@cenet</source><creator>ZHONG RONGXIU ; DENG CHAO ; YANG HUIBAO ; LIU YING ; ZHANG SHILEI</creator><creatorcontrib>ZHONG RONGXIU ; DENG CHAO ; YANG HUIBAO ; LIU YING ; ZHANG SHILEI</creatorcontrib><description>The invention provides an information processing method, device and equipment, and the method comprises the steps: obtaining first text coding content corresponding to to-be-processed audio data; using a target generator in a generative adversarial network model to obtain target audio data according to the first text coding content and the target sound feature information; wherein the target sound feature information comprises at least one of target loudness information, target tone information and target timbre information. According to the scheme, the generative adversarial network model can be adopted to predict the voice waveform (that is, the target audio data is obtained), a vocoder is not needed to synthesize the voice waveform, end-to-end voice conversion is realized, the mismatch problem caused by vocoder cascade and the defects of noise or tone quality damage and the like existing in a result output by the vocoder are avoided, and the voice conversion efficiency is improved. The problem of noise or</description><language>chi ; eng</language><subject>ACOUSTICS ; CALCULATING ; COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS ; COMPUTING ; COUNTING ; ELECTRIC DIGITAL DATA PROCESSING ; MUSICAL INSTRUMENTS ; PHYSICS ; SPEECH ANALYSIS OR SYNTHESIS ; SPEECH OR AUDIO CODING OR DECODING ; SPEECH OR VOICE PROCESSING ; SPEECH RECOGNITION</subject><creationdate>2023</creationdate><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&amp;date=20231020&amp;DB=EPODOC&amp;CC=CN&amp;NR=116911251A$$EHTML$$P50$$Gepo$$Hfree_for_read</linktohtml><link.rule.ids>230,308,780,885,25563,76318</link.rule.ids><linktorsrc>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&amp;date=20231020&amp;DB=EPODOC&amp;CC=CN&amp;NR=116911251A$$EView_record_in_European_Patent_Office$$FView_record_in_$$GEuropean_Patent_Office$$Hfree_for_read</linktorsrc></links><search><creatorcontrib>ZHONG RONGXIU</creatorcontrib><creatorcontrib>DENG CHAO</creatorcontrib><creatorcontrib>YANG HUIBAO</creatorcontrib><creatorcontrib>LIU YING</creatorcontrib><creatorcontrib>ZHANG SHILEI</creatorcontrib><title>Information processing method, device and equipment</title><description>The invention provides an information processing method, device and equipment, and the method comprises the steps: obtaining first text coding content corresponding to to-be-processed audio data; using a target generator in a generative adversarial network model to obtain target audio data according to the first text coding content and the target sound feature information; wherein the target sound feature information comprises at least one of target loudness information, target tone information and target timbre information. According to the scheme, the generative adversarial network model can be adopted to predict the voice waveform (that is, the target audio data is obtained), a vocoder is not needed to synthesize the voice waveform, end-to-end voice conversion is realized, the mismatch problem caused by vocoder cascade and the defects of noise or tone quality damage and the like existing in a result output by the vocoder are avoided, and the voice conversion efficiency is improved. The problem of noise or</description><subject>ACOUSTICS</subject><subject>CALCULATING</subject><subject>COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS</subject><subject>COMPUTING</subject><subject>COUNTING</subject><subject>ELECTRIC DIGITAL DATA PROCESSING</subject><subject>MUSICAL INSTRUMENTS</subject><subject>PHYSICS</subject><subject>SPEECH ANALYSIS OR SYNTHESIS</subject><subject>SPEECH OR AUDIO CODING OR DECODING</subject><subject>SPEECH OR VOICE PROCESSING</subject><subject>SPEECH RECOGNITION</subject><fulltext>true</fulltext><rsrctype>patent</rsrctype><creationdate>2023</creationdate><recordtype>patent</recordtype><sourceid>EVB</sourceid><recordid>eNrjZDD2zEvLL8pNLMnMz1MoKMpPTi0uzsxLV8hNLcnIT9FRSEkty0xOVUjMS1FILSzNLMhNzSvhYWBNS8wpTuWF0twMim6uIc4euqkF-fGpxQWJyal5qSXxzn6GhmaWhoZGpoaOxsSoAQAb0yyb</recordid><startdate>20231020</startdate><enddate>20231020</enddate><creator>ZHONG RONGXIU</creator><creator>DENG CHAO</creator><creator>YANG HUIBAO</creator><creator>LIU YING</creator><creator>ZHANG SHILEI</creator><scope>EVB</scope></search><sort><creationdate>20231020</creationdate><title>Information processing method, device and equipment</title><author>ZHONG RONGXIU ; DENG CHAO ; YANG HUIBAO ; LIU YING ; ZHANG SHILEI</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-epo_espacenet_CN116911251A3</frbrgroupid><rsrctype>patents</rsrctype><prefilter>patents</prefilter><language>chi ; eng</language><creationdate>2023</creationdate><topic>ACOUSTICS</topic><topic>CALCULATING</topic><topic>COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS</topic><topic>COMPUTING</topic><topic>COUNTING</topic><topic>ELECTRIC DIGITAL DATA PROCESSING</topic><topic>MUSICAL INSTRUMENTS</topic><topic>PHYSICS</topic><topic>SPEECH ANALYSIS OR SYNTHESIS</topic><topic>SPEECH OR AUDIO CODING OR DECODING</topic><topic>SPEECH OR VOICE PROCESSING</topic><topic>SPEECH RECOGNITION</topic><toplevel>online_resources</toplevel><creatorcontrib>ZHONG RONGXIU</creatorcontrib><creatorcontrib>DENG CHAO</creatorcontrib><creatorcontrib>YANG HUIBAO</creatorcontrib><creatorcontrib>LIU YING</creatorcontrib><creatorcontrib>ZHANG SHILEI</creatorcontrib><collection>esp@cenet</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>ZHONG RONGXIU</au><au>DENG CHAO</au><au>YANG HUIBAO</au><au>LIU YING</au><au>ZHANG SHILEI</au><format>patent</format><genre>patent</genre><ristype>GEN</ristype><title>Information processing method, device and equipment</title><date>2023-10-20</date><risdate>2023</risdate><abstract>The invention provides an information processing method, device and equipment, and the method comprises the steps: obtaining first text coding content corresponding to to-be-processed audio data; using a target generator in a generative adversarial network model to obtain target audio data according to the first text coding content and the target sound feature information; wherein the target sound feature information comprises at least one of target loudness information, target tone information and target timbre information. According to the scheme, the generative adversarial network model can be adopted to predict the voice waveform (that is, the target audio data is obtained), a vocoder is not needed to synthesize the voice waveform, end-to-end voice conversion is realized, the mismatch problem caused by vocoder cascade and the defects of noise or tone quality damage and the like existing in a result output by the vocoder are avoided, and the voice conversion efficiency is improved. The problem of noise or</abstract><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier
ispartof
issn
language chi ; eng
recordid cdi_epo_espacenet_CN116911251A
source esp@cenet
subjects ACOUSTICS
CALCULATING
COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
COMPUTING
COUNTING
ELECTRIC DIGITAL DATA PROCESSING
MUSICAL INSTRUMENTS
PHYSICS
SPEECH ANALYSIS OR SYNTHESIS
SPEECH OR AUDIO CODING OR DECODING
SPEECH OR VOICE PROCESSING
SPEECH RECOGNITION
title Information processing method, device and equipment
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-10T16%3A48%3A50IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-epo_EVB&rft_val_fmt=info:ofi/fmt:kev:mtx:patent&rft.genre=patent&rft.au=ZHONG%20RONGXIU&rft.date=2023-10-20&rft_id=info:doi/&rft_dat=%3Cepo_EVB%3ECN116911251A%3C/epo_EVB%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true