Information processing method, device and equipment

The invention provides an information processing method, device and equipment, and the method comprises the steps: obtaining first text coding content corresponding to to-be-processed audio data; using a target generator in a generative adversarial network model to obtain target audio data according...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	ZHONG RONGXIU, DENG CHAO, YANG HUIBAO, LIU YING, ZHANG SHILEI
Format:	Patent
Sprache:	chi ; eng
Schlagworte:	ACOUSTICS CALCULATING COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS COMPUTING COUNTING ELECTRIC DIGITAL DATA PROCESSING MUSICAL INSTRUMENTS PHYSICS SPEECH ANALYSIS OR SYNTHESIS SPEECH OR AUDIO CODING OR DECODING SPEECH OR VOICE PROCESSING SPEECH RECOGNITION
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title
container_volume
creator	ZHONG RONGXIU DENG CHAO YANG HUIBAO LIU YING ZHANG SHILEI
description	The invention provides an information processing method, device and equipment, and the method comprises the steps: obtaining first text coding content corresponding to to-be-processed audio data; using a target generator in a generative adversarial network model to obtain target audio data according to the first text coding content and the target sound feature information; wherein the target sound feature information comprises at least one of target loudness information, target tone information and target timbre information. According to the scheme, the generative adversarial network model can be adopted to predict the voice waveform (that is, the target audio data is obtained), a vocoder is not needed to synthesize the voice waveform, end-to-end voice conversion is realized, the mismatch problem caused by vocoder cascade and the defects of noise or tone quality damage and the like existing in a result output by the vocoder are avoided, and the voice conversion efficiency is improved. The problem of noise or
format	Patent
fullrecord	<record><control><sourceid>epo_EVB</sourceid><recordid>TN_cdi_epo_espacenet_CN116911251A</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>CN116911251A</sourcerecordid><originalsourceid>FETCH-epo_espacenet_CN116911251A3</originalsourceid><addsrcrecordid>eNrjZDD2zEvLL8pNLMnMz1MoKMpPTi0uzsxLV8hNLcnIT9FRSEkty0xOVUjMS1FILSzNLMhNzSvhYWBNS8wpTuWF0twMim6uIc4euqkF-fGpxQWJyal5qSXxzn6GhmaWhoZGpoaOxsSoAQAb0yyb</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>patent</recordtype></control><display><type>patent</type><title>Information processing method, device and equipment</title><source>esp@cenet</source><creator>ZHONG RONGXIU ; DENG CHAO ; YANG HUIBAO ; LIU YING ; ZHANG SHILEI</creator><creatorcontrib>ZHONG RONGXIU ; DENG CHAO ; YANG HUIBAO ; LIU YING ; ZHANG SHILEI</creatorcontrib><description>The invention provides an information processing method, device and equipment, and the method comprises the steps: obtaining first text coding content corresponding to to-be-processed audio data; using a target generator in a generative adversarial network model to obtain target audio data according to the first text coding content and the target sound feature information; wherein the target sound feature information comprises at least one of target loudness information, target tone information and target timbre information. According to the scheme, the generative adversarial network model can be adopted to predict the voice waveform (that is, the target audio data is obtained), a vocoder is not needed to synthesize the voice waveform, end-to-end voice conversion is realized, the mismatch problem caused by vocoder cascade and the defects of noise or tone quality damage and the like existing in a result output by the vocoder are avoided, and the voice conversion efficiency is improved. The problem of noise or</description><language>chi ; eng</language><subject>ACOUSTICS ; CALCULATING ; COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS ; COMPUTING ; COUNTING ; ELECTRIC DIGITAL DATA PROCESSING ; MUSICAL INSTRUMENTS ; PHYSICS ; SPEECH ANALYSIS OR SYNTHESIS ; SPEECH OR AUDIO CODING OR DECODING ; SPEECH OR VOICE PROCESSING ; SPEECH RECOGNITION</subject><creationdate>2023</creationdate><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&date=20231020&DB=EPODOC&CC=CN&NR=116911251A$$EHTML$$P50$$Gepo$$Hfree_for_read</linktohtml><link.rule.ids>230,308,780,885,25563,76318</link.rule.ids><linktorsrc>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&date=20231020&DB=EPODOC&CC=CN&NR=116911251A$$EView_record_in_European_Patent_Office$$FView_record_in_$$GEuropean_Patent_Office$$Hfree_for_read</linktorsrc></links><search><creatorcontrib>ZHONG RONGXIU</creatorcontrib><creatorcontrib>DENG CHAO</creatorcontrib><creatorcontrib>YANG HUIBAO</creatorcontrib><creatorcontrib>LIU YING</creatorcontrib><creatorcontrib>ZHANG SHILEI</creatorcontrib><title>Information processing method, device and equipment</title><description>The invention provides an information processing method, device and equipment, and the method comprises the steps: obtaining first text coding content corresponding to to-be-processed audio data; using a target generator in a generative adversarial network model to obtain target audio data according to the first text coding content and the target sound feature information; wherein the target sound feature information comprises at least one of target loudness information, target tone information and target timbre information. According to the scheme, the generative adversarial network model can be adopted to predict the voice waveform (that is, the target audio data is obtained), a vocoder is not needed to synthesize the voice waveform, end-to-end voice conversion is realized, the mismatch problem caused by vocoder cascade and the defects of noise or tone quality damage and the like existing in a result output by the vocoder are avoided, and the voice conversion efficiency is improved. The problem of noise or</description><subject>ACOUSTICS</subject><subject>CALCULATING</subject><subject>COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS</subject><subject>COMPUTING</subject><subject>COUNTING</subject><subject>ELECTRIC DIGITAL DATA PROCESSING</subject><subject>MUSICAL INSTRUMENTS</subject><subject>PHYSICS</subject><subject>SPEECH ANALYSIS OR SYNTHESIS</subject><subject>SPEECH OR AUDIO CODING OR DECODING</subject><subject>SPEECH OR VOICE PROCESSING</subject><subject>SPEECH RECOGNITION</subject><fulltext>true</fulltext><rsrctype>patent</rsrctype><creationdate>2023</creationdate><recordtype>patent</recordtype><sourceid>EVB</sourceid><recordid>eNrjZDD2zEvLL8pNLMnMz1MoKMpPTi0uzsxLV8hNLcnIT9FRSEkty0xOVUjMS1FILSzNLMhNzSvhYWBNS8wpTuWF0twMim6uIc4euqkF-fGpxQWJyal5qSXxzn6GhmaWhoZGpoaOxsSoAQAb0yyb</recordid><startdate>20231020</startdate><enddate>20231020</enddate><creator>ZHONG RONGXIU</creator><creator>DENG CHAO</creator><creator>YANG HUIBAO</creator><creator>LIU YING</creator><creator>ZHANG SHILEI</creator><scope>EVB</scope></search><sort><creationdate>20231020</creationdate><title>Information processing method, device and equipment</title><author>ZHONG RONGXIU ; DENG CHAO ; YANG HUIBAO ; LIU YING ; ZHANG SHILEI</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-epo_espacenet_CN116911251A3</frbrgroupid><rsrctype>patents</rsrctype><prefilter>patents</prefilter><language>chi ; eng</language><creationdate>2023</creationdate><topic>ACOUSTICS</topic><topic>CALCULATING</topic><topic>COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS</topic><topic>COMPUTING</topic><topic>COUNTING</topic><topic>ELECTRIC DIGITAL DATA PROCESSING</topic><topic>MUSICAL INSTRUMENTS</topic><topic>PHYSICS</topic><topic>SPEECH ANALYSIS OR SYNTHESIS</topic><topic>SPEECH OR AUDIO CODING OR DECODING</topic><topic>SPEECH OR VOICE PROCESSING</topic><topic>SPEECH RECOGNITION</topic><toplevel>online_resources</toplevel><creatorcontrib>ZHONG RONGXIU</creatorcontrib><creatorcontrib>DENG CHAO</creatorcontrib><creatorcontrib>YANG HUIBAO</creatorcontrib><creatorcontrib>LIU YING</creatorcontrib><creatorcontrib>ZHANG SHILEI</creatorcontrib><collection>esp@cenet</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>ZHONG RONGXIU</au><au>DENG CHAO</au><au>YANG HUIBAO</au><au>LIU YING</au><au>ZHANG SHILEI</au><format>patent</format><genre>patent</genre><ristype>GEN</ristype><title>Information processing method, device and equipment</title><date>2023-10-20</date><risdate>2023</risdate><abstract>The invention provides an information processing method, device and equipment, and the method comprises the steps: obtaining first text coding content corresponding to to-be-processed audio data; using a target generator in a generative adversarial network model to obtain target audio data according to the first text coding content and the target sound feature information; wherein the target sound feature information comprises at least one of target loudness information, target tone information and target timbre information. According to the scheme, the generative adversarial network model can be adopted to predict the voice waveform (that is, the target audio data is obtained), a vocoder is not needed to synthesize the voice waveform, end-to-end voice conversion is realized, the mismatch problem caused by vocoder cascade and the defects of noise or tone quality damage and the like existing in a result output by the vocoder are avoided, and the voice conversion efficiency is improved. The problem of noise or</abstract><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier
ispartof
issn
language	chi ; eng
recordid	cdi_epo_espacenet_CN116911251A
source	esp@cenet
subjects	ACOUSTICS CALCULATING COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS COMPUTING COUNTING ELECTRIC DIGITAL DATA PROCESSING MUSICAL INSTRUMENTS PHYSICS SPEECH ANALYSIS OR SYNTHESIS SPEECH OR AUDIO CODING OR DECODING SPEECH OR VOICE PROCESSING SPEECH RECOGNITION
title	Information processing method, device and equipment
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-10T16%3A48%3A50IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-epo_EVB&rft_val_fmt=info:ofi/fmt:kev:mtx:patent&rft.genre=patent&rft.au=ZHONG%20RONGXIU&rft.date=2023-10-20&rft_id=info:doi/&rft_dat=%3Cepo_EVB%3ECN116911251A%3C/epo_EVB%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true