Data processing method and device, equipment and storage medium

The invention provides a data processing method and device, equipment and a storage medium, and relates to the technical field of voice recognition. The method comprises the following steps: acquiring audio data of a to-be-processed video and an image recognition text obtained by performing subtitle...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	FAN LU, HE XIAODONG, WU YOUZHENG, LI FANGZHU, DENG LIPING, FU LI
Format:	Patent
Sprache:	chi ; eng
Schlagworte:	ACOUSTICS MUSICAL INSTRUMENTS PHYSICS SPEECH ANALYSIS OR SYNTHESIS SPEECH OR AUDIO CODING OR DECODING SPEECH OR VOICE PROCESSING SPEECH RECOGNITION
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title
container_volume
creator	FAN LU HE XIAODONG WU YOUZHENG LI FANGZHU DENG LIPING FU LI
description	The invention provides a data processing method and device, equipment and a storage medium, and relates to the technical field of voice recognition. The method comprises the following steps: acquiring audio data of a to-be-processed video and an image recognition text obtained by performing subtitle text recognition on a corresponding to-be-processed video frame, performing voice recognition on the audio data of the to-be-processed video and performing forced alignment processing to obtain an aligned text, and then performing error correction processing on the aligned text to obtain an error correction result of the to-be-processed video. And obtaining an error-corrected text, and screening the error-corrected text by referring to the corresponding image recognition text to obtain training data for training a speech recognition model. The method expands the training data set of the speech recognition model. 本公开提供一种数据处理方法、装置、设备及存储介质，涉及语音识别技术领域。该方法包括：获取待处理视频的音频数据及对对应的待处理视频帧进行字幕文本识别得到的图像识别文本，对待处理视频的音频数据及进行语音识别并进
format	Patent
fullrecord	<record><control><sourceid>epo_EVB</sourceid><recordid>TN_cdi_epo_espacenet_CN117789705A</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>CN117789705A</sourcerecordid><originalsourceid>FETCH-epo_espacenet_CN117789705A3</originalsourceid><addsrcrecordid>eNrjZLB3SSxJVCgoyk9OLS7OzEtXyE0tychPUUjMS1FISS3LTE7VUUgtLM0syE3NKwGLFpfkFyWmpwIVpmSW5vIwsKYl5hSn8kJpbgZFN9cQZw_d1IL8-NTigsTk1LzUknhnP0NDc3MLS3MDU0djYtQAAHTtMMg</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>patent</recordtype></control><display><type>patent</type><title>Data processing method and device, equipment and storage medium</title><source>esp@cenet</source><creator>FAN LU ; HE XIAODONG ; WU YOUZHENG ; LI FANGZHU ; DENG LIPING ; FU LI</creator><creatorcontrib>FAN LU ; HE XIAODONG ; WU YOUZHENG ; LI FANGZHU ; DENG LIPING ; FU LI</creatorcontrib><description>The invention provides a data processing method and device, equipment and a storage medium, and relates to the technical field of voice recognition. The method comprises the following steps: acquiring audio data of a to-be-processed video and an image recognition text obtained by performing subtitle text recognition on a corresponding to-be-processed video frame, performing voice recognition on the audio data of the to-be-processed video and performing forced alignment processing to obtain an aligned text, and then performing error correction processing on the aligned text to obtain an error correction result of the to-be-processed video. And obtaining an error-corrected text, and screening the error-corrected text by referring to the corresponding image recognition text to obtain training data for training a speech recognition model. The method expands the training data set of the speech recognition model. 本公开提供一种数据处理方法、装置、设备及存储介质，涉及语音识别技术领域。该方法包括：获取待处理视频的音频数据及对对应的待处理视频帧进行字幕文本识别得到的图像识别文本，对待处理视频的音频数据及进行语音识别并进</description><language>chi ; eng</language><subject>ACOUSTICS ; MUSICAL INSTRUMENTS ; PHYSICS ; SPEECH ANALYSIS OR SYNTHESIS ; SPEECH OR AUDIO CODING OR DECODING ; SPEECH OR VOICE PROCESSING ; SPEECH RECOGNITION</subject><creationdate>2024</creationdate><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&date=20240329&DB=EPODOC&CC=CN&NR=117789705A$$EHTML$$P50$$Gepo$$Hfree_for_read</linktohtml><link.rule.ids>230,308,780,885,25564,76547</link.rule.ids><linktorsrc>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&date=20240329&DB=EPODOC&CC=CN&NR=117789705A$$EView_record_in_European_Patent_Office$$FView_record_in_$$GEuropean_Patent_Office$$Hfree_for_read</linktorsrc></links><search><creatorcontrib>FAN LU</creatorcontrib><creatorcontrib>HE XIAODONG</creatorcontrib><creatorcontrib>WU YOUZHENG</creatorcontrib><creatorcontrib>LI FANGZHU</creatorcontrib><creatorcontrib>DENG LIPING</creatorcontrib><creatorcontrib>FU LI</creatorcontrib><title>Data processing method and device, equipment and storage medium</title><description>The invention provides a data processing method and device, equipment and a storage medium, and relates to the technical field of voice recognition. The method comprises the following steps: acquiring audio data of a to-be-processed video and an image recognition text obtained by performing subtitle text recognition on a corresponding to-be-processed video frame, performing voice recognition on the audio data of the to-be-processed video and performing forced alignment processing to obtain an aligned text, and then performing error correction processing on the aligned text to obtain an error correction result of the to-be-processed video. And obtaining an error-corrected text, and screening the error-corrected text by referring to the corresponding image recognition text to obtain training data for training a speech recognition model. The method expands the training data set of the speech recognition model. 本公开提供一种数据处理方法、装置、设备及存储介质，涉及语音识别技术领域。该方法包括：获取待处理视频的音频数据及对对应的待处理视频帧进行字幕文本识别得到的图像识别文本，对待处理视频的音频数据及进行语音识别并进</description><subject>ACOUSTICS</subject><subject>MUSICAL INSTRUMENTS</subject><subject>PHYSICS</subject><subject>SPEECH ANALYSIS OR SYNTHESIS</subject><subject>SPEECH OR AUDIO CODING OR DECODING</subject><subject>SPEECH OR VOICE PROCESSING</subject><subject>SPEECH RECOGNITION</subject><fulltext>true</fulltext><rsrctype>patent</rsrctype><creationdate>2024</creationdate><recordtype>patent</recordtype><sourceid>EVB</sourceid><recordid>eNrjZLB3SSxJVCgoyk9OLS7OzEtXyE0tychPUUjMS1FISS3LTE7VUUgtLM0syE3NKwGLFpfkFyWmpwIVpmSW5vIwsKYl5hSn8kJpbgZFN9cQZw_d1IL8-NTigsTk1LzUknhnP0NDc3MLS3MDU0djYtQAAHTtMMg</recordid><startdate>20240329</startdate><enddate>20240329</enddate><creator>FAN LU</creator><creator>HE XIAODONG</creator><creator>WU YOUZHENG</creator><creator>LI FANGZHU</creator><creator>DENG LIPING</creator><creator>FU LI</creator><scope>EVB</scope></search><sort><creationdate>20240329</creationdate><title>Data processing method and device, equipment and storage medium</title><author>FAN LU ; HE XIAODONG ; WU YOUZHENG ; LI FANGZHU ; DENG LIPING ; FU LI</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-epo_espacenet_CN117789705A3</frbrgroupid><rsrctype>patents</rsrctype><prefilter>patents</prefilter><language>chi ; eng</language><creationdate>2024</creationdate><topic>ACOUSTICS</topic><topic>MUSICAL INSTRUMENTS</topic><topic>PHYSICS</topic><topic>SPEECH ANALYSIS OR SYNTHESIS</topic><topic>SPEECH OR AUDIO CODING OR DECODING</topic><topic>SPEECH OR VOICE PROCESSING</topic><topic>SPEECH RECOGNITION</topic><toplevel>online_resources</toplevel><creatorcontrib>FAN LU</creatorcontrib><creatorcontrib>HE XIAODONG</creatorcontrib><creatorcontrib>WU YOUZHENG</creatorcontrib><creatorcontrib>LI FANGZHU</creatorcontrib><creatorcontrib>DENG LIPING</creatorcontrib><creatorcontrib>FU LI</creatorcontrib><collection>esp@cenet</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>FAN LU</au><au>HE XIAODONG</au><au>WU YOUZHENG</au><au>LI FANGZHU</au><au>DENG LIPING</au><au>FU LI</au><format>patent</format><genre>patent</genre><ristype>GEN</ristype><title>Data processing method and device, equipment and storage medium</title><date>2024-03-29</date><risdate>2024</risdate><abstract>The invention provides a data processing method and device, equipment and a storage medium, and relates to the technical field of voice recognition. The method comprises the following steps: acquiring audio data of a to-be-processed video and an image recognition text obtained by performing subtitle text recognition on a corresponding to-be-processed video frame, performing voice recognition on the audio data of the to-be-processed video and performing forced alignment processing to obtain an aligned text, and then performing error correction processing on the aligned text to obtain an error correction result of the to-be-processed video. And obtaining an error-corrected text, and screening the error-corrected text by referring to the corresponding image recognition text to obtain training data for training a speech recognition model. The method expands the training data set of the speech recognition model. 本公开提供一种数据处理方法、装置、设备及存储介质，涉及语音识别技术领域。该方法包括：获取待处理视频的音频数据及对对应的待处理视频帧进行字幕文本识别得到的图像识别文本，对待处理视频的音频数据及进行语音识别并进</abstract><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier
ispartof
issn
language	chi ; eng
recordid	cdi_epo_espacenet_CN117789705A
source	esp@cenet
subjects	ACOUSTICS MUSICAL INSTRUMENTS PHYSICS SPEECH ANALYSIS OR SYNTHESIS SPEECH OR AUDIO CODING OR DECODING SPEECH OR VOICE PROCESSING SPEECH RECOGNITION
title	Data processing method and device, equipment and storage medium
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-24T02%3A36%3A24IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-epo_EVB&rft_val_fmt=info:ofi/fmt:kev:mtx:patent&rft.genre=patent&rft.au=FAN%20LU&rft.date=2024-03-29&rft_id=info:doi/&rft_dat=%3Cepo_EVB%3ECN117789705A%3C/epo_EVB%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true