Data processing method and device, equipment and storage medium

The invention provides a data processing method and device, equipment and a storage medium, and relates to the technical field of voice recognition. The method comprises the following steps: acquiring audio data of a to-be-processed video and an image recognition text obtained by performing subtitle...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: FAN LU, HE XIAODONG, WU YOUZHENG, LI FANGZHU, DENG LIPING, FU LI
Format: Patent
Sprache:chi ; eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator FAN LU
HE XIAODONG
WU YOUZHENG
LI FANGZHU
DENG LIPING
FU LI
description The invention provides a data processing method and device, equipment and a storage medium, and relates to the technical field of voice recognition. The method comprises the following steps: acquiring audio data of a to-be-processed video and an image recognition text obtained by performing subtitle text recognition on a corresponding to-be-processed video frame, performing voice recognition on the audio data of the to-be-processed video and performing forced alignment processing to obtain an aligned text, and then performing error correction processing on the aligned text to obtain an error correction result of the to-be-processed video. And obtaining an error-corrected text, and screening the error-corrected text by referring to the corresponding image recognition text to obtain training data for training a speech recognition model. The method expands the training data set of the speech recognition model. 本公开提供一种数据处理方法、装置、设备及存储介质,涉及语音识别技术领域。该方法包括:获取待处理视频的音频数据及对对应的待处理视频帧进行字幕文本识别得到的图像识别文本,对待处理视频的音频数据及进行语音识别并进
format Patent
fullrecord <record><control><sourceid>epo_EVB</sourceid><recordid>TN_cdi_epo_espacenet_CN117789705A</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>CN117789705A</sourcerecordid><originalsourceid>FETCH-epo_espacenet_CN117789705A3</originalsourceid><addsrcrecordid>eNrjZLB3SSxJVCgoyk9OLS7OzEtXyE0tychPUUjMS1FISS3LTE7VUUgtLM0syE3NKwGLFpfkFyWmpwIVpmSW5vIwsKYl5hSn8kJpbgZFN9cQZw_d1IL8-NTigsTk1LzUknhnP0NDc3MLS3MDU0djYtQAAHTtMMg</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>patent</recordtype></control><display><type>patent</type><title>Data processing method and device, equipment and storage medium</title><source>esp@cenet</source><creator>FAN LU ; HE XIAODONG ; WU YOUZHENG ; LI FANGZHU ; DENG LIPING ; FU LI</creator><creatorcontrib>FAN LU ; HE XIAODONG ; WU YOUZHENG ; LI FANGZHU ; DENG LIPING ; FU LI</creatorcontrib><description>The invention provides a data processing method and device, equipment and a storage medium, and relates to the technical field of voice recognition. The method comprises the following steps: acquiring audio data of a to-be-processed video and an image recognition text obtained by performing subtitle text recognition on a corresponding to-be-processed video frame, performing voice recognition on the audio data of the to-be-processed video and performing forced alignment processing to obtain an aligned text, and then performing error correction processing on the aligned text to obtain an error correction result of the to-be-processed video. And obtaining an error-corrected text, and screening the error-corrected text by referring to the corresponding image recognition text to obtain training data for training a speech recognition model. The method expands the training data set of the speech recognition model. 本公开提供一种数据处理方法、装置、设备及存储介质,涉及语音识别技术领域。该方法包括:获取待处理视频的音频数据及对对应的待处理视频帧进行字幕文本识别得到的图像识别文本,对待处理视频的音频数据及进行语音识别并进</description><language>chi ; eng</language><subject>ACOUSTICS ; MUSICAL INSTRUMENTS ; PHYSICS ; SPEECH ANALYSIS OR SYNTHESIS ; SPEECH OR AUDIO CODING OR DECODING ; SPEECH OR VOICE PROCESSING ; SPEECH RECOGNITION</subject><creationdate>2024</creationdate><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&amp;date=20240329&amp;DB=EPODOC&amp;CC=CN&amp;NR=117789705A$$EHTML$$P50$$Gepo$$Hfree_for_read</linktohtml><link.rule.ids>230,308,780,885,25564,76547</link.rule.ids><linktorsrc>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&amp;date=20240329&amp;DB=EPODOC&amp;CC=CN&amp;NR=117789705A$$EView_record_in_European_Patent_Office$$FView_record_in_$$GEuropean_Patent_Office$$Hfree_for_read</linktorsrc></links><search><creatorcontrib>FAN LU</creatorcontrib><creatorcontrib>HE XIAODONG</creatorcontrib><creatorcontrib>WU YOUZHENG</creatorcontrib><creatorcontrib>LI FANGZHU</creatorcontrib><creatorcontrib>DENG LIPING</creatorcontrib><creatorcontrib>FU LI</creatorcontrib><title>Data processing method and device, equipment and storage medium</title><description>The invention provides a data processing method and device, equipment and a storage medium, and relates to the technical field of voice recognition. The method comprises the following steps: acquiring audio data of a to-be-processed video and an image recognition text obtained by performing subtitle text recognition on a corresponding to-be-processed video frame, performing voice recognition on the audio data of the to-be-processed video and performing forced alignment processing to obtain an aligned text, and then performing error correction processing on the aligned text to obtain an error correction result of the to-be-processed video. And obtaining an error-corrected text, and screening the error-corrected text by referring to the corresponding image recognition text to obtain training data for training a speech recognition model. The method expands the training data set of the speech recognition model. 本公开提供一种数据处理方法、装置、设备及存储介质,涉及语音识别技术领域。该方法包括:获取待处理视频的音频数据及对对应的待处理视频帧进行字幕文本识别得到的图像识别文本,对待处理视频的音频数据及进行语音识别并进</description><subject>ACOUSTICS</subject><subject>MUSICAL INSTRUMENTS</subject><subject>PHYSICS</subject><subject>SPEECH ANALYSIS OR SYNTHESIS</subject><subject>SPEECH OR AUDIO CODING OR DECODING</subject><subject>SPEECH OR VOICE PROCESSING</subject><subject>SPEECH RECOGNITION</subject><fulltext>true</fulltext><rsrctype>patent</rsrctype><creationdate>2024</creationdate><recordtype>patent</recordtype><sourceid>EVB</sourceid><recordid>eNrjZLB3SSxJVCgoyk9OLS7OzEtXyE0tychPUUjMS1FISS3LTE7VUUgtLM0syE3NKwGLFpfkFyWmpwIVpmSW5vIwsKYl5hSn8kJpbgZFN9cQZw_d1IL8-NTigsTk1LzUknhnP0NDc3MLS3MDU0djYtQAAHTtMMg</recordid><startdate>20240329</startdate><enddate>20240329</enddate><creator>FAN LU</creator><creator>HE XIAODONG</creator><creator>WU YOUZHENG</creator><creator>LI FANGZHU</creator><creator>DENG LIPING</creator><creator>FU LI</creator><scope>EVB</scope></search><sort><creationdate>20240329</creationdate><title>Data processing method and device, equipment and storage medium</title><author>FAN LU ; HE XIAODONG ; WU YOUZHENG ; LI FANGZHU ; DENG LIPING ; FU LI</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-epo_espacenet_CN117789705A3</frbrgroupid><rsrctype>patents</rsrctype><prefilter>patents</prefilter><language>chi ; eng</language><creationdate>2024</creationdate><topic>ACOUSTICS</topic><topic>MUSICAL INSTRUMENTS</topic><topic>PHYSICS</topic><topic>SPEECH ANALYSIS OR SYNTHESIS</topic><topic>SPEECH OR AUDIO CODING OR DECODING</topic><topic>SPEECH OR VOICE PROCESSING</topic><topic>SPEECH RECOGNITION</topic><toplevel>online_resources</toplevel><creatorcontrib>FAN LU</creatorcontrib><creatorcontrib>HE XIAODONG</creatorcontrib><creatorcontrib>WU YOUZHENG</creatorcontrib><creatorcontrib>LI FANGZHU</creatorcontrib><creatorcontrib>DENG LIPING</creatorcontrib><creatorcontrib>FU LI</creatorcontrib><collection>esp@cenet</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>FAN LU</au><au>HE XIAODONG</au><au>WU YOUZHENG</au><au>LI FANGZHU</au><au>DENG LIPING</au><au>FU LI</au><format>patent</format><genre>patent</genre><ristype>GEN</ristype><title>Data processing method and device, equipment and storage medium</title><date>2024-03-29</date><risdate>2024</risdate><abstract>The invention provides a data processing method and device, equipment and a storage medium, and relates to the technical field of voice recognition. The method comprises the following steps: acquiring audio data of a to-be-processed video and an image recognition text obtained by performing subtitle text recognition on a corresponding to-be-processed video frame, performing voice recognition on the audio data of the to-be-processed video and performing forced alignment processing to obtain an aligned text, and then performing error correction processing on the aligned text to obtain an error correction result of the to-be-processed video. And obtaining an error-corrected text, and screening the error-corrected text by referring to the corresponding image recognition text to obtain training data for training a speech recognition model. The method expands the training data set of the speech recognition model. 本公开提供一种数据处理方法、装置、设备及存储介质,涉及语音识别技术领域。该方法包括:获取待处理视频的音频数据及对对应的待处理视频帧进行字幕文本识别得到的图像识别文本,对待处理视频的音频数据及进行语音识别并进</abstract><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier
ispartof
issn
language chi ; eng
recordid cdi_epo_espacenet_CN117789705A
source esp@cenet
subjects ACOUSTICS
MUSICAL INSTRUMENTS
PHYSICS
SPEECH ANALYSIS OR SYNTHESIS
SPEECH OR AUDIO CODING OR DECODING
SPEECH OR VOICE PROCESSING
SPEECH RECOGNITION
title Data processing method and device, equipment and storage medium
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-24T02%3A36%3A24IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-epo_EVB&rft_val_fmt=info:ofi/fmt:kev:mtx:patent&rft.genre=patent&rft.au=FAN%20LU&rft.date=2024-03-29&rft_id=info:doi/&rft_dat=%3Cepo_EVB%3ECN117789705A%3C/epo_EVB%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true