Data processing method and device, equipment and storage medium
The invention provides a data processing method and device, equipment and a storage medium, and relates to the technical field of voice recognition. The method comprises the following steps: acquiring audio data of a to-be-processed video and an image recognition text obtained by performing subtitle...
Gespeichert in:
Hauptverfasser: | , , , , , |
---|---|
Format: | Patent |
Sprache: | chi ; eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | |
container_volume | |
creator | FAN LU HE XIAODONG WU YOUZHENG LI FANGZHU DENG LIPING FU LI |
description | The invention provides a data processing method and device, equipment and a storage medium, and relates to the technical field of voice recognition. The method comprises the following steps: acquiring audio data of a to-be-processed video and an image recognition text obtained by performing subtitle text recognition on a corresponding to-be-processed video frame, performing voice recognition on the audio data of the to-be-processed video and performing forced alignment processing to obtain an aligned text, and then performing error correction processing on the aligned text to obtain an error correction result of the to-be-processed video. And obtaining an error-corrected text, and screening the error-corrected text by referring to the corresponding image recognition text to obtain training data for training a speech recognition model. The method expands the training data set of the speech recognition model.
本公开提供一种数据处理方法、装置、设备及存储介质,涉及语音识别技术领域。该方法包括:获取待处理视频的音频数据及对对应的待处理视频帧进行字幕文本识别得到的图像识别文本,对待处理视频的音频数据及进行语音识别并进 |
format | Patent |
fullrecord | <record><control><sourceid>epo_EVB</sourceid><recordid>TN_cdi_epo_espacenet_CN117789705A</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>CN117789705A</sourcerecordid><originalsourceid>FETCH-epo_espacenet_CN117789705A3</originalsourceid><addsrcrecordid>eNrjZLB3SSxJVCgoyk9OLS7OzEtXyE0tychPUUjMS1FISS3LTE7VUUgtLM0syE3NKwGLFpfkFyWmpwIVpmSW5vIwsKYl5hSn8kJpbgZFN9cQZw_d1IL8-NTigsTk1LzUknhnP0NDc3MLS3MDU0djYtQAAHTtMMg</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>patent</recordtype></control><display><type>patent</type><title>Data processing method and device, equipment and storage medium</title><source>esp@cenet</source><creator>FAN LU ; HE XIAODONG ; WU YOUZHENG ; LI FANGZHU ; DENG LIPING ; FU LI</creator><creatorcontrib>FAN LU ; HE XIAODONG ; WU YOUZHENG ; LI FANGZHU ; DENG LIPING ; FU LI</creatorcontrib><description>The invention provides a data processing method and device, equipment and a storage medium, and relates to the technical field of voice recognition. The method comprises the following steps: acquiring audio data of a to-be-processed video and an image recognition text obtained by performing subtitle text recognition on a corresponding to-be-processed video frame, performing voice recognition on the audio data of the to-be-processed video and performing forced alignment processing to obtain an aligned text, and then performing error correction processing on the aligned text to obtain an error correction result of the to-be-processed video. And obtaining an error-corrected text, and screening the error-corrected text by referring to the corresponding image recognition text to obtain training data for training a speech recognition model. The method expands the training data set of the speech recognition model.
本公开提供一种数据处理方法、装置、设备及存储介质,涉及语音识别技术领域。该方法包括:获取待处理视频的音频数据及对对应的待处理视频帧进行字幕文本识别得到的图像识别文本,对待处理视频的音频数据及进行语音识别并进</description><language>chi ; eng</language><subject>ACOUSTICS ; MUSICAL INSTRUMENTS ; PHYSICS ; SPEECH ANALYSIS OR SYNTHESIS ; SPEECH OR AUDIO CODING OR DECODING ; SPEECH OR VOICE PROCESSING ; SPEECH RECOGNITION</subject><creationdate>2024</creationdate><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&date=20240329&DB=EPODOC&CC=CN&NR=117789705A$$EHTML$$P50$$Gepo$$Hfree_for_read</linktohtml><link.rule.ids>230,308,780,885,25564,76547</link.rule.ids><linktorsrc>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&date=20240329&DB=EPODOC&CC=CN&NR=117789705A$$EView_record_in_European_Patent_Office$$FView_record_in_$$GEuropean_Patent_Office$$Hfree_for_read</linktorsrc></links><search><creatorcontrib>FAN LU</creatorcontrib><creatorcontrib>HE XIAODONG</creatorcontrib><creatorcontrib>WU YOUZHENG</creatorcontrib><creatorcontrib>LI FANGZHU</creatorcontrib><creatorcontrib>DENG LIPING</creatorcontrib><creatorcontrib>FU LI</creatorcontrib><title>Data processing method and device, equipment and storage medium</title><description>The invention provides a data processing method and device, equipment and a storage medium, and relates to the technical field of voice recognition. The method comprises the following steps: acquiring audio data of a to-be-processed video and an image recognition text obtained by performing subtitle text recognition on a corresponding to-be-processed video frame, performing voice recognition on the audio data of the to-be-processed video and performing forced alignment processing to obtain an aligned text, and then performing error correction processing on the aligned text to obtain an error correction result of the to-be-processed video. And obtaining an error-corrected text, and screening the error-corrected text by referring to the corresponding image recognition text to obtain training data for training a speech recognition model. The method expands the training data set of the speech recognition model.
本公开提供一种数据处理方法、装置、设备及存储介质,涉及语音识别技术领域。该方法包括:获取待处理视频的音频数据及对对应的待处理视频帧进行字幕文本识别得到的图像识别文本,对待处理视频的音频数据及进行语音识别并进</description><subject>ACOUSTICS</subject><subject>MUSICAL INSTRUMENTS</subject><subject>PHYSICS</subject><subject>SPEECH ANALYSIS OR SYNTHESIS</subject><subject>SPEECH OR AUDIO CODING OR DECODING</subject><subject>SPEECH OR VOICE PROCESSING</subject><subject>SPEECH RECOGNITION</subject><fulltext>true</fulltext><rsrctype>patent</rsrctype><creationdate>2024</creationdate><recordtype>patent</recordtype><sourceid>EVB</sourceid><recordid>eNrjZLB3SSxJVCgoyk9OLS7OzEtXyE0tychPUUjMS1FISS3LTE7VUUgtLM0syE3NKwGLFpfkFyWmpwIVpmSW5vIwsKYl5hSn8kJpbgZFN9cQZw_d1IL8-NTigsTk1LzUknhnP0NDc3MLS3MDU0djYtQAAHTtMMg</recordid><startdate>20240329</startdate><enddate>20240329</enddate><creator>FAN LU</creator><creator>HE XIAODONG</creator><creator>WU YOUZHENG</creator><creator>LI FANGZHU</creator><creator>DENG LIPING</creator><creator>FU LI</creator><scope>EVB</scope></search><sort><creationdate>20240329</creationdate><title>Data processing method and device, equipment and storage medium</title><author>FAN LU ; HE XIAODONG ; WU YOUZHENG ; LI FANGZHU ; DENG LIPING ; FU LI</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-epo_espacenet_CN117789705A3</frbrgroupid><rsrctype>patents</rsrctype><prefilter>patents</prefilter><language>chi ; eng</language><creationdate>2024</creationdate><topic>ACOUSTICS</topic><topic>MUSICAL INSTRUMENTS</topic><topic>PHYSICS</topic><topic>SPEECH ANALYSIS OR SYNTHESIS</topic><topic>SPEECH OR AUDIO CODING OR DECODING</topic><topic>SPEECH OR VOICE PROCESSING</topic><topic>SPEECH RECOGNITION</topic><toplevel>online_resources</toplevel><creatorcontrib>FAN LU</creatorcontrib><creatorcontrib>HE XIAODONG</creatorcontrib><creatorcontrib>WU YOUZHENG</creatorcontrib><creatorcontrib>LI FANGZHU</creatorcontrib><creatorcontrib>DENG LIPING</creatorcontrib><creatorcontrib>FU LI</creatorcontrib><collection>esp@cenet</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>FAN LU</au><au>HE XIAODONG</au><au>WU YOUZHENG</au><au>LI FANGZHU</au><au>DENG LIPING</au><au>FU LI</au><format>patent</format><genre>patent</genre><ristype>GEN</ristype><title>Data processing method and device, equipment and storage medium</title><date>2024-03-29</date><risdate>2024</risdate><abstract>The invention provides a data processing method and device, equipment and a storage medium, and relates to the technical field of voice recognition. The method comprises the following steps: acquiring audio data of a to-be-processed video and an image recognition text obtained by performing subtitle text recognition on a corresponding to-be-processed video frame, performing voice recognition on the audio data of the to-be-processed video and performing forced alignment processing to obtain an aligned text, and then performing error correction processing on the aligned text to obtain an error correction result of the to-be-processed video. And obtaining an error-corrected text, and screening the error-corrected text by referring to the corresponding image recognition text to obtain training data for training a speech recognition model. The method expands the training data set of the speech recognition model.
本公开提供一种数据处理方法、装置、设备及存储介质,涉及语音识别技术领域。该方法包括:获取待处理视频的音频数据及对对应的待处理视频帧进行字幕文本识别得到的图像识别文本,对待处理视频的音频数据及进行语音识别并进</abstract><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | |
ispartof | |
issn | |
language | chi ; eng |
recordid | cdi_epo_espacenet_CN117789705A |
source | esp@cenet |
subjects | ACOUSTICS MUSICAL INSTRUMENTS PHYSICS SPEECH ANALYSIS OR SYNTHESIS SPEECH OR AUDIO CODING OR DECODING SPEECH OR VOICE PROCESSING SPEECH RECOGNITION |
title | Data processing method and device, equipment and storage medium |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-24T02%3A36%3A24IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-epo_EVB&rft_val_fmt=info:ofi/fmt:kev:mtx:patent&rft.genre=patent&rft.au=FAN%20LU&rft.date=2024-03-29&rft_id=info:doi/&rft_dat=%3Cepo_EVB%3ECN117789705A%3C/epo_EVB%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true |