Method for training speech recognition model, device and storage medium

A method for training a speech recognition model, a device and a storage medium, which relate to the field of computer technologies, and particularly to the fields of speech recognition technologies, deep learning technologies, or the like, are disclosed. The method for training a speech recognition...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Chen, Zhijie, Shao, Junyao, Qian, Sheng, Zang, Qiguang, Liang, Mingxin, Zheng, Huanxin, Fu, Xiaoyin
Format: Patent
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator Chen, Zhijie
Shao, Junyao
Qian, Sheng
Zang, Qiguang
Liang, Mingxin
Zheng, Huanxin
Fu, Xiaoyin
description A method for training a speech recognition model, a device and a storage medium, which relate to the field of computer technologies, and particularly to the fields of speech recognition technologies, deep learning technologies, or the like, are disclosed. The method for training a speech recognition model includes: obtaining a fusion probability of each of at least one candidate text corresponding to a speech based on an acoustic decoding model and a language model; selecting a preset number of one or more candidate texts based on the fusion probability of each of the at least one candidate text, and determining a predicted text based on the preset number of one or more candidate texts; and obtaining a loss function based on the predicted text and a standard text corresponding to the speech, and training the speech recognition model based on the loss function.
format Patent
fullrecord <record><control><sourceid>epo_EVB</sourceid><recordid>TN_cdi_epo_espacenet_US12033616B2</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>US12033616B2</sourcerecordid><originalsourceid>FETCH-epo_espacenet_US12033616B23</originalsourceid><addsrcrecordid>eNqNyjsOwjAQBUA3FAi4w9KDRGIpBwDxaaiAOrLsl2SleNeyDeen4QBU08zSXO-okwYaNFPNjoVlpJIAP1GG11G4sgpFDZh3FPBhD3ISqFTNbgRFBH7HtVkMbi7Y_FyZ7eX8PN32SNqjJOchqP3r0bQHa7umO7b2n_MFWmwz4Q</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>patent</recordtype></control><display><type>patent</type><title>Method for training speech recognition model, device and storage medium</title><source>esp@cenet</source><creator>Chen, Zhijie ; Shao, Junyao ; Qian, Sheng ; Zang, Qiguang ; Liang, Mingxin ; Zheng, Huanxin ; Fu, Xiaoyin</creator><creatorcontrib>Chen, Zhijie ; Shao, Junyao ; Qian, Sheng ; Zang, Qiguang ; Liang, Mingxin ; Zheng, Huanxin ; Fu, Xiaoyin</creatorcontrib><description>A method for training a speech recognition model, a device and a storage medium, which relate to the field of computer technologies, and particularly to the fields of speech recognition technologies, deep learning technologies, or the like, are disclosed. The method for training a speech recognition model includes: obtaining a fusion probability of each of at least one candidate text corresponding to a speech based on an acoustic decoding model and a language model; selecting a preset number of one or more candidate texts based on the fusion probability of each of the at least one candidate text, and determining a predicted text based on the preset number of one or more candidate texts; and obtaining a loss function based on the predicted text and a standard text corresponding to the speech, and training the speech recognition model based on the loss function.</description><language>eng</language><subject>ACOUSTICS ; MUSICAL INSTRUMENTS ; PHYSICS ; SPEECH ANALYSIS OR SYNTHESIS ; SPEECH OR AUDIO CODING OR DECODING ; SPEECH OR VOICE PROCESSING ; SPEECH RECOGNITION</subject><creationdate>2024</creationdate><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&amp;date=20240709&amp;DB=EPODOC&amp;CC=US&amp;NR=12033616B2$$EHTML$$P50$$Gepo$$Hfree_for_read</linktohtml><link.rule.ids>230,308,780,885,25564,76547</link.rule.ids><linktorsrc>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&amp;date=20240709&amp;DB=EPODOC&amp;CC=US&amp;NR=12033616B2$$EView_record_in_European_Patent_Office$$FView_record_in_$$GEuropean_Patent_Office$$Hfree_for_read</linktorsrc></links><search><creatorcontrib>Chen, Zhijie</creatorcontrib><creatorcontrib>Shao, Junyao</creatorcontrib><creatorcontrib>Qian, Sheng</creatorcontrib><creatorcontrib>Zang, Qiguang</creatorcontrib><creatorcontrib>Liang, Mingxin</creatorcontrib><creatorcontrib>Zheng, Huanxin</creatorcontrib><creatorcontrib>Fu, Xiaoyin</creatorcontrib><title>Method for training speech recognition model, device and storage medium</title><description>A method for training a speech recognition model, a device and a storage medium, which relate to the field of computer technologies, and particularly to the fields of speech recognition technologies, deep learning technologies, or the like, are disclosed. The method for training a speech recognition model includes: obtaining a fusion probability of each of at least one candidate text corresponding to a speech based on an acoustic decoding model and a language model; selecting a preset number of one or more candidate texts based on the fusion probability of each of the at least one candidate text, and determining a predicted text based on the preset number of one or more candidate texts; and obtaining a loss function based on the predicted text and a standard text corresponding to the speech, and training the speech recognition model based on the loss function.</description><subject>ACOUSTICS</subject><subject>MUSICAL INSTRUMENTS</subject><subject>PHYSICS</subject><subject>SPEECH ANALYSIS OR SYNTHESIS</subject><subject>SPEECH OR AUDIO CODING OR DECODING</subject><subject>SPEECH OR VOICE PROCESSING</subject><subject>SPEECH RECOGNITION</subject><fulltext>true</fulltext><rsrctype>patent</rsrctype><creationdate>2024</creationdate><recordtype>patent</recordtype><sourceid>EVB</sourceid><recordid>eNqNyjsOwjAQBUA3FAi4w9KDRGIpBwDxaaiAOrLsl2SleNeyDeen4QBU08zSXO-okwYaNFPNjoVlpJIAP1GG11G4sgpFDZh3FPBhD3ISqFTNbgRFBH7HtVkMbi7Y_FyZ7eX8PN32SNqjJOchqP3r0bQHa7umO7b2n_MFWmwz4Q</recordid><startdate>20240709</startdate><enddate>20240709</enddate><creator>Chen, Zhijie</creator><creator>Shao, Junyao</creator><creator>Qian, Sheng</creator><creator>Zang, Qiguang</creator><creator>Liang, Mingxin</creator><creator>Zheng, Huanxin</creator><creator>Fu, Xiaoyin</creator><scope>EVB</scope></search><sort><creationdate>20240709</creationdate><title>Method for training speech recognition model, device and storage medium</title><author>Chen, Zhijie ; Shao, Junyao ; Qian, Sheng ; Zang, Qiguang ; Liang, Mingxin ; Zheng, Huanxin ; Fu, Xiaoyin</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-epo_espacenet_US12033616B23</frbrgroupid><rsrctype>patents</rsrctype><prefilter>patents</prefilter><language>eng</language><creationdate>2024</creationdate><topic>ACOUSTICS</topic><topic>MUSICAL INSTRUMENTS</topic><topic>PHYSICS</topic><topic>SPEECH ANALYSIS OR SYNTHESIS</topic><topic>SPEECH OR AUDIO CODING OR DECODING</topic><topic>SPEECH OR VOICE PROCESSING</topic><topic>SPEECH RECOGNITION</topic><toplevel>online_resources</toplevel><creatorcontrib>Chen, Zhijie</creatorcontrib><creatorcontrib>Shao, Junyao</creatorcontrib><creatorcontrib>Qian, Sheng</creatorcontrib><creatorcontrib>Zang, Qiguang</creatorcontrib><creatorcontrib>Liang, Mingxin</creatorcontrib><creatorcontrib>Zheng, Huanxin</creatorcontrib><creatorcontrib>Fu, Xiaoyin</creatorcontrib><collection>esp@cenet</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Chen, Zhijie</au><au>Shao, Junyao</au><au>Qian, Sheng</au><au>Zang, Qiguang</au><au>Liang, Mingxin</au><au>Zheng, Huanxin</au><au>Fu, Xiaoyin</au><format>patent</format><genre>patent</genre><ristype>GEN</ristype><title>Method for training speech recognition model, device and storage medium</title><date>2024-07-09</date><risdate>2024</risdate><abstract>A method for training a speech recognition model, a device and a storage medium, which relate to the field of computer technologies, and particularly to the fields of speech recognition technologies, deep learning technologies, or the like, are disclosed. The method for training a speech recognition model includes: obtaining a fusion probability of each of at least one candidate text corresponding to a speech based on an acoustic decoding model and a language model; selecting a preset number of one or more candidate texts based on the fusion probability of each of the at least one candidate text, and determining a predicted text based on the preset number of one or more candidate texts; and obtaining a loss function based on the predicted text and a standard text corresponding to the speech, and training the speech recognition model based on the loss function.</abstract><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier
ispartof
issn
language eng
recordid cdi_epo_espacenet_US12033616B2
source esp@cenet
subjects ACOUSTICS
MUSICAL INSTRUMENTS
PHYSICS
SPEECH ANALYSIS OR SYNTHESIS
SPEECH OR AUDIO CODING OR DECODING
SPEECH OR VOICE PROCESSING
SPEECH RECOGNITION
title Method for training speech recognition model, device and storage medium
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-20T05%3A29%3A37IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-epo_EVB&rft_val_fmt=info:ofi/fmt:kev:mtx:patent&rft.genre=patent&rft.au=Chen,%20Zhijie&rft.date=2024-07-09&rft_id=info:doi/&rft_dat=%3Cepo_EVB%3EUS12033616B2%3C/epo_EVB%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true