Distributed speech recognition using dynamically determined feature vector codebook size
In a mobile wireless communication system automatic speech recognition is performed in a distributed manner using a mobile station based near or front end stage which extracts and vector quantizes recognition feature parameters from frames of an utterance and an infrastructure based back or far end...
Gespeichert in:
1. Verfasser: | |
---|---|
Format: | Patent |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | |
container_volume | |
creator | YANG, YIN-PIN |
description | In a mobile wireless communication system automatic speech recognition is performed in a distributed manner using a mobile station based near or front end stage which extracts and vector quantizes recognition feature parameters from frames of an utterance and an infrastructure based back or far end stage which reverses the vector quantization to recover the feature parameters and subjects the feature parameters to a hidden markov model (HMM) evaluation to obtain a recognition decision for the utterance. In order to conserve network capacity, the size (Sz) of the codebook used for the vector quantization, and the corresponding number of bits (B) per codebook index B, are adapted on a dialogue-by-dialogue basis in relation to the vocabulary size |V| for the dialogue. The adaptation, which is performed at the front end, accomplishes a tradeoff between expected recognition rate RR and expected bitrate RR by optimizing a metric which is a function of both. In addition to the frame-wise compression of an utterance into a string of code indices (q-string), farther ""timewise"" compression is obtained by run-length coding the string. The data transmitted from the front end to the back end includes the number of bits (B) per codebook value, which also indicates the codebook size (Sz). |
format | Patent |
fullrecord | <record><control><sourceid>epo_EVB</sourceid><recordid>TN_cdi_epo_espacenet_TW541516BB</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>TW541516BB</sourcerecordid><originalsourceid>FETCH-epo_espacenet_TW541516BB3</originalsourceid><addsrcrecordid>eNqFyj0OgkAQBlAaC6OewbmABVE8AP7EA5BoR5bdD5i47JDdwQRPr4W91WveMnucOWnkZlI4SiNge4qw0gVWlkBT4tCRm4MZ2BrvZ3JQxIHD97cwOkXQC1YlkhWHRuRJid9YZ4vW-ITNz1W2vV6q022HUWqk0VgEaF3di0Ne5Mey3P8fH1TaOts</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>patent</recordtype></control><display><type>patent</type><title>Distributed speech recognition using dynamically determined feature vector codebook size</title><source>esp@cenet</source><creator>YANG, YIN-PIN</creator><creatorcontrib>YANG, YIN-PIN</creatorcontrib><description>In a mobile wireless communication system automatic speech recognition is performed in a distributed manner using a mobile station based near or front end stage which extracts and vector quantizes recognition feature parameters from frames of an utterance and an infrastructure based back or far end stage which reverses the vector quantization to recover the feature parameters and subjects the feature parameters to a hidden markov model (HMM) evaluation to obtain a recognition decision for the utterance. In order to conserve network capacity, the size (Sz) of the codebook used for the vector quantization, and the corresponding number of bits (B) per codebook index B, are adapted on a dialogue-by-dialogue basis in relation to the vocabulary size |V| for the dialogue. The adaptation, which is performed at the front end, accomplishes a tradeoff between expected recognition rate RR and expected bitrate RR by optimizing a metric which is a function of both. In addition to the frame-wise compression of an utterance into a string of code indices (q-string), farther ""timewise"" compression is obtained by run-length coding the string. The data transmitted from the front end to the back end includes the number of bits (B) per codebook value, which also indicates the codebook size (Sz).</description><edition>7</edition><language>eng</language><subject>ACOUSTICS ; MUSICAL INSTRUMENTS ; PHYSICS ; SPEECH ANALYSIS OR SYNTHESIS ; SPEECH OR AUDIO CODING OR DECODING ; SPEECH OR VOICE PROCESSING ; SPEECH RECOGNITION</subject><creationdate>2003</creationdate><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&date=20030711&DB=EPODOC&CC=TW&NR=541516B$$EHTML$$P50$$Gepo$$Hfree_for_read</linktohtml><link.rule.ids>230,308,777,882,25546,76297</link.rule.ids><linktorsrc>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&date=20030711&DB=EPODOC&CC=TW&NR=541516B$$EView_record_in_European_Patent_Office$$FView_record_in_$$GEuropean_Patent_Office$$Hfree_for_read</linktorsrc></links><search><creatorcontrib>YANG, YIN-PIN</creatorcontrib><title>Distributed speech recognition using dynamically determined feature vector codebook size</title><description>In a mobile wireless communication system automatic speech recognition is performed in a distributed manner using a mobile station based near or front end stage which extracts and vector quantizes recognition feature parameters from frames of an utterance and an infrastructure based back or far end stage which reverses the vector quantization to recover the feature parameters and subjects the feature parameters to a hidden markov model (HMM) evaluation to obtain a recognition decision for the utterance. In order to conserve network capacity, the size (Sz) of the codebook used for the vector quantization, and the corresponding number of bits (B) per codebook index B, are adapted on a dialogue-by-dialogue basis in relation to the vocabulary size |V| for the dialogue. The adaptation, which is performed at the front end, accomplishes a tradeoff between expected recognition rate RR and expected bitrate RR by optimizing a metric which is a function of both. In addition to the frame-wise compression of an utterance into a string of code indices (q-string), farther ""timewise"" compression is obtained by run-length coding the string. The data transmitted from the front end to the back end includes the number of bits (B) per codebook value, which also indicates the codebook size (Sz).</description><subject>ACOUSTICS</subject><subject>MUSICAL INSTRUMENTS</subject><subject>PHYSICS</subject><subject>SPEECH ANALYSIS OR SYNTHESIS</subject><subject>SPEECH OR AUDIO CODING OR DECODING</subject><subject>SPEECH OR VOICE PROCESSING</subject><subject>SPEECH RECOGNITION</subject><fulltext>true</fulltext><rsrctype>patent</rsrctype><creationdate>2003</creationdate><recordtype>patent</recordtype><sourceid>EVB</sourceid><recordid>eNqFyj0OgkAQBlAaC6OewbmABVE8AP7EA5BoR5bdD5i47JDdwQRPr4W91WveMnucOWnkZlI4SiNge4qw0gVWlkBT4tCRm4MZ2BrvZ3JQxIHD97cwOkXQC1YlkhWHRuRJid9YZ4vW-ITNz1W2vV6q022HUWqk0VgEaF3di0Ne5Mey3P8fH1TaOts</recordid><startdate>20030711</startdate><enddate>20030711</enddate><creator>YANG, YIN-PIN</creator><scope>EVB</scope></search><sort><creationdate>20030711</creationdate><title>Distributed speech recognition using dynamically determined feature vector codebook size</title><author>YANG, YIN-PIN</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-epo_espacenet_TW541516BB3</frbrgroupid><rsrctype>patents</rsrctype><prefilter>patents</prefilter><language>eng</language><creationdate>2003</creationdate><topic>ACOUSTICS</topic><topic>MUSICAL INSTRUMENTS</topic><topic>PHYSICS</topic><topic>SPEECH ANALYSIS OR SYNTHESIS</topic><topic>SPEECH OR AUDIO CODING OR DECODING</topic><topic>SPEECH OR VOICE PROCESSING</topic><topic>SPEECH RECOGNITION</topic><toplevel>online_resources</toplevel><creatorcontrib>YANG, YIN-PIN</creatorcontrib><collection>esp@cenet</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>YANG, YIN-PIN</au><format>patent</format><genre>patent</genre><ristype>GEN</ristype><title>Distributed speech recognition using dynamically determined feature vector codebook size</title><date>2003-07-11</date><risdate>2003</risdate><abstract>In a mobile wireless communication system automatic speech recognition is performed in a distributed manner using a mobile station based near or front end stage which extracts and vector quantizes recognition feature parameters from frames of an utterance and an infrastructure based back or far end stage which reverses the vector quantization to recover the feature parameters and subjects the feature parameters to a hidden markov model (HMM) evaluation to obtain a recognition decision for the utterance. In order to conserve network capacity, the size (Sz) of the codebook used for the vector quantization, and the corresponding number of bits (B) per codebook index B, are adapted on a dialogue-by-dialogue basis in relation to the vocabulary size |V| for the dialogue. The adaptation, which is performed at the front end, accomplishes a tradeoff between expected recognition rate RR and expected bitrate RR by optimizing a metric which is a function of both. In addition to the frame-wise compression of an utterance into a string of code indices (q-string), farther ""timewise"" compression is obtained by run-length coding the string. The data transmitted from the front end to the back end includes the number of bits (B) per codebook value, which also indicates the codebook size (Sz).</abstract><edition>7</edition><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | |
ispartof | |
issn | |
language | eng |
recordid | cdi_epo_espacenet_TW541516BB |
source | esp@cenet |
subjects | ACOUSTICS MUSICAL INSTRUMENTS PHYSICS SPEECH ANALYSIS OR SYNTHESIS SPEECH OR AUDIO CODING OR DECODING SPEECH OR VOICE PROCESSING SPEECH RECOGNITION |
title | Distributed speech recognition using dynamically determined feature vector codebook size |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-17T11%3A39%3A00IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-epo_EVB&rft_val_fmt=info:ofi/fmt:kev:mtx:patent&rft.genre=patent&rft.au=YANG,%20YIN-PIN&rft.date=2003-07-11&rft_id=info:doi/&rft_dat=%3Cepo_EVB%3ETW541516BB%3C/epo_EVB%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true |