Robust speech recognition using probabilistic union models
This paper introduces a new statistical approach, namely the probabilistic union model, for speech recognition involving partial, unknown frequency-band corruption. Partial frequency-band corruption accounts for the effect of a family of real-world noises. Previous methods based on the missing featu...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on speech and audio processing 2002-09, Vol.10 (6), p.403-414 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 414 |
---|---|
container_issue | 6 |
container_start_page | 403 |
container_title | IEEE transactions on speech and audio processing |
container_volume | 10 |
creator | Ji Ming Jancovic, P. Smith, F.J. |
description | This paper introduces a new statistical approach, namely the probabilistic union model, for speech recognition involving partial, unknown frequency-band corruption. Partial frequency-band corruption accounts for the effect of a family of real-world noises. Previous methods based on the missing feature theory usually require the identity of the noisy bands. This identification can be difficult for unexpected noise with unknown, time-varying band characteristics. The new model combines the local frequency-band information based on the union of random events, to reduce the dependence of the model on information about the noise. This model partially accomplishes the target: offering robustness to partial frequency-band corruption, while requiring no information about the noise. This paper introduces the theory and implementation of the union model, and is focused on several important advances. These new developments include a new algorithm for automatic order selection, a generalization of the modeling principle to accommodate partial feature stream corruption, and a combination of the union model with conventional noise reduction techniques to deal with a mixture of stationary noise and unknown, nonstationary noise. For the evaluation, we used the TIDIGITS database for speaker-independent connected digit recognition. The utterances were corrupted by various types of additive noise, stationary or time-varying, assuming no knowledge about the noise characteristics. The results indicate that the new model offers significantly improved robustness in comparison to other models. |
doi_str_mv | 10.1109/TSA.2002.803439 |
format | Article |
fullrecord | <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_pascalfrancis_primary_13973499</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>1040264</ieee_id><sourcerecordid>2430083021</sourcerecordid><originalsourceid>FETCH-LOGICAL-c379t-58420be02af5c375303e9562852d4a3863bd49f8a9ab0820935b71d759f8c4913</originalsourceid><addsrcrecordid>eNp9kE1PwzAMhisEEmNw5sClQgJO3Zw4SRNu08SXNAkJxjlKs3Rk6trRtAf-PamGBOLAyZb9-LX9Jsk5gQkhoKbL19mEAtCJBGSoDpIR4VxmFDkexhwEZkLk4jg5CWEDAJLkbJTcvjRFH7o07Jyz72nrbLOufeebOu2Dr9fprm0KU_jKh87btK-HzrZZuSqcJkelqYI7-47j5O3-bjl_zBbPD0_z2SKzmKsu45JRKBxQU_JY4QjoFBdUcrpiBqXAYsVUKY0yBUgKCnmRk1XOY80yRXCc3Ox14ykfvQud3vpgXVWZ2jV90IoohZTkGMnrf8m4UwmEQfLyD7hp-raOX2gpGeZEMhGh6R6ybRNC60q9a_3WtJ-agB4s19FyPViu95bHiatvWROsqcrW1NaHnzFUOTI1cBd7zjvnfqkyoILhF9Pih2g</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>884371846</pqid></control><display><type>article</type><title>Robust speech recognition using probabilistic union models</title><source>IEEE Electronic Library (IEL)</source><creator>Ji Ming ; Jancovic, P. ; Smith, F.J.</creator><creatorcontrib>Ji Ming ; Jancovic, P. ; Smith, F.J.</creatorcontrib><description>This paper introduces a new statistical approach, namely the probabilistic union model, for speech recognition involving partial, unknown frequency-band corruption. Partial frequency-band corruption accounts for the effect of a family of real-world noises. Previous methods based on the missing feature theory usually require the identity of the noisy bands. This identification can be difficult for unexpected noise with unknown, time-varying band characteristics. The new model combines the local frequency-band information based on the union of random events, to reduce the dependence of the model on information about the noise. This model partially accomplishes the target: offering robustness to partial frequency-band corruption, while requiring no information about the noise. This paper introduces the theory and implementation of the union model, and is focused on several important advances. These new developments include a new algorithm for automatic order selection, a generalization of the modeling principle to accommodate partial feature stream corruption, and a combination of the union model with conventional noise reduction techniques to deal with a mixture of stationary noise and unknown, nonstationary noise. For the evaluation, we used the TIDIGITS database for speaker-independent connected digit recognition. The utterances were corrupted by various types of additive noise, stationary or time-varying, assuming no knowledge about the noise characteristics. The results indicate that the new model offers significantly improved robustness in comparison to other models.</description><identifier>ISSN: 1063-6676</identifier><identifier>ISSN: 2329-9290</identifier><identifier>EISSN: 1558-2353</identifier><identifier>EISSN: 2329-9304</identifier><identifier>DOI: 10.1109/TSA.2002.803439</identifier><identifier>CODEN: IESPEJ</identifier><language>eng</language><publisher>New York, NY: IEEE</publisher><subject>Additive noise ; Applied sciences ; Artificial intelligence ; Band theory ; Computer science; control theory; systems ; Corruption ; Exact sciences and technology ; Filtering ; Frequency ; Information, signal and communications theory ; Lifting equipment ; Noise ; Noise reduction ; Noise robustness ; Probabilistic methods ; Probability theory ; Robustness ; Signal processing ; Spatial databases ; Speech and sound recognition and synthesis. Linguistics ; Speech processing ; Speech recognition ; Studies ; Telecommunications and information theory ; Telephony ; Unions ; Working environment noise</subject><ispartof>IEEE transactions on speech and audio processing, 2002-09, Vol.10 (6), p.403-414</ispartof><rights>2003 INIST-CNRS</rights><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2002</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c379t-58420be02af5c375303e9562852d4a3863bd49f8a9ab0820935b71d759f8c4913</citedby><cites>FETCH-LOGICAL-c379t-58420be02af5c375303e9562852d4a3863bd49f8a9ab0820935b71d759f8c4913</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/1040264$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,796,27924,27925,54758</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/1040264$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=13973499$$DView record in Pascal Francis$$Hfree_for_read</backlink></links><search><creatorcontrib>Ji Ming</creatorcontrib><creatorcontrib>Jancovic, P.</creatorcontrib><creatorcontrib>Smith, F.J.</creatorcontrib><title>Robust speech recognition using probabilistic union models</title><title>IEEE transactions on speech and audio processing</title><addtitle>T-SAP</addtitle><description>This paper introduces a new statistical approach, namely the probabilistic union model, for speech recognition involving partial, unknown frequency-band corruption. Partial frequency-band corruption accounts for the effect of a family of real-world noises. Previous methods based on the missing feature theory usually require the identity of the noisy bands. This identification can be difficult for unexpected noise with unknown, time-varying band characteristics. The new model combines the local frequency-band information based on the union of random events, to reduce the dependence of the model on information about the noise. This model partially accomplishes the target: offering robustness to partial frequency-band corruption, while requiring no information about the noise. This paper introduces the theory and implementation of the union model, and is focused on several important advances. These new developments include a new algorithm for automatic order selection, a generalization of the modeling principle to accommodate partial feature stream corruption, and a combination of the union model with conventional noise reduction techniques to deal with a mixture of stationary noise and unknown, nonstationary noise. For the evaluation, we used the TIDIGITS database for speaker-independent connected digit recognition. The utterances were corrupted by various types of additive noise, stationary or time-varying, assuming no knowledge about the noise characteristics. The results indicate that the new model offers significantly improved robustness in comparison to other models.</description><subject>Additive noise</subject><subject>Applied sciences</subject><subject>Artificial intelligence</subject><subject>Band theory</subject><subject>Computer science; control theory; systems</subject><subject>Corruption</subject><subject>Exact sciences and technology</subject><subject>Filtering</subject><subject>Frequency</subject><subject>Information, signal and communications theory</subject><subject>Lifting equipment</subject><subject>Noise</subject><subject>Noise reduction</subject><subject>Noise robustness</subject><subject>Probabilistic methods</subject><subject>Probability theory</subject><subject>Robustness</subject><subject>Signal processing</subject><subject>Spatial databases</subject><subject>Speech and sound recognition and synthesis. Linguistics</subject><subject>Speech processing</subject><subject>Speech recognition</subject><subject>Studies</subject><subject>Telecommunications and information theory</subject><subject>Telephony</subject><subject>Unions</subject><subject>Working environment noise</subject><issn>1063-6676</issn><issn>2329-9290</issn><issn>1558-2353</issn><issn>2329-9304</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2002</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNp9kE1PwzAMhisEEmNw5sClQgJO3Zw4SRNu08SXNAkJxjlKs3Rk6trRtAf-PamGBOLAyZb9-LX9Jsk5gQkhoKbL19mEAtCJBGSoDpIR4VxmFDkexhwEZkLk4jg5CWEDAJLkbJTcvjRFH7o07Jyz72nrbLOufeebOu2Dr9fprm0KU_jKh87btK-HzrZZuSqcJkelqYI7-47j5O3-bjl_zBbPD0_z2SKzmKsu45JRKBxQU_JY4QjoFBdUcrpiBqXAYsVUKY0yBUgKCnmRk1XOY80yRXCc3Ox14ykfvQud3vpgXVWZ2jV90IoohZTkGMnrf8m4UwmEQfLyD7hp-raOX2gpGeZEMhGh6R6ybRNC60q9a_3WtJ-agB4s19FyPViu95bHiatvWROsqcrW1NaHnzFUOTI1cBd7zjvnfqkyoILhF9Pih2g</recordid><startdate>20020901</startdate><enddate>20020901</enddate><creator>Ji Ming</creator><creator>Jancovic, P.</creator><creator>Smith, F.J.</creator><general>IEEE</general><general>Institute of Electrical and Electronics Engineers</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>RIA</scope><scope>RIE</scope><scope>IQODW</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>7SP</scope><scope>F28</scope><scope>FR3</scope></search><sort><creationdate>20020901</creationdate><title>Robust speech recognition using probabilistic union models</title><author>Ji Ming ; Jancovic, P. ; Smith, F.J.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c379t-58420be02af5c375303e9562852d4a3863bd49f8a9ab0820935b71d759f8c4913</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2002</creationdate><topic>Additive noise</topic><topic>Applied sciences</topic><topic>Artificial intelligence</topic><topic>Band theory</topic><topic>Computer science; control theory; systems</topic><topic>Corruption</topic><topic>Exact sciences and technology</topic><topic>Filtering</topic><topic>Frequency</topic><topic>Information, signal and communications theory</topic><topic>Lifting equipment</topic><topic>Noise</topic><topic>Noise reduction</topic><topic>Noise robustness</topic><topic>Probabilistic methods</topic><topic>Probability theory</topic><topic>Robustness</topic><topic>Signal processing</topic><topic>Spatial databases</topic><topic>Speech and sound recognition and synthesis. Linguistics</topic><topic>Speech processing</topic><topic>Speech recognition</topic><topic>Studies</topic><topic>Telecommunications and information theory</topic><topic>Telephony</topic><topic>Unions</topic><topic>Working environment noise</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Ji Ming</creatorcontrib><creatorcontrib>Jancovic, P.</creatorcontrib><creatorcontrib>Smith, F.J.</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>Pascal-Francis</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>Electronics & Communications Abstracts</collection><collection>ANTE: Abstracts in New Technology & Engineering</collection><collection>Engineering Research Database</collection><jtitle>IEEE transactions on speech and audio processing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Ji Ming</au><au>Jancovic, P.</au><au>Smith, F.J.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Robust speech recognition using probabilistic union models</atitle><jtitle>IEEE transactions on speech and audio processing</jtitle><stitle>T-SAP</stitle><date>2002-09-01</date><risdate>2002</risdate><volume>10</volume><issue>6</issue><spage>403</spage><epage>414</epage><pages>403-414</pages><issn>1063-6676</issn><issn>2329-9290</issn><eissn>1558-2353</eissn><eissn>2329-9304</eissn><coden>IESPEJ</coden><abstract>This paper introduces a new statistical approach, namely the probabilistic union model, for speech recognition involving partial, unknown frequency-band corruption. Partial frequency-band corruption accounts for the effect of a family of real-world noises. Previous methods based on the missing feature theory usually require the identity of the noisy bands. This identification can be difficult for unexpected noise with unknown, time-varying band characteristics. The new model combines the local frequency-band information based on the union of random events, to reduce the dependence of the model on information about the noise. This model partially accomplishes the target: offering robustness to partial frequency-band corruption, while requiring no information about the noise. This paper introduces the theory and implementation of the union model, and is focused on several important advances. These new developments include a new algorithm for automatic order selection, a generalization of the modeling principle to accommodate partial feature stream corruption, and a combination of the union model with conventional noise reduction techniques to deal with a mixture of stationary noise and unknown, nonstationary noise. For the evaluation, we used the TIDIGITS database for speaker-independent connected digit recognition. The utterances were corrupted by various types of additive noise, stationary or time-varying, assuming no knowledge about the noise characteristics. The results indicate that the new model offers significantly improved robustness in comparison to other models.</abstract><cop>New York, NY</cop><pub>IEEE</pub><doi>10.1109/TSA.2002.803439</doi><tpages>12</tpages></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | ISSN: 1063-6676 |
ispartof | IEEE transactions on speech and audio processing, 2002-09, Vol.10 (6), p.403-414 |
issn | 1063-6676 2329-9290 1558-2353 2329-9304 |
language | eng |
recordid | cdi_pascalfrancis_primary_13973499 |
source | IEEE Electronic Library (IEL) |
subjects | Additive noise Applied sciences Artificial intelligence Band theory Computer science control theory systems Corruption Exact sciences and technology Filtering Frequency Information, signal and communications theory Lifting equipment Noise Noise reduction Noise robustness Probabilistic methods Probability theory Robustness Signal processing Spatial databases Speech and sound recognition and synthesis. Linguistics Speech processing Speech recognition Studies Telecommunications and information theory Telephony Unions Working environment noise |
title | Robust speech recognition using probabilistic union models |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-04T19%3A02%3A50IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Robust%20speech%20recognition%20using%20probabilistic%20union%20models&rft.jtitle=IEEE%20transactions%20on%20speech%20and%20audio%20processing&rft.au=Ji%20Ming&rft.date=2002-09-01&rft.volume=10&rft.issue=6&rft.spage=403&rft.epage=414&rft.pages=403-414&rft.issn=1063-6676&rft.eissn=1558-2353&rft.coden=IESPEJ&rft_id=info:doi/10.1109/TSA.2002.803439&rft_dat=%3Cproquest_RIE%3E2430083021%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=884371846&rft_id=info:pmid/&rft_ieee_id=1040264&rfr_iscdi=true |