Cepstrum-domain acoustic feature compensation based on decomposition of speech and noise for ASR in noisy environments
This paper presents a set of acoustic feature pre-processing techniques that are applied to improving automatic speech recognition (ASR) performance on noisy speech recognition tasks. The principal contribution of this paper is an approach for cepstrum-domain feature compensation in ASR which is mot...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on speech and audio processing 2003-09, Vol.11 (5), p.435-446 |
---|---|
Hauptverfasser: | , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 446 |
---|---|
container_issue | 5 |
container_start_page | 435 |
container_title | IEEE transactions on speech and audio processing |
container_volume | 11 |
creator | Hong Kook Kim Rose, R.C. |
description | This paper presents a set of acoustic feature pre-processing techniques that are applied to improving automatic speech recognition (ASR) performance on noisy speech recognition tasks. The principal contribution of this paper is an approach for cepstrum-domain feature compensation in ASR which is motivated by techniques for decomposing speech and noise that were originally developed for noisy speech enhancement. This approach is applied in combination with other feature compensation algorithms to compensating ASR features obtained from a mel-filterbank cepstrum coefficient front-end. Performance comparisons are made with respect to the application of the minimum mean squared error log spectral amplitude (MMSE-LSA) estimator based speech enhancement algorithm prior to feature analysis. An experimental study is presented where the feature compensation approaches described in the paper are found to greatly reduce ASR word error rate compared to uncompensated features under environmental and channel mismatched conditions. |
doi_str_mv | 10.1109/TSA.2003.815515 |
format | Article |
fullrecord | <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_proquest_miscellaneous_919932620</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>1223593</ieee_id><sourcerecordid>28203513</sourcerecordid><originalsourceid>FETCH-LOGICAL-c474t-1d9b57ecf8ea59d15e73b7b87dd2a9925961439681047da9f58f3ea7489469663</originalsourceid><addsrcrecordid>eNp9kUtrHDEQhAeTgB0n5xxyEYbYp1lLo9HruCx-gSHgx1lopRaR2ZHG0ozB_z4ar8GQg0_dVH9dUFTT_CR4RQhW5w_361WHMV1JwhhhB81RnbLtKKNf6o45bTkX_LD5VsoTxlgS0R81LxsYy5TnoXVpMCEiY9NcpmCRBzPNGZBNwwixmCmkiLamgEN1cbDoqYQ3OXlURgD7F5noUEyhAPIpo_X9Haqei_CKIL6EnOIAcSrfm6_e7Ar8eJ_HzePlxcPmur39c3WzWd-2thf91BKntkyA9RIMU44wEHQrtlI41xmlOqY46anikuBeOKM8k56CEb1UPVec0-PmbO875vQ8Q5n0EIqF3c5EqDm1IkrRjne4kqefkp2sECO0gif_gU9pzrGm0FL2pLqpxe18D9mcSsng9ZjDYPKrJlgvdelal17q0vu66sfvd1tTrNn5bKIN5eONYUHoW6Jfey4AwMe5qz0rSv8B7a-eAw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>884132690</pqid></control><display><type>article</type><title>Cepstrum-domain acoustic feature compensation based on decomposition of speech and noise for ASR in noisy environments</title><source>IEEE Electronic Library (IEL)</source><creator>Hong Kook Kim ; Rose, R.C.</creator><creatorcontrib>Hong Kook Kim ; Rose, R.C.</creatorcontrib><description>This paper presents a set of acoustic feature pre-processing techniques that are applied to improving automatic speech recognition (ASR) performance on noisy speech recognition tasks. The principal contribution of this paper is an approach for cepstrum-domain feature compensation in ASR which is motivated by techniques for decomposing speech and noise that were originally developed for noisy speech enhancement. This approach is applied in combination with other feature compensation algorithms to compensating ASR features obtained from a mel-filterbank cepstrum coefficient front-end. Performance comparisons are made with respect to the application of the minimum mean squared error log spectral amplitude (MMSE-LSA) estimator based speech enhancement algorithm prior to feature analysis. An experimental study is presented where the feature compensation approaches described in the paper are found to greatly reduce ASR word error rate compared to uncompensated features under environmental and channel mismatched conditions.</description><identifier>ISSN: 1063-6676</identifier><identifier>ISSN: 2329-9290</identifier><identifier>EISSN: 1558-2353</identifier><identifier>EISSN: 2329-9304</identifier><identifier>DOI: 10.1109/TSA.2003.815515</identifier><identifier>CODEN: IESPEJ</identifier><language>eng</language><publisher>New York, NY: IEEE</publisher><subject>Acoustic noise ; Acoustics ; Algorithms ; Amplitude estimation ; Applied sciences ; Automatic speech recognition ; Cepstral analysis ; Cepstrum ; Compensation ; Decomposition ; Error analysis ; Exact sciences and technology ; Information, signal and communications theory ; Noise ; Performance analysis ; Signal processing ; Speech ; Speech analysis ; Speech enhancement ; Speech processing ; Speech recognition ; Telecommunications and information theory ; Working environment noise</subject><ispartof>IEEE transactions on speech and audio processing, 2003-09, Vol.11 (5), p.435-446</ispartof><rights>2003 INIST-CNRS</rights><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2003</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c474t-1d9b57ecf8ea59d15e73b7b87dd2a9925961439681047da9f58f3ea7489469663</citedby><cites>FETCH-LOGICAL-c474t-1d9b57ecf8ea59d15e73b7b87dd2a9925961439681047da9f58f3ea7489469663</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/1223593$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,796,27924,27925,54758</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/1223593$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=15071366$$DView record in Pascal Francis$$Hfree_for_read</backlink></links><search><creatorcontrib>Hong Kook Kim</creatorcontrib><creatorcontrib>Rose, R.C.</creatorcontrib><title>Cepstrum-domain acoustic feature compensation based on decomposition of speech and noise for ASR in noisy environments</title><title>IEEE transactions on speech and audio processing</title><addtitle>T-SAP</addtitle><description>This paper presents a set of acoustic feature pre-processing techniques that are applied to improving automatic speech recognition (ASR) performance on noisy speech recognition tasks. The principal contribution of this paper is an approach for cepstrum-domain feature compensation in ASR which is motivated by techniques for decomposing speech and noise that were originally developed for noisy speech enhancement. This approach is applied in combination with other feature compensation algorithms to compensating ASR features obtained from a mel-filterbank cepstrum coefficient front-end. Performance comparisons are made with respect to the application of the minimum mean squared error log spectral amplitude (MMSE-LSA) estimator based speech enhancement algorithm prior to feature analysis. An experimental study is presented where the feature compensation approaches described in the paper are found to greatly reduce ASR word error rate compared to uncompensated features under environmental and channel mismatched conditions.</description><subject>Acoustic noise</subject><subject>Acoustics</subject><subject>Algorithms</subject><subject>Amplitude estimation</subject><subject>Applied sciences</subject><subject>Automatic speech recognition</subject><subject>Cepstral analysis</subject><subject>Cepstrum</subject><subject>Compensation</subject><subject>Decomposition</subject><subject>Error analysis</subject><subject>Exact sciences and technology</subject><subject>Information, signal and communications theory</subject><subject>Noise</subject><subject>Performance analysis</subject><subject>Signal processing</subject><subject>Speech</subject><subject>Speech analysis</subject><subject>Speech enhancement</subject><subject>Speech processing</subject><subject>Speech recognition</subject><subject>Telecommunications and information theory</subject><subject>Working environment noise</subject><issn>1063-6676</issn><issn>2329-9290</issn><issn>1558-2353</issn><issn>2329-9304</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2003</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNp9kUtrHDEQhAeTgB0n5xxyEYbYp1lLo9HruCx-gSHgx1lopRaR2ZHG0ozB_z4ar8GQg0_dVH9dUFTT_CR4RQhW5w_361WHMV1JwhhhB81RnbLtKKNf6o45bTkX_LD5VsoTxlgS0R81LxsYy5TnoXVpMCEiY9NcpmCRBzPNGZBNwwixmCmkiLamgEN1cbDoqYQ3OXlURgD7F5noUEyhAPIpo_X9Haqei_CKIL6EnOIAcSrfm6_e7Ar8eJ_HzePlxcPmur39c3WzWd-2thf91BKntkyA9RIMU44wEHQrtlI41xmlOqY46anikuBeOKM8k56CEb1UPVec0-PmbO875vQ8Q5n0EIqF3c5EqDm1IkrRjne4kqefkp2sECO0gif_gU9pzrGm0FL2pLqpxe18D9mcSsng9ZjDYPKrJlgvdelal17q0vu66sfvd1tTrNn5bKIN5eONYUHoW6Jfey4AwMe5qz0rSv8B7a-eAw</recordid><startdate>20030901</startdate><enddate>20030901</enddate><creator>Hong Kook Kim</creator><creator>Rose, R.C.</creator><general>IEEE</general><general>Institute of Electrical and Electronics Engineers</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>RIA</scope><scope>RIE</scope><scope>IQODW</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>7SP</scope><scope>F28</scope><scope>FR3</scope></search><sort><creationdate>20030901</creationdate><title>Cepstrum-domain acoustic feature compensation based on decomposition of speech and noise for ASR in noisy environments</title><author>Hong Kook Kim ; Rose, R.C.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c474t-1d9b57ecf8ea59d15e73b7b87dd2a9925961439681047da9f58f3ea7489469663</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2003</creationdate><topic>Acoustic noise</topic><topic>Acoustics</topic><topic>Algorithms</topic><topic>Amplitude estimation</topic><topic>Applied sciences</topic><topic>Automatic speech recognition</topic><topic>Cepstral analysis</topic><topic>Cepstrum</topic><topic>Compensation</topic><topic>Decomposition</topic><topic>Error analysis</topic><topic>Exact sciences and technology</topic><topic>Information, signal and communications theory</topic><topic>Noise</topic><topic>Performance analysis</topic><topic>Signal processing</topic><topic>Speech</topic><topic>Speech analysis</topic><topic>Speech enhancement</topic><topic>Speech processing</topic><topic>Speech recognition</topic><topic>Telecommunications and information theory</topic><topic>Working environment noise</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Hong Kook Kim</creatorcontrib><creatorcontrib>Rose, R.C.</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>Pascal-Francis</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>Electronics & Communications Abstracts</collection><collection>ANTE: Abstracts in New Technology & Engineering</collection><collection>Engineering Research Database</collection><jtitle>IEEE transactions on speech and audio processing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Hong Kook Kim</au><au>Rose, R.C.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Cepstrum-domain acoustic feature compensation based on decomposition of speech and noise for ASR in noisy environments</atitle><jtitle>IEEE transactions on speech and audio processing</jtitle><stitle>T-SAP</stitle><date>2003-09-01</date><risdate>2003</risdate><volume>11</volume><issue>5</issue><spage>435</spage><epage>446</epage><pages>435-446</pages><issn>1063-6676</issn><issn>2329-9290</issn><eissn>1558-2353</eissn><eissn>2329-9304</eissn><coden>IESPEJ</coden><abstract>This paper presents a set of acoustic feature pre-processing techniques that are applied to improving automatic speech recognition (ASR) performance on noisy speech recognition tasks. The principal contribution of this paper is an approach for cepstrum-domain feature compensation in ASR which is motivated by techniques for decomposing speech and noise that were originally developed for noisy speech enhancement. This approach is applied in combination with other feature compensation algorithms to compensating ASR features obtained from a mel-filterbank cepstrum coefficient front-end. Performance comparisons are made with respect to the application of the minimum mean squared error log spectral amplitude (MMSE-LSA) estimator based speech enhancement algorithm prior to feature analysis. An experimental study is presented where the feature compensation approaches described in the paper are found to greatly reduce ASR word error rate compared to uncompensated features under environmental and channel mismatched conditions.</abstract><cop>New York, NY</cop><pub>IEEE</pub><doi>10.1109/TSA.2003.815515</doi><tpages>12</tpages></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | ISSN: 1063-6676 |
ispartof | IEEE transactions on speech and audio processing, 2003-09, Vol.11 (5), p.435-446 |
issn | 1063-6676 2329-9290 1558-2353 2329-9304 |
language | eng |
recordid | cdi_proquest_miscellaneous_919932620 |
source | IEEE Electronic Library (IEL) |
subjects | Acoustic noise Acoustics Algorithms Amplitude estimation Applied sciences Automatic speech recognition Cepstral analysis Cepstrum Compensation Decomposition Error analysis Exact sciences and technology Information, signal and communications theory Noise Performance analysis Signal processing Speech Speech analysis Speech enhancement Speech processing Speech recognition Telecommunications and information theory Working environment noise |
title | Cepstrum-domain acoustic feature compensation based on decomposition of speech and noise for ASR in noisy environments |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-26T19%3A02%3A44IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Cepstrum-domain%20acoustic%20feature%20compensation%20based%20on%20decomposition%20of%20speech%20and%20noise%20for%20ASR%20in%20noisy%20environments&rft.jtitle=IEEE%20transactions%20on%20speech%20and%20audio%20processing&rft.au=Hong%20Kook%20Kim&rft.date=2003-09-01&rft.volume=11&rft.issue=5&rft.spage=435&rft.epage=446&rft.pages=435-446&rft.issn=1063-6676&rft.eissn=1558-2353&rft.coden=IESPEJ&rft_id=info:doi/10.1109/TSA.2003.815515&rft_dat=%3Cproquest_RIE%3E28203513%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=884132690&rft_id=info:pmid/&rft_ieee_id=1223593&rfr_iscdi=true |