Cepstrum-domain acoustic feature compensation based on decomposition of speech and noise for ASR in noisy environments

This paper presents a set of acoustic feature pre-processing techniques that are applied to improving automatic speech recognition (ASR) performance on noisy speech recognition tasks. The principal contribution of this paper is an approach for cepstrum-domain feature compensation in ASR which is mot...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on speech and audio processing 2003-09, Vol.11 (5), p.435-446
Hauptverfasser: Hong Kook Kim, Rose, R.C.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 446
container_issue 5
container_start_page 435
container_title IEEE transactions on speech and audio processing
container_volume 11
creator Hong Kook Kim
Rose, R.C.
description This paper presents a set of acoustic feature pre-processing techniques that are applied to improving automatic speech recognition (ASR) performance on noisy speech recognition tasks. The principal contribution of this paper is an approach for cepstrum-domain feature compensation in ASR which is motivated by techniques for decomposing speech and noise that were originally developed for noisy speech enhancement. This approach is applied in combination with other feature compensation algorithms to compensating ASR features obtained from a mel-filterbank cepstrum coefficient front-end. Performance comparisons are made with respect to the application of the minimum mean squared error log spectral amplitude (MMSE-LSA) estimator based speech enhancement algorithm prior to feature analysis. An experimental study is presented where the feature compensation approaches described in the paper are found to greatly reduce ASR word error rate compared to uncompensated features under environmental and channel mismatched conditions.
doi_str_mv 10.1109/TSA.2003.815515
format Article
fullrecord <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_proquest_miscellaneous_919932620</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>1223593</ieee_id><sourcerecordid>28203513</sourcerecordid><originalsourceid>FETCH-LOGICAL-c474t-1d9b57ecf8ea59d15e73b7b87dd2a9925961439681047da9f58f3ea7489469663</originalsourceid><addsrcrecordid>eNp9kUtrHDEQhAeTgB0n5xxyEYbYp1lLo9HruCx-gSHgx1lopRaR2ZHG0ozB_z4ar8GQg0_dVH9dUFTT_CR4RQhW5w_361WHMV1JwhhhB81RnbLtKKNf6o45bTkX_LD5VsoTxlgS0R81LxsYy5TnoXVpMCEiY9NcpmCRBzPNGZBNwwixmCmkiLamgEN1cbDoqYQ3OXlURgD7F5noUEyhAPIpo_X9Haqei_CKIL6EnOIAcSrfm6_e7Ar8eJ_HzePlxcPmur39c3WzWd-2thf91BKntkyA9RIMU44wEHQrtlI41xmlOqY46anikuBeOKM8k56CEb1UPVec0-PmbO875vQ8Q5n0EIqF3c5EqDm1IkrRjne4kqefkp2sECO0gif_gU9pzrGm0FL2pLqpxe18D9mcSsng9ZjDYPKrJlgvdelal17q0vu66sfvd1tTrNn5bKIN5eONYUHoW6Jfey4AwMe5qz0rSv8B7a-eAw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>884132690</pqid></control><display><type>article</type><title>Cepstrum-domain acoustic feature compensation based on decomposition of speech and noise for ASR in noisy environments</title><source>IEEE Electronic Library (IEL)</source><creator>Hong Kook Kim ; Rose, R.C.</creator><creatorcontrib>Hong Kook Kim ; Rose, R.C.</creatorcontrib><description>This paper presents a set of acoustic feature pre-processing techniques that are applied to improving automatic speech recognition (ASR) performance on noisy speech recognition tasks. The principal contribution of this paper is an approach for cepstrum-domain feature compensation in ASR which is motivated by techniques for decomposing speech and noise that were originally developed for noisy speech enhancement. This approach is applied in combination with other feature compensation algorithms to compensating ASR features obtained from a mel-filterbank cepstrum coefficient front-end. Performance comparisons are made with respect to the application of the minimum mean squared error log spectral amplitude (MMSE-LSA) estimator based speech enhancement algorithm prior to feature analysis. An experimental study is presented where the feature compensation approaches described in the paper are found to greatly reduce ASR word error rate compared to uncompensated features under environmental and channel mismatched conditions.</description><identifier>ISSN: 1063-6676</identifier><identifier>ISSN: 2329-9290</identifier><identifier>EISSN: 1558-2353</identifier><identifier>EISSN: 2329-9304</identifier><identifier>DOI: 10.1109/TSA.2003.815515</identifier><identifier>CODEN: IESPEJ</identifier><language>eng</language><publisher>New York, NY: IEEE</publisher><subject>Acoustic noise ; Acoustics ; Algorithms ; Amplitude estimation ; Applied sciences ; Automatic speech recognition ; Cepstral analysis ; Cepstrum ; Compensation ; Decomposition ; Error analysis ; Exact sciences and technology ; Information, signal and communications theory ; Noise ; Performance analysis ; Signal processing ; Speech ; Speech analysis ; Speech enhancement ; Speech processing ; Speech recognition ; Telecommunications and information theory ; Working environment noise</subject><ispartof>IEEE transactions on speech and audio processing, 2003-09, Vol.11 (5), p.435-446</ispartof><rights>2003 INIST-CNRS</rights><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2003</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c474t-1d9b57ecf8ea59d15e73b7b87dd2a9925961439681047da9f58f3ea7489469663</citedby><cites>FETCH-LOGICAL-c474t-1d9b57ecf8ea59d15e73b7b87dd2a9925961439681047da9f58f3ea7489469663</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/1223593$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,796,27924,27925,54758</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/1223593$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&amp;idt=15071366$$DView record in Pascal Francis$$Hfree_for_read</backlink></links><search><creatorcontrib>Hong Kook Kim</creatorcontrib><creatorcontrib>Rose, R.C.</creatorcontrib><title>Cepstrum-domain acoustic feature compensation based on decomposition of speech and noise for ASR in noisy environments</title><title>IEEE transactions on speech and audio processing</title><addtitle>T-SAP</addtitle><description>This paper presents a set of acoustic feature pre-processing techniques that are applied to improving automatic speech recognition (ASR) performance on noisy speech recognition tasks. The principal contribution of this paper is an approach for cepstrum-domain feature compensation in ASR which is motivated by techniques for decomposing speech and noise that were originally developed for noisy speech enhancement. This approach is applied in combination with other feature compensation algorithms to compensating ASR features obtained from a mel-filterbank cepstrum coefficient front-end. Performance comparisons are made with respect to the application of the minimum mean squared error log spectral amplitude (MMSE-LSA) estimator based speech enhancement algorithm prior to feature analysis. An experimental study is presented where the feature compensation approaches described in the paper are found to greatly reduce ASR word error rate compared to uncompensated features under environmental and channel mismatched conditions.</description><subject>Acoustic noise</subject><subject>Acoustics</subject><subject>Algorithms</subject><subject>Amplitude estimation</subject><subject>Applied sciences</subject><subject>Automatic speech recognition</subject><subject>Cepstral analysis</subject><subject>Cepstrum</subject><subject>Compensation</subject><subject>Decomposition</subject><subject>Error analysis</subject><subject>Exact sciences and technology</subject><subject>Information, signal and communications theory</subject><subject>Noise</subject><subject>Performance analysis</subject><subject>Signal processing</subject><subject>Speech</subject><subject>Speech analysis</subject><subject>Speech enhancement</subject><subject>Speech processing</subject><subject>Speech recognition</subject><subject>Telecommunications and information theory</subject><subject>Working environment noise</subject><issn>1063-6676</issn><issn>2329-9290</issn><issn>1558-2353</issn><issn>2329-9304</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2003</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNp9kUtrHDEQhAeTgB0n5xxyEYbYp1lLo9HruCx-gSHgx1lopRaR2ZHG0ozB_z4ar8GQg0_dVH9dUFTT_CR4RQhW5w_361WHMV1JwhhhB81RnbLtKKNf6o45bTkX_LD5VsoTxlgS0R81LxsYy5TnoXVpMCEiY9NcpmCRBzPNGZBNwwixmCmkiLamgEN1cbDoqYQ3OXlURgD7F5noUEyhAPIpo_X9Haqei_CKIL6EnOIAcSrfm6_e7Ar8eJ_HzePlxcPmur39c3WzWd-2thf91BKntkyA9RIMU44wEHQrtlI41xmlOqY46anikuBeOKM8k56CEb1UPVec0-PmbO875vQ8Q5n0EIqF3c5EqDm1IkrRjne4kqefkp2sECO0gif_gU9pzrGm0FL2pLqpxe18D9mcSsng9ZjDYPKrJlgvdelal17q0vu66sfvd1tTrNn5bKIN5eONYUHoW6Jfey4AwMe5qz0rSv8B7a-eAw</recordid><startdate>20030901</startdate><enddate>20030901</enddate><creator>Hong Kook Kim</creator><creator>Rose, R.C.</creator><general>IEEE</general><general>Institute of Electrical and Electronics Engineers</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>RIA</scope><scope>RIE</scope><scope>IQODW</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>7SP</scope><scope>F28</scope><scope>FR3</scope></search><sort><creationdate>20030901</creationdate><title>Cepstrum-domain acoustic feature compensation based on decomposition of speech and noise for ASR in noisy environments</title><author>Hong Kook Kim ; Rose, R.C.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c474t-1d9b57ecf8ea59d15e73b7b87dd2a9925961439681047da9f58f3ea7489469663</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2003</creationdate><topic>Acoustic noise</topic><topic>Acoustics</topic><topic>Algorithms</topic><topic>Amplitude estimation</topic><topic>Applied sciences</topic><topic>Automatic speech recognition</topic><topic>Cepstral analysis</topic><topic>Cepstrum</topic><topic>Compensation</topic><topic>Decomposition</topic><topic>Error analysis</topic><topic>Exact sciences and technology</topic><topic>Information, signal and communications theory</topic><topic>Noise</topic><topic>Performance analysis</topic><topic>Signal processing</topic><topic>Speech</topic><topic>Speech analysis</topic><topic>Speech enhancement</topic><topic>Speech processing</topic><topic>Speech recognition</topic><topic>Telecommunications and information theory</topic><topic>Working environment noise</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Hong Kook Kim</creatorcontrib><creatorcontrib>Rose, R.C.</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>Pascal-Francis</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>ANTE: Abstracts in New Technology &amp; Engineering</collection><collection>Engineering Research Database</collection><jtitle>IEEE transactions on speech and audio processing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Hong Kook Kim</au><au>Rose, R.C.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Cepstrum-domain acoustic feature compensation based on decomposition of speech and noise for ASR in noisy environments</atitle><jtitle>IEEE transactions on speech and audio processing</jtitle><stitle>T-SAP</stitle><date>2003-09-01</date><risdate>2003</risdate><volume>11</volume><issue>5</issue><spage>435</spage><epage>446</epage><pages>435-446</pages><issn>1063-6676</issn><issn>2329-9290</issn><eissn>1558-2353</eissn><eissn>2329-9304</eissn><coden>IESPEJ</coden><abstract>This paper presents a set of acoustic feature pre-processing techniques that are applied to improving automatic speech recognition (ASR) performance on noisy speech recognition tasks. The principal contribution of this paper is an approach for cepstrum-domain feature compensation in ASR which is motivated by techniques for decomposing speech and noise that were originally developed for noisy speech enhancement. This approach is applied in combination with other feature compensation algorithms to compensating ASR features obtained from a mel-filterbank cepstrum coefficient front-end. Performance comparisons are made with respect to the application of the minimum mean squared error log spectral amplitude (MMSE-LSA) estimator based speech enhancement algorithm prior to feature analysis. An experimental study is presented where the feature compensation approaches described in the paper are found to greatly reduce ASR word error rate compared to uncompensated features under environmental and channel mismatched conditions.</abstract><cop>New York, NY</cop><pub>IEEE</pub><doi>10.1109/TSA.2003.815515</doi><tpages>12</tpages></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 1063-6676
ispartof IEEE transactions on speech and audio processing, 2003-09, Vol.11 (5), p.435-446
issn 1063-6676
2329-9290
1558-2353
2329-9304
language eng
recordid cdi_proquest_miscellaneous_919932620
source IEEE Electronic Library (IEL)
subjects Acoustic noise
Acoustics
Algorithms
Amplitude estimation
Applied sciences
Automatic speech recognition
Cepstral analysis
Cepstrum
Compensation
Decomposition
Error analysis
Exact sciences and technology
Information, signal and communications theory
Noise
Performance analysis
Signal processing
Speech
Speech analysis
Speech enhancement
Speech processing
Speech recognition
Telecommunications and information theory
Working environment noise
title Cepstrum-domain acoustic feature compensation based on decomposition of speech and noise for ASR in noisy environments
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-26T19%3A02%3A44IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Cepstrum-domain%20acoustic%20feature%20compensation%20based%20on%20decomposition%20of%20speech%20and%20noise%20for%20ASR%20in%20noisy%20environments&rft.jtitle=IEEE%20transactions%20on%20speech%20and%20audio%20processing&rft.au=Hong%20Kook%20Kim&rft.date=2003-09-01&rft.volume=11&rft.issue=5&rft.spage=435&rft.epage=446&rft.pages=435-446&rft.issn=1063-6676&rft.eissn=1558-2353&rft.coden=IESPEJ&rft_id=info:doi/10.1109/TSA.2003.815515&rft_dat=%3Cproquest_RIE%3E28203513%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=884132690&rft_id=info:pmid/&rft_ieee_id=1223593&rfr_iscdi=true