Robust AM-FM features for speech recognition
In this letter, a nonlinear AM-FM speech model is used to extract robust features for speech recognition. The proposed features measure the amount of amplitude and frequency modulation that exists in speech resonances and attempt to model aspects of the speech acoustic information that the commonly...
Gespeichert in:
Veröffentlicht in: | IEEE signal processing letters 2005-09, Vol.12 (9), p.621-624 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 624 |
---|---|
container_issue | 9 |
container_start_page | 621 |
container_title | IEEE signal processing letters |
container_volume | 12 |
creator | Dimitriadis, D. Maragos, P. Potamianos, A. |
description | In this letter, a nonlinear AM-FM speech model is used to extract robust features for speech recognition. The proposed features measure the amount of amplitude and frequency modulation that exists in speech resonances and attempt to model aspects of the speech acoustic information that the commonly used linear source-filter model fails to capture. The robustness and discriminability of the AM-FM features is investigated in combination with mel cepstrum coefficients (MFCCs). It is shown that these hybrid features perform well in the presence of noise, both in terms of phoneme-discrimination (J-measure) and in terms of speech recognition performance in several different tasks. Average relative error rate reduction up to 11% for clean and 46% for mismatched noisy conditions is achieved when AM-FM features are combined with MFCCs. |
doi_str_mv | 10.1109/LSP.2005.853050 |
format | Article |
fullrecord | <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_proquest_journals_883386779</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>1495427</ieee_id><sourcerecordid>28073080</sourcerecordid><originalsourceid>FETCH-LOGICAL-c348t-79111291fdb4032e11198c06d26d92f4c6ea360b204e0681e0acfa909a8965db3</originalsourceid><addsrcrecordid>eNpdkM9LwzAUx4MoOKdnD16KB092e0maNDmO4VTYUPxxDmn6qh1bU5P24H9vRwXB0_s--Hwfjw8hlxRmlIKer1-fZwxAzJTgIOCITKgQKmVc0uMhQw6p1qBOyVmMWwBQVIkJuX3xRR-7ZLFJV5ukQtv1AWNS-ZDEFtF9JgGd_2jqrvbNOTmp7C7ixe-ckvfV3dvyIV0_3T8uF-vU8Ux1aa4ppUzTqiwy4AyHTSsHsmSy1KzKnETLJRQMMgSpKIJ1ldWgrdJSlAWfkpvxbhv8V4-xM_s6OtztbIO-j4YpyDkoGMDrf-DW96EZfjNKca5knusBmo-QCz7GgJVpQ7234dtQMAd1ZlBnDurMqG5oXI2NGhH_6EyLjOX8B9JhZy0</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>883386779</pqid></control><display><type>article</type><title>Robust AM-FM features for speech recognition</title><source>IEEE Xplore (Online service)</source><creator>Dimitriadis, D. ; Maragos, P. ; Potamianos, A.</creator><creatorcontrib>Dimitriadis, D. ; Maragos, P. ; Potamianos, A.</creatorcontrib><description>In this letter, a nonlinear AM-FM speech model is used to extract robust features for speech recognition. The proposed features measure the amount of amplitude and frequency modulation that exists in speech resonances and attempt to model aspects of the speech acoustic information that the commonly used linear source-filter model fails to capture. The robustness and discriminability of the AM-FM features is investigated in combination with mel cepstrum coefficients (MFCCs). It is shown that these hybrid features perform well in the presence of noise, both in terms of phoneme-discrimination (J-measure) and in terms of speech recognition performance in several different tasks. Average relative error rate reduction up to 11% for clean and 46% for mismatched noisy conditions is achieved when AM-FM features are combined with MFCCs.</description><identifier>ISSN: 1070-9908</identifier><identifier>EISSN: 1558-2361</identifier><identifier>DOI: 10.1109/LSP.2005.853050</identifier><identifier>CODEN: ISPLEM</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Acoustic measurements ; Acoustic noise ; AM-FM ; ASR ; Cepstrum ; Data mining ; Feature extraction ; features ; Frequency measurement ; Frequency modulation ; Noise robustness ; nonlinear ; Resonance ; speech ; Speech recognition</subject><ispartof>IEEE signal processing letters, 2005-09, Vol.12 (9), p.621-624</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2005</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c348t-79111291fdb4032e11198c06d26d92f4c6ea360b204e0681e0acfa909a8965db3</citedby><cites>FETCH-LOGICAL-c348t-79111291fdb4032e11198c06d26d92f4c6ea360b204e0681e0acfa909a8965db3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/1495427$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,796,27924,27925,54758</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/1495427$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Dimitriadis, D.</creatorcontrib><creatorcontrib>Maragos, P.</creatorcontrib><creatorcontrib>Potamianos, A.</creatorcontrib><title>Robust AM-FM features for speech recognition</title><title>IEEE signal processing letters</title><addtitle>LSP</addtitle><description>In this letter, a nonlinear AM-FM speech model is used to extract robust features for speech recognition. The proposed features measure the amount of amplitude and frequency modulation that exists in speech resonances and attempt to model aspects of the speech acoustic information that the commonly used linear source-filter model fails to capture. The robustness and discriminability of the AM-FM features is investigated in combination with mel cepstrum coefficients (MFCCs). It is shown that these hybrid features perform well in the presence of noise, both in terms of phoneme-discrimination (J-measure) and in terms of speech recognition performance in several different tasks. Average relative error rate reduction up to 11% for clean and 46% for mismatched noisy conditions is achieved when AM-FM features are combined with MFCCs.</description><subject>Acoustic measurements</subject><subject>Acoustic noise</subject><subject>AM-FM</subject><subject>ASR</subject><subject>Cepstrum</subject><subject>Data mining</subject><subject>Feature extraction</subject><subject>features</subject><subject>Frequency measurement</subject><subject>Frequency modulation</subject><subject>Noise robustness</subject><subject>nonlinear</subject><subject>Resonance</subject><subject>speech</subject><subject>Speech recognition</subject><issn>1070-9908</issn><issn>1558-2361</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2005</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpdkM9LwzAUx4MoOKdnD16KB092e0maNDmO4VTYUPxxDmn6qh1bU5P24H9vRwXB0_s--Hwfjw8hlxRmlIKer1-fZwxAzJTgIOCITKgQKmVc0uMhQw6p1qBOyVmMWwBQVIkJuX3xRR-7ZLFJV5ukQtv1AWNS-ZDEFtF9JgGd_2jqrvbNOTmp7C7ixe-ckvfV3dvyIV0_3T8uF-vU8Ux1aa4ppUzTqiwy4AyHTSsHsmSy1KzKnETLJRQMMgSpKIJ1ldWgrdJSlAWfkpvxbhv8V4-xM_s6OtztbIO-j4YpyDkoGMDrf-DW96EZfjNKca5knusBmo-QCz7GgJVpQ7234dtQMAd1ZlBnDurMqG5oXI2NGhH_6EyLjOX8B9JhZy0</recordid><startdate>20050901</startdate><enddate>20050901</enddate><creator>Dimitriadis, D.</creator><creator>Maragos, P.</creator><creator>Potamianos, A.</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope></search><sort><creationdate>20050901</creationdate><title>Robust AM-FM features for speech recognition</title><author>Dimitriadis, D. ; Maragos, P. ; Potamianos, A.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c348t-79111291fdb4032e11198c06d26d92f4c6ea360b204e0681e0acfa909a8965db3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2005</creationdate><topic>Acoustic measurements</topic><topic>Acoustic noise</topic><topic>AM-FM</topic><topic>ASR</topic><topic>Cepstrum</topic><topic>Data mining</topic><topic>Feature extraction</topic><topic>features</topic><topic>Frequency measurement</topic><topic>Frequency modulation</topic><topic>Noise robustness</topic><topic>nonlinear</topic><topic>Resonance</topic><topic>speech</topic><topic>Speech recognition</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Dimitriadis, D.</creatorcontrib><creatorcontrib>Maragos, P.</creatorcontrib><creatorcontrib>Potamianos, A.</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005–Present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998–Present</collection><collection>IEEE Xplore (Online service)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE signal processing letters</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Dimitriadis, D.</au><au>Maragos, P.</au><au>Potamianos, A.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Robust AM-FM features for speech recognition</atitle><jtitle>IEEE signal processing letters</jtitle><stitle>LSP</stitle><date>2005-09-01</date><risdate>2005</risdate><volume>12</volume><issue>9</issue><spage>621</spage><epage>624</epage><pages>621-624</pages><issn>1070-9908</issn><eissn>1558-2361</eissn><coden>ISPLEM</coden><abstract>In this letter, a nonlinear AM-FM speech model is used to extract robust features for speech recognition. The proposed features measure the amount of amplitude and frequency modulation that exists in speech resonances and attempt to model aspects of the speech acoustic information that the commonly used linear source-filter model fails to capture. The robustness and discriminability of the AM-FM features is investigated in combination with mel cepstrum coefficients (MFCCs). It is shown that these hybrid features perform well in the presence of noise, both in terms of phoneme-discrimination (J-measure) and in terms of speech recognition performance in several different tasks. Average relative error rate reduction up to 11% for clean and 46% for mismatched noisy conditions is achieved when AM-FM features are combined with MFCCs.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/LSP.2005.853050</doi><tpages>4</tpages></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | ISSN: 1070-9908 |
ispartof | IEEE signal processing letters, 2005-09, Vol.12 (9), p.621-624 |
issn | 1070-9908 1558-2361 |
language | eng |
recordid | cdi_proquest_journals_883386779 |
source | IEEE Xplore (Online service) |
subjects | Acoustic measurements Acoustic noise AM-FM ASR Cepstrum Data mining Feature extraction features Frequency measurement Frequency modulation Noise robustness nonlinear Resonance speech Speech recognition |
title | Robust AM-FM features for speech recognition |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-01T10%3A24%3A05IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Robust%20AM-FM%20features%20for%20speech%20recognition&rft.jtitle=IEEE%20signal%20processing%20letters&rft.au=Dimitriadis,%20D.&rft.date=2005-09-01&rft.volume=12&rft.issue=9&rft.spage=621&rft.epage=624&rft.pages=621-624&rft.issn=1070-9908&rft.eissn=1558-2361&rft.coden=ISPLEM&rft_id=info:doi/10.1109/LSP.2005.853050&rft_dat=%3Cproquest_RIE%3E28073080%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=883386779&rft_id=info:pmid/&rft_ieee_id=1495427&rfr_iscdi=true |