Speech recognition using cepstral articulatory features
Though speech recognition has been widely investigated in the past decades, the role of articulation in recognition has received scant attention. Recognition accuracy increases when recognizers are trained with acoustic features in conjunction with articulatory ones. Traditionally, acoustic features...
Gespeichert in:
Veröffentlicht in: | Speech communication 2019-02, Vol.107, p.26-37 |
---|---|
Hauptverfasser: | , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 37 |
---|---|
container_issue | |
container_start_page | 26 |
container_title | Speech communication |
container_volume | 107 |
creator | Najnin, Shamima Banerjee, Bonny |
description | Though speech recognition has been widely investigated in the past decades, the role of articulation in recognition has received scant attention. Recognition accuracy increases when recognizers are trained with acoustic features in conjunction with articulatory ones. Traditionally, acoustic features are represented by mel-frequency cepstral coefficients (MFCCs) while articulatory features are represented by the locations or trajectories of the articulators. We propose the articulatory cepstral coefficients (ACCs) as features which are the cepstral coefficients of the time-location articulatory signal. We show that ACCs yield state-of-the-art results in phoneme classification and recognition on benchmark datasets over a wide range of experiments. The similarity of MFCCs and ACCs and their superior performance in isolation and conjunction indicate that common algorithms can be effectively used for acoustic and articulatory signals. |
doi_str_mv | 10.1016/j.specom.2019.01.002 |
format | Article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2195865745</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S0167639318300669</els_id><sourcerecordid>2195865745</sourcerecordid><originalsourceid>FETCH-LOGICAL-c380t-2c4b8ae1b41a0e180a41336d104015e65ae5ad6241da2c339856929c4c5a23773</originalsourceid><addsrcrecordid>eNp9kEtLxDAUhYMoOI7-AxcF16335tGmG0EGXzDgQl2HTHpnTJlpa5IK8--t1LWru_nOOdyPsWuEAgHL27aIA7n-UHDAugAsAPgJW6CueF6h5qdsMWFVXopanLOLGFsAkFrzBaveBiL3mYUpv-t88n2XjdF3u8zREFOw-8yG5N24t6kPx2xLNo2B4iU729p9pKu_u2Qfjw_vq-d8_fr0srpf505oSDl3cqMt4UaiBUINVqIQZYMgARWVypKyTcklNpY7IWqtyprXTjpluagqsWQ3c-8Q-q-RYjJtP4ZumjQca6VLVUk1UXKmXOhjDLQ1Q_AHG44GwfwqMq2ZFZlfRQbQTIqm2N0co-mDb0_BROepc9T4yUcyTe__L_gBH0xwaQ</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2195865745</pqid></control><display><type>article</type><title>Speech recognition using cepstral articulatory features</title><source>Elsevier ScienceDirect Journals</source><creator>Najnin, Shamima ; Banerjee, Bonny</creator><creatorcontrib>Najnin, Shamima ; Banerjee, Bonny</creatorcontrib><description>Though speech recognition has been widely investigated in the past decades, the role of articulation in recognition has received scant attention. Recognition accuracy increases when recognizers are trained with acoustic features in conjunction with articulatory ones. Traditionally, acoustic features are represented by mel-frequency cepstral coefficients (MFCCs) while articulatory features are represented by the locations or trajectories of the articulators. We propose the articulatory cepstral coefficients (ACCs) as features which are the cepstral coefficients of the time-location articulatory signal. We show that ACCs yield state-of-the-art results in phoneme classification and recognition on benchmark datasets over a wide range of experiments. The similarity of MFCCs and ACCs and their superior performance in isolation and conjunction indicate that common algorithms can be effectively used for acoustic and articulatory signals.</description><identifier>ISSN: 0167-6393</identifier><identifier>EISSN: 1872-7182</identifier><identifier>DOI: 10.1016/j.specom.2019.01.002</identifier><language>eng</language><publisher>Amsterdam: Elsevier B.V</publisher><subject>Acoustic feature ; Acoustics ; Algorithms ; Articulation ; Cepstral articulatory feature ; Coefficients ; Deep neural network ; Feature recognition ; General regression neural network ; Inversion mapping ; Phoneme recognition ; Phonemes ; Recognition ; Speech recognition ; Voice recognition</subject><ispartof>Speech communication, 2019-02, Vol.107, p.26-37</ispartof><rights>2019 Elsevier B.V.</rights><rights>Copyright Elsevier Science Ltd. Feb 2019</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c380t-2c4b8ae1b41a0e180a41336d104015e65ae5ad6241da2c339856929c4c5a23773</citedby><cites>FETCH-LOGICAL-c380t-2c4b8ae1b41a0e180a41336d104015e65ae5ad6241da2c339856929c4c5a23773</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.sciencedirect.com/science/article/pii/S0167639318300669$$EHTML$$P50$$Gelsevier$$H</linktohtml><link.rule.ids>314,776,780,3537,27901,27902,65306</link.rule.ids></links><search><creatorcontrib>Najnin, Shamima</creatorcontrib><creatorcontrib>Banerjee, Bonny</creatorcontrib><title>Speech recognition using cepstral articulatory features</title><title>Speech communication</title><description>Though speech recognition has been widely investigated in the past decades, the role of articulation in recognition has received scant attention. Recognition accuracy increases when recognizers are trained with acoustic features in conjunction with articulatory ones. Traditionally, acoustic features are represented by mel-frequency cepstral coefficients (MFCCs) while articulatory features are represented by the locations or trajectories of the articulators. We propose the articulatory cepstral coefficients (ACCs) as features which are the cepstral coefficients of the time-location articulatory signal. We show that ACCs yield state-of-the-art results in phoneme classification and recognition on benchmark datasets over a wide range of experiments. The similarity of MFCCs and ACCs and their superior performance in isolation and conjunction indicate that common algorithms can be effectively used for acoustic and articulatory signals.</description><subject>Acoustic feature</subject><subject>Acoustics</subject><subject>Algorithms</subject><subject>Articulation</subject><subject>Cepstral articulatory feature</subject><subject>Coefficients</subject><subject>Deep neural network</subject><subject>Feature recognition</subject><subject>General regression neural network</subject><subject>Inversion mapping</subject><subject>Phoneme recognition</subject><subject>Phonemes</subject><subject>Recognition</subject><subject>Speech recognition</subject><subject>Voice recognition</subject><issn>0167-6393</issn><issn>1872-7182</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2019</creationdate><recordtype>article</recordtype><recordid>eNp9kEtLxDAUhYMoOI7-AxcF16335tGmG0EGXzDgQl2HTHpnTJlpa5IK8--t1LWru_nOOdyPsWuEAgHL27aIA7n-UHDAugAsAPgJW6CueF6h5qdsMWFVXopanLOLGFsAkFrzBaveBiL3mYUpv-t88n2XjdF3u8zREFOw-8yG5N24t6kPx2xLNo2B4iU729p9pKu_u2Qfjw_vq-d8_fr0srpf505oSDl3cqMt4UaiBUINVqIQZYMgARWVypKyTcklNpY7IWqtyprXTjpluagqsWQ3c-8Q-q-RYjJtP4ZumjQca6VLVUk1UXKmXOhjDLQ1Q_AHG44GwfwqMq2ZFZlfRQbQTIqm2N0co-mDb0_BROepc9T4yUcyTe__L_gBH0xwaQ</recordid><startdate>201902</startdate><enddate>201902</enddate><creator>Najnin, Shamima</creator><creator>Banerjee, Bonny</creator><general>Elsevier B.V</general><general>Elsevier Science Ltd</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>7T9</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope></search><sort><creationdate>201902</creationdate><title>Speech recognition using cepstral articulatory features</title><author>Najnin, Shamima ; Banerjee, Bonny</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c380t-2c4b8ae1b41a0e180a41336d104015e65ae5ad6241da2c339856929c4c5a23773</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2019</creationdate><topic>Acoustic feature</topic><topic>Acoustics</topic><topic>Algorithms</topic><topic>Articulation</topic><topic>Cepstral articulatory feature</topic><topic>Coefficients</topic><topic>Deep neural network</topic><topic>Feature recognition</topic><topic>General regression neural network</topic><topic>Inversion mapping</topic><topic>Phoneme recognition</topic><topic>Phonemes</topic><topic>Recognition</topic><topic>Speech recognition</topic><topic>Voice recognition</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Najnin, Shamima</creatorcontrib><creatorcontrib>Banerjee, Bonny</creatorcontrib><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Linguistics and Language Behavior Abstracts (LLBA)</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>Speech communication</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Najnin, Shamima</au><au>Banerjee, Bonny</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Speech recognition using cepstral articulatory features</atitle><jtitle>Speech communication</jtitle><date>2019-02</date><risdate>2019</risdate><volume>107</volume><spage>26</spage><epage>37</epage><pages>26-37</pages><issn>0167-6393</issn><eissn>1872-7182</eissn><abstract>Though speech recognition has been widely investigated in the past decades, the role of articulation in recognition has received scant attention. Recognition accuracy increases when recognizers are trained with acoustic features in conjunction with articulatory ones. Traditionally, acoustic features are represented by mel-frequency cepstral coefficients (MFCCs) while articulatory features are represented by the locations or trajectories of the articulators. We propose the articulatory cepstral coefficients (ACCs) as features which are the cepstral coefficients of the time-location articulatory signal. We show that ACCs yield state-of-the-art results in phoneme classification and recognition on benchmark datasets over a wide range of experiments. The similarity of MFCCs and ACCs and their superior performance in isolation and conjunction indicate that common algorithms can be effectively used for acoustic and articulatory signals.</abstract><cop>Amsterdam</cop><pub>Elsevier B.V</pub><doi>10.1016/j.specom.2019.01.002</doi><tpages>12</tpages><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0167-6393 |
ispartof | Speech communication, 2019-02, Vol.107, p.26-37 |
issn | 0167-6393 1872-7182 |
language | eng |
recordid | cdi_proquest_journals_2195865745 |
source | Elsevier ScienceDirect Journals |
subjects | Acoustic feature Acoustics Algorithms Articulation Cepstral articulatory feature Coefficients Deep neural network Feature recognition General regression neural network Inversion mapping Phoneme recognition Phonemes Recognition Speech recognition Voice recognition |
title | Speech recognition using cepstral articulatory features |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-04T23%3A46%3A17IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Speech%20recognition%20using%20cepstral%20articulatory%20features&rft.jtitle=Speech%20communication&rft.au=Najnin,%20Shamima&rft.date=2019-02&rft.volume=107&rft.spage=26&rft.epage=37&rft.pages=26-37&rft.issn=0167-6393&rft.eissn=1872-7182&rft_id=info:doi/10.1016/j.specom.2019.01.002&rft_dat=%3Cproquest_cross%3E2195865745%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2195865745&rft_id=info:pmid/&rft_els_id=S0167639318300669&rfr_iscdi=true |