Speech recognition using cepstral articulatory features

Though speech recognition has been widely investigated in the past decades, the role of articulation in recognition has received scant attention. Recognition accuracy increases when recognizers are trained with acoustic features in conjunction with articulatory ones. Traditionally, acoustic features...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Speech communication 2019-02, Vol.107, p.26-37
Hauptverfasser: Najnin, Shamima, Banerjee, Bonny
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 37
container_issue
container_start_page 26
container_title Speech communication
container_volume 107
creator Najnin, Shamima
Banerjee, Bonny
description Though speech recognition has been widely investigated in the past decades, the role of articulation in recognition has received scant attention. Recognition accuracy increases when recognizers are trained with acoustic features in conjunction with articulatory ones. Traditionally, acoustic features are represented by mel-frequency cepstral coefficients (MFCCs) while articulatory features are represented by the locations or trajectories of the articulators. We propose the articulatory cepstral coefficients (ACCs) as features which are the cepstral coefficients of the time-location articulatory signal. We show that ACCs yield state-of-the-art results in phoneme classification and recognition on benchmark datasets over a wide range of experiments. The similarity of MFCCs and ACCs and their superior performance in isolation and conjunction indicate that common algorithms can be effectively used for acoustic and articulatory signals.
doi_str_mv 10.1016/j.specom.2019.01.002
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2195865745</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S0167639318300669</els_id><sourcerecordid>2195865745</sourcerecordid><originalsourceid>FETCH-LOGICAL-c380t-2c4b8ae1b41a0e180a41336d104015e65ae5ad6241da2c339856929c4c5a23773</originalsourceid><addsrcrecordid>eNp9kEtLxDAUhYMoOI7-AxcF16335tGmG0EGXzDgQl2HTHpnTJlpa5IK8--t1LWru_nOOdyPsWuEAgHL27aIA7n-UHDAugAsAPgJW6CueF6h5qdsMWFVXopanLOLGFsAkFrzBaveBiL3mYUpv-t88n2XjdF3u8zREFOw-8yG5N24t6kPx2xLNo2B4iU729p9pKu_u2Qfjw_vq-d8_fr0srpf505oSDl3cqMt4UaiBUINVqIQZYMgARWVypKyTcklNpY7IWqtyprXTjpluagqsWQ3c-8Q-q-RYjJtP4ZumjQca6VLVUk1UXKmXOhjDLQ1Q_AHG44GwfwqMq2ZFZlfRQbQTIqm2N0co-mDb0_BROepc9T4yUcyTe__L_gBH0xwaQ</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2195865745</pqid></control><display><type>article</type><title>Speech recognition using cepstral articulatory features</title><source>Elsevier ScienceDirect Journals</source><creator>Najnin, Shamima ; Banerjee, Bonny</creator><creatorcontrib>Najnin, Shamima ; Banerjee, Bonny</creatorcontrib><description>Though speech recognition has been widely investigated in the past decades, the role of articulation in recognition has received scant attention. Recognition accuracy increases when recognizers are trained with acoustic features in conjunction with articulatory ones. Traditionally, acoustic features are represented by mel-frequency cepstral coefficients (MFCCs) while articulatory features are represented by the locations or trajectories of the articulators. We propose the articulatory cepstral coefficients (ACCs) as features which are the cepstral coefficients of the time-location articulatory signal. We show that ACCs yield state-of-the-art results in phoneme classification and recognition on benchmark datasets over a wide range of experiments. The similarity of MFCCs and ACCs and their superior performance in isolation and conjunction indicate that common algorithms can be effectively used for acoustic and articulatory signals.</description><identifier>ISSN: 0167-6393</identifier><identifier>EISSN: 1872-7182</identifier><identifier>DOI: 10.1016/j.specom.2019.01.002</identifier><language>eng</language><publisher>Amsterdam: Elsevier B.V</publisher><subject>Acoustic feature ; Acoustics ; Algorithms ; Articulation ; Cepstral articulatory feature ; Coefficients ; Deep neural network ; Feature recognition ; General regression neural network ; Inversion mapping ; Phoneme recognition ; Phonemes ; Recognition ; Speech recognition ; Voice recognition</subject><ispartof>Speech communication, 2019-02, Vol.107, p.26-37</ispartof><rights>2019 Elsevier B.V.</rights><rights>Copyright Elsevier Science Ltd. Feb 2019</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c380t-2c4b8ae1b41a0e180a41336d104015e65ae5ad6241da2c339856929c4c5a23773</citedby><cites>FETCH-LOGICAL-c380t-2c4b8ae1b41a0e180a41336d104015e65ae5ad6241da2c339856929c4c5a23773</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.sciencedirect.com/science/article/pii/S0167639318300669$$EHTML$$P50$$Gelsevier$$H</linktohtml><link.rule.ids>314,776,780,3537,27901,27902,65306</link.rule.ids></links><search><creatorcontrib>Najnin, Shamima</creatorcontrib><creatorcontrib>Banerjee, Bonny</creatorcontrib><title>Speech recognition using cepstral articulatory features</title><title>Speech communication</title><description>Though speech recognition has been widely investigated in the past decades, the role of articulation in recognition has received scant attention. Recognition accuracy increases when recognizers are trained with acoustic features in conjunction with articulatory ones. Traditionally, acoustic features are represented by mel-frequency cepstral coefficients (MFCCs) while articulatory features are represented by the locations or trajectories of the articulators. We propose the articulatory cepstral coefficients (ACCs) as features which are the cepstral coefficients of the time-location articulatory signal. We show that ACCs yield state-of-the-art results in phoneme classification and recognition on benchmark datasets over a wide range of experiments. The similarity of MFCCs and ACCs and their superior performance in isolation and conjunction indicate that common algorithms can be effectively used for acoustic and articulatory signals.</description><subject>Acoustic feature</subject><subject>Acoustics</subject><subject>Algorithms</subject><subject>Articulation</subject><subject>Cepstral articulatory feature</subject><subject>Coefficients</subject><subject>Deep neural network</subject><subject>Feature recognition</subject><subject>General regression neural network</subject><subject>Inversion mapping</subject><subject>Phoneme recognition</subject><subject>Phonemes</subject><subject>Recognition</subject><subject>Speech recognition</subject><subject>Voice recognition</subject><issn>0167-6393</issn><issn>1872-7182</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2019</creationdate><recordtype>article</recordtype><recordid>eNp9kEtLxDAUhYMoOI7-AxcF16335tGmG0EGXzDgQl2HTHpnTJlpa5IK8--t1LWru_nOOdyPsWuEAgHL27aIA7n-UHDAugAsAPgJW6CueF6h5qdsMWFVXopanLOLGFsAkFrzBaveBiL3mYUpv-t88n2XjdF3u8zREFOw-8yG5N24t6kPx2xLNo2B4iU729p9pKu_u2Qfjw_vq-d8_fr0srpf505oSDl3cqMt4UaiBUINVqIQZYMgARWVypKyTcklNpY7IWqtyprXTjpluagqsWQ3c-8Q-q-RYjJtP4ZumjQca6VLVUk1UXKmXOhjDLQ1Q_AHG44GwfwqMq2ZFZlfRQbQTIqm2N0co-mDb0_BROepc9T4yUcyTe__L_gBH0xwaQ</recordid><startdate>201902</startdate><enddate>201902</enddate><creator>Najnin, Shamima</creator><creator>Banerjee, Bonny</creator><general>Elsevier B.V</general><general>Elsevier Science Ltd</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>7T9</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope></search><sort><creationdate>201902</creationdate><title>Speech recognition using cepstral articulatory features</title><author>Najnin, Shamima ; Banerjee, Bonny</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c380t-2c4b8ae1b41a0e180a41336d104015e65ae5ad6241da2c339856929c4c5a23773</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2019</creationdate><topic>Acoustic feature</topic><topic>Acoustics</topic><topic>Algorithms</topic><topic>Articulation</topic><topic>Cepstral articulatory feature</topic><topic>Coefficients</topic><topic>Deep neural network</topic><topic>Feature recognition</topic><topic>General regression neural network</topic><topic>Inversion mapping</topic><topic>Phoneme recognition</topic><topic>Phonemes</topic><topic>Recognition</topic><topic>Speech recognition</topic><topic>Voice recognition</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Najnin, Shamima</creatorcontrib><creatorcontrib>Banerjee, Bonny</creatorcontrib><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Linguistics and Language Behavior Abstracts (LLBA)</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>Speech communication</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Najnin, Shamima</au><au>Banerjee, Bonny</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Speech recognition using cepstral articulatory features</atitle><jtitle>Speech communication</jtitle><date>2019-02</date><risdate>2019</risdate><volume>107</volume><spage>26</spage><epage>37</epage><pages>26-37</pages><issn>0167-6393</issn><eissn>1872-7182</eissn><abstract>Though speech recognition has been widely investigated in the past decades, the role of articulation in recognition has received scant attention. Recognition accuracy increases when recognizers are trained with acoustic features in conjunction with articulatory ones. Traditionally, acoustic features are represented by mel-frequency cepstral coefficients (MFCCs) while articulatory features are represented by the locations or trajectories of the articulators. We propose the articulatory cepstral coefficients (ACCs) as features which are the cepstral coefficients of the time-location articulatory signal. We show that ACCs yield state-of-the-art results in phoneme classification and recognition on benchmark datasets over a wide range of experiments. The similarity of MFCCs and ACCs and their superior performance in isolation and conjunction indicate that common algorithms can be effectively used for acoustic and articulatory signals.</abstract><cop>Amsterdam</cop><pub>Elsevier B.V</pub><doi>10.1016/j.specom.2019.01.002</doi><tpages>12</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 0167-6393
ispartof Speech communication, 2019-02, Vol.107, p.26-37
issn 0167-6393
1872-7182
language eng
recordid cdi_proquest_journals_2195865745
source Elsevier ScienceDirect Journals
subjects Acoustic feature
Acoustics
Algorithms
Articulation
Cepstral articulatory feature
Coefficients
Deep neural network
Feature recognition
General regression neural network
Inversion mapping
Phoneme recognition
Phonemes
Recognition
Speech recognition
Voice recognition
title Speech recognition using cepstral articulatory features
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-04T23%3A46%3A17IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Speech%20recognition%20using%20cepstral%20articulatory%20features&rft.jtitle=Speech%20communication&rft.au=Najnin,%20Shamima&rft.date=2019-02&rft.volume=107&rft.spage=26&rft.epage=37&rft.pages=26-37&rft.issn=0167-6393&rft.eissn=1872-7182&rft_id=info:doi/10.1016/j.specom.2019.01.002&rft_dat=%3Cproquest_cross%3E2195865745%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2195865745&rft_id=info:pmid/&rft_els_id=S0167639318300669&rfr_iscdi=true