Continuous Punjabi speech recognition model based on Kaldi ASR toolkit

In this paper, continuous Punjabi speech recognition model is presented using Kaldi toolkit. For speech recognition, the extraction of Mel frequency cepstral coefficients (MFCC) features and perceptual linear prediction (PLP) features were extracted from Punjabi continuous speech samples. The perfor...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	International journal of speech technology 2018-06, Vol.21 (2), p.211-216
Hauptverfasser:	Guglani, Jyoti, Mishra, A. N.
Format:	Artikel
Sprache:	eng
Schlagworte:	Artificial Intelligence Automatic speech recognition Continuous speech Engineering Feature extraction Feature recognition Linear prediction N-Gram language models Punjabi language Signal,Image and Speech Processing Social Sciences Speech perception Speech recognition Voice recognition
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	216
container_issue	2
container_start_page	211
container_title	International journal of speech technology
container_volume	21
creator	Guglani, Jyoti Mishra, A. N.
description	In this paper, continuous Punjabi speech recognition model is presented using Kaldi toolkit. For speech recognition, the extraction of Mel frequency cepstral coefficients (MFCC) features and perceptual linear prediction (PLP) features were extracted from Punjabi continuous speech samples. The performance of automatic speech recognition (ASR) system for both monophone and triphone model i.e., tri1, tri2 and tri3 model using N-gram language model is reported. The performance of ASR system were computed in terms of word error rate (WER). A significant reduction in WER was observed using the tri phone model over mono phone model ASR .Also the performance of ASR using tri3 model is improved over tri2 model and the performance of tri2 model is improved over tri1 model ASR. Further, it was found that MFCC feature provides higher speech recognition accuracy than PLP features for continuous Punjabi speech.
doi_str_mv	10.1007/s10772-018-9497-6
format	Article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2038767307</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2038767307</sourcerecordid><originalsourceid>FETCH-LOGICAL-c316t-a7def04bbd8a52aaa7d5f349cfcaee63dbfbd96c1fcc3717603e6f992caf8aed3</originalsourceid><addsrcrecordid>eNp1kEtLAzEUhYMoWKs_wF3AdTSPmWSyLMWqWFB8rEMmj5o6ndRkZuG_N2UEV67uPZdzzoUPgEuCrwnG4iYTLARFmDRIVlIgfgRmpC6XhhB8XHbWEEQrwk_BWc5bjLEUks7Aahn7IfRjHDN8HvutbgPMe-fMB0zOxE0fhhB7uIvWdbDV2VlY5KPubICL1xc4xNh9huEcnHjdZXfxO-fgfXX7trxH66e7h-VijQwjfEBaWOdx1ba20TXVuujas0oab7RznNnWt1ZyQ7wxTBDBMXPcS0mN9o12ls3B1dS7T_FrdHlQ2zimvrxUFLNGcMGwKC4yuUyKOSfn1T6FnU7fimB1wKUmXKrgUgdcipcMnTK5ePuNS3_N_4d-ABzibrc</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2038767307</pqid></control><display><type>article</type><title>Continuous Punjabi speech recognition model based on Kaldi ASR toolkit</title><source>Springer Nature - Complete Springer Journals</source><creator>Guglani, Jyoti ; Mishra, A. N.</creator><creatorcontrib>Guglani, Jyoti ; Mishra, A. N.</creatorcontrib><description>In this paper, continuous Punjabi speech recognition model is presented using Kaldi toolkit. For speech recognition, the extraction of Mel frequency cepstral coefficients (MFCC) features and perceptual linear prediction (PLP) features were extracted from Punjabi continuous speech samples. The performance of automatic speech recognition (ASR) system for both monophone and triphone model i.e., tri1, tri2 and tri3 model using N-gram language model is reported. The performance of ASR system were computed in terms of word error rate (WER). A significant reduction in WER was observed using the tri phone model over mono phone model ASR .Also the performance of ASR using tri3 model is improved over tri2 model and the performance of tri2 model is improved over tri1 model ASR. Further, it was found that MFCC feature provides higher speech recognition accuracy than PLP features for continuous Punjabi speech.</description><identifier>ISSN: 1381-2416</identifier><identifier>EISSN: 1572-8110</identifier><identifier>DOI: 10.1007/s10772-018-9497-6</identifier><language>eng</language><publisher>New York: Springer US</publisher><subject>Artificial Intelligence ; Automatic speech recognition ; Continuous speech ; Engineering ; Feature extraction ; Feature recognition ; Linear prediction ; N-Gram language models ; Punjabi language ; Signal,Image and Speech Processing ; Social Sciences ; Speech perception ; Speech recognition ; Voice recognition</subject><ispartof>International journal of speech technology, 2018-06, Vol.21 (2), p.211-216</ispartof><rights>Springer Science+Business Media, LLC, part of Springer Nature 2018</rights><rights>Copyright Springer Science & Business Media 2018</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c316t-a7def04bbd8a52aaa7d5f349cfcaee63dbfbd96c1fcc3717603e6f992caf8aed3</citedby><cites>FETCH-LOGICAL-c316t-a7def04bbd8a52aaa7d5f349cfcaee63dbfbd96c1fcc3717603e6f992caf8aed3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s10772-018-9497-6$$EPDF$$P50$$Gspringer$$H</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/s10772-018-9497-6$$EHTML$$P50$$Gspringer$$H</linktohtml><link.rule.ids>314,776,780,27903,27904,41467,42536,51297</link.rule.ids></links><search><creatorcontrib>Guglani, Jyoti</creatorcontrib><creatorcontrib>Mishra, A. N.</creatorcontrib><title>Continuous Punjabi speech recognition model based on Kaldi ASR toolkit</title><title>International journal of speech technology</title><addtitle>Int J Speech Technol</addtitle><description>In this paper, continuous Punjabi speech recognition model is presented using Kaldi toolkit. For speech recognition, the extraction of Mel frequency cepstral coefficients (MFCC) features and perceptual linear prediction (PLP) features were extracted from Punjabi continuous speech samples. The performance of automatic speech recognition (ASR) system for both monophone and triphone model i.e., tri1, tri2 and tri3 model using N-gram language model is reported. The performance of ASR system were computed in terms of word error rate (WER). A significant reduction in WER was observed using the tri phone model over mono phone model ASR .Also the performance of ASR using tri3 model is improved over tri2 model and the performance of tri2 model is improved over tri1 model ASR. Further, it was found that MFCC feature provides higher speech recognition accuracy than PLP features for continuous Punjabi speech.</description><subject>Artificial Intelligence</subject><subject>Automatic speech recognition</subject><subject>Continuous speech</subject><subject>Engineering</subject><subject>Feature extraction</subject><subject>Feature recognition</subject><subject>Linear prediction</subject><subject>N-Gram language models</subject><subject>Punjabi language</subject><subject>Signal,Image and Speech Processing</subject><subject>Social Sciences</subject><subject>Speech perception</subject><subject>Speech recognition</subject><subject>Voice recognition</subject><issn>1381-2416</issn><issn>1572-8110</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2018</creationdate><recordtype>article</recordtype><recordid>eNp1kEtLAzEUhYMoWKs_wF3AdTSPmWSyLMWqWFB8rEMmj5o6ndRkZuG_N2UEV67uPZdzzoUPgEuCrwnG4iYTLARFmDRIVlIgfgRmpC6XhhB8XHbWEEQrwk_BWc5bjLEUks7Aahn7IfRjHDN8HvutbgPMe-fMB0zOxE0fhhB7uIvWdbDV2VlY5KPubICL1xc4xNh9huEcnHjdZXfxO-fgfXX7trxH66e7h-VijQwjfEBaWOdx1ba20TXVuujas0oab7RznNnWt1ZyQ7wxTBDBMXPcS0mN9o12ls3B1dS7T_FrdHlQ2zimvrxUFLNGcMGwKC4yuUyKOSfn1T6FnU7fimB1wKUmXKrgUgdcipcMnTK5ePuNS3_N_4d-ABzibrc</recordid><startdate>20180601</startdate><enddate>20180601</enddate><creator>Guglani, Jyoti</creator><creator>Mishra, A. N.</creator><general>Springer US</general><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7T9</scope></search><sort><creationdate>20180601</creationdate><title>Continuous Punjabi speech recognition model based on Kaldi ASR toolkit</title><author>Guglani, Jyoti ; Mishra, A. N.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c316t-a7def04bbd8a52aaa7d5f349cfcaee63dbfbd96c1fcc3717603e6f992caf8aed3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2018</creationdate><topic>Artificial Intelligence</topic><topic>Automatic speech recognition</topic><topic>Continuous speech</topic><topic>Engineering</topic><topic>Feature extraction</topic><topic>Feature recognition</topic><topic>Linear prediction</topic><topic>N-Gram language models</topic><topic>Punjabi language</topic><topic>Signal,Image and Speech Processing</topic><topic>Social Sciences</topic><topic>Speech perception</topic><topic>Speech recognition</topic><topic>Voice recognition</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Guglani, Jyoti</creatorcontrib><creatorcontrib>Mishra, A. N.</creatorcontrib><collection>CrossRef</collection><collection>Linguistics and Language Behavior Abstracts (LLBA)</collection><jtitle>International journal of speech technology</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Guglani, Jyoti</au><au>Mishra, A. N.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Continuous Punjabi speech recognition model based on Kaldi ASR toolkit</atitle><jtitle>International journal of speech technology</jtitle><stitle>Int J Speech Technol</stitle><date>2018-06-01</date><risdate>2018</risdate><volume>21</volume><issue>2</issue><spage>211</spage><epage>216</epage><pages>211-216</pages><issn>1381-2416</issn><eissn>1572-8110</eissn><abstract>In this paper, continuous Punjabi speech recognition model is presented using Kaldi toolkit. For speech recognition, the extraction of Mel frequency cepstral coefficients (MFCC) features and perceptual linear prediction (PLP) features were extracted from Punjabi continuous speech samples. The performance of automatic speech recognition (ASR) system for both monophone and triphone model i.e., tri1, tri2 and tri3 model using N-gram language model is reported. The performance of ASR system were computed in terms of word error rate (WER). A significant reduction in WER was observed using the tri phone model over mono phone model ASR .Also the performance of ASR using tri3 model is improved over tri2 model and the performance of tri2 model is improved over tri1 model ASR. Further, it was found that MFCC feature provides higher speech recognition accuracy than PLP features for continuous Punjabi speech.</abstract><cop>New York</cop><pub>Springer US</pub><doi>10.1007/s10772-018-9497-6</doi><tpages>6</tpages></addata></record>
fulltext	fulltext
identifier	ISSN: 1381-2416
ispartof	International journal of speech technology, 2018-06, Vol.21 (2), p.211-216
issn	1381-2416 1572-8110
language	eng
recordid	cdi_proquest_journals_2038767307
source	Springer Nature - Complete Springer Journals
subjects	Artificial Intelligence Automatic speech recognition Continuous speech Engineering Feature extraction Feature recognition Linear prediction N-Gram language models Punjabi language Signal,Image and Speech Processing Social Sciences Speech perception Speech recognition Voice recognition
title	Continuous Punjabi speech recognition model based on Kaldi ASR toolkit
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-25T09%3A34%3A16IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Continuous%20Punjabi%20speech%20recognition%20model%20based%20on%20Kaldi%20ASR%20toolkit&rft.jtitle=International%20journal%20of%20speech%20technology&rft.au=Guglani,%20Jyoti&rft.date=2018-06-01&rft.volume=21&rft.issue=2&rft.spage=211&rft.epage=216&rft.pages=211-216&rft.issn=1381-2416&rft.eissn=1572-8110&rft_id=info:doi/10.1007/s10772-018-9497-6&rft_dat=%3Cproquest_cross%3E2038767307%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2038767307&rft_id=info:pmid/&rfr_iscdi=true