Feature Extraction Techniques with Analysis of Confusing Words for Speech Recognition in the Hindi Language

The research work presents experimental work to build a speaker-independent connected word Hindi speech recognition system using different feature extraction techniques with comparative analysis of confusing words. Comparative analysis of confusing words is essential to understand the reason for the...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Wireless personal communications 2021-06, Vol.118 (4), p.3303-3333
Hauptverfasser:	Bhatt, Shobha, Jain, Anurag, Dev, Amita
Format:	Artikel
Sprache:	eng
Schlagworte:	Coefficients Communications Engineering Comparative analysis Computer Communication Networks Confusion Engineering Error analysis Feature extraction Feature recognition Linear prediction Markov chains Mathematical models Networks Pattern analysis Performance evaluation Signal,Image and Speech Processing Speech Speech recognition Voice recognition Words (language)
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	3333
container_issue	4
container_start_page	3303
container_title	Wireless personal communications
container_volume	118
creator	Bhatt, Shobha Jain, Anurag Dev, Amita
description	The research work presents experimental work to build a speaker-independent connected word Hindi speech recognition system using different feature extraction techniques with comparative analysis of confusing words. Comparative analysis of confusing words is essential to understand the reason for the speech recognition errors. Based on the error analysis, different feature extraction techniques, classification techniques, acoustic models, and pronunciation dictionaries can be selected to improve the speech recognition system's performance. Earlier studies for Hindi speech recognition lack detailed comparative analysis of confusing words for different feature extractions methods. As speaker-independent systems are developed for all, comparative analysis of confusing words is also presented for all feature extraction techniques. Speaker independent system was proposed with five states monophone based hidden Markov model (HMM) using HMM-based tool kit HTK. A Self-created data set of Hindi speech corpus has been used in the experiment. Feature extraction techniques such as linear predictive coding cepstral coefficients (LPCCs), mel frequency cepstral coefficients (MFCCs), and perceptual linear prediction coefficients (PLPs) were applied using delta, double delta, and energy parameters to evaluate the performance of the proposed methodology. The system was assessed by using different feature extraction techniques for speaker-independent mode. Research findings reveal that PLP coefficients show the highest recognition score, while LPCCs got the lowest recognition scores.Investigations also reveal that both PLP and MFCC coefficients are better than LPCC in speech recognition. Comparative analysis of confusing words shows that PLPs and MFCCs show fewer confusions than LPCCs and exhibit mostly the same pattern in the confusion analysis. Research outcomes also reveal that substitution errors are a significant cause of low recognition. It was also found that some words were recognized with individual feature extraction techniques only. Confusion analysis of the words indicates that words which have nasals, liquid, and fricative sound in first place exhibit more confusions. The investigation could improve speech recognition by choosing an appropriate feature extraction method and mixing the various feature extraction methods. The research outcomes can also be utilized to build linguistic resources for improving speech recognition. The results show that the developed re
doi_str_mv	10.1007/s11277-021-08181-0
format	Article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2533363928</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2533363928</sourcerecordid><originalsourceid>FETCH-LOGICAL-c319t-6971e80b7f50dd96c48c91c7edd6000aff0d259047bc004c1ad7b731131fb58b3</originalsourceid><addsrcrecordid>eNp9kMFKAzEURYMoWKs_4CrgevQl6Uwmy1JaKxQEreguZDLJNLUmNZlB-_dOW8Gdm_c291y4B6FrArcEgN8lQijnGVCSQUnK_p6gAck5zUo2ejtFAxBUZAUl9BxdpLQG6DFBB-h9ZlTbRYOn321UunXB46XRK-8-O5Pwl2tXeOzVZpdcwsHiSfC2S843-DXEOmEbIn7emp7AT0aHxrtDhfO4XRk8d752eKF806nGXKIzqzbJXP3-IXqZTZeTebZ4vH-YjBeZZkS0WSE4MSVU3OZQ16LQo1ILormp6wIAlLVQ01zAiFcaYKSJqnnFGSGM2CovKzZEN8febQz7Fa1chy72I5KkOWOsYIKWfYoeUzqGlKKxchvdh4o7SUDupcqjVNlLlQepEnqIHaHUh31j4l_1P9QPrdp6qA</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2533363928</pqid></control><display><type>article</type><title>Feature Extraction Techniques with Analysis of Confusing Words for Speech Recognition in the Hindi Language</title><source>SpringerLink Journals - AutoHoldings</source><creator>Bhatt, Shobha ; Jain, Anurag ; Dev, Amita</creator><creatorcontrib>Bhatt, Shobha ; Jain, Anurag ; Dev, Amita</creatorcontrib><description>The research work presents experimental work to build a speaker-independent connected word Hindi speech recognition system using different feature extraction techniques with comparative analysis of confusing words. Comparative analysis of confusing words is essential to understand the reason for the speech recognition errors. Based on the error analysis, different feature extraction techniques, classification techniques, acoustic models, and pronunciation dictionaries can be selected to improve the speech recognition system's performance. Earlier studies for Hindi speech recognition lack detailed comparative analysis of confusing words for different feature extractions methods. As speaker-independent systems are developed for all, comparative analysis of confusing words is also presented for all feature extraction techniques. Speaker independent system was proposed with five states monophone based hidden Markov model (HMM) using HMM-based tool kit HTK. A Self-created data set of Hindi speech corpus has been used in the experiment. Feature extraction techniques such as linear predictive coding cepstral coefficients (LPCCs), mel frequency cepstral coefficients (MFCCs), and perceptual linear prediction coefficients (PLPs) were applied using delta, double delta, and energy parameters to evaluate the performance of the proposed methodology. The system was assessed by using different feature extraction techniques for speaker-independent mode. Research findings reveal that PLP coefficients show the highest recognition score, while LPCCs got the lowest recognition scores.Investigations also reveal that both PLP and MFCC coefficients are better than LPCC in speech recognition. Comparative analysis of confusing words shows that PLPs and MFCCs show fewer confusions than LPCCs and exhibit mostly the same pattern in the confusion analysis. Research outcomes also reveal that substitution errors are a significant cause of low recognition. It was also found that some words were recognized with individual feature extraction techniques only. Confusion analysis of the words indicates that words which have nasals, liquid, and fricative sound in first place exhibit more confusions. The investigation could improve speech recognition by choosing an appropriate feature extraction method and mixing the various feature extraction methods. The research outcomes can also be utilized to build linguistic resources for improving speech recognition. The results show that the developed recognition framework achieved the highest recognition word accuracy of 76.68% with PLPs for the speaker-independent model. The proposed system was also compared with existing similar work available.</description><identifier>ISSN: 0929-6212</identifier><identifier>EISSN: 1572-834X</identifier><identifier>DOI: 10.1007/s11277-021-08181-0</identifier><language>eng</language><publisher>New York: Springer US</publisher><subject>Coefficients ; Communications Engineering ; Comparative analysis ; Computer Communication Networks ; Confusion ; Engineering ; Error analysis ; Feature extraction ; Feature recognition ; Linear prediction ; Markov chains ; Mathematical models ; Networks ; Pattern analysis ; Performance evaluation ; Signal,Image and Speech Processing ; Speech ; Speech recognition ; Voice recognition ; Words (language)</subject><ispartof>Wireless personal communications, 2021-06, Vol.118 (4), p.3303-3333</ispartof><rights>The Author(s), under exclusive licence to Springer Science+Business Media, LLC part of Springer Nature 2021</rights><rights>The Author(s), under exclusive licence to Springer Science+Business Media, LLC part of Springer Nature 2021.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c319t-6971e80b7f50dd96c48c91c7edd6000aff0d259047bc004c1ad7b731131fb58b3</citedby><cites>FETCH-LOGICAL-c319t-6971e80b7f50dd96c48c91c7edd6000aff0d259047bc004c1ad7b731131fb58b3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s11277-021-08181-0$$EPDF$$P50$$Gspringer$$H</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/s11277-021-08181-0$$EHTML$$P50$$Gspringer$$H</linktohtml><link.rule.ids>314,778,782,27907,27908,41471,42540,51302</link.rule.ids></links><search><creatorcontrib>Bhatt, Shobha</creatorcontrib><creatorcontrib>Jain, Anurag</creatorcontrib><creatorcontrib>Dev, Amita</creatorcontrib><title>Feature Extraction Techniques with Analysis of Confusing Words for Speech Recognition in the Hindi Language</title><title>Wireless personal communications</title><addtitle>Wireless Pers Commun</addtitle><description>The research work presents experimental work to build a speaker-independent connected word Hindi speech recognition system using different feature extraction techniques with comparative analysis of confusing words. Comparative analysis of confusing words is essential to understand the reason for the speech recognition errors. Based on the error analysis, different feature extraction techniques, classification techniques, acoustic models, and pronunciation dictionaries can be selected to improve the speech recognition system's performance. Earlier studies for Hindi speech recognition lack detailed comparative analysis of confusing words for different feature extractions methods. As speaker-independent systems are developed for all, comparative analysis of confusing words is also presented for all feature extraction techniques. Speaker independent system was proposed with five states monophone based hidden Markov model (HMM) using HMM-based tool kit HTK. A Self-created data set of Hindi speech corpus has been used in the experiment. Feature extraction techniques such as linear predictive coding cepstral coefficients (LPCCs), mel frequency cepstral coefficients (MFCCs), and perceptual linear prediction coefficients (PLPs) were applied using delta, double delta, and energy parameters to evaluate the performance of the proposed methodology. The system was assessed by using different feature extraction techniques for speaker-independent mode. Research findings reveal that PLP coefficients show the highest recognition score, while LPCCs got the lowest recognition scores.Investigations also reveal that both PLP and MFCC coefficients are better than LPCC in speech recognition. Comparative analysis of confusing words shows that PLPs and MFCCs show fewer confusions than LPCCs and exhibit mostly the same pattern in the confusion analysis. Research outcomes also reveal that substitution errors are a significant cause of low recognition. It was also found that some words were recognized with individual feature extraction techniques only. Confusion analysis of the words indicates that words which have nasals, liquid, and fricative sound in first place exhibit more confusions. The investigation could improve speech recognition by choosing an appropriate feature extraction method and mixing the various feature extraction methods. The research outcomes can also be utilized to build linguistic resources for improving speech recognition. The results show that the developed recognition framework achieved the highest recognition word accuracy of 76.68% with PLPs for the speaker-independent model. The proposed system was also compared with existing similar work available.</description><subject>Coefficients</subject><subject>Communications Engineering</subject><subject>Comparative analysis</subject><subject>Computer Communication Networks</subject><subject>Confusion</subject><subject>Engineering</subject><subject>Error analysis</subject><subject>Feature extraction</subject><subject>Feature recognition</subject><subject>Linear prediction</subject><subject>Markov chains</subject><subject>Mathematical models</subject><subject>Networks</subject><subject>Pattern analysis</subject><subject>Performance evaluation</subject><subject>Signal,Image and Speech Processing</subject><subject>Speech</subject><subject>Speech recognition</subject><subject>Voice recognition</subject><subject>Words (language)</subject><issn>0929-6212</issn><issn>1572-834X</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><recordid>eNp9kMFKAzEURYMoWKs_4CrgevQl6Uwmy1JaKxQEreguZDLJNLUmNZlB-_dOW8Gdm_c291y4B6FrArcEgN8lQijnGVCSQUnK_p6gAck5zUo2ejtFAxBUZAUl9BxdpLQG6DFBB-h9ZlTbRYOn321UunXB46XRK-8-O5Pwl2tXeOzVZpdcwsHiSfC2S843-DXEOmEbIn7emp7AT0aHxrtDhfO4XRk8d752eKF806nGXKIzqzbJXP3-IXqZTZeTebZ4vH-YjBeZZkS0WSE4MSVU3OZQ16LQo1ILormp6wIAlLVQ01zAiFcaYKSJqnnFGSGM2CovKzZEN8febQz7Fa1chy72I5KkOWOsYIKWfYoeUzqGlKKxchvdh4o7SUDupcqjVNlLlQepEnqIHaHUh31j4l_1P9QPrdp6qA</recordid><startdate>20210601</startdate><enddate>20210601</enddate><creator>Bhatt, Shobha</creator><creator>Jain, Anurag</creator><creator>Dev, Amita</creator><general>Springer US</general><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope></search><sort><creationdate>20210601</creationdate><title>Feature Extraction Techniques with Analysis of Confusing Words for Speech Recognition in the Hindi Language</title><author>Bhatt, Shobha ; Jain, Anurag ; Dev, Amita</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c319t-6971e80b7f50dd96c48c91c7edd6000aff0d259047bc004c1ad7b731131fb58b3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Coefficients</topic><topic>Communications Engineering</topic><topic>Comparative analysis</topic><topic>Computer Communication Networks</topic><topic>Confusion</topic><topic>Engineering</topic><topic>Error analysis</topic><topic>Feature extraction</topic><topic>Feature recognition</topic><topic>Linear prediction</topic><topic>Markov chains</topic><topic>Mathematical models</topic><topic>Networks</topic><topic>Pattern analysis</topic><topic>Performance evaluation</topic><topic>Signal,Image and Speech Processing</topic><topic>Speech</topic><topic>Speech recognition</topic><topic>Voice recognition</topic><topic>Words (language)</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Bhatt, Shobha</creatorcontrib><creatorcontrib>Jain, Anurag</creatorcontrib><creatorcontrib>Dev, Amita</creatorcontrib><collection>CrossRef</collection><jtitle>Wireless personal communications</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Bhatt, Shobha</au><au>Jain, Anurag</au><au>Dev, Amita</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Feature Extraction Techniques with Analysis of Confusing Words for Speech Recognition in the Hindi Language</atitle><jtitle>Wireless personal communications</jtitle><stitle>Wireless Pers Commun</stitle><date>2021-06-01</date><risdate>2021</risdate><volume>118</volume><issue>4</issue><spage>3303</spage><epage>3333</epage><pages>3303-3333</pages><issn>0929-6212</issn><eissn>1572-834X</eissn><abstract>The research work presents experimental work to build a speaker-independent connected word Hindi speech recognition system using different feature extraction techniques with comparative analysis of confusing words. Comparative analysis of confusing words is essential to understand the reason for the speech recognition errors. Based on the error analysis, different feature extraction techniques, classification techniques, acoustic models, and pronunciation dictionaries can be selected to improve the speech recognition system's performance. Earlier studies for Hindi speech recognition lack detailed comparative analysis of confusing words for different feature extractions methods. As speaker-independent systems are developed for all, comparative analysis of confusing words is also presented for all feature extraction techniques. Speaker independent system was proposed with five states monophone based hidden Markov model (HMM) using HMM-based tool kit HTK. A Self-created data set of Hindi speech corpus has been used in the experiment. Feature extraction techniques such as linear predictive coding cepstral coefficients (LPCCs), mel frequency cepstral coefficients (MFCCs), and perceptual linear prediction coefficients (PLPs) were applied using delta, double delta, and energy parameters to evaluate the performance of the proposed methodology. The system was assessed by using different feature extraction techniques for speaker-independent mode. Research findings reveal that PLP coefficients show the highest recognition score, while LPCCs got the lowest recognition scores.Investigations also reveal that both PLP and MFCC coefficients are better than LPCC in speech recognition. Comparative analysis of confusing words shows that PLPs and MFCCs show fewer confusions than LPCCs and exhibit mostly the same pattern in the confusion analysis. Research outcomes also reveal that substitution errors are a significant cause of low recognition. It was also found that some words were recognized with individual feature extraction techniques only. Confusion analysis of the words indicates that words which have nasals, liquid, and fricative sound in first place exhibit more confusions. The investigation could improve speech recognition by choosing an appropriate feature extraction method and mixing the various feature extraction methods. The research outcomes can also be utilized to build linguistic resources for improving speech recognition. The results show that the developed recognition framework achieved the highest recognition word accuracy of 76.68% with PLPs for the speaker-independent model. The proposed system was also compared with existing similar work available.</abstract><cop>New York</cop><pub>Springer US</pub><doi>10.1007/s11277-021-08181-0</doi><tpages>31</tpages></addata></record>
fulltext	fulltext
identifier	ISSN: 0929-6212
ispartof	Wireless personal communications, 2021-06, Vol.118 (4), p.3303-3333
issn	0929-6212 1572-834X
language	eng
recordid	cdi_proquest_journals_2533363928
source	SpringerLink Journals - AutoHoldings
subjects	Coefficients Communications Engineering Comparative analysis Computer Communication Networks Confusion Engineering Error analysis Feature extraction Feature recognition Linear prediction Markov chains Mathematical models Networks Pattern analysis Performance evaluation Signal,Image and Speech Processing Speech Speech recognition Voice recognition Words (language)
title	Feature Extraction Techniques with Analysis of Confusing Words for Speech Recognition in the Hindi Language
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-16T18%3A15%3A57IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Feature%20Extraction%20Techniques%20with%20Analysis%20of%20Confusing%20Words%20for%20Speech%20Recognition%20in%20the%20Hindi%20Language&rft.jtitle=Wireless%20personal%20communications&rft.au=Bhatt,%20Shobha&rft.date=2021-06-01&rft.volume=118&rft.issue=4&rft.spage=3303&rft.epage=3333&rft.pages=3303-3333&rft.issn=0929-6212&rft.eissn=1572-834X&rft_id=info:doi/10.1007/s11277-021-08181-0&rft_dat=%3Cproquest_cross%3E2533363928%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2533363928&rft_id=info:pmid/&rfr_iscdi=true