Parasitic sorority of speech processing algorithms with an assortment of statistical toolkits

Speech is a one-dimensional quasi non-stationary time varying signal produced by a sequence of sounds. Speech signals are random in nature. Speech signals are easily corrupted by noise so recognition is an important role in speech processing. Many researches have designed recognition system with cha...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Journal of physics. Conference series 2021-08, Vol.1998 (1), p.12024
Hauptverfasser:	Sudhakaran, Prathibha, Yadav, Ashwani Kumar, Karamchandani, Sunil
Format:	Artikel
Sprache:	eng
Schlagworte:	Acoustics Algorithms Speech Speech processing Speech recognition Toolkits
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue	1
container_start_page	12024
container_title	Journal of physics. Conference series
container_volume	1998
creator	Sudhakaran, Prathibha Yadav, Ashwani Kumar Karamchandani, Sunil
description	Speech is a one-dimensional quasi non-stationary time varying signal produced by a sequence of sounds. Speech signals are random in nature. Speech signals are easily corrupted by noise so recognition is an important role in speech processing. Many researches have designed recognition system with challenging parameters. Speech corpus can vary from environment, region, dialects, age, rate at which words are spoken. Pre-processing is the first step which includes framing, de-noisingand filtering. This paper focuses on speech techniques and statistical open source tools such as HTK, Julius, CMUSphinx and Kaldi. The word error rate obtained using all the toolkits on WSJ1 corpus gives us a clear understanding that Kaldi stands out as the most advanced recipes and scripts for speech recognition systems. An Indian English corpus by IITM was implemented in Kaldi yeilds WER of 6.41 and has been compared to other indian and international languages and well known corpuses.
doi_str_mv	10.1088/1742-6596/1998/1/012024
format	Article
fullrecord	<record><control><sourceid>proquest_iop_j</sourceid><recordid>TN_cdi_proquest_journals_2563806597</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2563806597</sourcerecordid><originalsourceid>FETCH-LOGICAL-c3284-72945cc12369d5591afecb8912df3910d5f1a762d7685de4970573e4bfdf22103</originalsourceid><addsrcrecordid>eNqFkF1LwzAUhoMoOKe_wYB3Ql0-mia5lOEnAwX1UkKWpltn19QkQ_bvTa1MBMHcnBzO-77n8ABwitEFRkJMMM9JVjBZTLCUqZ0gTBDJ98BoN9nf_YU4BEchrBCi6fEReH3UXoc61gYG552v4xa6CobOWrOEnXfGhlC3C6ibRT9drgP8SAXqFuqQLHFt2_hliTrWIQXpBkbnmrc6hmNwUOkm2JPvOgYv11fP09ts9nBzN72cZYYSkWecyJwZgwktZMmYxLqyZi4kJmVFJUYlq7DmBSl5IVhpc8kR49Tm86qsCMGIjsHZkJsOft_YENXKbXybVirCCipQosCTig8q410I3laq8_Va-63CSPUsVU9J9cRUz1JhNbBMzvPBWbvuJ_r-cfr0W6i6dPAY0D_E_634BBRMhMw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2563806597</pqid></control><display><type>article</type><title>Parasitic sorority of speech processing algorithms with an assortment of statistical toolkits</title><source>IOP Publishing Free Content</source><source>Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals</source><source>IOPscience extra</source><source>Alma/SFX Local Collection</source><source>Free Full-Text Journals in Chemistry</source><creator>Sudhakaran, Prathibha ; Yadav, Ashwani Kumar ; Karamchandani, Sunil</creator><creatorcontrib>Sudhakaran, Prathibha ; Yadav, Ashwani Kumar ; Karamchandani, Sunil</creatorcontrib><description>Speech is a one-dimensional quasi non-stationary time varying signal produced by a sequence of sounds. Speech signals are random in nature. Speech signals are easily corrupted by noise so recognition is an important role in speech processing. Many researches have designed recognition system with challenging parameters. Speech corpus can vary from environment, region, dialects, age, rate at which words are spoken. Pre-processing is the first step which includes framing, de-noisingand filtering. This paper focuses on speech techniques and statistical open source tools such as HTK, Julius, CMUSphinx and Kaldi. The word error rate obtained using all the toolkits on WSJ1 corpus gives us a clear understanding that Kaldi stands out as the most advanced recipes and scripts for speech recognition systems. An Indian English corpus by IITM was implemented in Kaldi yeilds WER of 6.41 and has been compared to other indian and international languages and well known corpuses.</description><identifier>ISSN: 1742-6588</identifier><identifier>EISSN: 1742-6596</identifier><identifier>DOI: 10.1088/1742-6596/1998/1/012024</identifier><language>eng</language><publisher>Bristol: IOP Publishing</publisher><subject>Acoustics ; Algorithms ; Speech ; Speech processing ; Speech recognition ; Toolkits</subject><ispartof>Journal of physics. Conference series, 2021-08, Vol.1998 (1), p.12024</ispartof><rights>Published under licence by IOP Publishing Ltd</rights><rights>2021. This work is published under http://creativecommons.org/licenses/by/3.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c3284-72945cc12369d5591afecb8912df3910d5f1a762d7685de4970573e4bfdf22103</citedby><cites>FETCH-LOGICAL-c3284-72945cc12369d5591afecb8912df3910d5f1a762d7685de4970573e4bfdf22103</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://iopscience.iop.org/article/10.1088/1742-6596/1998/1/012024/pdf$$EPDF$$P50$$Giop$$Hfree_for_read</linktopdf><link.rule.ids>314,778,782,27911,27912,38855,38877,53827,53854</link.rule.ids></links><search><creatorcontrib>Sudhakaran, Prathibha</creatorcontrib><creatorcontrib>Yadav, Ashwani Kumar</creatorcontrib><creatorcontrib>Karamchandani, Sunil</creatorcontrib><title>Parasitic sorority of speech processing algorithms with an assortment of statistical toolkits</title><title>Journal of physics. Conference series</title><addtitle>J. Phys.: Conf. Ser</addtitle><description>Speech is a one-dimensional quasi non-stationary time varying signal produced by a sequence of sounds. Speech signals are random in nature. Speech signals are easily corrupted by noise so recognition is an important role in speech processing. Many researches have designed recognition system with challenging parameters. Speech corpus can vary from environment, region, dialects, age, rate at which words are spoken. Pre-processing is the first step which includes framing, de-noisingand filtering. This paper focuses on speech techniques and statistical open source tools such as HTK, Julius, CMUSphinx and Kaldi. The word error rate obtained using all the toolkits on WSJ1 corpus gives us a clear understanding that Kaldi stands out as the most advanced recipes and scripts for speech recognition systems. An Indian English corpus by IITM was implemented in Kaldi yeilds WER of 6.41 and has been compared to other indian and international languages and well known corpuses.</description><subject>Acoustics</subject><subject>Algorithms</subject><subject>Speech</subject><subject>Speech processing</subject><subject>Speech recognition</subject><subject>Toolkits</subject><issn>1742-6588</issn><issn>1742-6596</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><sourceid>O3W</sourceid><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><recordid>eNqFkF1LwzAUhoMoOKe_wYB3Ql0-mia5lOEnAwX1UkKWpltn19QkQ_bvTa1MBMHcnBzO-77n8ABwitEFRkJMMM9JVjBZTLCUqZ0gTBDJ98BoN9nf_YU4BEchrBCi6fEReH3UXoc61gYG552v4xa6CobOWrOEnXfGhlC3C6ibRT9drgP8SAXqFuqQLHFt2_hliTrWIQXpBkbnmrc6hmNwUOkm2JPvOgYv11fP09ts9nBzN72cZYYSkWecyJwZgwktZMmYxLqyZi4kJmVFJUYlq7DmBSl5IVhpc8kR49Tm86qsCMGIjsHZkJsOft_YENXKbXybVirCCipQosCTig8q410I3laq8_Va-63CSPUsVU9J9cRUz1JhNbBMzvPBWbvuJ_r-cfr0W6i6dPAY0D_E_634BBRMhMw</recordid><startdate>20210801</startdate><enddate>20210801</enddate><creator>Sudhakaran, Prathibha</creator><creator>Yadav, Ashwani Kumar</creator><creator>Karamchandani, Sunil</creator><general>IOP Publishing</general><scope>O3W</scope><scope>TSCCA</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>H8D</scope><scope>HCIFZ</scope><scope>L7M</scope><scope>P5Z</scope><scope>P62</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope></search><sort><creationdate>20210801</creationdate><title>Parasitic sorority of speech processing algorithms with an assortment of statistical toolkits</title><author>Sudhakaran, Prathibha ; Yadav, Ashwani Kumar ; Karamchandani, Sunil</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c3284-72945cc12369d5591afecb8912df3910d5f1a762d7685de4970573e4bfdf22103</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Acoustics</topic><topic>Algorithms</topic><topic>Speech</topic><topic>Speech processing</topic><topic>Speech recognition</topic><topic>Toolkits</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Sudhakaran, Prathibha</creatorcontrib><creatorcontrib>Yadav, Ashwani Kumar</creatorcontrib><creatorcontrib>Karamchandani, Sunil</creatorcontrib><collection>IOP Publishing Free Content</collection><collection>IOPscience (Open Access)</collection><collection>CrossRef</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>Advanced Technologies & Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>Aerospace Database</collection><collection>SciTech Premium Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Advanced Technologies & Aerospace Database</collection><collection>ProQuest Advanced Technologies & Aerospace Collection</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><jtitle>Journal of physics. Conference series</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Sudhakaran, Prathibha</au><au>Yadav, Ashwani Kumar</au><au>Karamchandani, Sunil</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Parasitic sorority of speech processing algorithms with an assortment of statistical toolkits</atitle><jtitle>Journal of physics. Conference series</jtitle><addtitle>J. Phys.: Conf. Ser</addtitle><date>2021-08-01</date><risdate>2021</risdate><volume>1998</volume><issue>1</issue><spage>12024</spage><pages>12024-</pages><issn>1742-6588</issn><eissn>1742-6596</eissn><abstract>Speech is a one-dimensional quasi non-stationary time varying signal produced by a sequence of sounds. Speech signals are random in nature. Speech signals are easily corrupted by noise so recognition is an important role in speech processing. Many researches have designed recognition system with challenging parameters. Speech corpus can vary from environment, region, dialects, age, rate at which words are spoken. Pre-processing is the first step which includes framing, de-noisingand filtering. This paper focuses on speech techniques and statistical open source tools such as HTK, Julius, CMUSphinx and Kaldi. The word error rate obtained using all the toolkits on WSJ1 corpus gives us a clear understanding that Kaldi stands out as the most advanced recipes and scripts for speech recognition systems. An Indian English corpus by IITM was implemented in Kaldi yeilds WER of 6.41 and has been compared to other indian and international languages and well known corpuses.</abstract><cop>Bristol</cop><pub>IOP Publishing</pub><doi>10.1088/1742-6596/1998/1/012024</doi><tpages>9</tpages><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	ISSN: 1742-6588
ispartof	Journal of physics. Conference series, 2021-08, Vol.1998 (1), p.12024
issn	1742-6588 1742-6596
language	eng
recordid	cdi_proquest_journals_2563806597
source	IOP Publishing Free Content; Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals; IOPscience extra; Alma/SFX Local Collection; Free Full-Text Journals in Chemistry
subjects	Acoustics Algorithms Speech Speech processing Speech recognition Toolkits
title	Parasitic sorority of speech processing algorithms with an assortment of statistical toolkits
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-15T17%3A15%3A42IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_iop_j&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Parasitic%20sorority%20of%20speech%20processing%20algorithms%20with%20an%20assortment%20of%20statistical%20toolkits&rft.jtitle=Journal%20of%20physics.%20Conference%20series&rft.au=Sudhakaran,%20Prathibha&rft.date=2021-08-01&rft.volume=1998&rft.issue=1&rft.spage=12024&rft.pages=12024-&rft.issn=1742-6588&rft.eissn=1742-6596&rft_id=info:doi/10.1088/1742-6596/1998/1/012024&rft_dat=%3Cproquest_iop_j%3E2563806597%3C/proquest_iop_j%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2563806597&rft_id=info:pmid/&rfr_iscdi=true