Parasitic sorority of speech processing algorithms with an assortment of statistical toolkits
Speech is a one-dimensional quasi non-stationary time varying signal produced by a sequence of sounds. Speech signals are random in nature. Speech signals are easily corrupted by noise so recognition is an important role in speech processing. Many researches have designed recognition system with cha...
Gespeichert in:
Veröffentlicht in: | Journal of physics. Conference series 2021-08, Vol.1998 (1), p.12024 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | 1 |
container_start_page | 12024 |
container_title | Journal of physics. Conference series |
container_volume | 1998 |
creator | Sudhakaran, Prathibha Yadav, Ashwani Kumar Karamchandani, Sunil |
description | Speech is a one-dimensional quasi non-stationary time varying signal produced by a sequence of sounds. Speech signals are random in nature. Speech signals are easily corrupted by noise so recognition is an important role in speech processing. Many researches have designed recognition system with challenging parameters. Speech corpus can vary from environment, region, dialects, age, rate at which words are spoken. Pre-processing is the first step which includes framing, de-noisingand filtering. This paper focuses on speech techniques and statistical open source tools such as HTK, Julius, CMUSphinx and Kaldi. The word error rate obtained using all the toolkits on WSJ1 corpus gives us a clear understanding that Kaldi stands out as the most advanced recipes and scripts for speech recognition systems. An Indian English corpus by IITM was implemented in Kaldi yeilds WER of 6.41 and has been compared to other indian and international languages and well known corpuses. |
doi_str_mv | 10.1088/1742-6596/1998/1/012024 |
format | Article |
fullrecord | <record><control><sourceid>proquest_iop_j</sourceid><recordid>TN_cdi_proquest_journals_2563806597</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2563806597</sourcerecordid><originalsourceid>FETCH-LOGICAL-c3284-72945cc12369d5591afecb8912df3910d5f1a762d7685de4970573e4bfdf22103</originalsourceid><addsrcrecordid>eNqFkF1LwzAUhoMoOKe_wYB3Ql0-mia5lOEnAwX1UkKWpltn19QkQ_bvTa1MBMHcnBzO-77n8ABwitEFRkJMMM9JVjBZTLCUqZ0gTBDJ98BoN9nf_YU4BEchrBCi6fEReH3UXoc61gYG552v4xa6CobOWrOEnXfGhlC3C6ibRT9drgP8SAXqFuqQLHFt2_hliTrWIQXpBkbnmrc6hmNwUOkm2JPvOgYv11fP09ts9nBzN72cZYYSkWecyJwZgwktZMmYxLqyZi4kJmVFJUYlq7DmBSl5IVhpc8kR49Tm86qsCMGIjsHZkJsOft_YENXKbXybVirCCipQosCTig8q410I3laq8_Va-63CSPUsVU9J9cRUz1JhNbBMzvPBWbvuJ_r-cfr0W6i6dPAY0D_E_634BBRMhMw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2563806597</pqid></control><display><type>article</type><title>Parasitic sorority of speech processing algorithms with an assortment of statistical toolkits</title><source>IOP Publishing Free Content</source><source>Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals</source><source>IOPscience extra</source><source>Alma/SFX Local Collection</source><source>Free Full-Text Journals in Chemistry</source><creator>Sudhakaran, Prathibha ; Yadav, Ashwani Kumar ; Karamchandani, Sunil</creator><creatorcontrib>Sudhakaran, Prathibha ; Yadav, Ashwani Kumar ; Karamchandani, Sunil</creatorcontrib><description>Speech is a one-dimensional quasi non-stationary time varying signal produced by a sequence of sounds. Speech signals are random in nature. Speech signals are easily corrupted by noise so recognition is an important role in speech processing. Many researches have designed recognition system with challenging parameters. Speech corpus can vary from environment, region, dialects, age, rate at which words are spoken. Pre-processing is the first step which includes framing, de-noisingand filtering. This paper focuses on speech techniques and statistical open source tools such as HTK, Julius, CMUSphinx and Kaldi. The word error rate obtained using all the toolkits on WSJ1 corpus gives us a clear understanding that Kaldi stands out as the most advanced recipes and scripts for speech recognition systems. An Indian English corpus by IITM was implemented in Kaldi yeilds WER of 6.41 and has been compared to other indian and international languages and well known corpuses.</description><identifier>ISSN: 1742-6588</identifier><identifier>EISSN: 1742-6596</identifier><identifier>DOI: 10.1088/1742-6596/1998/1/012024</identifier><language>eng</language><publisher>Bristol: IOP Publishing</publisher><subject>Acoustics ; Algorithms ; Speech ; Speech processing ; Speech recognition ; Toolkits</subject><ispartof>Journal of physics. Conference series, 2021-08, Vol.1998 (1), p.12024</ispartof><rights>Published under licence by IOP Publishing Ltd</rights><rights>2021. This work is published under http://creativecommons.org/licenses/by/3.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c3284-72945cc12369d5591afecb8912df3910d5f1a762d7685de4970573e4bfdf22103</citedby><cites>FETCH-LOGICAL-c3284-72945cc12369d5591afecb8912df3910d5f1a762d7685de4970573e4bfdf22103</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://iopscience.iop.org/article/10.1088/1742-6596/1998/1/012024/pdf$$EPDF$$P50$$Giop$$Hfree_for_read</linktopdf><link.rule.ids>314,778,782,27911,27912,38855,38877,53827,53854</link.rule.ids></links><search><creatorcontrib>Sudhakaran, Prathibha</creatorcontrib><creatorcontrib>Yadav, Ashwani Kumar</creatorcontrib><creatorcontrib>Karamchandani, Sunil</creatorcontrib><title>Parasitic sorority of speech processing algorithms with an assortment of statistical toolkits</title><title>Journal of physics. Conference series</title><addtitle>J. Phys.: Conf. Ser</addtitle><description>Speech is a one-dimensional quasi non-stationary time varying signal produced by a sequence of sounds. Speech signals are random in nature. Speech signals are easily corrupted by noise so recognition is an important role in speech processing. Many researches have designed recognition system with challenging parameters. Speech corpus can vary from environment, region, dialects, age, rate at which words are spoken. Pre-processing is the first step which includes framing, de-noisingand filtering. This paper focuses on speech techniques and statistical open source tools such as HTK, Julius, CMUSphinx and Kaldi. The word error rate obtained using all the toolkits on WSJ1 corpus gives us a clear understanding that Kaldi stands out as the most advanced recipes and scripts for speech recognition systems. An Indian English corpus by IITM was implemented in Kaldi yeilds WER of 6.41 and has been compared to other indian and international languages and well known corpuses.</description><subject>Acoustics</subject><subject>Algorithms</subject><subject>Speech</subject><subject>Speech processing</subject><subject>Speech recognition</subject><subject>Toolkits</subject><issn>1742-6588</issn><issn>1742-6596</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><sourceid>O3W</sourceid><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><recordid>eNqFkF1LwzAUhoMoOKe_wYB3Ql0-mia5lOEnAwX1UkKWpltn19QkQ_bvTa1MBMHcnBzO-77n8ABwitEFRkJMMM9JVjBZTLCUqZ0gTBDJ98BoN9nf_YU4BEchrBCi6fEReH3UXoc61gYG552v4xa6CobOWrOEnXfGhlC3C6ibRT9drgP8SAXqFuqQLHFt2_hliTrWIQXpBkbnmrc6hmNwUOkm2JPvOgYv11fP09ts9nBzN72cZYYSkWecyJwZgwktZMmYxLqyZi4kJmVFJUYlq7DmBSl5IVhpc8kR49Tm86qsCMGIjsHZkJsOft_YENXKbXybVirCCipQosCTig8q410I3laq8_Va-63CSPUsVU9J9cRUz1JhNbBMzvPBWbvuJ_r-cfr0W6i6dPAY0D_E_634BBRMhMw</recordid><startdate>20210801</startdate><enddate>20210801</enddate><creator>Sudhakaran, Prathibha</creator><creator>Yadav, Ashwani Kumar</creator><creator>Karamchandani, Sunil</creator><general>IOP Publishing</general><scope>O3W</scope><scope>TSCCA</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>H8D</scope><scope>HCIFZ</scope><scope>L7M</scope><scope>P5Z</scope><scope>P62</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope></search><sort><creationdate>20210801</creationdate><title>Parasitic sorority of speech processing algorithms with an assortment of statistical toolkits</title><author>Sudhakaran, Prathibha ; Yadav, Ashwani Kumar ; Karamchandani, Sunil</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c3284-72945cc12369d5591afecb8912df3910d5f1a762d7685de4970573e4bfdf22103</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Acoustics</topic><topic>Algorithms</topic><topic>Speech</topic><topic>Speech processing</topic><topic>Speech recognition</topic><topic>Toolkits</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Sudhakaran, Prathibha</creatorcontrib><creatorcontrib>Yadav, Ashwani Kumar</creatorcontrib><creatorcontrib>Karamchandani, Sunil</creatorcontrib><collection>IOP Publishing Free Content</collection><collection>IOPscience (Open Access)</collection><collection>CrossRef</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>Advanced Technologies & Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>Aerospace Database</collection><collection>SciTech Premium Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Advanced Technologies & Aerospace Database</collection><collection>ProQuest Advanced Technologies & Aerospace Collection</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><jtitle>Journal of physics. Conference series</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Sudhakaran, Prathibha</au><au>Yadav, Ashwani Kumar</au><au>Karamchandani, Sunil</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Parasitic sorority of speech processing algorithms with an assortment of statistical toolkits</atitle><jtitle>Journal of physics. Conference series</jtitle><addtitle>J. Phys.: Conf. Ser</addtitle><date>2021-08-01</date><risdate>2021</risdate><volume>1998</volume><issue>1</issue><spage>12024</spage><pages>12024-</pages><issn>1742-6588</issn><eissn>1742-6596</eissn><abstract>Speech is a one-dimensional quasi non-stationary time varying signal produced by a sequence of sounds. Speech signals are random in nature. Speech signals are easily corrupted by noise so recognition is an important role in speech processing. Many researches have designed recognition system with challenging parameters. Speech corpus can vary from environment, region, dialects, age, rate at which words are spoken. Pre-processing is the first step which includes framing, de-noisingand filtering. This paper focuses on speech techniques and statistical open source tools such as HTK, Julius, CMUSphinx and Kaldi. The word error rate obtained using all the toolkits on WSJ1 corpus gives us a clear understanding that Kaldi stands out as the most advanced recipes and scripts for speech recognition systems. An Indian English corpus by IITM was implemented in Kaldi yeilds WER of 6.41 and has been compared to other indian and international languages and well known corpuses.</abstract><cop>Bristol</cop><pub>IOP Publishing</pub><doi>10.1088/1742-6596/1998/1/012024</doi><tpages>9</tpages><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1742-6588 |
ispartof | Journal of physics. Conference series, 2021-08, Vol.1998 (1), p.12024 |
issn | 1742-6588 1742-6596 |
language | eng |
recordid | cdi_proquest_journals_2563806597 |
source | IOP Publishing Free Content; Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals; IOPscience extra; Alma/SFX Local Collection; Free Full-Text Journals in Chemistry |
subjects | Acoustics Algorithms Speech Speech processing Speech recognition Toolkits |
title | Parasitic sorority of speech processing algorithms with an assortment of statistical toolkits |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-15T17%3A15%3A42IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_iop_j&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Parasitic%20sorority%20of%20speech%20processing%20algorithms%20with%20an%20assortment%20of%20statistical%20toolkits&rft.jtitle=Journal%20of%20physics.%20Conference%20series&rft.au=Sudhakaran,%20Prathibha&rft.date=2021-08-01&rft.volume=1998&rft.issue=1&rft.spage=12024&rft.pages=12024-&rft.issn=1742-6588&rft.eissn=1742-6596&rft_id=info:doi/10.1088/1742-6596/1998/1/012024&rft_dat=%3Cproquest_iop_j%3E2563806597%3C/proquest_iop_j%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2563806597&rft_id=info:pmid/&rfr_iscdi=true |