An HMM-based speech recognizer using overlapping articulatory features

State-of-the-art speech recognition is accomplished by using stochastic models (hidden Markov models) to represent small, nonoverlapping segments of speech (phonemes or allophones). In these traditional HMM speech recognizers, the control strategy does not draw extensively on the underlying structur...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	The Journal of the Acoustical Society of America 1996-10, Vol.100 (4), p.2500-2513
Hauptverfasser:	Erler, Kevin, Freeman, George H.
Format:	Artikel
Sprache:	eng
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	2513
container_issue	4
container_start_page	2500
container_title	The Journal of the Acoustical Society of America
container_volume	100
creator	Erler, Kevin Freeman, George H.
description	State-of-the-art speech recognition is accomplished by using stochastic models (hidden Markov models) to represent small, nonoverlapping segments of speech (phonemes or allophones). In these traditional HMM speech recognizers, the control strategy does not draw extensively on the underlying structure of speech, but rather models speech as a set of disjoint ‘‘segmental’’ units. Such a strategy does not easily accommodate the influence that phonemes have on neighboring phonemes, nor does it attach any meaning to the internal states of the model. In this work, an alternative HMM control strategy is presented which draws on the idea that the production of speech is a process governed by the mechanical motion of a set of relatively slow moving ‘‘articulators.’’ The articulatory feature model is defined as an HMM in which each internal state of the model represents one possible configuration of the (quantized) articulatory system. Rather than modeling disjoint segments, this model represents the acoustic patterns associated with the various articulatory configurations of the speech production system. Instead of a set of disjoint models, this scheme represents the entire vocabulary with a single, large HMM. The internal model states now have meaning due to their correlation with the physical state of the production system. This allows the incorporation of linguistic and physiological knowledge to improve performance. System philosophy, implementation, and results are discussed.
doi_str_mv	10.1121/1.417358
format	Article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_85551660</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>85551660</sourcerecordid><originalsourceid>FETCH-LOGICAL-c287t-af66198829b4cb3077ae622ed58c824a9dbe5c180527ffc5bd249e41176670e73</originalsourceid><addsrcrecordid>eNqFkE1Lw0AURQdRMFbBn5CVuEmdN5nPZSnWCi1udD1MJi81kiZxJhHqr29L3Lu698LhLg4h90DnAAyeYM5B5UJfkAQEo5kWjF-ShFIKGTdSXpObGL9OU-jcJGS1aNP1dpsVLmKZxh7Rf6YBfbdr618M6Rjrdpd2Pxga1_fn7sJQ-7FxQxcOaYVuGAPGW3JVuSbi3V_OyMfq-X25zjZvL6_LxSbzTKshc5WUYLRmpuC-yKlSDiVjWArtNePOlAUKD5oKpqrKi6Jk3CAHUFIqiiqfkYfptw_d94hxsPs6emwa12I3RquFECAl_RcEoSQ3uTyBjxPoQxdjwMr2od67cLBA7dmoBTsZzY_sz2d4</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>15764936</pqid></control><display><type>article</type><title>An HMM-based speech recognizer using overlapping articulatory features</title><source>Acoustical Society of America Digital Library</source><creator>Erler, Kevin ; Freeman, George H.</creator><creatorcontrib>Erler, Kevin ; Freeman, George H.</creatorcontrib><description>State-of-the-art speech recognition is accomplished by using stochastic models (hidden Markov models) to represent small, nonoverlapping segments of speech (phonemes or allophones). In these traditional HMM speech recognizers, the control strategy does not draw extensively on the underlying structure of speech, but rather models speech as a set of disjoint ‘‘segmental’’ units. Such a strategy does not easily accommodate the influence that phonemes have on neighboring phonemes, nor does it attach any meaning to the internal states of the model. In this work, an alternative HMM control strategy is presented which draws on the idea that the production of speech is a process governed by the mechanical motion of a set of relatively slow moving ‘‘articulators.’’ The articulatory feature model is defined as an HMM in which each internal state of the model represents one possible configuration of the (quantized) articulatory system. Rather than modeling disjoint segments, this model represents the acoustic patterns associated with the various articulatory configurations of the speech production system. Instead of a set of disjoint models, this scheme represents the entire vocabulary with a single, large HMM. The internal model states now have meaning due to their correlation with the physical state of the production system. This allows the incorporation of linguistic and physiological knowledge to improve performance. System philosophy, implementation, and results are discussed.</description><identifier>ISSN: 0001-4966</identifier><identifier>EISSN: 1520-8524</identifier><identifier>DOI: 10.1121/1.417358</identifier><identifier>CODEN: JASMAN</identifier><language>eng</language><ispartof>The Journal of the Acoustical Society of America, 1996-10, Vol.100 (4), p.2500-2513</ispartof><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c287t-af66198829b4cb3077ae622ed58c824a9dbe5c180527ffc5bd249e41176670e73</citedby></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>207,314,777,781,27905,27906</link.rule.ids></links><search><creatorcontrib>Erler, Kevin</creatorcontrib><creatorcontrib>Freeman, George H.</creatorcontrib><title>An HMM-based speech recognizer using overlapping articulatory features</title><title>The Journal of the Acoustical Society of America</title><description>State-of-the-art speech recognition is accomplished by using stochastic models (hidden Markov models) to represent small, nonoverlapping segments of speech (phonemes or allophones). In these traditional HMM speech recognizers, the control strategy does not draw extensively on the underlying structure of speech, but rather models speech as a set of disjoint ‘‘segmental’’ units. Such a strategy does not easily accommodate the influence that phonemes have on neighboring phonemes, nor does it attach any meaning to the internal states of the model. In this work, an alternative HMM control strategy is presented which draws on the idea that the production of speech is a process governed by the mechanical motion of a set of relatively slow moving ‘‘articulators.’’ The articulatory feature model is defined as an HMM in which each internal state of the model represents one possible configuration of the (quantized) articulatory system. Rather than modeling disjoint segments, this model represents the acoustic patterns associated with the various articulatory configurations of the speech production system. Instead of a set of disjoint models, this scheme represents the entire vocabulary with a single, large HMM. The internal model states now have meaning due to their correlation with the physical state of the production system. This allows the incorporation of linguistic and physiological knowledge to improve performance. System philosophy, implementation, and results are discussed.</description><issn>0001-4966</issn><issn>1520-8524</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>1996</creationdate><recordtype>article</recordtype><recordid>eNqFkE1Lw0AURQdRMFbBn5CVuEmdN5nPZSnWCi1udD1MJi81kiZxJhHqr29L3Lu698LhLg4h90DnAAyeYM5B5UJfkAQEo5kWjF-ShFIKGTdSXpObGL9OU-jcJGS1aNP1dpsVLmKZxh7Rf6YBfbdr618M6Rjrdpd2Pxga1_fn7sJQ-7FxQxcOaYVuGAPGW3JVuSbi3V_OyMfq-X25zjZvL6_LxSbzTKshc5WUYLRmpuC-yKlSDiVjWArtNePOlAUKD5oKpqrKi6Jk3CAHUFIqiiqfkYfptw_d94hxsPs6emwa12I3RquFECAl_RcEoSQ3uTyBjxPoQxdjwMr2od67cLBA7dmoBTsZzY_sz2d4</recordid><startdate>19961001</startdate><enddate>19961001</enddate><creator>Erler, Kevin</creator><creator>Freeman, George H.</creator><scope>AAYXX</scope><scope>CITATION</scope><scope>7TK</scope><scope>7T9</scope></search><sort><creationdate>19961001</creationdate><title>An HMM-based speech recognizer using overlapping articulatory features</title><author>Erler, Kevin ; Freeman, George H.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c287t-af66198829b4cb3077ae622ed58c824a9dbe5c180527ffc5bd249e41176670e73</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>1996</creationdate><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Erler, Kevin</creatorcontrib><creatorcontrib>Freeman, George H.</creatorcontrib><collection>CrossRef</collection><collection>Neurosciences Abstracts</collection><collection>Linguistics and Language Behavior Abstracts (LLBA)</collection><jtitle>The Journal of the Acoustical Society of America</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Erler, Kevin</au><au>Freeman, George H.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>An HMM-based speech recognizer using overlapping articulatory features</atitle><jtitle>The Journal of the Acoustical Society of America</jtitle><date>1996-10-01</date><risdate>1996</risdate><volume>100</volume><issue>4</issue><spage>2500</spage><epage>2513</epage><pages>2500-2513</pages><issn>0001-4966</issn><eissn>1520-8524</eissn><coden>JASMAN</coden><abstract>State-of-the-art speech recognition is accomplished by using stochastic models (hidden Markov models) to represent small, nonoverlapping segments of speech (phonemes or allophones). In these traditional HMM speech recognizers, the control strategy does not draw extensively on the underlying structure of speech, but rather models speech as a set of disjoint ‘‘segmental’’ units. Such a strategy does not easily accommodate the influence that phonemes have on neighboring phonemes, nor does it attach any meaning to the internal states of the model. In this work, an alternative HMM control strategy is presented which draws on the idea that the production of speech is a process governed by the mechanical motion of a set of relatively slow moving ‘‘articulators.’’ The articulatory feature model is defined as an HMM in which each internal state of the model represents one possible configuration of the (quantized) articulatory system. Rather than modeling disjoint segments, this model represents the acoustic patterns associated with the various articulatory configurations of the speech production system. Instead of a set of disjoint models, this scheme represents the entire vocabulary with a single, large HMM. The internal model states now have meaning due to their correlation with the physical state of the production system. This allows the incorporation of linguistic and physiological knowledge to improve performance. System philosophy, implementation, and results are discussed.</abstract><doi>10.1121/1.417358</doi><tpages>14</tpages></addata></record>
fulltext	fulltext
identifier	ISSN: 0001-4966
ispartof	The Journal of the Acoustical Society of America, 1996-10, Vol.100 (4), p.2500-2513
issn	0001-4966 1520-8524
language	eng
recordid	cdi_proquest_miscellaneous_85551660
source	Acoustical Society of America Digital Library
title	An HMM-based speech recognizer using overlapping articulatory features
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-17T18%3A53%3A49IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=An%20HMM-based%20speech%20recognizer%20using%20overlapping%20articulatory%20features&rft.jtitle=The%20Journal%20of%20the%20Acoustical%20Society%20of%20America&rft.au=Erler,%20Kevin&rft.date=1996-10-01&rft.volume=100&rft.issue=4&rft.spage=2500&rft.epage=2513&rft.pages=2500-2513&rft.issn=0001-4966&rft.eissn=1520-8524&rft.coden=JASMAN&rft_id=info:doi/10.1121/1.417358&rft_dat=%3Cproquest_cross%3E85551660%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=15764936&rft_id=info:pmid/&rfr_iscdi=true