Bayesian adaptation in HMM training and decoding using a mixture of feature transforms

Adaptive training under a Bayesian framework addresses some limitations of the standard maximum likelihood approaches. Also, the adaptively trained system can be directly used in unsupervised inference. The Bayesian framework uses a distribution of the transform rather than a point estimate. A conti...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Tsakalidis, S., Matsoukas, S.
Format:	Tagungsbericht
Sprache:	eng
Schlagworte:	Acoustic testing adaptive training Bayesian inference Bayesian methods Broadcasting Hidden Markov models Loudspeakers Maximum likelihood decoding Maximum likelihood estimation Speech recognition Stochastic processes Training data
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	334
container_issue
container_start_page	329
container_title
container_volume
creator	Tsakalidis, S. Matsoukas, S.
description	Adaptive training under a Bayesian framework addresses some limitations of the standard maximum likelihood approaches. Also, the adaptively trained system can be directly used in unsupervised inference. The Bayesian framework uses a distribution of the transform rather than a point estimate. A continuous transform distribution makes the integral associated with the Bayesian framework intractable and therefore various approximations have been proposed. In this paper we model the transform distribution via a mixture of transforms. Under this model, the likelihood of an utterance is computed as a weighted sum of the likelihoods obtained by transforming its features based on each of the transforms in the mixture, with weights set to the transform priors. Experimental results on Arabic broadcast news exhibit increased likelihood on acoustic training data and improved speech recognition performance on unseen test data, compared to speaker independent and standard adaptive models.
doi_str_mv	10.1109/ASRU.2007.4430133
format	Conference Proceeding
fullrecord	<record><control><sourceid>ieee_6IE</sourceid><recordid>TN_cdi_ieee_primary_4430133</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>4430133</ieee_id><sourcerecordid>4430133</sourcerecordid><originalsourceid>FETCH-LOGICAL-i175t-bfcdb6e95ea7912b0f287270ffda429eb62a5d6e0c08ae847f43cdc3dbcbe9db3</originalsourceid><addsrcrecordid>eNo1kM9KAzEYxCMiqHUfQLzkBXbNv91sjrVoK7QIar2WL8kXibjZstmCfXu11tPMb2DmMIRcc1Zxzszt9OV5XQnGdKWUZFzKE3LJlVCKa9XUp6Qwuv3nWpyTIucPxhjXjeKNuiBvd7DHHCFR8LAdYYx9ojHRxWpFxwFiiumdQvLUo-v9L-zyIaJd_Bp3A9I-0IBwsD-FlEM_dPmKnAX4zFgcdULWD_evs0W5fJo_zqbLMnJdj6UNztsGTY2gDReWBdFqoVkIHpQwaBsBtW-QOdYCtkoHJZ130ltn0XgrJ-Tmbzci4mY7xA6G_eZ4hfwG2eRUAA</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>Bayesian adaptation in HMM training and decoding using a mixture of feature transforms</title><source>IEEE Electronic Library (IEL) Conference Proceedings</source><creator>Tsakalidis, S. ; Matsoukas, S.</creator><creatorcontrib>Tsakalidis, S. ; Matsoukas, S.</creatorcontrib><description>Adaptive training under a Bayesian framework addresses some limitations of the standard maximum likelihood approaches. Also, the adaptively trained system can be directly used in unsupervised inference. The Bayesian framework uses a distribution of the transform rather than a point estimate. A continuous transform distribution makes the integral associated with the Bayesian framework intractable and therefore various approximations have been proposed. In this paper we model the transform distribution via a mixture of transforms. Under this model, the likelihood of an utterance is computed as a weighted sum of the likelihoods obtained by transforming its features based on each of the transforms in the mixture, with weights set to the transform priors. Experimental results on Arabic broadcast news exhibit increased likelihood on acoustic training data and improved speech recognition performance on unseen test data, compared to speaker independent and standard adaptive models.</description><identifier>ISBN: 9781424417452</identifier><identifier>ISBN: 1424417457</identifier><identifier>EISBN: 1424417465</identifier><identifier>EISBN: 9781424417469</identifier><identifier>DOI: 10.1109/ASRU.2007.4430133</identifier><language>eng</language><publisher>IEEE</publisher><subject>Acoustic testing ; adaptive training ; Bayesian inference ; Bayesian methods ; Broadcasting ; Hidden Markov models ; Loudspeakers ; Maximum likelihood decoding ; Maximum likelihood estimation ; Speech recognition ; Stochastic processes ; Training data</subject><ispartof>2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU), 2007, p.329-334</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/4430133$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,780,784,789,790,2058,27925,54920</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/4430133$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Tsakalidis, S.</creatorcontrib><creatorcontrib>Matsoukas, S.</creatorcontrib><title>Bayesian adaptation in HMM training and decoding using a mixture of feature transforms</title><title>2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU)</title><addtitle>ASRU</addtitle><description>Adaptive training under a Bayesian framework addresses some limitations of the standard maximum likelihood approaches. Also, the adaptively trained system can be directly used in unsupervised inference. The Bayesian framework uses a distribution of the transform rather than a point estimate. A continuous transform distribution makes the integral associated with the Bayesian framework intractable and therefore various approximations have been proposed. In this paper we model the transform distribution via a mixture of transforms. Under this model, the likelihood of an utterance is computed as a weighted sum of the likelihoods obtained by transforming its features based on each of the transforms in the mixture, with weights set to the transform priors. Experimental results on Arabic broadcast news exhibit increased likelihood on acoustic training data and improved speech recognition performance on unseen test data, compared to speaker independent and standard adaptive models.</description><subject>Acoustic testing</subject><subject>adaptive training</subject><subject>Bayesian inference</subject><subject>Bayesian methods</subject><subject>Broadcasting</subject><subject>Hidden Markov models</subject><subject>Loudspeakers</subject><subject>Maximum likelihood decoding</subject><subject>Maximum likelihood estimation</subject><subject>Speech recognition</subject><subject>Stochastic processes</subject><subject>Training data</subject><isbn>9781424417452</isbn><isbn>1424417457</isbn><isbn>1424417465</isbn><isbn>9781424417469</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2007</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><sourceid>RIE</sourceid><recordid>eNo1kM9KAzEYxCMiqHUfQLzkBXbNv91sjrVoK7QIar2WL8kXibjZstmCfXu11tPMb2DmMIRcc1Zxzszt9OV5XQnGdKWUZFzKE3LJlVCKa9XUp6Qwuv3nWpyTIucPxhjXjeKNuiBvd7DHHCFR8LAdYYx9ojHRxWpFxwFiiumdQvLUo-v9L-zyIaJd_Bp3A9I-0IBwsD-FlEM_dPmKnAX4zFgcdULWD_evs0W5fJo_zqbLMnJdj6UNztsGTY2gDReWBdFqoVkIHpQwaBsBtW-QOdYCtkoHJZ130ltn0XgrJ-Tmbzci4mY7xA6G_eZ4hfwG2eRUAA</recordid><startdate>200712</startdate><enddate>200712</enddate><creator>Tsakalidis, S.</creator><creator>Matsoukas, S.</creator><general>IEEE</general><scope>6IE</scope><scope>6IL</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIL</scope></search><sort><creationdate>200712</creationdate><title>Bayesian adaptation in HMM training and decoding using a mixture of feature transforms</title><author>Tsakalidis, S. ; Matsoukas, S.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-i175t-bfcdb6e95ea7912b0f287270ffda429eb62a5d6e0c08ae847f43cdc3dbcbe9db3</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2007</creationdate><topic>Acoustic testing</topic><topic>adaptive training</topic><topic>Bayesian inference</topic><topic>Bayesian methods</topic><topic>Broadcasting</topic><topic>Hidden Markov models</topic><topic>Loudspeakers</topic><topic>Maximum likelihood decoding</topic><topic>Maximum likelihood estimation</topic><topic>Speech recognition</topic><topic>Stochastic processes</topic><topic>Training data</topic><toplevel>online_resources</toplevel><creatorcontrib>Tsakalidis, S.</creatorcontrib><creatorcontrib>Matsoukas, S.</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Electronic Library (IEL)</collection><collection>IEEE Proceedings Order Plans (POP All) 1998-Present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Tsakalidis, S.</au><au>Matsoukas, S.</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Bayesian adaptation in HMM training and decoding using a mixture of feature transforms</atitle><btitle>2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU)</btitle><stitle>ASRU</stitle><date>2007-12</date><risdate>2007</risdate><spage>329</spage><epage>334</epage><pages>329-334</pages><isbn>9781424417452</isbn><isbn>1424417457</isbn><eisbn>1424417465</eisbn><eisbn>9781424417469</eisbn><abstract>Adaptive training under a Bayesian framework addresses some limitations of the standard maximum likelihood approaches. Also, the adaptively trained system can be directly used in unsupervised inference. The Bayesian framework uses a distribution of the transform rather than a point estimate. A continuous transform distribution makes the integral associated with the Bayesian framework intractable and therefore various approximations have been proposed. In this paper we model the transform distribution via a mixture of transforms. Under this model, the likelihood of an utterance is computed as a weighted sum of the likelihoods obtained by transforming its features based on each of the transforms in the mixture, with weights set to the transform priors. Experimental results on Arabic broadcast news exhibit increased likelihood on acoustic training data and improved speech recognition performance on unseen test data, compared to speaker independent and standard adaptive models.</abstract><pub>IEEE</pub><doi>10.1109/ASRU.2007.4430133</doi><tpages>6</tpages></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISBN: 9781424417452
ispartof	2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU), 2007, p.329-334
issn
language	eng
recordid	cdi_ieee_primary_4430133
source	IEEE Electronic Library (IEL) Conference Proceedings
subjects	Acoustic testing adaptive training Bayesian inference Bayesian methods Broadcasting Hidden Markov models Loudspeakers Maximum likelihood decoding Maximum likelihood estimation Speech recognition Stochastic processes Training data
title	Bayesian adaptation in HMM training and decoding using a mixture of feature transforms
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-03T04%3A45%3A43IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_6IE&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Bayesian%20adaptation%20in%20HMM%20training%20and%20decoding%20using%20a%20mixture%20of%20feature%20transforms&rft.btitle=2007%20IEEE%20Workshop%20on%20Automatic%20Speech%20Recognition%20&%20Understanding%20(ASRU)&rft.au=Tsakalidis,%20S.&rft.date=2007-12&rft.spage=329&rft.epage=334&rft.pages=329-334&rft.isbn=9781424417452&rft.isbn_list=1424417457&rft_id=info:doi/10.1109/ASRU.2007.4430133&rft_dat=%3Cieee_6IE%3E4430133%3C/ieee_6IE%3E%3Curl%3E%3C/url%3E&rft.eisbn=1424417465&rft.eisbn_list=9781424417469&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=4430133&rfr_iscdi=true