Front-end feature transforms with context filtering for speaker adaptation

Feature-space transforms such as feature-space maximum likelihood linear regression (FMLLR) are very effective speaker adaptation technique, especially on mismatched test data. In this study, we extend the full-rank square matrix of FMLLR to a non-square matrix that uses neighboring feature vectors...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Jing Huang, Visweswariah, Karthik, Olsen, Peder, Goel, Vaibhava
Format:	Tagungsbericht
Sprache:	eng
Schlagworte:	Adaptation models Context context filtering Context modeling Data models feature-space maximum likelihood linear regression Feature-space transforms Hidden Markov models Noise measurement Transforms
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	4443
container_issue
container_start_page	4440
container_title
container_volume
creator	Jing Huang Visweswariah, Karthik Olsen, Peder Goel, Vaibhava
description	Feature-space transforms such as feature-space maximum likelihood linear regression (FMLLR) are very effective speaker adaptation technique, especially on mismatched test data. In this study, we extend the full-rank square matrix of FMLLR to a non-square matrix that uses neighboring feature vectors in estimating the adapted central feature vector. Through optimizing an appropriate objective function we aim to filter out and transform features through the correlation of the feature context. We compare to FMLLR that just con sider the current feature vector only. Our experiments are conducted on the automobile data with different speed conditions. Results show that context filtering improves 23% on word error rate over conventional FMLLR on noisy 60mph data with adapted ML model, and 7%/9% improvement over the discriminatively trained FMMI/BMMI models.
doi_str_mv	10.1109/ICASSP.2011.5947339
format	Conference Proceeding
fullrecord	<record><control><sourceid>ieee_6IE</sourceid><recordid>TN_cdi_ieee_primary_5947339</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>5947339</ieee_id><sourcerecordid>5947339</sourcerecordid><originalsourceid>FETCH-LOGICAL-i175t-8fbe4577c2631eaf1667d8ae65038e915ed8b0d998608c7c4c5c25b2f888b40d3</originalsourceid><addsrcrecordid>eNo1UMlOwzAUNJtEKPmCXvwDCX52vB1RRVlUCaSCxK1ykmcwtEnkGAF_TxBlLnOY0WhmCJkDKwGYvbhdXK7XDyVnAKW0lRbCHpAzqKTWTAqrD0nGhbYFWPZ8RHKrzb9m2DHJQHJWKKjsKcnH8Y1NUFxraTNyt4x9lwrsWurRpY-INEXXjb6Pu5F-hvRKm8mAX4n6sE0YQ_dCJ5GOA7p3jNS1bkguhb47JyfebUfM9zwjT8urx8VNsbq_ngasigBapsL4Gn_LNVwJQOdBKd0ah0oyYdCCxNbUrLXWKGYa3VSNbLisuTfG1BVrxYzM_3IDIm6GGHYufm_2t4gfE9pTWQ</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>Front-end feature transforms with context filtering for speaker adaptation</title><source>IEEE Electronic Library (IEL) Conference Proceedings</source><creator>Jing Huang ; Visweswariah, Karthik ; Olsen, Peder ; Goel, Vaibhava</creator><creatorcontrib>Jing Huang ; Visweswariah, Karthik ; Olsen, Peder ; Goel, Vaibhava</creatorcontrib><description>Feature-space transforms such as feature-space maximum likelihood linear regression (FMLLR) are very effective speaker adaptation technique, especially on mismatched test data. In this study, we extend the full-rank square matrix of FMLLR to a non-square matrix that uses neighboring feature vectors in estimating the adapted central feature vector. Through optimizing an appropriate objective function we aim to filter out and transform features through the correlation of the feature context. We compare to FMLLR that just con sider the current feature vector only. Our experiments are conducted on the automobile data with different speed conditions. Results show that context filtering improves 23% on word error rate over conventional FMLLR on noisy 60mph data with adapted ML model, and 7%/9% improvement over the discriminatively trained FMMI/BMMI models.</description><identifier>ISSN: 1520-6149</identifier><identifier>ISBN: 9781457705380</identifier><identifier>ISBN: 1457705389</identifier><identifier>EISSN: 2379-190X</identifier><identifier>EISBN: 1457705397</identifier><identifier>EISBN: 9781457705373</identifier><identifier>EISBN: 9781457705397</identifier><identifier>EISBN: 1457705370</identifier><identifier>DOI: 10.1109/ICASSP.2011.5947339</identifier><language>eng</language><publisher>IEEE</publisher><subject>Adaptation models ; Context ; context filtering ; Context modeling ; Data models ; feature-space maximum likelihood linear regression ; Feature-space transforms ; Hidden Markov models ; Noise measurement ; Transforms</subject><ispartof>2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2011, p.4440-4443</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/5947339$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,780,784,789,790,2058,27925,54920</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/5947339$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Jing Huang</creatorcontrib><creatorcontrib>Visweswariah, Karthik</creatorcontrib><creatorcontrib>Olsen, Peder</creatorcontrib><creatorcontrib>Goel, Vaibhava</creatorcontrib><title>Front-end feature transforms with context filtering for speaker adaptation</title><title>2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)</title><addtitle>ICASSP</addtitle><description>Feature-space transforms such as feature-space maximum likelihood linear regression (FMLLR) are very effective speaker adaptation technique, especially on mismatched test data. In this study, we extend the full-rank square matrix of FMLLR to a non-square matrix that uses neighboring feature vectors in estimating the adapted central feature vector. Through optimizing an appropriate objective function we aim to filter out and transform features through the correlation of the feature context. We compare to FMLLR that just con sider the current feature vector only. Our experiments are conducted on the automobile data with different speed conditions. Results show that context filtering improves 23% on word error rate over conventional FMLLR on noisy 60mph data with adapted ML model, and 7%/9% improvement over the discriminatively trained FMMI/BMMI models.</description><subject>Adaptation models</subject><subject>Context</subject><subject>context filtering</subject><subject>Context modeling</subject><subject>Data models</subject><subject>feature-space maximum likelihood linear regression</subject><subject>Feature-space transforms</subject><subject>Hidden Markov models</subject><subject>Noise measurement</subject><subject>Transforms</subject><issn>1520-6149</issn><issn>2379-190X</issn><isbn>9781457705380</isbn><isbn>1457705389</isbn><isbn>1457705397</isbn><isbn>9781457705373</isbn><isbn>9781457705397</isbn><isbn>1457705370</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2011</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><sourceid>RIE</sourceid><recordid>eNo1UMlOwzAUNJtEKPmCXvwDCX52vB1RRVlUCaSCxK1ykmcwtEnkGAF_TxBlLnOY0WhmCJkDKwGYvbhdXK7XDyVnAKW0lRbCHpAzqKTWTAqrD0nGhbYFWPZ8RHKrzb9m2DHJQHJWKKjsKcnH8Y1NUFxraTNyt4x9lwrsWurRpY-INEXXjb6Pu5F-hvRKm8mAX4n6sE0YQ_dCJ5GOA7p3jNS1bkguhb47JyfebUfM9zwjT8urx8VNsbq_ngasigBapsL4Gn_LNVwJQOdBKd0ah0oyYdCCxNbUrLXWKGYa3VSNbLisuTfG1BVrxYzM_3IDIm6GGHYufm_2t4gfE9pTWQ</recordid><startdate>201105</startdate><enddate>201105</enddate><creator>Jing Huang</creator><creator>Visweswariah, Karthik</creator><creator>Olsen, Peder</creator><creator>Goel, Vaibhava</creator><general>IEEE</general><scope>6IE</scope><scope>6IH</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIO</scope></search><sort><creationdate>201105</creationdate><title>Front-end feature transforms with context filtering for speaker adaptation</title><author>Jing Huang ; Visweswariah, Karthik ; Olsen, Peder ; Goel, Vaibhava</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-i175t-8fbe4577c2631eaf1667d8ae65038e915ed8b0d998608c7c4c5c25b2f888b40d3</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2011</creationdate><topic>Adaptation models</topic><topic>Context</topic><topic>context filtering</topic><topic>Context modeling</topic><topic>Data models</topic><topic>feature-space maximum likelihood linear regression</topic><topic>Feature-space transforms</topic><topic>Hidden Markov models</topic><topic>Noise measurement</topic><topic>Transforms</topic><toplevel>online_resources</toplevel><creatorcontrib>Jing Huang</creatorcontrib><creatorcontrib>Visweswariah, Karthik</creatorcontrib><creatorcontrib>Olsen, Peder</creatorcontrib><creatorcontrib>Goel, Vaibhava</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan (POP) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Electronic Library (IEL)</collection><collection>IEEE Proceedings Order Plans (POP) 1998-present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Jing Huang</au><au>Visweswariah, Karthik</au><au>Olsen, Peder</au><au>Goel, Vaibhava</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Front-end feature transforms with context filtering for speaker adaptation</atitle><btitle>2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)</btitle><stitle>ICASSP</stitle><date>2011-05</date><risdate>2011</risdate><spage>4440</spage><epage>4443</epage><pages>4440-4443</pages><issn>1520-6149</issn><eissn>2379-190X</eissn><isbn>9781457705380</isbn><isbn>1457705389</isbn><eisbn>1457705397</eisbn><eisbn>9781457705373</eisbn><eisbn>9781457705397</eisbn><eisbn>1457705370</eisbn><abstract>Feature-space transforms such as feature-space maximum likelihood linear regression (FMLLR) are very effective speaker adaptation technique, especially on mismatched test data. In this study, we extend the full-rank square matrix of FMLLR to a non-square matrix that uses neighboring feature vectors in estimating the adapted central feature vector. Through optimizing an appropriate objective function we aim to filter out and transform features through the correlation of the feature context. We compare to FMLLR that just con sider the current feature vector only. Our experiments are conducted on the automobile data with different speed conditions. Results show that context filtering improves 23% on word error rate over conventional FMLLR on noisy 60mph data with adapted ML model, and 7%/9% improvement over the discriminatively trained FMMI/BMMI models.</abstract><pub>IEEE</pub><doi>10.1109/ICASSP.2011.5947339</doi><tpages>4</tpages></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 1520-6149
ispartof	2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2011, p.4440-4443
issn	1520-6149 2379-190X
language	eng
recordid	cdi_ieee_primary_5947339
source	IEEE Electronic Library (IEL) Conference Proceedings
subjects	Adaptation models Context context filtering Context modeling Data models feature-space maximum likelihood linear regression Feature-space transforms Hidden Markov models Noise measurement Transforms
title	Front-end feature transforms with context filtering for speaker adaptation
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-27T18%3A36%3A21IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_6IE&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Front-end%20feature%20transforms%20with%20context%20filtering%20for%20speaker%20adaptation&rft.btitle=2011%20IEEE%20International%20Conference%20on%20Acoustics,%20Speech%20and%20Signal%20Processing%20(ICASSP)&rft.au=Jing%20Huang&rft.date=2011-05&rft.spage=4440&rft.epage=4443&rft.pages=4440-4443&rft.issn=1520-6149&rft.eissn=2379-190X&rft.isbn=9781457705380&rft.isbn_list=1457705389&rft_id=info:doi/10.1109/ICASSP.2011.5947339&rft_dat=%3Cieee_6IE%3E5947339%3C/ieee_6IE%3E%3Curl%3E%3C/url%3E&rft.eisbn=1457705397&rft.eisbn_list=9781457705373&rft.eisbn_list=9781457705397&rft.eisbn_list=1457705370&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=5947339&rfr_iscdi=true