Subspace Gaussian Mixture Models for speech recognition

We describe an acoustic modeling approach in which all phonetic states share a common Gaussian Mixture Model structure, and the means and mixture weights vary in a subspace of the total parameter space. We call this a Subspace Gaussian Mixture Model (SGMM). Globally shared parameters define the subs...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Povey, Daniel, Burget, Lukáš, Agarwal, Mohit, Akyazi, Pinar, Kai Feng, Ghoshal, Arnab, Glembek, Ondřej, Goel, Nagendra Kumar, Karafiát, Martin, Rastrow, Ariya, Rose, Richard C, Schwarz, Petr, Thomas, Samuel
Format:	Tagungsbericht
Sprache:	eng
Schlagworte:	Acoustic testing Costs Equations Gaussian Mixture Models Hidden Markov models Loudspeakers Natural languages Software testing Software tools Speech recognition Training data
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	4333
container_issue
container_start_page	4330
container_title
container_volume
creator	Povey, Daniel Burget, Lukáš Agarwal, Mohit Akyazi, Pinar Kai Feng Ghoshal, Arnab Glembek, Ondřej Goel, Nagendra Kumar Karafiát, Martin Rastrow, Ariya Rose, Richard C Schwarz, Petr Thomas, Samuel
description	We describe an acoustic modeling approach in which all phonetic states share a common Gaussian Mixture Model structure, and the means and mixture weights vary in a subspace of the total parameter space. We call this a Subspace Gaussian Mixture Model (SGMM). Globally shared parameters define the subspace. This style of acoustic model allows for a much more compact representation and gives better results than a conventional modeling approach, particularly with smaller amounts of training data.
doi_str_mv	10.1109/ICASSP.2010.5495662
format	Conference Proceeding
fullrecord	<record><control><sourceid>ieee_6IE</sourceid><recordid>TN_cdi_ieee_primary_5495662</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>5495662</ieee_id><sourcerecordid>5495662</sourcerecordid><originalsourceid>FETCH-LOGICAL-c291t-a6aee211d9c42647f8d1755cf3cf77de02e4ec78b4c73d2534707b83a5b3a3113</originalsourceid><addsrcrecordid>eNpVj81Kw0AUhUdUsNY-QTfzAqlzZ-78LaVoFVoUousymdzoSE1CJgF9ewt24-pwvsXhO4wtQawAhL99Wt-V5ctKiiPQ6LUx8owtvHWAEhGlN-b8X9f-gs1AS1EYQH_FrnP-FEI4i27GbDlVuQ-R-CZMOafQ8l36HqeB-K6r6ZB50w0890Txgw8Uu_c2jalrb9hlEw6ZFqecs7eH-9f1Y7F93hwNt0WUHsYimEAkAWofURq0javBah0bFRtraxKSkKJ1FUaraqkVWmErp4KuVFAAas6Wf7uJiPb9kL7C8LM__Va_mzNJsA</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>Subspace Gaussian Mixture Models for speech recognition</title><source>IEEE Electronic Library (IEL) Conference Proceedings</source><creator>Povey, Daniel ; Burget, Lukáš ; Agarwal, Mohit ; Akyazi, Pinar ; Kai Feng ; Ghoshal, Arnab ; Glembek, Ondřej ; Goel, Nagendra Kumar ; Karafiát, Martin ; Rastrow, Ariya ; Rose, Richard C ; Schwarz, Petr ; Thomas, Samuel</creator><creatorcontrib>Povey, Daniel ; Burget, Lukáš ; Agarwal, Mohit ; Akyazi, Pinar ; Kai Feng ; Ghoshal, Arnab ; Glembek, Ondřej ; Goel, Nagendra Kumar ; Karafiát, Martin ; Rastrow, Ariya ; Rose, Richard C ; Schwarz, Petr ; Thomas, Samuel</creatorcontrib><description>We describe an acoustic modeling approach in which all phonetic states share a common Gaussian Mixture Model structure, and the means and mixture weights vary in a subspace of the total parameter space. We call this a Subspace Gaussian Mixture Model (SGMM). Globally shared parameters define the subspace. This style of acoustic model allows for a much more compact representation and gives better results than a conventional modeling approach, particularly with smaller amounts of training data.</description><identifier>ISSN: 1520-6149</identifier><identifier>ISBN: 9781424442959</identifier><identifier>ISBN: 1424442958</identifier><identifier>EISBN: 9781424442966</identifier><identifier>EISBN: 1424442966</identifier><identifier>DOI: 10.1109/ICASSP.2010.5495662</identifier><language>eng</language><publisher>IEEE</publisher><subject>Acoustic testing ; Costs ; Equations ; Gaussian Mixture Models ; Hidden Markov models ; Loudspeakers ; Natural languages ; Software testing ; Software tools ; Speech recognition ; Training data</subject><ispartof>2010 IEEE International Conference on Acoustics, Speech and Signal Processing, 2010, p.4330-4333</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c291t-a6aee211d9c42647f8d1755cf3cf77de02e4ec78b4c73d2534707b83a5b3a3113</citedby></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/5495662$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,780,784,789,790,2058,27925,54920</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/5495662$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Povey, Daniel</creatorcontrib><creatorcontrib>Burget, Lukáš</creatorcontrib><creatorcontrib>Agarwal, Mohit</creatorcontrib><creatorcontrib>Akyazi, Pinar</creatorcontrib><creatorcontrib>Kai Feng</creatorcontrib><creatorcontrib>Ghoshal, Arnab</creatorcontrib><creatorcontrib>Glembek, Ondřej</creatorcontrib><creatorcontrib>Goel, Nagendra Kumar</creatorcontrib><creatorcontrib>Karafiát, Martin</creatorcontrib><creatorcontrib>Rastrow, Ariya</creatorcontrib><creatorcontrib>Rose, Richard C</creatorcontrib><creatorcontrib>Schwarz, Petr</creatorcontrib><creatorcontrib>Thomas, Samuel</creatorcontrib><title>Subspace Gaussian Mixture Models for speech recognition</title><title>2010 IEEE International Conference on Acoustics, Speech and Signal Processing</title><addtitle>ICASSP</addtitle><description>We describe an acoustic modeling approach in which all phonetic states share a common Gaussian Mixture Model structure, and the means and mixture weights vary in a subspace of the total parameter space. We call this a Subspace Gaussian Mixture Model (SGMM). Globally shared parameters define the subspace. This style of acoustic model allows for a much more compact representation and gives better results than a conventional modeling approach, particularly with smaller amounts of training data.</description><subject>Acoustic testing</subject><subject>Costs</subject><subject>Equations</subject><subject>Gaussian Mixture Models</subject><subject>Hidden Markov models</subject><subject>Loudspeakers</subject><subject>Natural languages</subject><subject>Software testing</subject><subject>Software tools</subject><subject>Speech recognition</subject><subject>Training data</subject><issn>1520-6149</issn><isbn>9781424442959</isbn><isbn>1424442958</isbn><isbn>9781424442966</isbn><isbn>1424442966</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2010</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><sourceid>RIE</sourceid><recordid>eNpVj81Kw0AUhUdUsNY-QTfzAqlzZ-78LaVoFVoUousymdzoSE1CJgF9ewt24-pwvsXhO4wtQawAhL99Wt-V5ctKiiPQ6LUx8owtvHWAEhGlN-b8X9f-gs1AS1EYQH_FrnP-FEI4i27GbDlVuQ-R-CZMOafQ8l36HqeB-K6r6ZB50w0890Txgw8Uu_c2jalrb9hlEw6ZFqecs7eH-9f1Y7F93hwNt0WUHsYimEAkAWofURq0javBah0bFRtraxKSkKJ1FUaraqkVWmErp4KuVFAAas6Wf7uJiPb9kL7C8LM__Va_mzNJsA</recordid><startdate>201003</startdate><enddate>201003</enddate><creator>Povey, Daniel</creator><creator>Burget, Lukáš</creator><creator>Agarwal, Mohit</creator><creator>Akyazi, Pinar</creator><creator>Kai Feng</creator><creator>Ghoshal, Arnab</creator><creator>Glembek, Ondřej</creator><creator>Goel, Nagendra Kumar</creator><creator>Karafiát, Martin</creator><creator>Rastrow, Ariya</creator><creator>Rose, Richard C</creator><creator>Schwarz, Petr</creator><creator>Thomas, Samuel</creator><general>IEEE</general><scope>6IE</scope><scope>6IH</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIO</scope></search><sort><creationdate>201003</creationdate><title>Subspace Gaussian Mixture Models for speech recognition</title><author>Povey, Daniel ; Burget, Lukáš ; Agarwal, Mohit ; Akyazi, Pinar ; Kai Feng ; Ghoshal, Arnab ; Glembek, Ondřej ; Goel, Nagendra Kumar ; Karafiát, Martin ; Rastrow, Ariya ; Rose, Richard C ; Schwarz, Petr ; Thomas, Samuel</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c291t-a6aee211d9c42647f8d1755cf3cf77de02e4ec78b4c73d2534707b83a5b3a3113</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2010</creationdate><topic>Acoustic testing</topic><topic>Costs</topic><topic>Equations</topic><topic>Gaussian Mixture Models</topic><topic>Hidden Markov models</topic><topic>Loudspeakers</topic><topic>Natural languages</topic><topic>Software testing</topic><topic>Software tools</topic><topic>Speech recognition</topic><topic>Training data</topic><toplevel>online_resources</toplevel><creatorcontrib>Povey, Daniel</creatorcontrib><creatorcontrib>Burget, Lukáš</creatorcontrib><creatorcontrib>Agarwal, Mohit</creatorcontrib><creatorcontrib>Akyazi, Pinar</creatorcontrib><creatorcontrib>Kai Feng</creatorcontrib><creatorcontrib>Ghoshal, Arnab</creatorcontrib><creatorcontrib>Glembek, Ondřej</creatorcontrib><creatorcontrib>Goel, Nagendra Kumar</creatorcontrib><creatorcontrib>Karafiát, Martin</creatorcontrib><creatorcontrib>Rastrow, Ariya</creatorcontrib><creatorcontrib>Rose, Richard C</creatorcontrib><creatorcontrib>Schwarz, Petr</creatorcontrib><creatorcontrib>Thomas, Samuel</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan (POP) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Electronic Library (IEL)</collection><collection>IEEE Proceedings Order Plans (POP) 1998-present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Povey, Daniel</au><au>Burget, Lukáš</au><au>Agarwal, Mohit</au><au>Akyazi, Pinar</au><au>Kai Feng</au><au>Ghoshal, Arnab</au><au>Glembek, Ondřej</au><au>Goel, Nagendra Kumar</au><au>Karafiát, Martin</au><au>Rastrow, Ariya</au><au>Rose, Richard C</au><au>Schwarz, Petr</au><au>Thomas, Samuel</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Subspace Gaussian Mixture Models for speech recognition</atitle><btitle>2010 IEEE International Conference on Acoustics, Speech and Signal Processing</btitle><stitle>ICASSP</stitle><date>2010-03</date><risdate>2010</risdate><spage>4330</spage><epage>4333</epage><pages>4330-4333</pages><issn>1520-6149</issn><isbn>9781424442959</isbn><isbn>1424442958</isbn><eisbn>9781424442966</eisbn><eisbn>1424442966</eisbn><abstract>We describe an acoustic modeling approach in which all phonetic states share a common Gaussian Mixture Model structure, and the means and mixture weights vary in a subspace of the total parameter space. We call this a Subspace Gaussian Mixture Model (SGMM). Globally shared parameters define the subspace. This style of acoustic model allows for a much more compact representation and gives better results than a conventional modeling approach, particularly with smaller amounts of training data.</abstract><pub>IEEE</pub><doi>10.1109/ICASSP.2010.5495662</doi><tpages>4</tpages></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 1520-6149
ispartof	2010 IEEE International Conference on Acoustics, Speech and Signal Processing, 2010, p.4330-4333
issn	1520-6149
language	eng
recordid	cdi_ieee_primary_5495662
source	IEEE Electronic Library (IEL) Conference Proceedings
subjects	Acoustic testing Costs Equations Gaussian Mixture Models Hidden Markov models Loudspeakers Natural languages Software testing Software tools Speech recognition Training data
title	Subspace Gaussian Mixture Models for speech recognition
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-26T17%3A32%3A48IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_6IE&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Subspace%20Gaussian%20Mixture%20Models%20for%20speech%20recognition&rft.btitle=2010%20IEEE%20International%20Conference%20on%20Acoustics,%20Speech%20and%20Signal%20Processing&rft.au=Povey,%20Daniel&rft.date=2010-03&rft.spage=4330&rft.epage=4333&rft.pages=4330-4333&rft.issn=1520-6149&rft.isbn=9781424442959&rft.isbn_list=1424442958&rft_id=info:doi/10.1109/ICASSP.2010.5495662&rft_dat=%3Cieee_6IE%3E5495662%3C/ieee_6IE%3E%3Curl%3E%3C/url%3E&rft.eisbn=9781424442966&rft.eisbn_list=1424442966&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=5495662&rfr_iscdi=true