Subspace Gaussian Mixture Models for speech recognition

We describe an acoustic modeling approach in which all phonetic states share a common Gaussian Mixture Model structure, and the means and mixture weights vary in a subspace of the total parameter space. We call this a Subspace Gaussian Mixture Model (SGMM). Globally shared parameters define the subs...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Povey, Daniel, Burget, Lukáš, Agarwal, Mohit, Akyazi, Pinar, Kai Feng, Ghoshal, Arnab, Glembek, Ondřej, Goel, Nagendra Kumar, Karafiát, Martin, Rastrow, Ariya, Rose, Richard C, Schwarz, Petr, Thomas, Samuel
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 4333
container_issue
container_start_page 4330
container_title
container_volume
creator Povey, Daniel
Burget, Lukáš
Agarwal, Mohit
Akyazi, Pinar
Kai Feng
Ghoshal, Arnab
Glembek, Ondřej
Goel, Nagendra Kumar
Karafiát, Martin
Rastrow, Ariya
Rose, Richard C
Schwarz, Petr
Thomas, Samuel
description We describe an acoustic modeling approach in which all phonetic states share a common Gaussian Mixture Model structure, and the means and mixture weights vary in a subspace of the total parameter space. We call this a Subspace Gaussian Mixture Model (SGMM). Globally shared parameters define the subspace. This style of acoustic model allows for a much more compact representation and gives better results than a conventional modeling approach, particularly with smaller amounts of training data.
doi_str_mv 10.1109/ICASSP.2010.5495662
format Conference Proceeding
fullrecord <record><control><sourceid>ieee_6IE</sourceid><recordid>TN_cdi_ieee_primary_5495662</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>5495662</ieee_id><sourcerecordid>5495662</sourcerecordid><originalsourceid>FETCH-LOGICAL-c291t-a6aee211d9c42647f8d1755cf3cf77de02e4ec78b4c73d2534707b83a5b3a3113</originalsourceid><addsrcrecordid>eNpVj81Kw0AUhUdUsNY-QTfzAqlzZ-78LaVoFVoUousymdzoSE1CJgF9ewt24-pwvsXhO4wtQawAhL99Wt-V5ctKiiPQ6LUx8owtvHWAEhGlN-b8X9f-gs1AS1EYQH_FrnP-FEI4i27GbDlVuQ-R-CZMOafQ8l36HqeB-K6r6ZB50w0890Txgw8Uu_c2jalrb9hlEw6ZFqecs7eH-9f1Y7F93hwNt0WUHsYimEAkAWofURq0javBah0bFRtraxKSkKJ1FUaraqkVWmErp4KuVFAAas6Wf7uJiPb9kL7C8LM__Va_mzNJsA</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>Subspace Gaussian Mixture Models for speech recognition</title><source>IEEE Electronic Library (IEL) Conference Proceedings</source><creator>Povey, Daniel ; Burget, Lukáš ; Agarwal, Mohit ; Akyazi, Pinar ; Kai Feng ; Ghoshal, Arnab ; Glembek, Ondřej ; Goel, Nagendra Kumar ; Karafiát, Martin ; Rastrow, Ariya ; Rose, Richard C ; Schwarz, Petr ; Thomas, Samuel</creator><creatorcontrib>Povey, Daniel ; Burget, Lukáš ; Agarwal, Mohit ; Akyazi, Pinar ; Kai Feng ; Ghoshal, Arnab ; Glembek, Ondřej ; Goel, Nagendra Kumar ; Karafiát, Martin ; Rastrow, Ariya ; Rose, Richard C ; Schwarz, Petr ; Thomas, Samuel</creatorcontrib><description>We describe an acoustic modeling approach in which all phonetic states share a common Gaussian Mixture Model structure, and the means and mixture weights vary in a subspace of the total parameter space. We call this a Subspace Gaussian Mixture Model (SGMM). Globally shared parameters define the subspace. This style of acoustic model allows for a much more compact representation and gives better results than a conventional modeling approach, particularly with smaller amounts of training data.</description><identifier>ISSN: 1520-6149</identifier><identifier>ISBN: 9781424442959</identifier><identifier>ISBN: 1424442958</identifier><identifier>EISBN: 9781424442966</identifier><identifier>EISBN: 1424442966</identifier><identifier>DOI: 10.1109/ICASSP.2010.5495662</identifier><language>eng</language><publisher>IEEE</publisher><subject>Acoustic testing ; Costs ; Equations ; Gaussian Mixture Models ; Hidden Markov models ; Loudspeakers ; Natural languages ; Software testing ; Software tools ; Speech recognition ; Training data</subject><ispartof>2010 IEEE International Conference on Acoustics, Speech and Signal Processing, 2010, p.4330-4333</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c291t-a6aee211d9c42647f8d1755cf3cf77de02e4ec78b4c73d2534707b83a5b3a3113</citedby></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/5495662$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,780,784,789,790,2058,27925,54920</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/5495662$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Povey, Daniel</creatorcontrib><creatorcontrib>Burget, Lukáš</creatorcontrib><creatorcontrib>Agarwal, Mohit</creatorcontrib><creatorcontrib>Akyazi, Pinar</creatorcontrib><creatorcontrib>Kai Feng</creatorcontrib><creatorcontrib>Ghoshal, Arnab</creatorcontrib><creatorcontrib>Glembek, Ondřej</creatorcontrib><creatorcontrib>Goel, Nagendra Kumar</creatorcontrib><creatorcontrib>Karafiát, Martin</creatorcontrib><creatorcontrib>Rastrow, Ariya</creatorcontrib><creatorcontrib>Rose, Richard C</creatorcontrib><creatorcontrib>Schwarz, Petr</creatorcontrib><creatorcontrib>Thomas, Samuel</creatorcontrib><title>Subspace Gaussian Mixture Models for speech recognition</title><title>2010 IEEE International Conference on Acoustics, Speech and Signal Processing</title><addtitle>ICASSP</addtitle><description>We describe an acoustic modeling approach in which all phonetic states share a common Gaussian Mixture Model structure, and the means and mixture weights vary in a subspace of the total parameter space. We call this a Subspace Gaussian Mixture Model (SGMM). Globally shared parameters define the subspace. This style of acoustic model allows for a much more compact representation and gives better results than a conventional modeling approach, particularly with smaller amounts of training data.</description><subject>Acoustic testing</subject><subject>Costs</subject><subject>Equations</subject><subject>Gaussian Mixture Models</subject><subject>Hidden Markov models</subject><subject>Loudspeakers</subject><subject>Natural languages</subject><subject>Software testing</subject><subject>Software tools</subject><subject>Speech recognition</subject><subject>Training data</subject><issn>1520-6149</issn><isbn>9781424442959</isbn><isbn>1424442958</isbn><isbn>9781424442966</isbn><isbn>1424442966</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2010</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><sourceid>RIE</sourceid><recordid>eNpVj81Kw0AUhUdUsNY-QTfzAqlzZ-78LaVoFVoUousymdzoSE1CJgF9ewt24-pwvsXhO4wtQawAhL99Wt-V5ctKiiPQ6LUx8owtvHWAEhGlN-b8X9f-gs1AS1EYQH_FrnP-FEI4i27GbDlVuQ-R-CZMOafQ8l36HqeB-K6r6ZB50w0890Txgw8Uu_c2jalrb9hlEw6ZFqecs7eH-9f1Y7F93hwNt0WUHsYimEAkAWofURq0javBah0bFRtraxKSkKJ1FUaraqkVWmErp4KuVFAAas6Wf7uJiPb9kL7C8LM__Va_mzNJsA</recordid><startdate>201003</startdate><enddate>201003</enddate><creator>Povey, Daniel</creator><creator>Burget, Lukáš</creator><creator>Agarwal, Mohit</creator><creator>Akyazi, Pinar</creator><creator>Kai Feng</creator><creator>Ghoshal, Arnab</creator><creator>Glembek, Ondřej</creator><creator>Goel, Nagendra Kumar</creator><creator>Karafiát, Martin</creator><creator>Rastrow, Ariya</creator><creator>Rose, Richard C</creator><creator>Schwarz, Petr</creator><creator>Thomas, Samuel</creator><general>IEEE</general><scope>6IE</scope><scope>6IH</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIO</scope></search><sort><creationdate>201003</creationdate><title>Subspace Gaussian Mixture Models for speech recognition</title><author>Povey, Daniel ; Burget, Lukáš ; Agarwal, Mohit ; Akyazi, Pinar ; Kai Feng ; Ghoshal, Arnab ; Glembek, Ondřej ; Goel, Nagendra Kumar ; Karafiát, Martin ; Rastrow, Ariya ; Rose, Richard C ; Schwarz, Petr ; Thomas, Samuel</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c291t-a6aee211d9c42647f8d1755cf3cf77de02e4ec78b4c73d2534707b83a5b3a3113</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2010</creationdate><topic>Acoustic testing</topic><topic>Costs</topic><topic>Equations</topic><topic>Gaussian Mixture Models</topic><topic>Hidden Markov models</topic><topic>Loudspeakers</topic><topic>Natural languages</topic><topic>Software testing</topic><topic>Software tools</topic><topic>Speech recognition</topic><topic>Training data</topic><toplevel>online_resources</toplevel><creatorcontrib>Povey, Daniel</creatorcontrib><creatorcontrib>Burget, Lukáš</creatorcontrib><creatorcontrib>Agarwal, Mohit</creatorcontrib><creatorcontrib>Akyazi, Pinar</creatorcontrib><creatorcontrib>Kai Feng</creatorcontrib><creatorcontrib>Ghoshal, Arnab</creatorcontrib><creatorcontrib>Glembek, Ondřej</creatorcontrib><creatorcontrib>Goel, Nagendra Kumar</creatorcontrib><creatorcontrib>Karafiát, Martin</creatorcontrib><creatorcontrib>Rastrow, Ariya</creatorcontrib><creatorcontrib>Rose, Richard C</creatorcontrib><creatorcontrib>Schwarz, Petr</creatorcontrib><creatorcontrib>Thomas, Samuel</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan (POP) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Electronic Library (IEL)</collection><collection>IEEE Proceedings Order Plans (POP) 1998-present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Povey, Daniel</au><au>Burget, Lukáš</au><au>Agarwal, Mohit</au><au>Akyazi, Pinar</au><au>Kai Feng</au><au>Ghoshal, Arnab</au><au>Glembek, Ondřej</au><au>Goel, Nagendra Kumar</au><au>Karafiát, Martin</au><au>Rastrow, Ariya</au><au>Rose, Richard C</au><au>Schwarz, Petr</au><au>Thomas, Samuel</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Subspace Gaussian Mixture Models for speech recognition</atitle><btitle>2010 IEEE International Conference on Acoustics, Speech and Signal Processing</btitle><stitle>ICASSP</stitle><date>2010-03</date><risdate>2010</risdate><spage>4330</spage><epage>4333</epage><pages>4330-4333</pages><issn>1520-6149</issn><isbn>9781424442959</isbn><isbn>1424442958</isbn><eisbn>9781424442966</eisbn><eisbn>1424442966</eisbn><abstract>We describe an acoustic modeling approach in which all phonetic states share a common Gaussian Mixture Model structure, and the means and mixture weights vary in a subspace of the total parameter space. We call this a Subspace Gaussian Mixture Model (SGMM). Globally shared parameters define the subspace. This style of acoustic model allows for a much more compact representation and gives better results than a conventional modeling approach, particularly with smaller amounts of training data.</abstract><pub>IEEE</pub><doi>10.1109/ICASSP.2010.5495662</doi><tpages>4</tpages></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 1520-6149
ispartof 2010 IEEE International Conference on Acoustics, Speech and Signal Processing, 2010, p.4330-4333
issn 1520-6149
language eng
recordid cdi_ieee_primary_5495662
source IEEE Electronic Library (IEL) Conference Proceedings
subjects Acoustic testing
Costs
Equations
Gaussian Mixture Models
Hidden Markov models
Loudspeakers
Natural languages
Software testing
Software tools
Speech recognition
Training data
title Subspace Gaussian Mixture Models for speech recognition
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-26T17%3A32%3A48IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_6IE&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Subspace%20Gaussian%20Mixture%20Models%20for%20speech%20recognition&rft.btitle=2010%20IEEE%20International%20Conference%20on%20Acoustics,%20Speech%20and%20Signal%20Processing&rft.au=Povey,%20Daniel&rft.date=2010-03&rft.spage=4330&rft.epage=4333&rft.pages=4330-4333&rft.issn=1520-6149&rft.isbn=9781424442959&rft.isbn_list=1424442958&rft_id=info:doi/10.1109/ICASSP.2010.5495662&rft_dat=%3Cieee_6IE%3E5495662%3C/ieee_6IE%3E%3Curl%3E%3C/url%3E&rft.eisbn=9781424442966&rft.eisbn_list=1424442966&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=5495662&rfr_iscdi=true