Bayesian Estimation of PLDA in the Presence of Noisy Training Labels, With Applications to Speaker Verification
This paperpresents a Bayesian framework for estimating a Probabilistic Linear Discriminant Analysis (PLDA) model in the presence of noisy labels. True class labels are interpreted as latent random variables, which are transmitted through a noisy channel, and received as observed speaker labels. The...
Gespeichert in:
Veröffentlicht in: | IEEE/ACM transactions on audio, speech, and language processing speech, and language processing, 2022, Vol.30, p.414-428 |
---|---|
1. Verfasser: | |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 428 |
---|---|
container_issue | |
container_start_page | 414 |
container_title | IEEE/ACM transactions on audio, speech, and language processing |
container_volume | 30 |
creator | Borgstrom, Bengt J. |
description | This paperpresents a Bayesian framework for estimating a Probabilistic Linear Discriminant Analysis (PLDA) model in the presence of noisy labels. True class labels are interpreted as latent random variables, which are transmitted through a noisy channel, and received as observed speaker labels. The labeling process is modeled as a Discrete Memoryless Channel (DMC). PLDA hyperparameters are interpreted as random variables, and their joint posterior distribution is derived using mean-field Variational Bayes, allowing maximum a posteriori (MAP) estimates of the PLDA model parameters to be determined. The proposed solution, referred to as VB-MAP, is presented as a general framework, but is studied in the context of speaker verification, and a variety of use cases are discussed. Specifically, VB-MAP can be used for PLDA estimation with unreliable labels, unsupervised PLDA estimation, and to infer the reliability of a PLDA training set. Experimental results show the proposed approach to provide significant performance improvements on a variety of NIST Speaker Recognition Evaluation (SRE) tasks, both for data sets with simulated mislabels, and for data sets with naturally occurring missing or unreliable labels. |
doi_str_mv | 10.1109/TASLP.2021.3130980 |
format | Article |
fullrecord | <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_crossref_primary_10_1109_TASLP_2021_3130980</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9627797</ieee_id><sourcerecordid>2621792958</sourcerecordid><originalsourceid>FETCH-LOGICAL-c295t-a5d2ae44c9b6aaed133e1e78514825b509d64a04fd3f020aa1af202d560cd1db3</originalsourceid><addsrcrecordid>eNo9kMlOwzAQhi0EElXpC8DFEldSxnY2H0MpixRBpRY4Wk4yoS4lDnZ66NuTLnCa0cy_SB8hlwzGjIG8XWTzfDbmwNlYMAEyhRMy4ILLQAoIT_92LuGcjLxfAQCDRMokHBB7p7fojW7o1HfmW3fGNtTWdJbfZ9Q0tFsinTn02JS4u79Y47d04bRpTPNJc13g2t_QD9Mtada2a1PuIzztLJ23qL_Q0Xd0pj4-LshZrdceR8c5JG8P08XkKchfH58nWR6UXEZdoKOKawzDUhax1lgxIZBhkkYsTHlURCCrONQQ1pWogYPWTNc9gCqKoaxYVYghuT7kts7-bNB3amU3rukrFY85S3oaUdqr-EFVOuu9w1q1rofgtoqB2rFVe7Zqx1Yd2famq4PJIOK_QcY8SWQifgHlJXWM</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2621792958</pqid></control><display><type>article</type><title>Bayesian Estimation of PLDA in the Presence of Noisy Training Labels, With Applications to Speaker Verification</title><source>IEEE Electronic Library (IEL)</source><creator>Borgstrom, Bengt J.</creator><creatorcontrib>Borgstrom, Bengt J.</creatorcontrib><description>This paperpresents a Bayesian framework for estimating a Probabilistic Linear Discriminant Analysis (PLDA) model in the presence of noisy labels. True class labels are interpreted as latent random variables, which are transmitted through a noisy channel, and received as observed speaker labels. The labeling process is modeled as a Discrete Memoryless Channel (DMC). PLDA hyperparameters are interpreted as random variables, and their joint posterior distribution is derived using mean-field Variational Bayes, allowing maximum a posteriori (MAP) estimates of the PLDA model parameters to be determined. The proposed solution, referred to as VB-MAP, is presented as a general framework, but is studied in the context of speaker verification, and a variety of use cases are discussed. Specifically, VB-MAP can be used for PLDA estimation with unreliable labels, unsupervised PLDA estimation, and to infer the reliability of a PLDA training set. Experimental results show the proposed approach to provide significant performance improvements on a variety of NIST Speaker Recognition Evaluation (SRE) tasks, both for data sets with simulated mislabels, and for data sets with naturally occurring missing or unreliable labels.</description><identifier>ISSN: 2329-9290</identifier><identifier>EISSN: 2329-9304</identifier><identifier>DOI: 10.1109/TASLP.2021.3130980</identifier><identifier>CODEN: ITASFA</identifier><language>eng</language><publisher>Piscataway: IEEE</publisher><subject>Adaptation models ; Bayes methods ; Bayesian analysis ; Data models ; Datasets ; Discriminant analysis ; Estimation ; Labeling ; Labels ; Noise measurement ; noisy labels ; probabilistic linear discriminant analysis ; Random variables ; Speaker verification ; Speech recognition ; Training ; variational bayes ; Verification</subject><ispartof>IEEE/ACM transactions on audio, speech, and language processing, 2022, Vol.30, p.414-428</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2022</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c295t-a5d2ae44c9b6aaed133e1e78514825b509d64a04fd3f020aa1af202d560cd1db3</citedby><cites>FETCH-LOGICAL-c295t-a5d2ae44c9b6aaed133e1e78514825b509d64a04fd3f020aa1af202d560cd1db3</cites><orcidid>0000-0001-8529-5378</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9627797$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,796,4024,27923,27924,27925,54758</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/9627797$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Borgstrom, Bengt J.</creatorcontrib><title>Bayesian Estimation of PLDA in the Presence of Noisy Training Labels, With Applications to Speaker Verification</title><title>IEEE/ACM transactions on audio, speech, and language processing</title><addtitle>TASLP</addtitle><description>This paperpresents a Bayesian framework for estimating a Probabilistic Linear Discriminant Analysis (PLDA) model in the presence of noisy labels. True class labels are interpreted as latent random variables, which are transmitted through a noisy channel, and received as observed speaker labels. The labeling process is modeled as a Discrete Memoryless Channel (DMC). PLDA hyperparameters are interpreted as random variables, and their joint posterior distribution is derived using mean-field Variational Bayes, allowing maximum a posteriori (MAP) estimates of the PLDA model parameters to be determined. The proposed solution, referred to as VB-MAP, is presented as a general framework, but is studied in the context of speaker verification, and a variety of use cases are discussed. Specifically, VB-MAP can be used for PLDA estimation with unreliable labels, unsupervised PLDA estimation, and to infer the reliability of a PLDA training set. Experimental results show the proposed approach to provide significant performance improvements on a variety of NIST Speaker Recognition Evaluation (SRE) tasks, both for data sets with simulated mislabels, and for data sets with naturally occurring missing or unreliable labels.</description><subject>Adaptation models</subject><subject>Bayes methods</subject><subject>Bayesian analysis</subject><subject>Data models</subject><subject>Datasets</subject><subject>Discriminant analysis</subject><subject>Estimation</subject><subject>Labeling</subject><subject>Labels</subject><subject>Noise measurement</subject><subject>noisy labels</subject><subject>probabilistic linear discriminant analysis</subject><subject>Random variables</subject><subject>Speaker verification</subject><subject>Speech recognition</subject><subject>Training</subject><subject>variational bayes</subject><subject>Verification</subject><issn>2329-9290</issn><issn>2329-9304</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNo9kMlOwzAQhi0EElXpC8DFEldSxnY2H0MpixRBpRY4Wk4yoS4lDnZ66NuTLnCa0cy_SB8hlwzGjIG8XWTzfDbmwNlYMAEyhRMy4ILLQAoIT_92LuGcjLxfAQCDRMokHBB7p7fojW7o1HfmW3fGNtTWdJbfZ9Q0tFsinTn02JS4u79Y47d04bRpTPNJc13g2t_QD9Mtada2a1PuIzztLJ23qL_Q0Xd0pj4-LshZrdceR8c5JG8P08XkKchfH58nWR6UXEZdoKOKawzDUhax1lgxIZBhkkYsTHlURCCrONQQ1pWogYPWTNc9gCqKoaxYVYghuT7kts7-bNB3amU3rukrFY85S3oaUdqr-EFVOuu9w1q1rofgtoqB2rFVe7Zqx1Yd2famq4PJIOK_QcY8SWQifgHlJXWM</recordid><startdate>2022</startdate><enddate>2022</enddate><creator>Borgstrom, Bengt J.</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0001-8529-5378</orcidid></search><sort><creationdate>2022</creationdate><title>Bayesian Estimation of PLDA in the Presence of Noisy Training Labels, With Applications to Speaker Verification</title><author>Borgstrom, Bengt J.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c295t-a5d2ae44c9b6aaed133e1e78514825b509d64a04fd3f020aa1af202d560cd1db3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Adaptation models</topic><topic>Bayes methods</topic><topic>Bayesian analysis</topic><topic>Data models</topic><topic>Datasets</topic><topic>Discriminant analysis</topic><topic>Estimation</topic><topic>Labeling</topic><topic>Labels</topic><topic>Noise measurement</topic><topic>noisy labels</topic><topic>probabilistic linear discriminant analysis</topic><topic>Random variables</topic><topic>Speaker verification</topic><topic>Speech recognition</topic><topic>Training</topic><topic>variational bayes</topic><topic>Verification</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Borgstrom, Bengt J.</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE/ACM transactions on audio, speech, and language processing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Borgstrom, Bengt J.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Bayesian Estimation of PLDA in the Presence of Noisy Training Labels, With Applications to Speaker Verification</atitle><jtitle>IEEE/ACM transactions on audio, speech, and language processing</jtitle><stitle>TASLP</stitle><date>2022</date><risdate>2022</risdate><volume>30</volume><spage>414</spage><epage>428</epage><pages>414-428</pages><issn>2329-9290</issn><eissn>2329-9304</eissn><coden>ITASFA</coden><abstract>This paperpresents a Bayesian framework for estimating a Probabilistic Linear Discriminant Analysis (PLDA) model in the presence of noisy labels. True class labels are interpreted as latent random variables, which are transmitted through a noisy channel, and received as observed speaker labels. The labeling process is modeled as a Discrete Memoryless Channel (DMC). PLDA hyperparameters are interpreted as random variables, and their joint posterior distribution is derived using mean-field Variational Bayes, allowing maximum a posteriori (MAP) estimates of the PLDA model parameters to be determined. The proposed solution, referred to as VB-MAP, is presented as a general framework, but is studied in the context of speaker verification, and a variety of use cases are discussed. Specifically, VB-MAP can be used for PLDA estimation with unreliable labels, unsupervised PLDA estimation, and to infer the reliability of a PLDA training set. Experimental results show the proposed approach to provide significant performance improvements on a variety of NIST Speaker Recognition Evaluation (SRE) tasks, both for data sets with simulated mislabels, and for data sets with naturally occurring missing or unreliable labels.</abstract><cop>Piscataway</cop><pub>IEEE</pub><doi>10.1109/TASLP.2021.3130980</doi><tpages>15</tpages><orcidid>https://orcid.org/0000-0001-8529-5378</orcidid></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | ISSN: 2329-9290 |
ispartof | IEEE/ACM transactions on audio, speech, and language processing, 2022, Vol.30, p.414-428 |
issn | 2329-9290 2329-9304 |
language | eng |
recordid | cdi_crossref_primary_10_1109_TASLP_2021_3130980 |
source | IEEE Electronic Library (IEL) |
subjects | Adaptation models Bayes methods Bayesian analysis Data models Datasets Discriminant analysis Estimation Labeling Labels Noise measurement noisy labels probabilistic linear discriminant analysis Random variables Speaker verification Speech recognition Training variational bayes Verification |
title | Bayesian Estimation of PLDA in the Presence of Noisy Training Labels, With Applications to Speaker Verification |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-21T08%3A47%3A06IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Bayesian%20Estimation%20of%20PLDA%20in%20the%20Presence%20of%20Noisy%20Training%20Labels,%20With%20Applications%20to%20Speaker%20Verification&rft.jtitle=IEEE/ACM%20transactions%20on%20audio,%20speech,%20and%20language%20processing&rft.au=Borgstrom,%20Bengt%20J.&rft.date=2022&rft.volume=30&rft.spage=414&rft.epage=428&rft.pages=414-428&rft.issn=2329-9290&rft.eissn=2329-9304&rft.coden=ITASFA&rft_id=info:doi/10.1109/TASLP.2021.3130980&rft_dat=%3Cproquest_RIE%3E2621792958%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2621792958&rft_id=info:pmid/&rft_ieee_id=9627797&rfr_iscdi=true |