Bayesian Estimation of PLDA in the Presence of Noisy Training Labels, With Applications to Speaker Verification

This paperpresents a Bayesian framework for estimating a Probabilistic Linear Discriminant Analysis (PLDA) model in the presence of noisy labels. True class labels are interpreted as latent random variables, which are transmitted through a noisy channel, and received as observed speaker labels. The...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE/ACM transactions on audio, speech, and language processing speech, and language processing, 2022, Vol.30, p.414-428
1. Verfasser: Borgstrom, Bengt J.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 428
container_issue
container_start_page 414
container_title IEEE/ACM transactions on audio, speech, and language processing
container_volume 30
creator Borgstrom, Bengt J.
description This paperpresents a Bayesian framework for estimating a Probabilistic Linear Discriminant Analysis (PLDA) model in the presence of noisy labels. True class labels are interpreted as latent random variables, which are transmitted through a noisy channel, and received as observed speaker labels. The labeling process is modeled as a Discrete Memoryless Channel (DMC). PLDA hyperparameters are interpreted as random variables, and their joint posterior distribution is derived using mean-field Variational Bayes, allowing maximum a posteriori (MAP) estimates of the PLDA model parameters to be determined. The proposed solution, referred to as VB-MAP, is presented as a general framework, but is studied in the context of speaker verification, and a variety of use cases are discussed. Specifically, VB-MAP can be used for PLDA estimation with unreliable labels, unsupervised PLDA estimation, and to infer the reliability of a PLDA training set. Experimental results show the proposed approach to provide significant performance improvements on a variety of NIST Speaker Recognition Evaluation (SRE) tasks, both for data sets with simulated mislabels, and for data sets with naturally occurring missing or unreliable labels.
doi_str_mv 10.1109/TASLP.2021.3130980
format Article
fullrecord <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_crossref_primary_10_1109_TASLP_2021_3130980</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9627797</ieee_id><sourcerecordid>2621792958</sourcerecordid><originalsourceid>FETCH-LOGICAL-c295t-a5d2ae44c9b6aaed133e1e78514825b509d64a04fd3f020aa1af202d560cd1db3</originalsourceid><addsrcrecordid>eNo9kMlOwzAQhi0EElXpC8DFEldSxnY2H0MpixRBpRY4Wk4yoS4lDnZ66NuTLnCa0cy_SB8hlwzGjIG8XWTzfDbmwNlYMAEyhRMy4ILLQAoIT_92LuGcjLxfAQCDRMokHBB7p7fojW7o1HfmW3fGNtTWdJbfZ9Q0tFsinTn02JS4u79Y47d04bRpTPNJc13g2t_QD9Mtada2a1PuIzztLJ23qL_Q0Xd0pj4-LshZrdceR8c5JG8P08XkKchfH58nWR6UXEZdoKOKawzDUhax1lgxIZBhkkYsTHlURCCrONQQ1pWogYPWTNc9gCqKoaxYVYghuT7kts7-bNB3amU3rukrFY85S3oaUdqr-EFVOuu9w1q1rofgtoqB2rFVe7Zqx1Yd2famq4PJIOK_QcY8SWQifgHlJXWM</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2621792958</pqid></control><display><type>article</type><title>Bayesian Estimation of PLDA in the Presence of Noisy Training Labels, With Applications to Speaker Verification</title><source>IEEE Electronic Library (IEL)</source><creator>Borgstrom, Bengt J.</creator><creatorcontrib>Borgstrom, Bengt J.</creatorcontrib><description>This paperpresents a Bayesian framework for estimating a Probabilistic Linear Discriminant Analysis (PLDA) model in the presence of noisy labels. True class labels are interpreted as latent random variables, which are transmitted through a noisy channel, and received as observed speaker labels. The labeling process is modeled as a Discrete Memoryless Channel (DMC). PLDA hyperparameters are interpreted as random variables, and their joint posterior distribution is derived using mean-field Variational Bayes, allowing maximum a posteriori (MAP) estimates of the PLDA model parameters to be determined. The proposed solution, referred to as VB-MAP, is presented as a general framework, but is studied in the context of speaker verification, and a variety of use cases are discussed. Specifically, VB-MAP can be used for PLDA estimation with unreliable labels, unsupervised PLDA estimation, and to infer the reliability of a PLDA training set. Experimental results show the proposed approach to provide significant performance improvements on a variety of NIST Speaker Recognition Evaluation (SRE) tasks, both for data sets with simulated mislabels, and for data sets with naturally occurring missing or unreliable labels.</description><identifier>ISSN: 2329-9290</identifier><identifier>EISSN: 2329-9304</identifier><identifier>DOI: 10.1109/TASLP.2021.3130980</identifier><identifier>CODEN: ITASFA</identifier><language>eng</language><publisher>Piscataway: IEEE</publisher><subject>Adaptation models ; Bayes methods ; Bayesian analysis ; Data models ; Datasets ; Discriminant analysis ; Estimation ; Labeling ; Labels ; Noise measurement ; noisy labels ; probabilistic linear discriminant analysis ; Random variables ; Speaker verification ; Speech recognition ; Training ; variational bayes ; Verification</subject><ispartof>IEEE/ACM transactions on audio, speech, and language processing, 2022, Vol.30, p.414-428</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2022</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c295t-a5d2ae44c9b6aaed133e1e78514825b509d64a04fd3f020aa1af202d560cd1db3</citedby><cites>FETCH-LOGICAL-c295t-a5d2ae44c9b6aaed133e1e78514825b509d64a04fd3f020aa1af202d560cd1db3</cites><orcidid>0000-0001-8529-5378</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9627797$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,796,4024,27923,27924,27925,54758</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/9627797$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Borgstrom, Bengt J.</creatorcontrib><title>Bayesian Estimation of PLDA in the Presence of Noisy Training Labels, With Applications to Speaker Verification</title><title>IEEE/ACM transactions on audio, speech, and language processing</title><addtitle>TASLP</addtitle><description>This paperpresents a Bayesian framework for estimating a Probabilistic Linear Discriminant Analysis (PLDA) model in the presence of noisy labels. True class labels are interpreted as latent random variables, which are transmitted through a noisy channel, and received as observed speaker labels. The labeling process is modeled as a Discrete Memoryless Channel (DMC). PLDA hyperparameters are interpreted as random variables, and their joint posterior distribution is derived using mean-field Variational Bayes, allowing maximum a posteriori (MAP) estimates of the PLDA model parameters to be determined. The proposed solution, referred to as VB-MAP, is presented as a general framework, but is studied in the context of speaker verification, and a variety of use cases are discussed. Specifically, VB-MAP can be used for PLDA estimation with unreliable labels, unsupervised PLDA estimation, and to infer the reliability of a PLDA training set. Experimental results show the proposed approach to provide significant performance improvements on a variety of NIST Speaker Recognition Evaluation (SRE) tasks, both for data sets with simulated mislabels, and for data sets with naturally occurring missing or unreliable labels.</description><subject>Adaptation models</subject><subject>Bayes methods</subject><subject>Bayesian analysis</subject><subject>Data models</subject><subject>Datasets</subject><subject>Discriminant analysis</subject><subject>Estimation</subject><subject>Labeling</subject><subject>Labels</subject><subject>Noise measurement</subject><subject>noisy labels</subject><subject>probabilistic linear discriminant analysis</subject><subject>Random variables</subject><subject>Speaker verification</subject><subject>Speech recognition</subject><subject>Training</subject><subject>variational bayes</subject><subject>Verification</subject><issn>2329-9290</issn><issn>2329-9304</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNo9kMlOwzAQhi0EElXpC8DFEldSxnY2H0MpixRBpRY4Wk4yoS4lDnZ66NuTLnCa0cy_SB8hlwzGjIG8XWTzfDbmwNlYMAEyhRMy4ILLQAoIT_92LuGcjLxfAQCDRMokHBB7p7fojW7o1HfmW3fGNtTWdJbfZ9Q0tFsinTn02JS4u79Y47d04bRpTPNJc13g2t_QD9Mtada2a1PuIzztLJ23qL_Q0Xd0pj4-LshZrdceR8c5JG8P08XkKchfH58nWR6UXEZdoKOKawzDUhax1lgxIZBhkkYsTHlURCCrONQQ1pWogYPWTNc9gCqKoaxYVYghuT7kts7-bNB3amU3rukrFY85S3oaUdqr-EFVOuu9w1q1rofgtoqB2rFVe7Zqx1Yd2famq4PJIOK_QcY8SWQifgHlJXWM</recordid><startdate>2022</startdate><enddate>2022</enddate><creator>Borgstrom, Bengt J.</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0001-8529-5378</orcidid></search><sort><creationdate>2022</creationdate><title>Bayesian Estimation of PLDA in the Presence of Noisy Training Labels, With Applications to Speaker Verification</title><author>Borgstrom, Bengt J.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c295t-a5d2ae44c9b6aaed133e1e78514825b509d64a04fd3f020aa1af202d560cd1db3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Adaptation models</topic><topic>Bayes methods</topic><topic>Bayesian analysis</topic><topic>Data models</topic><topic>Datasets</topic><topic>Discriminant analysis</topic><topic>Estimation</topic><topic>Labeling</topic><topic>Labels</topic><topic>Noise measurement</topic><topic>noisy labels</topic><topic>probabilistic linear discriminant analysis</topic><topic>Random variables</topic><topic>Speaker verification</topic><topic>Speech recognition</topic><topic>Training</topic><topic>variational bayes</topic><topic>Verification</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Borgstrom, Bengt J.</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE/ACM transactions on audio, speech, and language processing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Borgstrom, Bengt J.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Bayesian Estimation of PLDA in the Presence of Noisy Training Labels, With Applications to Speaker Verification</atitle><jtitle>IEEE/ACM transactions on audio, speech, and language processing</jtitle><stitle>TASLP</stitle><date>2022</date><risdate>2022</risdate><volume>30</volume><spage>414</spage><epage>428</epage><pages>414-428</pages><issn>2329-9290</issn><eissn>2329-9304</eissn><coden>ITASFA</coden><abstract>This paperpresents a Bayesian framework for estimating a Probabilistic Linear Discriminant Analysis (PLDA) model in the presence of noisy labels. True class labels are interpreted as latent random variables, which are transmitted through a noisy channel, and received as observed speaker labels. The labeling process is modeled as a Discrete Memoryless Channel (DMC). PLDA hyperparameters are interpreted as random variables, and their joint posterior distribution is derived using mean-field Variational Bayes, allowing maximum a posteriori (MAP) estimates of the PLDA model parameters to be determined. The proposed solution, referred to as VB-MAP, is presented as a general framework, but is studied in the context of speaker verification, and a variety of use cases are discussed. Specifically, VB-MAP can be used for PLDA estimation with unreliable labels, unsupervised PLDA estimation, and to infer the reliability of a PLDA training set. Experimental results show the proposed approach to provide significant performance improvements on a variety of NIST Speaker Recognition Evaluation (SRE) tasks, both for data sets with simulated mislabels, and for data sets with naturally occurring missing or unreliable labels.</abstract><cop>Piscataway</cop><pub>IEEE</pub><doi>10.1109/TASLP.2021.3130980</doi><tpages>15</tpages><orcidid>https://orcid.org/0000-0001-8529-5378</orcidid></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 2329-9290
ispartof IEEE/ACM transactions on audio, speech, and language processing, 2022, Vol.30, p.414-428
issn 2329-9290
2329-9304
language eng
recordid cdi_crossref_primary_10_1109_TASLP_2021_3130980
source IEEE Electronic Library (IEL)
subjects Adaptation models
Bayes methods
Bayesian analysis
Data models
Datasets
Discriminant analysis
Estimation
Labeling
Labels
Noise measurement
noisy labels
probabilistic linear discriminant analysis
Random variables
Speaker verification
Speech recognition
Training
variational bayes
Verification
title Bayesian Estimation of PLDA in the Presence of Noisy Training Labels, With Applications to Speaker Verification
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-21T08%3A47%3A06IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Bayesian%20Estimation%20of%20PLDA%20in%20the%20Presence%20of%20Noisy%20Training%20Labels,%20With%20Applications%20to%20Speaker%20Verification&rft.jtitle=IEEE/ACM%20transactions%20on%20audio,%20speech,%20and%20language%20processing&rft.au=Borgstrom,%20Bengt%20J.&rft.date=2022&rft.volume=30&rft.spage=414&rft.epage=428&rft.pages=414-428&rft.issn=2329-9290&rft.eissn=2329-9304&rft.coden=ITASFA&rft_id=info:doi/10.1109/TASLP.2021.3130980&rft_dat=%3Cproquest_RIE%3E2621792958%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2621792958&rft_id=info:pmid/&rft_ieee_id=9627797&rfr_iscdi=true