Curriculum Learning Based Approaches for Noise Robust Speaker Recognition

Performance of speaker identification (SID) systems is known to degrade rapidly in the presence of mismatch such as noise and channel degradations. This study introduces a novel class of curriculum learning (CL) based algorithms for noise robust speaker recognition. We introduce CL-based approaches...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE/ACM transactions on audio, speech, and language processing speech, and language processing, 2018-01, Vol.26 (1), p.197-210
Hauptverfasser:	Ranjan, Shivesh, Hansen, John H. L.
Format:	Artikel
Sprache:	eng
Schlagworte:	Curriculum learning (CL) Estimation Noise measurement noise robust Noise robustness probabilistic linear discriminant (PLDA) Rats Signal to noise ratio speaker verification Speech Training
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	210
container_issue	1
container_start_page	197
container_title	IEEE/ACM transactions on audio, speech, and language processing
container_volume	26
creator	Ranjan, Shivesh Hansen, John H. L.
description	Performance of speaker identification (SID) systems is known to degrade rapidly in the presence of mismatch such as noise and channel degradations. This study introduces a novel class of curriculum learning (CL) based algorithms for noise robust speaker recognition. We introduce CL-based approaches at two stages within a state-of-the-art speaker verification system: at the i-Vector extractor estimation and at the probabilistic linear discriminant (PLDA) back-end. Our proposed CL-based approaches operate by categorizing the available training data into progressively more challenging subsets using a suitable difficulty criterion. Next, the corresponding training algorithms are initialized with a subset that is closest to a clean noise-free set, and progressively moving to subsets that are more challenging for training as the algorithms progress. We evaluate the performance of our proposed approaches on the noisy and severely degraded data from the DARPA RATS SID task, and show consistent and significant improvement across multiple test sets over a baseline SID framework with a standard i-Vector extractor and multisession PLDA-based back-end. We also construct a very challenging evaluation set by adding noise to the NIST SRE 2010 C5 extended condition trials, where our proposed CL-based PLDA is shown to offer significant improvements over a traditional PLDA based back-end.
doi_str_mv	10.1109/TASLP.2017.2765832
format	Article
fullrecord	<record><control><sourceid>crossref_RIE</sourceid><recordid>TN_cdi_crossref_primary_10_1109_TASLP_2017_2765832</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>8080267</ieee_id><sourcerecordid>10_1109_TASLP_2017_2765832</sourcerecordid><originalsourceid>FETCH-LOGICAL-c267t-5670d10168b215c869c008c2c5accfaa8fa29ebaeabd4b76429e84e7fcdd8bb33</originalsourceid><addsrcrecordid>eNo9kM1OAjEUhRujiQR5Ad30BQZvOz9tl0hUSCZqANeTtnMHqzCdtMzCt3cQdHXPXXwnJx8htwymjIG638zW5duUAxNTLopcpvyCjHjKVaJSyC7_MldwTSYxfgIAA6GUyEZkOe9DcLbf9Xtaog6ta7f0QUes6azrgtf2AyNtfKAv3kWkK2_6eKDrDvUXBrpC67etOzjf3pCrRu8iTs53TN6fHjfzRVK-Pi_nszKxvBCHJC8E1AxYIQ1nuZWFsgDScptraxutZaO5QqNRmzozosiGT2YoGlvX0pg0HRN-6rXBxxiwqbrg9jp8Vwyqo4_q10d19FGdfQzQ3QlyiPgPSJAwjEp_ACUcXZ0</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Curriculum Learning Based Approaches for Noise Robust Speaker Recognition</title><source>IEEE Electronic Library (IEL)</source><creator>Ranjan, Shivesh ; Hansen, John H. L.</creator><creatorcontrib>Ranjan, Shivesh ; Hansen, John H. L.</creatorcontrib><description>Performance of speaker identification (SID) systems is known to degrade rapidly in the presence of mismatch such as noise and channel degradations. This study introduces a novel class of curriculum learning (CL) based algorithms for noise robust speaker recognition. We introduce CL-based approaches at two stages within a state-of-the-art speaker verification system: at the i-Vector extractor estimation and at the probabilistic linear discriminant (PLDA) back-end. Our proposed CL-based approaches operate by categorizing the available training data into progressively more challenging subsets using a suitable difficulty criterion. Next, the corresponding training algorithms are initialized with a subset that is closest to a clean noise-free set, and progressively moving to subsets that are more challenging for training as the algorithms progress. We evaluate the performance of our proposed approaches on the noisy and severely degraded data from the DARPA RATS SID task, and show consistent and significant improvement across multiple test sets over a baseline SID framework with a standard i-Vector extractor and multisession PLDA-based back-end. We also construct a very challenging evaluation set by adding noise to the NIST SRE 2010 C5 extended condition trials, where our proposed CL-based PLDA is shown to offer significant improvements over a traditional PLDA based back-end.</description><identifier>ISSN: 2329-9290</identifier><identifier>EISSN: 2329-9304</identifier><identifier>DOI: 10.1109/TASLP.2017.2765832</identifier><identifier>CODEN: ITASD8</identifier><language>eng</language><publisher>IEEE</publisher><subject>Curriculum learning (CL) ; Estimation ; Noise measurement ; noise robust ; Noise robustness ; probabilistic linear discriminant (PLDA) ; Rats ; Signal to noise ratio ; speaker verification ; Speech ; Training</subject><ispartof>IEEE/ACM transactions on audio, speech, and language processing, 2018-01, Vol.26 (1), p.197-210</ispartof><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c267t-5670d10168b215c869c008c2c5accfaa8fa29ebaeabd4b76429e84e7fcdd8bb33</citedby><cites>FETCH-LOGICAL-c267t-5670d10168b215c869c008c2c5accfaa8fa29ebaeabd4b76429e84e7fcdd8bb33</cites><orcidid>0000-0002-7365-0253 ; 0000-0003-1382-9929</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/8080267$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,777,781,793,27905,27906,54739</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/8080267$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Ranjan, Shivesh</creatorcontrib><creatorcontrib>Hansen, John H. L.</creatorcontrib><title>Curriculum Learning Based Approaches for Noise Robust Speaker Recognition</title><title>IEEE/ACM transactions on audio, speech, and language processing</title><addtitle>TASLP</addtitle><description>Performance of speaker identification (SID) systems is known to degrade rapidly in the presence of mismatch such as noise and channel degradations. This study introduces a novel class of curriculum learning (CL) based algorithms for noise robust speaker recognition. We introduce CL-based approaches at two stages within a state-of-the-art speaker verification system: at the i-Vector extractor estimation and at the probabilistic linear discriminant (PLDA) back-end. Our proposed CL-based approaches operate by categorizing the available training data into progressively more challenging subsets using a suitable difficulty criterion. Next, the corresponding training algorithms are initialized with a subset that is closest to a clean noise-free set, and progressively moving to subsets that are more challenging for training as the algorithms progress. We evaluate the performance of our proposed approaches on the noisy and severely degraded data from the DARPA RATS SID task, and show consistent and significant improvement across multiple test sets over a baseline SID framework with a standard i-Vector extractor and multisession PLDA-based back-end. We also construct a very challenging evaluation set by adding noise to the NIST SRE 2010 C5 extended condition trials, where our proposed CL-based PLDA is shown to offer significant improvements over a traditional PLDA based back-end.</description><subject>Curriculum learning (CL)</subject><subject>Estimation</subject><subject>Noise measurement</subject><subject>noise robust</subject><subject>Noise robustness</subject><subject>probabilistic linear discriminant (PLDA)</subject><subject>Rats</subject><subject>Signal to noise ratio</subject><subject>speaker verification</subject><subject>Speech</subject><subject>Training</subject><issn>2329-9290</issn><issn>2329-9304</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2018</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNo9kM1OAjEUhRujiQR5Ad30BQZvOz9tl0hUSCZqANeTtnMHqzCdtMzCt3cQdHXPXXwnJx8htwymjIG638zW5duUAxNTLopcpvyCjHjKVaJSyC7_MldwTSYxfgIAA6GUyEZkOe9DcLbf9Xtaog6ta7f0QUes6azrgtf2AyNtfKAv3kWkK2_6eKDrDvUXBrpC67etOzjf3pCrRu8iTs53TN6fHjfzRVK-Pi_nszKxvBCHJC8E1AxYIQ1nuZWFsgDScptraxutZaO5QqNRmzozosiGT2YoGlvX0pg0HRN-6rXBxxiwqbrg9jp8Vwyqo4_q10d19FGdfQzQ3QlyiPgPSJAwjEp_ACUcXZ0</recordid><startdate>201801</startdate><enddate>201801</enddate><creator>Ranjan, Shivesh</creator><creator>Hansen, John H. L.</creator><general>IEEE</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><orcidid>https://orcid.org/0000-0002-7365-0253</orcidid><orcidid>https://orcid.org/0000-0003-1382-9929</orcidid></search><sort><creationdate>201801</creationdate><title>Curriculum Learning Based Approaches for Noise Robust Speaker Recognition</title><author>Ranjan, Shivesh ; Hansen, John H. L.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c267t-5670d10168b215c869c008c2c5accfaa8fa29ebaeabd4b76429e84e7fcdd8bb33</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2018</creationdate><topic>Curriculum learning (CL)</topic><topic>Estimation</topic><topic>Noise measurement</topic><topic>noise robust</topic><topic>Noise robustness</topic><topic>probabilistic linear discriminant (PLDA)</topic><topic>Rats</topic><topic>Signal to noise ratio</topic><topic>speaker verification</topic><topic>Speech</topic><topic>Training</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Ranjan, Shivesh</creatorcontrib><creatorcontrib>Hansen, John H. L.</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><jtitle>IEEE/ACM transactions on audio, speech, and language processing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Ranjan, Shivesh</au><au>Hansen, John H. L.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Curriculum Learning Based Approaches for Noise Robust Speaker Recognition</atitle><jtitle>IEEE/ACM transactions on audio, speech, and language processing</jtitle><stitle>TASLP</stitle><date>2018-01</date><risdate>2018</risdate><volume>26</volume><issue>1</issue><spage>197</spage><epage>210</epage><pages>197-210</pages><issn>2329-9290</issn><eissn>2329-9304</eissn><coden>ITASD8</coden><abstract>Performance of speaker identification (SID) systems is known to degrade rapidly in the presence of mismatch such as noise and channel degradations. This study introduces a novel class of curriculum learning (CL) based algorithms for noise robust speaker recognition. We introduce CL-based approaches at two stages within a state-of-the-art speaker verification system: at the i-Vector extractor estimation and at the probabilistic linear discriminant (PLDA) back-end. Our proposed CL-based approaches operate by categorizing the available training data into progressively more challenging subsets using a suitable difficulty criterion. Next, the corresponding training algorithms are initialized with a subset that is closest to a clean noise-free set, and progressively moving to subsets that are more challenging for training as the algorithms progress. We evaluate the performance of our proposed approaches on the noisy and severely degraded data from the DARPA RATS SID task, and show consistent and significant improvement across multiple test sets over a baseline SID framework with a standard i-Vector extractor and multisession PLDA-based back-end. We also construct a very challenging evaluation set by adding noise to the NIST SRE 2010 C5 extended condition trials, where our proposed CL-based PLDA is shown to offer significant improvements over a traditional PLDA based back-end.</abstract><pub>IEEE</pub><doi>10.1109/TASLP.2017.2765832</doi><tpages>14</tpages><orcidid>https://orcid.org/0000-0002-7365-0253</orcidid><orcidid>https://orcid.org/0000-0003-1382-9929</orcidid></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 2329-9290
ispartof	IEEE/ACM transactions on audio, speech, and language processing, 2018-01, Vol.26 (1), p.197-210
issn	2329-9290 2329-9304
language	eng
recordid	cdi_crossref_primary_10_1109_TASLP_2017_2765832
source	IEEE Electronic Library (IEL)
subjects	Curriculum learning (CL) Estimation Noise measurement noise robust Noise robustness probabilistic linear discriminant (PLDA) Rats Signal to noise ratio speaker verification Speech Training
title	Curriculum Learning Based Approaches for Noise Robust Speaker Recognition
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-19T20%3A39%3A00IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-crossref_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Curriculum%20Learning%20Based%20Approaches%20for%20Noise%20Robust%20Speaker%20Recognition&rft.jtitle=IEEE/ACM%20transactions%20on%20audio,%20speech,%20and%20language%20processing&rft.au=Ranjan,%20Shivesh&rft.date=2018-01&rft.volume=26&rft.issue=1&rft.spage=197&rft.epage=210&rft.pages=197-210&rft.issn=2329-9290&rft.eissn=2329-9304&rft.coden=ITASD8&rft_id=info:doi/10.1109/TASLP.2017.2765832&rft_dat=%3Ccrossref_RIE%3E10_1109_TASLP_2017_2765832%3C/crossref_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=8080267&rfr_iscdi=true