Curriculum Learning Based Approaches for Noise Robust Speaker Recognition

Performance of speaker identification (SID) systems is known to degrade rapidly in the presence of mismatch such as noise and channel degradations. This study introduces a novel class of curriculum learning (CL) based algorithms for noise robust speaker recognition. We introduce CL-based approaches...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE/ACM transactions on audio, speech, and language processing speech, and language processing, 2018-01, Vol.26 (1), p.197-210
Hauptverfasser: Ranjan, Shivesh, Hansen, John H. L.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 210
container_issue 1
container_start_page 197
container_title IEEE/ACM transactions on audio, speech, and language processing
container_volume 26
creator Ranjan, Shivesh
Hansen, John H. L.
description Performance of speaker identification (SID) systems is known to degrade rapidly in the presence of mismatch such as noise and channel degradations. This study introduces a novel class of curriculum learning (CL) based algorithms for noise robust speaker recognition. We introduce CL-based approaches at two stages within a state-of-the-art speaker verification system: at the i-Vector extractor estimation and at the probabilistic linear discriminant (PLDA) back-end. Our proposed CL-based approaches operate by categorizing the available training data into progressively more challenging subsets using a suitable difficulty criterion. Next, the corresponding training algorithms are initialized with a subset that is closest to a clean noise-free set, and progressively moving to subsets that are more challenging for training as the algorithms progress. We evaluate the performance of our proposed approaches on the noisy and severely degraded data from the DARPA RATS SID task, and show consistent and significant improvement across multiple test sets over a baseline SID framework with a standard i-Vector extractor and multisession PLDA-based back-end. We also construct a very challenging evaluation set by adding noise to the NIST SRE 2010 C5 extended condition trials, where our proposed CL-based PLDA is shown to offer significant improvements over a traditional PLDA based back-end.
doi_str_mv 10.1109/TASLP.2017.2765832
format Article
fullrecord <record><control><sourceid>crossref_RIE</sourceid><recordid>TN_cdi_crossref_primary_10_1109_TASLP_2017_2765832</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>8080267</ieee_id><sourcerecordid>10_1109_TASLP_2017_2765832</sourcerecordid><originalsourceid>FETCH-LOGICAL-c267t-5670d10168b215c869c008c2c5accfaa8fa29ebaeabd4b76429e84e7fcdd8bb33</originalsourceid><addsrcrecordid>eNo9kM1OAjEUhRujiQR5Ad30BQZvOz9tl0hUSCZqANeTtnMHqzCdtMzCt3cQdHXPXXwnJx8htwymjIG638zW5duUAxNTLopcpvyCjHjKVaJSyC7_MldwTSYxfgIAA6GUyEZkOe9DcLbf9Xtaog6ta7f0QUes6azrgtf2AyNtfKAv3kWkK2_6eKDrDvUXBrpC67etOzjf3pCrRu8iTs53TN6fHjfzRVK-Pi_nszKxvBCHJC8E1AxYIQ1nuZWFsgDScptraxutZaO5QqNRmzozosiGT2YoGlvX0pg0HRN-6rXBxxiwqbrg9jp8Vwyqo4_q10d19FGdfQzQ3QlyiPgPSJAwjEp_ACUcXZ0</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Curriculum Learning Based Approaches for Noise Robust Speaker Recognition</title><source>IEEE Electronic Library (IEL)</source><creator>Ranjan, Shivesh ; Hansen, John H. L.</creator><creatorcontrib>Ranjan, Shivesh ; Hansen, John H. L.</creatorcontrib><description>Performance of speaker identification (SID) systems is known to degrade rapidly in the presence of mismatch such as noise and channel degradations. This study introduces a novel class of curriculum learning (CL) based algorithms for noise robust speaker recognition. We introduce CL-based approaches at two stages within a state-of-the-art speaker verification system: at the i-Vector extractor estimation and at the probabilistic linear discriminant (PLDA) back-end. Our proposed CL-based approaches operate by categorizing the available training data into progressively more challenging subsets using a suitable difficulty criterion. Next, the corresponding training algorithms are initialized with a subset that is closest to a clean noise-free set, and progressively moving to subsets that are more challenging for training as the algorithms progress. We evaluate the performance of our proposed approaches on the noisy and severely degraded data from the DARPA RATS SID task, and show consistent and significant improvement across multiple test sets over a baseline SID framework with a standard i-Vector extractor and multisession PLDA-based back-end. We also construct a very challenging evaluation set by adding noise to the NIST SRE 2010 C5 extended condition trials, where our proposed CL-based PLDA is shown to offer significant improvements over a traditional PLDA based back-end.</description><identifier>ISSN: 2329-9290</identifier><identifier>EISSN: 2329-9304</identifier><identifier>DOI: 10.1109/TASLP.2017.2765832</identifier><identifier>CODEN: ITASD8</identifier><language>eng</language><publisher>IEEE</publisher><subject>Curriculum learning (CL) ; Estimation ; Noise measurement ; noise robust ; Noise robustness ; probabilistic linear discriminant (PLDA) ; Rats ; Signal to noise ratio ; speaker verification ; Speech ; Training</subject><ispartof>IEEE/ACM transactions on audio, speech, and language processing, 2018-01, Vol.26 (1), p.197-210</ispartof><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c267t-5670d10168b215c869c008c2c5accfaa8fa29ebaeabd4b76429e84e7fcdd8bb33</citedby><cites>FETCH-LOGICAL-c267t-5670d10168b215c869c008c2c5accfaa8fa29ebaeabd4b76429e84e7fcdd8bb33</cites><orcidid>0000-0002-7365-0253 ; 0000-0003-1382-9929</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/8080267$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,777,781,793,27905,27906,54739</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/8080267$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Ranjan, Shivesh</creatorcontrib><creatorcontrib>Hansen, John H. L.</creatorcontrib><title>Curriculum Learning Based Approaches for Noise Robust Speaker Recognition</title><title>IEEE/ACM transactions on audio, speech, and language processing</title><addtitle>TASLP</addtitle><description>Performance of speaker identification (SID) systems is known to degrade rapidly in the presence of mismatch such as noise and channel degradations. This study introduces a novel class of curriculum learning (CL) based algorithms for noise robust speaker recognition. We introduce CL-based approaches at two stages within a state-of-the-art speaker verification system: at the i-Vector extractor estimation and at the probabilistic linear discriminant (PLDA) back-end. Our proposed CL-based approaches operate by categorizing the available training data into progressively more challenging subsets using a suitable difficulty criterion. Next, the corresponding training algorithms are initialized with a subset that is closest to a clean noise-free set, and progressively moving to subsets that are more challenging for training as the algorithms progress. We evaluate the performance of our proposed approaches on the noisy and severely degraded data from the DARPA RATS SID task, and show consistent and significant improvement across multiple test sets over a baseline SID framework with a standard i-Vector extractor and multisession PLDA-based back-end. We also construct a very challenging evaluation set by adding noise to the NIST SRE 2010 C5 extended condition trials, where our proposed CL-based PLDA is shown to offer significant improvements over a traditional PLDA based back-end.</description><subject>Curriculum learning (CL)</subject><subject>Estimation</subject><subject>Noise measurement</subject><subject>noise robust</subject><subject>Noise robustness</subject><subject>probabilistic linear discriminant (PLDA)</subject><subject>Rats</subject><subject>Signal to noise ratio</subject><subject>speaker verification</subject><subject>Speech</subject><subject>Training</subject><issn>2329-9290</issn><issn>2329-9304</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2018</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNo9kM1OAjEUhRujiQR5Ad30BQZvOz9tl0hUSCZqANeTtnMHqzCdtMzCt3cQdHXPXXwnJx8htwymjIG638zW5duUAxNTLopcpvyCjHjKVaJSyC7_MldwTSYxfgIAA6GUyEZkOe9DcLbf9Xtaog6ta7f0QUes6azrgtf2AyNtfKAv3kWkK2_6eKDrDvUXBrpC67etOzjf3pCrRu8iTs53TN6fHjfzRVK-Pi_nszKxvBCHJC8E1AxYIQ1nuZWFsgDScptraxutZaO5QqNRmzozosiGT2YoGlvX0pg0HRN-6rXBxxiwqbrg9jp8Vwyqo4_q10d19FGdfQzQ3QlyiPgPSJAwjEp_ACUcXZ0</recordid><startdate>201801</startdate><enddate>201801</enddate><creator>Ranjan, Shivesh</creator><creator>Hansen, John H. L.</creator><general>IEEE</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><orcidid>https://orcid.org/0000-0002-7365-0253</orcidid><orcidid>https://orcid.org/0000-0003-1382-9929</orcidid></search><sort><creationdate>201801</creationdate><title>Curriculum Learning Based Approaches for Noise Robust Speaker Recognition</title><author>Ranjan, Shivesh ; Hansen, John H. L.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c267t-5670d10168b215c869c008c2c5accfaa8fa29ebaeabd4b76429e84e7fcdd8bb33</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2018</creationdate><topic>Curriculum learning (CL)</topic><topic>Estimation</topic><topic>Noise measurement</topic><topic>noise robust</topic><topic>Noise robustness</topic><topic>probabilistic linear discriminant (PLDA)</topic><topic>Rats</topic><topic>Signal to noise ratio</topic><topic>speaker verification</topic><topic>Speech</topic><topic>Training</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Ranjan, Shivesh</creatorcontrib><creatorcontrib>Hansen, John H. L.</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><jtitle>IEEE/ACM transactions on audio, speech, and language processing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Ranjan, Shivesh</au><au>Hansen, John H. L.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Curriculum Learning Based Approaches for Noise Robust Speaker Recognition</atitle><jtitle>IEEE/ACM transactions on audio, speech, and language processing</jtitle><stitle>TASLP</stitle><date>2018-01</date><risdate>2018</risdate><volume>26</volume><issue>1</issue><spage>197</spage><epage>210</epage><pages>197-210</pages><issn>2329-9290</issn><eissn>2329-9304</eissn><coden>ITASD8</coden><abstract>Performance of speaker identification (SID) systems is known to degrade rapidly in the presence of mismatch such as noise and channel degradations. This study introduces a novel class of curriculum learning (CL) based algorithms for noise robust speaker recognition. We introduce CL-based approaches at two stages within a state-of-the-art speaker verification system: at the i-Vector extractor estimation and at the probabilistic linear discriminant (PLDA) back-end. Our proposed CL-based approaches operate by categorizing the available training data into progressively more challenging subsets using a suitable difficulty criterion. Next, the corresponding training algorithms are initialized with a subset that is closest to a clean noise-free set, and progressively moving to subsets that are more challenging for training as the algorithms progress. We evaluate the performance of our proposed approaches on the noisy and severely degraded data from the DARPA RATS SID task, and show consistent and significant improvement across multiple test sets over a baseline SID framework with a standard i-Vector extractor and multisession PLDA-based back-end. We also construct a very challenging evaluation set by adding noise to the NIST SRE 2010 C5 extended condition trials, where our proposed CL-based PLDA is shown to offer significant improvements over a traditional PLDA based back-end.</abstract><pub>IEEE</pub><doi>10.1109/TASLP.2017.2765832</doi><tpages>14</tpages><orcidid>https://orcid.org/0000-0002-7365-0253</orcidid><orcidid>https://orcid.org/0000-0003-1382-9929</orcidid></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 2329-9290
ispartof IEEE/ACM transactions on audio, speech, and language processing, 2018-01, Vol.26 (1), p.197-210
issn 2329-9290
2329-9304
language eng
recordid cdi_crossref_primary_10_1109_TASLP_2017_2765832
source IEEE Electronic Library (IEL)
subjects Curriculum learning (CL)
Estimation
Noise measurement
noise robust
Noise robustness
probabilistic linear discriminant (PLDA)
Rats
Signal to noise ratio
speaker verification
Speech
Training
title Curriculum Learning Based Approaches for Noise Robust Speaker Recognition
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-19T20%3A39%3A00IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-crossref_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Curriculum%20Learning%20Based%20Approaches%20for%20Noise%20Robust%20Speaker%20Recognition&rft.jtitle=IEEE/ACM%20transactions%20on%20audio,%20speech,%20and%20language%20processing&rft.au=Ranjan,%20Shivesh&rft.date=2018-01&rft.volume=26&rft.issue=1&rft.spage=197&rft.epage=210&rft.pages=197-210&rft.issn=2329-9290&rft.eissn=2329-9304&rft.coden=ITASD8&rft_id=info:doi/10.1109/TASLP.2017.2765832&rft_dat=%3Ccrossref_RIE%3E10_1109_TASLP_2017_2765832%3C/crossref_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=8080267&rfr_iscdi=true