Curriculum Learning Based Approaches for Noise Robust Speaker Recognition
Performance of speaker identification (SID) systems is known to degrade rapidly in the presence of mismatch such as noise and channel degradations. This study introduces a novel class of curriculum learning (CL) based algorithms for noise robust speaker recognition. We introduce CL-based approaches...
Gespeichert in:
Veröffentlicht in: | IEEE/ACM transactions on audio, speech, and language processing speech, and language processing, 2018-01, Vol.26 (1), p.197-210 |
---|---|
Hauptverfasser: | , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 210 |
---|---|
container_issue | 1 |
container_start_page | 197 |
container_title | IEEE/ACM transactions on audio, speech, and language processing |
container_volume | 26 |
creator | Ranjan, Shivesh Hansen, John H. L. |
description | Performance of speaker identification (SID) systems is known to degrade rapidly in the presence of mismatch such as noise and channel degradations. This study introduces a novel class of curriculum learning (CL) based algorithms for noise robust speaker recognition. We introduce CL-based approaches at two stages within a state-of-the-art speaker verification system: at the i-Vector extractor estimation and at the probabilistic linear discriminant (PLDA) back-end. Our proposed CL-based approaches operate by categorizing the available training data into progressively more challenging subsets using a suitable difficulty criterion. Next, the corresponding training algorithms are initialized with a subset that is closest to a clean noise-free set, and progressively moving to subsets that are more challenging for training as the algorithms progress. We evaluate the performance of our proposed approaches on the noisy and severely degraded data from the DARPA RATS SID task, and show consistent and significant improvement across multiple test sets over a baseline SID framework with a standard i-Vector extractor and multisession PLDA-based back-end. We also construct a very challenging evaluation set by adding noise to the NIST SRE 2010 C5 extended condition trials, where our proposed CL-based PLDA is shown to offer significant improvements over a traditional PLDA based back-end. |
doi_str_mv | 10.1109/TASLP.2017.2765832 |
format | Article |
fullrecord | <record><control><sourceid>crossref_RIE</sourceid><recordid>TN_cdi_crossref_primary_10_1109_TASLP_2017_2765832</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>8080267</ieee_id><sourcerecordid>10_1109_TASLP_2017_2765832</sourcerecordid><originalsourceid>FETCH-LOGICAL-c267t-5670d10168b215c869c008c2c5accfaa8fa29ebaeabd4b76429e84e7fcdd8bb33</originalsourceid><addsrcrecordid>eNo9kM1OAjEUhRujiQR5Ad30BQZvOz9tl0hUSCZqANeTtnMHqzCdtMzCt3cQdHXPXXwnJx8htwymjIG638zW5duUAxNTLopcpvyCjHjKVaJSyC7_MldwTSYxfgIAA6GUyEZkOe9DcLbf9Xtaog6ta7f0QUes6azrgtf2AyNtfKAv3kWkK2_6eKDrDvUXBrpC67etOzjf3pCrRu8iTs53TN6fHjfzRVK-Pi_nszKxvBCHJC8E1AxYIQ1nuZWFsgDScptraxutZaO5QqNRmzozosiGT2YoGlvX0pg0HRN-6rXBxxiwqbrg9jp8Vwyqo4_q10d19FGdfQzQ3QlyiPgPSJAwjEp_ACUcXZ0</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Curriculum Learning Based Approaches for Noise Robust Speaker Recognition</title><source>IEEE Electronic Library (IEL)</source><creator>Ranjan, Shivesh ; Hansen, John H. L.</creator><creatorcontrib>Ranjan, Shivesh ; Hansen, John H. L.</creatorcontrib><description>Performance of speaker identification (SID) systems is known to degrade rapidly in the presence of mismatch such as noise and channel degradations. This study introduces a novel class of curriculum learning (CL) based algorithms for noise robust speaker recognition. We introduce CL-based approaches at two stages within a state-of-the-art speaker verification system: at the i-Vector extractor estimation and at the probabilistic linear discriminant (PLDA) back-end. Our proposed CL-based approaches operate by categorizing the available training data into progressively more challenging subsets using a suitable difficulty criterion. Next, the corresponding training algorithms are initialized with a subset that is closest to a clean noise-free set, and progressively moving to subsets that are more challenging for training as the algorithms progress. We evaluate the performance of our proposed approaches on the noisy and severely degraded data from the DARPA RATS SID task, and show consistent and significant improvement across multiple test sets over a baseline SID framework with a standard i-Vector extractor and multisession PLDA-based back-end. We also construct a very challenging evaluation set by adding noise to the NIST SRE 2010 C5 extended condition trials, where our proposed CL-based PLDA is shown to offer significant improvements over a traditional PLDA based back-end.</description><identifier>ISSN: 2329-9290</identifier><identifier>EISSN: 2329-9304</identifier><identifier>DOI: 10.1109/TASLP.2017.2765832</identifier><identifier>CODEN: ITASD8</identifier><language>eng</language><publisher>IEEE</publisher><subject>Curriculum learning (CL) ; Estimation ; Noise measurement ; noise robust ; Noise robustness ; probabilistic linear discriminant (PLDA) ; Rats ; Signal to noise ratio ; speaker verification ; Speech ; Training</subject><ispartof>IEEE/ACM transactions on audio, speech, and language processing, 2018-01, Vol.26 (1), p.197-210</ispartof><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c267t-5670d10168b215c869c008c2c5accfaa8fa29ebaeabd4b76429e84e7fcdd8bb33</citedby><cites>FETCH-LOGICAL-c267t-5670d10168b215c869c008c2c5accfaa8fa29ebaeabd4b76429e84e7fcdd8bb33</cites><orcidid>0000-0002-7365-0253 ; 0000-0003-1382-9929</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/8080267$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,777,781,793,27905,27906,54739</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/8080267$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Ranjan, Shivesh</creatorcontrib><creatorcontrib>Hansen, John H. L.</creatorcontrib><title>Curriculum Learning Based Approaches for Noise Robust Speaker Recognition</title><title>IEEE/ACM transactions on audio, speech, and language processing</title><addtitle>TASLP</addtitle><description>Performance of speaker identification (SID) systems is known to degrade rapidly in the presence of mismatch such as noise and channel degradations. This study introduces a novel class of curriculum learning (CL) based algorithms for noise robust speaker recognition. We introduce CL-based approaches at two stages within a state-of-the-art speaker verification system: at the i-Vector extractor estimation and at the probabilistic linear discriminant (PLDA) back-end. Our proposed CL-based approaches operate by categorizing the available training data into progressively more challenging subsets using a suitable difficulty criterion. Next, the corresponding training algorithms are initialized with a subset that is closest to a clean noise-free set, and progressively moving to subsets that are more challenging for training as the algorithms progress. We evaluate the performance of our proposed approaches on the noisy and severely degraded data from the DARPA RATS SID task, and show consistent and significant improvement across multiple test sets over a baseline SID framework with a standard i-Vector extractor and multisession PLDA-based back-end. We also construct a very challenging evaluation set by adding noise to the NIST SRE 2010 C5 extended condition trials, where our proposed CL-based PLDA is shown to offer significant improvements over a traditional PLDA based back-end.</description><subject>Curriculum learning (CL)</subject><subject>Estimation</subject><subject>Noise measurement</subject><subject>noise robust</subject><subject>Noise robustness</subject><subject>probabilistic linear discriminant (PLDA)</subject><subject>Rats</subject><subject>Signal to noise ratio</subject><subject>speaker verification</subject><subject>Speech</subject><subject>Training</subject><issn>2329-9290</issn><issn>2329-9304</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2018</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNo9kM1OAjEUhRujiQR5Ad30BQZvOz9tl0hUSCZqANeTtnMHqzCdtMzCt3cQdHXPXXwnJx8htwymjIG638zW5duUAxNTLopcpvyCjHjKVaJSyC7_MldwTSYxfgIAA6GUyEZkOe9DcLbf9Xtaog6ta7f0QUes6azrgtf2AyNtfKAv3kWkK2_6eKDrDvUXBrpC67etOzjf3pCrRu8iTs53TN6fHjfzRVK-Pi_nszKxvBCHJC8E1AxYIQ1nuZWFsgDScptraxutZaO5QqNRmzozosiGT2YoGlvX0pg0HRN-6rXBxxiwqbrg9jp8Vwyqo4_q10d19FGdfQzQ3QlyiPgPSJAwjEp_ACUcXZ0</recordid><startdate>201801</startdate><enddate>201801</enddate><creator>Ranjan, Shivesh</creator><creator>Hansen, John H. L.</creator><general>IEEE</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><orcidid>https://orcid.org/0000-0002-7365-0253</orcidid><orcidid>https://orcid.org/0000-0003-1382-9929</orcidid></search><sort><creationdate>201801</creationdate><title>Curriculum Learning Based Approaches for Noise Robust Speaker Recognition</title><author>Ranjan, Shivesh ; Hansen, John H. L.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c267t-5670d10168b215c869c008c2c5accfaa8fa29ebaeabd4b76429e84e7fcdd8bb33</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2018</creationdate><topic>Curriculum learning (CL)</topic><topic>Estimation</topic><topic>Noise measurement</topic><topic>noise robust</topic><topic>Noise robustness</topic><topic>probabilistic linear discriminant (PLDA)</topic><topic>Rats</topic><topic>Signal to noise ratio</topic><topic>speaker verification</topic><topic>Speech</topic><topic>Training</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Ranjan, Shivesh</creatorcontrib><creatorcontrib>Hansen, John H. L.</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><jtitle>IEEE/ACM transactions on audio, speech, and language processing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Ranjan, Shivesh</au><au>Hansen, John H. L.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Curriculum Learning Based Approaches for Noise Robust Speaker Recognition</atitle><jtitle>IEEE/ACM transactions on audio, speech, and language processing</jtitle><stitle>TASLP</stitle><date>2018-01</date><risdate>2018</risdate><volume>26</volume><issue>1</issue><spage>197</spage><epage>210</epage><pages>197-210</pages><issn>2329-9290</issn><eissn>2329-9304</eissn><coden>ITASD8</coden><abstract>Performance of speaker identification (SID) systems is known to degrade rapidly in the presence of mismatch such as noise and channel degradations. This study introduces a novel class of curriculum learning (CL) based algorithms for noise robust speaker recognition. We introduce CL-based approaches at two stages within a state-of-the-art speaker verification system: at the i-Vector extractor estimation and at the probabilistic linear discriminant (PLDA) back-end. Our proposed CL-based approaches operate by categorizing the available training data into progressively more challenging subsets using a suitable difficulty criterion. Next, the corresponding training algorithms are initialized with a subset that is closest to a clean noise-free set, and progressively moving to subsets that are more challenging for training as the algorithms progress. We evaluate the performance of our proposed approaches on the noisy and severely degraded data from the DARPA RATS SID task, and show consistent and significant improvement across multiple test sets over a baseline SID framework with a standard i-Vector extractor and multisession PLDA-based back-end. We also construct a very challenging evaluation set by adding noise to the NIST SRE 2010 C5 extended condition trials, where our proposed CL-based PLDA is shown to offer significant improvements over a traditional PLDA based back-end.</abstract><pub>IEEE</pub><doi>10.1109/TASLP.2017.2765832</doi><tpages>14</tpages><orcidid>https://orcid.org/0000-0002-7365-0253</orcidid><orcidid>https://orcid.org/0000-0003-1382-9929</orcidid></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | ISSN: 2329-9290 |
ispartof | IEEE/ACM transactions on audio, speech, and language processing, 2018-01, Vol.26 (1), p.197-210 |
issn | 2329-9290 2329-9304 |
language | eng |
recordid | cdi_crossref_primary_10_1109_TASLP_2017_2765832 |
source | IEEE Electronic Library (IEL) |
subjects | Curriculum learning (CL) Estimation Noise measurement noise robust Noise robustness probabilistic linear discriminant (PLDA) Rats Signal to noise ratio speaker verification Speech Training |
title | Curriculum Learning Based Approaches for Noise Robust Speaker Recognition |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-19T20%3A39%3A00IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-crossref_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Curriculum%20Learning%20Based%20Approaches%20for%20Noise%20Robust%20Speaker%20Recognition&rft.jtitle=IEEE/ACM%20transactions%20on%20audio,%20speech,%20and%20language%20processing&rft.au=Ranjan,%20Shivesh&rft.date=2018-01&rft.volume=26&rft.issue=1&rft.spage=197&rft.epage=210&rft.pages=197-210&rft.issn=2329-9290&rft.eissn=2329-9304&rft.coden=ITASD8&rft_id=info:doi/10.1109/TASLP.2017.2765832&rft_dat=%3Ccrossref_RIE%3E10_1109_TASLP_2017_2765832%3C/crossref_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=8080267&rfr_iscdi=true |