Robust Speaker Diarization in a Multi-Speaker Environment Using Autocorrelation-based Noise Subtraction

This paper shows research performed into the topic of speaker diarization for multi-speaker environment. It looks into the algorithms and the implementation of an offline speaker segmentation and indexing system for recorded speech data where usually more than one speaker is present. Speaker diariza...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Mirrezaie, S.M., Ahadi, S.M., Kashi, A.
Format:	Tagungsbericht
Sprache:	eng
Schlagworte:	Autocorrelation Broadcasting Clustering algorithms Genetics Indexing meetings indexing Noise robustness noisy speech Robust speaker diarization speaker segmentation and clustering Speech recognition Streaming media Working environment noise
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	296
container_issue
container_start_page	291
container_title
container_volume
creator	Mirrezaie, S.M. Ahadi, S.M. Kashi, A.
description	This paper shows research performed into the topic of speaker diarization for multi-speaker environment. It looks into the algorithms and the implementation of an offline speaker segmentation and indexing system for recorded speech data where usually more than one speaker is present. Speaker diarization is a well studied topic in the domain of broadcast news recordings. Most of the proposed systems involve hierarchical clustering of the data, where the number of speakers and their identities are known a priori. Speaker diarization is the task of assigning a unique label to all speech segments in an audio stream by the same speaker. There are two key challenges: processing speed and robustness in the presence of noise. In this paper we address the robustness issue by using a method already successful in speech recognition application. Using ANS (Autocorrelation-Based Noise Subtraction) for robust genetic algorithm-based speaker diarization, we compare the results with the baseline MFCC-based system in clean and noisy conditions.
doi_str_mv	10.1109/ISSPIT.2007.4458171
format	Conference Proceeding
fullrecord	<record><control><sourceid>ieee_6IE</sourceid><recordid>TN_cdi_ieee_primary_4458171</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>4458171</ieee_id><sourcerecordid>4458171</sourcerecordid><originalsourceid>FETCH-LOGICAL-i90t-f268e2050a0d9e05e3daf0ec409528f58066b0b13bcdc4ba8ec1be616aeee3093</originalsourceid><addsrcrecordid>eNpVkMtOwzAURI0Aiar0C7rxD6RcP-I4y6oUGqk8RMq6spObypAmle0gwdfzKgtWozmjmcUQMmUwYwzyq6IsH4vNjANkMylTzTJ2QiZ5ppnkUjItUjj956U4IyPOFE8yLcUFmYTwAgAsU5IpNSK7p94OIdLygOYVPb12xrsPE13fUddRQ--GNrrkL152b8733R67SJ-D63Z0PsS-6r3H9qeUWBOwpve9C0jLwUZvqm9-Sc4b0wacHHVMNjfLzWKVrB9ui8V8nbgcYtJwpZFDCgbqHCFFUZsGsJKQp1w3qQalLFgmbFVX0hqNFbOomDKIKCAXYzL9nXVfYHvwbm_8-_b4lPgETupc5Q</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>Robust Speaker Diarization in a Multi-Speaker Environment Using Autocorrelation-based Noise Subtraction</title><source>IEEE Electronic Library (IEL) Conference Proceedings</source><creator>Mirrezaie, S.M. ; Ahadi, S.M. ; Kashi, A.</creator><creatorcontrib>Mirrezaie, S.M. ; Ahadi, S.M. ; Kashi, A.</creatorcontrib><description>This paper shows research performed into the topic of speaker diarization for multi-speaker environment. It looks into the algorithms and the implementation of an offline speaker segmentation and indexing system for recorded speech data where usually more than one speaker is present. Speaker diarization is a well studied topic in the domain of broadcast news recordings. Most of the proposed systems involve hierarchical clustering of the data, where the number of speakers and their identities are known a priori. Speaker diarization is the task of assigning a unique label to all speech segments in an audio stream by the same speaker. There are two key challenges: processing speed and robustness in the presence of noise. In this paper we address the robustness issue by using a method already successful in speech recognition application. Using ANS (Autocorrelation-Based Noise Subtraction) for robust genetic algorithm-based speaker diarization, we compare the results with the baseline MFCC-based system in clean and noisy conditions.</description><identifier>ISSN: 2162-7843</identifier><identifier>ISBN: 9781424418343</identifier><identifier>ISBN: 1424418348</identifier><identifier>EISBN: 9781424418350</identifier><identifier>EISBN: 1424418356</identifier><identifier>DOI: 10.1109/ISSPIT.2007.4458171</identifier><language>eng</language><publisher>IEEE</publisher><subject>Autocorrelation ; Broadcasting ; Clustering algorithms ; Genetics ; Indexing ; meetings indexing ; Noise robustness ; noisy speech ; Robust speaker diarization ; speaker segmentation and clustering ; Speech recognition ; Streaming media ; Working environment noise</subject><ispartof>2007 IEEE International Symposium on Signal Processing and Information Technology, 2007, p.291-296</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/4458171$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,780,784,789,790,2058,27925,54920</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/4458171$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Mirrezaie, S.M.</creatorcontrib><creatorcontrib>Ahadi, S.M.</creatorcontrib><creatorcontrib>Kashi, A.</creatorcontrib><title>Robust Speaker Diarization in a Multi-Speaker Environment Using Autocorrelation-based Noise Subtraction</title><title>2007 IEEE International Symposium on Signal Processing and Information Technology</title><addtitle>ISSPIT</addtitle><description>This paper shows research performed into the topic of speaker diarization for multi-speaker environment. It looks into the algorithms and the implementation of an offline speaker segmentation and indexing system for recorded speech data where usually more than one speaker is present. Speaker diarization is a well studied topic in the domain of broadcast news recordings. Most of the proposed systems involve hierarchical clustering of the data, where the number of speakers and their identities are known a priori. Speaker diarization is the task of assigning a unique label to all speech segments in an audio stream by the same speaker. There are two key challenges: processing speed and robustness in the presence of noise. In this paper we address the robustness issue by using a method already successful in speech recognition application. Using ANS (Autocorrelation-Based Noise Subtraction) for robust genetic algorithm-based speaker diarization, we compare the results with the baseline MFCC-based system in clean and noisy conditions.</description><subject>Autocorrelation</subject><subject>Broadcasting</subject><subject>Clustering algorithms</subject><subject>Genetics</subject><subject>Indexing</subject><subject>meetings indexing</subject><subject>Noise robustness</subject><subject>noisy speech</subject><subject>Robust speaker diarization</subject><subject>speaker segmentation and clustering</subject><subject>Speech recognition</subject><subject>Streaming media</subject><subject>Working environment noise</subject><issn>2162-7843</issn><isbn>9781424418343</isbn><isbn>1424418348</isbn><isbn>9781424418350</isbn><isbn>1424418356</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2007</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><sourceid>RIE</sourceid><recordid>eNpVkMtOwzAURI0Aiar0C7rxD6RcP-I4y6oUGqk8RMq6spObypAmle0gwdfzKgtWozmjmcUQMmUwYwzyq6IsH4vNjANkMylTzTJ2QiZ5ppnkUjItUjj956U4IyPOFE8yLcUFmYTwAgAsU5IpNSK7p94OIdLygOYVPb12xrsPE13fUddRQ--GNrrkL152b8733R67SJ-D63Z0PsS-6r3H9qeUWBOwpve9C0jLwUZvqm9-Sc4b0wacHHVMNjfLzWKVrB9ui8V8nbgcYtJwpZFDCgbqHCFFUZsGsJKQp1w3qQalLFgmbFVX0hqNFbOomDKIKCAXYzL9nXVfYHvwbm_8-_b4lPgETupc5Q</recordid><startdate>200712</startdate><enddate>200712</enddate><creator>Mirrezaie, S.M.</creator><creator>Ahadi, S.M.</creator><creator>Kashi, A.</creator><general>IEEE</general><scope>6IE</scope><scope>6IL</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIL</scope></search><sort><creationdate>200712</creationdate><title>Robust Speaker Diarization in a Multi-Speaker Environment Using Autocorrelation-based Noise Subtraction</title><author>Mirrezaie, S.M. ; Ahadi, S.M. ; Kashi, A.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-i90t-f268e2050a0d9e05e3daf0ec409528f58066b0b13bcdc4ba8ec1be616aeee3093</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2007</creationdate><topic>Autocorrelation</topic><topic>Broadcasting</topic><topic>Clustering algorithms</topic><topic>Genetics</topic><topic>Indexing</topic><topic>meetings indexing</topic><topic>Noise robustness</topic><topic>noisy speech</topic><topic>Robust speaker diarization</topic><topic>speaker segmentation and clustering</topic><topic>Speech recognition</topic><topic>Streaming media</topic><topic>Working environment noise</topic><toplevel>online_resources</toplevel><creatorcontrib>Mirrezaie, S.M.</creatorcontrib><creatorcontrib>Ahadi, S.M.</creatorcontrib><creatorcontrib>Kashi, A.</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Electronic Library (IEL)</collection><collection>IEEE Proceedings Order Plans (POP All) 1998-Present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Mirrezaie, S.M.</au><au>Ahadi, S.M.</au><au>Kashi, A.</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Robust Speaker Diarization in a Multi-Speaker Environment Using Autocorrelation-based Noise Subtraction</atitle><btitle>2007 IEEE International Symposium on Signal Processing and Information Technology</btitle><stitle>ISSPIT</stitle><date>2007-12</date><risdate>2007</risdate><spage>291</spage><epage>296</epage><pages>291-296</pages><issn>2162-7843</issn><isbn>9781424418343</isbn><isbn>1424418348</isbn><eisbn>9781424418350</eisbn><eisbn>1424418356</eisbn><abstract>This paper shows research performed into the topic of speaker diarization for multi-speaker environment. It looks into the algorithms and the implementation of an offline speaker segmentation and indexing system for recorded speech data where usually more than one speaker is present. Speaker diarization is a well studied topic in the domain of broadcast news recordings. Most of the proposed systems involve hierarchical clustering of the data, where the number of speakers and their identities are known a priori. Speaker diarization is the task of assigning a unique label to all speech segments in an audio stream by the same speaker. There are two key challenges: processing speed and robustness in the presence of noise. In this paper we address the robustness issue by using a method already successful in speech recognition application. Using ANS (Autocorrelation-Based Noise Subtraction) for robust genetic algorithm-based speaker diarization, we compare the results with the baseline MFCC-based system in clean and noisy conditions.</abstract><pub>IEEE</pub><doi>10.1109/ISSPIT.2007.4458171</doi><tpages>6</tpages></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 2162-7843
ispartof	2007 IEEE International Symposium on Signal Processing and Information Technology, 2007, p.291-296
issn	2162-7843
language	eng
recordid	cdi_ieee_primary_4458171
source	IEEE Electronic Library (IEL) Conference Proceedings
subjects	Autocorrelation Broadcasting Clustering algorithms Genetics Indexing meetings indexing Noise robustness noisy speech Robust speaker diarization speaker segmentation and clustering Speech recognition Streaming media Working environment noise
title	Robust Speaker Diarization in a Multi-Speaker Environment Using Autocorrelation-based Noise Subtraction
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-07T00%3A38%3A08IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_6IE&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Robust%20Speaker%20Diarization%20in%20a%20Multi-Speaker%20Environment%20Using%20Autocorrelation-based%20Noise%20Subtraction&rft.btitle=2007%20IEEE%20International%20Symposium%20on%20Signal%20Processing%20and%20Information%20Technology&rft.au=Mirrezaie,%20S.M.&rft.date=2007-12&rft.spage=291&rft.epage=296&rft.pages=291-296&rft.issn=2162-7843&rft.isbn=9781424418343&rft.isbn_list=1424418348&rft_id=info:doi/10.1109/ISSPIT.2007.4458171&rft_dat=%3Cieee_6IE%3E4458171%3C/ieee_6IE%3E%3Curl%3E%3C/url%3E&rft.eisbn=9781424418350&rft.eisbn_list=1424418356&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=4458171&rfr_iscdi=true