Performance evaluation for transform domain model-based single-channel speech separation

It is already demonstrated that selected features have a much larger effect to the overall performance in speech applications accuracy than the selected generative models have. In this paper, we propose subband perceptually weighted transformation (SPWT) applied on magnitude spectrum to improve the...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Mowlaee, P., Sayadiyan, A.
Format:	Tagungsbericht
Sprache:	eng
Schlagworte:	Computational complexity Crosstalk Hidden Markov models magnitude spectrum Noise measurement Performance evaluation Power harmonic filters Spectral Distortion Speech analysis Statistical analysis Time frequency analysis Transform domain Vector quantization
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	942
container_issue
container_start_page	935
container_title
container_volume
creator	Mowlaee, P. Sayadiyan, A.
description	It is already demonstrated that selected features have a much larger effect to the overall performance in speech applications accuracy than the selected generative models have. In this paper, we propose subband perceptually weighted transformation (SPWT) applied on magnitude spectrum to improve the performance of single-channel separation scenario (SCSS). In particular, we compare three feature types namely, log-spectrum, magnitude spectrum and the proposed SPWT. A comprehensive statistical analysis is performed to evaluate the performance of a VQ-based SCSS framework in terms of the lower error bound. At the core of this approach are two trained codebooks of the quantized feature vectors of speakers, whereby the main evaluation for separation is performed. The simulation results show that the proposed transformation offers an attractive candidate to improve the separation performance of model-based SCSS. It is also observed that the proposed feature can result in a lower-error bound in terms of the spectral distortion (SD) as well as higher SSNR in comparison with other features.
doi_str_mv	10.1109/AICCSA.2009.5069444
format	Conference Proceeding
fullrecord	<record><control><sourceid>ieee_6IE</sourceid><recordid>TN_cdi_ieee_primary_5069444</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>5069444</ieee_id><sourcerecordid>5069444</sourcerecordid><originalsourceid>FETCH-LOGICAL-i175t-4044278e77a00462c7a78e5db8c9792370145c1278a92099c737ffac7b2277763</originalsourceid><addsrcrecordid>eNo9kNtqwzAMQL1LYW3XL-iLfyCdLTtR_FjCLoXCBttgb0V1lDUjcUrcDfb3S3fTi5COOEgSYq7VQmvlrparonhcLkApt0hV5qy1J2KiLVhrcpWZUzEGnekkNUadiZnD_I9hev7PAEZicnQ4pSCHCzGL8U0NYVPIjR2Llwfuq65vKXiW_EHNOx3qLsihJw89hXiEsuxaqoNsu5KbZEuRSxnr8Npw4ncUAjcy7pn9TkbeU_9tuBSjiprIs988Fc8310_FXbK-v10Vy3VSa0wPiVXWAuaMSMNSGXikoUrLbe4dOjCotE29HkbIwXCGR4NVRR63AIiYmamY_3hrZt7s-7ql_nPz-zHzBXADWNw</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>Performance evaluation for transform domain model-based single-channel speech separation</title><source>IEEE Electronic Library (IEL) Conference Proceedings</source><creator>Mowlaee, P. ; Sayadiyan, A.</creator><creatorcontrib>Mowlaee, P. ; Sayadiyan, A.</creatorcontrib><description>It is already demonstrated that selected features have a much larger effect to the overall performance in speech applications accuracy than the selected generative models have. In this paper, we propose subband perceptually weighted transformation (SPWT) applied on magnitude spectrum to improve the performance of single-channel separation scenario (SCSS). In particular, we compare three feature types namely, log-spectrum, magnitude spectrum and the proposed SPWT. A comprehensive statistical analysis is performed to evaluate the performance of a VQ-based SCSS framework in terms of the lower error bound. At the core of this approach are two trained codebooks of the quantized feature vectors of speakers, whereby the main evaluation for separation is performed. The simulation results show that the proposed transformation offers an attractive candidate to improve the separation performance of model-based SCSS. It is also observed that the proposed feature can result in a lower-error bound in terms of the spectral distortion (SD) as well as higher SSNR in comparison with other features.</description><identifier>ISSN: 2161-5322</identifier><identifier>ISBN: 9781424438075</identifier><identifier>ISBN: 1424438071</identifier><identifier>EISSN: 2161-5330</identifier><identifier>EISBN: 1424438063</identifier><identifier>EISBN: 9781424438068</identifier><identifier>DOI: 10.1109/AICCSA.2009.5069444</identifier><identifier>LCCN: 2009900282</identifier><language>eng</language><publisher>IEEE</publisher><subject>Computational complexity ; Crosstalk ; Hidden Markov models ; magnitude spectrum ; Noise measurement ; Performance evaluation ; Power harmonic filters ; Spectral Distortion ; Speech analysis ; Statistical analysis ; Time frequency analysis ; Transform domain ; Vector quantization</subject><ispartof>2009 IEEE/ACS International Conference on Computer Systems and Applications, 2009, p.935-942</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/5069444$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,780,784,789,790,2058,27925,54920</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/5069444$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Mowlaee, P.</creatorcontrib><creatorcontrib>Sayadiyan, A.</creatorcontrib><title>Performance evaluation for transform domain model-based single-channel speech separation</title><title>2009 IEEE/ACS International Conference on Computer Systems and Applications</title><addtitle>AICCSA</addtitle><description>It is already demonstrated that selected features have a much larger effect to the overall performance in speech applications accuracy than the selected generative models have. In this paper, we propose subband perceptually weighted transformation (SPWT) applied on magnitude spectrum to improve the performance of single-channel separation scenario (SCSS). In particular, we compare three feature types namely, log-spectrum, magnitude spectrum and the proposed SPWT. A comprehensive statistical analysis is performed to evaluate the performance of a VQ-based SCSS framework in terms of the lower error bound. At the core of this approach are two trained codebooks of the quantized feature vectors of speakers, whereby the main evaluation for separation is performed. The simulation results show that the proposed transformation offers an attractive candidate to improve the separation performance of model-based SCSS. It is also observed that the proposed feature can result in a lower-error bound in terms of the spectral distortion (SD) as well as higher SSNR in comparison with other features.</description><subject>Computational complexity</subject><subject>Crosstalk</subject><subject>Hidden Markov models</subject><subject>magnitude spectrum</subject><subject>Noise measurement</subject><subject>Performance evaluation</subject><subject>Power harmonic filters</subject><subject>Spectral Distortion</subject><subject>Speech analysis</subject><subject>Statistical analysis</subject><subject>Time frequency analysis</subject><subject>Transform domain</subject><subject>Vector quantization</subject><issn>2161-5322</issn><issn>2161-5330</issn><isbn>9781424438075</isbn><isbn>1424438071</isbn><isbn>1424438063</isbn><isbn>9781424438068</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2009</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><sourceid>RIE</sourceid><recordid>eNo9kNtqwzAMQL1LYW3XL-iLfyCdLTtR_FjCLoXCBttgb0V1lDUjcUrcDfb3S3fTi5COOEgSYq7VQmvlrparonhcLkApt0hV5qy1J2KiLVhrcpWZUzEGnekkNUadiZnD_I9hev7PAEZicnQ4pSCHCzGL8U0NYVPIjR2Llwfuq65vKXiW_EHNOx3qLsihJw89hXiEsuxaqoNsu5KbZEuRSxnr8Npw4ncUAjcy7pn9TkbeU_9tuBSjiprIs988Fc8310_FXbK-v10Vy3VSa0wPiVXWAuaMSMNSGXikoUrLbe4dOjCotE29HkbIwXCGR4NVRR63AIiYmamY_3hrZt7s-7ql_nPz-zHzBXADWNw</recordid><startdate>200905</startdate><enddate>200905</enddate><creator>Mowlaee, P.</creator><creator>Sayadiyan, A.</creator><general>IEEE</general><scope>6IE</scope><scope>6IL</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIL</scope></search><sort><creationdate>200905</creationdate><title>Performance evaluation for transform domain model-based single-channel speech separation</title><author>Mowlaee, P. ; Sayadiyan, A.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-i175t-4044278e77a00462c7a78e5db8c9792370145c1278a92099c737ffac7b2277763</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2009</creationdate><topic>Computational complexity</topic><topic>Crosstalk</topic><topic>Hidden Markov models</topic><topic>magnitude spectrum</topic><topic>Noise measurement</topic><topic>Performance evaluation</topic><topic>Power harmonic filters</topic><topic>Spectral Distortion</topic><topic>Speech analysis</topic><topic>Statistical analysis</topic><topic>Time frequency analysis</topic><topic>Transform domain</topic><topic>Vector quantization</topic><toplevel>online_resources</toplevel><creatorcontrib>Mowlaee, P.</creatorcontrib><creatorcontrib>Sayadiyan, A.</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Electronic Library Online</collection><collection>IEEE Proceedings Order Plans (POP All) 1998-Present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Mowlaee, P.</au><au>Sayadiyan, A.</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Performance evaluation for transform domain model-based single-channel speech separation</atitle><btitle>2009 IEEE/ACS International Conference on Computer Systems and Applications</btitle><stitle>AICCSA</stitle><date>2009-05</date><risdate>2009</risdate><spage>935</spage><epage>942</epage><pages>935-942</pages><issn>2161-5322</issn><eissn>2161-5330</eissn><isbn>9781424438075</isbn><isbn>1424438071</isbn><eisbn>1424438063</eisbn><eisbn>9781424438068</eisbn><abstract>It is already demonstrated that selected features have a much larger effect to the overall performance in speech applications accuracy than the selected generative models have. In this paper, we propose subband perceptually weighted transformation (SPWT) applied on magnitude spectrum to improve the performance of single-channel separation scenario (SCSS). In particular, we compare three feature types namely, log-spectrum, magnitude spectrum and the proposed SPWT. A comprehensive statistical analysis is performed to evaluate the performance of a VQ-based SCSS framework in terms of the lower error bound. At the core of this approach are two trained codebooks of the quantized feature vectors of speakers, whereby the main evaluation for separation is performed. The simulation results show that the proposed transformation offers an attractive candidate to improve the separation performance of model-based SCSS. It is also observed that the proposed feature can result in a lower-error bound in terms of the spectral distortion (SD) as well as higher SSNR in comparison with other features.</abstract><pub>IEEE</pub><doi>10.1109/AICCSA.2009.5069444</doi><tpages>8</tpages></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 2161-5322
ispartof	2009 IEEE/ACS International Conference on Computer Systems and Applications, 2009, p.935-942
issn	2161-5322 2161-5330
language	eng
recordid	cdi_ieee_primary_5069444
source	IEEE Electronic Library (IEL) Conference Proceedings
subjects	Computational complexity Crosstalk Hidden Markov models magnitude spectrum Noise measurement Performance evaluation Power harmonic filters Spectral Distortion Speech analysis Statistical analysis Time frequency analysis Transform domain Vector quantization
title	Performance evaluation for transform domain model-based single-channel speech separation
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-20T13%3A49%3A29IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_6IE&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Performance%20evaluation%20for%20transform%20domain%20model-based%20single-channel%20speech%20separation&rft.btitle=2009%20IEEE/ACS%20International%20Conference%20on%20Computer%20Systems%20and%20Applications&rft.au=Mowlaee,%20P.&rft.date=2009-05&rft.spage=935&rft.epage=942&rft.pages=935-942&rft.issn=2161-5322&rft.eissn=2161-5330&rft.isbn=9781424438075&rft.isbn_list=1424438071&rft_id=info:doi/10.1109/AICCSA.2009.5069444&rft_dat=%3Cieee_6IE%3E5069444%3C/ieee_6IE%3E%3Curl%3E%3C/url%3E&rft.eisbn=1424438063&rft.eisbn_list=9781424438068&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=5069444&rfr_iscdi=true