Performance evaluation for transform domain model-based single-channel speech separation

It is already demonstrated that selected features have a much larger effect to the overall performance in speech applications accuracy than the selected generative models have. In this paper, we propose subband perceptually weighted transformation (SPWT) applied on magnitude spectrum to improve the...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Mowlaee, P., Sayadiyan, A.
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 942
container_issue
container_start_page 935
container_title
container_volume
creator Mowlaee, P.
Sayadiyan, A.
description It is already demonstrated that selected features have a much larger effect to the overall performance in speech applications accuracy than the selected generative models have. In this paper, we propose subband perceptually weighted transformation (SPWT) applied on magnitude spectrum to improve the performance of single-channel separation scenario (SCSS). In particular, we compare three feature types namely, log-spectrum, magnitude spectrum and the proposed SPWT. A comprehensive statistical analysis is performed to evaluate the performance of a VQ-based SCSS framework in terms of the lower error bound. At the core of this approach are two trained codebooks of the quantized feature vectors of speakers, whereby the main evaluation for separation is performed. The simulation results show that the proposed transformation offers an attractive candidate to improve the separation performance of model-based SCSS. It is also observed that the proposed feature can result in a lower-error bound in terms of the spectral distortion (SD) as well as higher SSNR in comparison with other features.
doi_str_mv 10.1109/AICCSA.2009.5069444
format Conference Proceeding
fullrecord <record><control><sourceid>ieee_6IE</sourceid><recordid>TN_cdi_ieee_primary_5069444</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>5069444</ieee_id><sourcerecordid>5069444</sourcerecordid><originalsourceid>FETCH-LOGICAL-i175t-4044278e77a00462c7a78e5db8c9792370145c1278a92099c737ffac7b2277763</originalsourceid><addsrcrecordid>eNo9kNtqwzAMQL1LYW3XL-iLfyCdLTtR_FjCLoXCBttgb0V1lDUjcUrcDfb3S3fTi5COOEgSYq7VQmvlrparonhcLkApt0hV5qy1J2KiLVhrcpWZUzEGnekkNUadiZnD_I9hev7PAEZicnQ4pSCHCzGL8U0NYVPIjR2Llwfuq65vKXiW_EHNOx3qLsihJw89hXiEsuxaqoNsu5KbZEuRSxnr8Npw4ncUAjcy7pn9TkbeU_9tuBSjiprIs988Fc8310_FXbK-v10Vy3VSa0wPiVXWAuaMSMNSGXikoUrLbe4dOjCotE29HkbIwXCGR4NVRR63AIiYmamY_3hrZt7s-7ql_nPz-zHzBXADWNw</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>Performance evaluation for transform domain model-based single-channel speech separation</title><source>IEEE Electronic Library (IEL) Conference Proceedings</source><creator>Mowlaee, P. ; Sayadiyan, A.</creator><creatorcontrib>Mowlaee, P. ; Sayadiyan, A.</creatorcontrib><description>It is already demonstrated that selected features have a much larger effect to the overall performance in speech applications accuracy than the selected generative models have. In this paper, we propose subband perceptually weighted transformation (SPWT) applied on magnitude spectrum to improve the performance of single-channel separation scenario (SCSS). In particular, we compare three feature types namely, log-spectrum, magnitude spectrum and the proposed SPWT. A comprehensive statistical analysis is performed to evaluate the performance of a VQ-based SCSS framework in terms of the lower error bound. At the core of this approach are two trained codebooks of the quantized feature vectors of speakers, whereby the main evaluation for separation is performed. The simulation results show that the proposed transformation offers an attractive candidate to improve the separation performance of model-based SCSS. It is also observed that the proposed feature can result in a lower-error bound in terms of the spectral distortion (SD) as well as higher SSNR in comparison with other features.</description><identifier>ISSN: 2161-5322</identifier><identifier>ISBN: 9781424438075</identifier><identifier>ISBN: 1424438071</identifier><identifier>EISSN: 2161-5330</identifier><identifier>EISBN: 1424438063</identifier><identifier>EISBN: 9781424438068</identifier><identifier>DOI: 10.1109/AICCSA.2009.5069444</identifier><identifier>LCCN: 2009900282</identifier><language>eng</language><publisher>IEEE</publisher><subject>Computational complexity ; Crosstalk ; Hidden Markov models ; magnitude spectrum ; Noise measurement ; Performance evaluation ; Power harmonic filters ; Spectral Distortion ; Speech analysis ; Statistical analysis ; Time frequency analysis ; Transform domain ; Vector quantization</subject><ispartof>2009 IEEE/ACS International Conference on Computer Systems and Applications, 2009, p.935-942</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/5069444$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,780,784,789,790,2058,27925,54920</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/5069444$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Mowlaee, P.</creatorcontrib><creatorcontrib>Sayadiyan, A.</creatorcontrib><title>Performance evaluation for transform domain model-based single-channel speech separation</title><title>2009 IEEE/ACS International Conference on Computer Systems and Applications</title><addtitle>AICCSA</addtitle><description>It is already demonstrated that selected features have a much larger effect to the overall performance in speech applications accuracy than the selected generative models have. In this paper, we propose subband perceptually weighted transformation (SPWT) applied on magnitude spectrum to improve the performance of single-channel separation scenario (SCSS). In particular, we compare three feature types namely, log-spectrum, magnitude spectrum and the proposed SPWT. A comprehensive statistical analysis is performed to evaluate the performance of a VQ-based SCSS framework in terms of the lower error bound. At the core of this approach are two trained codebooks of the quantized feature vectors of speakers, whereby the main evaluation for separation is performed. The simulation results show that the proposed transformation offers an attractive candidate to improve the separation performance of model-based SCSS. It is also observed that the proposed feature can result in a lower-error bound in terms of the spectral distortion (SD) as well as higher SSNR in comparison with other features.</description><subject>Computational complexity</subject><subject>Crosstalk</subject><subject>Hidden Markov models</subject><subject>magnitude spectrum</subject><subject>Noise measurement</subject><subject>Performance evaluation</subject><subject>Power harmonic filters</subject><subject>Spectral Distortion</subject><subject>Speech analysis</subject><subject>Statistical analysis</subject><subject>Time frequency analysis</subject><subject>Transform domain</subject><subject>Vector quantization</subject><issn>2161-5322</issn><issn>2161-5330</issn><isbn>9781424438075</isbn><isbn>1424438071</isbn><isbn>1424438063</isbn><isbn>9781424438068</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2009</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><sourceid>RIE</sourceid><recordid>eNo9kNtqwzAMQL1LYW3XL-iLfyCdLTtR_FjCLoXCBttgb0V1lDUjcUrcDfb3S3fTi5COOEgSYq7VQmvlrparonhcLkApt0hV5qy1J2KiLVhrcpWZUzEGnekkNUadiZnD_I9hev7PAEZicnQ4pSCHCzGL8U0NYVPIjR2Llwfuq65vKXiW_EHNOx3qLsihJw89hXiEsuxaqoNsu5KbZEuRSxnr8Npw4ncUAjcy7pn9TkbeU_9tuBSjiprIs988Fc8310_FXbK-v10Vy3VSa0wPiVXWAuaMSMNSGXikoUrLbe4dOjCotE29HkbIwXCGR4NVRR63AIiYmamY_3hrZt7s-7ql_nPz-zHzBXADWNw</recordid><startdate>200905</startdate><enddate>200905</enddate><creator>Mowlaee, P.</creator><creator>Sayadiyan, A.</creator><general>IEEE</general><scope>6IE</scope><scope>6IL</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIL</scope></search><sort><creationdate>200905</creationdate><title>Performance evaluation for transform domain model-based single-channel speech separation</title><author>Mowlaee, P. ; Sayadiyan, A.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-i175t-4044278e77a00462c7a78e5db8c9792370145c1278a92099c737ffac7b2277763</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2009</creationdate><topic>Computational complexity</topic><topic>Crosstalk</topic><topic>Hidden Markov models</topic><topic>magnitude spectrum</topic><topic>Noise measurement</topic><topic>Performance evaluation</topic><topic>Power harmonic filters</topic><topic>Spectral Distortion</topic><topic>Speech analysis</topic><topic>Statistical analysis</topic><topic>Time frequency analysis</topic><topic>Transform domain</topic><topic>Vector quantization</topic><toplevel>online_resources</toplevel><creatorcontrib>Mowlaee, P.</creatorcontrib><creatorcontrib>Sayadiyan, A.</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Electronic Library Online</collection><collection>IEEE Proceedings Order Plans (POP All) 1998-Present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Mowlaee, P.</au><au>Sayadiyan, A.</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Performance evaluation for transform domain model-based single-channel speech separation</atitle><btitle>2009 IEEE/ACS International Conference on Computer Systems and Applications</btitle><stitle>AICCSA</stitle><date>2009-05</date><risdate>2009</risdate><spage>935</spage><epage>942</epage><pages>935-942</pages><issn>2161-5322</issn><eissn>2161-5330</eissn><isbn>9781424438075</isbn><isbn>1424438071</isbn><eisbn>1424438063</eisbn><eisbn>9781424438068</eisbn><abstract>It is already demonstrated that selected features have a much larger effect to the overall performance in speech applications accuracy than the selected generative models have. In this paper, we propose subband perceptually weighted transformation (SPWT) applied on magnitude spectrum to improve the performance of single-channel separation scenario (SCSS). In particular, we compare three feature types namely, log-spectrum, magnitude spectrum and the proposed SPWT. A comprehensive statistical analysis is performed to evaluate the performance of a VQ-based SCSS framework in terms of the lower error bound. At the core of this approach are two trained codebooks of the quantized feature vectors of speakers, whereby the main evaluation for separation is performed. The simulation results show that the proposed transformation offers an attractive candidate to improve the separation performance of model-based SCSS. It is also observed that the proposed feature can result in a lower-error bound in terms of the spectral distortion (SD) as well as higher SSNR in comparison with other features.</abstract><pub>IEEE</pub><doi>10.1109/AICCSA.2009.5069444</doi><tpages>8</tpages></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 2161-5322
ispartof 2009 IEEE/ACS International Conference on Computer Systems and Applications, 2009, p.935-942
issn 2161-5322
2161-5330
language eng
recordid cdi_ieee_primary_5069444
source IEEE Electronic Library (IEL) Conference Proceedings
subjects Computational complexity
Crosstalk
Hidden Markov models
magnitude spectrum
Noise measurement
Performance evaluation
Power harmonic filters
Spectral Distortion
Speech analysis
Statistical analysis
Time frequency analysis
Transform domain
Vector quantization
title Performance evaluation for transform domain model-based single-channel speech separation
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-20T13%3A49%3A29IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_6IE&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Performance%20evaluation%20for%20transform%20domain%20model-based%20single-channel%20speech%20separation&rft.btitle=2009%20IEEE/ACS%20International%20Conference%20on%20Computer%20Systems%20and%20Applications&rft.au=Mowlaee,%20P.&rft.date=2009-05&rft.spage=935&rft.epage=942&rft.pages=935-942&rft.issn=2161-5322&rft.eissn=2161-5330&rft.isbn=9781424438075&rft.isbn_list=1424438071&rft_id=info:doi/10.1109/AICCSA.2009.5069444&rft_dat=%3Cieee_6IE%3E5069444%3C/ieee_6IE%3E%3Curl%3E%3C/url%3E&rft.eisbn=1424438063&rft.eisbn_list=9781424438068&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=5069444&rfr_iscdi=true