Performance evaluation for transform domain model-based single-channel speech separation
It is already demonstrated that selected features have a much larger effect to the overall performance in speech applications accuracy than the selected generative models have. In this paper, we propose subband perceptually weighted transformation (SPWT) applied on magnitude spectrum to improve the...
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Tagungsbericht |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 942 |
---|---|
container_issue | |
container_start_page | 935 |
container_title | |
container_volume | |
creator | Mowlaee, P. Sayadiyan, A. |
description | It is already demonstrated that selected features have a much larger effect to the overall performance in speech applications accuracy than the selected generative models have. In this paper, we propose subband perceptually weighted transformation (SPWT) applied on magnitude spectrum to improve the performance of single-channel separation scenario (SCSS). In particular, we compare three feature types namely, log-spectrum, magnitude spectrum and the proposed SPWT. A comprehensive statistical analysis is performed to evaluate the performance of a VQ-based SCSS framework in terms of the lower error bound. At the core of this approach are two trained codebooks of the quantized feature vectors of speakers, whereby the main evaluation for separation is performed. The simulation results show that the proposed transformation offers an attractive candidate to improve the separation performance of model-based SCSS. It is also observed that the proposed feature can result in a lower-error bound in terms of the spectral distortion (SD) as well as higher SSNR in comparison with other features. |
doi_str_mv | 10.1109/AICCSA.2009.5069444 |
format | Conference Proceeding |
fullrecord | <record><control><sourceid>ieee_6IE</sourceid><recordid>TN_cdi_ieee_primary_5069444</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>5069444</ieee_id><sourcerecordid>5069444</sourcerecordid><originalsourceid>FETCH-LOGICAL-i175t-4044278e77a00462c7a78e5db8c9792370145c1278a92099c737ffac7b2277763</originalsourceid><addsrcrecordid>eNo9kNtqwzAMQL1LYW3XL-iLfyCdLTtR_FjCLoXCBttgb0V1lDUjcUrcDfb3S3fTi5COOEgSYq7VQmvlrparonhcLkApt0hV5qy1J2KiLVhrcpWZUzEGnekkNUadiZnD_I9hev7PAEZicnQ4pSCHCzGL8U0NYVPIjR2Llwfuq65vKXiW_EHNOx3qLsihJw89hXiEsuxaqoNsu5KbZEuRSxnr8Npw4ncUAjcy7pn9TkbeU_9tuBSjiprIs988Fc8310_FXbK-v10Vy3VSa0wPiVXWAuaMSMNSGXikoUrLbe4dOjCotE29HkbIwXCGR4NVRR63AIiYmamY_3hrZt7s-7ql_nPz-zHzBXADWNw</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>Performance evaluation for transform domain model-based single-channel speech separation</title><source>IEEE Electronic Library (IEL) Conference Proceedings</source><creator>Mowlaee, P. ; Sayadiyan, A.</creator><creatorcontrib>Mowlaee, P. ; Sayadiyan, A.</creatorcontrib><description>It is already demonstrated that selected features have a much larger effect to the overall performance in speech applications accuracy than the selected generative models have. In this paper, we propose subband perceptually weighted transformation (SPWT) applied on magnitude spectrum to improve the performance of single-channel separation scenario (SCSS). In particular, we compare three feature types namely, log-spectrum, magnitude spectrum and the proposed SPWT. A comprehensive statistical analysis is performed to evaluate the performance of a VQ-based SCSS framework in terms of the lower error bound. At the core of this approach are two trained codebooks of the quantized feature vectors of speakers, whereby the main evaluation for separation is performed. The simulation results show that the proposed transformation offers an attractive candidate to improve the separation performance of model-based SCSS. It is also observed that the proposed feature can result in a lower-error bound in terms of the spectral distortion (SD) as well as higher SSNR in comparison with other features.</description><identifier>ISSN: 2161-5322</identifier><identifier>ISBN: 9781424438075</identifier><identifier>ISBN: 1424438071</identifier><identifier>EISSN: 2161-5330</identifier><identifier>EISBN: 1424438063</identifier><identifier>EISBN: 9781424438068</identifier><identifier>DOI: 10.1109/AICCSA.2009.5069444</identifier><identifier>LCCN: 2009900282</identifier><language>eng</language><publisher>IEEE</publisher><subject>Computational complexity ; Crosstalk ; Hidden Markov models ; magnitude spectrum ; Noise measurement ; Performance evaluation ; Power harmonic filters ; Spectral Distortion ; Speech analysis ; Statistical analysis ; Time frequency analysis ; Transform domain ; Vector quantization</subject><ispartof>2009 IEEE/ACS International Conference on Computer Systems and Applications, 2009, p.935-942</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/5069444$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,780,784,789,790,2058,27925,54920</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/5069444$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Mowlaee, P.</creatorcontrib><creatorcontrib>Sayadiyan, A.</creatorcontrib><title>Performance evaluation for transform domain model-based single-channel speech separation</title><title>2009 IEEE/ACS International Conference on Computer Systems and Applications</title><addtitle>AICCSA</addtitle><description>It is already demonstrated that selected features have a much larger effect to the overall performance in speech applications accuracy than the selected generative models have. In this paper, we propose subband perceptually weighted transformation (SPWT) applied on magnitude spectrum to improve the performance of single-channel separation scenario (SCSS). In particular, we compare three feature types namely, log-spectrum, magnitude spectrum and the proposed SPWT. A comprehensive statistical analysis is performed to evaluate the performance of a VQ-based SCSS framework in terms of the lower error bound. At the core of this approach are two trained codebooks of the quantized feature vectors of speakers, whereby the main evaluation for separation is performed. The simulation results show that the proposed transformation offers an attractive candidate to improve the separation performance of model-based SCSS. It is also observed that the proposed feature can result in a lower-error bound in terms of the spectral distortion (SD) as well as higher SSNR in comparison with other features.</description><subject>Computational complexity</subject><subject>Crosstalk</subject><subject>Hidden Markov models</subject><subject>magnitude spectrum</subject><subject>Noise measurement</subject><subject>Performance evaluation</subject><subject>Power harmonic filters</subject><subject>Spectral Distortion</subject><subject>Speech analysis</subject><subject>Statistical analysis</subject><subject>Time frequency analysis</subject><subject>Transform domain</subject><subject>Vector quantization</subject><issn>2161-5322</issn><issn>2161-5330</issn><isbn>9781424438075</isbn><isbn>1424438071</isbn><isbn>1424438063</isbn><isbn>9781424438068</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2009</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><sourceid>RIE</sourceid><recordid>eNo9kNtqwzAMQL1LYW3XL-iLfyCdLTtR_FjCLoXCBttgb0V1lDUjcUrcDfb3S3fTi5COOEgSYq7VQmvlrparonhcLkApt0hV5qy1J2KiLVhrcpWZUzEGnekkNUadiZnD_I9hev7PAEZicnQ4pSCHCzGL8U0NYVPIjR2Llwfuq65vKXiW_EHNOx3qLsihJw89hXiEsuxaqoNsu5KbZEuRSxnr8Npw4ncUAjcy7pn9TkbeU_9tuBSjiprIs988Fc8310_FXbK-v10Vy3VSa0wPiVXWAuaMSMNSGXikoUrLbe4dOjCotE29HkbIwXCGR4NVRR63AIiYmamY_3hrZt7s-7ql_nPz-zHzBXADWNw</recordid><startdate>200905</startdate><enddate>200905</enddate><creator>Mowlaee, P.</creator><creator>Sayadiyan, A.</creator><general>IEEE</general><scope>6IE</scope><scope>6IL</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIL</scope></search><sort><creationdate>200905</creationdate><title>Performance evaluation for transform domain model-based single-channel speech separation</title><author>Mowlaee, P. ; Sayadiyan, A.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-i175t-4044278e77a00462c7a78e5db8c9792370145c1278a92099c737ffac7b2277763</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2009</creationdate><topic>Computational complexity</topic><topic>Crosstalk</topic><topic>Hidden Markov models</topic><topic>magnitude spectrum</topic><topic>Noise measurement</topic><topic>Performance evaluation</topic><topic>Power harmonic filters</topic><topic>Spectral Distortion</topic><topic>Speech analysis</topic><topic>Statistical analysis</topic><topic>Time frequency analysis</topic><topic>Transform domain</topic><topic>Vector quantization</topic><toplevel>online_resources</toplevel><creatorcontrib>Mowlaee, P.</creatorcontrib><creatorcontrib>Sayadiyan, A.</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Electronic Library Online</collection><collection>IEEE Proceedings Order Plans (POP All) 1998-Present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Mowlaee, P.</au><au>Sayadiyan, A.</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Performance evaluation for transform domain model-based single-channel speech separation</atitle><btitle>2009 IEEE/ACS International Conference on Computer Systems and Applications</btitle><stitle>AICCSA</stitle><date>2009-05</date><risdate>2009</risdate><spage>935</spage><epage>942</epage><pages>935-942</pages><issn>2161-5322</issn><eissn>2161-5330</eissn><isbn>9781424438075</isbn><isbn>1424438071</isbn><eisbn>1424438063</eisbn><eisbn>9781424438068</eisbn><abstract>It is already demonstrated that selected features have a much larger effect to the overall performance in speech applications accuracy than the selected generative models have. In this paper, we propose subband perceptually weighted transformation (SPWT) applied on magnitude spectrum to improve the performance of single-channel separation scenario (SCSS). In particular, we compare three feature types namely, log-spectrum, magnitude spectrum and the proposed SPWT. A comprehensive statistical analysis is performed to evaluate the performance of a VQ-based SCSS framework in terms of the lower error bound. At the core of this approach are two trained codebooks of the quantized feature vectors of speakers, whereby the main evaluation for separation is performed. The simulation results show that the proposed transformation offers an attractive candidate to improve the separation performance of model-based SCSS. It is also observed that the proposed feature can result in a lower-error bound in terms of the spectral distortion (SD) as well as higher SSNR in comparison with other features.</abstract><pub>IEEE</pub><doi>10.1109/AICCSA.2009.5069444</doi><tpages>8</tpages></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | ISSN: 2161-5322 |
ispartof | 2009 IEEE/ACS International Conference on Computer Systems and Applications, 2009, p.935-942 |
issn | 2161-5322 2161-5330 |
language | eng |
recordid | cdi_ieee_primary_5069444 |
source | IEEE Electronic Library (IEL) Conference Proceedings |
subjects | Computational complexity Crosstalk Hidden Markov models magnitude spectrum Noise measurement Performance evaluation Power harmonic filters Spectral Distortion Speech analysis Statistical analysis Time frequency analysis Transform domain Vector quantization |
title | Performance evaluation for transform domain model-based single-channel speech separation |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-20T13%3A49%3A29IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_6IE&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Performance%20evaluation%20for%20transform%20domain%20model-based%20single-channel%20speech%20separation&rft.btitle=2009%20IEEE/ACS%20International%20Conference%20on%20Computer%20Systems%20and%20Applications&rft.au=Mowlaee,%20P.&rft.date=2009-05&rft.spage=935&rft.epage=942&rft.pages=935-942&rft.issn=2161-5322&rft.eissn=2161-5330&rft.isbn=9781424438075&rft.isbn_list=1424438071&rft_id=info:doi/10.1109/AICCSA.2009.5069444&rft_dat=%3Cieee_6IE%3E5069444%3C/ieee_6IE%3E%3Curl%3E%3C/url%3E&rft.eisbn=1424438063&rft.eisbn_list=9781424438068&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=5069444&rfr_iscdi=true |