Single Channel Speech Separation using Minimum Mean Square Error Estimation of Sources' Log Spectra

We present an approach for separating two speech signals when only one single recording of their linear mixture is available. The log spectra of the sources are estimated from the mixture's log spectrum using minimum mean square error (MMSE) approach. The estimation is obtained from the assumpt...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Radfar, M.H., Dansereau, R.M.
Format:	Tagungsbericht
Sprache:	eng
Schlagworte:	Estimation error Filtering Filters Mean square error methods Probability density function Source separation Speech coding Speech processing State estimation Systems engineering and theory
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	132
container_issue
container_start_page	128
container_title
container_volume
creator	Radfar, M.H. Dansereau, R.M.
description	We present an approach for separating two speech signals when only one single recording of their linear mixture is available. The log spectra of the sources are estimated from the mixture's log spectrum using minimum mean square error (MMSE) approach. The estimation is obtained from the assumption that the sources are modelled using a set of Gaussian subsources which are related to the mixture using MIXMAX approximation. The resulting estimator has a closed form and is expressed using the mean and variance of Gaussian subsources. In order to obtain the two most likely subsources which generate the mixture, we use the estimation-detection technique. We also show that the binary mask filtering which has been empirically - and with no mathematical justification - used in speech separation techniques is, in fact, a simplified form of the MMSE estimator. The proposed technique is compared with the binary mask when the input consists of male-male, female-female, and female-male mixtures. The experimental results in terms of segmental SNR show that the MMSE estimator outperforms binary mask filtering.
doi_str_mv	10.1109/MLSP.2007.4414294
format	Conference Proceeding
fullrecord	<record><control><sourceid>ieee_6IE</sourceid><recordid>TN_cdi_ieee_primary_4414294</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>4414294</ieee_id><sourcerecordid>4414294</sourcerecordid><originalsourceid>FETCH-LOGICAL-i90t-fd8d2bb8b9fb2ec13a1c4ac897786c714e1532c6440a1148f9ad9f1d69d4deaa3</originalsourceid><addsrcrecordid>eNo1UD1PwzAUNF8SpfQHIBZvTAl-jhPbI6rKh5QKpHRgq16cl9aoTYqTDPx7glqmu9OdTqdj7A5EDCDs4zIvPmIphI6VAiWtOmMzq81IR51mWXLOJjLRJrLSfF6wm38jtZdsAmkKkUwVXLNZ130JIUBnoysmzBW-2eyIz7fYNLTjxYHIbXlBBwzY-7bhQzcm-NI3fj_s-ZKw4cX3gIH4IoQ28EXX-_0x2ta8aIfgqHvgebv5K3N9wFt2VeOuo9kJp2z1vFjNX6P8_eVt_pRH3oo-qitTybI0pa1LSQ4SBKfQGau1yZwGRZAm0mVKCQRQprZY2RqqzFaqIsRkyu6PtZ6I1ocwrgo_69NdyS_4AFr1</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>Single Channel Speech Separation using Minimum Mean Square Error Estimation of Sources' Log Spectra</title><source>IEEE Electronic Library (IEL) Conference Proceedings</source><creator>Radfar, M.H. ; Dansereau, R.M.</creator><creatorcontrib>Radfar, M.H. ; Dansereau, R.M.</creatorcontrib><description>We present an approach for separating two speech signals when only one single recording of their linear mixture is available. The log spectra of the sources are estimated from the mixture's log spectrum using minimum mean square error (MMSE) approach. The estimation is obtained from the assumption that the sources are modelled using a set of Gaussian subsources which are related to the mixture using MIXMAX approximation. The resulting estimator has a closed form and is expressed using the mean and variance of Gaussian subsources. In order to obtain the two most likely subsources which generate the mixture, we use the estimation-detection technique. We also show that the binary mask filtering which has been empirically - and with no mathematical justification - used in speech separation techniques is, in fact, a simplified form of the MMSE estimator. The proposed technique is compared with the binary mask when the input consists of male-male, female-female, and female-male mixtures. The experimental results in terms of segmental SNR show that the MMSE estimator outperforms binary mask filtering.</description><identifier>ISSN: 1551-2541</identifier><identifier>ISBN: 1424415659</identifier><identifier>ISBN: 9781424415656</identifier><identifier>EISSN: 2378-928X</identifier><identifier>EISBN: 9781424415663</identifier><identifier>EISBN: 1424415667</identifier><identifier>DOI: 10.1109/MLSP.2007.4414294</identifier><language>eng</language><publisher>IEEE</publisher><subject>Estimation error ; Filtering ; Filters ; Mean square error methods ; Probability density function ; Source separation ; Speech coding ; Speech processing ; State estimation ; Systems engineering and theory</subject><ispartof>2007 IEEE Workshop on Machine Learning for Signal Processing, 2007, p.128-132</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/4414294$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,780,784,789,790,2058,27925,54920</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/4414294$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Radfar, M.H.</creatorcontrib><creatorcontrib>Dansereau, R.M.</creatorcontrib><title>Single Channel Speech Separation using Minimum Mean Square Error Estimation of Sources' Log Spectra</title><title>2007 IEEE Workshop on Machine Learning for Signal Processing</title><addtitle>MLSP</addtitle><description>We present an approach for separating two speech signals when only one single recording of their linear mixture is available. The log spectra of the sources are estimated from the mixture's log spectrum using minimum mean square error (MMSE) approach. The estimation is obtained from the assumption that the sources are modelled using a set of Gaussian subsources which are related to the mixture using MIXMAX approximation. The resulting estimator has a closed form and is expressed using the mean and variance of Gaussian subsources. In order to obtain the two most likely subsources which generate the mixture, we use the estimation-detection technique. We also show that the binary mask filtering which has been empirically - and with no mathematical justification - used in speech separation techniques is, in fact, a simplified form of the MMSE estimator. The proposed technique is compared with the binary mask when the input consists of male-male, female-female, and female-male mixtures. The experimental results in terms of segmental SNR show that the MMSE estimator outperforms binary mask filtering.</description><subject>Estimation error</subject><subject>Filtering</subject><subject>Filters</subject><subject>Mean square error methods</subject><subject>Probability density function</subject><subject>Source separation</subject><subject>Speech coding</subject><subject>Speech processing</subject><subject>State estimation</subject><subject>Systems engineering and theory</subject><issn>1551-2541</issn><issn>2378-928X</issn><isbn>1424415659</isbn><isbn>9781424415656</isbn><isbn>9781424415663</isbn><isbn>1424415667</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2007</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><sourceid>RIE</sourceid><recordid>eNo1UD1PwzAUNF8SpfQHIBZvTAl-jhPbI6rKh5QKpHRgq16cl9aoTYqTDPx7glqmu9OdTqdj7A5EDCDs4zIvPmIphI6VAiWtOmMzq81IR51mWXLOJjLRJrLSfF6wm38jtZdsAmkKkUwVXLNZ130JIUBnoysmzBW-2eyIz7fYNLTjxYHIbXlBBwzY-7bhQzcm-NI3fj_s-ZKw4cX3gIH4IoQ28EXX-_0x2ta8aIfgqHvgebv5K3N9wFt2VeOuo9kJp2z1vFjNX6P8_eVt_pRH3oo-qitTybI0pa1LSQ4SBKfQGau1yZwGRZAm0mVKCQRQprZY2RqqzFaqIsRkyu6PtZ6I1ocwrgo_69NdyS_4AFr1</recordid><startdate>200708</startdate><enddate>200708</enddate><creator>Radfar, M.H.</creator><creator>Dansereau, R.M.</creator><general>IEEE</general><scope>6IE</scope><scope>6IL</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIL</scope></search><sort><creationdate>200708</creationdate><title>Single Channel Speech Separation using Minimum Mean Square Error Estimation of Sources' Log Spectra</title><author>Radfar, M.H. ; Dansereau, R.M.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-i90t-fd8d2bb8b9fb2ec13a1c4ac897786c714e1532c6440a1148f9ad9f1d69d4deaa3</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2007</creationdate><topic>Estimation error</topic><topic>Filtering</topic><topic>Filters</topic><topic>Mean square error methods</topic><topic>Probability density function</topic><topic>Source separation</topic><topic>Speech coding</topic><topic>Speech processing</topic><topic>State estimation</topic><topic>Systems engineering and theory</topic><toplevel>online_resources</toplevel><creatorcontrib>Radfar, M.H.</creatorcontrib><creatorcontrib>Dansereau, R.M.</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Electronic Library (IEL)</collection><collection>IEEE Proceedings Order Plans (POP All) 1998-Present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Radfar, M.H.</au><au>Dansereau, R.M.</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Single Channel Speech Separation using Minimum Mean Square Error Estimation of Sources' Log Spectra</atitle><btitle>2007 IEEE Workshop on Machine Learning for Signal Processing</btitle><stitle>MLSP</stitle><date>2007-08</date><risdate>2007</risdate><spage>128</spage><epage>132</epage><pages>128-132</pages><issn>1551-2541</issn><eissn>2378-928X</eissn><isbn>1424415659</isbn><isbn>9781424415656</isbn><eisbn>9781424415663</eisbn><eisbn>1424415667</eisbn><abstract>We present an approach for separating two speech signals when only one single recording of their linear mixture is available. The log spectra of the sources are estimated from the mixture's log spectrum using minimum mean square error (MMSE) approach. The estimation is obtained from the assumption that the sources are modelled using a set of Gaussian subsources which are related to the mixture using MIXMAX approximation. The resulting estimator has a closed form and is expressed using the mean and variance of Gaussian subsources. In order to obtain the two most likely subsources which generate the mixture, we use the estimation-detection technique. We also show that the binary mask filtering which has been empirically - and with no mathematical justification - used in speech separation techniques is, in fact, a simplified form of the MMSE estimator. The proposed technique is compared with the binary mask when the input consists of male-male, female-female, and female-male mixtures. The experimental results in terms of segmental SNR show that the MMSE estimator outperforms binary mask filtering.</abstract><pub>IEEE</pub><doi>10.1109/MLSP.2007.4414294</doi><tpages>5</tpages></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 1551-2541
ispartof	2007 IEEE Workshop on Machine Learning for Signal Processing, 2007, p.128-132
issn	1551-2541 2378-928X
language	eng
recordid	cdi_ieee_primary_4414294
source	IEEE Electronic Library (IEL) Conference Proceedings
subjects	Estimation error Filtering Filters Mean square error methods Probability density function Source separation Speech coding Speech processing State estimation Systems engineering and theory
title	Single Channel Speech Separation using Minimum Mean Square Error Estimation of Sources' Log Spectra
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-25T05%3A01%3A29IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_6IE&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Single%20Channel%20Speech%20Separation%20using%20Minimum%20Mean%20Square%20Error%20Estimation%20of%20Sources'%20Log%20Spectra&rft.btitle=2007%20IEEE%20Workshop%20on%20Machine%20Learning%20for%20Signal%20Processing&rft.au=Radfar,%20M.H.&rft.date=2007-08&rft.spage=128&rft.epage=132&rft.pages=128-132&rft.issn=1551-2541&rft.eissn=2378-928X&rft.isbn=1424415659&rft.isbn_list=9781424415656&rft_id=info:doi/10.1109/MLSP.2007.4414294&rft_dat=%3Cieee_6IE%3E4414294%3C/ieee_6IE%3E%3Curl%3E%3C/url%3E&rft.eisbn=9781424415663&rft.eisbn_list=1424415667&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=4414294&rfr_iscdi=true