Single Channel Speech Separation using Minimum Mean Square Error Estimation of Sources' Log Spectra

We present an approach for separating two speech signals when only one single recording of their linear mixture is available. The log spectra of the sources are estimated from the mixture's log spectrum using minimum mean square error (MMSE) approach. The estimation is obtained from the assumpt...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Radfar, M.H., Dansereau, R.M.
Format:	Tagungsbericht
Sprache:	eng
Schlagworte:	Estimation error Filtering Filters Mean square error methods Probability density function Source separation Speech coding Speech processing State estimation Systems engineering and theory
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	We present an approach for separating two speech signals when only one single recording of their linear mixture is available. The log spectra of the sources are estimated from the mixture's log spectrum using minimum mean square error (MMSE) approach. The estimation is obtained from the assumption that the sources are modelled using a set of Gaussian subsources which are related to the mixture using MIXMAX approximation. The resulting estimator has a closed form and is expressed using the mean and variance of Gaussian subsources. In order to obtain the two most likely subsources which generate the mixture, we use the estimation-detection technique. We also show that the binary mask filtering which has been empirically - and with no mathematical justification - used in speech separation techniques is, in fact, a simplified form of the MMSE estimator. The proposed technique is compared with the binary mask when the input consists of male-male, female-female, and female-male mixtures. The experimental results in terms of segmental SNR show that the MMSE estimator outperforms binary mask filtering.
ISSN:	1551-2541 2378-928X
DOI:	10.1109/MLSP.2007.4414294