Performance evaluation for transform domain model-based single-channel speech separation
It is already demonstrated that selected features have a much larger effect to the overall performance in speech applications accuracy than the selected generative models have. In this paper, we propose subband perceptually weighted transformation (SPWT) applied on magnitude spectrum to improve the...
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Tagungsbericht |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | It is already demonstrated that selected features have a much larger effect to the overall performance in speech applications accuracy than the selected generative models have. In this paper, we propose subband perceptually weighted transformation (SPWT) applied on magnitude spectrum to improve the performance of single-channel separation scenario (SCSS). In particular, we compare three feature types namely, log-spectrum, magnitude spectrum and the proposed SPWT. A comprehensive statistical analysis is performed to evaluate the performance of a VQ-based SCSS framework in terms of the lower error bound. At the core of this approach are two trained codebooks of the quantized feature vectors of speakers, whereby the main evaluation for separation is performed. The simulation results show that the proposed transformation offers an attractive candidate to improve the separation performance of model-based SCSS. It is also observed that the proposed feature can result in a lower-error bound in terms of the spectral distortion (SD) as well as higher SSNR in comparison with other features. |
---|---|
ISSN: | 2161-5322 2161-5330 |
DOI: | 10.1109/AICCSA.2009.5069444 |