Phase-aware subspace decomposition for single channel speech separation
Single channel speech separation (SCSS) is often required as post-processing in several applications that facilitate automatic human-to-human or human-to-machine communication in challenging acoustic environments such as voice command for smart homes or robotics. The proposed SCSS system, that the a...
Gespeichert in:
Veröffentlicht in: | IET signal processing 2020-06, Vol.14 (4), p.214-222 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Single channel speech separation (SCSS) is often required as post-processing in several applications that facilitate automatic human-to-human or human-to-machine communication in challenging acoustic environments such as voice command for smart homes or robotics. The proposed SCSS system, that the authors call phase-aware subspace decomposition (PASD), relies on subspace decomposition for speech separation followed by a phase-aware mask for final subspace recovery. In fact, the proposed approach decomposes the mixture into a sparse and low-rank subspace in the frequency domain by rank minimising that relies on iterative decomposition using adaptive thresholding in each iteration to achieve soft estimation and considers phase-information for reconstruction. Separation results are reported in terms of both intrusive and non-intrusive metrics using realistic recordings corrupted with real-life noises. As speech separation systems are expected to have maximal interference rejection without speech distortion, we also evaluate the proposed system by recognising speech from a target speaker in the presence of either concurrent speech or noise. Recognition results show that separated signals are of high intelligibility so that they can be exploited by other automatic applications. The extensive evaluation under different test scenarios proves that PASD consistently improves the quality of the separated signals, compared to other benchmark approaches. |
---|---|
ISSN: | 1751-9675 1751-9683 1751-9683 |
DOI: | 10.1049/iet-spr.2019.0373 |