CLC: Complex Linear Coding for the DNS 2020 Challenge

Complex-valued processing brought deep learning-based speech enhancement and signal extraction to a new level. Typically, the noise reduction process is based on a time-frequency (TF) mask which is applied to a noisy spectrogram. Complex masks (CM) usually outperform real-valued masks due to their a...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:arXiv.org 2020-06
Hauptverfasser: Schröter, Hendrik, Rosenkranz, Tobias, Escalante-B, Alberto N, Maier, Andreas
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Complex-valued processing brought deep learning-based speech enhancement and signal extraction to a new level. Typically, the noise reduction process is based on a time-frequency (TF) mask which is applied to a noisy spectrogram. Complex masks (CM) usually outperform real-valued masks due to their ability to modify the phase. Recent work proposed to use a complex linear combination of coefficients called complex linear coding (CLC) instead of a point-wise multiplication with a mask. This allows to incorporate information from previous and optionally future time steps which results in superior performance over mask-based enhancement for certain noise conditions. In fact, the linear combination enables to model quasi-steady properties like the spectrum within a frequency band. In this work, we apply CLC to the Deep Noise Suppression (DNS) challenge and propose CLC as an alternative to traditional mask-based processing, e.g. used by the baseline. We evaluated our models using the provided test set and an additional validation set with real-world stationary and non-stationary noises. Based on the published test set, we outperform the baseline w.r.t. the scale independent signal distortion ratio (SI-SDR) by about 3dB.
ISSN:2331-8422