Speech enhancement using a DNN-augmented colored-noise Kalman filter

•Colored-noise Kalman filter is adopted in our system, which is more component to deal with the complex noise and alleviate the speech distortion.•A multi-objective DNN is first employed to joint estimate parameters of the clean speech autoregressive (AR) model and the noise AR model. Two kinds of D...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Speech communication 2020-12, Vol.125, p.142-151
Hauptverfasser:	Yu, Hongjiang, Zhu, Wei-Ping, Champagne, Benoit
Format:	Artikel
Sprache:	eng
Schlagworte:	Acoustic mapping Algorithms Artificial neural networks Autoregressive models Autoregressive processes Corlored-noise Kalman filter Deep neural network Denoising Kalman filters Linear prediction Noise Noise reduction Optimization Parameter estimation Process parameters Spectral subtraction Speech Speech enhancement Speech processing Subtraction
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	•Colored-noise Kalman filter is adopted in our system, which is more component to deal with the complex noise and alleviate the speech distortion.•A multi-objective DNN is first employed to joint estimate parameters of the clean speech autoregressive (AR) model and the noise AR model. Two kinds of DNN, i.e., fully-connected feed-forward network (FNN) and long short-term memory (LSTM), are adopted.•A post subtraction technique is employed to further remove the residual noise in the Kalman-filtered speech.•The proposed system takes advantage of both the DNN based method and Kalman filtering, and has a good generalization capability in both seen and unseen noise environments. In this paper, we propose a new speech enhancement system using a deep neural network (DNN)-augmented colored-noise Kalman filter. In our system, both clean speech and noise are modelled as autoregressive (AR) processes, whose parameters comprise the linear prediction coefficients (LPCs) and the driving noise variances. The LPCs are obtained through training a multi-objective DNN that learns the mapping from the noisy acoustic features to the line spectrum frequencies (LSFs), while the driving noise variances are obtained by solving an optimization problem aiming to minimize the difference between the modelled and observed AR spectra of the noisy speech. The colored-noise Kalman filter with DNN estimated parameters is then applied to the noisy speech for denoising. Finally, a post-subtraction technique is adopted to further remove the residual noise in the Kalman-filtered speech. Extensive computer simulations show that the proposed speech enhancement system achieves significant performance gains when compared to conventional Kalman filter based algorithms as well as recent DNN-based methods under both seen and unseen noise conditions.
ISSN:	0167-6393 1872-7182
DOI:	10.1016/j.specom.2020.10.007