Online Speech Dereverberation Using Kalman Filter and EM Algorithm

Speech signals recorded in a room are commonly degraded by reverberation. In most cases, both the speech signal and the acoustic system of the room are unknown and time-varying. In this paper, a scenario with a single desired sound source and slowly time-varying and spatially-white noise is consider...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE/ACM transactions on audio, speech, and language processing speech, and language processing, 2015-02, Vol.23 (2), p.394-406
Hauptverfasser:	Schwartz, Boaz, Gannot, Sharon, Habets, Emanuel A. P.
Format:	Artikel
Sprache:	eng
Schlagworte:	Acoustic noise Acoustics Algorithms Cleaning Convergence convolution in STFT Dereverberation Estimates Kalman filters Maximization Microphones recursive expectation-maximization recursive parameter estimation Sound design Speech Speech enhancement Tracking
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Speech signals recorded in a room are commonly degraded by reverberation. In most cases, both the speech signal and the acoustic system of the room are unknown and time-varying. In this paper, a scenario with a single desired sound source and slowly time-varying and spatially-white noise is considered, and a multi-microphone algorithm that simultaneously estimates the clean speech signal and the time-varying acoustic system is proposed. The recursive expectation-maximization scheme is employed to obtain both the clean speech signal and the acoustic system in an online manner. In the expectation step, the Kalman filter is applied to extract a new sample of the clean signal, and in the maximization step, the system estimate is updated according to the output of the Kalman filter. Experimental results show that the proposed method is able to significantly reduce reverberation and increase the speech quality. Moreover, the tracking ability of the algorithm was validated in practical scenarios using human speakers moving in a natural manner.
ISSN:	2329-9290 2329-9304
DOI:	10.1109/TASLP.2014.2372342