Blind Subband Beamforming With Time-Delay Constraints for Moving Source Speech Enhancement

A new robust microphone array method to enhance speech signals generated by a moving person in a noisy environment is presented. This blind approach is based on a two-stage scheme. First, a subband time-delay estimation method is used to localize the dominant speech source. The second stage involves...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on audio, speech, and language processing speech, and language processing, 2007, Vol.15 (8), p.2360-2372
Hauptverfasser:	Yermeche, Z., Grbic, N., Claesson, I.
Format:	Artikel
Sprache:	eng
Schlagworte:	Acoustic noise Adaptive arrays Applied sciences Array signal processing Arrays Beamforming Blinds delay estimation Detection, estimation, filtering, equalization, prediction Distortion Electrical Engineering, Electronic Engineering, Information Engineering Elektroteknik och elektronik Engineering and Technology Exact sciences and technology Information, signal and communications theory Microphone arrays Miscellaneous Noise generators Noise levels Robustness Signal and communications theory Signal generators Signal processing Signal, noise Speech Speech analysis Speech enhancement Speech processing Spreading Studies Teknik Telecommunications and information theory Working environment noise
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	A new robust microphone array method to enhance speech signals generated by a moving person in a noisy environment is presented. This blind approach is based on a two-stage scheme. First, a subband time-delay estimation method is used to localize the dominant speech source. The second stage involves speech enhancement, based on the acquired spatial information, by means of a soft-constrained subband beamformer. The novelty of the proposed method involves considering the spatial spreading of the sound source as equivalent to a time-delay spreading, thus, allowing for the estimated intersensor time-delays to be directly used in the beamforming operations. In comparison to previous approaches, this new method requires no special array geometry, knowledge of the array manifold, or acquisition of calibration data to adapt the array weights. Furthermore, such a scheme allows for the beamformer to efficiently adapt to speaker movement. The robustness of the time-delay estimation of speech signals in high noise levels is improved by making use of the non-Gaussian nature of speech trough a subband Kurtosis-weighted structure. Evaluation in a real environment with a moving speaker shows promising results, with suppression levels of up to 16 dB for background noise and interfering (speech) signals, associated to a relatively small effect of speech distortion.
ISSN:	1558-7916 2329-9290 1558-7924 1558-7924 2329-9304
DOI:	10.1109/TASL.2007.903309