Adaptive Kalman Filtering and Smoothing for Tracking Vocal Tract Resonances Using a Continuous-Valued Hidden Dynamic Model

A novel Kalman filtering/smoothing algorithm is presented for efficient and accurate estimation of vocal tract resonances or formants, which are natural frequencies and bandwidths of the resonator from larynx to lips, in fluent speech. The algorithm uses a hidden dynamic model, with a state-space fo...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on audio, speech, and language processing speech, and language processing, 2007, Vol.15 (1), p.13-23
Hauptverfasser:	Li Deng, Lee, L.J., Attias, H., Acero, A.
Format:	Artikel
Sprache:	eng
Schlagworte:	Adaptive filters Adaptive piecewise linearization adaptive residual parameter learning Algorithms Applied sciences Bandwidth Coding, codes continuous dynamics Detection, estimation, filtering, equalization, prediction Exact sciences and technology Filtering algorithms formant analysis Frequency estimation hidden dynamic model Information, signal and communications theory Kalman filtering Kalman filters Mathematical analysis Mathematical models nonlinear prediction Nonlinearity Resonance Resonant frequency Resonator filters Signal and communications theory Signal processing Signal, noise Smoothing Smoothing methods Speech Speech processing state-space model Studies Telecommunications and information theory Tracking vocal tract resonance
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	A novel Kalman filtering/smoothing algorithm is presented for efficient and accurate estimation of vocal tract resonances or formants, which are natural frequencies and bandwidths of the resonator from larynx to lips, in fluent speech. The algorithm uses a hidden dynamic model, with a state-space formulation, where the resonance frequency and bandwidth values are treated as continuous-valued hidden state variables. The observation equation of the model is constructed by an analytical predictive function from the resonance frequencies and bandwidths to LPC cepstra as the observation vectors. This nonlinear function is adaptively linearized, and a residual or bias term, which is adaptively trained, is added to the nonlinear function to represent the iteratively reduced piecewise linear approximation error. Details of the piecewise linearization design process are described. An iterative tracking algorithm is presented, which embeds both the adaptive residual training and piecewise linearization design in the Kalman filtering/smoothing framework. Experiments on estimating resonances in Switchboard speech data show accurate estimation results. In particular, the effectiveness of the adaptive residual training is demonstrated. Our approach provides a solution to the traditional "hidden formant problem," and produces meaningful results even during consonantal closures when the supra-laryngeal source may cause no spectral prominences in speech acoustics
ISSN:	1558-7916 2329-9290 1558-7924 2329-9304
DOI:	10.1109/TASL.2006.876724