Multi-band speech recognition in noisy environments

This paper presents a new approach for multi-band based automatic speech recognition (ASR). Previous work by Bourlard et al. (see Proc. Int. Conf. on Spoken Language Processing, Philadelphia, p.426-9, 1996) and Hermansky et al. (see Proc. Int. Conf. on Spoken Language Processing, Philadelphia, p.157...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Okawa, S., Bocchieri, E., Potamianos, A.
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:This paper presents a new approach for multi-band based automatic speech recognition (ASR). Previous work by Bourlard et al. (see Proc. Int. Conf. on Spoken Language Processing, Philadelphia, p.426-9, 1996) and Hermansky et al. (see Proc. Int. Conf. on Spoken Language Processing, Philadelphia, p.1579-82, 1996) suggests that multi-band ASR gives a more accurate recognition, especially in noisy acoustic environments, by combining the likelihoods of different frequency bands. Here we evaluate this likelihood recombination (LC) approach to multi-band ASR, and propose an alternative method, namely feature recombination (FC). In the FC system, after different acoustic analyzers are applied to each sub-band individually, a vector is composed by combining the sub-band features. The speech classifier then calculates the likelihood from the single vector. Thus, band-limited noise affects only a few of the feature components, as in the multi-band LC system, but, at the same time, all feature components are jointly modeled, as in conventional ASR. The experimental results show that the FC system can yield better performance than both the conventional ASR and the LC strategy for noisy speech.
ISSN:1520-6149
2379-190X
DOI:10.1109/ICASSP.1998.675346