Bounded cepstral marginalization of missing data for robust speech recognition

•Robust recognition of noisy speech achieved via a novel missing data technique.•Proposed modified bounded marginalization compatible with MFCC trained models.•The second proposed technique is more accurate, but still fast and simple.•The third method competes with imputation techniques considering...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Computer speech & language 2016-03, Vol.36, p.1-23
Hauptverfasser: Ebrahim Kafoori, Kian, Ahadi, Seyed Mohammad
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:•Robust recognition of noisy speech achieved via a novel missing data technique.•Proposed modified bounded marginalization compatible with MFCC trained models.•The second proposed technique is more accurate, but still fast and simple.•The third method competes with imputation techniques considering accuracy.•Proposed techniques are all simpler and faster than imputation techniques. Spectral imputation and classifier modification can be counted as the two main missing data approaches for robust automatic speech recognition (ASR). Despite their potentials, little attention has been paid to the classifier modification techniques. In this paper, we show that transferring bounded marginalization, which is a classifier modification method, from spectral to cepstral domain would be beneficial for robust ASR. We also propose improved solutions on this transfer toward a better performance. Two such techniques are presented. The first approach still does not need training of any extra model. It benefits from an observed characteristic of cepstral features and raises accuracy of previously proposed method to a comparable level with that of a classic imputation method. The second technique combines our originally proposed method with an imputation technique but replaces spectral reconstruction with a simpler and faster possible range estimation of missing components. We show that the resulting method improves the accuracies of either of the two combined methods. The proposed techniques also show good robustness when implemented with an inaccurate spectrographic mask.
ISSN:0885-2308
1095-8363
DOI:10.1016/j.csl.2015.07.005