Spectral amplitude nonlinearities for improved noise robustness of spectral features for use in automatic speech recognition

Auditory models for outer periphery processing include a sigmoid shaped nonlinearity that is even more compressed than standard logarithmic scaling at very low and very high amplitudes. In some studies done at Carnegie Mellon University, it has been shown that this compressive nonlinearity is the mo...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:The Journal of the Acoustical Society of America 2011-10, Vol.130 (4_Supplement), p.2524-2524
Hauptverfasser: Zahorian, Stephen, Wong, Brian
Format: Artikel
Sprache:eng
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Auditory models for outer periphery processing include a sigmoid shaped nonlinearity that is even more compressed than standard logarithmic scaling at very low and very high amplitudes. In some studies done at Carnegie Mellon University, it has been shown that this compressive nonlinearity is the most important aspect of the Seneff auditory model in terms of improving accuracy of automatic speech recognition in the presence of noise. However, in this previous work, the nonlinearity was trained for each frequency band of the Mel frequency cepstrum coefficients thus making it impractical to incorporate in automatic speech recognition systems. In the current study, a compressive nonlinearity is parametrically represented and constructed without training, to allow various degrees of steepness and “rounding” of corners for low and high amplitudes. Using this nonlinearity, experimental results for various noise conditions, and with mismatches in noise between training and test data, were obtained for phone recognition using the TIMIT and NTIMIT databases. The implications of the results are that a fixed compressive nonlinearity can be used to improve automatic speech recognition robustness with respect to mismatches between training and test data.
ISSN:0001-4966
1520-8524
DOI:10.1121/1.3655077