Robust speech event detection using strictly temporal information

A major problem in the development of robust speech recognition systems is our understanding of how to deal with noise and/or reduced spectral information. Previous studies have shown that the temporal structure of speech is robust and a good source of information for recognizing manner. This work d...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:The Journal of the Acoustical Society of America 2001-05, Vol.109 (5_Supplement), p.2493-2493
Hauptverfasser: Deshmukh, Om D., Wilson, Carol E., Salomon, Ariel
Format: Artikel
Sprache:eng
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:A major problem in the development of robust speech recognition systems is our understanding of how to deal with noise and/or reduced spectral information. Previous studies have shown that the temporal structure of speech is robust and a good source of information for recognizing manner. This work discusses the development of algorithms that specifically target temporal information in the speech signal that can be used to identify different manner classes. In particular, we extract events associated with sonorant regions and obstruent regions. We compared the performance of our algorithms on clean speech (sentences taken from the TIMIT database) and on 1 channel spectrally impoverished speech [as described by Q.-J. Fu, House Ear Institute, Dept. of Auditory Implants and Perception, Tigersoft]. The results show no degradation in performance for the detection of landmarks associated with stops, affricates, strident fricatives, and vowels. There is a small degradation of six landmarks associated with the ‘‘weaker’’ consonants (nonstrident fricatives and sonorant consonants).
ISSN:0001-4966
1520-8524
DOI:10.1121/1.4744872