An integrated model of acoustics and language using semantic classification trees

We propose multilevel semantic classification trees to combine different information sources for predicting speech events (e.g. word chains, phrases, etc.). Traditionally in speech recognition systems these information sources (acoustic evidence, language model) are calculated independently and comb...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Noth, E., De Mori, R., Fischer, J., Gebhard, A., Harbeck, S., Kompe, R., Kuhn, R., Niemann, H., Mast, M.
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:We propose multilevel semantic classification trees to combine different information sources for predicting speech events (e.g. word chains, phrases, etc.). Traditionally in speech recognition systems these information sources (acoustic evidence, language model) are calculated independently and combined via Bayes rule. The proposed approach allows one to combine sources of different types it is no longer necessary for each source to yield a probability. Moreover the tree can look at several information sources simultaneously. The approach is demonstrated for the prediction of prosodically marked phrase boundaries, combining information about the spoken word chain, word category information, prosodic parameters, and the result of a neural network predicting the boundary on the basis of acoustic-prosodic features. The recognition rates of up to 90% for the two class problem boundary vs. no boundary are already comparable to results achieved with the above mentioned Bayes rule approach that combines the acoustic classifier with a 5-gram categorical language model. This is remarkable, since so far only a small set of questions combining information from different sources have been implemented.
ISSN:1520-6149
2379-190X
DOI:10.1109/ICASSP.1996.541122