A neural tree architecture for recognition of speech features

Neural tree networks provide an efficient technique for pattern classification. A typical structure grows the tree to accommodate the varying complexity of different problems. For recognition of speech features in continuous speech (as described by acoustic vectors), this architecture was found to g...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:The Journal of the Acoustical Society of America 1991-10, Vol.90 (4_Supplement), p.2272-2273
Hauptverfasser: Rahim, Mazin, Flanagan, James
Format: Artikel
Sprache:eng
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Neural tree networks provide an efficient technique for pattern classification. A typical structure grows the tree to accommodate the varying complexity of different problems. For recognition of speech features in continuous speech (as described by acoustic vectors), this architecture was found to grow unnecessarily when presented with large amounts of training material collected from many speakers. This paper describes a method for forward-pruning the tree through consideration of the degree of ‘‘confusion’’ among the applied patterns. This implementation was tested for recognition of 34 phonemes, extracted from the TIMIT database. Results show that in addition to advantages in computational efficiency and recognition performance, the neural tree architecture provides important correlations among the types of speech features being classified. One observation is that the parent network (first root) performs a separation between two classes of sounds; namely (a) vowels, glides, semivowels, and nasals, and (b) fricatives, affricates, and stop consonants. Subsequent networks are found to classify phonemes according to place of articulation (e.g., alveolar−/s/ and /z/), or manner of articulation (e.g., nasals−/m/, /n/, and /ng/). These phonemic classifications are valuable in improving the networks’ ability to perform feature recognition on data outside the training set.
ISSN:0001-4966
DOI:10.1121/1.401190