Using an adaptive network to recognize demisyllables in continuous speech

A nonlinear, multilayer associative network was trained on a speech recognition task using continuous speech. Naturally spoken 14-syllable “sentences” from one talker were preprocessed to produce a 15-band spectral representation incorporating several transformations introduced by the peripheral aud...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	The Journal of the Acoustical Society of America 1988-05, Vol.83 (S1), p.S53-S53
Hauptverfasser:	Kamm, Candace A., Landauer, Thomas K., Singhal, Sharad
Format:	Artikel
Sprache:	eng
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	A nonlinear, multilayer associative network was trained on a speech recognition task using continuous speech. Naturally spoken 14-syllable “sentences” from one talker were preprocessed to produce a 15-band spectral representation incorporating several transformations introduced by the peripheral auditory system on acoustic signals. Input nodes to the network represented a 150-ms window through which the spectral representation passed in 2-ms steps. A single layer of 20 hidden nodes was used. Output nodes represented seven initial demisyllables whose target values were specified based on a human listener's identification of the sounds heard during the input segment. The network was trained to criterion using a variant of the back-propagation learning algorithm [Rumelhart et al., Nature 323, 533–536 (1986); Landauer et al., Proc. Cog. Sci. Soc., 531–536 (1987)]. A minimum-error-rate figure of merit (derived from signal detection theory) was used to evaluate the effect of the size of the training corpus on the network's performance. Minimum error rate on a test corpus of demisyllables spoken in different contexts decreased from 5.2% to less than 2% as the number of sentences in the training corpus was increased from one to seven.
ISSN:	0001-4966 1520-8524
DOI:	10.1121/1.2025400