INFERRING ARTICULATION AND RECOGNIZING GESTURES FROM ACOUSTICS WITH A NEURAL NETWORK TRAINED ON X-RAY MICROBEAM DATA
This paper describes a method for inferring articulatory parameters from acoustics with a neural network trained on paired acoustic and articulatory data. An x-ray microbeam recorded the vertical movements of the lower lip, tongue tip, and tongue dorsum of three speakers saying the English stop cons...
Gespeichert in:
Veröffentlicht in: | The Journal of the Acoustical Society of America 1992-08, Vol.92 (2), p.688-700 |
---|---|
Hauptverfasser: | , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | This paper describes a method for inferring articulatory parameters from acoustics with a neural network trained on paired acoustic and articulatory data. An x-ray microbeam recorded the vertical movements of the lower lip, tongue tip, and tongue dorsum of three speakers saying the English stop consonants in repeated Ce syllables. A neural network was then trained to map from simultaneously recorded acoustic data to the articulatory data. To evaluate learning, acoustics from the training set were passed through the neural network. To evaluate generalization, acoustics from speakers or consonants excluded from the training set were passed through the network. The articulatory trajectories thus inferred were a good fit to the actual movements in both the learning and generalization conditions, as judged by root-mean-square error and correlation. Inferred trajectories were also matched to templates of lower lip, tongue tip, and tongue dorsum release gestures extracted from the original data. This technique correctly recognized from 94.4% to 98.9% of all gestures in the learning and cross-speaker generalization conditions, and 75% of gestures underlying consonants excluded from the training set. In addition, greater regularity was observed for movements of articulators that were critical in the formation of each consonant. |
---|---|
ISSN: | 0001-4966 1520-8524 |
DOI: | 10.1121/1.403994 |