Modeling the glottal volume-velocity waveform for three voice types

The purpose of this study was to model features of the glottal volume-velocity waveform for three voice types: modal voice, vocal fry, and breathy voice. The study analyzed data measured from two sustained vowels and one sentence uttered by nine adult, male subjects who represented examples of the t...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:The Journal of the Acoustical Society of America 1995-01, Vol.97 (1), p.505-519
Hauptverfasser: Childers, D G, Ahn, C
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The purpose of this study was to model features of the glottal volume-velocity waveform for three voice types: modal voice, vocal fry, and breathy voice. The study analyzed data measured from two sustained vowels and one sentence uttered by nine adult, male subjects who represented examples of the three voice types. The primary analysis procedure was glottal inverse filtering, which estimated the glottal volume-velocity waveform. The estimated glottal volume-velocity waveform was then fit to an LF model waveform. Four parameters of the LF model were adjusted to minimize the mean-squared error between the estimated glottal waveform and the LF model waveform. Statistical averages and standard deviations of the four parameters of the LF glottal waveform model were calculated using the data for each voice type. The four LF model parameters characterize important low-frequency features of the glottal waveform, namely, the glottal pulse width, pulse skewness, abruptness of closure of the glottal pulse, and the spectral tilt of the glottal pulse. Statistical analysis included ANOVA and multiple linear regression analysis. The ANOVA results demonstrated that there was a difference in three of the four LF model parameters for the three voice types. The linear regression analysis between the four LF model parameters and a formal rating by a listening test of the quality of the three voice types was used to determine the most significant LF model parameters for each voice type. A simple rule was devised for synthesizing the three voice types with a formant synthesizer using the LF glottal waveform model. Listener evaluations of the synthesized speech tended to confirm the results determined by the analysis procedures.
ISSN:0001-4966
1520-8524
DOI:10.1121/1.412276