Leveraging the temporal dynamics of anticipatory vowel-to-vowel coarticulation in linguistic prediction: A statistical modeling approach

•Anticipatory vowel-to-vowel coarticulatory effects are temporally non-linear.•Whole-formant representations detect more anticipatory information in the signal.•Multinomial regression models linguistic prediction during speech comprehension. Previous research has shown that coarticulatory informatio...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of phonetics 2021-09, Vol.88, p.101093, Article 101093
Hauptverfasser: Flego, Stefon, Forrest, Jon
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:•Anticipatory vowel-to-vowel coarticulatory effects are temporally non-linear.•Whole-formant representations detect more anticipatory information in the signal.•Multinomial regression models linguistic prediction during speech comprehension. Previous research has shown that coarticulatory information in the signal orients listeners in spoken word recognition, and that articulatory and perceptual dynamics closely parallel one another. The current study uses statistical classification to test the power of time-varying anticipatory coarticulatory information present in the acoustic signal for predicting upcoming sounds in the speech stream. Bayesian mixed-effects multinomial logistic regression models were trained on several different representations of spectral variation present in V1 in order to predict the identity of V2 in naturally coarticulated transconsonantal V1…V2 sequences. Models trained on simple measures of spectral variation (e.g. formant measures taken at V1 midpoint) were compared with models trained on more sophisticated time-varying representations (e.g. the estimated coefficients of polynomial curves fit to whole formant trajectories of V1). Accuracy in predicting V2 was greater when models were trained on dynamic representations of spectral variation in V1, and those trained on quadratic and cubic polynomial representations achieved the greatest accuracy, with more than 15 percentage points in correct classification over using midpoint formant frequencies alone. The results demonstrate that spectral representations with high temporal resolution capture more disambiguating anticipatory information available in the signal than representations with lower temporal resolution.
ISSN:0095-4470
1095-8576
DOI:10.1016/j.wocn.2021.101093