Analyzing fricative confusions in healthy and pathological speech using modified S-transform

Fricatives are a class of speech sounds that are produced when air passes through a partial constriction in the vocal tract resulting in a turbulent airflow with prominent energy in the high-frequency region. Place of constriction decides the resonances resulting in fricatives that differ in place o...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:International journal of speech technology 2024, Vol.27 (4), p.977-985
Hauptverfasser: Roopa, S., Karjigi, Veena, Chandrashekar, H. M.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Fricatives are a class of speech sounds that are produced when air passes through a partial constriction in the vocal tract resulting in a turbulent airflow with prominent energy in the high-frequency region. Place of constriction decides the resonances resulting in fricatives that differ in place of articulation. The present study considers three classes of fricatives namely dental, alveolar and post-alveolar. To distinguish the fricatives based on place of articulation, it is important to have a signal representation with good frequency resolution at high frequencies. The standard S-transform exhibits the varying resolution with an uncontrolled window width and exhibits good frequency resolution at low-frequencies and good time resolution at high-frequencies. Modified S-transform introduces two adjustable parameters to control the width of the Gaussian window and provides better frequency resolution at high frequencies than S-transform and suitable for classification of fricatives based on place of articulation. The classification of fricatives in normal and pathological speech is attempted by using S-transform and modified S-transform spectrograms. Experimental results show that the use of modified S-transform provides higher fricative classification accuracy of 93.4% and 50% compared to 91.7% and 44.54% by using S-transform for normal and pathological speech respectively.
ISSN:1381-2416
1572-8110
DOI:10.1007/s10772-024-10139-z