Comparison of Cepstral Analysis Based on Voiced-Segment Extraction and Voice Tasks for Discriminating Dysphonic and Normophonic Korean Speakers
This study investigated whether there are differences in the discriminatory power of cepstral analysis according to the voiced-segment extraction method and voice tasks used for identifying dysphonic and normophonic Korean individuals. A total of 2,863 subjects (2,595 subjects with and 268 subjects...
Gespeichert in:
Veröffentlicht in: | Journal of voice 2021-03, Vol.35 (2), p.328.e11-328.e22 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | This study investigated whether there are differences in the discriminatory power of cepstral analysis according to the voiced-segment extraction method and voice tasks used for identifying dysphonic and normophonic Korean individuals.
A total of 2,863 subjects (2,595 subjects with and 268 subjects without dysphonia) were included in this study. The 3-second sustained vowel (SV) /a/ and one sentence of “Sanchaek” were edited and analyzed using Praat scripts. Cepstral analyses (cepstral peak prominence [CPP], smoothed cepstral peak prominence [CPPS], and low/high spectral ratio [LHRatio]) were performed using three voice tasks, namely, SV, continuous speech (CS), and extracted continuous speech (EXT) samples. Additionally, auditory-perceptual (A-P) assessments were performed by three speech language pathologists.
Significant differences were found between dysphonic and normophonic voice groups for all cepstral parameters, except the LHRatio_EXT. Cepstral measurements of both SV and CS were highly correlated with A-P ratings. Furthermore, the diagnostic predictive power of CPP and CPPS for CS using the area under the receiver operating characteristic curve (AUC) was >0.919, the positive likelihood ratio (LR+) was ≥6.85, and the negative likelihood ratio (LR−) was ≤0.23. Additionally, for EXT, the AUC was >0.816, LR+ was 3.10, and LR− was ≤0.33.
Both CS and EXT can predict dysphonia relatively well (r > 0.816). EXT showed lower predictability than the original sample (CS) analysis. Subsequent studies should implement voiced-segment extraction methods using various algorithms. |
---|---|
ISSN: | 0892-1997 1873-4588 |
DOI: | 10.1016/j.jvoice.2019.09.009 |