Speaker normalization based on subglottal resonances

Speaker normalization typically focuses on variabilities of the supra-glottal (vocal tract) resonances, which constitute a major cause of spectral mismatch. Recent studies show that the subglottal airways also affect spectral properties of speech sounds. This paper presents a speaker normalization m...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Shizhen Wang, Alwan, A., Lulich, S.M.
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Speaker normalization typically focuses on variabilities of the supra-glottal (vocal tract) resonances, which constitute a major cause of spectral mismatch. Recent studies show that the subglottal airways also affect spectral properties of speech sounds. This paper presents a speaker normalization method based on estimating the second and third subglottal resonances. Since the subglottal airways do not change for a specific speaker, the subglottal resonances are independent of the sound type (i.e., vowel, consonant, etc.) and remain constant for a given speaker. This context-free property makes the proposed method suitable for limited data speaker adaptation. This method is computationally more efficient than maximum-likelihood based VTLN, with performance better than VTLN especially for limited adaptation data. Experimental results confirm that this method performs well in a variety of testing conditions and tasks.
ISSN:1520-6149
2379-190X
DOI:10.1109/ICASSP.2008.4518600