Gain-optimized spectral distortions for pronunciation training

This paper considers an assessment and evaluation of speech sound pronunciation quality in computer-aided language learning systems. We examine the gain optimization of spectral distortion measures between the speech signals of a native speaker and a learner. During training, a learner has to achiev...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Optimization letters 2022-09, Vol.16 (7), p.2095-2113
Hauptverfasser: Savchenko, Andrey V., Savchenko, Vladimir V., Savchenko, Lyudmila V.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:This paper considers an assessment and evaluation of speech sound pronunciation quality in computer-aided language learning systems. We examine the gain optimization of spectral distortion measures between the speech signals of a native speaker and a learner. During training, a learner has to achieve stable pronunciation of all sounds. This is measured by computing the distances between the sounds produced by the learner and the model speaker. In order to improve pronunciation, it is proposed to adapt the linear prediction coding coefficients of reference sounds by using the gradient descent optimization of the gain-optimized dissimilarity. As a result, we demonstrate the possibility of synthesizing sounds that will be either close to the model pronunciation or achievable by a learner. An experimental study shows that the proposed procedure leads to high efficiency for pronunciation training even in the presence of noise in the observed utterance.
ISSN:1862-4472
1862-4480
DOI:10.1007/s11590-021-01790-5