Gain-optimized spectral distortions for pronunciation training
This paper considers an assessment and evaluation of speech sound pronunciation quality in computer-aided language learning systems. We examine the gain optimization of spectral distortion measures between the speech signals of a native speaker and a learner. During training, a learner has to achiev...
Gespeichert in:
Veröffentlicht in: | Optimization letters 2022-09, Vol.16 (7), p.2095-2113 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | This paper considers an assessment and evaluation of speech sound pronunciation quality in computer-aided language learning systems. We examine the gain optimization of spectral distortion measures between the speech signals of a native speaker and a learner. During training, a learner has to achieve stable pronunciation of all sounds. This is measured by computing the distances between the sounds produced by the learner and the model speaker. In order to improve pronunciation, it is proposed to adapt the linear prediction coding coefficients of reference sounds by using the gradient descent optimization of the gain-optimized dissimilarity. As a result, we demonstrate the possibility of synthesizing sounds that will be either close to the model pronunciation or achievable by a learner. An experimental study shows that the proposed procedure leads to high efficiency for pronunciation training even in the presence of noise in the observed utterance. |
---|---|
ISSN: | 1862-4472 1862-4480 |
DOI: | 10.1007/s11590-021-01790-5 |