Vocal tract length normalization strategy based on maximum likelihood criterion
In this paper performances of automatic speech recognition systems which use vocal tract length normalization (VTN) are presented. Beside standard procedure for VTN coefficient estimation several variants based on robust statistic methods are introduced. All systems which use VTN performed better th...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Tagungsbericht |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | In this paper performances of automatic speech recognition systems which use vocal tract length normalization (VTN) are presented. Beside standard procedure for VTN coefficient estimation several variants based on robust statistic methods are introduced. All systems which use VTN performed better than referent systems, while the best performance was achieved by the system in which the VTN coefficient for a particular speaker is chosen as the one with maximum sample mean of likelihoods per phoneme. Phoneme likelihoods are calculated as sample medians of feature vectors corresponding to particular phonemes. The relative improvement of performance for this system is about 20%. |
---|---|
DOI: | 10.1109/EURCON.2009.5167662 |