Unsupervised speaker adaptation based on sufficient HMM statistics of selected speakers

Describes an efficient method for unsupervised speaker adaptation. This method is based on (1) selecting a subset of speakers who are acoustically close to a test speaker, and (2) calculating adapted model parameters according to the previously stored sufficient HMM statistics of the selected speake...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221) Speech, and Signal Processing. Proceedings (Cat. No.01CH37221), 2001-05, Vol.1, p.341-344 vol.1
Hauptverfasser:	Yoshizawa, S., Baba, A., Matsunami, K., Mera, Y., Yamada, M., Shikano, K.
Format:	Artikel
Sprache:	eng
Schlagworte:	Acoustic testing Cepstrum Hidden Markov models Laboratories Loudspeakers Maximum likelihood linear regression Mel frequency cepstral coefficient Speech Statistical analysis Statistics
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Describes an efficient method for unsupervised speaker adaptation. This method is based on (1) selecting a subset of speakers who are acoustically close to a test speaker, and (2) calculating adapted model parameters according to the previously stored sufficient HMM statistics of the selected speakers' data. In this method, only a few unsupervised test speaker's data are required for the adaptation. Also, by using the sufficient HMM statistics of the selected speakers' data, a quick adaptation can be done. Compared with a pre-clustering method, the proposed method can obtain a more optimal speaker cluster because the clustering result is determined according to test speaker's data on-line. Experimental results show that the proposed method attains better improvement than MLLR from the speaker independent model. Moreover the proposed method utilizes only one unsupervised sentence utterance, while MLLR usually utilizes more than ten supervised sentence utterances.
ISSN:	1520-6149 2379-190X
DOI:	10.1109/ICASSP.2001.940837