Mixture of Factor Analyzers Using Priors From Non-Parallel Speech for Voice Conversion

A robust voice conversion function relies on a large amount of parallel training data, which is difficult to collect in practice. To tackle the sparse parallel training data problem in voice conversion, this paper describes a mixture of factor analyzers method which integrates prior knowledge from n...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE signal processing letters 2012-12, Vol.19 (12), p.914-917
Hauptverfasser:	Zhizheng Wu, Kinnunen, T., Eng Siong Chng, Haizhou Li
Format:	Artikel
Sprache:	eng
Schlagworte:	Covariance matrix Expectation-maximization algorithms factor analysis mixture of factor analyzers prior knowledge Speech Training data Vectors Voice conversion
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	A robust voice conversion function relies on a large amount of parallel training data, which is difficult to collect in practice. To tackle the sparse parallel training data problem in voice conversion, this paper describes a mixture of factor analyzers method which integrates prior knowledge from non-parallel speech into the training of conversion function. The experiments on CMU ARCTIC corpus show that the proposed method improves the quality and similarity of converted speech. With both objective and subjective evaluations, we show the proposed method outperforms the baseline GMM method.
ISSN:	1070-9908 1558-2361
DOI:	10.1109/LSP.2012.2225615