Non-parallel training for many-to-many eigenvoice conversion

This paper presents a novel training method of an eigenvoice Gaussian mixture model (EV-GMM) effectively using non-parallel data sets for many-to-many eigenvoice conversion, which is a technique for converting an arbitrary source speaker's voice into an arbitrary target speaker's voice. In...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Ohtani, Yamato, Toda, Tomoki, Saruwatari, Hiroshi, Shikano, Kiyohiro
Format:	Tagungsbericht
Sprache:	eng
Schlagworte:	Data mining eigenvoice Gaussian mixture model Information science Loudspeakers many-to-many Microwave integrated circuits non-parallel training Probability Quality control Speech Virtual colonoscopy Voice conversion
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	This paper presents a novel training method of an eigenvoice Gaussian mixture model (EV-GMM) effectively using non-parallel data sets for many-to-many eigenvoice conversion, which is a technique for converting an arbitrary source speaker's voice into an arbitrary target speaker's voice. In the proposed method, an initial EV-GMM is trained with the conventional method using parallel data sets consisting of a single reference speaker and multiple pre-stored speakers. Then, the initial EV-GMM is further refined using non-parallel data sets including a larger number of pre-stored speakers while considering the reference speaker's voices as hidden variables. The experimental results demonstrate that the proposed method yields significant quality improvements in converted speech by enabling us to use data of a larger number of pre-stored speakers.
ISSN:	1520-6149 2379-190X
DOI:	10.1109/ICASSP.2010.5495139