Chinese speech cloning method for end-to-end tone and emotion migration

The invention discloses a Chinese speech cloning method for end-to-end tone and emotion migration, and the method comprises the following steps: collecting Chinese speech recorded by a user as training data, and extracting required speech features; training a voice clone synthesis model, wherein the...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: GUO YONGBIN, ZHANG LIUJIAN, LIU JIANGFENG, LIU DINGWEI, CHEN HUAJUN, MAO AIHUA
Format: Patent
Sprache:chi ; eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The invention discloses a Chinese speech cloning method for end-to-end tone and emotion migration, and the method comprises the following steps: collecting Chinese speech recorded by a user as training data, and extracting required speech features; training a voice clone synthesis model, wherein the voice clone synthesis model comprises a tone emotion encoder, a synthesizer and a vocoder; using the trained voice clone synthesis model to generate the existing voice of a specified speaker of the voice clone synthesis model according to the voice or text content input by the user; or according to the short-time voice input by the user, the timbre and emotion in the voice of the user are rapidly cloned. According to the invention, end-to-end speech synthesis and cloning are realized, and speech with different emotions and timbres is synthesized by embedding the same model and different speaker vectors through a multi-speaker model. According to the invention, the speaker embedding vector generated by short voices