Chinese speech cloning method for end-to-end tone and emotion migration

The invention discloses a Chinese speech cloning method for end-to-end tone and emotion migration, and the method comprises the following steps: collecting Chinese speech recorded by a user as training data, and extracting required speech features; training a voice clone synthesis model, wherein the...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	GUO YONGBIN, ZHANG LIUJIAN, LIU JIANGFENG, LIU DINGWEI, CHEN HUAJUN, MAO AIHUA
Format:	Patent
Sprache:	chi ; eng
Schlagworte:	ACOUSTICS MUSICAL INSTRUMENTS PHYSICS SPEECH ANALYSIS OR SYNTHESIS SPEECH OR AUDIO CODING OR DECODING SPEECH OR VOICE PROCESSING SPEECH RECOGNITION
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	The invention discloses a Chinese speech cloning method for end-to-end tone and emotion migration, and the method comprises the following steps: collecting Chinese speech recorded by a user as training data, and extracting required speech features; training a voice clone synthesis model, wherein the voice clone synthesis model comprises a tone emotion encoder, a synthesizer and a vocoder; using the trained voice clone synthesis model to generate the existing voice of a specified speaker of the voice clone synthesis model according to the voice or text content input by the user; or according to the short-time voice input by the user, the timbre and emotion in the voice of the user are rapidly cloned. According to the invention, end-to-end speech synthesis and cloning are realized, and speech with different emotions and timbres is synthesized by embedding the same model and different speaker vectors through a multi-speaker model. According to the invention, the speaker embedding vector generated by short voices