Speech synthesis system and method based on rhythm emotion migration

The invention discloses a speech synthesis system and method based on rhythm emotion migration, and the system comprises a text encoder module, a sequence alignment module, a multi-level style adapter, a content adapter module, and a decoder module, the text encoder module is used for carrying out v...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	QIN JITAO, NIU ZENGHUI, PANG PING
Format:	Patent
Sprache:	chi ; eng
Schlagworte:	ACOUSTICS MUSICAL INSTRUMENTS PHYSICS SPEECH ANALYSIS OR SYNTHESIS SPEECH OR AUDIO CODING OR DECODING SPEECH OR VOICE PROCESSING SPEECH RECOGNITION
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	The invention discloses a speech synthesis system and method based on rhythm emotion migration, and the system comprises a text encoder module, a sequence alignment module, a multi-level style adapter, a content adapter module, and a decoder module, the text encoder module is used for carrying out vectorization coding on a text input in a TTS system, and the sequence alignment module is used for carrying out sequence alignment on the multi-level style adapter; the codes are mixed with some style attributes; the sequence alignment module is used for voice-text alignment, and the style attribute is eliminated through the content adapter module after alignment; the multi-level style adapter extracts multi-scale features of the reference audio, fuses the multi-scale features, and inputs the fused multi-scale features and the output after content adaptation into the voice frame decoder for outputting the Mel sound spectrum; and finally, accessing a vocoder to convert the Mel sound spectrum into a voice waveform. A