Speech synthesis system and method based on rhythm emotion migration

The invention discloses a speech synthesis system and method based on rhythm emotion migration, and the system comprises a text encoder module, a sequence alignment module, a multi-level style adapter, a content adapter module, and a decoder module, the text encoder module is used for carrying out v...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: QIN JITAO, NIU ZENGHUI, PANG PING
Format: Patent
Sprache:chi ; eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The invention discloses a speech synthesis system and method based on rhythm emotion migration, and the system comprises a text encoder module, a sequence alignment module, a multi-level style adapter, a content adapter module, and a decoder module, the text encoder module is used for carrying out vectorization coding on a text input in a TTS system, and the sequence alignment module is used for carrying out sequence alignment on the multi-level style adapter; the codes are mixed with some style attributes; the sequence alignment module is used for voice-text alignment, and the style attribute is eliminated through the content adapter module after alignment; the multi-level style adapter extracts multi-scale features of the reference audio, fuses the multi-scale features, and inputs the fused multi-scale features and the output after content adaptation into the voice frame decoder for outputting the Mel sound spectrum; and finally, accessing a vocoder to convert the Mel sound spectrum into a voice waveform. A