Bi-LSTM (Bidirectional-Long Short-Term Memory Recurrent Neural Networks) and WaveNet fused voice conversion method

The invention provides a Bi-LSTM (Bidirectional-Long Short-Term Memory Recurrent Neural Networks) and WaveNet fused voice conversion method which comprises the following steps: firstly, extracting features of a to-be-converted voice, and sending a Mel Frequency cepstrum coefficient of the to-be-conv...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: SUN MENG, CAO TIEYONG, ZENG XIN, LI LI, MIAO XIAOKONG, ZHANG XIONGWEI, ZHENG CHANGYAN
Format: Patent
Sprache:chi ; eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The invention provides a Bi-LSTM (Bidirectional-Long Short-Term Memory Recurrent Neural Networks) and WaveNet fused voice conversion method which comprises the following steps: firstly, extracting features of a to-be-converted voice, and sending a Mel Frequency cepstrum coefficient of the to-be-converted voice into a feature conversion network for conversion to obtain a converted Mel Frequency cepstrum coefficient; then up-sampling an aperiodic frequency of the to-be-converted voice, a linearly converted fundamental tone frequency and the converted Mel Frequency cepstrum coefficient, and sending into a voice generation network to obtain a pre-generated voice, and sending a Mel Frequency cepstrum coefficient of the pre-generated voice into a post-processing network for post-processing; andup-sampling the post-treated Mel Frequency cepstrum coefficient, the aperiodic frequency of the to-be-converted voice and the linearly converted fundamental tone frequency, and sending into the voicegeneration network to gene