A TEXT-TO-SPEECH SYNTHESIS METHOD AND SYSTEM, A METHOD OF TRAINING A TEXT-TO-SPEECH SYNTHESIS SYSTEM, AND A METHOD OF CALCULATING AN EXPRESSIVITY SCORE
A method includes receiving text and inputting the received text in a prediction network. The method further includes generating, using the prediction network, speech data. The prediction network comprises a neural network that is trained to generate expressive speech data from text. The neural netw...
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Patent |
Sprache: | eng ; fre ; ger |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | A method includes receiving text and inputting the received text in a prediction network. The method further includes generating, using the prediction network, speech data. The prediction network comprises a neural network that is trained to generate expressive speech data from text. The neural network is trained by: receiving a first training dataset comprising audio data and corresponding text data; acquiring a respective expressivity score for each audio sample of the audio data; selecting, from the first training dataset, a first subset of training data based on the respective expressivity scores of the audio data in the first training dataset; generating, for the first subset of training data, prediction audio data for the corresponding text data; and comparing the prediction audio data to the audio data of the first subset of training data. |
---|