A TEXT-TO-SPEECH SYNTHESIS METHOD AND SYSTEM, A METHOD OF TRAINING A TEXT-TO-SPEECH SYNTHESIS SYSTEM, AND A METHOD OF CALCULATING AN EXPRESSIVITY SCORE

A method includes receiving text and inputting the received text in a prediction network. The method further includes generating, using the prediction network, speech data. The prediction network comprises a neural network that is trained to generate expressive speech data from text. The neural netw...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	FLYNN, John, QURESHI, Zeenat
Format:	Patent
Sprache:	eng ; fre ; ger
Schlagworte:	ACOUSTICS MUSICAL INSTRUMENTS PHYSICS SPEECH ANALYSIS OR SYNTHESIS SPEECH OR AUDIO CODING OR DECODING SPEECH OR VOICE PROCESSING SPEECH RECOGNITION
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	A method includes receiving text and inputting the received text in a prediction network. The method further includes generating, using the prediction network, speech data. The prediction network comprises a neural network that is trained to generate expressive speech data from text. The neural network is trained by: receiving a first training dataset comprising audio data and corresponding text data; acquiring a respective expressivity score for each audio sample of the audio data; selecting, from the first training dataset, a first subset of training data based on the respective expressivity scores of the audio data in the first training dataset; generating, for the first subset of training data, prediction audio data for the corresponding text data; and comparing the prediction audio data to the audio data of the first subset of training data.