Text-to-speech (TTS) processing

A speech model is trained using multi-task learning. A first task may correspond to how well predicted audio matches training audio; a second task may correspond to a metric of perceived audio quality. The speech model may include, during training, layers related to the second task that are discarde...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Prateek, Nishant, Merritt, Thomas Edward, Breen, Andrew Paul, Putrycz, Bartosz, Chicote, Roberto Barra, Nadolski, Adam Franciszek, Aggarwal, Vatsal
Format:	Patent
Sprache:	eng
Schlagworte:	ACOUSTICS MUSICAL INSTRUMENTS PHYSICS SPEECH ANALYSIS OR SYNTHESIS SPEECH OR AUDIO CODING OR DECODING SPEECH OR VOICE PROCESSING SPEECH RECOGNITION
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	A speech model is trained using multi-task learning. A first task may correspond to how well predicted audio matches training audio; a second task may correspond to a metric of perceived audio quality. The speech model may include, during training, layers related to the second task that are discarded at runtime.