Text-to-speech (TTS) processing

A speech model is trained using multi-task learning. A first task may correspond to how well predicted audio matches training audio; a second task may correspond to a metric of perceived audio quality. The speech model may include, during training, layers related to the second task that are discarde...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Prateek, Nishant, Merritt, Thomas Edward, Breen, Andrew Paul, Putrycz, Bartosz, Chicote, Roberto Barra, Nadolski, Adam Franciszek, Aggarwal, Vatsal
Format: Patent
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:A speech model is trained using multi-task learning. A first task may correspond to how well predicted audio matches training audio; a second task may correspond to a metric of perceived audio quality. The speech model may include, during training, layers related to the second task that are discarded at runtime.