SPEECH SYNTHESIS MODEL SELECTION

In some implementations, a text-to-speech system may perform a mapping of acoustic frames to linguistic model clusters in a pre-selection process for unit selection synthesis. An architecture may leverage data-driven models, such as neural networks that are trained using recorded speech samples, to...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Fructuoso Javier Gonzalvo, Chun Byungha
Format: Patent
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:In some implementations, a text-to-speech system may perform a mapping of acoustic frames to linguistic model clusters in a pre-selection process for unit selection synthesis. An architecture may leverage data-driven models, such as neural networks that are trained using recorded speech samples, to effectively map acoustic frames to linguistic model clusters during synthesis. This architecture may allow for improved handling and synthesis of combinations of unseen linguistic features.