Speech synthesis using dynamical modelling with global variance

A text-to-speech (TTS) system is trained according to a linear dynamic model (LDM) whereby text is converted to a sequence of linguistic units (eg. phonemes, sub-phonemes), each state of which is looked up in an acoustic model table to produce a sequence of speech vectors which is adjusted to increa...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Vassilis Digalakis, Vassilis Tsiaras, Vassilis Diakoloukas, Ioannis Stylianou, Ranniery Maia
Format: Patent
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:A text-to-speech (TTS) system is trained according to a linear dynamic model (LDM) whereby text is converted to a sequence of linguistic units (eg. phonemes, sub-phonemes), each state of which is looked up in an acoustic model table to produce a sequence of speech vectors which is adjusted to increase the variance of the speech vectors vi(d) based on a predefined global variance v before being output as speech. A predefined number T of hidden vectors xt evolve according to a state equation involving an observation matrix H, state transformation matrix F, covariance matrices Q & R and mean vectors m. Second order LDMs may be constrained to be critically damped towards a target q, and speech parameter trajectories Y may be calculated according to a steepest ascent method.