Speech synthesis using dynamical modelling with global variance
A text-to-speech (TTS) system is trained according to a linear dynamic model (LDM) whereby text is converted to a sequence of linguistic units (eg. phonemes, sub-phonemes), each state of which is looked up in an acoustic model table to produce a sequence of speech vectors which is adjusted to increa...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Patent |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | A text-to-speech (TTS) system is trained according to a linear dynamic model (LDM) whereby text is converted to a sequence of linguistic units (eg. phonemes, sub-phonemes), each state of which is looked up in an acoustic model table to produce a sequence of speech vectors which is adjusted to increase the variance of the speech vectors vi(d) based on a predefined global variance v before being output as speech. A predefined number T of hidden vectors xt evolve according to a state equation involving an observation matrix H, state transformation matrix F, covariance matrices Q & R and mean vectors m. Second order LDMs may be constrained to be critically damped towards a target q, and speech parameter trajectories Y may be calculated according to a steepest ascent method. |
---|