Text-based speech synthesis method containing synthetic speech comparisons and updates

The invention relates to the improvement of voice-controlled systems with text-based speech synthesis, in particular with the improvement of the synthetic reproduction of a stored trail of characters whose pronunciation is subject to certain peculiarities. The invention specifies a simple reproducti...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Buth, Peter, Dufhues, Frank
Format: Patent
Sprache:eng
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The invention relates to the improvement of voice-controlled systems with text-based speech synthesis, in particular with the improvement of the synthetic reproduction of a stored trail of characters whose pronunciation is subject to certain peculiarities. The invention specifies a simple reproduction method with improved pronunciation for voice-controlled systems with text-based speech synthesis even when the stored train of characters to be synthesized does not follow the general rules of speech reproduction. According to the invention, the method of "copying" the original spoken input text into the otherwise synthesized reproduction text, which is the current state of the art, is avoided, which will significantly increase the acceptance of the user of the voice-controlled system due to the process invented. More specifically, when there is actual spoken speech input that corresponds to a stored train of characters, the converted train of characters is compared to the speech input before reproduction of the train of characters described phonetically according to general rules and converted to a purely synthetic form. When the converted train of characters is found to deviate from the speech input by a value above a threshold value, at least one variation of the converted train of characters is created. This variation is then output instead of the converted train of characters as long as this variation deviates from the speech input by a value below the threshold value.