A bootstrapping approach to automating prosodic annotation for limited-domain synthesis

Most speech synthesis systems use symbolic prosody labels for marking emphasis and phrase structure. but in corpus-based approaches prosodic annotation of speech is a labor intensive process driving up the cost of development of new voices. This paper explores the potential for reducing that cost by...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Bulyko, I., Ostendorf, M.
Format:	Tagungsbericht
Sprache:	eng
Schlagworte:	Accuracy Automatic speech recognition Control system synthesis Costs Decision trees Frequency Speech processing Speech synthesis Training data
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Most speech synthesis systems use symbolic prosody labels for marking emphasis and phrase structure. but in corpus-based approaches prosodic annotation of speech is a labor intensive process driving up the cost of development of new voices. This paper explores the potential for reducing that cost by using a bootstrapping approach to automatic prosodic annotation, particularly in a limited domain application. A perceptual experiment shows that using predominantly automatic prosody labels we can achieve nearly as high synthesis quality as if all data was hand-labeled.
DOI:	10.1109/WSS.2002.1224385