Speaker Adaptive Text-to-Speech with Timbre-Normalized Vector-Quantized Feature

Achieving high fidelity and speaker similarity in text-to-speech speaker adaptation with limited amount of data is a challenging task. Most existing methods only consider adapting to the timbre of the target speakers but fail to capture their speaking styles from little data. In this work, we propos...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE/ACM transactions on audio, speech, and language processing speech, and language processing, 2023-01, Vol.31, p.1-12
Hauptverfasser:	Du, Chenpeng, Guo, Yiwei, Chen, Xie, Yu, Kai
Format:	Artikel
Sprache:	eng
Schlagworte:	Acoustics Adaptation Adaptation models Decomposition Embedding Feature extraction Lookup tables Similarity speaker adaptation Speaking Speech processing Speech recognition speech synthesis Timbre timbre normalization Training vector quantization Vocoders
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Schreiben Sie den ersten Kommentar!