ACOUSTIC MODEL LEARNING DEVICE, VOICE SYNTHESIS DEVICE, METHOD, AND PROGRAM

A technique for synthesizing speech based on DNN that is modeled low-latency and appropriately in limited computational resource situations is presented. An acoustic model learning apparatus includes a corpus storage unit configured to store natural linguistic feature sequences and natural speech pa...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	OHTANI, Yamato, MATSUNAGA, Noriyuki
Format:	Patent
Sprache:	eng ; fre ; ger
Schlagworte:	ACOUSTICS MUSICAL INSTRUMENTS PHYSICS SPEECH ANALYSIS OR SYNTHESIS SPEECH OR AUDIO CODING OR DECODING SPEECH OR VOICE PROCESSING SPEECH RECOGNITION
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	A technique for synthesizing speech based on DNN that is modeled low-latency and appropriately in limited computational resource situations is presented. An acoustic model learning apparatus includes a corpus storage unit configured to store natural linguistic feature sequences and natural speech parameter sequences, extracted from a plurality of speech data, per speech unit; a prediction model storage unit configured to store a feed-forward neural network type prediction model for predicting a synthesized speech parameter sequence from a natural linguistic feature sequence; a prediction unit configured to input the natural linguistic feature sequence and predict the synthesized speech parameter sequence using the prediction model; an error calculation device configured to calculate an error related to the synthesized speech parameter sequence and the natural speech parameter sequence; and a learning unit configured to perform a predetermined optimization for the error and learn the prediction model; wherein the error calculation device configured to utilize a loss function for associating adjacent frames with respect to the output layer of the prediction model.