End-to-end long-time speech recognition method

The invention provides an end-to-end long-time speech recognition method. The method comprises the following steps: selecting a corpus as a training data set, and carrying out data preprocessing and feature extraction on voice data in the training data set to generate voice features; constructing an...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: ZOU JUNWEI, LYU BAIYANG, MING YUE, WEN ZHIGANG, LI ZERUI
Format: Patent
Sprache:chi ; eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The invention provides an end-to-end long-time speech recognition method. The method comprises the following steps: selecting a corpus as a training data set, and carrying out data preprocessing and feature extraction on voice data in the training data set to generate voice features; constructing an improved RNN-T model fusing an external language model and a long-term speech recognition algorithm, and inputting the speech features into the RNN-T model for training to obtain a trained improved RNN-T model; and taking the trained improved RNN-T model as a teacher model in a mutual learning knowledge distillation algorithm, training a student model in the mutual learning knowledge distillation algorithm by using the mutual learning knowledge distillation algorithm, identifying long-term voice data to be identified by using the trained and verified student model, and outputting a voice identification result. According to the method, the external language model, the long-term speech recognition algorithm module a