Continuous Punjabi speech recognition model based on Kaldi ASR toolkit

In this paper, continuous Punjabi speech recognition model is presented using Kaldi toolkit. For speech recognition, the extraction of Mel frequency cepstral coefficients (MFCC) features and perceptual linear prediction (PLP) features were extracted from Punjabi continuous speech samples. The perfor...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:International journal of speech technology 2018-06, Vol.21 (2), p.211-216
Hauptverfasser: Guglani, Jyoti, Mishra, A. N.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:In this paper, continuous Punjabi speech recognition model is presented using Kaldi toolkit. For speech recognition, the extraction of Mel frequency cepstral coefficients (MFCC) features and perceptual linear prediction (PLP) features were extracted from Punjabi continuous speech samples. The performance of automatic speech recognition (ASR) system for both monophone and triphone model i.e., tri1, tri2 and tri3 model using N-gram language model is reported. The performance of ASR system were computed in terms of word error rate (WER). A significant reduction in WER was observed using the tri phone model over mono phone model ASR .Also the performance of ASR using tri3 model is improved over tri2 model and the performance of tri2 model is improved over tri1 model ASR. Further, it was found that MFCC feature provides higher speech recognition accuracy than PLP features for continuous Punjabi speech.
ISSN:1381-2416
1572-8110
DOI:10.1007/s10772-018-9497-6