Bangladeshi Bangla speech corpus for automatic speech recognition research

•Development of language resource of Bangladeshi bangla spoken language (BBSL).•Development of a large speech corpus named সুবাক্য (SUBAK.KO).•Evaluation of the strength and weakness of SUBAK.KO corpus by comparing it with another similar kind of open-source large corpus.•SUBAK.KO is a more balanced...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Speech communication 2022-01, Vol.136, p.84-97
Hauptverfasser:	Kibria, Shafkat, Samin, Ahnaf Mozib, Kobir, M. Humayon, Rahman, M. Shahidur, Selim, M. Reza, Iqbal, M. Zafar
Format:	Artikel
Sprache:	eng
Schlagworte:	Accentuation Algorithms Automatic speech recognition Bangladeshi bangla corpus Bengali Continuous speech Corpora evaluation Corpus linguistics Error analysis Neural networks Quality assessment Recurrent neural network Recurrent neural networks Speech Speech recognition Spoken language Voice recognition
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	•Development of language resource of Bangladeshi bangla spoken language (BBSL).•Development of a large speech corpus named সুবাক্য (SUBAK.KO).•Evaluation of the strength and weakness of SUBAK.KO corpus by comparing it with another similar kind of open-source large corpus.•SUBAK.KO is a more balanced corpus compared to the open-source large corpus by google.•SUBAK.KO contains most of the regional accented speakers’ variability of Bangladeshi bangla. This article reports the development of language resource for Bangladeshi Bangla spoken language (BBSL). Bangladeshi Bangla has inadequate large speech corpora for Large Vocabulary Continuous Speech Recognition (LVCSR) system. The accuracy of the automatic speech recognition (ASR) system rests on the quality of the speech corpus. This work discusses the common issues and activities related to the development of a large speech corpus named ▪ (SUBAK.KO). This corpus is designed to support ASR research in Bangladeshi Bangla. It has been labeled sentence-wise. We have trained this corpus with one of the well-known current End-to-End ASR algorithms, Recurrent Neural Networks (RNNs) with Connectionist Temporal Classification (CTC). To know the strengths and weaknesses, the CER (Character Error Rate) and the WER (Word Error Rate) of the trained RNNCTC model have been observed. Another open-source large Bangla ASR corpus has been trained using the same ASR algorithm. Both trained models have been compared to assess the quality of these corpora. It has been found that SUBAK.KO is a more balanced corpus and considered more regional accented speech variability for a LVCSR system compared to that open-source large Bangla ASR corpus.
ISSN:	0167-6393 1872-7182
DOI:	10.1016/j.specom.2021.12.004