The convolutional neural networks for Amazigh speech recognition system

In this paper, we present an approach based on convolutional neural networks to build an automatic speech recognition system for the Amazigh language. This system is built with TensorFlow and uses mel frequency cepstral coefficient (MFCC) to extract features. In order to test the effect of the speak...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Telkomnika 2021-04, Vol.19 (2), p.515-522
Hauptverfasser:	Telmem, Meryam, Ghanou, Youssef
Format:	Artikel
Sprache:	eng
Schlagworte:	Accuracy Age Artificial neural networks Audio data Automatic speech recognition Berber languages Datasets Deep learning Experiments Feature extraction Model accuracy Neural networks Speech Speech recognition Voice recognition
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	In this paper, we present an approach based on convolutional neural networks to build an automatic speech recognition system for the Amazigh language. This system is built with TensorFlow and uses mel frequency cepstral coefficient (MFCC) to extract features. In order to test the effect of the speaker's gender and age on the accuracy of the model, the system was trained and tested on several datasets. The first experiment the dataset consists of 9240 audio files. The second experiment the dataset consists of 9240 audio files distributed between females and males' speakers. The last experiment 3 the dataset consists of 13860 audio files distributed between age 9-15, age 16-30, and age 30+. The result shows that the model trained on a dataset of adult speaker's age +30 categories generates the best accuracy with 93.9%.
ISSN:	1693-6930 2302-9293
DOI:	10.12928/telkomnika.v19i2.16793