Amazigh CNN speech recognition system based on Mel spectrogram feature extraction method

The field of speech recognition makes it simpler for humans and machines to engage with speech. Number-oriented communication, such as using a registration code, mobile number, score, or account number, can benefit from speech recognition for digits. This paper presents our Amazigh automatic speech...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:International journal of speech technology 2024-03, Vol.27 (1), p.287-296
Hauptverfasser: Boulal, Hossam, Hamidi, Mohamed, Abarkan, Mustapha, Barkani, Jamal
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The field of speech recognition makes it simpler for humans and machines to engage with speech. Number-oriented communication, such as using a registration code, mobile number, score, or account number, can benefit from speech recognition for digits. This paper presents our Amazigh automatic speech recognition (ASR) experience based on the deep learning approach. The convolutional neural network (CNN) and Mel spectrogram are exploited to evaluate audio samples and produce spectrograms as a part of the deep learning strategy. To attempt the recognition of the Amazigh numerals, we use a database that includes digits ranging from zero to nine collected from 42 native speakers in total, men and women between the ages of 20 and 40. Our experimental results show that spoken digits in Amazigh can be identified with a maximum accuracy of 93.62%, 94% Precision, and 94% Recall.
ISSN:1381-2416
1572-8110
DOI:10.1007/s10772-024-10100-0