End-to-End Transformer-based Automatic Speech Recognition for Northern Kurdish: A Pioneering Approach
Automatic Speech Recognition (ASR) for low-resource languages remains a challenging task due to limited training data. This paper introduces a comprehensive study exploring the effectiveness of Whisper, a pre-trained ASR model, for Northern Kurdish (Kurmanji) an under-resourced language spoken in th...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Automatic Speech Recognition (ASR) for low-resource languages remains a
challenging task due to limited training data. This paper introduces a
comprehensive study exploring the effectiveness of Whisper, a pre-trained ASR
model, for Northern Kurdish (Kurmanji) an under-resourced language spoken in
the Middle East. We investigate three fine-tuning strategies: vanilla, specific
parameters, and additional modules. Using a Northern Kurdish fine-tuning speech
corpus containing approximately 68 hours of validated transcribed data, our
experiments demonstrate that the additional module fine-tuning strategy
significantly improves ASR accuracy on a specialized test set, achieving a Word
Error Rate (WER) of 10.5% and Character Error Rate (CER) of 5.7% with Whisper
version 3. These results underscore the potential of sophisticated transformer
models for low-resource ASR and emphasize the importance of tailored
fine-tuning techniques for optimal performance. |
---|---|
DOI: | 10.48550/arxiv.2410.16330 |