Improving Automatic Forced Alignment for Phoneme Segmentation in Quranic Recitation
Segmentation plays a crucial role in speech processing applications, where high accuracy is essential. The quest for improved accuracy in automatic segmentation, particularly in the context of the Arabic language, has garnered substantial attention. However, the differences between Qur'an recit...
Gespeichert in:
Veröffentlicht in: | IEEE access 2024, Vol.12, p.229-244 |
---|---|
Hauptverfasser: | , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Segmentation plays a crucial role in speech processing applications, where high accuracy is essential. The quest for improved accuracy in automatic segmentation, particularly in the context of the Arabic language, has garnered substantial attention. However, the differences between Qur'an recitation and normal Arabic speech, especially with regard to intonation rules affecting the lengthening of long vowels, pose challenges in segmentation especially for Qur'an recitation. This research endeavors to address these challenges by delving into the domain of automatic segmentation for Qur'an recitation recognition. The proposed scheme employs a hidden Markov models (HMMs) forced alignment algorithm. To enhance the precision of segmentation, several refinements have been introduced, with a primary emphasis on the phonetic model of the Qur'an and Tajweed, particularly the intricate rules governing elongation. These enhancements encompass the adaptation of an acoustic model tailored for Qur'anic recitation as preprocessing and culminate in the development of an algorithm aimed at refining forced alignment based on the phonetic nuances of the Qur'an. These enhancements are seamlessly integrated as post-processing components for the classic HMM-based forced alignment. The research utilizes a comprehensive database featuring recordings from 100 renowned Qur'an reciters, encompassing the recitation of 21 Qur'anic verses (Ayat). Additionally, 30 reciters were asked to record the same verses, incorporating various recitation speed patterns. To facilitate the evaluation process, a Random sample of the Qur'anic database was manually segmented, comprised 21 Ayats, totaling 19,800 words, with 89 unique words (14 verses x 3 recitation levels: fast, slow and normal x 6 readers). The outcomes of this study manifest notable advancements in the alignment of long vowels within Qur'an recitation, all while maintaining the precise alignment of vowels and consonants. Objective comparisons between the proposed automatic methods and manual segmentation were conducted to ascertain the superior approach. The findings affirm that the classic forced alignment method produces satisfactory outcomes when employed on verses lacking long vowels. However, its performance diminishes when confronted with verses containing long vowels. Therefore, the test samples were categorized into three groups based on the presence of long vowels, resulting in a Correct Classification Rate (CCR) that ranged fr |
---|---|
ISSN: | 2169-3536 2169-3536 |
DOI: | 10.1109/ACCESS.2023.3345843 |