Evaluation of soft segment modeling on acontext independent phoneme classification system

يعتبر التوزيع الهندسي لامتداد الحالة من الافتراضات الأساسية التي تحد من أداء نمذجة ماركوف للإشارة الصوتية. و على العموم، فإن أنموذج الأجزاء التتابعية - العشوائية، و كذلك جزئياتHMM ، و على الخصوص، لتجاوز هذا النقص جزئيا يؤدي بدوره إلى زيادة في درجة صعوبة في تدريب و تحديد الطور. إضافة إلى هذا الافتراض...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	The Arabian Journal for Science and Engineering. 2007, Vol.32 (1B), p.49-65
Hauptverfasser:	Razzazi, Farbod, Sayadiyan, Abu al-Qasim
Format:	Artikel
Sprache:	ara ; eng
Schlagworte:	Automatic speech recognition Hidden Markov models
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	يعتبر التوزيع الهندسي لامتداد الحالة من الافتراضات الأساسية التي تحد من أداء نمذجة ماركوف للإشارة الصوتية. و على العموم، فإن أنموذج الأجزاء التتابعية - العشوائية، و كذلك جزئياتHMM ، و على الخصوص، لتجاوز هذا النقص جزئيا يؤدي بدوره إلى زيادة في درجة صعوبة في تدريب و تحديد الطور. إضافة إلى هذا الافتراض، لم ندرج التغير الزمني التدريجي للإحصائيات الصوت ضمن نموذج. HMM نعرض في هذا البحث طريقة جديدة للنمذجة، حيث نورد في النموذج أثر الجزئيات المتجاورة على تقدير احتمالات اقتران الكثافة و كذلك حساب كل جزئية صوتية، بذلك يكون الأنموذج أكثر ثباتا ضد الأخطاء الجزئية، و كذلك يعالج التغير من جزئية إلى أخرى باستخدام أقل عدد من الباراميترات. تم اختبار هذا الانموذج باستخدام نظام TIMIT الذي يعتمد على نظام الجزئيات الصوتية المستقلة. أثناء الاختبار تم تصنيف الجزئيات الصوتية باستخدام عدة طرق للتعرف عليها و التوصل إلى أفضل الحلول و من ثم مقارنتها بأنموذج كثافي متصل-أنموذج ماركوف المستتر . .(CDHMM) أظهرت النتائج تحسنا بقدر % 8 – 10 في التعرف الصوتي مقارنة بأنموذج ماركوف الأساسي. The geometric distribution of states' duration is one of the main performance limiting assumptions of hidden Markov modeling of speech signals. Stochastic segment models, generally, and segmental HMM, specifically, overcome this deficiency partly at the cost of more complexity in both training and recognition phases. In addition to this assumption, the gradual temporal changes of speech statistics has not been modeled in HMM. In this paper, a new duration modeling approach is presented. The main idea of the model is to consider the effect of adjacent segments on the probability density function estimation and evaluation of each acoustic segment. This idea not only makes the model robust against segmentation errors, but also it models gradual change from one segment to the next one with a minimum set of parameters. The proposed idea is analytically formulated and tested on a TIMIT based context independent phoneme classification system. During the test procedure, the phoneme classification of different phoneme classes was performed by applying various proposed recognition algorithms. The system was optimized and the results have been compared with a continuous density hidden Markov model (CDHMM) with similar computational complexity. The results show 8–10% improvement in phoneme recognition rate in comparison with standard continuous density hidden Markov model. This indicates improved compatibility of the proposed model with the speech nature.
ISSN:	1319-8025 2191-4281