A speaker-independent Thai polysyllabic word recognition using hidden Markov model

This correspondence presents a speech recognition system of speaker-independent Thai polysyllabic words. This development is based on the discrete hidden Markov model in conjunction with vector quantization algorithm, endpoint detection algorithm for syllable endpoint detection and separation, and t...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Akhuputra, V., Jitapunkul, S., Pornsukchandra, W., Luksaneeyanawin, S.
Format:	Tagungsbericht
Sprache:	eng
Schlagworte:	Algorithm design and analysis Hidden Markov models Natural languages Signal analysis Signal processing Signal processing algorithms Smoothing methods Speech analysis Speech recognition Vector quantization
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	599 vol.2
container_issue
container_start_page	593
container_title
container_volume	2
creator	Akhuputra, V. Jitapunkul, S. Pornsukchandra, W. Luksaneeyanawin, S.
description	This correspondence presents a speech recognition system of speaker-independent Thai polysyllabic words. This development is based on the discrete hidden Markov model in conjunction with vector quantization algorithm, endpoint detection algorithm for syllable endpoint detection and separation, and time normalization algorithm. The 70-Thai word vocabulary is subdivided into four sets comprising single, double, and triple syllabled words, 20 words in each set, and the last set consists of 10-Thai numeric words, zero to nine. The separated speech training set and testing set are composed of both male and female speakers within the range of 18 to 25 years old. For the tonal characteristics of the Thai language, the algorithms and the model parameters are modified in order to be applicable to the Thai language. The experiments on the effects of model parameter variations on recognition rate are conducted. The model parameters are number of codebooks, number of model states, and number of training speakers. The results show that the increase in the number of codebook and the number of model states have the major effect on the recognition rates. Also, the number of training speakers has less effect than the others. The average recognition rate of this speaker-independent recognition system is 89.906 percent for 40 speakers testing set using 256 vector codebook of 10-order linear prediction coefficients and 15-state model parameters. The recognition rate of the four sets of words are 86.750 percent for single-syllabled words, 92.375 percent for double-syllabled words, 96.250 percent for triple-syllabled words, and 84.250 percent for the numeric words.
doi_str_mv	10.1109/PACRIM.1997.620333
format	Conference Proceeding
fullrecord	<record><control><sourceid>ieee_6IE</sourceid><recordid>TN_cdi_ieee_primary_620333</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>620333</ieee_id><sourcerecordid>620333</sourcerecordid><originalsourceid>FETCH-ieee_primary_6203333</originalsourceid><addsrcrecordid>eNp9jssKwjAURAMi-PwBV_cHrImxapYiii4EKd2XaK96bZqUxAf9ewu6dhYzB4aBYWwkeCQEV5Pjap3sD5FQahHNp1xK2WI9vlg2pHgsO2wYwp03msWxULLLkhWECnWBfkw2xwobsw9Ib5qgcqYOtTH6RGd4O5-Dx7O7WnqQs_AMZK9wo7wZwEH7wr2gdDmaAWtftAk4_GWfjbabdL0bEyJmladS-zr7vpN_yw9vFz_W</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>A speaker-independent Thai polysyllabic word recognition using hidden Markov model</title><source>IEEE Electronic Library (IEL) Conference Proceedings</source><creator>Akhuputra, V. ; Jitapunkul, S. ; Pornsukchandra, W. ; Luksaneeyanawin, S.</creator><creatorcontrib>Akhuputra, V. ; Jitapunkul, S. ; Pornsukchandra, W. ; Luksaneeyanawin, S.</creatorcontrib><description>This correspondence presents a speech recognition system of speaker-independent Thai polysyllabic words. This development is based on the discrete hidden Markov model in conjunction with vector quantization algorithm, endpoint detection algorithm for syllable endpoint detection and separation, and time normalization algorithm. The 70-Thai word vocabulary is subdivided into four sets comprising single, double, and triple syllabled words, 20 words in each set, and the last set consists of 10-Thai numeric words, zero to nine. The separated speech training set and testing set are composed of both male and female speakers within the range of 18 to 25 years old. For the tonal characteristics of the Thai language, the algorithms and the model parameters are modified in order to be applicable to the Thai language. The experiments on the effects of model parameter variations on recognition rate are conducted. The model parameters are number of codebooks, number of model states, and number of training speakers. The results show that the increase in the number of codebook and the number of model states have the major effect on the recognition rates. Also, the number of training speakers has less effect than the others. The average recognition rate of this speaker-independent recognition system is 89.906 percent for 40 speakers testing set using 256 vector codebook of 10-order linear prediction coefficients and 15-state model parameters. The recognition rate of the four sets of words are 86.750 percent for single-syllabled words, 92.375 percent for double-syllabled words, 96.250 percent for triple-syllabled words, and 84.250 percent for the numeric words.</description><identifier>ISBN: 0780339053</identifier><identifier>ISBN: 9780780339057</identifier><identifier>DOI: 10.1109/PACRIM.1997.620333</identifier><language>eng</language><publisher>IEEE</publisher><subject>Algorithm design and analysis ; Hidden Markov models ; Natural languages ; Signal analysis ; Signal processing ; Signal processing algorithms ; Smoothing methods ; Speech analysis ; Speech recognition ; Vector quantization</subject><ispartof>1997 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing, PACRIM. 10 Years Networking the Pacific Rim, 1987-1997, 1997, Vol.2, p.593-599 vol.2</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/620333$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,776,780,785,786,2052,4036,4037,27902,54895</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/620333$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Akhuputra, V.</creatorcontrib><creatorcontrib>Jitapunkul, S.</creatorcontrib><creatorcontrib>Pornsukchandra, W.</creatorcontrib><creatorcontrib>Luksaneeyanawin, S.</creatorcontrib><title>A speaker-independent Thai polysyllabic word recognition using hidden Markov model</title><title>1997 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing, PACRIM. 10 Years Networking the Pacific Rim, 1987-1997</title><addtitle>PACRIM</addtitle><description>This correspondence presents a speech recognition system of speaker-independent Thai polysyllabic words. This development is based on the discrete hidden Markov model in conjunction with vector quantization algorithm, endpoint detection algorithm for syllable endpoint detection and separation, and time normalization algorithm. The 70-Thai word vocabulary is subdivided into four sets comprising single, double, and triple syllabled words, 20 words in each set, and the last set consists of 10-Thai numeric words, zero to nine. The separated speech training set and testing set are composed of both male and female speakers within the range of 18 to 25 years old. For the tonal characteristics of the Thai language, the algorithms and the model parameters are modified in order to be applicable to the Thai language. The experiments on the effects of model parameter variations on recognition rate are conducted. The model parameters are number of codebooks, number of model states, and number of training speakers. The results show that the increase in the number of codebook and the number of model states have the major effect on the recognition rates. Also, the number of training speakers has less effect than the others. The average recognition rate of this speaker-independent recognition system is 89.906 percent for 40 speakers testing set using 256 vector codebook of 10-order linear prediction coefficients and 15-state model parameters. The recognition rate of the four sets of words are 86.750 percent for single-syllabled words, 92.375 percent for double-syllabled words, 96.250 percent for triple-syllabled words, and 84.250 percent for the numeric words.</description><subject>Algorithm design and analysis</subject><subject>Hidden Markov models</subject><subject>Natural languages</subject><subject>Signal analysis</subject><subject>Signal processing</subject><subject>Signal processing algorithms</subject><subject>Smoothing methods</subject><subject>Speech analysis</subject><subject>Speech recognition</subject><subject>Vector quantization</subject><isbn>0780339053</isbn><isbn>9780780339057</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>1997</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><sourceid>RIE</sourceid><recordid>eNp9jssKwjAURAMi-PwBV_cHrImxapYiii4EKd2XaK96bZqUxAf9ewu6dhYzB4aBYWwkeCQEV5Pjap3sD5FQahHNp1xK2WI9vlg2pHgsO2wYwp03msWxULLLkhWECnWBfkw2xwobsw9Ib5qgcqYOtTH6RGd4O5-Dx7O7WnqQs_AMZK9wo7wZwEH7wr2gdDmaAWtftAk4_GWfjbabdL0bEyJmladS-zr7vpN_yw9vFz_W</recordid><startdate>1997</startdate><enddate>1997</enddate><creator>Akhuputra, V.</creator><creator>Jitapunkul, S.</creator><creator>Pornsukchandra, W.</creator><creator>Luksaneeyanawin, S.</creator><general>IEEE</general><scope>6IE</scope><scope>6IL</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIL</scope></search><sort><creationdate>1997</creationdate><title>A speaker-independent Thai polysyllabic word recognition using hidden Markov model</title><author>Akhuputra, V. ; Jitapunkul, S. ; Pornsukchandra, W. ; Luksaneeyanawin, S.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-ieee_primary_6203333</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>1997</creationdate><topic>Algorithm design and analysis</topic><topic>Hidden Markov models</topic><topic>Natural languages</topic><topic>Signal analysis</topic><topic>Signal processing</topic><topic>Signal processing algorithms</topic><topic>Smoothing methods</topic><topic>Speech analysis</topic><topic>Speech recognition</topic><topic>Vector quantization</topic><toplevel>online_resources</toplevel><creatorcontrib>Akhuputra, V.</creatorcontrib><creatorcontrib>Jitapunkul, S.</creatorcontrib><creatorcontrib>Pornsukchandra, W.</creatorcontrib><creatorcontrib>Luksaneeyanawin, S.</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Electronic Library (IEL)</collection><collection>IEEE Proceedings Order Plans (POP All) 1998-Present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Akhuputra, V.</au><au>Jitapunkul, S.</au><au>Pornsukchandra, W.</au><au>Luksaneeyanawin, S.</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>A speaker-independent Thai polysyllabic word recognition using hidden Markov model</atitle><btitle>1997 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing, PACRIM. 10 Years Networking the Pacific Rim, 1987-1997</btitle><stitle>PACRIM</stitle><date>1997</date><risdate>1997</risdate><volume>2</volume><spage>593</spage><epage>599 vol.2</epage><pages>593-599 vol.2</pages><isbn>0780339053</isbn><isbn>9780780339057</isbn><abstract>This correspondence presents a speech recognition system of speaker-independent Thai polysyllabic words. This development is based on the discrete hidden Markov model in conjunction with vector quantization algorithm, endpoint detection algorithm for syllable endpoint detection and separation, and time normalization algorithm. The 70-Thai word vocabulary is subdivided into four sets comprising single, double, and triple syllabled words, 20 words in each set, and the last set consists of 10-Thai numeric words, zero to nine. The separated speech training set and testing set are composed of both male and female speakers within the range of 18 to 25 years old. For the tonal characteristics of the Thai language, the algorithms and the model parameters are modified in order to be applicable to the Thai language. The experiments on the effects of model parameter variations on recognition rate are conducted. The model parameters are number of codebooks, number of model states, and number of training speakers. The results show that the increase in the number of codebook and the number of model states have the major effect on the recognition rates. Also, the number of training speakers has less effect than the others. The average recognition rate of this speaker-independent recognition system is 89.906 percent for 40 speakers testing set using 256 vector codebook of 10-order linear prediction coefficients and 15-state model parameters. The recognition rate of the four sets of words are 86.750 percent for single-syllabled words, 92.375 percent for double-syllabled words, 96.250 percent for triple-syllabled words, and 84.250 percent for the numeric words.</abstract><pub>IEEE</pub><doi>10.1109/PACRIM.1997.620333</doi></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISBN: 0780339053
ispartof	1997 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing, PACRIM. 10 Years Networking the Pacific Rim, 1987-1997, 1997, Vol.2, p.593-599 vol.2
issn
language	eng
recordid	cdi_ieee_primary_620333
source	IEEE Electronic Library (IEL) Conference Proceedings
subjects	Algorithm design and analysis Hidden Markov models Natural languages Signal analysis Signal processing Signal processing algorithms Smoothing methods Speech analysis Speech recognition Vector quantization
title	A speaker-independent Thai polysyllabic word recognition using hidden Markov model
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-03T12%3A05%3A02IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_6IE&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=A%20speaker-independent%20Thai%20polysyllabic%20word%20recognition%20using%20hidden%20Markov%20model&rft.btitle=1997%20IEEE%20Pacific%20Rim%20Conference%20on%20Communications,%20Computers%20and%20Signal%20Processing,%20PACRIM.%2010%20Years%20Networking%20the%20Pacific%20Rim,%201987-1997&rft.au=Akhuputra,%20V.&rft.date=1997&rft.volume=2&rft.spage=593&rft.epage=599%20vol.2&rft.pages=593-599%20vol.2&rft.isbn=0780339053&rft.isbn_list=9780780339057&rft_id=info:doi/10.1109/PACRIM.1997.620333&rft_dat=%3Cieee_6IE%3E620333%3C/ieee_6IE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=620333&rfr_iscdi=true