Rule-Based Automatic Phonetic Transcription for the Romanian Language
In this paper, we report on an implemented rule-based tool for grapheme to phoneme conversion for the Romanian language and experiments run on two different sets of words. A 4779 words database consisting of the most frequent words in a text corpus collected from 93 books representing genuine Romani...
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Tagungsbericht |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 686 |
---|---|
container_issue | |
container_start_page | 682 |
container_title | |
container_volume | |
creator | Stefan-Adrian, T. Doru-Petru, M. |
description | In this paper, we report on an implemented rule-based tool for grapheme to phoneme conversion for the Romanian language and experiments run on two different sets of words. A 4779 words database consisting of the most frequent words in a text corpus collected from 93 books representing genuine Romanian literature, foreign literary works translated into Romanian and scientific texts was transcribed obtaining more than 95% accurate phonetic transcriptions, using a set of 102 letter-to-sound rules. Then, using the same rules, a larger set consisting of 15599 words was transcribed generating 91.46% accurate transcriptions. Although the rules weren't written by phoneticians and can be further improved, it is clear that even for mostly phonetic languages like Romanian, rule-based letter-to sound systems with manually introduced rules are limited by the human inability to spot important patterns in the pronunciation. |
doi_str_mv | 10.1109/ComputationWorld.2009.59 |
format | Conference Proceeding |
fullrecord | <record><control><sourceid>ieee_6IE</sourceid><recordid>TN_cdi_ieee_primary_5358878</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>5358878</ieee_id><sourcerecordid>5358878</sourcerecordid><originalsourceid>FETCH-LOGICAL-i175t-8fb14c88d6c5d67b1c0934da3c6b7c476049b38cc76f179bd9b0a05ea432fb743</originalsourceid><addsrcrecordid>eNotT8tKw0AAXBFBqfkCL_sDibvZ97GGqoWAUioey77SriS7ZZMc_HtTdS4zA8MMAwDEqMIYqccmDed50lNI8TPl3lU1Qqpi6goUSkgkuGJE8ppe_3pMa0oZ5hzfgmIcv9ACygir5R3Y7Obel0969A6u5ykNS6mF76cU_UXss46jzeF8mYJdynA6ebhbYjHoCFsdj7M--ntw0-l-9MU_r8DH82bfvJbt28u2WbdlwIJNpewMplZKxy1zXBhskSLUaWK5EZYKjqgyRForeIeFMk4ZpBHzmpK6M4KSFXj46w3e-8M5h0Hn78NyRUohyQ_yslF0</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>Rule-Based Automatic Phonetic Transcription for the Romanian Language</title><source>IEEE Electronic Library (IEL) Conference Proceedings</source><creator>Stefan-Adrian, T. ; Doru-Petru, M.</creator><creatorcontrib>Stefan-Adrian, T. ; Doru-Petru, M.</creatorcontrib><description>In this paper, we report on an implemented rule-based tool for grapheme to phoneme conversion for the Romanian language and experiments run on two different sets of words. A 4779 words database consisting of the most frequent words in a text corpus collected from 93 books representing genuine Romanian literature, foreign literary works translated into Romanian and scientific texts was transcribed obtaining more than 95% accurate phonetic transcriptions, using a set of 102 letter-to-sound rules. Then, using the same rules, a larger set consisting of 15599 words was transcribed generating 91.46% accurate transcriptions. Although the rules weren't written by phoneticians and can be further improved, it is clear that even for mostly phonetic languages like Romanian, rule-based letter-to sound systems with manually introduced rules are limited by the human inability to spot important patterns in the pronunciation.</description><identifier>ISBN: 9781424451661</identifier><identifier>ISBN: 1424451663</identifier><identifier>EISBN: 9780769538624</identifier><identifier>EISBN: 0769538622</identifier><identifier>DOI: 10.1109/ComputationWorld.2009.59</identifier><language>eng</language><publisher>IEEE</publisher><subject>Dictionaries ; Error analysis ; letter-to-sound ; Military computing ; Natural languages ; Neural networks ; phonetics ; rules ; Speech synthesis ; System testing ; Telecommunication computing ; XML</subject><ispartof>2009 Computation World: Future Computing, Service Computation, Cognitive, Adaptive, Content, Patterns, 2009, p.682-686</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/5358878$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,780,784,789,790,2056,27924,54919</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/5358878$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Stefan-Adrian, T.</creatorcontrib><creatorcontrib>Doru-Petru, M.</creatorcontrib><title>Rule-Based Automatic Phonetic Transcription for the Romanian Language</title><title>2009 Computation World: Future Computing, Service Computation, Cognitive, Adaptive, Content, Patterns</title><addtitle>COMPUTATIONWORLD</addtitle><description>In this paper, we report on an implemented rule-based tool for grapheme to phoneme conversion for the Romanian language and experiments run on two different sets of words. A 4779 words database consisting of the most frequent words in a text corpus collected from 93 books representing genuine Romanian literature, foreign literary works translated into Romanian and scientific texts was transcribed obtaining more than 95% accurate phonetic transcriptions, using a set of 102 letter-to-sound rules. Then, using the same rules, a larger set consisting of 15599 words was transcribed generating 91.46% accurate transcriptions. Although the rules weren't written by phoneticians and can be further improved, it is clear that even for mostly phonetic languages like Romanian, rule-based letter-to sound systems with manually introduced rules are limited by the human inability to spot important patterns in the pronunciation.</description><subject>Dictionaries</subject><subject>Error analysis</subject><subject>letter-to-sound</subject><subject>Military computing</subject><subject>Natural languages</subject><subject>Neural networks</subject><subject>phonetics</subject><subject>rules</subject><subject>Speech synthesis</subject><subject>System testing</subject><subject>Telecommunication computing</subject><subject>XML</subject><isbn>9781424451661</isbn><isbn>1424451663</isbn><isbn>9780769538624</isbn><isbn>0769538622</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2009</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><sourceid>RIE</sourceid><recordid>eNotT8tKw0AAXBFBqfkCL_sDibvZ97GGqoWAUioey77SriS7ZZMc_HtTdS4zA8MMAwDEqMIYqccmDed50lNI8TPl3lU1Qqpi6goUSkgkuGJE8ppe_3pMa0oZ5hzfgmIcv9ACygir5R3Y7Obel0969A6u5ykNS6mF76cU_UXss46jzeF8mYJdynA6ebhbYjHoCFsdj7M--ntw0-l-9MU_r8DH82bfvJbt28u2WbdlwIJNpewMplZKxy1zXBhskSLUaWK5EZYKjqgyRForeIeFMk4ZpBHzmpK6M4KSFXj46w3e-8M5h0Hn78NyRUohyQ_yslF0</recordid><startdate>200911</startdate><enddate>200911</enddate><creator>Stefan-Adrian, T.</creator><creator>Doru-Petru, M.</creator><general>IEEE</general><scope>6IE</scope><scope>6IL</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIL</scope></search><sort><creationdate>200911</creationdate><title>Rule-Based Automatic Phonetic Transcription for the Romanian Language</title><author>Stefan-Adrian, T. ; Doru-Petru, M.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-i175t-8fb14c88d6c5d67b1c0934da3c6b7c476049b38cc76f179bd9b0a05ea432fb743</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2009</creationdate><topic>Dictionaries</topic><topic>Error analysis</topic><topic>letter-to-sound</topic><topic>Military computing</topic><topic>Natural languages</topic><topic>Neural networks</topic><topic>phonetics</topic><topic>rules</topic><topic>Speech synthesis</topic><topic>System testing</topic><topic>Telecommunication computing</topic><topic>XML</topic><toplevel>online_resources</toplevel><creatorcontrib>Stefan-Adrian, T.</creatorcontrib><creatorcontrib>Doru-Petru, M.</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Electronic Library (IEL)</collection><collection>IEEE Proceedings Order Plans (POP All) 1998-Present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Stefan-Adrian, T.</au><au>Doru-Petru, M.</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Rule-Based Automatic Phonetic Transcription for the Romanian Language</atitle><btitle>2009 Computation World: Future Computing, Service Computation, Cognitive, Adaptive, Content, Patterns</btitle><stitle>COMPUTATIONWORLD</stitle><date>2009-11</date><risdate>2009</risdate><spage>682</spage><epage>686</epage><pages>682-686</pages><isbn>9781424451661</isbn><isbn>1424451663</isbn><eisbn>9780769538624</eisbn><eisbn>0769538622</eisbn><abstract>In this paper, we report on an implemented rule-based tool for grapheme to phoneme conversion for the Romanian language and experiments run on two different sets of words. A 4779 words database consisting of the most frequent words in a text corpus collected from 93 books representing genuine Romanian literature, foreign literary works translated into Romanian and scientific texts was transcribed obtaining more than 95% accurate phonetic transcriptions, using a set of 102 letter-to-sound rules. Then, using the same rules, a larger set consisting of 15599 words was transcribed generating 91.46% accurate transcriptions. Although the rules weren't written by phoneticians and can be further improved, it is clear that even for mostly phonetic languages like Romanian, rule-based letter-to sound systems with manually introduced rules are limited by the human inability to spot important patterns in the pronunciation.</abstract><pub>IEEE</pub><doi>10.1109/ComputationWorld.2009.59</doi><tpages>5</tpages></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | ISBN: 9781424451661 |
ispartof | 2009 Computation World: Future Computing, Service Computation, Cognitive, Adaptive, Content, Patterns, 2009, p.682-686 |
issn | |
language | eng |
recordid | cdi_ieee_primary_5358878 |
source | IEEE Electronic Library (IEL) Conference Proceedings |
subjects | Dictionaries Error analysis letter-to-sound Military computing Natural languages Neural networks phonetics rules Speech synthesis System testing Telecommunication computing XML |
title | Rule-Based Automatic Phonetic Transcription for the Romanian Language |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-13T03%3A08%3A34IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_6IE&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Rule-Based%20Automatic%20Phonetic%20Transcription%20for%20the%20Romanian%20Language&rft.btitle=2009%20Computation%20World:%20Future%20Computing,%20Service%20Computation,%20Cognitive,%20Adaptive,%20Content,%20Patterns&rft.au=Stefan-Adrian,%20T.&rft.date=2009-11&rft.spage=682&rft.epage=686&rft.pages=682-686&rft.isbn=9781424451661&rft.isbn_list=1424451663&rft_id=info:doi/10.1109/ComputationWorld.2009.59&rft_dat=%3Cieee_6IE%3E5358878%3C/ieee_6IE%3E%3Curl%3E%3C/url%3E&rft.eisbn=9780769538624&rft.eisbn_list=0769538622&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=5358878&rfr_iscdi=true |