Analysis of N-gram model on Telugu document classification

Document classification is one of the recent areas of research evolved as a result of exponential growth in the quantum electronic form of documents. Various document representation methods based on linguistic knowledge are revisited in literature. Adaptability of N-gram models on various languages...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Rani, B.P., Vardhan, B.V., Durga, A.K., Reddy, L.P., Babu, A.V.
Format:	Tagungsbericht
Sprache:	eng
Schlagworte:	Evolutionary computation
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	3203
container_issue
container_start_page	3199
container_title
container_volume
creator	Rani, B.P. Vardhan, B.V. Durga, A.K. Reddy, L.P. Babu, A.V.
description	Document classification is one of the recent areas of research evolved as a result of exponential growth in the quantum electronic form of documents. Various document representation methods based on linguistic knowledge are revisited in literature. Adaptability of N-gram models on various languages is the recent trend. In this paper an attempt is made to analyze character N-gram model on Telugu documents. Tokenization of syllables and the associated complexity of Telugu script is described. A combination of Bayes probabilistic classifier and character N-gram model is discussed in this paper. The performance of the proposed classifier is evaluated in terms of overall accuracy.
doi_str_mv	10.1109/CEC.2008.4631231
format	Conference Proceeding
fullrecord	<record><control><sourceid>ieee_6IE</sourceid><recordid>TN_cdi_ieee_primary_4631231</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>4631231</ieee_id><sourcerecordid>4631231</sourcerecordid><originalsourceid>FETCH-LOGICAL-i175t-1487c29cc0254f1a3f7c3191786bab2567a00efc5f02cd520b4ccaa092aed75a3</originalsourceid><addsrcrecordid>eNo1kEtrAjEURtOHULXuC93kD8z03ptkknQng32AtBsL3UnMJJIyj2J04b-voP02Z3HgLD7GHhBKRLBP9aIuCcCUshJIAq_YzGqDkqREQ0JfszFaiQUAVTds8i9I3p4EGFtobb5HbHJqaAu6suaOzXL-gdOkEhXSmD3Pe9cec8p8iPyj2O5cx7uhCS0fer4K7WF74M3gD13o99y3LucUk3f7NPT3bBRdm8Pswin7elms6rdi-fn6Xs-XRUKt9gVKoz1Z74GUjOhE1F6gRW2qjduQqrQDCNGrCOQbRbCR3jsHllxotHJiyh7P3RRCWP_uUud2x_XlE_EHxXFNoQ</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>Analysis of N-gram model on Telugu document classification</title><source>IEEE Electronic Library (IEL) Conference Proceedings</source><creator>Rani, B.P. ; Vardhan, B.V. ; Durga, A.K. ; Reddy, L.P. ; Babu, A.V.</creator><creatorcontrib>Rani, B.P. ; Vardhan, B.V. ; Durga, A.K. ; Reddy, L.P. ; Babu, A.V.</creatorcontrib><description>Document classification is one of the recent areas of research evolved as a result of exponential growth in the quantum electronic form of documents. Various document representation methods based on linguistic knowledge are revisited in literature. Adaptability of N-gram models on various languages is the recent trend. In this paper an attempt is made to analyze character N-gram model on Telugu documents. Tokenization of syllables and the associated complexity of Telugu script is described. A combination of Bayes probabilistic classifier and character N-gram model is discussed in this paper. The performance of the proposed classifier is evaluated in terms of overall accuracy.</description><identifier>ISSN: 1089-778X</identifier><identifier>ISBN: 1424418224</identifier><identifier>ISBN: 9781424418220</identifier><identifier>EISSN: 1941-0026</identifier><identifier>EISBN: 9781424418237</identifier><identifier>EISBN: 1424418232</identifier><identifier>DOI: 10.1109/CEC.2008.4631231</identifier><identifier>LCCN: 2007907698</identifier><language>eng</language><publisher>IEEE</publisher><subject>Evolutionary computation</subject><ispartof>2008 IEEE Congress on Evolutionary Computation (IEEE World Congress on Computational Intelligence), 2008, p.3199-3203</ispartof><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/4631231$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,776,780,785,786,792,2052,27902,54733,54895</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/4631231$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Rani, B.P.</creatorcontrib><creatorcontrib>Vardhan, B.V.</creatorcontrib><creatorcontrib>Durga, A.K.</creatorcontrib><creatorcontrib>Reddy, L.P.</creatorcontrib><creatorcontrib>Babu, A.V.</creatorcontrib><title>Analysis of N-gram model on Telugu document classification</title><title>2008 IEEE Congress on Evolutionary Computation (IEEE World Congress on Computational Intelligence)</title><addtitle>CEC</addtitle><description>Document classification is one of the recent areas of research evolved as a result of exponential growth in the quantum electronic form of documents. Various document representation methods based on linguistic knowledge are revisited in literature. Adaptability of N-gram models on various languages is the recent trend. In this paper an attempt is made to analyze character N-gram model on Telugu documents. Tokenization of syllables and the associated complexity of Telugu script is described. A combination of Bayes probabilistic classifier and character N-gram model is discussed in this paper. The performance of the proposed classifier is evaluated in terms of overall accuracy.</description><subject>Evolutionary computation</subject><issn>1089-778X</issn><issn>1941-0026</issn><isbn>1424418224</isbn><isbn>9781424418220</isbn><isbn>9781424418237</isbn><isbn>1424418232</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2008</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><sourceid>RIE</sourceid><recordid>eNo1kEtrAjEURtOHULXuC93kD8z03ptkknQng32AtBsL3UnMJJIyj2J04b-voP02Z3HgLD7GHhBKRLBP9aIuCcCUshJIAq_YzGqDkqREQ0JfszFaiQUAVTds8i9I3p4EGFtobb5HbHJqaAu6suaOzXL-gdOkEhXSmD3Pe9cec8p8iPyj2O5cx7uhCS0fer4K7WF74M3gD13o99y3LucUk3f7NPT3bBRdm8Pswin7elms6rdi-fn6Xs-XRUKt9gVKoz1Z74GUjOhE1F6gRW2qjduQqrQDCNGrCOQbRbCR3jsHllxotHJiyh7P3RRCWP_uUud2x_XlE_EHxXFNoQ</recordid><startdate>200806</startdate><enddate>200806</enddate><creator>Rani, B.P.</creator><creator>Vardhan, B.V.</creator><creator>Durga, A.K.</creator><creator>Reddy, L.P.</creator><creator>Babu, A.V.</creator><general>IEEE</general><scope>6IE</scope><scope>6IL</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIL</scope></search><sort><creationdate>200806</creationdate><title>Analysis of N-gram model on Telugu document classification</title><author>Rani, B.P. ; Vardhan, B.V. ; Durga, A.K. ; Reddy, L.P. ; Babu, A.V.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-i175t-1487c29cc0254f1a3f7c3191786bab2567a00efc5f02cd520b4ccaa092aed75a3</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2008</creationdate><topic>Evolutionary computation</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Rani, B.P.</creatorcontrib><creatorcontrib>Vardhan, B.V.</creatorcontrib><creatorcontrib>Durga, A.K.</creatorcontrib><creatorcontrib>Reddy, L.P.</creatorcontrib><creatorcontrib>Babu, A.V.</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Electronic Library (IEL)</collection><collection>IEEE Proceedings Order Plans (POP All) 1998-Present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Rani, B.P.</au><au>Vardhan, B.V.</au><au>Durga, A.K.</au><au>Reddy, L.P.</au><au>Babu, A.V.</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Analysis of N-gram model on Telugu document classification</atitle><btitle>2008 IEEE Congress on Evolutionary Computation (IEEE World Congress on Computational Intelligence)</btitle><stitle>CEC</stitle><date>2008-06</date><risdate>2008</risdate><spage>3199</spage><epage>3203</epage><pages>3199-3203</pages><issn>1089-778X</issn><eissn>1941-0026</eissn><isbn>1424418224</isbn><isbn>9781424418220</isbn><eisbn>9781424418237</eisbn><eisbn>1424418232</eisbn><abstract>Document classification is one of the recent areas of research evolved as a result of exponential growth in the quantum electronic form of documents. Various document representation methods based on linguistic knowledge are revisited in literature. Adaptability of N-gram models on various languages is the recent trend. In this paper an attempt is made to analyze character N-gram model on Telugu documents. Tokenization of syllables and the associated complexity of Telugu script is described. A combination of Bayes probabilistic classifier and character N-gram model is discussed in this paper. The performance of the proposed classifier is evaluated in terms of overall accuracy.</abstract><pub>IEEE</pub><doi>10.1109/CEC.2008.4631231</doi><tpages>5</tpages></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 1089-778X
ispartof	2008 IEEE Congress on Evolutionary Computation (IEEE World Congress on Computational Intelligence), 2008, p.3199-3203
issn	1089-778X 1941-0026
language	eng
recordid	cdi_ieee_primary_4631231
source	IEEE Electronic Library (IEL) Conference Proceedings
subjects	Evolutionary computation
title	Analysis of N-gram model on Telugu document classification
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-08T11%3A29%3A24IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_6IE&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Analysis%20of%20N-gram%20model%20on%20Telugu%20document%20classification&rft.btitle=2008%20IEEE%20Congress%20on%20Evolutionary%20Computation%20(IEEE%20World%20Congress%20on%20Computational%20Intelligence)&rft.au=Rani,%20B.P.&rft.date=2008-06&rft.spage=3199&rft.epage=3203&rft.pages=3199-3203&rft.issn=1089-778X&rft.eissn=1941-0026&rft.isbn=1424418224&rft.isbn_list=9781424418220&rft_id=info:doi/10.1109/CEC.2008.4631231&rft_dat=%3Cieee_6IE%3E4631231%3C/ieee_6IE%3E%3Curl%3E%3C/url%3E&rft.eisbn=9781424418237&rft.eisbn_list=1424418232&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=4631231&rfr_iscdi=true