An improved KNN text classification algorithm based on density
Text classification has gained booming interest over the past few years. As a simple, effective and nonparametric classification method, KNN method is widely used in document classification. However, the uneven distribution in training set will affect the KNN classified result negatively. Moreover,...
Gespeichert in:
Hauptverfasser: | , , , , , |
---|---|
Format: | Tagungsbericht |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 117 |
---|---|
container_issue | |
container_start_page | 113 |
container_title | |
container_volume | |
creator | Kansheng Shi Lemin Li Haitao Liu Jie He Naitong Zhang Wentao Song |
description | Text classification has gained booming interest over the past few years. As a simple, effective and nonparametric classification method, KNN method is widely used in document classification. However, the uneven distribution in training set will affect the KNN classified result negatively. Moreover, the uneven distribution phenomenon of text is very common in documents on the Web. To tackling on this, this paper proposes an improved KNN method denoted by DBKNN. Experimental results show that the DBKNN algorithm can better serve classification requests for large sets of unevenly distributed documents. |
doi_str_mv | 10.1109/CCIS.2011.6045043 |
format | Conference Proceeding |
fullrecord | <record><control><sourceid>ieee_6IE</sourceid><recordid>TN_cdi_ieee_primary_6045043</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>6045043</ieee_id><sourcerecordid>6045043</sourcerecordid><originalsourceid>FETCH-LOGICAL-c223t-a2e5301574c852254d1bc16f3604d166b5dee2613df542b5b70aad2ce48fe8fa3</originalsourceid><addsrcrecordid>eNpVUNtKAzEUjDew1P0A8SU_sDXn5LLZF6EsXoqlPqjgW8kmWY3spWyC2L93wSI4LwMzw8AMIZfAFgCsvK6q1fMCGcBCMSGZ4EckKwsNClALZKiPyQx5oXJZyreTfx6Xp38e5-cki_GTTVBKa13MyM2yp6HbjcOXd_Rxs6HJfydqWxNjaII1KQw9Ne37MIb00dHaxCk3Sc73MaT9BTlrTBt9duA5eb27fake8vXT_aparnOLyFNu0EvOQBbCaokohYPagmr4tMeBUrV03qMC7hopsJZ1wYxxaL3QjdeN4XNy9dsbvPfb3Rg6M-63hzf4D-NITfw</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>An improved KNN text classification algorithm based on density</title><source>IEEE Electronic Library (IEL) Conference Proceedings</source><creator>Kansheng Shi ; Lemin Li ; Haitao Liu ; Jie He ; Naitong Zhang ; Wentao Song</creator><creatorcontrib>Kansheng Shi ; Lemin Li ; Haitao Liu ; Jie He ; Naitong Zhang ; Wentao Song</creatorcontrib><description>Text classification has gained booming interest over the past few years. As a simple, effective and nonparametric classification method, KNN method is widely used in document classification. However, the uneven distribution in training set will affect the KNN classified result negatively. Moreover, the uneven distribution phenomenon of text is very common in documents on the Web. To tackling on this, this paper proposes an improved KNN method denoted by DBKNN. Experimental results show that the DBKNN algorithm can better serve classification requests for large sets of unevenly distributed documents.</description><identifier>ISSN: 2376-5933</identifier><identifier>ISBN: 9781612842035</identifier><identifier>ISBN: 1612842038</identifier><identifier>EISSN: 2376-595X</identifier><identifier>EISBN: 9781612842028</identifier><identifier>EISBN: 1612842046</identifier><identifier>EISBN: 9781612842042</identifier><identifier>EISBN: 161284202X</identifier><identifier>DOI: 10.1109/CCIS.2011.6045043</identifier><language>eng</language><publisher>IEEE</publisher><subject>Algorithm design and analysis ; Classification algorithms ; decision function ; Equations ; KNN ; Mathematical model ; Support vector machine classification ; Text categorization ; Text classification ; Training ; VSM</subject><ispartof>2011 IEEE International Conference on Cloud Computing and Intelligence Systems, 2011, p.113-117</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c223t-a2e5301574c852254d1bc16f3604d166b5dee2613df542b5b70aad2ce48fe8fa3</citedby></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/6045043$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,780,784,789,790,2058,27925,54920</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/6045043$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Kansheng Shi</creatorcontrib><creatorcontrib>Lemin Li</creatorcontrib><creatorcontrib>Haitao Liu</creatorcontrib><creatorcontrib>Jie He</creatorcontrib><creatorcontrib>Naitong Zhang</creatorcontrib><creatorcontrib>Wentao Song</creatorcontrib><title>An improved KNN text classification algorithm based on density</title><title>2011 IEEE International Conference on Cloud Computing and Intelligence Systems</title><addtitle>CCIS</addtitle><description>Text classification has gained booming interest over the past few years. As a simple, effective and nonparametric classification method, KNN method is widely used in document classification. However, the uneven distribution in training set will affect the KNN classified result negatively. Moreover, the uneven distribution phenomenon of text is very common in documents on the Web. To tackling on this, this paper proposes an improved KNN method denoted by DBKNN. Experimental results show that the DBKNN algorithm can better serve classification requests for large sets of unevenly distributed documents.</description><subject>Algorithm design and analysis</subject><subject>Classification algorithms</subject><subject>decision function</subject><subject>Equations</subject><subject>KNN</subject><subject>Mathematical model</subject><subject>Support vector machine classification</subject><subject>Text categorization</subject><subject>Text classification</subject><subject>Training</subject><subject>VSM</subject><issn>2376-5933</issn><issn>2376-595X</issn><isbn>9781612842035</isbn><isbn>1612842038</isbn><isbn>9781612842028</isbn><isbn>1612842046</isbn><isbn>9781612842042</isbn><isbn>161284202X</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2011</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><sourceid>RIE</sourceid><recordid>eNpVUNtKAzEUjDew1P0A8SU_sDXn5LLZF6EsXoqlPqjgW8kmWY3spWyC2L93wSI4LwMzw8AMIZfAFgCsvK6q1fMCGcBCMSGZ4EckKwsNClALZKiPyQx5oXJZyreTfx6Xp38e5-cki_GTTVBKa13MyM2yp6HbjcOXd_Rxs6HJfydqWxNjaII1KQw9Ne37MIb00dHaxCk3Sc73MaT9BTlrTBt9duA5eb27fake8vXT_aparnOLyFNu0EvOQBbCaokohYPagmr4tMeBUrV03qMC7hopsJZ1wYxxaL3QjdeN4XNy9dsbvPfb3Rg6M-63hzf4D-NITfw</recordid><startdate>201109</startdate><enddate>201109</enddate><creator>Kansheng Shi</creator><creator>Lemin Li</creator><creator>Haitao Liu</creator><creator>Jie He</creator><creator>Naitong Zhang</creator><creator>Wentao Song</creator><general>IEEE</general><scope>6IE</scope><scope>6IL</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIL</scope></search><sort><creationdate>201109</creationdate><title>An improved KNN text classification algorithm based on density</title><author>Kansheng Shi ; Lemin Li ; Haitao Liu ; Jie He ; Naitong Zhang ; Wentao Song</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c223t-a2e5301574c852254d1bc16f3604d166b5dee2613df542b5b70aad2ce48fe8fa3</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2011</creationdate><topic>Algorithm design and analysis</topic><topic>Classification algorithms</topic><topic>decision function</topic><topic>Equations</topic><topic>KNN</topic><topic>Mathematical model</topic><topic>Support vector machine classification</topic><topic>Text categorization</topic><topic>Text classification</topic><topic>Training</topic><topic>VSM</topic><toplevel>online_resources</toplevel><creatorcontrib>Kansheng Shi</creatorcontrib><creatorcontrib>Lemin Li</creatorcontrib><creatorcontrib>Haitao Liu</creatorcontrib><creatorcontrib>Jie He</creatorcontrib><creatorcontrib>Naitong Zhang</creatorcontrib><creatorcontrib>Wentao Song</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Electronic Library (IEL)</collection><collection>IEEE Proceedings Order Plans (POP All) 1998-Present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Kansheng Shi</au><au>Lemin Li</au><au>Haitao Liu</au><au>Jie He</au><au>Naitong Zhang</au><au>Wentao Song</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>An improved KNN text classification algorithm based on density</atitle><btitle>2011 IEEE International Conference on Cloud Computing and Intelligence Systems</btitle><stitle>CCIS</stitle><date>2011-09</date><risdate>2011</risdate><spage>113</spage><epage>117</epage><pages>113-117</pages><issn>2376-5933</issn><eissn>2376-595X</eissn><isbn>9781612842035</isbn><isbn>1612842038</isbn><eisbn>9781612842028</eisbn><eisbn>1612842046</eisbn><eisbn>9781612842042</eisbn><eisbn>161284202X</eisbn><abstract>Text classification has gained booming interest over the past few years. As a simple, effective and nonparametric classification method, KNN method is widely used in document classification. However, the uneven distribution in training set will affect the KNN classified result negatively. Moreover, the uneven distribution phenomenon of text is very common in documents on the Web. To tackling on this, this paper proposes an improved KNN method denoted by DBKNN. Experimental results show that the DBKNN algorithm can better serve classification requests for large sets of unevenly distributed documents.</abstract><pub>IEEE</pub><doi>10.1109/CCIS.2011.6045043</doi><tpages>5</tpages></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | ISSN: 2376-5933 |
ispartof | 2011 IEEE International Conference on Cloud Computing and Intelligence Systems, 2011, p.113-117 |
issn | 2376-5933 2376-595X |
language | eng |
recordid | cdi_ieee_primary_6045043 |
source | IEEE Electronic Library (IEL) Conference Proceedings |
subjects | Algorithm design and analysis Classification algorithms decision function Equations KNN Mathematical model Support vector machine classification Text categorization Text classification Training VSM |
title | An improved KNN text classification algorithm based on density |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-26T02%3A57%3A58IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_6IE&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=An%20improved%20KNN%20text%20classification%20algorithm%20based%20on%20density&rft.btitle=2011%20IEEE%20International%20Conference%20on%20Cloud%20Computing%20and%20Intelligence%20Systems&rft.au=Kansheng%20Shi&rft.date=2011-09&rft.spage=113&rft.epage=117&rft.pages=113-117&rft.issn=2376-5933&rft.eissn=2376-595X&rft.isbn=9781612842035&rft.isbn_list=1612842038&rft_id=info:doi/10.1109/CCIS.2011.6045043&rft_dat=%3Cieee_6IE%3E6045043%3C/ieee_6IE%3E%3Curl%3E%3C/url%3E&rft.eisbn=9781612842028&rft.eisbn_list=1612842046&rft.eisbn_list=9781612842042&rft.eisbn_list=161284202X&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=6045043&rfr_iscdi=true |