An improved KNN text classification algorithm based on density

Text classification has gained booming interest over the past few years. As a simple, effective and nonparametric classification method, KNN method is widely used in document classification. However, the uneven distribution in training set will affect the KNN classified result negatively. Moreover,...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Kansheng Shi, Lemin Li, Haitao Liu, Jie He, Naitong Zhang, Wentao Song
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 117
container_issue
container_start_page 113
container_title
container_volume
creator Kansheng Shi
Lemin Li
Haitao Liu
Jie He
Naitong Zhang
Wentao Song
description Text classification has gained booming interest over the past few years. As a simple, effective and nonparametric classification method, KNN method is widely used in document classification. However, the uneven distribution in training set will affect the KNN classified result negatively. Moreover, the uneven distribution phenomenon of text is very common in documents on the Web. To tackling on this, this paper proposes an improved KNN method denoted by DBKNN. Experimental results show that the DBKNN algorithm can better serve classification requests for large sets of unevenly distributed documents.
doi_str_mv 10.1109/CCIS.2011.6045043
format Conference Proceeding
fullrecord <record><control><sourceid>ieee_6IE</sourceid><recordid>TN_cdi_ieee_primary_6045043</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>6045043</ieee_id><sourcerecordid>6045043</sourcerecordid><originalsourceid>FETCH-LOGICAL-c223t-a2e5301574c852254d1bc16f3604d166b5dee2613df542b5b70aad2ce48fe8fa3</originalsourceid><addsrcrecordid>eNpVUNtKAzEUjDew1P0A8SU_sDXn5LLZF6EsXoqlPqjgW8kmWY3spWyC2L93wSI4LwMzw8AMIZfAFgCsvK6q1fMCGcBCMSGZ4EckKwsNClALZKiPyQx5oXJZyreTfx6Xp38e5-cki_GTTVBKa13MyM2yp6HbjcOXd_Rxs6HJfydqWxNjaII1KQw9Ne37MIb00dHaxCk3Sc73MaT9BTlrTBt9duA5eb27fake8vXT_aparnOLyFNu0EvOQBbCaokohYPagmr4tMeBUrV03qMC7hopsJZ1wYxxaL3QjdeN4XNy9dsbvPfb3Rg6M-63hzf4D-NITfw</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>An improved KNN text classification algorithm based on density</title><source>IEEE Electronic Library (IEL) Conference Proceedings</source><creator>Kansheng Shi ; Lemin Li ; Haitao Liu ; Jie He ; Naitong Zhang ; Wentao Song</creator><creatorcontrib>Kansheng Shi ; Lemin Li ; Haitao Liu ; Jie He ; Naitong Zhang ; Wentao Song</creatorcontrib><description>Text classification has gained booming interest over the past few years. As a simple, effective and nonparametric classification method, KNN method is widely used in document classification. However, the uneven distribution in training set will affect the KNN classified result negatively. Moreover, the uneven distribution phenomenon of text is very common in documents on the Web. To tackling on this, this paper proposes an improved KNN method denoted by DBKNN. Experimental results show that the DBKNN algorithm can better serve classification requests for large sets of unevenly distributed documents.</description><identifier>ISSN: 2376-5933</identifier><identifier>ISBN: 9781612842035</identifier><identifier>ISBN: 1612842038</identifier><identifier>EISSN: 2376-595X</identifier><identifier>EISBN: 9781612842028</identifier><identifier>EISBN: 1612842046</identifier><identifier>EISBN: 9781612842042</identifier><identifier>EISBN: 161284202X</identifier><identifier>DOI: 10.1109/CCIS.2011.6045043</identifier><language>eng</language><publisher>IEEE</publisher><subject>Algorithm design and analysis ; Classification algorithms ; decision function ; Equations ; KNN ; Mathematical model ; Support vector machine classification ; Text categorization ; Text classification ; Training ; VSM</subject><ispartof>2011 IEEE International Conference on Cloud Computing and Intelligence Systems, 2011, p.113-117</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c223t-a2e5301574c852254d1bc16f3604d166b5dee2613df542b5b70aad2ce48fe8fa3</citedby></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/6045043$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,780,784,789,790,2058,27925,54920</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/6045043$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Kansheng Shi</creatorcontrib><creatorcontrib>Lemin Li</creatorcontrib><creatorcontrib>Haitao Liu</creatorcontrib><creatorcontrib>Jie He</creatorcontrib><creatorcontrib>Naitong Zhang</creatorcontrib><creatorcontrib>Wentao Song</creatorcontrib><title>An improved KNN text classification algorithm based on density</title><title>2011 IEEE International Conference on Cloud Computing and Intelligence Systems</title><addtitle>CCIS</addtitle><description>Text classification has gained booming interest over the past few years. As a simple, effective and nonparametric classification method, KNN method is widely used in document classification. However, the uneven distribution in training set will affect the KNN classified result negatively. Moreover, the uneven distribution phenomenon of text is very common in documents on the Web. To tackling on this, this paper proposes an improved KNN method denoted by DBKNN. Experimental results show that the DBKNN algorithm can better serve classification requests for large sets of unevenly distributed documents.</description><subject>Algorithm design and analysis</subject><subject>Classification algorithms</subject><subject>decision function</subject><subject>Equations</subject><subject>KNN</subject><subject>Mathematical model</subject><subject>Support vector machine classification</subject><subject>Text categorization</subject><subject>Text classification</subject><subject>Training</subject><subject>VSM</subject><issn>2376-5933</issn><issn>2376-595X</issn><isbn>9781612842035</isbn><isbn>1612842038</isbn><isbn>9781612842028</isbn><isbn>1612842046</isbn><isbn>9781612842042</isbn><isbn>161284202X</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2011</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><sourceid>RIE</sourceid><recordid>eNpVUNtKAzEUjDew1P0A8SU_sDXn5LLZF6EsXoqlPqjgW8kmWY3spWyC2L93wSI4LwMzw8AMIZfAFgCsvK6q1fMCGcBCMSGZ4EckKwsNClALZKiPyQx5oXJZyreTfx6Xp38e5-cki_GTTVBKa13MyM2yp6HbjcOXd_Rxs6HJfydqWxNjaII1KQw9Ne37MIb00dHaxCk3Sc73MaT9BTlrTBt9duA5eb27fake8vXT_aparnOLyFNu0EvOQBbCaokohYPagmr4tMeBUrV03qMC7hopsJZ1wYxxaL3QjdeN4XNy9dsbvPfb3Rg6M-63hzf4D-NITfw</recordid><startdate>201109</startdate><enddate>201109</enddate><creator>Kansheng Shi</creator><creator>Lemin Li</creator><creator>Haitao Liu</creator><creator>Jie He</creator><creator>Naitong Zhang</creator><creator>Wentao Song</creator><general>IEEE</general><scope>6IE</scope><scope>6IL</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIL</scope></search><sort><creationdate>201109</creationdate><title>An improved KNN text classification algorithm based on density</title><author>Kansheng Shi ; Lemin Li ; Haitao Liu ; Jie He ; Naitong Zhang ; Wentao Song</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c223t-a2e5301574c852254d1bc16f3604d166b5dee2613df542b5b70aad2ce48fe8fa3</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2011</creationdate><topic>Algorithm design and analysis</topic><topic>Classification algorithms</topic><topic>decision function</topic><topic>Equations</topic><topic>KNN</topic><topic>Mathematical model</topic><topic>Support vector machine classification</topic><topic>Text categorization</topic><topic>Text classification</topic><topic>Training</topic><topic>VSM</topic><toplevel>online_resources</toplevel><creatorcontrib>Kansheng Shi</creatorcontrib><creatorcontrib>Lemin Li</creatorcontrib><creatorcontrib>Haitao Liu</creatorcontrib><creatorcontrib>Jie He</creatorcontrib><creatorcontrib>Naitong Zhang</creatorcontrib><creatorcontrib>Wentao Song</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Electronic Library (IEL)</collection><collection>IEEE Proceedings Order Plans (POP All) 1998-Present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Kansheng Shi</au><au>Lemin Li</au><au>Haitao Liu</au><au>Jie He</au><au>Naitong Zhang</au><au>Wentao Song</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>An improved KNN text classification algorithm based on density</atitle><btitle>2011 IEEE International Conference on Cloud Computing and Intelligence Systems</btitle><stitle>CCIS</stitle><date>2011-09</date><risdate>2011</risdate><spage>113</spage><epage>117</epage><pages>113-117</pages><issn>2376-5933</issn><eissn>2376-595X</eissn><isbn>9781612842035</isbn><isbn>1612842038</isbn><eisbn>9781612842028</eisbn><eisbn>1612842046</eisbn><eisbn>9781612842042</eisbn><eisbn>161284202X</eisbn><abstract>Text classification has gained booming interest over the past few years. As a simple, effective and nonparametric classification method, KNN method is widely used in document classification. However, the uneven distribution in training set will affect the KNN classified result negatively. Moreover, the uneven distribution phenomenon of text is very common in documents on the Web. To tackling on this, this paper proposes an improved KNN method denoted by DBKNN. Experimental results show that the DBKNN algorithm can better serve classification requests for large sets of unevenly distributed documents.</abstract><pub>IEEE</pub><doi>10.1109/CCIS.2011.6045043</doi><tpages>5</tpages></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 2376-5933
ispartof 2011 IEEE International Conference on Cloud Computing and Intelligence Systems, 2011, p.113-117
issn 2376-5933
2376-595X
language eng
recordid cdi_ieee_primary_6045043
source IEEE Electronic Library (IEL) Conference Proceedings
subjects Algorithm design and analysis
Classification algorithms
decision function
Equations
KNN
Mathematical model
Support vector machine classification
Text categorization
Text classification
Training
VSM
title An improved KNN text classification algorithm based on density
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-26T02%3A57%3A58IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_6IE&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=An%20improved%20KNN%20text%20classification%20algorithm%20based%20on%20density&rft.btitle=2011%20IEEE%20International%20Conference%20on%20Cloud%20Computing%20and%20Intelligence%20Systems&rft.au=Kansheng%20Shi&rft.date=2011-09&rft.spage=113&rft.epage=117&rft.pages=113-117&rft.issn=2376-5933&rft.eissn=2376-595X&rft.isbn=9781612842035&rft.isbn_list=1612842038&rft_id=info:doi/10.1109/CCIS.2011.6045043&rft_dat=%3Cieee_6IE%3E6045043%3C/ieee_6IE%3E%3Curl%3E%3C/url%3E&rft.eisbn=9781612842028&rft.eisbn_list=1612842046&rft.eisbn_list=9781612842042&rft.eisbn_list=161284202X&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=6045043&rfr_iscdi=true