An improved KNN text classification algorithm based on density

Text classification has gained booming interest over the past few years. As a simple, effective and nonparametric classification method, KNN method is widely used in document classification. However, the uneven distribution in training set will affect the KNN classified result negatively. Moreover,...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Kansheng Shi, Lemin Li, Haitao Liu, Jie He, Naitong Zhang, Wentao Song
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Text classification has gained booming interest over the past few years. As a simple, effective and nonparametric classification method, KNN method is widely used in document classification. However, the uneven distribution in training set will affect the KNN classified result negatively. Moreover, the uneven distribution phenomenon of text is very common in documents on the Web. To tackling on this, this paper proposes an improved KNN method denoted by DBKNN. Experimental results show that the DBKNN algorithm can better serve classification requests for large sets of unevenly distributed documents.
ISSN:2376-5933
2376-595X
DOI:10.1109/CCIS.2011.6045043