Nearest Neighbor with Double Neighborhoods Algorithm for Imbalanced Classification

Classification of imbalanced data is a challenge in data mining and pattern recognition tasks. The over-advantage of the majority classes often lead to poor performance of traditional classifiers, when imbalanced data is processed. In this paper, we propose an algorithm called nearest neighbor with...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IAENG international journal of applied mathematics 2020-03, Vol.50 (1), p.1-13
Hauptverfasser:	Wang, Caiwen, Yang, Youlong
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Bias Classification Data mining Datasets Methods Neighborhoods Pattern recognition Sparsity
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Classification of imbalanced data is a challenge in data mining and pattern recognition tasks. The over-advantage of the majority classes often lead to poor performance of traditional classifiers, when imbalanced data is processed. In this paper, we propose an algorithm called nearest neighbor with double neighborhoods algorithm (NNDN) to deal with binary-class imbalanced data classification. In classification step, a double neighborhoods scheme is presented based on data distribution to judge the sparsity of the main neighborhood, and a tendency weighting scheme is used to increase the sensitivity of the algorithm to minority instances. Finally, we compare our method with six well-known algorithms on forty benchmark data sets. The results show that the proposed algorithm is suitable for imbalanced data classification, and outperforms the re-sampling and cost-sensitive learning strategies with generality-oriented base learners in most data sets.
ISSN:	1992-9978 1992-9986