Prototype reduction of the nearest neighbor classifier by weighting training instances
Nearest Neighbor (NN) algorithm is one of the most straightforward instance-based algorithms that is faced with the problem of deciding which instances to store for use during generalization. In its basic form, all training instances are stored and treated equally to find the NN of an input test pat...
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Tagungsbericht |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Nearest Neighbor (NN) algorithm is one of the most straightforward instance-based algorithms that is faced with the problem of deciding which instances to store for use during generalization. In its basic form, all training instances are stored and treated equally to find the NN of an input test pattern. In the scheme proposed in this paper, a weight that is a real number in the interval [0, ∞], is assigned to each training pattern. The weights of training patterns are used during generalization to find the NN of a query pattern. To specify the weights of training patterns, we propose a learning algorithm that minimizes the error-rate of the classifier on train data. At the same time, the algorithm reduces the size of the training set and can be viewed as a powerful instance reduction technique. An instance having zero weight is not used in the generalization phase and can be virtually removed from the training set. The classifier learned in this way might suffer from over fitting especially in the case of noisy data sets with highly overlapped classes. To tackle this problem, we use a noise-filtering algorithm to remove noisy patterns from the training set before applying the learning scheme. Using 11 data sets from UCI repository, we show that the proposed method not only improves the generalization accuracy of the basic NN classifier but also provides prototype reduction and is more effective than other algorithms proposed for this purpose in the literature. |
---|---|
DOI: | 10.1109/IEEEGCC.2009.5734315 |