Improved Adaboost algorithm for classification based on noise confidence degree and weighted feature selection
Adaboost is a typical ensemble learning algorithm and has been studied and widely used in classification tasks. In order to improve the classification performance of existing Adaboost algorithms effectively, a noise confidence degree and weighted feature selection based Adaboost algorithm (called NW...
Gespeichert in:
Veröffentlicht in: | IEEE access 2020-01, Vol.8, p.1-1 |
---|---|
Hauptverfasser: | , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Adaboost is a typical ensemble learning algorithm and has been studied and widely used in classification tasks. In order to improve the classification performance of existing Adaboost algorithms effectively, a noise confidence degree and weighted feature selection based Adaboost algorithm (called NW_Ada) is proposed. Firstly, in order to decrease the impact of sample set density on noise detection results, the conceptions of clustering degree and deviated degree are introduced, and a new method of evaluating the noise confidence is proposed. Then, based on the traditional feature selections of filters, a weighted feature selection method is proposed to select the features which can effectively distinguish the samples those are misclassified. Finally, based on the traditional error rate calculation method, the category recall based classifier error rate calculation method is proposed to solve the problem that the traditional methods ignore the distribution of misclassified samples when dealing with the unbalanced datasets. The experimental results show that the proposed method comprehensively considers the influences of sample density, sample weight and dataset size on classification results, and obtains significant improvement on classification performance compared to traditional Adaboost algorithms when different datasets especially the unbalanced datasets are used. |
---|---|
ISSN: | 2169-3536 2169-3536 |
DOI: | 10.1109/ACCESS.2020.3017164 |