A new incomplete pattern belief classification method with multiple estimations based on KNN

The classification of missing data is a challenging task, because the lack of pattern attributes may bring uncertainty to the classification results and most classification methods produce only one estimation, which may have a risk of misclassification. A new incomplete pattern belief classification...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Applied soft computing 2020-05, Vol.90, p.106175, Article 106175
Hauptverfasser: Ma, Zong-fang, Tian, Hong-peng, Liu, Ze-chao, Zhang, Zuo-wei
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The classification of missing data is a challenging task, because the lack of pattern attributes may bring uncertainty to the classification results and most classification methods produce only one estimation, which may have a risk of misclassification. A new incomplete pattern belief classification (PBC) method with multiple estimations based on K-nearest neighbors (KNNs) is proposed to deal with missing data. PBC preliminarily classifies the incomplete pattern using its KNNs obtained by the known attributes. The pattern whose KNNs contain only one class information can be directly divided into this class. If not, the p (p≤c) estimations will be computed according to the different KNNs in different classes when p classes are included in the KNNs of the pattern and it will yield p pieces of classification results by the chosen classifier. Then, a weighted possibility distance method is used to further divide the p classification results with their KNNs’ classification information. The pattern with similar possibility distances in different classes will be reasonably classified into a proper meta-class under the framework of belief functions theory, which truly reflects the uncertainty of the pattern caused by missing values and effectively reduces the error rate. Experiments on both artificial and real data sets show that PBC is effective for dealing with missing data.
ISSN:1568-4946
1872-9681
DOI:10.1016/j.asoc.2020.106175