A misclassification cost risk bound based on hybrid particle swarm optimization heuristic

•Proposed a misclassification cost risk bound (MCRB) for binary linear classification problems.•Developed a hybrid particle swarm optimization procedure to solve MCRB.•Using simulated and real-world datasets, test the MCRB bound for several linear and non-linear cost sensitive classifiers. Linear di...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Expert systems with applications 2014-03, Vol.41 (4), p.1483-1491
1. Verfasser: Pendharkar, Parag C.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:•Proposed a misclassification cost risk bound (MCRB) for binary linear classification problems.•Developed a hybrid particle swarm optimization procedure to solve MCRB.•Using simulated and real-world datasets, test the MCRB bound for several linear and non-linear cost sensitive classifiers. Linear discriminant analysis models to minimize misclassification cost have recently gained popularity. It is well known that the misclassification cost minimizing linear discriminant analysis problem is an ▪-complete problem that is difficult to solve to optimality for large scale datasets. As a result, heuristic techniques have gained popularity but it is difficult to assess how well these heuristic techniques perform. One way to aid assessment of the performance of heuristic techniques is to establish a lower-bound on the optimal value of misclassification cost. In this paper, we propose and use a hybrid particle swarm optimization (PSO) and Lagrangian relaxation (LR) based heuristic to establish a misclassification cost lower bound (MCLB) for two-group linear classifiers. We use the subgradient optimization procedure to tighten the MCLB. Using simulated and real-world datasets, we test a misclassification cost minimizing linear genetic algorithm classifier and two commercial non-linear classifiers (C5.0 and C&RT) to compare their performances with the MCLB. Our holdout sample tests indicate that the proposed MCLB works well for both linear and non-linear classifiers when class data distributions are normal. Additionally, as misclassification cost asymmetry increases, the proposed MCLB appears to provide better results.
ISSN:0957-4174
1873-6793
DOI:10.1016/j.eswa.2013.08.045