Support Vector Machines with the Ramp Loss and the Hard Margin Loss

In the interest of deriving classifiers that are robust to outlier observations, we present integer programming formulations of Vapnik's support vector machine (SVM) with the ramp loss and hard margin loss. The ramp loss allows a maximum error of 2 for each training observation, while the hard...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Operations research 2011-03, Vol.59 (2), p.467-479
1. Verfasser:	Brooks, J. Paul
Format:	Artikel
Sprache:	eng
Schlagworte:	Analysis Applied sciences Computer science control theory systems Data processing. List processing. Character string processing Datasets Exact sciences and technology integer applications Integer programming Integers Integrality Kernel functions Machine learning Mathematical programming Mathematical vectors Memory organisation. Data processing Methods Observational research Operational research and scientific management Operational research. Management science Optimal solutions Optimization algorithms Outliers pattern analysis programming quadratic integer programming Quadratic programming Software statistics Structural optimization Studies Support vector machines
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	In the interest of deriving classifiers that are robust to outlier observations, we present integer programming formulations of Vapnik's support vector machine (SVM) with the ramp loss and hard margin loss. The ramp loss allows a maximum error of 2 for each training observation, while the hard margin loss calculates error by counting the number of training observations that are in the margin or misclassified outside of the margin. SVM with these loss functions is shown to be a consistent estimator when used with certain kernel functions. In computational studies with simulated and real-world data, SVM with the robust loss functions ignores outlier observations effectively, providing an advantage over SVM with the traditional hinge loss when using the linear kernel. Despite the fact that training SVM with the robust loss functions requires the solution of a quadratic mixed-integer program (QMIP) and is NP-hard, while traditional SVM requires only the solution of a continuous quadratic program (QP), we are able to find good solutions and prove optimality for instances with up to 500 observations. Solution methods are presented for the new formulations that improve computational performance over industry-standard integer programming solvers alone.
ISSN:	0030-364X 1526-5463
DOI:	10.1287/opre.1100.0854