Empirical comparison of cross-validation and internal metrics for tuning SVM hyperparameters

•Compared cross validation with 5 internal metric to select SVM hyperparameters.•Cross validation results in hyperparameters with better accuracy on new data.•Distance between two classes (DBTC) is the second best algorithm.•DBTC has the lowest execution time and is a very competitive alternative to...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Pattern recognition letters 2017-03, Vol.88, p.6-11
Hauptverfasser:	Duarte, Edson, Wainer, Jacques
Format:	Artikel
Sprache:	eng
Schlagworte:	Artificial intelligence Binary data Classification Classifiers Cross validation Hyper-parameter tuning Internal metrics Metric system Model selection Pattern recognition Support vector machines SVM Training Tuning Vector space
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	•Compared cross validation with 5 internal metric to select SVM hyperparameters.•Cross validation results in hyperparameters with better accuracy on new data.•Distance between two classes (DBTC) is the second best algorithm.•DBTC has the lowest execution time and is a very competitive alternative to 5-fold CV. Hyperparameter tuning is a mandatory step for building a support vector machine classifier. In this work, we study some methods based on metrics of the training set itself, and not the performance of the classifier on a different test set - the usual cross-validation approach. We compare cross-validation (5-fold) with Xi-alpha, radius-margin bound, generalized approximate cross validation, maximum discrepancy and distance between two classes on 110 public binary data sets. Cross validation is the method that resulted in the best selection of the hyper-parameters, but it is also the method with one of the highest execution time. Distance between two classes (DBTC) is the fastest and the second best ranked method. We discuss that DBTC is a reasonable alternative to cross validation when training/hyperparameter-selection times are an issue and that the loss in accuracy when using DBTC is reasonably small.
ISSN:	0167-8655 1872-7344
DOI:	10.1016/j.patrec.2017.01.007