The trace kernel bandwidth criterion for support vector data description

•Support Vector Data Description (SVDD) is a popular kernel-based unsupervised one-class classification method. The Gaussian kernel is the most common used kernel.•The Gaussian kernel has a tuning parameter, the kernel bandwidth, and it is important to choose it correctly.•We propose an automated, u...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Pattern recognition 2021-03, Vol.111, p.107662, Article 107662
Hauptverfasser: Chaudhuri, Arin, Sadek, Carol, Kakde, Deovrat, Wang, Haoyu, Hu, Wenhao, Jiang, Hansi, Kong, Seunghyun, Liao, Yuwei, Peredriy, Sergiy
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:•Support Vector Data Description (SVDD) is a popular kernel-based unsupervised one-class classification method. The Gaussian kernel is the most common used kernel.•The Gaussian kernel has a tuning parameter, the kernel bandwidth, and it is important to choose it correctly.•We propose an automated, unsupervised, bandwidth selection method for SVDD.•Our proposed bandwidth is also appropriate for selecting the bandwidth for One Class Support Vector Machines (OCSVM). Support vector data description (SVDD) is a popular anomaly detection technique. The computation of the SVDD classifier requires a kernel function, for which the Gaussian kernel is a common choice. The Gaussian kernel has a bandwidth parameter, and it is important to set the value of this parameter correctly to ensure good results. A small bandwidth leads to overfitting, and the resulting SVDD classifier overestimates the number of anomalies, whereas a large bandwidth leads to underfitting and an inability to detect many anomalies. In this paper, we present a new, unsupervised method for selecting the Gaussian kernel bandwidth. Our method exploits a low-rank representation of the kernel matrix to suggest a kernel bandwidth value. Our new technique is competitive with the current state of the art for low-dimensional data and performs extremely well for many classes of high-dimensional data. This method is also applicable to one-class support vector machines (OCSVM).
ISSN:0031-3203
1873-5142
DOI:10.1016/j.patcog.2020.107662