Support vector machine in big data: smoothing strategy and adaptive distributed inference

Support vector machine (SVM) is a powerful binary classification tool, but the growing size of modern data is bringing challenges to it. First, the non-smoothness of hinge loss poses difficulties in large-scale computation. Second, the existing large-scale distributed algorithms heavily rely on unif...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Statistics and computing 2024-12, Vol.34 (6), Article 188
Hauptverfasser:	Wang, Kangning, Liu, Jin, Sun, Xiaofei
Format:	Artikel
Sprache:	eng
Schlagworte:	Adaptive sampling Algorithms Artificial Intelligence Big Data Computer Science Data smoothing Original Paper Probability and Statistics in Computer Science Randomness Smoothing Smoothness Statistical Theory and Methods Statistics and Computing/Statistics Programs Support vector machines
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Support vector machine (SVM) is a powerful binary classification tool, but the growing size of modern data is bringing challenges to it. First, the non-smoothness of hinge loss poses difficulties in large-scale computation. Second, the existing large-scale distributed algorithms heavily rely on uniformity and randomness conditions, which are frequently violated in practice. To solve these issues, we first construct a convolution smoothing SVM, which enjoys a smooth and convex objective function. Then a distributed SVM is developed, in which the estimator can be calculated conveniently by minimizing a pilot sample-based distributed surrogate loss. In particular, it can be adaptive when the uniformity or randomness condition is violated. The established theoretical results and numerical experiments on both synthetic and real data all confirm the proposed methods.
ISSN:	0960-3174 1573-1375
DOI:	10.1007/s11222-024-10506-5