FSVM-CIL: Fuzzy Support Vector Machines for Class Imbalance Learning

Support vector machines (SVMs) is a popular machine learning technique, which works effectively with balanced datasets. However, when it comes to imbalanced datasets, SVMs produce suboptimal classification models. On the other hand, the SVM algorithm is sensitive to outliers and noise present in the...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on fuzzy systems 2010-06, Vol.18 (3), p.558-571
Hauptverfasser: Batuwita, Rukshan, Palade, Vasile
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Support vector machines (SVMs) is a popular machine learning technique, which works effectively with balanced datasets. However, when it comes to imbalanced datasets, SVMs produce suboptimal classification models. On the other hand, the SVM algorithm is sensitive to outliers and noise present in the datasets. Therefore, although the existing class imbalance learning (CIL) methods can make SVMs less sensitive to class imbalance, they can still suffer from the problem of outliers and noise. Fuzzy SVMs (FSVMs) is a variant of the SVM algorithm, which has been proposed to handle the problem of outliers and noise. In FSVMs, training examples are assigned different fuzzy-membership values based on their importance, and these membership values are incorporated into the SVM learning algorithm to make it less sensitive to outliers and noise. However, like the normal SVM algorithm, FSVMs can also suffer from the problem of class imbalance. In this paper, we present a method to improve FSVMs for CIL (called FSVM-CIL), which can be used to handle the class imbalance problem in the presence of outliers and noise. We thoroughly evaluated the proposed FSVM-CIL method on ten real-world imbalanced datasets and compared its performance with five existing CIL methods, which are available for normal SVM training. Based on the overall results, we can conclude that the proposed FSVM-CIL method is a very effective method for CIL, especially in the presence of outliers and noise in datasets.
ISSN:1063-6706
1941-0034
DOI:10.1109/TFUZZ.2010.2042721