GLSVM: Integrating Structured Feature Selection and Large Margin Classification

High dimensional data challenges current feature selection methods. For many real world problems we often have prior knowledge about the relationship of features. For example in microarray data analysis, genes from the same biological pathways are expected to have similar relationship to the outcome...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Hongliang Fei, Quanz, B., Jun Huan
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:High dimensional data challenges current feature selection methods. For many real world problems we often have prior knowledge about the relationship of features. For example in microarray data analysis, genes from the same biological pathways are expected to have similar relationship to the outcome that we target to predict. Recent regularization methods on support vector machine (SVM) have achieved great success to perform feature selection and model selection simultaneously for high dimensional data, but neglect such relationship among features. To build interpretable SVM models, the structure information of features should be incorporated. In this paper, we propose an algorithm GLSVM that automatically perform model selection and feature selection in SVMs. To incorporate the prior knowledge of feature relationship, we extend standard 2 norm SVM and use a penalty function that employs a L 2 norm regularization term including the normalized Laplacian of the graph and L 1 penalty. We have demonstrated the effectiveness of our methods and compare them to the state-of-the-art using two real-world benchmarks.
ISSN:2375-9232
2375-9259
DOI:10.1109/ICDMW.2009.39