Efficient classification using parallel and scalable compressed model and its application on intrusion detection

•We propose a compressed model composed of horizontal and vertical compression.•We employ OneR as horizontal compression, AP clustering as vertical compression.•We implement a Map-Reduce based scalable and parallel framework for compression.•We use KNN and SVM to build intrusion detector using the p...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Expert systems with applications 2014-10, Vol.41 (13), p.5972-5983
Hauptverfasser: Chen, Tieming, Zhang, Xu, Jin, Shichao, Kim, Okhee
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:•We propose a compressed model composed of horizontal and vertical compression.•We employ OneR as horizontal compression, AP clustering as vertical compression.•We implement a Map-Reduce based scalable and parallel framework for compression.•We use KNN and SVM to build intrusion detector using the proposed compressed model.•Both KDD99 and CDMC2012 are evaluated to show the detection efficiency and accuracy. In order to achieve high efficiency of classification in intrusion detection, a compressed model is proposed in this paper which combines horizontal compression with vertical compression. OneR is utilized as horizontal compression for attribute reduction, and affinity propagation is employed as vertical compression to select small representative exemplars from large training data. As to be able to computationally compress the larger volume of training data with scalability, MapReduce based parallelization approach is then implemented and evaluated for each step of the model compression process abovementioned, on which common but efficient classification methods can be directly used. Experimental application study on two publicly available datasets of intrusion detection, KDD99 and CMDC2012, demonstrates that the classification using the compressed model proposed can effectively speed up the detection procedure at up to 184 times, most importantly at the cost of a minimal accuracy difference with less than 1% on average.
ISSN:0957-4174
1873-6793
DOI:10.1016/j.eswa.2014.04.009