Automatic malware classification and new malware detection using machine learning

The explosive growth ofmalware variants poses a major threat to information security. Traditional anti-virus systems based on signatures fail to classify unknown malware into their corresponding families and to detect new kinds of malware pro- grams. Therefore, we propose a machine learning based ma...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Frontiers of information technology & electronic engineering 2017-09, Vol.18 (9), p.1336-1347
Hauptverfasser: Liu, Liu, Wang, Bao-sheng, Yu, Bo, Zhong, Qiu-xi
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The explosive growth ofmalware variants poses a major threat to information security. Traditional anti-virus systems based on signatures fail to classify unknown malware into their corresponding families and to detect new kinds of malware pro- grams. Therefore, we propose a machine learning based malware analysis system, which is composed of three modules: data processing, decision making, and new malware detection. The data processing module deals with gray-scale images, Opcode n-gram, and import fimctions, which are employed to extract the features of the malware. The decision-making module uses the features to classify the malware and to identify suspicious malware. Finally, the detection module uses the shared nearest neighbor (SNN) clustering algorithm to discover new malware families. Our approach is evaluated on more than 20 000 malware instances, which were collected by Kingsoft, ESET NOD32, and Anubis. The results show that our system can effectively classify the un- known malware with a best accuracy of 98.9%, and successfully detects 86.7% of the new malware.
ISSN:2095-9184
2095-9230
DOI:10.1631/FITEE.1601325