Deep Learning and Visualization for Identifying Malware Families
The growing threat of malware is becoming more and more difficult to ignore. In this paper, a malware feature images generation method is used to combine the static analysis of malicious code with the methods of recurrent neural networks (RNN) and convolutional neural networks (CNN). By using an RNN...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on dependable and secure computing 2021-01, Vol.18 (1), p.283-295 |
---|---|
Hauptverfasser: | , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The growing threat of malware is becoming more and more difficult to ignore. In this paper, a malware feature images generation method is used to combine the static analysis of malicious code with the methods of recurrent neural networks (RNN) and convolutional neural networks (CNN). By using an RNN, our method considers not only the original information of malware but also the ability to associate the original code with timing characteristics; furthermore, the process reduces the dependence on category labels of malware. Then, we use minhash to generate feature images from the fusion of the original codes and the predictive codes from the RNN. Finally, we train a CNN to classify feature images. When we trained very few samples (the proportion of the sample size of training dataset to validation dataset was 1:30), we obtained accuracy over 92 percent. When we adjust the proportion to 3:1, the accuracy exceeds 99.5 percent. As shown in confusion matrices, our method obtains a good result, where the worst false positive rate of all the malware families is 0.0147 and the average false positive rate is 0.0058. |
---|---|
ISSN: | 1545-5971 1941-0018 |
DOI: | 10.1109/TDSC.2018.2884928 |