Channel Features and API Frequency-Based Transformer Model for Malware Identification

Malicious software (malware), in various forms and variants, continues to pose significant threats to user information security. Researchers have identified the effectiveness of utilizing API call sequences to identify malware. However, the evasion techniques employed by malware, such as obfuscation...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Sensors (Basel, Switzerland) Switzerland), 2024-01, Vol.24 (2), p.580
Hauptverfasser:	Qian, Liping, Cong, Lin
Format:	Artikel
Sprache:	eng
Schlagworte:	Ablation Access control Accuracy API sequence Classification Computational linguistics deep learning dynamic analysis Electric transformers Language processing Malware malware identification Natural language interfaces Natural language processing Neural networks Ransomware Researchers Safety and security measures Semantics Software Spyware transformer
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Malicious software (malware), in various forms and variants, continues to pose significant threats to user information security. Researchers have identified the effectiveness of utilizing API call sequences to identify malware. However, the evasion techniques employed by malware, such as obfuscation and complex API call sequences, challenge existing detection methods. This research addresses this issue by introducing CAFTrans, a novel transformer-based model for malware detection. We enhance the traditional transformer encoder with a one-dimensional channel attention module (1D-CAM) to improve the correlation between API call vector features, thereby enhancing feature embedding. A word frequency reinforcement module is also implemented to refine API features by preserving low-frequency API features. To capture subtle relationships between APIs and achieve more accurate identification of features for different types of malware, we leverage convolutional neural networks (CNNs) and long short-term memory (LSTM) networks. Experimental results demonstrate the effectiveness of CAFTrans, achieving state-of-the-art performance on the mal-api-2019 dataset with an F1 score of 0.65252 and an AUC of 0.8913. The findings suggest that CAFTrans improves accuracy in distinguishing between various types of malware and exhibits enhanced recognition capabilities for unknown samples and adversarial attacks.
ISSN:	1424-8220 1424-8220
DOI:	10.3390/s24020580