Multi-perspective API call sequence behavior analysis and fusion for malware classification

The growing variety of malicious software, i.e., malware, has caused great damage and economic loss to computer systems. The API call sequence of malware reflects its dynamic behavior during execution, which is difficult to disguise. Therefore, API call sequence can serve as a robust feature for the...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Computers & security 2025-01, Vol.148, p.104177, Article 104177
Hauptverfasser: Wu, Peng, Gao, Mohan, Sun, Fuhui, Wang, Xiaoyan, Pan, Li
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The growing variety of malicious software, i.e., malware, has caused great damage and economic loss to computer systems. The API call sequence of malware reflects its dynamic behavior during execution, which is difficult to disguise. Therefore, API call sequence can serve as a robust feature for the detection and classification of malware. The statistical analysis presented in this paper reveals two distinct characteristics within the API call sequences of different malware: (1) the API existence feature caused by frequent calls to the APIs with some special functions, and (2) the API transition feature caused by frequent calls to some special API subsequence patterns. Based on these two characteristics, this paper proposes MINES, a Multi-perspective apI call sequeNce bEhavior fuSion malware classification Method. Specifically, the API existence features from different perspectives are described by two graphs that model diverse rich and complex existence relationships between APIs, and we adopt the graph contrastive learning framework to extract the consistent shared API existence feature from two graphs. Similarly, the API transition features of different hops are described by the multi-order transition probability matrices. By treat each order as a channel, a CNN-based contrastive learning framework is adopted to extract the API transition feature. Finally, the two kinds of extracted features are fused to classify malware. Experiments on five datasets demonstrate the superiority of MINES over various state-of-the-arts by a large margin.
ISSN:0167-4048
DOI:10.1016/j.cose.2024.104177