Context-Aware, Adaptive, and Scalable Android Malware Detection Through Online Learning

It is well known that Android malware constantly evolves so as to evade detection. This causes the entire malware population to be nonstationary. Contrary to this fact, most of the prior works on machine learning based android malware detection have assumed that the distribution of the observed malw...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on emerging topics in computational intelligence 2017-06, Vol.1 (3), p.157-175
Hauptverfasser: Narayanan, Annamalai, Chandramohan, Mahinthan, Lihui Chen, Yang Liu
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:It is well known that Android malware constantly evolves so as to evade detection. This causes the entire malware population to be nonstationary. Contrary to this fact, most of the prior works on machine learning based android malware detection have assumed that the distribution of the observed malware characteristics (i.e., features) does not change over time. In this paper, we address the problem of malware population drift and propose a novel online learning based framework to detect malware, named CASANDRA (Context-aware, Adaptive and Scalable ANDRoid mAlware detector). In order to perform accurate detection, a novel graph kernel that facilitates capturing apps security-sensitive behaviors along with their context information from dependence graphs is proposed. Besides being accurate and scalable, CASANDRA has specific advantages: first, being adaptive to the evolution in malware features over time; second, explaining the significant features that led to an apps classification as being malicious or benign. In a large-scale comparative analysis, CASANDRA outperforms two state-of-the-art techniques on a benchmark dataset achieving 99.23% F-measure. When evaluated with more than 87 000 apps collected in-the-wild, CASANDRA achieves 89.92% accuracy, outperforming existing techniques by more than 25% in their typical batch learning setting and more than 7% when they are continuously retained, while maintaining comparable efficiency.
ISSN:2471-285X
2471-285X
DOI:10.1109/TETCI.2017.2699220