Effective detection of android malware based on the usage of data flow APIs and machine learning

Context. Android has been ranked as the top smartphone platform nowadays. Studies show that Android malware have increased dramatically and that personal privacy theft has become a major form of attack in recent years. These critical security circumstances have generated a strong interest in develop...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Information and software technology 2016-07, Vol.75, p.17-25
Hauptverfasser:	Wu, Songyang, Wang, Pan, Li, Xun, Zhang, Yong
Format:	Artikel
Sprache:	eng
Schlagworte:	Android security Application programming interface Applications programs Artificial intelligence Classification Computer viruses Data transmission Lists Machine learning Malware Malware detection Operating systems Privacy Privacy leakage Smartphones Studies
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Context. Android has been ranked as the top smartphone platform nowadays. Studies show that Android malware have increased dramatically and that personal privacy theft has become a major form of attack in recent years. These critical security circumstances have generated a strong interest in developing systems that automatically detect malicious behaviour in Android applications (apps). However, most methods of detecting sensitive data leakage have certain shortcomings, including computational expensiveness and false positives. Objective. This study proposes an Android malware detecting system that provides highly accurate classification and efficient sensitive data transmission analysis. Method. The study adopts a machine learning approach that leverages the use of dataflow application program interfaces (APIs) as classification features to detect Android malware. We conduct a thorough analysis to extract dataflow-related API-level features and improve the k-nearest neighbour classification model. The dataflow-related API list is further optimized through machine learning, which enables us to improve considerably the efficiency of sensitive data transmission analysis, whereas analytical accuracy is approximated to that of the experiment using a full dataflow-related API list. Results. The proposed scheme is evaluated using 1160 benign and 1050 malicious samples. Results show that the system can achieve an accuracy rate of as high as 97.66% in detecting unknown Android malware. Our experiment of static dataflow analysis shows that more than 85% of sensitive data transmission paths can be determined using the refined API subset, whereas time of analysis decreases by nearly 40%. Conclusion. The usage of dataflow-related APIs is a valid feature for identifying Android malware. The proposed scheme provides an efficient approach to detecting Android malware and investigating privacy violations in malicious apps.
ISSN:	0950-5849 1873-6025
DOI:	10.1016/j.infsof.2016.03.004