Predicting sensitive information leakage in IoT applications using flows-aware machine learning approach
This paper presents an approach for identification of vulnerable IoT applications. The approach focuses on a category of vulnerabilities that leads to sensitive information leakage which can be identified by using taint flow analysis. Tainted flows vulnerability is very much impacted by the structur...
Gespeichert in:
Veröffentlicht in: | Empirical software engineering : an international journal 2022-11, Vol.27 (6), Article 137 |
---|---|
Hauptverfasser: | , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | This paper presents an approach for identification of vulnerable IoT applications. The approach focuses on a category of vulnerabilities that leads to sensitive information leakage which can be identified by using taint flow analysis. Tainted flows vulnerability is very much impacted by the structure of the program and the order of the statements in the code, designing an approach to detect such vulnerability needs to take into consideration such information in order to provide precise results. In this paper, we propose and develop an approach, FlowsMiner, that mines features from the code related to program structure such as control statements and methods, in addition to program’s statement order. FlowsMiner, generates features in the form of tainted flows. We developed, Flows2Vec, a tool that transform the features recovered by FlowsMiner into vectors, which are then used to aid the process of machine learning by providing a flow’s aware model building process. The resulting model is capable of accurately classify applications as vulnerable if the vulnerability is exhibited by changes in the order of statements in source code. When compared to a base Bag of Words (BoW) approach, the experiments show that the proposed approach has improved the AUC of the prediction models for all algorithms and the best case for
C
o
r
p
u
s
1 dataset is improved from 0.91 to 0.94 and for
C
o
r
p
u
s
2 from 0.56 to 0.96. |
---|---|
ISSN: | 1382-3256 1573-7616 |
DOI: | 10.1007/s10664-022-10157-y |