A Novel Data-Driven Attack Method on Machine Learning Models

With the increasing popularity and usage of artificial intelligence systems, it has become crucial to address their vulnerability to cyber-attacks. In this study, we propose a novel gradient descent-based method to generate fake data that can be accepted as positive by a targeted machine learning mo...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	J.UCS (Annual print and CD-ROM archive ed.) 2024-03, Vol.30 (3), p.402-417
Hauptverfasser:	Sadikoglu, Emre, Kösesoy, Irfan, Gök, Murat
Format:	Artikel
Sprache:	eng
Schlagworte:	Adversarial da Algorithms Analysis Artificial intelligence Cyber security Cybersecurity Data driven attack Data mining Datasets Effectiveness Machine learning Methods Reverse engineering Security systems
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	With the increasing popularity and usage of artificial intelligence systems, it has become crucial to address their vulnerability to cyber-attacks. In this study, we propose a novel gradient descent-based method to generate fake data that can be accepted as positive by a targeted machine learning model. Our method is designed to generate a large number of positive samples with a minimal number of probes to the model, making it difficult to detect by security systems. Additionally, we develop an alternative model to the attacked model using a reverse engineering approach, trained on a dataset composed of the samples generated by our method. We evaluate the success of our proposed method and the alternative model through a series of experiments. We conducted experiments on six distinct datasets, each of which was trained using three separate machine-learning algorithms. This resulted in a total of eighteen unique models that were evaluated and compared in our analysis. In the evaluation of results, the most commonly used metrics in the literature, including effective attack rate (EAR), accuracy, precision, recall, and F1 score, were employed. Focusing particularly on EAR-oriented assessments, our method demonstrates its effectiveness with a notably high EAR of 97% in the combination of the kNN method and the Cancer dataset. According to the results of our experiments, the proposed method demonstrates high effectiveness as a data-driven attack method.
ISSN:	0948-695X 0948-6968
DOI:	10.3897/jucs.108445