Anomaly Detection Dataset for Industrial Control Systems

Over the past few decades, Industrial Control Systems (ICS) have been targeted by cyberattacks and are becoming increasingly vulnerable as more ICSs are connected to the internet. Using Machine Learning (ML) for Intrusion Detection Systems (IDS) is a promising approach for ICS cyber protection, but...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE access 2023, Vol.11, p.107982-107996
Hauptverfasser: Dehlaghi-Ghadim, Alireza, Moghadam, Mahshid Helali, Balador, Ali, Hansson, Hans
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Over the past few decades, Industrial Control Systems (ICS) have been targeted by cyberattacks and are becoming increasingly vulnerable as more ICSs are connected to the internet. Using Machine Learning (ML) for Intrusion Detection Systems (IDS) is a promising approach for ICS cyber protection, but the lack of suitable datasets for evaluating ML algorithms is a challenge. Although a few commonly used datasets may not reflect realistic ICS network data, lack necessary features for effective anomaly detection, or be outdated. This paper introduces the 'ICS-Flow' dataset, which offers network data and process state variables logs for supervised and unsupervised ML-based IDS assessment. The network data includes normal and anomalous network packets and flows captured from simulated ICS components and emulated networks, where the anomalies were applied to the system through various cyberattacks. We also proposed an open-source tool, "ICSFlowGenerator," for generating network flow parameters from Raw network packets. The final dataset comprises over 25,000,000 raw network packets, network flow records, and process variable logs. The paper describes the methodology used to collect and label the dataset and provides a detailed data analysis. Finally, we implement several ML models, including the decision tree, random forest, and artificial neural network to detect anomalies and attacks, demonstrating that our dataset can be used effectively for training intrusion detection ML models.
ISSN:2169-3536
2169-3536
DOI:10.1109/ACCESS.2023.3320928