Unleashing the Power of Unlabeled Data: A Self-supervised Learning Framework for Cyber Attack Detection in Smart Grids
Modern power grids are undergoing significant changes driven by information and communication technologies (ICTs), and evolving into smart grids with higher efficiency and lower operation cost. Using ICTs, however, comes with an inevitable side effect that makes the power system more vulnerable to c...
Gespeichert in:
Hauptverfasser: | , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Modern power grids are undergoing significant changes driven by information
and communication technologies (ICTs), and evolving into smart grids with
higher efficiency and lower operation cost. Using ICTs, however, comes with an
inevitable side effect that makes the power system more vulnerable to cyber
attacks. In this paper, we propose a self-supervised learning-based framework
to detect and identify various types of cyber attacks. Different from existing
approaches, the proposed framework does not rely on large amounts of
well-curated labeled data but makes use of the massive unlabeled data in the
wild which are easily accessible. Specifically, the proposed framework adopts
the BERT model from the natural language processing domain and learns
generalizable and effective representations from the unlabeled sensing data,
which capture the distinctive patterns of different attacks. Using the learned
representations, together with a very small amount of labeled data, we can
train a task-specific classifier to detect various types of cyber attacks.
Meanwhile, real-world training datasets are usually imbalanced, i.e., there are
only a limited number of data samples containing attacks. In order to cope with
such data imbalance, we propose a new loss function, separate mean error (SME),
which pays equal attention to the large and small categories to better train
the model. Experiment results in a 5-area power grid system with 37 buses
demonstrate the superior performance of our framework over existing approaches,
especially when a very limited portion of labeled data are available, e.g., as
low as 0.002\%. We believe such a framework can be easily adopted to detect a
variety of cyber attacks in other power grid scenarios. |
---|---|
DOI: | 10.48550/arxiv.2405.13965 |