Certified Adversarial Robustness of Machine Learning-based Malware Detectors via (De)Randomized Smoothing
Deep learning-based malware detection systems are vulnerable to adversarial EXEmples - carefully-crafted malicious programs that evade detection with minimal perturbation. As such, the community is dedicating effort to develop mechanisms to defend against adversarial EXEmples. However, current rando...
Gespeichert in:
Hauptverfasser: | , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Deep learning-based malware detection systems are vulnerable to adversarial
EXEmples - carefully-crafted malicious programs that evade detection with
minimal perturbation. As such, the community is dedicating effort to develop
mechanisms to defend against adversarial EXEmples. However, current randomized
smoothing-based defenses are still vulnerable to attacks that inject blocks of
adversarial content. In this paper, we introduce a certifiable defense against
patch attacks that guarantees, for a given executable and an adversarial patch
size, no adversarial EXEmple exist. Our method is inspired by (de)randomized
smoothing which provides deterministic robustness certificates. During
training, a base classifier is trained using subsets of continguous bytes. At
inference time, our defense splits the executable into non-overlapping chunks,
classifies each chunk independently, and computes the final prediction through
majority voting to minimize the influence of injected content. Furthermore, we
introduce a preprocessing step that fixes the size of the sections and headers
to a multiple of the chunk size. As a consequence, the injected content is
confined to an integer number of chunks without tampering the other chunks
containing the real bytes of the input examples, allowing us to extend our
certified robustness guarantees to content insertion attacks. We perform an
extensive ablation study, by comparing our defense with randomized
smoothing-based defenses against a plethora of content manipulation attacks and
neural network architectures. Results show that our method exhibits unmatched
robustness against strong content-insertion attacks, outperforming randomized
smoothing-based defenses in the literature. |
---|---|
DOI: | 10.48550/arxiv.2405.00392 |