Robust Evaluation of Diffusion-Based Adversarial Purification
We question the current evaluation practice on diffusion-based purification methods. Diffusion-based purification methods aim to remove adversarial effects from an input data point at test time. The approach gains increasing attention as an alternative to adversarial training due to the disentanglin...
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | We question the current evaluation practice on diffusion-based purification
methods. Diffusion-based purification methods aim to remove adversarial effects
from an input data point at test time. The approach gains increasing attention
as an alternative to adversarial training due to the disentangling between
training and testing. Well-known white-box attacks are often employed to
measure the robustness of the purification. However, it is unknown whether
these attacks are the most effective for the diffusion-based purification since
the attacks are often tailored for adversarial training. We analyze the current
practices and provide a new guideline for measuring the robustness of
purification methods against adversarial attacks. Based on our analysis, we
further propose a new purification strategy improving robustness compared to
the current diffusion-based purification methods. |
---|---|
DOI: | 10.48550/arxiv.2303.09051 |