Adversarial Fine-tune with Dynamically Regulated Adversary
Adversarial training is an effective method to boost model robustness to malicious, adversarial attacks. However, such improvement in model robustness often leads to a significant sacrifice of standard performance on clean images. In many real-world applications such as health diagnosis and autonomo...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Adversarial training is an effective method to boost model robustness to
malicious, adversarial attacks. However, such improvement in model robustness
often leads to a significant sacrifice of standard performance on clean images.
In many real-world applications such as health diagnosis and autonomous
surgical robotics, the standard performance is more valued over model
robustness against such extremely malicious attacks. This leads to the
question: To what extent we can boost model robustness without sacrificing
standard performance? This work tackles this problem and proposes a simple yet
effective transfer learning-based adversarial training strategy that
disentangles the negative effects of adversarial samples on model's standard
performance. In addition, we introduce a training-friendly adversarial attack
algorithm, which facilitates the boost of adversarial robustness without
introducing significant training complexity. Extensive experimentation
indicates that the proposed method outperforms previous adversarial training
algorithms towards the target: to improve model robustness while preserving
model's standard performance on clean data. |
---|---|
DOI: | 10.48550/arxiv.2204.13232 |