Nearest is Not Dearest: Towards Practical Defense against Quantization-conditioned Backdoor Attacks
Model quantization is widely used to compress and accelerate deep neural networks. However, recent studies have revealed the feasibility of weaponizing model quantization via implanting quantization-conditioned backdoors (QCBs). These special backdoors stay dormant on released full-precision models...
Gespeichert in:
Hauptverfasser: | , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Model quantization is widely used to compress and accelerate deep neural
networks. However, recent studies have revealed the feasibility of weaponizing
model quantization via implanting quantization-conditioned backdoors (QCBs).
These special backdoors stay dormant on released full-precision models but will
come into effect after standard quantization. Due to the peculiarity of QCBs,
existing defenses have minor effects on reducing their threats or are even
infeasible. In this paper, we conduct the first in-depth analysis of QCBs. We
reveal that the activation of existing QCBs primarily stems from the nearest
rounding operation and is closely related to the norms of neuron-wise
truncation errors (i.e., the difference between the continuous full-precision
weights and its quantized version). Motivated by these insights, we propose
Error-guided Flipped Rounding with Activation Preservation (EFRAP), an
effective and practical defense against QCBs. Specifically, EFRAP learns a
non-nearest rounding strategy with neuron-wise error norm and layer-wise
activation preservation guidance, flipping the rounding strategies of neurons
crucial for backdoor effects but with minimal impact on clean accuracy.
Extensive evaluations on benchmark datasets demonstrate that our EFRAP can
defeat state-of-the-art QCB attacks under various settings. Code is available
at https://github.com/AntigoneRandy/QuantBackdoor_EFRAP. |
---|---|
DOI: | 10.48550/arxiv.2405.12725 |