Fixed Inter-Neuron Covariability Induces Adversarial Robustness
The vulnerability to adversarial perturbations is a major flaw of Deep Neural Networks (DNNs) that raises question about their reliability when in real-world scenarios. On the other hand, human perception, which DNNs are supposed to emulate, is highly robust to such perturbations, indicating that th...
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The vulnerability to adversarial perturbations is a major flaw of Deep Neural
Networks (DNNs) that raises question about their reliability when in real-world
scenarios. On the other hand, human perception, which DNNs are supposed to
emulate, is highly robust to such perturbations, indicating that there may be
certain features of the human perception that make it robust but are not
represented in the current class of DNNs. One such feature is that the activity
of biological neurons is correlated and the structure of this correlation tends
to be rather rigid over long spans of times, even if it hampers performance and
learning. We hypothesize that integrating such constraints on the activations
of a DNN would improve its adversarial robustness, and, to test this
hypothesis, we have developed the Self-Consistent Activation (SCA) layer, which
comprises of neurons whose activations are consistent with each other, as they
conform to a fixed, but learned, covariability pattern. When evaluated on image
and sound recognition tasks, the models with a SCA layer achieved high
accuracy, and exhibited significantly greater robustness than multi-layer
perceptron models to state-of-the-art Auto-PGD adversarial attacks
\textit{without being trained on adversarially perturbed data |
---|---|
DOI: | 10.48550/arxiv.2308.03956 |