ADVERSARIAL CORRUPTION FOR ATTRIBUTION-BASED EXPLANATIONS VALIDATION
Herein are machine learning (ML) explainability (MLX) techniques that perturb a non-anomalous tuple to generate an anomalous tuple as adversarial input to any explainer that is based on feature attribution. In an embodiment, a computer generates, from a non-anomalous tuple, an anomalous tuple that c...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Patent |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Herein are machine learning (ML) explainability (MLX) techniques that perturb a non-anomalous tuple to generate an anomalous tuple as adversarial input to any explainer that is based on feature attribution. In an embodiment, a computer generates, from a non-anomalous tuple, an anomalous tuple that contains a perturbed value of a perturbed feature. In the anomalous tuple, the perturbed value of the perturbed feature is modified to cause a change in reconstruction error for the anomalous tuple. The change in reconstruction error includes a decrease in reconstruction error of the perturbed feature and/or an increase in a sum of reconstruction error of all features that are not the perturbed feature. After modifying the perturbed value, an attribution-based explainer automatically generates an explanation that identifies an identified feature as a cause of the anomalous tuple being anomalous. Whether the identified feature of the explanation is or is not the perturbed feature is detected. |
---|