Breaking Fair Binary Classification with Optimal Flipping Attacks
Minimizing risk with fairness constraints is one of the popular approaches to learning a fair classifier. Recent works showed that this approach yields an unfair classifier if the training set is corrupted. In this work, we study the minimum amount of data corruption required for a successful flippi...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Minimizing risk with fairness constraints is one of the popular approaches to
learning a fair classifier. Recent works showed that this approach yields an
unfair classifier if the training set is corrupted. In this work, we study the
minimum amount of data corruption required for a successful flipping attack.
First, we find lower/upper bounds on this quantity and show that these bounds
are tight when the target model is the unique unconstrained risk minimizer.
Second, we propose a computationally efficient data poisoning attack algorithm
that can compromise the performance of fair learning algorithms. |
---|---|
DOI: | 10.48550/arxiv.2204.05472 |