Say My Name: a Model's Bias Discovery Framework
In the last few years, due to the broad applicability of deep learning to downstream tasks and end-to-end training capabilities, increasingly more concerns about potential biases to specific, non-representative patterns have been raised. Many works focusing on unsupervised debiasing usually leverage...
Gespeichert in:
Hauptverfasser: | , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | In the last few years, due to the broad applicability of deep learning to
downstream tasks and end-to-end training capabilities, increasingly more
concerns about potential biases to specific, non-representative patterns have
been raised. Many works focusing on unsupervised debiasing usually leverage the
tendency of deep models to learn ``easier'' samples, for example by clustering
the latent space to obtain bias pseudo-labels. However, the interpretation of
such pseudo-labels is not trivial, especially for a non-expert end user, as it
does not provide semantic information about the bias features. To address this
issue, we introduce ``Say My Name'' (SaMyNa), the first tool to identify biases
within deep models semantically. Unlike existing methods, our approach focuses
on biases learned by the model. Our text-based pipeline enhances explainability
and supports debiasing efforts: applicable during either training or post-hoc
validation, our method can disentangle task-related information and proposes
itself as a tool to analyze biases. Evaluation on traditional benchmarks
demonstrates its effectiveness in detecting biases and even disclaiming them,
showcasing its broad applicability for model diagnosis. |
---|---|
DOI: | 10.48550/arxiv.2408.09570 |