The Clever Hans Effect in Unsupervised Learning
Unsupervised learning has become an essential building block of AI systems. The representations it produces, e.g. in foundation models, are critical to a wide variety of downstream applications. It is therefore important to carefully examine unsupervised models to ensure not only that they produce a...
Gespeichert in:
Hauptverfasser: | , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Unsupervised learning has become an essential building block of AI systems.
The representations it produces, e.g. in foundation models, are critical to a
wide variety of downstream applications. It is therefore important to carefully
examine unsupervised models to ensure not only that they produce accurate
predictions, but also that these predictions are not "right for the wrong
reasons", the so-called Clever Hans (CH) effect. Using specially developed
Explainable AI techniques, we show for the first time that CH effects are
widespread in unsupervised learning. Our empirical findings are enriched by
theoretical insights, which interestingly point to inductive biases in the
unsupervised learning machine as a primary source of CH effects. Overall, our
work sheds light on unexplored risks associated with practical applications of
unsupervised learning and suggests ways to make unsupervised learning more
robust. |
---|---|
DOI: | 10.48550/arxiv.2408.08041 |