On the Limitations of Dataset Balancing: The Lost Battle Against Spurious Correlations
Recent work has shown that deep learning models in NLP are highly sensitive to low-level correlations between simple features and specific output labels, leading to overfitting and lack of generalization. To mitigate this problem, a common practice is to balance datasets by adding new instances or b...
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Recent work has shown that deep learning models in NLP are highly sensitive
to low-level correlations between simple features and specific output labels,
leading to overfitting and lack of generalization. To mitigate this problem, a
common practice is to balance datasets by adding new instances or by filtering
out "easy" instances (Sakaguchi et al., 2020), culminating in a recent proposal
to eliminate single-word correlations altogether (Gardner et al., 2021). In
this opinion paper, we identify that despite these efforts,
increasingly-powerful models keep exploiting ever-smaller spurious
correlations, and as a result even balancing all single-word features is
insufficient for mitigating all of these correlations. In parallel, a truly
balanced dataset may be bound to "throw the baby out with the bathwater" and
miss important signal encoding common sense and world knowledge. We highlight
several alternatives to dataset balancing, focusing on enhancing datasets with
richer contexts, allowing models to abstain and interact with users, and
turning from large-scale fine-tuning to zero- or few-shot setups. |
---|---|
DOI: | 10.48550/arxiv.2204.12708 |