ID and OOD Performance Are Sometimes Inversely Correlated on Real-world Datasets
Several studies have compared the in-distribution (ID) and out-of-distribution (OOD) performance of models in computer vision and NLP. They report a frequent positive correlation and some surprisingly never even observe an inverse correlation indicative of a necessary trade-off. The possibility of i...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Several studies have compared the in-distribution (ID) and
out-of-distribution (OOD) performance of models in computer vision and NLP.
They report a frequent positive correlation and some surprisingly never even
observe an inverse correlation indicative of a necessary trade-off. The
possibility of inverse patterns is important to determine whether ID
performance can serve as a proxy for OOD generalization capabilities.
This paper shows with multiple datasets that inverse correlations between ID
and OOD performance do happen in real-world data - not only in theoretical
worst-case settings. We also explain theoretically how these cases can arise
even in a minimal linear setting, and why past studies could miss such cases
due to a biased selection of models.
Our observations lead to recommendations that contradict those found in much
of the current literature. - High OOD performance sometimes requires trading
off ID performance. - Focusing on ID performance alone may not lead to optimal
OOD performance. It may produce diminishing (eventually negative) returns in
OOD performance. - In these cases, studies on OOD generalization that use ID
performance for model selection (a common recommended practice) will
necessarily miss the best-performing models, making these studies blind to a
whole range of phenomena. |
---|---|
DOI: | 10.48550/arxiv.2209.00613 |