Maximum Mean Discrepancy for Generalization in the Presence of Distribution and Missingness Shift
Covariate shifts are a common problem in predictive modeling on real-world problems. This paper proposes addressing the covariate shift problem by minimizing Maximum Mean Discrepancy (MMD) statistics between the training and test sets in either feature input space, feature representation space, or b...
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Covariate shifts are a common problem in predictive modeling on real-world
problems. This paper proposes addressing the covariate shift problem by
minimizing Maximum Mean Discrepancy (MMD) statistics between the training and
test sets in either feature input space, feature representation space, or both.
We designed three techniques that we call MMD Representation, MMD Mask, and MMD
Hybrid to deal with the scenarios where only a distribution shift exists, only
a missingness shift exists, or both types of shift exist, respectively. We find
that integrating an MMD loss component helps models use the best features for
generalization and avoid dangerous extrapolation as much as possible for each
test sample. Models treated with this MMD approach show better performance,
calibration, and extrapolation on the test set. |
---|---|
DOI: | 10.48550/arxiv.2111.10344 |