Automated Domain Discovery from Multiple Sources to Improve Zero-Shot Generalization
Domain generalization (DG) methods aim to develop models that generalize to settings where the test distribution is different from the training data. In this paper, we focus on the challenging problem of multi-source zero shot DG (MDG), where labeled training data from multiple source domains is ava...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Domain generalization (DG) methods aim to develop models that generalize to
settings where the test distribution is different from the training data. In
this paper, we focus on the challenging problem of multi-source zero shot DG
(MDG), where labeled training data from multiple source domains is available
but with no access to data from the target domain. A wide range of solutions
have been proposed for this problem, including the state-of-the-art
multi-domain ensembling approaches. Despite these advances, the na\"ive ERM
solution of pooling all source data together and training a single classifier
is surprisingly effective on standard benchmarks. In this paper, we hypothesize
that, it is important to elucidate the link between pre-specified domain labels
and MDG performance, in order to explain this behavior. More specifically, we
consider two popular classes of MDG algorithms -- distributional robust
optimization (DRO) and multi-domain ensembles, in order to demonstrate how
inferring custom domain groups can lead to consistent improvements over the
original domain labels that come with the dataset. To this end, we propose (i)
Group-DRO++, which incorporates an explicit clustering step to identify custom
domains in an existing DRO technique; and (ii) DReaME, which produces effective
multi-domain ensembles through implicit domain re-labeling with a novel
meta-optimization algorithm. Using empirical studies on multiple standard
benchmarks, we show that our variants consistently outperform ERM by
significant margins (1.5% - 9%), and produce state-of-the-art MDG performance.
Our code can be found at https://github.com/kowshikthopalli/DREAME |
---|---|
DOI: | 10.48550/arxiv.2112.09802 |