CBDMoE: Consistent-but-Diverse Mixture of Experts for Domain Generalization
Machine learning models often suffer from severe performance degradation due to distributional shifts between testing and training data. To address this issue, researchers have focused on domain generalization (DG), which aims to generalize a model trained on source domains to arbitrary unseen targe...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on multimedia 2024, Vol.26, p.9814-9824 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Machine learning models often suffer from severe performance degradation due to distributional shifts between testing and training data. To address this issue, researchers have focused on domain generalization (DG), which aims to generalize a model trained on source domains to arbitrary unseen target domains. Recently, ensemble learning has emerged as a popular strategy for addressing the DG problem, and domain-specific experts are typically involved. However, the existing methods do not sufficiently consider the generalizability of individual experts or leverage the consistency and diversity among them, thus limiting the generalizability of the constructed models. In this paper, we propose a consistent-but-diverse mixture of experts (CBDMoE) algorithm, which is an improved MoE framework that effectively harnesses ensemble learning for solving the DG problem. Specifically, we introduce individual expert learning (IEL), which incorporates a novel domain-class-balanced subset division (DCBSD)-based sampling strategy to facilitate a generalizable expert learning process. Additionally, we present consistent-but-diverse learning (CBDL), which employs two regularizing losses to encourage consistency and diversity in the predictions of the experts. Our proposed strategy significantly enhances the generalizability of the MoE framework. Extensive experiments conducted on three popular DG benchmark datasets demonstrate that our method outperforms the state-of-the-art approaches. |
---|---|
ISSN: | 1520-9210 1941-0077 |
DOI: | 10.1109/TMM.2024.3399468 |