Mitigating Domain Shift in Federated Learning via Intra- and Inter-Domain Prototypes
Federated Learning (FL) has emerged as a decentralized machine learning technique, allowing clients to train a global model collaboratively without sharing private data. However, most FL studies ignore the crucial challenge of heterogeneous domains where each client has a distinct feature distributi...
Gespeichert in:
Hauptverfasser: | , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Federated Learning (FL) has emerged as a decentralized machine learning
technique, allowing clients to train a global model collaboratively without
sharing private data. However, most FL studies ignore the crucial challenge of
heterogeneous domains where each client has a distinct feature distribution,
which is common in real-world scenarios. Prototype learning, which leverages
the mean feature vectors within the same classes, has become a prominent
solution for federated learning under domain skew. However, existing federated
prototype learning methods only consider inter-domain prototypes on the server
and overlook intra-domain characteristics. In this work, we introduce a novel
federated prototype learning method, namely I$^2$PFL, which incorporates
$\textbf{I}$ntra-domain and $\textbf{I}$nter-domain $\textbf{P}$rototypes, to
mitigate domain shifts and learn a generalized global model across multiple
domains in federated learning. To construct intra-domain prototypes, we propose
feature alignment with MixUp-based augmented prototypes to capture the
diversity of local domains and enhance the generalization of local features.
Additionally, we introduce a reweighting mechanism for inter-domain prototypes
to generate generalized prototypes to provide inter-domain knowledge and reduce
domain skew across multiple clients. Extensive experiments on the Digits,
Office-10, and PACS datasets illustrate the superior performance of our method
compared to other baselines. |
---|---|
DOI: | 10.48550/arxiv.2501.08521 |