Adaptive Test-Time Personalization for Federated Learning
Personalized federated learning algorithms have shown promising results in adapting models to various distribution shifts. However, most of these methods require labeled data on testing clients for personalization, which is usually unavailable in real-world scenarios. In this paper, we introduce a n...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Personalized federated learning algorithms have shown promising results in
adapting models to various distribution shifts. However, most of these methods
require labeled data on testing clients for personalization, which is usually
unavailable in real-world scenarios. In this paper, we introduce a novel
setting called test-time personalized federated learning (TTPFL), where clients
locally adapt a global model in an unsupervised way without relying on any
labeled data during test-time. While traditional test-time adaptation (TTA) can
be used in this scenario, most of them inherently assume training data come
from a single domain, while they come from multiple clients (source domains)
with different distributions. Overlooking these domain interrelationships can
result in suboptimal generalization. Moreover, most TTA algorithms are designed
for a specific kind of distribution shift and lack the flexibility to handle
multiple kinds of distribution shifts in FL. In this paper, we find that this
lack of flexibility partially results from their pre-defining which modules to
adapt in the model. To tackle this challenge, we propose a novel algorithm
called ATP to adaptively learns the adaptation rates for each module in the
model from distribution shifts among source domains. Theoretical analysis
proves the strong generalization of ATP. Extensive experiments demonstrate its
superiority in handling various distribution shifts including label shift,
image corruptions, and domain shift, outperforming existing TTA methods across
multiple datasets and model architectures. Our code is available at
https://github.com/baowenxuan/ATP . |
---|---|
DOI: | 10.48550/arxiv.2310.18816 |