Where to Begin? On the Impact of Pre-Training and Initialization in Federated Learning
An oft-cited challenge of federated learning is the presence of heterogeneity. \emph{Data heterogeneity} refers to the fact that data from different clients may follow very different distributions. \emph{System heterogeneity} refers to the fact that client devices have different system capabilities....
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | An oft-cited challenge of federated learning is the presence of
heterogeneity. \emph{Data heterogeneity} refers to the fact that data from
different clients may follow very different distributions. \emph{System
heterogeneity} refers to the fact that client devices have different system
capabilities. A considerable number of federated optimization methods address
this challenge. In the literature, empirical evaluations usually start
federated training from random initialization. However, in many practical
applications of federated learning, the server has access to proxy data for the
training task that can be used to pre-train a model before starting federated
training. We empirically study the impact of starting from a pre-trained model
in federated learning using four standard federated learning benchmark
datasets. Unsurprisingly, starting from a pre-trained model reduces the
training time required to reach a target error rate and enables the training of
more accurate models (up to 40\%) than is possible when starting from random
initialization. Surprisingly, we also find that starting federated learning
from a pre-trained initialization reduces the effect of both data and system
heterogeneity. We recommend that future work proposing and evaluating federated
optimization methods evaluate the performance when starting from random and
pre-trained initializations. We also believe this study raises several
questions for further work on understanding the role of heterogeneity in
federated optimization. |
---|---|
DOI: | 10.48550/arxiv.2210.08090 |