Accelerating Federated Learning with a Global Biased Optimiser
Federated Learning (FL) is a recent development in distributed machine learning that collaboratively trains models without training data leaving client devices, preserving data privacy. In real-world FL, the training set is distributed over clients in a highly non-Independent and Identically Distrib...
Gespeichert in:
Hauptverfasser: | , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Federated Learning (FL) is a recent development in distributed machine
learning that collaboratively trains models without training data leaving
client devices, preserving data privacy. In real-world FL, the training set is
distributed over clients in a highly non-Independent and Identically
Distributed (non-IID) fashion, harming model convergence speed and final
performance. To address this challenge, we propose a novel, generalised
approach for incorporating adaptive optimisation into FL with the Federated
Global Biased Optimiser (FedGBO) algorithm. FedGBO accelerates FL by employing
a set of global biased optimiser values during training, reducing
'client-drift' from non-IID data whilst benefiting from adaptive optimisation.
We show that in FedGBO, updates to the global model can be reformulated as
centralised training using biased gradients and optimiser updates, and apply
this framework to prove FedGBO's convergence on nonconvex objectives when using
the momentum-SGD (SGDm) optimiser. We also conduct extensive experiments using
4 FL benchmark datasets (CIFAR100, Sent140, FEMNIST, Shakespeare) and 3 popular
optimisers (SGDm, RMSProp, Adam) to compare FedGBO against six state-of-the-art
FL algorithms. The results demonstrate that FedGBO displays superior or
competitive performance across the datasets whilst having low data-upload and
computational costs, and provide practical insights into the trade-offs
associated with different adaptive-FL algorithms and optimisers. |
---|---|
DOI: | 10.48550/arxiv.2108.09134 |