Communication-Efficient Federated Distillation: Theoretical Analysis and Performance Enhancement

Federated learning (FL) is a promising paradigm for privacy-preserving deep learning using data distributed on Internet of Things devices. Traditional model sharing-based methods, e.g., federated averaging (FedAvg), suffer from high communication overhead and difficulty in accommodating heterogeneou...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE internet of things journal 2024-12, Vol.11 (23), p.37959-37973
Hauptverfasser: Liu, Lumin, Zhang, Jun, Song, Shenghui, Letaief, Khaled B.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Federated learning (FL) is a promising paradigm for privacy-preserving deep learning using data distributed on Internet of Things devices. Traditional model sharing-based methods, e.g., federated averaging (FedAvg), suffer from high communication overhead and difficulty in accommodating heterogeneous model architectures. Federated distillation (FD) is a recently proposed alternative to enable communication-efficient and robust FL, as well as heterogeneous client models. However, there is a lack of theoretical understanding of FD-based methods, and their design guidelines remain elusive. This article presents a generic meta-algorithm for FD, generalizing most existing FD training algorithms. By studying a linear classification problem, we show that, with sufficient distillation samples, the training performance of the meta-algorithm is the same as the vanilla FedAvg. To guide the algorithm design and improve communication efficiency, we further investigate the binary classification problem with a Gaussian mixture model, which shows that more distillation data and sampling data with higher confidence improve the training performance. Furthermore, we propose an effective distillation data sampling technique to improve the performance of the FD-meta algorithm, which also reduces communication overhead. Simulations on the benchmark data sets validate the theoretical findings and demonstrate that our proposed algorithm effectively reduces the communication overhead while achieving a satisfactory performance.
ISSN:2327-4662
2327-4662
DOI:10.1109/JIOT.2024.3446751