Breaking the Memory Wall for Heterogeneous Federated Learning via Model Splitting
Federated Learning (FL) enables multiple devices to collaboratively train a shared model while preserving data privacy. Ever-increasing model complexity coupled with limited memory resources on the participating devices severely bottlenecks the deployment of FL in real-world scenarios. Thus, a frame...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Federated Learning (FL) enables multiple devices to collaboratively train a
shared model while preserving data privacy. Ever-increasing model complexity
coupled with limited memory resources on the participating devices severely
bottlenecks the deployment of FL in real-world scenarios. Thus, a framework
that can effectively break the memory wall while jointly taking into account
the hardware and statistical heterogeneity in FL is urgently required. In this
paper, we propose SmartSplit, a framework that effectively reduces the memory
footprint on the device side while guaranteeing the training progress and model
accuracy for heterogeneous FL through model splitting.Towards this end,
SmartSplit employs a hierarchical structure to adaptively guide the overall
training process. In each training round, the central manager, hosted on the
server, dynamically selects the participating devices and sets the cutting
layer by jointly considering the memory budget, training capacity, and data
distribution of each device. The MEC manager, deployed within the edge server,
proceeds to split the local model and perform training of the server-side
portion. Meanwhile, it fine-tunes the splitting points based on the
time-evolving statistical importance. The on-device manager, embedded inside
each mobile device, continuously monitors the local training status while
employing cost-aware checkpointing to match the runtime dynamic memory budget.
Extensive experiments on representative datasets are conducted on both
commercial off-the-shelf mobile device testbeds. The experimental results show
that SmartSplit excels in FL training on highly memory-constrained mobile SoCs,
offering up to a 94% peak latency reduction and 100-fold memory savings. It
enhances accuracy performance by 1.49%-57.18% and adaptively adjusts to dynamic
memory budgets through cost-aware recomputation. |
---|---|
DOI: | 10.48550/arxiv.2410.11577 |