Federated Clustering: An Unsupervised Cluster-Wise Training for Decentralized Data Distributions
Federated Learning (FL) is a pivotal approach in decentralized machine learning, especially when data privacy is crucial and direct data sharing is impractical. While FL is typically associated with supervised learning, its potential in unsupervised scenarios is underexplored. This paper introduces...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Federated Learning (FL) is a pivotal approach in decentralized machine
learning, especially when data privacy is crucial and direct data sharing is
impractical. While FL is typically associated with supervised learning, its
potential in unsupervised scenarios is underexplored. This paper introduces a
novel unsupervised federated learning methodology designed to identify the
complete set of categories (global K) across multiple clients within
label-free, non-uniform data distributions, a process known as Federated
Clustering. Our approach, Federated Cluster-Wise Refinement (FedCRef), involves
clients that collaboratively train models on clusters with similar data
distributions. Initially, clients with diverse local data distributions (local
K) train models on their clusters to generate compressed data representations.
These local models are then shared across the network, enabling clients to
compare them through reconstruction error analysis, leading to the formation of
federated groups.In these groups, clients collaboratively train a shared model
representing each data distribution, while continuously refining their local
clusters to enhance data association accuracy. This iterative process allows
our system to identify all potential data distributions across the network and
develop robust representation models for each. To validate our approach, we
compare it with traditional centralized methods, establishing a performance
baseline and showcasing the advantages of our distributed solution. We also
conduct experiments on the EMNIST and KMNIST datasets, demonstrating FedCRef's
ability to refine and align cluster models with actual data distributions,
significantly improving data representation precision in unsupervised federated
settings. |
---|---|
DOI: | 10.48550/arxiv.2408.10664 |