FedImpute: Personalized federated learning for data imputation with clusterer and auxiliary classifier

Missing data is a prevalent challenge in real-world applications, hindering the usability and quality of datasets. Data imputation, a method to substitute missing values, offers a solution. While existing techniques achieve promising results, they often rely on centralized data collection, raising p...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Expert systems with applications 2025-04, Vol.270, p.126543, Article 126543
Hauptverfasser: Li, Yanan, Guo, Shaocong, Guo, Xinyuan, Zhao, Peng, Ren, Xuebin, Wang, Hui
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Missing data is a prevalent challenge in real-world applications, hindering the usability and quality of datasets. Data imputation, a method to substitute missing values, offers a solution. While existing techniques achieve promising results, they often rely on centralized data collection, raising privacy concerns. Existing methods for data imputation mainly utilize heterogeneous federation or optimization methods such as meta-learning. Although they can achieve personalized imputation, since they do not consider the impact of different data missing rates on the imputation quality, the imputed data may perform poorly in subsequent tasks. To address this, we propose a personalized federated approach FedImpute, from the perspective of data level. FedImpute tackles the challenge of balancing global model performance with local data customization to achieve robust imputation, whose core concept lies in leveraging the strengths of both global and local data perspectives. The global model captures universal patterns, while local models adapt to the unique characteristics of each participant’s private data. To achieve this, FedImpute incorporates modules, including the clusterer and auxiliary classifier, to extract and utilize latent class information during the imputation process. This enables the model to prioritize similarities within similar categories, leading to more precise and personalized imputations. Extensive evaluations on four real-world datasets demonstrate that FedImpute overall outperforms existing methods, especially for high missing situations. •Introduce a personalized federated learning algorithm for missing data imputation.•Learn characteristics of each participant’s dataset while safeguarding privacy and security.•Use clusterer and auxiliary classifier modules to extract latent class information.•Enable personalized treatment of each client’s data distribution and enhance imputation accuracy.
ISSN:0957-4174
DOI:10.1016/j.eswa.2025.126543