Hybrid Federated Learning for Multimodal IoT Systems

Multimodal federated learning (FL) targets the intersection of two promising research directions in Internet of Things (IoT) scenarios: 1) leveraging complementary multimodal information to enhance downstream inference performance and 2) conducting distributed training with privacy protection. Howev...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE internet of things journal 2024-11, Vol.11 (21), p.34055-34064
Hauptverfasser: Peng, Yuanzhe, Wu, Yusen, Bian, Jieming, Xu, Jie
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Multimodal federated learning (FL) targets the intersection of two promising research directions in Internet of Things (IoT) scenarios: 1) leveraging complementary multimodal information to enhance downstream inference performance and 2) conducting distributed training with privacy protection. However, the majority of existing works primarily focus on applying different FL methods in a straightforward manner after the multimodal feature fusion stage without fundamentally disentangling the multimodal FL across both the feature space and the sample space. There still exists an important tradeoff between the computationally demanding nature of multimodal information and the limited computing resources in IoT systems. To tackle this challenge, we propose a hybrid FL algorithm tailored for multimodal IoT systems (HFM). HFM utilizes vertical FL (VFL) to distribute computing resources across the feature space and horizontal FL (HFL) to distribute computing resources across the sample space. This innovative algorithm necessitates consideration of both stale information from the VFL component and perturbed gradients from the HFL component, which is not fully understood from a theoretical point. In this article, we theoretically prove that the convergence of HFM depends on the frequency of VFL communication and HFL communication, as well as the number of vertical partitions and horizontal partitions. Furthermore, we empirically demonstrate that HFM outperforms three types of baselines based on two public multimodal data sets, thereby making it practical for multimodal IoT systems that require rapid and accurate downstream inference tasks, such as classification, prediction, etc.
ISSN:2327-4662
2327-4662
DOI:10.1109/JIOT.2024.3443267