Heterogeneous Federated Learning Using Knowledge Codistillation

Federated Averaging, and many federated learning algorithm variants which build upon it, have a limitation: all clients must share the same model architecture. This results in unused modeling capacity on many clients, which limits model performance. To address this issue, we propose a method that in...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Lichtarge, Jared, Amid, Ehsan, Kumar, Shankar, Yang, Tien-Ju, Anil, Rohan, Mathews, Rajiv
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator Lichtarge, Jared
Amid, Ehsan
Kumar, Shankar
Yang, Tien-Ju
Anil, Rohan
Mathews, Rajiv
description Federated Averaging, and many federated learning algorithm variants which build upon it, have a limitation: all clients must share the same model architecture. This results in unused modeling capacity on many clients, which limits model performance. To address this issue, we propose a method that involves training a small model on the entire pool and a larger model on a subset of clients with higher capacity. The models exchange information bidirectionally via knowledge distillation, utilizing an unlabeled dataset on a server without sharing parameters. We present two variants of our method, which improve upon federated averaging on image classification and language modeling tasks. We show this technique can be useful even if only out-of-domain or limited in-domain distillation data is available. Additionally, the bi-directional knowledge distillation allows for domain transfer between the models when different pool populations introduce domain shift.
doi_str_mv 10.48550/arxiv.2310.02549
format Article
fullrecord <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2310_02549</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2310_02549</sourcerecordid><originalsourceid>FETCH-LOGICAL-a679-8960f0a10034ff2258b025878002323c7dba012738819d913fef1caf34e3c2093</originalsourceid><addsrcrecordid>eNotj81uwjAQhH3hUEEfoKfmBULX3oTYJ4SiUqpG4kLP0RKvI0vBRk769_YF2suMNIdv9AnxIGFZ6LKEJ0rf_nOp8DKAKgtzJ9Y7njjFngPHjzHbsuVEE9usYUrBhz57H6_5FuLXwLbnrI7Wj5MfBpp8DAsxczSMfP_fc3HYPh_qXd7sX17rTZPTqjK5NitwQBIAC-eUKvXx8q8rDaBQYVfZI4FUFWotjTUSHTvZkcOCsVNgcC4e_7A3g_ac_InST3s1aW8m-AutmEKl</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Heterogeneous Federated Learning Using Knowledge Codistillation</title><source>arXiv.org</source><creator>Lichtarge, Jared ; Amid, Ehsan ; Kumar, Shankar ; Yang, Tien-Ju ; Anil, Rohan ; Mathews, Rajiv</creator><creatorcontrib>Lichtarge, Jared ; Amid, Ehsan ; Kumar, Shankar ; Yang, Tien-Ju ; Anil, Rohan ; Mathews, Rajiv</creatorcontrib><description>Federated Averaging, and many federated learning algorithm variants which build upon it, have a limitation: all clients must share the same model architecture. This results in unused modeling capacity on many clients, which limits model performance. To address this issue, we propose a method that involves training a small model on the entire pool and a larger model on a subset of clients with higher capacity. The models exchange information bidirectionally via knowledge distillation, utilizing an unlabeled dataset on a server without sharing parameters. We present two variants of our method, which improve upon federated averaging on image classification and language modeling tasks. We show this technique can be useful even if only out-of-domain or limited in-domain distillation data is available. Additionally, the bi-directional knowledge distillation allows for domain transfer between the models when different pool populations introduce domain shift.</description><identifier>DOI: 10.48550/arxiv.2310.02549</identifier><language>eng</language><subject>Computer Science - Learning</subject><creationdate>2023-10</creationdate><rights>http://creativecommons.org/licenses/by/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2310.02549$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2310.02549$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Lichtarge, Jared</creatorcontrib><creatorcontrib>Amid, Ehsan</creatorcontrib><creatorcontrib>Kumar, Shankar</creatorcontrib><creatorcontrib>Yang, Tien-Ju</creatorcontrib><creatorcontrib>Anil, Rohan</creatorcontrib><creatorcontrib>Mathews, Rajiv</creatorcontrib><title>Heterogeneous Federated Learning Using Knowledge Codistillation</title><description>Federated Averaging, and many federated learning algorithm variants which build upon it, have a limitation: all clients must share the same model architecture. This results in unused modeling capacity on many clients, which limits model performance. To address this issue, we propose a method that involves training a small model on the entire pool and a larger model on a subset of clients with higher capacity. The models exchange information bidirectionally via knowledge distillation, utilizing an unlabeled dataset on a server without sharing parameters. We present two variants of our method, which improve upon federated averaging on image classification and language modeling tasks. We show this technique can be useful even if only out-of-domain or limited in-domain distillation data is available. Additionally, the bi-directional knowledge distillation allows for domain transfer between the models when different pool populations introduce domain shift.</description><subject>Computer Science - Learning</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotj81uwjAQhH3hUEEfoKfmBULX3oTYJ4SiUqpG4kLP0RKvI0vBRk769_YF2suMNIdv9AnxIGFZ6LKEJ0rf_nOp8DKAKgtzJ9Y7njjFngPHjzHbsuVEE9usYUrBhz57H6_5FuLXwLbnrI7Wj5MfBpp8DAsxczSMfP_fc3HYPh_qXd7sX17rTZPTqjK5NitwQBIAC-eUKvXx8q8rDaBQYVfZI4FUFWotjTUSHTvZkcOCsVNgcC4e_7A3g_ac_InST3s1aW8m-AutmEKl</recordid><startdate>20231003</startdate><enddate>20231003</enddate><creator>Lichtarge, Jared</creator><creator>Amid, Ehsan</creator><creator>Kumar, Shankar</creator><creator>Yang, Tien-Ju</creator><creator>Anil, Rohan</creator><creator>Mathews, Rajiv</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20231003</creationdate><title>Heterogeneous Federated Learning Using Knowledge Codistillation</title><author>Lichtarge, Jared ; Amid, Ehsan ; Kumar, Shankar ; Yang, Tien-Ju ; Anil, Rohan ; Mathews, Rajiv</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a679-8960f0a10034ff2258b025878002323c7dba012738819d913fef1caf34e3c2093</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Computer Science - Learning</topic><toplevel>online_resources</toplevel><creatorcontrib>Lichtarge, Jared</creatorcontrib><creatorcontrib>Amid, Ehsan</creatorcontrib><creatorcontrib>Kumar, Shankar</creatorcontrib><creatorcontrib>Yang, Tien-Ju</creatorcontrib><creatorcontrib>Anil, Rohan</creatorcontrib><creatorcontrib>Mathews, Rajiv</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Lichtarge, Jared</au><au>Amid, Ehsan</au><au>Kumar, Shankar</au><au>Yang, Tien-Ju</au><au>Anil, Rohan</au><au>Mathews, Rajiv</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Heterogeneous Federated Learning Using Knowledge Codistillation</atitle><date>2023-10-03</date><risdate>2023</risdate><abstract>Federated Averaging, and many federated learning algorithm variants which build upon it, have a limitation: all clients must share the same model architecture. This results in unused modeling capacity on many clients, which limits model performance. To address this issue, we propose a method that involves training a small model on the entire pool and a larger model on a subset of clients with higher capacity. The models exchange information bidirectionally via knowledge distillation, utilizing an unlabeled dataset on a server without sharing parameters. We present two variants of our method, which improve upon federated averaging on image classification and language modeling tasks. We show this technique can be useful even if only out-of-domain or limited in-domain distillation data is available. Additionally, the bi-directional knowledge distillation allows for domain transfer between the models when different pool populations introduce domain shift.</abstract><doi>10.48550/arxiv.2310.02549</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier DOI: 10.48550/arxiv.2310.02549
ispartof
issn
language eng
recordid cdi_arxiv_primary_2310_02549
source arXiv.org
subjects Computer Science - Learning
title Heterogeneous Federated Learning Using Knowledge Codistillation
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-05T17%3A52%3A23IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Heterogeneous%20Federated%20Learning%20Using%20Knowledge%20Codistillation&rft.au=Lichtarge,%20Jared&rft.date=2023-10-03&rft_id=info:doi/10.48550/arxiv.2310.02549&rft_dat=%3Carxiv_GOX%3E2310_02549%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true