Heterogeneous Federated Learning Using Knowledge Codistillation

Federated Averaging, and many federated learning algorithm variants which build upon it, have a limitation: all clients must share the same model architecture. This results in unused modeling capacity on many clients, which limits model performance. To address this issue, we propose a method that in...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Lichtarge, Jared, Amid, Ehsan, Kumar, Shankar, Yang, Tien-Ju, Anil, Rohan, Mathews, Rajiv
Format:	Artikel
Sprache:	eng
Schlagworte:	Computer Science - Learning
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title
container_volume
creator	Lichtarge, Jared Amid, Ehsan Kumar, Shankar Yang, Tien-Ju Anil, Rohan Mathews, Rajiv
description	Federated Averaging, and many federated learning algorithm variants which build upon it, have a limitation: all clients must share the same model architecture. This results in unused modeling capacity on many clients, which limits model performance. To address this issue, we propose a method that involves training a small model on the entire pool and a larger model on a subset of clients with higher capacity. The models exchange information bidirectionally via knowledge distillation, utilizing an unlabeled dataset on a server without sharing parameters. We present two variants of our method, which improve upon federated averaging on image classification and language modeling tasks. We show this technique can be useful even if only out-of-domain or limited in-domain distillation data is available. Additionally, the bi-directional knowledge distillation allows for domain transfer between the models when different pool populations introduce domain shift.
doi_str_mv	10.48550/arxiv.2310.02549
format	Article
fullrecord	<record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2310_02549</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2310_02549</sourcerecordid><originalsourceid>FETCH-LOGICAL-a679-8960f0a10034ff2258b025878002323c7dba012738819d913fef1caf34e3c2093</originalsourceid><addsrcrecordid>eNotj81uwjAQhH3hUEEfoKfmBULX3oTYJ4SiUqpG4kLP0RKvI0vBRk769_YF2suMNIdv9AnxIGFZ6LKEJ0rf_nOp8DKAKgtzJ9Y7njjFngPHjzHbsuVEE9usYUrBhz57H6_5FuLXwLbnrI7Wj5MfBpp8DAsxczSMfP_fc3HYPh_qXd7sX17rTZPTqjK5NitwQBIAC-eUKvXx8q8rDaBQYVfZI4FUFWotjTUSHTvZkcOCsVNgcC4e_7A3g_ac_InST3s1aW8m-AutmEKl</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Heterogeneous Federated Learning Using Knowledge Codistillation</title><source>arXiv.org</source><creator>Lichtarge, Jared ; Amid, Ehsan ; Kumar, Shankar ; Yang, Tien-Ju ; Anil, Rohan ; Mathews, Rajiv</creator><creatorcontrib>Lichtarge, Jared ; Amid, Ehsan ; Kumar, Shankar ; Yang, Tien-Ju ; Anil, Rohan ; Mathews, Rajiv</creatorcontrib><description>Federated Averaging, and many federated learning algorithm variants which build upon it, have a limitation: all clients must share the same model architecture. This results in unused modeling capacity on many clients, which limits model performance. To address this issue, we propose a method that involves training a small model on the entire pool and a larger model on a subset of clients with higher capacity. The models exchange information bidirectionally via knowledge distillation, utilizing an unlabeled dataset on a server without sharing parameters. We present two variants of our method, which improve upon federated averaging on image classification and language modeling tasks. We show this technique can be useful even if only out-of-domain or limited in-domain distillation data is available. Additionally, the bi-directional knowledge distillation allows for domain transfer between the models when different pool populations introduce domain shift.</description><identifier>DOI: 10.48550/arxiv.2310.02549</identifier><language>eng</language><subject>Computer Science - Learning</subject><creationdate>2023-10</creationdate><rights>http://creativecommons.org/licenses/by/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2310.02549$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2310.02549$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Lichtarge, Jared</creatorcontrib><creatorcontrib>Amid, Ehsan</creatorcontrib><creatorcontrib>Kumar, Shankar</creatorcontrib><creatorcontrib>Yang, Tien-Ju</creatorcontrib><creatorcontrib>Anil, Rohan</creatorcontrib><creatorcontrib>Mathews, Rajiv</creatorcontrib><title>Heterogeneous Federated Learning Using Knowledge Codistillation</title><description>Federated Averaging, and many federated learning algorithm variants which build upon it, have a limitation: all clients must share the same model architecture. This results in unused modeling capacity on many clients, which limits model performance. To address this issue, we propose a method that involves training a small model on the entire pool and a larger model on a subset of clients with higher capacity. The models exchange information bidirectionally via knowledge distillation, utilizing an unlabeled dataset on a server without sharing parameters. We present two variants of our method, which improve upon federated averaging on image classification and language modeling tasks. We show this technique can be useful even if only out-of-domain or limited in-domain distillation data is available. Additionally, the bi-directional knowledge distillation allows for domain transfer between the models when different pool populations introduce domain shift.</description><subject>Computer Science - Learning</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotj81uwjAQhH3hUEEfoKfmBULX3oTYJ4SiUqpG4kLP0RKvI0vBRk769_YF2suMNIdv9AnxIGFZ6LKEJ0rf_nOp8DKAKgtzJ9Y7njjFngPHjzHbsuVEE9usYUrBhz57H6_5FuLXwLbnrI7Wj5MfBpp8DAsxczSMfP_fc3HYPh_qXd7sX17rTZPTqjK5NitwQBIAC-eUKvXx8q8rDaBQYVfZI4FUFWotjTUSHTvZkcOCsVNgcC4e_7A3g_ac_InST3s1aW8m-AutmEKl</recordid><startdate>20231003</startdate><enddate>20231003</enddate><creator>Lichtarge, Jared</creator><creator>Amid, Ehsan</creator><creator>Kumar, Shankar</creator><creator>Yang, Tien-Ju</creator><creator>Anil, Rohan</creator><creator>Mathews, Rajiv</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20231003</creationdate><title>Heterogeneous Federated Learning Using Knowledge Codistillation</title><author>Lichtarge, Jared ; Amid, Ehsan ; Kumar, Shankar ; Yang, Tien-Ju ; Anil, Rohan ; Mathews, Rajiv</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a679-8960f0a10034ff2258b025878002323c7dba012738819d913fef1caf34e3c2093</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Computer Science - Learning</topic><toplevel>online_resources</toplevel><creatorcontrib>Lichtarge, Jared</creatorcontrib><creatorcontrib>Amid, Ehsan</creatorcontrib><creatorcontrib>Kumar, Shankar</creatorcontrib><creatorcontrib>Yang, Tien-Ju</creatorcontrib><creatorcontrib>Anil, Rohan</creatorcontrib><creatorcontrib>Mathews, Rajiv</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Lichtarge, Jared</au><au>Amid, Ehsan</au><au>Kumar, Shankar</au><au>Yang, Tien-Ju</au><au>Anil, Rohan</au><au>Mathews, Rajiv</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Heterogeneous Federated Learning Using Knowledge Codistillation</atitle><date>2023-10-03</date><risdate>2023</risdate><abstract>Federated Averaging, and many federated learning algorithm variants which build upon it, have a limitation: all clients must share the same model architecture. This results in unused modeling capacity on many clients, which limits model performance. To address this issue, we propose a method that involves training a small model on the entire pool and a larger model on a subset of clients with higher capacity. The models exchange information bidirectionally via knowledge distillation, utilizing an unlabeled dataset on a server without sharing parameters. We present two variants of our method, which improve upon federated averaging on image classification and language modeling tasks. We show this technique can be useful even if only out-of-domain or limited in-domain distillation data is available. Additionally, the bi-directional knowledge distillation allows for domain transfer between the models when different pool populations introduce domain shift.</abstract><doi>10.48550/arxiv.2310.02549</doi><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier	DOI: 10.48550/arxiv.2310.02549
ispartof
issn
language	eng
recordid	cdi_arxiv_primary_2310_02549
source	arXiv.org
subjects	Computer Science - Learning
title	Heterogeneous Federated Learning Using Knowledge Codistillation
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-05T17%3A52%3A23IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Heterogeneous%20Federated%20Learning%20Using%20Knowledge%20Codistillation&rft.au=Lichtarge,%20Jared&rft.date=2023-10-03&rft_id=info:doi/10.48550/arxiv.2310.02549&rft_dat=%3Carxiv_GOX%3E2310_02549%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true