An Empirical Evaluation of Federated Contextual Bandit Algorithms

As the adoption of federated learning increases for learning from sensitive data local to user devices, it is natural to ask if the learning can be done using implicit signals generated as users interact with the applications of interest, rather than requiring access to explicit labels which can be...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Agarwal, Alekh, McMahan, H. Brendan, Xu, Zheng
Format:	Artikel
Sprache:	eng
Schlagworte:	Computer Science - Artificial Intelligence Computer Science - Learning
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title
container_volume
creator	Agarwal, Alekh McMahan, H. Brendan Xu, Zheng
description	As the adoption of federated learning increases for learning from sensitive data local to user devices, it is natural to ask if the learning can be done using implicit signals generated as users interact with the applications of interest, rather than requiring access to explicit labels which can be difficult to acquire in many tasks. We approach such problems with the framework of federated contextual bandits, and develop variants of prominent contextual bandit algorithms from the centralized seting for the federated setting. We carefully evaluate these algorithms in a range of scenarios simulated using publicly available datasets. Our simulations model typical setups encountered in the real-world, such as various misalignments between an initial pre-trained model and the subsequent user interactions due to non-stationarity in the data and/or heterogeneity across clients. Our experiments reveal the surprising effectiveness of the simple and commonly used softmax heuristic in balancing the well-know exploration-exploitation tradeoff across the breadth of our settings.
doi_str_mv	10.48550/arxiv.2303.10218
format	Article
fullrecord	<record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2303_10218</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2303_10218</sourcerecordid><originalsourceid>FETCH-LOGICAL-a678-89fdedb8eddd04f6df5e3368b5c67089891985330f3c62d1cb457a8fef0c73373</originalsourceid><addsrcrecordid>eNotz71OwzAUhmEvDKhwAUz4BhLsnPgnY4hSQKrE0j068bHBUn4q163K3QOF6VtefdLD2IMUZW2VEk-YLvFcViCglKKS9pa17cL7-RBTdDjx_ozTCXNcF74GvvXkE2ZPvFuX7C_59JM840Ix83b6WFPMn_Pxjt0EnI7-_n83bL_t991rsXt_eevaXYHa2MI2gTyN1hORqIOmoDyAtqNy2gjb2EY2VgGIAE5XJN1YK4M2-CCcATCwYY9_t1fDcEhxxvQ1_FqGqwW-AV-iRDc</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>An Empirical Evaluation of Federated Contextual Bandit Algorithms</title><source>arXiv.org</source><creator>Agarwal, Alekh ; McMahan, H. Brendan ; Xu, Zheng</creator><creatorcontrib>Agarwal, Alekh ; McMahan, H. Brendan ; Xu, Zheng</creatorcontrib><description>As the adoption of federated learning increases for learning from sensitive data local to user devices, it is natural to ask if the learning can be done using implicit signals generated as users interact with the applications of interest, rather than requiring access to explicit labels which can be difficult to acquire in many tasks. We approach such problems with the framework of federated contextual bandits, and develop variants of prominent contextual bandit algorithms from the centralized seting for the federated setting. We carefully evaluate these algorithms in a range of scenarios simulated using publicly available datasets. Our simulations model typical setups encountered in the real-world, such as various misalignments between an initial pre-trained model and the subsequent user interactions due to non-stationarity in the data and/or heterogeneity across clients. Our experiments reveal the surprising effectiveness of the simple and commonly used softmax heuristic in balancing the well-know exploration-exploitation tradeoff across the breadth of our settings.</description><identifier>DOI: 10.48550/arxiv.2303.10218</identifier><language>eng</language><subject>Computer Science - Artificial Intelligence ; Computer Science - Learning</subject><creationdate>2023-03</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,777,882</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2303.10218$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2303.10218$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Agarwal, Alekh</creatorcontrib><creatorcontrib>McMahan, H. Brendan</creatorcontrib><creatorcontrib>Xu, Zheng</creatorcontrib><title>An Empirical Evaluation of Federated Contextual Bandit Algorithms</title><description>As the adoption of federated learning increases for learning from sensitive data local to user devices, it is natural to ask if the learning can be done using implicit signals generated as users interact with the applications of interest, rather than requiring access to explicit labels which can be difficult to acquire in many tasks. We approach such problems with the framework of federated contextual bandits, and develop variants of prominent contextual bandit algorithms from the centralized seting for the federated setting. We carefully evaluate these algorithms in a range of scenarios simulated using publicly available datasets. Our simulations model typical setups encountered in the real-world, such as various misalignments between an initial pre-trained model and the subsequent user interactions due to non-stationarity in the data and/or heterogeneity across clients. Our experiments reveal the surprising effectiveness of the simple and commonly used softmax heuristic in balancing the well-know exploration-exploitation tradeoff across the breadth of our settings.</description><subject>Computer Science - Artificial Intelligence</subject><subject>Computer Science - Learning</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotz71OwzAUhmEvDKhwAUz4BhLsnPgnY4hSQKrE0j068bHBUn4q163K3QOF6VtefdLD2IMUZW2VEk-YLvFcViCglKKS9pa17cL7-RBTdDjx_ozTCXNcF74GvvXkE2ZPvFuX7C_59JM840Ix83b6WFPMn_Pxjt0EnI7-_n83bL_t991rsXt_eevaXYHa2MI2gTyN1hORqIOmoDyAtqNy2gjb2EY2VgGIAE5XJN1YK4M2-CCcATCwYY9_t1fDcEhxxvQ1_FqGqwW-AV-iRDc</recordid><startdate>20230317</startdate><enddate>20230317</enddate><creator>Agarwal, Alekh</creator><creator>McMahan, H. Brendan</creator><creator>Xu, Zheng</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20230317</creationdate><title>An Empirical Evaluation of Federated Contextual Bandit Algorithms</title><author>Agarwal, Alekh ; McMahan, H. Brendan ; Xu, Zheng</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a678-89fdedb8eddd04f6df5e3368b5c67089891985330f3c62d1cb457a8fef0c73373</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Computer Science - Artificial Intelligence</topic><topic>Computer Science - Learning</topic><toplevel>online_resources</toplevel><creatorcontrib>Agarwal, Alekh</creatorcontrib><creatorcontrib>McMahan, H. Brendan</creatorcontrib><creatorcontrib>Xu, Zheng</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Agarwal, Alekh</au><au>McMahan, H. Brendan</au><au>Xu, Zheng</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>An Empirical Evaluation of Federated Contextual Bandit Algorithms</atitle><date>2023-03-17</date><risdate>2023</risdate><abstract>As the adoption of federated learning increases for learning from sensitive data local to user devices, it is natural to ask if the learning can be done using implicit signals generated as users interact with the applications of interest, rather than requiring access to explicit labels which can be difficult to acquire in many tasks. We approach such problems with the framework of federated contextual bandits, and develop variants of prominent contextual bandit algorithms from the centralized seting for the federated setting. We carefully evaluate these algorithms in a range of scenarios simulated using publicly available datasets. Our simulations model typical setups encountered in the real-world, such as various misalignments between an initial pre-trained model and the subsequent user interactions due to non-stationarity in the data and/or heterogeneity across clients. Our experiments reveal the surprising effectiveness of the simple and commonly used softmax heuristic in balancing the well-know exploration-exploitation tradeoff across the breadth of our settings.</abstract><doi>10.48550/arxiv.2303.10218</doi><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier	DOI: 10.48550/arxiv.2303.10218
ispartof
issn
language	eng
recordid	cdi_arxiv_primary_2303_10218
source	arXiv.org
subjects	Computer Science - Artificial Intelligence Computer Science - Learning
title	An Empirical Evaluation of Federated Contextual Bandit Algorithms
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-18T09%3A13%3A20IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=An%20Empirical%20Evaluation%20of%20Federated%20Contextual%20Bandit%20Algorithms&rft.au=Agarwal,%20Alekh&rft.date=2023-03-17&rft_id=info:doi/10.48550/arxiv.2303.10218&rft_dat=%3Carxiv_GOX%3E2303_10218%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true