An Empirical Evaluation of Federated Contextual Bandit Algorithms
As the adoption of federated learning increases for learning from sensitive data local to user devices, it is natural to ask if the learning can be done using implicit signals generated as users interact with the applications of interest, rather than requiring access to explicit labels which can be...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | |
container_volume | |
creator | Agarwal, Alekh McMahan, H. Brendan Xu, Zheng |
description | As the adoption of federated learning increases for learning from sensitive
data local to user devices, it is natural to ask if the learning can be done
using implicit signals generated as users interact with the applications of
interest, rather than requiring access to explicit labels which can be
difficult to acquire in many tasks. We approach such problems with the
framework of federated contextual bandits, and develop variants of prominent
contextual bandit algorithms from the centralized seting for the federated
setting. We carefully evaluate these algorithms in a range of scenarios
simulated using publicly available datasets. Our simulations model typical
setups encountered in the real-world, such as various misalignments between an
initial pre-trained model and the subsequent user interactions due to
non-stationarity in the data and/or heterogeneity across clients. Our
experiments reveal the surprising effectiveness of the simple and commonly used
softmax heuristic in balancing the well-know exploration-exploitation tradeoff
across the breadth of our settings. |
doi_str_mv | 10.48550/arxiv.2303.10218 |
format | Article |
fullrecord | <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2303_10218</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2303_10218</sourcerecordid><originalsourceid>FETCH-LOGICAL-a678-89fdedb8eddd04f6df5e3368b5c67089891985330f3c62d1cb457a8fef0c73373</originalsourceid><addsrcrecordid>eNotz71OwzAUhmEvDKhwAUz4BhLsnPgnY4hSQKrE0j068bHBUn4q163K3QOF6VtefdLD2IMUZW2VEk-YLvFcViCglKKS9pa17cL7-RBTdDjx_ozTCXNcF74GvvXkE2ZPvFuX7C_59JM840Ix83b6WFPMn_Pxjt0EnI7-_n83bL_t991rsXt_eevaXYHa2MI2gTyN1hORqIOmoDyAtqNy2gjb2EY2VgGIAE5XJN1YK4M2-CCcATCwYY9_t1fDcEhxxvQ1_FqGqwW-AV-iRDc</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>An Empirical Evaluation of Federated Contextual Bandit Algorithms</title><source>arXiv.org</source><creator>Agarwal, Alekh ; McMahan, H. Brendan ; Xu, Zheng</creator><creatorcontrib>Agarwal, Alekh ; McMahan, H. Brendan ; Xu, Zheng</creatorcontrib><description>As the adoption of federated learning increases for learning from sensitive
data local to user devices, it is natural to ask if the learning can be done
using implicit signals generated as users interact with the applications of
interest, rather than requiring access to explicit labels which can be
difficult to acquire in many tasks. We approach such problems with the
framework of federated contextual bandits, and develop variants of prominent
contextual bandit algorithms from the centralized seting for the federated
setting. We carefully evaluate these algorithms in a range of scenarios
simulated using publicly available datasets. Our simulations model typical
setups encountered in the real-world, such as various misalignments between an
initial pre-trained model and the subsequent user interactions due to
non-stationarity in the data and/or heterogeneity across clients. Our
experiments reveal the surprising effectiveness of the simple and commonly used
softmax heuristic in balancing the well-know exploration-exploitation tradeoff
across the breadth of our settings.</description><identifier>DOI: 10.48550/arxiv.2303.10218</identifier><language>eng</language><subject>Computer Science - Artificial Intelligence ; Computer Science - Learning</subject><creationdate>2023-03</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,777,882</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2303.10218$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2303.10218$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Agarwal, Alekh</creatorcontrib><creatorcontrib>McMahan, H. Brendan</creatorcontrib><creatorcontrib>Xu, Zheng</creatorcontrib><title>An Empirical Evaluation of Federated Contextual Bandit Algorithms</title><description>As the adoption of federated learning increases for learning from sensitive
data local to user devices, it is natural to ask if the learning can be done
using implicit signals generated as users interact with the applications of
interest, rather than requiring access to explicit labels which can be
difficult to acquire in many tasks. We approach such problems with the
framework of federated contextual bandits, and develop variants of prominent
contextual bandit algorithms from the centralized seting for the federated
setting. We carefully evaluate these algorithms in a range of scenarios
simulated using publicly available datasets. Our simulations model typical
setups encountered in the real-world, such as various misalignments between an
initial pre-trained model and the subsequent user interactions due to
non-stationarity in the data and/or heterogeneity across clients. Our
experiments reveal the surprising effectiveness of the simple and commonly used
softmax heuristic in balancing the well-know exploration-exploitation tradeoff
across the breadth of our settings.</description><subject>Computer Science - Artificial Intelligence</subject><subject>Computer Science - Learning</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotz71OwzAUhmEvDKhwAUz4BhLsnPgnY4hSQKrE0j068bHBUn4q163K3QOF6VtefdLD2IMUZW2VEk-YLvFcViCglKKS9pa17cL7-RBTdDjx_ozTCXNcF74GvvXkE2ZPvFuX7C_59JM840Ix83b6WFPMn_Pxjt0EnI7-_n83bL_t991rsXt_eevaXYHa2MI2gTyN1hORqIOmoDyAtqNy2gjb2EY2VgGIAE5XJN1YK4M2-CCcATCwYY9_t1fDcEhxxvQ1_FqGqwW-AV-iRDc</recordid><startdate>20230317</startdate><enddate>20230317</enddate><creator>Agarwal, Alekh</creator><creator>McMahan, H. Brendan</creator><creator>Xu, Zheng</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20230317</creationdate><title>An Empirical Evaluation of Federated Contextual Bandit Algorithms</title><author>Agarwal, Alekh ; McMahan, H. Brendan ; Xu, Zheng</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a678-89fdedb8eddd04f6df5e3368b5c67089891985330f3c62d1cb457a8fef0c73373</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Computer Science - Artificial Intelligence</topic><topic>Computer Science - Learning</topic><toplevel>online_resources</toplevel><creatorcontrib>Agarwal, Alekh</creatorcontrib><creatorcontrib>McMahan, H. Brendan</creatorcontrib><creatorcontrib>Xu, Zheng</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Agarwal, Alekh</au><au>McMahan, H. Brendan</au><au>Xu, Zheng</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>An Empirical Evaluation of Federated Contextual Bandit Algorithms</atitle><date>2023-03-17</date><risdate>2023</risdate><abstract>As the adoption of federated learning increases for learning from sensitive
data local to user devices, it is natural to ask if the learning can be done
using implicit signals generated as users interact with the applications of
interest, rather than requiring access to explicit labels which can be
difficult to acquire in many tasks. We approach such problems with the
framework of federated contextual bandits, and develop variants of prominent
contextual bandit algorithms from the centralized seting for the federated
setting. We carefully evaluate these algorithms in a range of scenarios
simulated using publicly available datasets. Our simulations model typical
setups encountered in the real-world, such as various misalignments between an
initial pre-trained model and the subsequent user interactions due to
non-stationarity in the data and/or heterogeneity across clients. Our
experiments reveal the surprising effectiveness of the simple and commonly used
softmax heuristic in balancing the well-know exploration-exploitation tradeoff
across the breadth of our settings.</abstract><doi>10.48550/arxiv.2303.10218</doi><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | DOI: 10.48550/arxiv.2303.10218 |
ispartof | |
issn | |
language | eng |
recordid | cdi_arxiv_primary_2303_10218 |
source | arXiv.org |
subjects | Computer Science - Artificial Intelligence Computer Science - Learning |
title | An Empirical Evaluation of Federated Contextual Bandit Algorithms |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-18T09%3A13%3A20IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=An%20Empirical%20Evaluation%20of%20Federated%20Contextual%20Bandit%20Algorithms&rft.au=Agarwal,%20Alekh&rft.date=2023-03-17&rft_id=info:doi/10.48550/arxiv.2303.10218&rft_dat=%3Carxiv_GOX%3E2303_10218%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true |