Regret of Age-of-Information Bandits
We consider a system with a single source that measures/tracks a time-varying quantity and periodically attempts to report these measurements to a monitoring station. Each update from the source has to be scheduled on one of K available communication channels. The probability of success of each at...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on communications 2022-01, Vol.70 (1), p.87-100 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 100 |
---|---|
container_issue | 1 |
container_start_page | 87 |
container_title | IEEE transactions on communications |
container_volume | 70 |
creator | Fatale, Santosh Bhandari, Kavya Narula, Urvidh Moharir, Sharayu Hanawal, Manjesh K. |
description | We consider a system with a single source that measures/tracks a time-varying quantity and periodically attempts to report these measurements to a monitoring station. Each update from the source has to be scheduled on one of K available communication channels. The probability of success of each attempted communication is a function of the channel used. This function is unknown to the scheduler. The metric of interest is the Age-of-Information (AoI), formally defined as the time elapsed since the destination received the recent most update from the source. We model our scheduling problem as a variant of the multi-arm bandit problem with communication channels as arms. We characterize a lower bound on the AoI regret achievable by any policy and characterize the performance of UCB, Thompson Sampling, and their variants. Our analytical results show that UCB and Thompson sampling are order-optimal for AoI bandits. In addition, we propose novel policies which, unlike UCB and Thompson Sampling, use the current AoI to make scheduling decisions. Via simulations, we show the proposed AoI-aware policies outperform existing AoI-agnostic policies. |
doi_str_mv | 10.1109/TCOMM.2021.3118037 |
format | Article |
fullrecord | <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_ieee_primary_9559999</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9559999</ieee_id><sourcerecordid>2619589179</sourcerecordid><originalsourceid>FETCH-LOGICAL-c295t-41fe67ae6b7555c3b4dbf7631d3330287089941993ec381a1b453bff0a6f44613</originalsourceid><addsrcrecordid>eNo9kMtOAkEQRTtGExH9Ad2Q6LaxaqqfSyQ-SCAkBtednqGbDJEZ7B4X_r2DEO-mNvfUTQ5jtwhjRLCPq-lysRgXUOCYEA2QPmMDlNJwMFKfswGABa60NpfsKuctAAggGrCH97BJoRu1cTTZBN5GPmtim3a-q9tm9OSbdd3la3YR_WcON6c7ZB8vz6vpG58vX2fTyZxXhZUdFxiD0j6oUkspKyrFuoxaEa6JCAqjwVgr0FoKFRn0WApJZYzgVRRCIQ3Z_fHvPrVf3yF3btt-p6afdIVCK41FbftWcWxVqc05hej2qd759OMQ3MGG-7PhDjbcyUYP3R2hOoTwD1gpbR_6BcTaWIY</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2619589179</pqid></control><display><type>article</type><title>Regret of Age-of-Information Bandits</title><source>IEEE Electronic Library (IEL)</source><creator>Fatale, Santosh ; Bhandari, Kavya ; Narula, Urvidh ; Moharir, Sharayu ; Hanawal, Manjesh K.</creator><creatorcontrib>Fatale, Santosh ; Bhandari, Kavya ; Narula, Urvidh ; Moharir, Sharayu ; Hanawal, Manjesh K.</creatorcontrib><description>We consider a system with a single source that measures/tracks a time-varying quantity and periodically attempts to report these measurements to a monitoring station. Each update from the source has to be scheduled on one of <inline-formula> <tex-math notation="LaTeX">K </tex-math></inline-formula> available communication channels. The probability of success of each attempted communication is a function of the channel used. This function is unknown to the scheduler. The metric of interest is the Age-of-Information (AoI), formally defined as the time elapsed since the destination received the recent most update from the source. We model our scheduling problem as a variant of the multi-arm bandit problem with communication channels as arms. We characterize a lower bound on the AoI regret achievable by any policy and characterize the performance of UCB, Thompson Sampling, and their variants. Our analytical results show that UCB and Thompson sampling are order-optimal for AoI bandits. In addition, we propose novel policies which, unlike UCB and Thompson Sampling, use the current AoI to make scheduling decisions. Via simulations, we show the proposed AoI-aware policies outperform existing AoI-agnostic policies.</description><identifier>ISSN: 0090-6778</identifier><identifier>EISSN: 1558-0857</identifier><identifier>DOI: 10.1109/TCOMM.2021.3118037</identifier><identifier>CODEN: IECMBT</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>age-of-information (AOI) ; Channels ; communication channel ; Communication channels ; internet-of-things (IOT) ; Lower bounds ; Measurement ; Monitoring ; multi-armed bandit (MAB) ; Multi-armed bandit problems ; Policies ; Sampling ; Schedules ; Scheduling ; Scheduling algorithms ; sensors ; Time measurement ; Time-varying systems ; Upper bound</subject><ispartof>IEEE transactions on communications, 2022-01, Vol.70 (1), p.87-100</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2022</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c295t-41fe67ae6b7555c3b4dbf7631d3330287089941993ec381a1b453bff0a6f44613</citedby><cites>FETCH-LOGICAL-c295t-41fe67ae6b7555c3b4dbf7631d3330287089941993ec381a1b453bff0a6f44613</cites><orcidid>0000-0001-5688-4640 ; 0000-0001-9393-9276 ; 0000-0002-1807-5487 ; 0000-0001-6328-1709 ; 0000-0002-7417-3882</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9559999$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,777,781,793,27905,27906,54739</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/9559999$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Fatale, Santosh</creatorcontrib><creatorcontrib>Bhandari, Kavya</creatorcontrib><creatorcontrib>Narula, Urvidh</creatorcontrib><creatorcontrib>Moharir, Sharayu</creatorcontrib><creatorcontrib>Hanawal, Manjesh K.</creatorcontrib><title>Regret of Age-of-Information Bandits</title><title>IEEE transactions on communications</title><addtitle>TCOMM</addtitle><description>We consider a system with a single source that measures/tracks a time-varying quantity and periodically attempts to report these measurements to a monitoring station. Each update from the source has to be scheduled on one of <inline-formula> <tex-math notation="LaTeX">K </tex-math></inline-formula> available communication channels. The probability of success of each attempted communication is a function of the channel used. This function is unknown to the scheduler. The metric of interest is the Age-of-Information (AoI), formally defined as the time elapsed since the destination received the recent most update from the source. We model our scheduling problem as a variant of the multi-arm bandit problem with communication channels as arms. We characterize a lower bound on the AoI regret achievable by any policy and characterize the performance of UCB, Thompson Sampling, and their variants. Our analytical results show that UCB and Thompson sampling are order-optimal for AoI bandits. In addition, we propose novel policies which, unlike UCB and Thompson Sampling, use the current AoI to make scheduling decisions. Via simulations, we show the proposed AoI-aware policies outperform existing AoI-agnostic policies.</description><subject>age-of-information (AOI)</subject><subject>Channels</subject><subject>communication channel</subject><subject>Communication channels</subject><subject>internet-of-things (IOT)</subject><subject>Lower bounds</subject><subject>Measurement</subject><subject>Monitoring</subject><subject>multi-armed bandit (MAB)</subject><subject>Multi-armed bandit problems</subject><subject>Policies</subject><subject>Sampling</subject><subject>Schedules</subject><subject>Scheduling</subject><subject>Scheduling algorithms</subject><subject>sensors</subject><subject>Time measurement</subject><subject>Time-varying systems</subject><subject>Upper bound</subject><issn>0090-6778</issn><issn>1558-0857</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNo9kMtOAkEQRTtGExH9Ad2Q6LaxaqqfSyQ-SCAkBtednqGbDJEZ7B4X_r2DEO-mNvfUTQ5jtwhjRLCPq-lysRgXUOCYEA2QPmMDlNJwMFKfswGABa60NpfsKuctAAggGrCH97BJoRu1cTTZBN5GPmtim3a-q9tm9OSbdd3la3YR_WcON6c7ZB8vz6vpG58vX2fTyZxXhZUdFxiD0j6oUkspKyrFuoxaEa6JCAqjwVgr0FoKFRn0WApJZYzgVRRCIQ3Z_fHvPrVf3yF3btt-p6afdIVCK41FbftWcWxVqc05hej2qd759OMQ3MGG-7PhDjbcyUYP3R2hOoTwD1gpbR_6BcTaWIY</recordid><startdate>202201</startdate><enddate>202201</enddate><creator>Fatale, Santosh</creator><creator>Bhandari, Kavya</creator><creator>Narula, Urvidh</creator><creator>Moharir, Sharayu</creator><creator>Hanawal, Manjesh K.</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SP</scope><scope>8FD</scope><scope>L7M</scope><orcidid>https://orcid.org/0000-0001-5688-4640</orcidid><orcidid>https://orcid.org/0000-0001-9393-9276</orcidid><orcidid>https://orcid.org/0000-0002-1807-5487</orcidid><orcidid>https://orcid.org/0000-0001-6328-1709</orcidid><orcidid>https://orcid.org/0000-0002-7417-3882</orcidid></search><sort><creationdate>202201</creationdate><title>Regret of Age-of-Information Bandits</title><author>Fatale, Santosh ; Bhandari, Kavya ; Narula, Urvidh ; Moharir, Sharayu ; Hanawal, Manjesh K.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c295t-41fe67ae6b7555c3b4dbf7631d3330287089941993ec381a1b453bff0a6f44613</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>age-of-information (AOI)</topic><topic>Channels</topic><topic>communication channel</topic><topic>Communication channels</topic><topic>internet-of-things (IOT)</topic><topic>Lower bounds</topic><topic>Measurement</topic><topic>Monitoring</topic><topic>multi-armed bandit (MAB)</topic><topic>Multi-armed bandit problems</topic><topic>Policies</topic><topic>Sampling</topic><topic>Schedules</topic><topic>Scheduling</topic><topic>Scheduling algorithms</topic><topic>sensors</topic><topic>Time measurement</topic><topic>Time-varying systems</topic><topic>Upper bound</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Fatale, Santosh</creatorcontrib><creatorcontrib>Bhandari, Kavya</creatorcontrib><creatorcontrib>Narula, Urvidh</creatorcontrib><creatorcontrib>Moharir, Sharayu</creatorcontrib><creatorcontrib>Hanawal, Manjesh K.</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>Advanced Technologies Database with Aerospace</collection><jtitle>IEEE transactions on communications</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Fatale, Santosh</au><au>Bhandari, Kavya</au><au>Narula, Urvidh</au><au>Moharir, Sharayu</au><au>Hanawal, Manjesh K.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Regret of Age-of-Information Bandits</atitle><jtitle>IEEE transactions on communications</jtitle><stitle>TCOMM</stitle><date>2022-01</date><risdate>2022</risdate><volume>70</volume><issue>1</issue><spage>87</spage><epage>100</epage><pages>87-100</pages><issn>0090-6778</issn><eissn>1558-0857</eissn><coden>IECMBT</coden><abstract>We consider a system with a single source that measures/tracks a time-varying quantity and periodically attempts to report these measurements to a monitoring station. Each update from the source has to be scheduled on one of <inline-formula> <tex-math notation="LaTeX">K </tex-math></inline-formula> available communication channels. The probability of success of each attempted communication is a function of the channel used. This function is unknown to the scheduler. The metric of interest is the Age-of-Information (AoI), formally defined as the time elapsed since the destination received the recent most update from the source. We model our scheduling problem as a variant of the multi-arm bandit problem with communication channels as arms. We characterize a lower bound on the AoI regret achievable by any policy and characterize the performance of UCB, Thompson Sampling, and their variants. Our analytical results show that UCB and Thompson sampling are order-optimal for AoI bandits. In addition, we propose novel policies which, unlike UCB and Thompson Sampling, use the current AoI to make scheduling decisions. Via simulations, we show the proposed AoI-aware policies outperform existing AoI-agnostic policies.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/TCOMM.2021.3118037</doi><tpages>14</tpages><orcidid>https://orcid.org/0000-0001-5688-4640</orcidid><orcidid>https://orcid.org/0000-0001-9393-9276</orcidid><orcidid>https://orcid.org/0000-0002-1807-5487</orcidid><orcidid>https://orcid.org/0000-0001-6328-1709</orcidid><orcidid>https://orcid.org/0000-0002-7417-3882</orcidid></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | ISSN: 0090-6778 |
ispartof | IEEE transactions on communications, 2022-01, Vol.70 (1), p.87-100 |
issn | 0090-6778 1558-0857 |
language | eng |
recordid | cdi_ieee_primary_9559999 |
source | IEEE Electronic Library (IEL) |
subjects | age-of-information (AOI) Channels communication channel Communication channels internet-of-things (IOT) Lower bounds Measurement Monitoring multi-armed bandit (MAB) Multi-armed bandit problems Policies Sampling Schedules Scheduling Scheduling algorithms sensors Time measurement Time-varying systems Upper bound |
title | Regret of Age-of-Information Bandits |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-18T18%3A45%3A28IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Regret%20of%20Age-of-Information%20Bandits&rft.jtitle=IEEE%20transactions%20on%20communications&rft.au=Fatale,%20Santosh&rft.date=2022-01&rft.volume=70&rft.issue=1&rft.spage=87&rft.epage=100&rft.pages=87-100&rft.issn=0090-6778&rft.eissn=1558-0857&rft.coden=IECMBT&rft_id=info:doi/10.1109/TCOMM.2021.3118037&rft_dat=%3Cproquest_RIE%3E2619589179%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2619589179&rft_id=info:pmid/&rft_ieee_id=9559999&rfr_iscdi=true |