Transmission Control in NB-IoT with Model-Based Reinforcement Learning

In Narrowband Internet of Things (NB-IoT), the control of uplink transmissions is a complex task involving device scheduling, resource allocation in the carrier, and the configuration of link-adaptation parameters. Existing heuristic proposals partially address the problem, but reinforcement learnin...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE access 2023-01, Vol.11, p.1-1
Hauptverfasser: Alcaraz, Juan J., Losilla, Fernando, Gonzalez-Castano, Francisco-Javier
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 1
container_issue
container_start_page 1
container_title IEEE access
container_volume 11
creator Alcaraz, Juan J.
Losilla, Fernando
Gonzalez-Castano, Francisco-Javier
description In Narrowband Internet of Things (NB-IoT), the control of uplink transmissions is a complex task involving device scheduling, resource allocation in the carrier, and the configuration of link-adaptation parameters. Existing heuristic proposals partially address the problem, but reinforcement learning (RL) seems to be the most effective approach a priori, given its success in similar control problems. However, the low sample efficiency of conventional (model-free) RL algorithms is an important limitation for their deployment in real systems. During their initial learning stages, RL agents need to explore the policy space selecting actions that are, in general, highly ineffective. In an NBIoT access network this implies a disproportionate increase in transmission delays. In this paper, we make two contributions to enable the adoption of RL in NBIoT: first, we present a multi-agent architecture based on the principle of task division. Second, we propose a new model-based RL algorithm for link adaptation characterized by its high sample efficiency. The combination of these two strategies results in an algorithm that, during the learning phase, is able to maintain the transmission delay in the order of hundreds of milliseconds, whereas model-free RL algorithms cause delays of up to several seconds. This allows our approach to be deployed, without prior training, in an operating NBIoT network and learn to control it efficiently without degrading its performance.
doi_str_mv 10.1109/ACCESS.2023.3284990
format Article
fullrecord <record><control><sourceid>proquest_doaj_</sourceid><recordid>TN_cdi_proquest_journals_2826477782</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10147823</ieee_id><doaj_id>oai_doaj_org_article_6c3b488358344d9c9648925f54797829</doaj_id><sourcerecordid>2826477782</sourcerecordid><originalsourceid>FETCH-LOGICAL-c409t-8f4e3c9fabcfa151350ff8044f5f369382896a0701b6f4793f7923a4492a17e03</originalsourceid><addsrcrecordid>eNpNkFtLAzEQhRdRsNT-An1Y8HlrbpvLY7tULVQFW59DupvUlG1Sky3ivzd1i3ReZhjO-WY4WXYLwRhCIB4mVTVbLscIIDzGiBMhwEU2QJCKApeYXp7N19koxi1IxdOqZIPscRWUizsbo_Uur7zrgm9z6_LXaTH3q_zbdp_5i290W0xV1E3-rq0zPtR6p12XL7QKzrrNTXZlVBv16NSH2cfjbFU9F4u3p3k1WRQ1AaIruCEa18KodW0ULCEugTEcEGJKg6nAHHFBFWAArqkhTGDDBMKKEIEUZBrgYTbvuY1XW7kPdqfCj_TKyr-FDxupQmfrVkta4zXhHJccE9KIWlDCBSpNmbiMI5FY9z1rH_zXQcdObv0huPS-RBxRwliSJRXuVXXwMQZt_q9CII_5yz5_ecxfnvJPrrveZbXWZw5Ijkz8CxKufgg</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2826477782</pqid></control><display><type>article</type><title>Transmission Control in NB-IoT with Model-Based Reinforcement Learning</title><source>IEEE Open Access Journals</source><source>DOAJ Directory of Open Access Journals</source><source>EZB-FREE-00999 freely available EZB journals</source><creator>Alcaraz, Juan J. ; Losilla, Fernando ; Gonzalez-Castano, Francisco-Javier</creator><creatorcontrib>Alcaraz, Juan J. ; Losilla, Fernando ; Gonzalez-Castano, Francisco-Javier</creatorcontrib><description>In Narrowband Internet of Things (NB-IoT), the control of uplink transmissions is a complex task involving device scheduling, resource allocation in the carrier, and the configuration of link-adaptation parameters. Existing heuristic proposals partially address the problem, but reinforcement learning (RL) seems to be the most effective approach a priori, given its success in similar control problems. However, the low sample efficiency of conventional (model-free) RL algorithms is an important limitation for their deployment in real systems. During their initial learning stages, RL agents need to explore the policy space selecting actions that are, in general, highly ineffective. In an NBIoT access network this implies a disproportionate increase in transmission delays. In this paper, we make two contributions to enable the adoption of RL in NBIoT: first, we present a multi-agent architecture based on the principle of task division. Second, we propose a new model-based RL algorithm for link adaptation characterized by its high sample efficiency. The combination of these two strategies results in an algorithm that, during the learning phase, is able to maintain the transmission delay in the order of hundreds of milliseconds, whereas model-free RL algorithms cause delays of up to several seconds. This allows our approach to be deployed, without prior training, in an operating NBIoT network and learn to control it efficiently without degrading its performance.</description><identifier>ISSN: 2169-3536</identifier><identifier>EISSN: 2169-3536</identifier><identifier>DOI: 10.1109/ACCESS.2023.3284990</identifier><identifier>CODEN: IAECCG</identifier><language>eng</language><publisher>Piscataway: IEEE</publisher><subject>Adaptation ; Algorithms ; Delays ; Downlink ; Internet of Things ; link adaptation ; Machine learning ; Multiagent systems ; Narrowband ; Narrowband Internet of Things (NB-IoT) ; NPUSCH ; Performance degradation ; Reinforcement Learning ; Resource allocation ; Resource management ; Resource scheduling ; Task analysis ; Task scheduling ; Time-frequency analysis ; Uplink ; uplink scheduling</subject><ispartof>IEEE access, 2023-01, Vol.11, p.1-1</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2023</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c409t-8f4e3c9fabcfa151350ff8044f5f369382896a0701b6f4793f7923a4492a17e03</citedby><cites>FETCH-LOGICAL-c409t-8f4e3c9fabcfa151350ff8044f5f369382896a0701b6f4793f7923a4492a17e03</cites><orcidid>0000-0001-6938-1540 ; 0000-0002-1756-0130 ; 0000-0001-5225-8378</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10147823$$EHTML$$P50$$Gieee$$Hfree_for_read</linktohtml><link.rule.ids>314,780,784,864,2102,27633,27924,27925,54933</link.rule.ids></links><search><creatorcontrib>Alcaraz, Juan J.</creatorcontrib><creatorcontrib>Losilla, Fernando</creatorcontrib><creatorcontrib>Gonzalez-Castano, Francisco-Javier</creatorcontrib><title>Transmission Control in NB-IoT with Model-Based Reinforcement Learning</title><title>IEEE access</title><addtitle>Access</addtitle><description>In Narrowband Internet of Things (NB-IoT), the control of uplink transmissions is a complex task involving device scheduling, resource allocation in the carrier, and the configuration of link-adaptation parameters. Existing heuristic proposals partially address the problem, but reinforcement learning (RL) seems to be the most effective approach a priori, given its success in similar control problems. However, the low sample efficiency of conventional (model-free) RL algorithms is an important limitation for their deployment in real systems. During their initial learning stages, RL agents need to explore the policy space selecting actions that are, in general, highly ineffective. In an NBIoT access network this implies a disproportionate increase in transmission delays. In this paper, we make two contributions to enable the adoption of RL in NBIoT: first, we present a multi-agent architecture based on the principle of task division. Second, we propose a new model-based RL algorithm for link adaptation characterized by its high sample efficiency. The combination of these two strategies results in an algorithm that, during the learning phase, is able to maintain the transmission delay in the order of hundreds of milliseconds, whereas model-free RL algorithms cause delays of up to several seconds. This allows our approach to be deployed, without prior training, in an operating NBIoT network and learn to control it efficiently without degrading its performance.</description><subject>Adaptation</subject><subject>Algorithms</subject><subject>Delays</subject><subject>Downlink</subject><subject>Internet of Things</subject><subject>link adaptation</subject><subject>Machine learning</subject><subject>Multiagent systems</subject><subject>Narrowband</subject><subject>Narrowband Internet of Things (NB-IoT)</subject><subject>NPUSCH</subject><subject>Performance degradation</subject><subject>Reinforcement Learning</subject><subject>Resource allocation</subject><subject>Resource management</subject><subject>Resource scheduling</subject><subject>Task analysis</subject><subject>Task scheduling</subject><subject>Time-frequency analysis</subject><subject>Uplink</subject><subject>uplink scheduling</subject><issn>2169-3536</issn><issn>2169-3536</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>ESBDL</sourceid><sourceid>RIE</sourceid><sourceid>DOA</sourceid><recordid>eNpNkFtLAzEQhRdRsNT-An1Y8HlrbpvLY7tULVQFW59DupvUlG1Sky3ivzd1i3ReZhjO-WY4WXYLwRhCIB4mVTVbLscIIDzGiBMhwEU2QJCKApeYXp7N19koxi1IxdOqZIPscRWUizsbo_Uur7zrgm9z6_LXaTH3q_zbdp_5i290W0xV1E3-rq0zPtR6p12XL7QKzrrNTXZlVBv16NSH2cfjbFU9F4u3p3k1WRQ1AaIruCEa18KodW0ULCEugTEcEGJKg6nAHHFBFWAArqkhTGDDBMKKEIEUZBrgYTbvuY1XW7kPdqfCj_TKyr-FDxupQmfrVkta4zXhHJccE9KIWlDCBSpNmbiMI5FY9z1rH_zXQcdObv0huPS-RBxRwliSJRXuVXXwMQZt_q9CII_5yz5_ecxfnvJPrrveZbXWZw5Ijkz8CxKufgg</recordid><startdate>20230101</startdate><enddate>20230101</enddate><creator>Alcaraz, Juan J.</creator><creator>Losilla, Fernando</creator><creator>Gonzalez-Castano, Francisco-Javier</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>ESBDL</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>7SR</scope><scope>8BQ</scope><scope>8FD</scope><scope>JG9</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>DOA</scope><orcidid>https://orcid.org/0000-0001-6938-1540</orcidid><orcidid>https://orcid.org/0000-0002-1756-0130</orcidid><orcidid>https://orcid.org/0000-0001-5225-8378</orcidid></search><sort><creationdate>20230101</creationdate><title>Transmission Control in NB-IoT with Model-Based Reinforcement Learning</title><author>Alcaraz, Juan J. ; Losilla, Fernando ; Gonzalez-Castano, Francisco-Javier</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c409t-8f4e3c9fabcfa151350ff8044f5f369382896a0701b6f4793f7923a4492a17e03</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Adaptation</topic><topic>Algorithms</topic><topic>Delays</topic><topic>Downlink</topic><topic>Internet of Things</topic><topic>link adaptation</topic><topic>Machine learning</topic><topic>Multiagent systems</topic><topic>Narrowband</topic><topic>Narrowband Internet of Things (NB-IoT)</topic><topic>NPUSCH</topic><topic>Performance degradation</topic><topic>Reinforcement Learning</topic><topic>Resource allocation</topic><topic>Resource management</topic><topic>Resource scheduling</topic><topic>Task analysis</topic><topic>Task scheduling</topic><topic>Time-frequency analysis</topic><topic>Uplink</topic><topic>uplink scheduling</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Alcaraz, Juan J.</creatorcontrib><creatorcontrib>Losilla, Fernando</creatorcontrib><creatorcontrib>Gonzalez-Castano, Francisco-Javier</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE Open Access Journals</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Engineered Materials Abstracts</collection><collection>METADEX</collection><collection>Technology Research Database</collection><collection>Materials Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>DOAJ Directory of Open Access Journals</collection><jtitle>IEEE access</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Alcaraz, Juan J.</au><au>Losilla, Fernando</au><au>Gonzalez-Castano, Francisco-Javier</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Transmission Control in NB-IoT with Model-Based Reinforcement Learning</atitle><jtitle>IEEE access</jtitle><stitle>Access</stitle><date>2023-01-01</date><risdate>2023</risdate><volume>11</volume><spage>1</spage><epage>1</epage><pages>1-1</pages><issn>2169-3536</issn><eissn>2169-3536</eissn><coden>IAECCG</coden><abstract>In Narrowband Internet of Things (NB-IoT), the control of uplink transmissions is a complex task involving device scheduling, resource allocation in the carrier, and the configuration of link-adaptation parameters. Existing heuristic proposals partially address the problem, but reinforcement learning (RL) seems to be the most effective approach a priori, given its success in similar control problems. However, the low sample efficiency of conventional (model-free) RL algorithms is an important limitation for their deployment in real systems. During their initial learning stages, RL agents need to explore the policy space selecting actions that are, in general, highly ineffective. In an NBIoT access network this implies a disproportionate increase in transmission delays. In this paper, we make two contributions to enable the adoption of RL in NBIoT: first, we present a multi-agent architecture based on the principle of task division. Second, we propose a new model-based RL algorithm for link adaptation characterized by its high sample efficiency. The combination of these two strategies results in an algorithm that, during the learning phase, is able to maintain the transmission delay in the order of hundreds of milliseconds, whereas model-free RL algorithms cause delays of up to several seconds. This allows our approach to be deployed, without prior training, in an operating NBIoT network and learn to control it efficiently without degrading its performance.</abstract><cop>Piscataway</cop><pub>IEEE</pub><doi>10.1109/ACCESS.2023.3284990</doi><tpages>1</tpages><orcidid>https://orcid.org/0000-0001-6938-1540</orcidid><orcidid>https://orcid.org/0000-0002-1756-0130</orcidid><orcidid>https://orcid.org/0000-0001-5225-8378</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 2169-3536
ispartof IEEE access, 2023-01, Vol.11, p.1-1
issn 2169-3536
2169-3536
language eng
recordid cdi_proquest_journals_2826477782
source IEEE Open Access Journals; DOAJ Directory of Open Access Journals; EZB-FREE-00999 freely available EZB journals
subjects Adaptation
Algorithms
Delays
Downlink
Internet of Things
link adaptation
Machine learning
Multiagent systems
Narrowband
Narrowband Internet of Things (NB-IoT)
NPUSCH
Performance degradation
Reinforcement Learning
Resource allocation
Resource management
Resource scheduling
Task analysis
Task scheduling
Time-frequency analysis
Uplink
uplink scheduling
title Transmission Control in NB-IoT with Model-Based Reinforcement Learning
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-24T11%3A56%3A25IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_doaj_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Transmission%20Control%20in%20NB-IoT%20with%20Model-Based%20Reinforcement%20Learning&rft.jtitle=IEEE%20access&rft.au=Alcaraz,%20Juan%20J.&rft.date=2023-01-01&rft.volume=11&rft.spage=1&rft.epage=1&rft.pages=1-1&rft.issn=2169-3536&rft.eissn=2169-3536&rft.coden=IAECCG&rft_id=info:doi/10.1109/ACCESS.2023.3284990&rft_dat=%3Cproquest_doaj_%3E2826477782%3C/proquest_doaj_%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2826477782&rft_id=info:pmid/&rft_ieee_id=10147823&rft_doaj_id=oai_doaj_org_article_6c3b488358344d9c9648925f54797829&rfr_iscdi=true