Transmission Control in NB-IoT with Model-Based Reinforcement Learning
In Narrowband Internet of Things (NB-IoT), the control of uplink transmissions is a complex task involving device scheduling, resource allocation in the carrier, and the configuration of link-adaptation parameters. Existing heuristic proposals partially address the problem, but reinforcement learnin...
Gespeichert in:
Veröffentlicht in: | IEEE access 2023-01, Vol.11, p.1-1 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 1 |
---|---|
container_issue | |
container_start_page | 1 |
container_title | IEEE access |
container_volume | 11 |
creator | Alcaraz, Juan J. Losilla, Fernando Gonzalez-Castano, Francisco-Javier |
description | In Narrowband Internet of Things (NB-IoT), the control of uplink transmissions is a complex task involving device scheduling, resource allocation in the carrier, and the configuration of link-adaptation parameters. Existing heuristic proposals partially address the problem, but reinforcement learning (RL) seems to be the most effective approach a priori, given its success in similar control problems. However, the low sample efficiency of conventional (model-free) RL algorithms is an important limitation for their deployment in real systems. During their initial learning stages, RL agents need to explore the policy space selecting actions that are, in general, highly ineffective. In an NBIoT access network this implies a disproportionate increase in transmission delays. In this paper, we make two contributions to enable the adoption of RL in NBIoT: first, we present a multi-agent architecture based on the principle of task division. Second, we propose a new model-based RL algorithm for link adaptation characterized by its high sample efficiency. The combination of these two strategies results in an algorithm that, during the learning phase, is able to maintain the transmission delay in the order of hundreds of milliseconds, whereas model-free RL algorithms cause delays of up to several seconds. This allows our approach to be deployed, without prior training, in an operating NBIoT network and learn to control it efficiently without degrading its performance. |
doi_str_mv | 10.1109/ACCESS.2023.3284990 |
format | Article |
fullrecord | <record><control><sourceid>proquest_doaj_</sourceid><recordid>TN_cdi_proquest_journals_2826477782</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10147823</ieee_id><doaj_id>oai_doaj_org_article_6c3b488358344d9c9648925f54797829</doaj_id><sourcerecordid>2826477782</sourcerecordid><originalsourceid>FETCH-LOGICAL-c409t-8f4e3c9fabcfa151350ff8044f5f369382896a0701b6f4793f7923a4492a17e03</originalsourceid><addsrcrecordid>eNpNkFtLAzEQhRdRsNT-An1Y8HlrbpvLY7tULVQFW59DupvUlG1Sky3ivzd1i3ReZhjO-WY4WXYLwRhCIB4mVTVbLscIIDzGiBMhwEU2QJCKApeYXp7N19koxi1IxdOqZIPscRWUizsbo_Uur7zrgm9z6_LXaTH3q_zbdp_5i290W0xV1E3-rq0zPtR6p12XL7QKzrrNTXZlVBv16NSH2cfjbFU9F4u3p3k1WRQ1AaIruCEa18KodW0ULCEugTEcEGJKg6nAHHFBFWAArqkhTGDDBMKKEIEUZBrgYTbvuY1XW7kPdqfCj_TKyr-FDxupQmfrVkta4zXhHJccE9KIWlDCBSpNmbiMI5FY9z1rH_zXQcdObv0huPS-RBxRwliSJRXuVXXwMQZt_q9CII_5yz5_ecxfnvJPrrveZbXWZw5Ijkz8CxKufgg</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2826477782</pqid></control><display><type>article</type><title>Transmission Control in NB-IoT with Model-Based Reinforcement Learning</title><source>IEEE Open Access Journals</source><source>DOAJ Directory of Open Access Journals</source><source>EZB-FREE-00999 freely available EZB journals</source><creator>Alcaraz, Juan J. ; Losilla, Fernando ; Gonzalez-Castano, Francisco-Javier</creator><creatorcontrib>Alcaraz, Juan J. ; Losilla, Fernando ; Gonzalez-Castano, Francisco-Javier</creatorcontrib><description>In Narrowband Internet of Things (NB-IoT), the control of uplink transmissions is a complex task involving device scheduling, resource allocation in the carrier, and the configuration of link-adaptation parameters. Existing heuristic proposals partially address the problem, but reinforcement learning (RL) seems to be the most effective approach a priori, given its success in similar control problems. However, the low sample efficiency of conventional (model-free) RL algorithms is an important limitation for their deployment in real systems. During their initial learning stages, RL agents need to explore the policy space selecting actions that are, in general, highly ineffective. In an NBIoT access network this implies a disproportionate increase in transmission delays. In this paper, we make two contributions to enable the adoption of RL in NBIoT: first, we present a multi-agent architecture based on the principle of task division. Second, we propose a new model-based RL algorithm for link adaptation characterized by its high sample efficiency. The combination of these two strategies results in an algorithm that, during the learning phase, is able to maintain the transmission delay in the order of hundreds of milliseconds, whereas model-free RL algorithms cause delays of up to several seconds. This allows our approach to be deployed, without prior training, in an operating NBIoT network and learn to control it efficiently without degrading its performance.</description><identifier>ISSN: 2169-3536</identifier><identifier>EISSN: 2169-3536</identifier><identifier>DOI: 10.1109/ACCESS.2023.3284990</identifier><identifier>CODEN: IAECCG</identifier><language>eng</language><publisher>Piscataway: IEEE</publisher><subject>Adaptation ; Algorithms ; Delays ; Downlink ; Internet of Things ; link adaptation ; Machine learning ; Multiagent systems ; Narrowband ; Narrowband Internet of Things (NB-IoT) ; NPUSCH ; Performance degradation ; Reinforcement Learning ; Resource allocation ; Resource management ; Resource scheduling ; Task analysis ; Task scheduling ; Time-frequency analysis ; Uplink ; uplink scheduling</subject><ispartof>IEEE access, 2023-01, Vol.11, p.1-1</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2023</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c409t-8f4e3c9fabcfa151350ff8044f5f369382896a0701b6f4793f7923a4492a17e03</citedby><cites>FETCH-LOGICAL-c409t-8f4e3c9fabcfa151350ff8044f5f369382896a0701b6f4793f7923a4492a17e03</cites><orcidid>0000-0001-6938-1540 ; 0000-0002-1756-0130 ; 0000-0001-5225-8378</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10147823$$EHTML$$P50$$Gieee$$Hfree_for_read</linktohtml><link.rule.ids>314,780,784,864,2102,27633,27924,27925,54933</link.rule.ids></links><search><creatorcontrib>Alcaraz, Juan J.</creatorcontrib><creatorcontrib>Losilla, Fernando</creatorcontrib><creatorcontrib>Gonzalez-Castano, Francisco-Javier</creatorcontrib><title>Transmission Control in NB-IoT with Model-Based Reinforcement Learning</title><title>IEEE access</title><addtitle>Access</addtitle><description>In Narrowband Internet of Things (NB-IoT), the control of uplink transmissions is a complex task involving device scheduling, resource allocation in the carrier, and the configuration of link-adaptation parameters. Existing heuristic proposals partially address the problem, but reinforcement learning (RL) seems to be the most effective approach a priori, given its success in similar control problems. However, the low sample efficiency of conventional (model-free) RL algorithms is an important limitation for their deployment in real systems. During their initial learning stages, RL agents need to explore the policy space selecting actions that are, in general, highly ineffective. In an NBIoT access network this implies a disproportionate increase in transmission delays. In this paper, we make two contributions to enable the adoption of RL in NBIoT: first, we present a multi-agent architecture based on the principle of task division. Second, we propose a new model-based RL algorithm for link adaptation characterized by its high sample efficiency. The combination of these two strategies results in an algorithm that, during the learning phase, is able to maintain the transmission delay in the order of hundreds of milliseconds, whereas model-free RL algorithms cause delays of up to several seconds. This allows our approach to be deployed, without prior training, in an operating NBIoT network and learn to control it efficiently without degrading its performance.</description><subject>Adaptation</subject><subject>Algorithms</subject><subject>Delays</subject><subject>Downlink</subject><subject>Internet of Things</subject><subject>link adaptation</subject><subject>Machine learning</subject><subject>Multiagent systems</subject><subject>Narrowband</subject><subject>Narrowband Internet of Things (NB-IoT)</subject><subject>NPUSCH</subject><subject>Performance degradation</subject><subject>Reinforcement Learning</subject><subject>Resource allocation</subject><subject>Resource management</subject><subject>Resource scheduling</subject><subject>Task analysis</subject><subject>Task scheduling</subject><subject>Time-frequency analysis</subject><subject>Uplink</subject><subject>uplink scheduling</subject><issn>2169-3536</issn><issn>2169-3536</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>ESBDL</sourceid><sourceid>RIE</sourceid><sourceid>DOA</sourceid><recordid>eNpNkFtLAzEQhRdRsNT-An1Y8HlrbpvLY7tULVQFW59DupvUlG1Sky3ivzd1i3ReZhjO-WY4WXYLwRhCIB4mVTVbLscIIDzGiBMhwEU2QJCKApeYXp7N19koxi1IxdOqZIPscRWUizsbo_Uur7zrgm9z6_LXaTH3q_zbdp_5i290W0xV1E3-rq0zPtR6p12XL7QKzrrNTXZlVBv16NSH2cfjbFU9F4u3p3k1WRQ1AaIruCEa18KodW0ULCEugTEcEGJKg6nAHHFBFWAArqkhTGDDBMKKEIEUZBrgYTbvuY1XW7kPdqfCj_TKyr-FDxupQmfrVkta4zXhHJccE9KIWlDCBSpNmbiMI5FY9z1rH_zXQcdObv0huPS-RBxRwliSJRXuVXXwMQZt_q9CII_5yz5_ecxfnvJPrrveZbXWZw5Ijkz8CxKufgg</recordid><startdate>20230101</startdate><enddate>20230101</enddate><creator>Alcaraz, Juan J.</creator><creator>Losilla, Fernando</creator><creator>Gonzalez-Castano, Francisco-Javier</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>ESBDL</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>7SR</scope><scope>8BQ</scope><scope>8FD</scope><scope>JG9</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>DOA</scope><orcidid>https://orcid.org/0000-0001-6938-1540</orcidid><orcidid>https://orcid.org/0000-0002-1756-0130</orcidid><orcidid>https://orcid.org/0000-0001-5225-8378</orcidid></search><sort><creationdate>20230101</creationdate><title>Transmission Control in NB-IoT with Model-Based Reinforcement Learning</title><author>Alcaraz, Juan J. ; Losilla, Fernando ; Gonzalez-Castano, Francisco-Javier</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c409t-8f4e3c9fabcfa151350ff8044f5f369382896a0701b6f4793f7923a4492a17e03</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Adaptation</topic><topic>Algorithms</topic><topic>Delays</topic><topic>Downlink</topic><topic>Internet of Things</topic><topic>link adaptation</topic><topic>Machine learning</topic><topic>Multiagent systems</topic><topic>Narrowband</topic><topic>Narrowband Internet of Things (NB-IoT)</topic><topic>NPUSCH</topic><topic>Performance degradation</topic><topic>Reinforcement Learning</topic><topic>Resource allocation</topic><topic>Resource management</topic><topic>Resource scheduling</topic><topic>Task analysis</topic><topic>Task scheduling</topic><topic>Time-frequency analysis</topic><topic>Uplink</topic><topic>uplink scheduling</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Alcaraz, Juan J.</creatorcontrib><creatorcontrib>Losilla, Fernando</creatorcontrib><creatorcontrib>Gonzalez-Castano, Francisco-Javier</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE Open Access Journals</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Engineered Materials Abstracts</collection><collection>METADEX</collection><collection>Technology Research Database</collection><collection>Materials Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>DOAJ Directory of Open Access Journals</collection><jtitle>IEEE access</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Alcaraz, Juan J.</au><au>Losilla, Fernando</au><au>Gonzalez-Castano, Francisco-Javier</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Transmission Control in NB-IoT with Model-Based Reinforcement Learning</atitle><jtitle>IEEE access</jtitle><stitle>Access</stitle><date>2023-01-01</date><risdate>2023</risdate><volume>11</volume><spage>1</spage><epage>1</epage><pages>1-1</pages><issn>2169-3536</issn><eissn>2169-3536</eissn><coden>IAECCG</coden><abstract>In Narrowband Internet of Things (NB-IoT), the control of uplink transmissions is a complex task involving device scheduling, resource allocation in the carrier, and the configuration of link-adaptation parameters. Existing heuristic proposals partially address the problem, but reinforcement learning (RL) seems to be the most effective approach a priori, given its success in similar control problems. However, the low sample efficiency of conventional (model-free) RL algorithms is an important limitation for their deployment in real systems. During their initial learning stages, RL agents need to explore the policy space selecting actions that are, in general, highly ineffective. In an NBIoT access network this implies a disproportionate increase in transmission delays. In this paper, we make two contributions to enable the adoption of RL in NBIoT: first, we present a multi-agent architecture based on the principle of task division. Second, we propose a new model-based RL algorithm for link adaptation characterized by its high sample efficiency. The combination of these two strategies results in an algorithm that, during the learning phase, is able to maintain the transmission delay in the order of hundreds of milliseconds, whereas model-free RL algorithms cause delays of up to several seconds. This allows our approach to be deployed, without prior training, in an operating NBIoT network and learn to control it efficiently without degrading its performance.</abstract><cop>Piscataway</cop><pub>IEEE</pub><doi>10.1109/ACCESS.2023.3284990</doi><tpages>1</tpages><orcidid>https://orcid.org/0000-0001-6938-1540</orcidid><orcidid>https://orcid.org/0000-0002-1756-0130</orcidid><orcidid>https://orcid.org/0000-0001-5225-8378</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 2169-3536 |
ispartof | IEEE access, 2023-01, Vol.11, p.1-1 |
issn | 2169-3536 2169-3536 |
language | eng |
recordid | cdi_proquest_journals_2826477782 |
source | IEEE Open Access Journals; DOAJ Directory of Open Access Journals; EZB-FREE-00999 freely available EZB journals |
subjects | Adaptation Algorithms Delays Downlink Internet of Things link adaptation Machine learning Multiagent systems Narrowband Narrowband Internet of Things (NB-IoT) NPUSCH Performance degradation Reinforcement Learning Resource allocation Resource management Resource scheduling Task analysis Task scheduling Time-frequency analysis Uplink uplink scheduling |
title | Transmission Control in NB-IoT with Model-Based Reinforcement Learning |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-24T11%3A56%3A25IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_doaj_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Transmission%20Control%20in%20NB-IoT%20with%20Model-Based%20Reinforcement%20Learning&rft.jtitle=IEEE%20access&rft.au=Alcaraz,%20Juan%20J.&rft.date=2023-01-01&rft.volume=11&rft.spage=1&rft.epage=1&rft.pages=1-1&rft.issn=2169-3536&rft.eissn=2169-3536&rft.coden=IAECCG&rft_id=info:doi/10.1109/ACCESS.2023.3284990&rft_dat=%3Cproquest_doaj_%3E2826477782%3C/proquest_doaj_%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2826477782&rft_id=info:pmid/&rft_ieee_id=10147823&rft_doaj_id=oai_doaj_org_article_6c3b488358344d9c9648925f54797829&rfr_iscdi=true |