Transmission Control in NB-IoT with Model-Based Reinforcement Learning

In Narrowband Internet of Things (NB-IoT), the control of uplink transmissions is a complex task involving device scheduling, resource allocation in the carrier, and the configuration of link-adaptation parameters. Existing heuristic proposals partially address the problem, but reinforcement learnin...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE access 2023-01, Vol.11, p.1-1
Hauptverfasser:	Alcaraz, Juan J., Losilla, Fernando, Gonzalez-Castano, Francisco-Javier
Format:	Artikel
Sprache:	eng
Schlagworte:	Adaptation Algorithms Delays Downlink Internet of Things link adaptation Machine learning Multiagent systems Narrowband Narrowband Internet of Things (NB-IoT) NPUSCH Performance degradation Reinforcement Learning Resource allocation Resource management Resource scheduling Task analysis Task scheduling Time-frequency analysis Uplink uplink scheduling
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	1
container_issue
container_start_page	1
container_title	IEEE access
container_volume	11
creator	Alcaraz, Juan J. Losilla, Fernando Gonzalez-Castano, Francisco-Javier
description	In Narrowband Internet of Things (NB-IoT), the control of uplink transmissions is a complex task involving device scheduling, resource allocation in the carrier, and the configuration of link-adaptation parameters. Existing heuristic proposals partially address the problem, but reinforcement learning (RL) seems to be the most effective approach a priori, given its success in similar control problems. However, the low sample efficiency of conventional (model-free) RL algorithms is an important limitation for their deployment in real systems. During their initial learning stages, RL agents need to explore the policy space selecting actions that are, in general, highly ineffective. In an NBIoT access network this implies a disproportionate increase in transmission delays. In this paper, we make two contributions to enable the adoption of RL in NBIoT: first, we present a multi-agent architecture based on the principle of task division. Second, we propose a new model-based RL algorithm for link adaptation characterized by its high sample efficiency. The combination of these two strategies results in an algorithm that, during the learning phase, is able to maintain the transmission delay in the order of hundreds of milliseconds, whereas model-free RL algorithms cause delays of up to several seconds. This allows our approach to be deployed, without prior training, in an operating NBIoT network and learn to control it efficiently without degrading its performance.
doi_str_mv	10.1109/ACCESS.2023.3284990
format	Article
fullrecord	<record><control><sourceid>proquest_doaj_</sourceid><recordid>TN_cdi_proquest_journals_2826477782</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10147823</ieee_id><doaj_id>oai_doaj_org_article_6c3b488358344d9c9648925f54797829</doaj_id><sourcerecordid>2826477782</sourcerecordid><originalsourceid>FETCH-LOGICAL-c409t-8f4e3c9fabcfa151350ff8044f5f369382896a0701b6f4793f7923a4492a17e03</originalsourceid><addsrcrecordid>eNpNkFtLAzEQhRdRsNT-An1Y8HlrbpvLY7tULVQFW59DupvUlG1Sky3ivzd1i3ReZhjO-WY4WXYLwRhCIB4mVTVbLscIIDzGiBMhwEU2QJCKApeYXp7N19koxi1IxdOqZIPscRWUizsbo_Uur7zrgm9z6_LXaTH3q_zbdp_5i290W0xV1E3-rq0zPtR6p12XL7QKzrrNTXZlVBv16NSH2cfjbFU9F4u3p3k1WRQ1AaIruCEa18KodW0ULCEugTEcEGJKg6nAHHFBFWAArqkhTGDDBMKKEIEUZBrgYTbvuY1XW7kPdqfCj_TKyr-FDxupQmfrVkta4zXhHJccE9KIWlDCBSpNmbiMI5FY9z1rH_zXQcdObv0huPS-RBxRwliSJRXuVXXwMQZt_q9CII_5yz5_ecxfnvJPrrveZbXWZw5Ijkz8CxKufgg</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2826477782</pqid></control><display><type>article</type><title>Transmission Control in NB-IoT with Model-Based Reinforcement Learning</title><source>IEEE Open Access Journals</source><source>DOAJ Directory of Open Access Journals</source><source>EZB-FREE-00999 freely available EZB journals</source><creator>Alcaraz, Juan J. ; Losilla, Fernando ; Gonzalez-Castano, Francisco-Javier</creator><creatorcontrib>Alcaraz, Juan J. ; Losilla, Fernando ; Gonzalez-Castano, Francisco-Javier</creatorcontrib><description>In Narrowband Internet of Things (NB-IoT), the control of uplink transmissions is a complex task involving device scheduling, resource allocation in the carrier, and the configuration of link-adaptation parameters. Existing heuristic proposals partially address the problem, but reinforcement learning (RL) seems to be the most effective approach a priori, given its success in similar control problems. However, the low sample efficiency of conventional (model-free) RL algorithms is an important limitation for their deployment in real systems. During their initial learning stages, RL agents need to explore the policy space selecting actions that are, in general, highly ineffective. In an NBIoT access network this implies a disproportionate increase in transmission delays. In this paper, we make two contributions to enable the adoption of RL in NBIoT: first, we present a multi-agent architecture based on the principle of task division. Second, we propose a new model-based RL algorithm for link adaptation characterized by its high sample efficiency. The combination of these two strategies results in an algorithm that, during the learning phase, is able to maintain the transmission delay in the order of hundreds of milliseconds, whereas model-free RL algorithms cause delays of up to several seconds. This allows our approach to be deployed, without prior training, in an operating NBIoT network and learn to control it efficiently without degrading its performance.</description><identifier>ISSN: 2169-3536</identifier><identifier>EISSN: 2169-3536</identifier><identifier>DOI: 10.1109/ACCESS.2023.3284990</identifier><identifier>CODEN: IAECCG</identifier><language>eng</language><publisher>Piscataway: IEEE</publisher><subject>Adaptation ; Algorithms ; Delays ; Downlink ; Internet of Things ; link adaptation ; Machine learning ; Multiagent systems ; Narrowband ; Narrowband Internet of Things (NB-IoT) ; NPUSCH ; Performance degradation ; Reinforcement Learning ; Resource allocation ; Resource management ; Resource scheduling ; Task analysis ; Task scheduling ; Time-frequency analysis ; Uplink ; uplink scheduling</subject><ispartof>IEEE access, 2023-01, Vol.11, p.1-1</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2023</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c409t-8f4e3c9fabcfa151350ff8044f5f369382896a0701b6f4793f7923a4492a17e03</citedby><cites>FETCH-LOGICAL-c409t-8f4e3c9fabcfa151350ff8044f5f369382896a0701b6f4793f7923a4492a17e03</cites><orcidid>0000-0001-6938-1540 ; 0000-0002-1756-0130 ; 0000-0001-5225-8378</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10147823$$EHTML$$P50$$Gieee$$Hfree_for_read</linktohtml><link.rule.ids>314,780,784,864,2102,27633,27924,27925,54933</link.rule.ids></links><search><creatorcontrib>Alcaraz, Juan J.</creatorcontrib><creatorcontrib>Losilla, Fernando</creatorcontrib><creatorcontrib>Gonzalez-Castano, Francisco-Javier</creatorcontrib><title>Transmission Control in NB-IoT with Model-Based Reinforcement Learning</title><title>IEEE access</title><addtitle>Access</addtitle><description>In Narrowband Internet of Things (NB-IoT), the control of uplink transmissions is a complex task involving device scheduling, resource allocation in the carrier, and the configuration of link-adaptation parameters. Existing heuristic proposals partially address the problem, but reinforcement learning (RL) seems to be the most effective approach a priori, given its success in similar control problems. However, the low sample efficiency of conventional (model-free) RL algorithms is an important limitation for their deployment in real systems. During their initial learning stages, RL agents need to explore the policy space selecting actions that are, in general, highly ineffective. In an NBIoT access network this implies a disproportionate increase in transmission delays. In this paper, we make two contributions to enable the adoption of RL in NBIoT: first, we present a multi-agent architecture based on the principle of task division. Second, we propose a new model-based RL algorithm for link adaptation characterized by its high sample efficiency. The combination of these two strategies results in an algorithm that, during the learning phase, is able to maintain the transmission delay in the order of hundreds of milliseconds, whereas model-free RL algorithms cause delays of up to several seconds. This allows our approach to be deployed, without prior training, in an operating NBIoT network and learn to control it efficiently without degrading its performance.</description><subject>Adaptation</subject><subject>Algorithms</subject><subject>Delays</subject><subject>Downlink</subject><subject>Internet of Things</subject><subject>link adaptation</subject><subject>Machine learning</subject><subject>Multiagent systems</subject><subject>Narrowband</subject><subject>Narrowband Internet of Things (NB-IoT)</subject><subject>NPUSCH</subject><subject>Performance degradation</subject><subject>Reinforcement Learning</subject><subject>Resource allocation</subject><subject>Resource management</subject><subject>Resource scheduling</subject><subject>Task analysis</subject><subject>Task scheduling</subject><subject>Time-frequency analysis</subject><subject>Uplink</subject><subject>uplink scheduling</subject><issn>2169-3536</issn><issn>2169-3536</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>ESBDL</sourceid><sourceid>RIE</sourceid><sourceid>DOA</sourceid><recordid>eNpNkFtLAzEQhRdRsNT-An1Y8HlrbpvLY7tULVQFW59DupvUlG1Sky3ivzd1i3ReZhjO-WY4WXYLwRhCIB4mVTVbLscIIDzGiBMhwEU2QJCKApeYXp7N19koxi1IxdOqZIPscRWUizsbo_Uur7zrgm9z6_LXaTH3q_zbdp_5i290W0xV1E3-rq0zPtR6p12XL7QKzrrNTXZlVBv16NSH2cfjbFU9F4u3p3k1WRQ1AaIruCEa18KodW0ULCEugTEcEGJKg6nAHHFBFWAArqkhTGDDBMKKEIEUZBrgYTbvuY1XW7kPdqfCj_TKyr-FDxupQmfrVkta4zXhHJccE9KIWlDCBSpNmbiMI5FY9z1rH_zXQcdObv0huPS-RBxRwliSJRXuVXXwMQZt_q9CII_5yz5_ecxfnvJPrrveZbXWZw5Ijkz8CxKufgg</recordid><startdate>20230101</startdate><enddate>20230101</enddate><creator>Alcaraz, Juan J.</creator><creator>Losilla, Fernando</creator><creator>Gonzalez-Castano, Francisco-Javier</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>ESBDL</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>7SR</scope><scope>8BQ</scope><scope>8FD</scope><scope>JG9</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>DOA</scope><orcidid>https://orcid.org/0000-0001-6938-1540</orcidid><orcidid>https://orcid.org/0000-0002-1756-0130</orcidid><orcidid>https://orcid.org/0000-0001-5225-8378</orcidid></search><sort><creationdate>20230101</creationdate><title>Transmission Control in NB-IoT with Model-Based Reinforcement Learning</title><author>Alcaraz, Juan J. ; Losilla, Fernando ; Gonzalez-Castano, Francisco-Javier</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c409t-8f4e3c9fabcfa151350ff8044f5f369382896a0701b6f4793f7923a4492a17e03</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Adaptation</topic><topic>Algorithms</topic><topic>Delays</topic><topic>Downlink</topic><topic>Internet of Things</topic><topic>link adaptation</topic><topic>Machine learning</topic><topic>Multiagent systems</topic><topic>Narrowband</topic><topic>Narrowband Internet of Things (NB-IoT)</topic><topic>NPUSCH</topic><topic>Performance degradation</topic><topic>Reinforcement Learning</topic><topic>Resource allocation</topic><topic>Resource management</topic><topic>Resource scheduling</topic><topic>Task analysis</topic><topic>Task scheduling</topic><topic>Time-frequency analysis</topic><topic>Uplink</topic><topic>uplink scheduling</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Alcaraz, Juan J.</creatorcontrib><creatorcontrib>Losilla, Fernando</creatorcontrib><creatorcontrib>Gonzalez-Castano, Francisco-Javier</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE Open Access Journals</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Engineered Materials Abstracts</collection><collection>METADEX</collection><collection>Technology Research Database</collection><collection>Materials Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>DOAJ Directory of Open Access Journals</collection><jtitle>IEEE access</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Alcaraz, Juan J.</au><au>Losilla, Fernando</au><au>Gonzalez-Castano, Francisco-Javier</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Transmission Control in NB-IoT with Model-Based Reinforcement Learning</atitle><jtitle>IEEE access</jtitle><stitle>Access</stitle><date>2023-01-01</date><risdate>2023</risdate><volume>11</volume><spage>1</spage><epage>1</epage><pages>1-1</pages><issn>2169-3536</issn><eissn>2169-3536</eissn><coden>IAECCG</coden><abstract>In Narrowband Internet of Things (NB-IoT), the control of uplink transmissions is a complex task involving device scheduling, resource allocation in the carrier, and the configuration of link-adaptation parameters. Existing heuristic proposals partially address the problem, but reinforcement learning (RL) seems to be the most effective approach a priori, given its success in similar control problems. However, the low sample efficiency of conventional (model-free) RL algorithms is an important limitation for their deployment in real systems. During their initial learning stages, RL agents need to explore the policy space selecting actions that are, in general, highly ineffective. In an NBIoT access network this implies a disproportionate increase in transmission delays. In this paper, we make two contributions to enable the adoption of RL in NBIoT: first, we present a multi-agent architecture based on the principle of task division. Second, we propose a new model-based RL algorithm for link adaptation characterized by its high sample efficiency. The combination of these two strategies results in an algorithm that, during the learning phase, is able to maintain the transmission delay in the order of hundreds of milliseconds, whereas model-free RL algorithms cause delays of up to several seconds. This allows our approach to be deployed, without prior training, in an operating NBIoT network and learn to control it efficiently without degrading its performance.</abstract><cop>Piscataway</cop><pub>IEEE</pub><doi>10.1109/ACCESS.2023.3284990</doi><tpages>1</tpages><orcidid>https://orcid.org/0000-0001-6938-1540</orcidid><orcidid>https://orcid.org/0000-0002-1756-0130</orcidid><orcidid>https://orcid.org/0000-0001-5225-8378</orcidid><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	ISSN: 2169-3536
ispartof	IEEE access, 2023-01, Vol.11, p.1-1
issn	2169-3536 2169-3536
language	eng
recordid	cdi_proquest_journals_2826477782
source	IEEE Open Access Journals; DOAJ Directory of Open Access Journals; EZB-FREE-00999 freely available EZB journals
subjects	Adaptation Algorithms Delays Downlink Internet of Things link adaptation Machine learning Multiagent systems Narrowband Narrowband Internet of Things (NB-IoT) NPUSCH Performance degradation Reinforcement Learning Resource allocation Resource management Resource scheduling Task analysis Task scheduling Time-frequency analysis Uplink uplink scheduling
title	Transmission Control in NB-IoT with Model-Based Reinforcement Learning
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-24T11%3A56%3A25IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_doaj_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Transmission%20Control%20in%20NB-IoT%20with%20Model-Based%20Reinforcement%20Learning&rft.jtitle=IEEE%20access&rft.au=Alcaraz,%20Juan%20J.&rft.date=2023-01-01&rft.volume=11&rft.spage=1&rft.epage=1&rft.pages=1-1&rft.issn=2169-3536&rft.eissn=2169-3536&rft.coden=IAECCG&rft_id=info:doi/10.1109/ACCESS.2023.3284990&rft_dat=%3Cproquest_doaj_%3E2826477782%3C/proquest_doaj_%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2826477782&rft_id=info:pmid/&rft_ieee_id=10147823&rft_doaj_id=oai_doaj_org_article_6c3b488358344d9c9648925f54797829&rfr_iscdi=true