Online reinforcement learning for adaptive interference coordination
Heterogeneous networks (HetNets), in which small cells overlay macro cells, are a cost‐effective approach to increase the capacity of cellular networks. However, HetNets have raised new issues related to cell association and interference management. In particular, the optimal configuration of interf...
Gespeichert in:
Veröffentlicht in: | Transactions on emerging telecommunications technologies 2020-10, Vol.31 (10), p.n/a |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | n/a |
---|---|
container_issue | 10 |
container_start_page | |
container_title | Transactions on emerging telecommunications technologies |
container_volume | 31 |
creator | Alcaraz, Juan J. Ayala‐Romero, Jose A. Vales‐Alonso, Javier Losilla‐López, Fernando |
description | Heterogeneous networks (HetNets), in which small cells overlay macro cells, are a cost‐effective approach to increase the capacity of cellular networks. However, HetNets have raised new issues related to cell association and interference management. In particular, the optimal configuration of interference coordination (IC) parameters is a challenging task because it depends on multiple stochastic processes such as the locations of the users, the traffic demands, or the strength of the received signals. This work proposes a self‐optimization algorithm capable of finding the optimal configuration in an operating network. We address the problem using a reinforcement learning (RL) approach, in which the actions are the IC configurations, whose performances are initially unknown. The main difficulty is that, due to the variable network conditions, the performance of each action may change over time. Our proposal is based on two main elements: the sequential exploration of subsets of actions (exploration regions), and an optimal stopping (OS) strategy for deciding when to end current exploration and start a new one. For our algorithm, referred to as local exploration with optimal stopping (LEOS), we provide theoretical bounds on its long‐term regret per sample and its convergence time. We compare LEOS to state‐of‐the‐art learning algorithms based on multiarmed bandits and policy gradient RL. Considering different changing rates in the network conditions, our numerical results show that LEOS outperforms the first alternative by 22%, and the second one by 48% in terms of average regret per sample.
We propose an online reinforcement learning algorithm for adjusting the interference coordination parameters in an operating heterogeneous network. Our proposal combines elements from multi‐armed bandits, sequential hypothesis testing, and stochastic approximation. |
doi_str_mv | 10.1002/ett.4087 |
format | Article |
fullrecord | <record><control><sourceid>wiley_cross</sourceid><recordid>TN_cdi_crossref_primary_10_1002_ett_4087</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>ETT4087</sourcerecordid><originalsourceid>FETCH-LOGICAL-c2657-2c2cd2027d11162dcb74bf8b2c9ea708885c3533dae3fdb1e6c87e79185d0f8b3</originalsourceid><addsrcrecordid>eNp1kE1LAzEURYMoWGrBn5Clm6l5yWSSLqXWDyh0M65DJnmRyDRTMkHpv3dqXbjxbe7jcriLQ8gtsCUwxu-xlGXNtLogMw4NVGIF8vLPf00W4_jBplOSy1rPyOMu9TEhzRhTGLLDPaZCe7Q5xfROp4pabw8lfiKNqWAOmDE5pG4Yso_JljikG3IVbD_i4jfn5O1p065fqu3u-XX9sK0cb6SquOPOc8aVB4CGe9epugu6426FVjGttXRCCuEtiuA7wMZphWoFWno2cWJO7s67Lg_jmDGYQ457m48GmDkJMJMAcxIwodUZ_Yo9Hv_lzKZtf_hvJPtdgg</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Online reinforcement learning for adaptive interference coordination</title><source>Wiley Online Library Journals Frontfile Complete</source><creator>Alcaraz, Juan J. ; Ayala‐Romero, Jose A. ; Vales‐Alonso, Javier ; Losilla‐López, Fernando</creator><creatorcontrib>Alcaraz, Juan J. ; Ayala‐Romero, Jose A. ; Vales‐Alonso, Javier ; Losilla‐López, Fernando</creatorcontrib><description>Heterogeneous networks (HetNets), in which small cells overlay macro cells, are a cost‐effective approach to increase the capacity of cellular networks. However, HetNets have raised new issues related to cell association and interference management. In particular, the optimal configuration of interference coordination (IC) parameters is a challenging task because it depends on multiple stochastic processes such as the locations of the users, the traffic demands, or the strength of the received signals. This work proposes a self‐optimization algorithm capable of finding the optimal configuration in an operating network. We address the problem using a reinforcement learning (RL) approach, in which the actions are the IC configurations, whose performances are initially unknown. The main difficulty is that, due to the variable network conditions, the performance of each action may change over time. Our proposal is based on two main elements: the sequential exploration of subsets of actions (exploration regions), and an optimal stopping (OS) strategy for deciding when to end current exploration and start a new one. For our algorithm, referred to as local exploration with optimal stopping (LEOS), we provide theoretical bounds on its long‐term regret per sample and its convergence time. We compare LEOS to state‐of‐the‐art learning algorithms based on multiarmed bandits and policy gradient RL. Considering different changing rates in the network conditions, our numerical results show that LEOS outperforms the first alternative by 22%, and the second one by 48% in terms of average regret per sample.
We propose an online reinforcement learning algorithm for adjusting the interference coordination parameters in an operating heterogeneous network. Our proposal combines elements from multi‐armed bandits, sequential hypothesis testing, and stochastic approximation.</description><identifier>ISSN: 2161-3915</identifier><identifier>EISSN: 2161-3915</identifier><identifier>DOI: 10.1002/ett.4087</identifier><language>eng</language><ispartof>Transactions on emerging telecommunications technologies, 2020-10, Vol.31 (10), p.n/a</ispartof><rights>2020 John Wiley & Sons, Ltd.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c2657-2c2cd2027d11162dcb74bf8b2c9ea708885c3533dae3fdb1e6c87e79185d0f8b3</citedby><cites>FETCH-LOGICAL-c2657-2c2cd2027d11162dcb74bf8b2c9ea708885c3533dae3fdb1e6c87e79185d0f8b3</cites><orcidid>0000-0002-1756-0130</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://onlinelibrary.wiley.com/doi/pdf/10.1002%2Fett.4087$$EPDF$$P50$$Gwiley$$H</linktopdf><linktohtml>$$Uhttps://onlinelibrary.wiley.com/doi/full/10.1002%2Fett.4087$$EHTML$$P50$$Gwiley$$H</linktohtml><link.rule.ids>314,776,780,1411,27901,27902,45550,45551</link.rule.ids></links><search><creatorcontrib>Alcaraz, Juan J.</creatorcontrib><creatorcontrib>Ayala‐Romero, Jose A.</creatorcontrib><creatorcontrib>Vales‐Alonso, Javier</creatorcontrib><creatorcontrib>Losilla‐López, Fernando</creatorcontrib><title>Online reinforcement learning for adaptive interference coordination</title><title>Transactions on emerging telecommunications technologies</title><description>Heterogeneous networks (HetNets), in which small cells overlay macro cells, are a cost‐effective approach to increase the capacity of cellular networks. However, HetNets have raised new issues related to cell association and interference management. In particular, the optimal configuration of interference coordination (IC) parameters is a challenging task because it depends on multiple stochastic processes such as the locations of the users, the traffic demands, or the strength of the received signals. This work proposes a self‐optimization algorithm capable of finding the optimal configuration in an operating network. We address the problem using a reinforcement learning (RL) approach, in which the actions are the IC configurations, whose performances are initially unknown. The main difficulty is that, due to the variable network conditions, the performance of each action may change over time. Our proposal is based on two main elements: the sequential exploration of subsets of actions (exploration regions), and an optimal stopping (OS) strategy for deciding when to end current exploration and start a new one. For our algorithm, referred to as local exploration with optimal stopping (LEOS), we provide theoretical bounds on its long‐term regret per sample and its convergence time. We compare LEOS to state‐of‐the‐art learning algorithms based on multiarmed bandits and policy gradient RL. Considering different changing rates in the network conditions, our numerical results show that LEOS outperforms the first alternative by 22%, and the second one by 48% in terms of average regret per sample.
We propose an online reinforcement learning algorithm for adjusting the interference coordination parameters in an operating heterogeneous network. Our proposal combines elements from multi‐armed bandits, sequential hypothesis testing, and stochastic approximation.</description><issn>2161-3915</issn><issn>2161-3915</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><recordid>eNp1kE1LAzEURYMoWGrBn5Clm6l5yWSSLqXWDyh0M65DJnmRyDRTMkHpv3dqXbjxbe7jcriLQ8gtsCUwxu-xlGXNtLogMw4NVGIF8vLPf00W4_jBplOSy1rPyOMu9TEhzRhTGLLDPaZCe7Q5xfROp4pabw8lfiKNqWAOmDE5pG4Yso_JljikG3IVbD_i4jfn5O1p065fqu3u-XX9sK0cb6SquOPOc8aVB4CGe9epugu6426FVjGttXRCCuEtiuA7wMZphWoFWno2cWJO7s67Lg_jmDGYQ457m48GmDkJMJMAcxIwodUZ_Yo9Hv_lzKZtf_hvJPtdgg</recordid><startdate>202010</startdate><enddate>202010</enddate><creator>Alcaraz, Juan J.</creator><creator>Ayala‐Romero, Jose A.</creator><creator>Vales‐Alonso, Javier</creator><creator>Losilla‐López, Fernando</creator><scope>AAYXX</scope><scope>CITATION</scope><orcidid>https://orcid.org/0000-0002-1756-0130</orcidid></search><sort><creationdate>202010</creationdate><title>Online reinforcement learning for adaptive interference coordination</title><author>Alcaraz, Juan J. ; Ayala‐Romero, Jose A. ; Vales‐Alonso, Javier ; Losilla‐López, Fernando</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c2657-2c2cd2027d11162dcb74bf8b2c9ea708885c3533dae3fdb1e6c87e79185d0f8b3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Alcaraz, Juan J.</creatorcontrib><creatorcontrib>Ayala‐Romero, Jose A.</creatorcontrib><creatorcontrib>Vales‐Alonso, Javier</creatorcontrib><creatorcontrib>Losilla‐López, Fernando</creatorcontrib><collection>CrossRef</collection><jtitle>Transactions on emerging telecommunications technologies</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Alcaraz, Juan J.</au><au>Ayala‐Romero, Jose A.</au><au>Vales‐Alonso, Javier</au><au>Losilla‐López, Fernando</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Online reinforcement learning for adaptive interference coordination</atitle><jtitle>Transactions on emerging telecommunications technologies</jtitle><date>2020-10</date><risdate>2020</risdate><volume>31</volume><issue>10</issue><epage>n/a</epage><issn>2161-3915</issn><eissn>2161-3915</eissn><abstract>Heterogeneous networks (HetNets), in which small cells overlay macro cells, are a cost‐effective approach to increase the capacity of cellular networks. However, HetNets have raised new issues related to cell association and interference management. In particular, the optimal configuration of interference coordination (IC) parameters is a challenging task because it depends on multiple stochastic processes such as the locations of the users, the traffic demands, or the strength of the received signals. This work proposes a self‐optimization algorithm capable of finding the optimal configuration in an operating network. We address the problem using a reinforcement learning (RL) approach, in which the actions are the IC configurations, whose performances are initially unknown. The main difficulty is that, due to the variable network conditions, the performance of each action may change over time. Our proposal is based on two main elements: the sequential exploration of subsets of actions (exploration regions), and an optimal stopping (OS) strategy for deciding when to end current exploration and start a new one. For our algorithm, referred to as local exploration with optimal stopping (LEOS), we provide theoretical bounds on its long‐term regret per sample and its convergence time. We compare LEOS to state‐of‐the‐art learning algorithms based on multiarmed bandits and policy gradient RL. Considering different changing rates in the network conditions, our numerical results show that LEOS outperforms the first alternative by 22%, and the second one by 48% in terms of average regret per sample.
We propose an online reinforcement learning algorithm for adjusting the interference coordination parameters in an operating heterogeneous network. Our proposal combines elements from multi‐armed bandits, sequential hypothesis testing, and stochastic approximation.</abstract><doi>10.1002/ett.4087</doi><tpages>24</tpages><orcidid>https://orcid.org/0000-0002-1756-0130</orcidid></addata></record> |
fulltext | fulltext |
identifier | ISSN: 2161-3915 |
ispartof | Transactions on emerging telecommunications technologies, 2020-10, Vol.31 (10), p.n/a |
issn | 2161-3915 2161-3915 |
language | eng |
recordid | cdi_crossref_primary_10_1002_ett_4087 |
source | Wiley Online Library Journals Frontfile Complete |
title | Online reinforcement learning for adaptive interference coordination |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-07T00%3A27%3A18IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-wiley_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Online%20reinforcement%20learning%20for%20adaptive%20interference%20coordination&rft.jtitle=Transactions%20on%20emerging%20telecommunications%20technologies&rft.au=Alcaraz,%20Juan%20J.&rft.date=2020-10&rft.volume=31&rft.issue=10&rft.epage=n/a&rft.issn=2161-3915&rft.eissn=2161-3915&rft_id=info:doi/10.1002/ett.4087&rft_dat=%3Cwiley_cross%3EETT4087%3C/wiley_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true |