Online reinforcement learning for adaptive interference coordination

Heterogeneous networks (HetNets), in which small cells overlay macro cells, are a cost‐effective approach to increase the capacity of cellular networks. However, HetNets have raised new issues related to cell association and interference management. In particular, the optimal configuration of interf...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Transactions on emerging telecommunications technologies 2020-10, Vol.31 (10), p.n/a
Hauptverfasser:	Alcaraz, Juan J., Ayala‐Romero, Jose A., Vales‐Alonso, Javier, Losilla‐López, Fernando
Format:	Artikel
Sprache:	eng
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	n/a
container_issue	10
container_start_page
container_title	Transactions on emerging telecommunications technologies
container_volume	31
creator	Alcaraz, Juan J. Ayala‐Romero, Jose A. Vales‐Alonso, Javier Losilla‐López, Fernando
description	Heterogeneous networks (HetNets), in which small cells overlay macro cells, are a cost‐effective approach to increase the capacity of cellular networks. However, HetNets have raised new issues related to cell association and interference management. In particular, the optimal configuration of interference coordination (IC) parameters is a challenging task because it depends on multiple stochastic processes such as the locations of the users, the traffic demands, or the strength of the received signals. This work proposes a self‐optimization algorithm capable of finding the optimal configuration in an operating network. We address the problem using a reinforcement learning (RL) approach, in which the actions are the IC configurations, whose performances are initially unknown. The main difficulty is that, due to the variable network conditions, the performance of each action may change over time. Our proposal is based on two main elements: the sequential exploration of subsets of actions (exploration regions), and an optimal stopping (OS) strategy for deciding when to end current exploration and start a new one. For our algorithm, referred to as local exploration with optimal stopping (LEOS), we provide theoretical bounds on its long‐term regret per sample and its convergence time. We compare LEOS to state‐of‐the‐art learning algorithms based on multiarmed bandits and policy gradient RL. Considering different changing rates in the network conditions, our numerical results show that LEOS outperforms the first alternative by 22%, and the second one by 48% in terms of average regret per sample. We propose an online reinforcement learning algorithm for adjusting the interference coordination parameters in an operating heterogeneous network. Our proposal combines elements from multi‐armed bandits, sequential hypothesis testing, and stochastic approximation.
doi_str_mv	10.1002/ett.4087
format	Article
fullrecord	<record><control><sourceid>wiley_cross</sourceid><recordid>TN_cdi_crossref_primary_10_1002_ett_4087</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>ETT4087</sourcerecordid><originalsourceid>FETCH-LOGICAL-c2657-2c2cd2027d11162dcb74bf8b2c9ea708885c3533dae3fdb1e6c87e79185d0f8b3</originalsourceid><addsrcrecordid>eNp1kE1LAzEURYMoWGrBn5Clm6l5yWSSLqXWDyh0M65DJnmRyDRTMkHpv3dqXbjxbe7jcriLQ8gtsCUwxu-xlGXNtLogMw4NVGIF8vLPf00W4_jBplOSy1rPyOMu9TEhzRhTGLLDPaZCe7Q5xfROp4pabw8lfiKNqWAOmDE5pG4Yso_JljikG3IVbD_i4jfn5O1p065fqu3u-XX9sK0cb6SquOPOc8aVB4CGe9epugu6426FVjGttXRCCuEtiuA7wMZphWoFWno2cWJO7s67Lg_jmDGYQ457m48GmDkJMJMAcxIwodUZ_Yo9Hv_lzKZtf_hvJPtdgg</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Online reinforcement learning for adaptive interference coordination</title><source>Wiley Online Library Journals Frontfile Complete</source><creator>Alcaraz, Juan J. ; Ayala‐Romero, Jose A. ; Vales‐Alonso, Javier ; Losilla‐López, Fernando</creator><creatorcontrib>Alcaraz, Juan J. ; Ayala‐Romero, Jose A. ; Vales‐Alonso, Javier ; Losilla‐López, Fernando</creatorcontrib><description>Heterogeneous networks (HetNets), in which small cells overlay macro cells, are a cost‐effective approach to increase the capacity of cellular networks. However, HetNets have raised new issues related to cell association and interference management. In particular, the optimal configuration of interference coordination (IC) parameters is a challenging task because it depends on multiple stochastic processes such as the locations of the users, the traffic demands, or the strength of the received signals. This work proposes a self‐optimization algorithm capable of finding the optimal configuration in an operating network. We address the problem using a reinforcement learning (RL) approach, in which the actions are the IC configurations, whose performances are initially unknown. The main difficulty is that, due to the variable network conditions, the performance of each action may change over time. Our proposal is based on two main elements: the sequential exploration of subsets of actions (exploration regions), and an optimal stopping (OS) strategy for deciding when to end current exploration and start a new one. For our algorithm, referred to as local exploration with optimal stopping (LEOS), we provide theoretical bounds on its long‐term regret per sample and its convergence time. We compare LEOS to state‐of‐the‐art learning algorithms based on multiarmed bandits and policy gradient RL. Considering different changing rates in the network conditions, our numerical results show that LEOS outperforms the first alternative by 22%, and the second one by 48% in terms of average regret per sample. We propose an online reinforcement learning algorithm for adjusting the interference coordination parameters in an operating heterogeneous network. Our proposal combines elements from multi‐armed bandits, sequential hypothesis testing, and stochastic approximation.</description><identifier>ISSN: 2161-3915</identifier><identifier>EISSN: 2161-3915</identifier><identifier>DOI: 10.1002/ett.4087</identifier><language>eng</language><ispartof>Transactions on emerging telecommunications technologies, 2020-10, Vol.31 (10), p.n/a</ispartof><rights>2020 John Wiley & Sons, Ltd.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c2657-2c2cd2027d11162dcb74bf8b2c9ea708885c3533dae3fdb1e6c87e79185d0f8b3</citedby><cites>FETCH-LOGICAL-c2657-2c2cd2027d11162dcb74bf8b2c9ea708885c3533dae3fdb1e6c87e79185d0f8b3</cites><orcidid>0000-0002-1756-0130</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://onlinelibrary.wiley.com/doi/pdf/10.1002%2Fett.4087$$EPDF$$P50$$Gwiley$$H</linktopdf><linktohtml>$$Uhttps://onlinelibrary.wiley.com/doi/full/10.1002%2Fett.4087$$EHTML$$P50$$Gwiley$$H</linktohtml><link.rule.ids>314,776,780,1411,27901,27902,45550,45551</link.rule.ids></links><search><creatorcontrib>Alcaraz, Juan J.</creatorcontrib><creatorcontrib>Ayala‐Romero, Jose A.</creatorcontrib><creatorcontrib>Vales‐Alonso, Javier</creatorcontrib><creatorcontrib>Losilla‐López, Fernando</creatorcontrib><title>Online reinforcement learning for adaptive interference coordination</title><title>Transactions on emerging telecommunications technologies</title><description>Heterogeneous networks (HetNets), in which small cells overlay macro cells, are a cost‐effective approach to increase the capacity of cellular networks. However, HetNets have raised new issues related to cell association and interference management. In particular, the optimal configuration of interference coordination (IC) parameters is a challenging task because it depends on multiple stochastic processes such as the locations of the users, the traffic demands, or the strength of the received signals. This work proposes a self‐optimization algorithm capable of finding the optimal configuration in an operating network. We address the problem using a reinforcement learning (RL) approach, in which the actions are the IC configurations, whose performances are initially unknown. The main difficulty is that, due to the variable network conditions, the performance of each action may change over time. Our proposal is based on two main elements: the sequential exploration of subsets of actions (exploration regions), and an optimal stopping (OS) strategy for deciding when to end current exploration and start a new one. For our algorithm, referred to as local exploration with optimal stopping (LEOS), we provide theoretical bounds on its long‐term regret per sample and its convergence time. We compare LEOS to state‐of‐the‐art learning algorithms based on multiarmed bandits and policy gradient RL. Considering different changing rates in the network conditions, our numerical results show that LEOS outperforms the first alternative by 22%, and the second one by 48% in terms of average regret per sample. We propose an online reinforcement learning algorithm for adjusting the interference coordination parameters in an operating heterogeneous network. Our proposal combines elements from multi‐armed bandits, sequential hypothesis testing, and stochastic approximation.</description><issn>2161-3915</issn><issn>2161-3915</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><recordid>eNp1kE1LAzEURYMoWGrBn5Clm6l5yWSSLqXWDyh0M65DJnmRyDRTMkHpv3dqXbjxbe7jcriLQ8gtsCUwxu-xlGXNtLogMw4NVGIF8vLPf00W4_jBplOSy1rPyOMu9TEhzRhTGLLDPaZCe7Q5xfROp4pabw8lfiKNqWAOmDE5pG4Yso_JljikG3IVbD_i4jfn5O1p065fqu3u-XX9sK0cb6SquOPOc8aVB4CGe9epugu6426FVjGttXRCCuEtiuA7wMZphWoFWno2cWJO7s67Lg_jmDGYQ457m48GmDkJMJMAcxIwodUZ_Yo9Hv_lzKZtf_hvJPtdgg</recordid><startdate>202010</startdate><enddate>202010</enddate><creator>Alcaraz, Juan J.</creator><creator>Ayala‐Romero, Jose A.</creator><creator>Vales‐Alonso, Javier</creator><creator>Losilla‐López, Fernando</creator><scope>AAYXX</scope><scope>CITATION</scope><orcidid>https://orcid.org/0000-0002-1756-0130</orcidid></search><sort><creationdate>202010</creationdate><title>Online reinforcement learning for adaptive interference coordination</title><author>Alcaraz, Juan J. ; Ayala‐Romero, Jose A. ; Vales‐Alonso, Javier ; Losilla‐López, Fernando</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c2657-2c2cd2027d11162dcb74bf8b2c9ea708885c3533dae3fdb1e6c87e79185d0f8b3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Alcaraz, Juan J.</creatorcontrib><creatorcontrib>Ayala‐Romero, Jose A.</creatorcontrib><creatorcontrib>Vales‐Alonso, Javier</creatorcontrib><creatorcontrib>Losilla‐López, Fernando</creatorcontrib><collection>CrossRef</collection><jtitle>Transactions on emerging telecommunications technologies</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Alcaraz, Juan J.</au><au>Ayala‐Romero, Jose A.</au><au>Vales‐Alonso, Javier</au><au>Losilla‐López, Fernando</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Online reinforcement learning for adaptive interference coordination</atitle><jtitle>Transactions on emerging telecommunications technologies</jtitle><date>2020-10</date><risdate>2020</risdate><volume>31</volume><issue>10</issue><epage>n/a</epage><issn>2161-3915</issn><eissn>2161-3915</eissn><abstract>Heterogeneous networks (HetNets), in which small cells overlay macro cells, are a cost‐effective approach to increase the capacity of cellular networks. However, HetNets have raised new issues related to cell association and interference management. In particular, the optimal configuration of interference coordination (IC) parameters is a challenging task because it depends on multiple stochastic processes such as the locations of the users, the traffic demands, or the strength of the received signals. This work proposes a self‐optimization algorithm capable of finding the optimal configuration in an operating network. We address the problem using a reinforcement learning (RL) approach, in which the actions are the IC configurations, whose performances are initially unknown. The main difficulty is that, due to the variable network conditions, the performance of each action may change over time. Our proposal is based on two main elements: the sequential exploration of subsets of actions (exploration regions), and an optimal stopping (OS) strategy for deciding when to end current exploration and start a new one. For our algorithm, referred to as local exploration with optimal stopping (LEOS), we provide theoretical bounds on its long‐term regret per sample and its convergence time. We compare LEOS to state‐of‐the‐art learning algorithms based on multiarmed bandits and policy gradient RL. Considering different changing rates in the network conditions, our numerical results show that LEOS outperforms the first alternative by 22%, and the second one by 48% in terms of average regret per sample. We propose an online reinforcement learning algorithm for adjusting the interference coordination parameters in an operating heterogeneous network. Our proposal combines elements from multi‐armed bandits, sequential hypothesis testing, and stochastic approximation.</abstract><doi>10.1002/ett.4087</doi><tpages>24</tpages><orcidid>https://orcid.org/0000-0002-1756-0130</orcidid></addata></record>
fulltext	fulltext
identifier	ISSN: 2161-3915
ispartof	Transactions on emerging telecommunications technologies, 2020-10, Vol.31 (10), p.n/a
issn	2161-3915 2161-3915
language	eng
recordid	cdi_crossref_primary_10_1002_ett_4087
source	Wiley Online Library Journals Frontfile Complete
title	Online reinforcement learning for adaptive interference coordination
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-07T00%3A27%3A18IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-wiley_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Online%20reinforcement%20learning%20for%20adaptive%20interference%20coordination&rft.jtitle=Transactions%20on%20emerging%20telecommunications%20technologies&rft.au=Alcaraz,%20Juan%20J.&rft.date=2020-10&rft.volume=31&rft.issue=10&rft.epage=n/a&rft.issn=2161-3915&rft.eissn=2161-3915&rft_id=info:doi/10.1002/ett.4087&rft_dat=%3Cwiley_cross%3EETT4087%3C/wiley_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true