Multi-Agent Reinforcement Learning for Dynamic Topology Optimization of Mesh Wireless Networks

In Mesh Wireless Networks (MWNs), the network coverage is extended by connecting Access Points (APs) in a mesh topology, where transmitting frames by multi-hop routing has to sustain the performances, such as end-to-end (E2E) delay and channel efficiency. Several recent studies have focused on minim...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on wireless communications 2024-09, Vol.23 (9), p.10501-10513
Hauptverfasser: Sun, Wei, Lv, Qiushuo, Xiao, Yang, Liu, Zhi, Tang, Qingwei, Li, Qiyue, Mu, Daoming
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 10513
container_issue 9
container_start_page 10501
container_title IEEE transactions on wireless communications
container_volume 23
creator Sun, Wei
Lv, Qiushuo
Xiao, Yang
Liu, Zhi
Tang, Qingwei
Li, Qiyue
Mu, Daoming
description In Mesh Wireless Networks (MWNs), the network coverage is extended by connecting Access Points (APs) in a mesh topology, where transmitting frames by multi-hop routing has to sustain the performances, such as end-to-end (E2E) delay and channel efficiency. Several recent studies have focused on minimizing E2E delay, but these methods are unable to adapt to the dynamic nature of MWNs. Meanwhile, reinforcement-learning-based methods offer better adaptability to dynamics but suffer from the problem of high-dimensional action spaces, leading to slower convergence. In this paper, we propose a multi-agent actor-critic reinforcement learning (MACRL) algorithm to optimize multiple objectives, specifically the minimization of E2E delay and the enhancement of channel efficiency. First, to reduce the action space and speed up the convergence in the dynamical optimization process, a centralized-critic-distributed-actor scheme is proposed. Then, a multi-objective reward balancing method is designed to dynamically balance the MWNs' performances between the E2E delay and the channel efficiency. Finally, the trained MACRL algorithm is deployed in the QaulNet simulator to verify its effectiveness.
doi_str_mv 10.1109/TWC.2024.3372694
format Article
fullrecord <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_crossref_primary_10_1109_TWC_2024_3372694</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10466475</ieee_id><sourcerecordid>3102974480</sourcerecordid><originalsourceid>FETCH-LOGICAL-c245t-a972108d556aae2d6f36c28ee2baa9266f22b5fbebbe5dabd593bb59474415e63</originalsourceid><addsrcrecordid>eNpNkM1LwzAYh4MoOKd3Dx4CnjuTNEnb45ifsDmQyW6GpH07M9umJh0y_3pbtoOn94Pn977wIHRNyYRSkt2t1rMJI4xP4jhhMuMnaESFSCPGeHo69LGMKEvkOboIYUsITaQQI_Sx2FWdjaYbaDr8BrYpnc-hHqY5aN_YZoP7Fb7fN7q2OV651lVus8fLtrO1_dWddQ12JV5A-MRr66GCEPArdD_Of4VLdFbqKsDVsY7R--PDavYczZdPL7PpPMoZF12ks4RRkhZCSK2BFbKMZc5SAGa0zpiUJWNGlAaMAVFoU4gsNkZkPOGcCpDxGN0e7rbefe8gdGrrdr7pX6qYEpb1XEp6ihyo3LsQPJSq9bbWfq8oUYNF1VtUg0V1tNhHbg4RCwD_cC4lT0T8B5xgb3U</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3102974480</pqid></control><display><type>article</type><title>Multi-Agent Reinforcement Learning for Dynamic Topology Optimization of Mesh Wireless Networks</title><source>IEEE Electronic Library (IEL)</source><creator>Sun, Wei ; Lv, Qiushuo ; Xiao, Yang ; Liu, Zhi ; Tang, Qingwei ; Li, Qiyue ; Mu, Daoming</creator><creatorcontrib>Sun, Wei ; Lv, Qiushuo ; Xiao, Yang ; Liu, Zhi ; Tang, Qingwei ; Li, Qiyue ; Mu, Daoming</creatorcontrib><description>In Mesh Wireless Networks (MWNs), the network coverage is extended by connecting Access Points (APs) in a mesh topology, where transmitting frames by multi-hop routing has to sustain the performances, such as end-to-end (E2E) delay and channel efficiency. Several recent studies have focused on minimizing E2E delay, but these methods are unable to adapt to the dynamic nature of MWNs. Meanwhile, reinforcement-learning-based methods offer better adaptability to dynamics but suffer from the problem of high-dimensional action spaces, leading to slower convergence. In this paper, we propose a multi-agent actor-critic reinforcement learning (MACRL) algorithm to optimize multiple objectives, specifically the minimization of E2E delay and the enhancement of channel efficiency. First, to reduce the action space and speed up the convergence in the dynamical optimization process, a centralized-critic-distributed-actor scheme is proposed. Then, a multi-objective reward balancing method is designed to dynamically balance the MWNs' performances between the E2E delay and the channel efficiency. Finally, the trained MACRL algorithm is deployed in the QaulNet simulator to verify its effectiveness.</description><identifier>ISSN: 1536-1276</identifier><identifier>EISSN: 1558-2248</identifier><identifier>DOI: 10.1109/TWC.2024.3372694</identifier><identifier>CODEN: ITWCAX</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Actor-critic ; ad hoc wireless network ; Algorithms ; Convergence ; Delay ; Delays ; Efficiency ; Logic gates ; Machine learning ; mesh wireless network ; Multiagent systems ; Network topologies ; Network topology ; reinforcement learning ; Topology ; Topology optimization ; Trajectory ; Vectors ; Wireless networks</subject><ispartof>IEEE transactions on wireless communications, 2024-09, Vol.23 (9), p.10501-10513</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2024</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c245t-a972108d556aae2d6f36c28ee2baa9266f22b5fbebbe5dabd593bb59474415e63</cites><orcidid>0000-0003-0537-4522 ; 0000-0001-8549-6794 ; 0000-0003-4075-0597 ; 0000-0002-9399-8759 ; 0000-0002-1692-3796</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10466475$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,27901,27902,54733</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/10466475$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Sun, Wei</creatorcontrib><creatorcontrib>Lv, Qiushuo</creatorcontrib><creatorcontrib>Xiao, Yang</creatorcontrib><creatorcontrib>Liu, Zhi</creatorcontrib><creatorcontrib>Tang, Qingwei</creatorcontrib><creatorcontrib>Li, Qiyue</creatorcontrib><creatorcontrib>Mu, Daoming</creatorcontrib><title>Multi-Agent Reinforcement Learning for Dynamic Topology Optimization of Mesh Wireless Networks</title><title>IEEE transactions on wireless communications</title><addtitle>TWC</addtitle><description>In Mesh Wireless Networks (MWNs), the network coverage is extended by connecting Access Points (APs) in a mesh topology, where transmitting frames by multi-hop routing has to sustain the performances, such as end-to-end (E2E) delay and channel efficiency. Several recent studies have focused on minimizing E2E delay, but these methods are unable to adapt to the dynamic nature of MWNs. Meanwhile, reinforcement-learning-based methods offer better adaptability to dynamics but suffer from the problem of high-dimensional action spaces, leading to slower convergence. In this paper, we propose a multi-agent actor-critic reinforcement learning (MACRL) algorithm to optimize multiple objectives, specifically the minimization of E2E delay and the enhancement of channel efficiency. First, to reduce the action space and speed up the convergence in the dynamical optimization process, a centralized-critic-distributed-actor scheme is proposed. Then, a multi-objective reward balancing method is designed to dynamically balance the MWNs' performances between the E2E delay and the channel efficiency. Finally, the trained MACRL algorithm is deployed in the QaulNet simulator to verify its effectiveness.</description><subject>Actor-critic</subject><subject>ad hoc wireless network</subject><subject>Algorithms</subject><subject>Convergence</subject><subject>Delay</subject><subject>Delays</subject><subject>Efficiency</subject><subject>Logic gates</subject><subject>Machine learning</subject><subject>mesh wireless network</subject><subject>Multiagent systems</subject><subject>Network topologies</subject><subject>Network topology</subject><subject>reinforcement learning</subject><subject>Topology</subject><subject>Topology optimization</subject><subject>Trajectory</subject><subject>Vectors</subject><subject>Wireless networks</subject><issn>1536-1276</issn><issn>1558-2248</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpNkM1LwzAYh4MoOKd3Dx4CnjuTNEnb45ifsDmQyW6GpH07M9umJh0y_3pbtoOn94Pn977wIHRNyYRSkt2t1rMJI4xP4jhhMuMnaESFSCPGeHo69LGMKEvkOboIYUsITaQQI_Sx2FWdjaYbaDr8BrYpnc-hHqY5aN_YZoP7Fb7fN7q2OV651lVus8fLtrO1_dWddQ12JV5A-MRr66GCEPArdD_Of4VLdFbqKsDVsY7R--PDavYczZdPL7PpPMoZF12ks4RRkhZCSK2BFbKMZc5SAGa0zpiUJWNGlAaMAVFoU4gsNkZkPOGcCpDxGN0e7rbefe8gdGrrdr7pX6qYEpb1XEp6ihyo3LsQPJSq9bbWfq8oUYNF1VtUg0V1tNhHbg4RCwD_cC4lT0T8B5xgb3U</recordid><startdate>20240901</startdate><enddate>20240901</enddate><creator>Sun, Wei</creator><creator>Lv, Qiushuo</creator><creator>Xiao, Yang</creator><creator>Liu, Zhi</creator><creator>Tang, Qingwei</creator><creator>Li, Qiyue</creator><creator>Mu, Daoming</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0003-0537-4522</orcidid><orcidid>https://orcid.org/0000-0001-8549-6794</orcidid><orcidid>https://orcid.org/0000-0003-4075-0597</orcidid><orcidid>https://orcid.org/0000-0002-9399-8759</orcidid><orcidid>https://orcid.org/0000-0002-1692-3796</orcidid></search><sort><creationdate>20240901</creationdate><title>Multi-Agent Reinforcement Learning for Dynamic Topology Optimization of Mesh Wireless Networks</title><author>Sun, Wei ; Lv, Qiushuo ; Xiao, Yang ; Liu, Zhi ; Tang, Qingwei ; Li, Qiyue ; Mu, Daoming</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c245t-a972108d556aae2d6f36c28ee2baa9266f22b5fbebbe5dabd593bb59474415e63</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Actor-critic</topic><topic>ad hoc wireless network</topic><topic>Algorithms</topic><topic>Convergence</topic><topic>Delay</topic><topic>Delays</topic><topic>Efficiency</topic><topic>Logic gates</topic><topic>Machine learning</topic><topic>mesh wireless network</topic><topic>Multiagent systems</topic><topic>Network topologies</topic><topic>Network topology</topic><topic>reinforcement learning</topic><topic>Topology</topic><topic>Topology optimization</topic><topic>Trajectory</topic><topic>Vectors</topic><topic>Wireless networks</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Sun, Wei</creatorcontrib><creatorcontrib>Lv, Qiushuo</creatorcontrib><creatorcontrib>Xiao, Yang</creatorcontrib><creatorcontrib>Liu, Zhi</creatorcontrib><creatorcontrib>Tang, Qingwei</creatorcontrib><creatorcontrib>Li, Qiyue</creatorcontrib><creatorcontrib>Mu, Daoming</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE transactions on wireless communications</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Sun, Wei</au><au>Lv, Qiushuo</au><au>Xiao, Yang</au><au>Liu, Zhi</au><au>Tang, Qingwei</au><au>Li, Qiyue</au><au>Mu, Daoming</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Multi-Agent Reinforcement Learning for Dynamic Topology Optimization of Mesh Wireless Networks</atitle><jtitle>IEEE transactions on wireless communications</jtitle><stitle>TWC</stitle><date>2024-09-01</date><risdate>2024</risdate><volume>23</volume><issue>9</issue><spage>10501</spage><epage>10513</epage><pages>10501-10513</pages><issn>1536-1276</issn><eissn>1558-2248</eissn><coden>ITWCAX</coden><abstract>In Mesh Wireless Networks (MWNs), the network coverage is extended by connecting Access Points (APs) in a mesh topology, where transmitting frames by multi-hop routing has to sustain the performances, such as end-to-end (E2E) delay and channel efficiency. Several recent studies have focused on minimizing E2E delay, but these methods are unable to adapt to the dynamic nature of MWNs. Meanwhile, reinforcement-learning-based methods offer better adaptability to dynamics but suffer from the problem of high-dimensional action spaces, leading to slower convergence. In this paper, we propose a multi-agent actor-critic reinforcement learning (MACRL) algorithm to optimize multiple objectives, specifically the minimization of E2E delay and the enhancement of channel efficiency. First, to reduce the action space and speed up the convergence in the dynamical optimization process, a centralized-critic-distributed-actor scheme is proposed. Then, a multi-objective reward balancing method is designed to dynamically balance the MWNs' performances between the E2E delay and the channel efficiency. Finally, the trained MACRL algorithm is deployed in the QaulNet simulator to verify its effectiveness.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/TWC.2024.3372694</doi><tpages>13</tpages><orcidid>https://orcid.org/0000-0003-0537-4522</orcidid><orcidid>https://orcid.org/0000-0001-8549-6794</orcidid><orcidid>https://orcid.org/0000-0003-4075-0597</orcidid><orcidid>https://orcid.org/0000-0002-9399-8759</orcidid><orcidid>https://orcid.org/0000-0002-1692-3796</orcidid></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 1536-1276
ispartof IEEE transactions on wireless communications, 2024-09, Vol.23 (9), p.10501-10513
issn 1536-1276
1558-2248
language eng
recordid cdi_crossref_primary_10_1109_TWC_2024_3372694
source IEEE Electronic Library (IEL)
subjects Actor-critic
ad hoc wireless network
Algorithms
Convergence
Delay
Delays
Efficiency
Logic gates
Machine learning
mesh wireless network
Multiagent systems
Network topologies
Network topology
reinforcement learning
Topology
Topology optimization
Trajectory
Vectors
Wireless networks
title Multi-Agent Reinforcement Learning for Dynamic Topology Optimization of Mesh Wireless Networks
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-13T05%3A55%3A51IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Multi-Agent%20Reinforcement%20Learning%20for%20Dynamic%20Topology%20Optimization%20of%20Mesh%20Wireless%20Networks&rft.jtitle=IEEE%20transactions%20on%20wireless%20communications&rft.au=Sun,%20Wei&rft.date=2024-09-01&rft.volume=23&rft.issue=9&rft.spage=10501&rft.epage=10513&rft.pages=10501-10513&rft.issn=1536-1276&rft.eissn=1558-2248&rft.coden=ITWCAX&rft_id=info:doi/10.1109/TWC.2024.3372694&rft_dat=%3Cproquest_RIE%3E3102974480%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3102974480&rft_id=info:pmid/&rft_ieee_id=10466475&rfr_iscdi=true