Multi-Agent Reinforcement Learning for Dynamic Topology Optimization of Mesh Wireless Networks
In Mesh Wireless Networks (MWNs), the network coverage is extended by connecting Access Points (APs) in a mesh topology, where transmitting frames by multi-hop routing has to sustain the performances, such as end-to-end (E2E) delay and channel efficiency. Several recent studies have focused on minim...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on wireless communications 2024-09, Vol.23 (9), p.10501-10513 |
---|---|
Hauptverfasser: | , , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 10513 |
---|---|
container_issue | 9 |
container_start_page | 10501 |
container_title | IEEE transactions on wireless communications |
container_volume | 23 |
creator | Sun, Wei Lv, Qiushuo Xiao, Yang Liu, Zhi Tang, Qingwei Li, Qiyue Mu, Daoming |
description | In Mesh Wireless Networks (MWNs), the network coverage is extended by connecting Access Points (APs) in a mesh topology, where transmitting frames by multi-hop routing has to sustain the performances, such as end-to-end (E2E) delay and channel efficiency. Several recent studies have focused on minimizing E2E delay, but these methods are unable to adapt to the dynamic nature of MWNs. Meanwhile, reinforcement-learning-based methods offer better adaptability to dynamics but suffer from the problem of high-dimensional action spaces, leading to slower convergence. In this paper, we propose a multi-agent actor-critic reinforcement learning (MACRL) algorithm to optimize multiple objectives, specifically the minimization of E2E delay and the enhancement of channel efficiency. First, to reduce the action space and speed up the convergence in the dynamical optimization process, a centralized-critic-distributed-actor scheme is proposed. Then, a multi-objective reward balancing method is designed to dynamically balance the MWNs' performances between the E2E delay and the channel efficiency. Finally, the trained MACRL algorithm is deployed in the QaulNet simulator to verify its effectiveness. |
doi_str_mv | 10.1109/TWC.2024.3372694 |
format | Article |
fullrecord | <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_crossref_primary_10_1109_TWC_2024_3372694</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10466475</ieee_id><sourcerecordid>3102974480</sourcerecordid><originalsourceid>FETCH-LOGICAL-c245t-a972108d556aae2d6f36c28ee2baa9266f22b5fbebbe5dabd593bb59474415e63</originalsourceid><addsrcrecordid>eNpNkM1LwzAYh4MoOKd3Dx4CnjuTNEnb45ifsDmQyW6GpH07M9umJh0y_3pbtoOn94Pn977wIHRNyYRSkt2t1rMJI4xP4jhhMuMnaESFSCPGeHo69LGMKEvkOboIYUsITaQQI_Sx2FWdjaYbaDr8BrYpnc-hHqY5aN_YZoP7Fb7fN7q2OV651lVus8fLtrO1_dWddQ12JV5A-MRr66GCEPArdD_Of4VLdFbqKsDVsY7R--PDavYczZdPL7PpPMoZF12ks4RRkhZCSK2BFbKMZc5SAGa0zpiUJWNGlAaMAVFoU4gsNkZkPOGcCpDxGN0e7rbefe8gdGrrdr7pX6qYEpb1XEp6ihyo3LsQPJSq9bbWfq8oUYNF1VtUg0V1tNhHbg4RCwD_cC4lT0T8B5xgb3U</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3102974480</pqid></control><display><type>article</type><title>Multi-Agent Reinforcement Learning for Dynamic Topology Optimization of Mesh Wireless Networks</title><source>IEEE Electronic Library (IEL)</source><creator>Sun, Wei ; Lv, Qiushuo ; Xiao, Yang ; Liu, Zhi ; Tang, Qingwei ; Li, Qiyue ; Mu, Daoming</creator><creatorcontrib>Sun, Wei ; Lv, Qiushuo ; Xiao, Yang ; Liu, Zhi ; Tang, Qingwei ; Li, Qiyue ; Mu, Daoming</creatorcontrib><description>In Mesh Wireless Networks (MWNs), the network coverage is extended by connecting Access Points (APs) in a mesh topology, where transmitting frames by multi-hop routing has to sustain the performances, such as end-to-end (E2E) delay and channel efficiency. Several recent studies have focused on minimizing E2E delay, but these methods are unable to adapt to the dynamic nature of MWNs. Meanwhile, reinforcement-learning-based methods offer better adaptability to dynamics but suffer from the problem of high-dimensional action spaces, leading to slower convergence. In this paper, we propose a multi-agent actor-critic reinforcement learning (MACRL) algorithm to optimize multiple objectives, specifically the minimization of E2E delay and the enhancement of channel efficiency. First, to reduce the action space and speed up the convergence in the dynamical optimization process, a centralized-critic-distributed-actor scheme is proposed. Then, a multi-objective reward balancing method is designed to dynamically balance the MWNs' performances between the E2E delay and the channel efficiency. Finally, the trained MACRL algorithm is deployed in the QaulNet simulator to verify its effectiveness.</description><identifier>ISSN: 1536-1276</identifier><identifier>EISSN: 1558-2248</identifier><identifier>DOI: 10.1109/TWC.2024.3372694</identifier><identifier>CODEN: ITWCAX</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Actor-critic ; ad hoc wireless network ; Algorithms ; Convergence ; Delay ; Delays ; Efficiency ; Logic gates ; Machine learning ; mesh wireless network ; Multiagent systems ; Network topologies ; Network topology ; reinforcement learning ; Topology ; Topology optimization ; Trajectory ; Vectors ; Wireless networks</subject><ispartof>IEEE transactions on wireless communications, 2024-09, Vol.23 (9), p.10501-10513</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2024</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c245t-a972108d556aae2d6f36c28ee2baa9266f22b5fbebbe5dabd593bb59474415e63</cites><orcidid>0000-0003-0537-4522 ; 0000-0001-8549-6794 ; 0000-0003-4075-0597 ; 0000-0002-9399-8759 ; 0000-0002-1692-3796</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10466475$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,27901,27902,54733</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/10466475$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Sun, Wei</creatorcontrib><creatorcontrib>Lv, Qiushuo</creatorcontrib><creatorcontrib>Xiao, Yang</creatorcontrib><creatorcontrib>Liu, Zhi</creatorcontrib><creatorcontrib>Tang, Qingwei</creatorcontrib><creatorcontrib>Li, Qiyue</creatorcontrib><creatorcontrib>Mu, Daoming</creatorcontrib><title>Multi-Agent Reinforcement Learning for Dynamic Topology Optimization of Mesh Wireless Networks</title><title>IEEE transactions on wireless communications</title><addtitle>TWC</addtitle><description>In Mesh Wireless Networks (MWNs), the network coverage is extended by connecting Access Points (APs) in a mesh topology, where transmitting frames by multi-hop routing has to sustain the performances, such as end-to-end (E2E) delay and channel efficiency. Several recent studies have focused on minimizing E2E delay, but these methods are unable to adapt to the dynamic nature of MWNs. Meanwhile, reinforcement-learning-based methods offer better adaptability to dynamics but suffer from the problem of high-dimensional action spaces, leading to slower convergence. In this paper, we propose a multi-agent actor-critic reinforcement learning (MACRL) algorithm to optimize multiple objectives, specifically the minimization of E2E delay and the enhancement of channel efficiency. First, to reduce the action space and speed up the convergence in the dynamical optimization process, a centralized-critic-distributed-actor scheme is proposed. Then, a multi-objective reward balancing method is designed to dynamically balance the MWNs' performances between the E2E delay and the channel efficiency. Finally, the trained MACRL algorithm is deployed in the QaulNet simulator to verify its effectiveness.</description><subject>Actor-critic</subject><subject>ad hoc wireless network</subject><subject>Algorithms</subject><subject>Convergence</subject><subject>Delay</subject><subject>Delays</subject><subject>Efficiency</subject><subject>Logic gates</subject><subject>Machine learning</subject><subject>mesh wireless network</subject><subject>Multiagent systems</subject><subject>Network topologies</subject><subject>Network topology</subject><subject>reinforcement learning</subject><subject>Topology</subject><subject>Topology optimization</subject><subject>Trajectory</subject><subject>Vectors</subject><subject>Wireless networks</subject><issn>1536-1276</issn><issn>1558-2248</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpNkM1LwzAYh4MoOKd3Dx4CnjuTNEnb45ifsDmQyW6GpH07M9umJh0y_3pbtoOn94Pn977wIHRNyYRSkt2t1rMJI4xP4jhhMuMnaESFSCPGeHo69LGMKEvkOboIYUsITaQQI_Sx2FWdjaYbaDr8BrYpnc-hHqY5aN_YZoP7Fb7fN7q2OV651lVus8fLtrO1_dWddQ12JV5A-MRr66GCEPArdD_Of4VLdFbqKsDVsY7R--PDavYczZdPL7PpPMoZF12ks4RRkhZCSK2BFbKMZc5SAGa0zpiUJWNGlAaMAVFoU4gsNkZkPOGcCpDxGN0e7rbefe8gdGrrdr7pX6qYEpb1XEp6ihyo3LsQPJSq9bbWfq8oUYNF1VtUg0V1tNhHbg4RCwD_cC4lT0T8B5xgb3U</recordid><startdate>20240901</startdate><enddate>20240901</enddate><creator>Sun, Wei</creator><creator>Lv, Qiushuo</creator><creator>Xiao, Yang</creator><creator>Liu, Zhi</creator><creator>Tang, Qingwei</creator><creator>Li, Qiyue</creator><creator>Mu, Daoming</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0003-0537-4522</orcidid><orcidid>https://orcid.org/0000-0001-8549-6794</orcidid><orcidid>https://orcid.org/0000-0003-4075-0597</orcidid><orcidid>https://orcid.org/0000-0002-9399-8759</orcidid><orcidid>https://orcid.org/0000-0002-1692-3796</orcidid></search><sort><creationdate>20240901</creationdate><title>Multi-Agent Reinforcement Learning for Dynamic Topology Optimization of Mesh Wireless Networks</title><author>Sun, Wei ; Lv, Qiushuo ; Xiao, Yang ; Liu, Zhi ; Tang, Qingwei ; Li, Qiyue ; Mu, Daoming</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c245t-a972108d556aae2d6f36c28ee2baa9266f22b5fbebbe5dabd593bb59474415e63</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Actor-critic</topic><topic>ad hoc wireless network</topic><topic>Algorithms</topic><topic>Convergence</topic><topic>Delay</topic><topic>Delays</topic><topic>Efficiency</topic><topic>Logic gates</topic><topic>Machine learning</topic><topic>mesh wireless network</topic><topic>Multiagent systems</topic><topic>Network topologies</topic><topic>Network topology</topic><topic>reinforcement learning</topic><topic>Topology</topic><topic>Topology optimization</topic><topic>Trajectory</topic><topic>Vectors</topic><topic>Wireless networks</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Sun, Wei</creatorcontrib><creatorcontrib>Lv, Qiushuo</creatorcontrib><creatorcontrib>Xiao, Yang</creatorcontrib><creatorcontrib>Liu, Zhi</creatorcontrib><creatorcontrib>Tang, Qingwei</creatorcontrib><creatorcontrib>Li, Qiyue</creatorcontrib><creatorcontrib>Mu, Daoming</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE transactions on wireless communications</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Sun, Wei</au><au>Lv, Qiushuo</au><au>Xiao, Yang</au><au>Liu, Zhi</au><au>Tang, Qingwei</au><au>Li, Qiyue</au><au>Mu, Daoming</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Multi-Agent Reinforcement Learning for Dynamic Topology Optimization of Mesh Wireless Networks</atitle><jtitle>IEEE transactions on wireless communications</jtitle><stitle>TWC</stitle><date>2024-09-01</date><risdate>2024</risdate><volume>23</volume><issue>9</issue><spage>10501</spage><epage>10513</epage><pages>10501-10513</pages><issn>1536-1276</issn><eissn>1558-2248</eissn><coden>ITWCAX</coden><abstract>In Mesh Wireless Networks (MWNs), the network coverage is extended by connecting Access Points (APs) in a mesh topology, where transmitting frames by multi-hop routing has to sustain the performances, such as end-to-end (E2E) delay and channel efficiency. Several recent studies have focused on minimizing E2E delay, but these methods are unable to adapt to the dynamic nature of MWNs. Meanwhile, reinforcement-learning-based methods offer better adaptability to dynamics but suffer from the problem of high-dimensional action spaces, leading to slower convergence. In this paper, we propose a multi-agent actor-critic reinforcement learning (MACRL) algorithm to optimize multiple objectives, specifically the minimization of E2E delay and the enhancement of channel efficiency. First, to reduce the action space and speed up the convergence in the dynamical optimization process, a centralized-critic-distributed-actor scheme is proposed. Then, a multi-objective reward balancing method is designed to dynamically balance the MWNs' performances between the E2E delay and the channel efficiency. Finally, the trained MACRL algorithm is deployed in the QaulNet simulator to verify its effectiveness.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/TWC.2024.3372694</doi><tpages>13</tpages><orcidid>https://orcid.org/0000-0003-0537-4522</orcidid><orcidid>https://orcid.org/0000-0001-8549-6794</orcidid><orcidid>https://orcid.org/0000-0003-4075-0597</orcidid><orcidid>https://orcid.org/0000-0002-9399-8759</orcidid><orcidid>https://orcid.org/0000-0002-1692-3796</orcidid></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | ISSN: 1536-1276 |
ispartof | IEEE transactions on wireless communications, 2024-09, Vol.23 (9), p.10501-10513 |
issn | 1536-1276 1558-2248 |
language | eng |
recordid | cdi_crossref_primary_10_1109_TWC_2024_3372694 |
source | IEEE Electronic Library (IEL) |
subjects | Actor-critic ad hoc wireless network Algorithms Convergence Delay Delays Efficiency Logic gates Machine learning mesh wireless network Multiagent systems Network topologies Network topology reinforcement learning Topology Topology optimization Trajectory Vectors Wireless networks |
title | Multi-Agent Reinforcement Learning for Dynamic Topology Optimization of Mesh Wireless Networks |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-13T05%3A55%3A51IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Multi-Agent%20Reinforcement%20Learning%20for%20Dynamic%20Topology%20Optimization%20of%20Mesh%20Wireless%20Networks&rft.jtitle=IEEE%20transactions%20on%20wireless%20communications&rft.au=Sun,%20Wei&rft.date=2024-09-01&rft.volume=23&rft.issue=9&rft.spage=10501&rft.epage=10513&rft.pages=10501-10513&rft.issn=1536-1276&rft.eissn=1558-2248&rft.coden=ITWCAX&rft_id=info:doi/10.1109/TWC.2024.3372694&rft_dat=%3Cproquest_RIE%3E3102974480%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3102974480&rft_id=info:pmid/&rft_ieee_id=10466475&rfr_iscdi=true |