Hierarchical Reinforcement Learning-Based Joint Allocation of Jamming Task and Power for Countering Networked Radar

The detection fusion and anti-jamming of a networked radar (NR) create a significant dynamic game between the NR and the jammer, making immediate jamming strategies that maximize the current jamming benefit unsuitable. Therefore, this paper proposes a long-term joint optimization problem of jamming...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on aerospace and electronic systems 2024-09, p.1-19
Hauptverfasser:	Wang, Yuedong, Liang, Yan, Wang, Zengfu
Format:	Artikel
Sprache:	eng
Schlagworte:	Airborne radar Aircraft Deep reinforcement learning Jamming networked radar noise jamming Optimization Radar cross-sections Radar detection resource allocation Resource management
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	19
container_issue
container_start_page	1
container_title	IEEE transactions on aerospace and electronic systems
container_volume
creator	Wang, Yuedong Liang, Yan Wang, Zengfu
description	The detection fusion and anti-jamming of a networked radar (NR) create a significant dynamic game between the NR and the jammer, making immediate jamming strategies that maximize the current jamming benefit unsuitable. Therefore, this paper proposes a long-term joint optimization problem of jamming tasks and power allocation in the NR anti-jamming fusion game. Specifically, a jammer is utilized to disrupt the joint detection capability of the NR, which possesses detection fusion and jamming suppression mechanisms. By simulating task decomposition and hierarchical control ideas from human decision-making processes, a hierarchical reinforcement learning-based jamming resource allocation scheme is established. This scheme designs a hierarchical policy network with a shared evaluation network, achieving joint optimization of hybrid discrete (jamming task) and continuous (jamming power) control variables. The value loss, top-level policy loss, and low-level policy loss, which are constructed based on the total reward, are optimized to update the parameters of the evaluation network and hierarchical policy network, thereby improving allocation strategies. Moreover, the state features and total reward are suitably designed based on the jamming mission to aid the jammer's strategy exploration. Finally, our approach is compared with state-of-the-art deep reinforcement learning (DRL) algorithms and timely optimization methods, demonstrating superior jamming performance and shorter decision-making time under typical parameters considering radar deployment and formation motion.
doi_str_mv	10.1109/TAES.2024.3467041
format	Article
fullrecord	<record><control><sourceid>crossref_RIE</sourceid><recordid>TN_cdi_ieee_primary_10693358</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10693358</ieee_id><sourcerecordid>10_1109_TAES_2024_3467041</sourcerecordid><originalsourceid>FETCH-LOGICAL-c638-a5ec76637837e4b930c907aba14942c91bda5048502eab86423b9755e8ab5c173</originalsourceid><addsrcrecordid>eNpNkEFOwzAQRS0EEqVwACQWvkCKHduxvSxVoVQVoJJ9NHEmYJrGyAmquD2J2gWr0cz8_xaPkFvOZpwze5_Pl--zlKVyJmSmmeRnZMKV0onNmDgnE8a4SWyq-CW56rqvYZVGignpVh4jRPfpHTR0i76tQ3S4x7anG4TY-vYjeYAOK7oOfjjOmyY46H1oaajpGvb7IUFz6HYU2oq-hQNGOjDoIvy0Pcbx-4L9IcTdwNhCBfGaXNTQdHhzmlOSPy7zxSrZvD49L-abxGXCJKDQ6SwT2giNsrSCOcs0lMCllamzvKxAMWkUSxFKk8lUlFYrhQZK5bgWU8KPWBdD10Wsi-_o9xB_C86KUVoxSitGacVJ2tC5O3Y8Iv7LZ1YIZcQflzZphg</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Hierarchical Reinforcement Learning-Based Joint Allocation of Jamming Task and Power for Countering Networked Radar</title><source>IEEE Electronic Library (IEL)</source><creator>Wang, Yuedong ; Liang, Yan ; Wang, Zengfu</creator><creatorcontrib>Wang, Yuedong ; Liang, Yan ; Wang, Zengfu</creatorcontrib><description>The detection fusion and anti-jamming of a networked radar (NR) create a significant dynamic game between the NR and the jammer, making immediate jamming strategies that maximize the current jamming benefit unsuitable. Therefore, this paper proposes a long-term joint optimization problem of jamming tasks and power allocation in the NR anti-jamming fusion game. Specifically, a jammer is utilized to disrupt the joint detection capability of the NR, which possesses detection fusion and jamming suppression mechanisms. By simulating task decomposition and hierarchical control ideas from human decision-making processes, a hierarchical reinforcement learning-based jamming resource allocation scheme is established. This scheme designs a hierarchical policy network with a shared evaluation network, achieving joint optimization of hybrid discrete (jamming task) and continuous (jamming power) control variables. The value loss, top-level policy loss, and low-level policy loss, which are constructed based on the total reward, are optimized to update the parameters of the evaluation network and hierarchical policy network, thereby improving allocation strategies. Moreover, the state features and total reward are suitably designed based on the jamming mission to aid the jammer's strategy exploration. Finally, our approach is compared with state-of-the-art deep reinforcement learning (DRL) algorithms and timely optimization methods, demonstrating superior jamming performance and shorter decision-making time under typical parameters considering radar deployment and formation motion.</description><identifier>ISSN: 0018-9251</identifier><identifier>EISSN: 1557-9603</identifier><identifier>DOI: 10.1109/TAES.2024.3467041</identifier><identifier>CODEN: IEARAX</identifier><language>eng</language><publisher>IEEE</publisher><subject>Airborne radar ; Aircraft ; Deep reinforcement learning ; Jamming ; networked radar ; noise jamming ; Optimization ; Radar cross-sections ; Radar detection ; resource allocation ; Resource management</subject><ispartof>IEEE transactions on aerospace and electronic systems, 2024-09, p.1-19</ispartof><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><orcidid>0000-0003-0424-3010 ; 0000-0003-4798-4257 ; 0000-0002-7476-4201</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10693358$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,27901,27902,54733</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/10693358$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Wang, Yuedong</creatorcontrib><creatorcontrib>Liang, Yan</creatorcontrib><creatorcontrib>Wang, Zengfu</creatorcontrib><title>Hierarchical Reinforcement Learning-Based Joint Allocation of Jamming Task and Power for Countering Networked Radar</title><title>IEEE transactions on aerospace and electronic systems</title><addtitle>T-AES</addtitle><description>The detection fusion and anti-jamming of a networked radar (NR) create a significant dynamic game between the NR and the jammer, making immediate jamming strategies that maximize the current jamming benefit unsuitable. Therefore, this paper proposes a long-term joint optimization problem of jamming tasks and power allocation in the NR anti-jamming fusion game. Specifically, a jammer is utilized to disrupt the joint detection capability of the NR, which possesses detection fusion and jamming suppression mechanisms. By simulating task decomposition and hierarchical control ideas from human decision-making processes, a hierarchical reinforcement learning-based jamming resource allocation scheme is established. This scheme designs a hierarchical policy network with a shared evaluation network, achieving joint optimization of hybrid discrete (jamming task) and continuous (jamming power) control variables. The value loss, top-level policy loss, and low-level policy loss, which are constructed based on the total reward, are optimized to update the parameters of the evaluation network and hierarchical policy network, thereby improving allocation strategies. Moreover, the state features and total reward are suitably designed based on the jamming mission to aid the jammer's strategy exploration. Finally, our approach is compared with state-of-the-art deep reinforcement learning (DRL) algorithms and timely optimization methods, demonstrating superior jamming performance and shorter decision-making time under typical parameters considering radar deployment and formation motion.</description><subject>Airborne radar</subject><subject>Aircraft</subject><subject>Deep reinforcement learning</subject><subject>Jamming</subject><subject>networked radar</subject><subject>noise jamming</subject><subject>Optimization</subject><subject>Radar cross-sections</subject><subject>Radar detection</subject><subject>resource allocation</subject><subject>Resource management</subject><issn>0018-9251</issn><issn>1557-9603</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpNkEFOwzAQRS0EEqVwACQWvkCKHduxvSxVoVQVoJJ9NHEmYJrGyAmquD2J2gWr0cz8_xaPkFvOZpwze5_Pl--zlKVyJmSmmeRnZMKV0onNmDgnE8a4SWyq-CW56rqvYZVGignpVh4jRPfpHTR0i76tQ3S4x7anG4TY-vYjeYAOK7oOfjjOmyY46H1oaajpGvb7IUFz6HYU2oq-hQNGOjDoIvy0Pcbx-4L9IcTdwNhCBfGaXNTQdHhzmlOSPy7zxSrZvD49L-abxGXCJKDQ6SwT2giNsrSCOcs0lMCllamzvKxAMWkUSxFKk8lUlFYrhQZK5bgWU8KPWBdD10Wsi-_o9xB_C86KUVoxSitGacVJ2tC5O3Y8Iv7LZ1YIZcQflzZphg</recordid><startdate>20240924</startdate><enddate>20240924</enddate><creator>Wang, Yuedong</creator><creator>Liang, Yan</creator><creator>Wang, Zengfu</creator><general>IEEE</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><orcidid>https://orcid.org/0000-0003-0424-3010</orcidid><orcidid>https://orcid.org/0000-0003-4798-4257</orcidid><orcidid>https://orcid.org/0000-0002-7476-4201</orcidid></search><sort><creationdate>20240924</creationdate><title>Hierarchical Reinforcement Learning-Based Joint Allocation of Jamming Task and Power for Countering Networked Radar</title><author>Wang, Yuedong ; Liang, Yan ; Wang, Zengfu</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c638-a5ec76637837e4b930c907aba14942c91bda5048502eab86423b9755e8ab5c173</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Airborne radar</topic><topic>Aircraft</topic><topic>Deep reinforcement learning</topic><topic>Jamming</topic><topic>networked radar</topic><topic>noise jamming</topic><topic>Optimization</topic><topic>Radar cross-sections</topic><topic>Radar detection</topic><topic>resource allocation</topic><topic>Resource management</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Wang, Yuedong</creatorcontrib><creatorcontrib>Liang, Yan</creatorcontrib><creatorcontrib>Wang, Zengfu</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><jtitle>IEEE transactions on aerospace and electronic systems</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Wang, Yuedong</au><au>Liang, Yan</au><au>Wang, Zengfu</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Hierarchical Reinforcement Learning-Based Joint Allocation of Jamming Task and Power for Countering Networked Radar</atitle><jtitle>IEEE transactions on aerospace and electronic systems</jtitle><stitle>T-AES</stitle><date>2024-09-24</date><risdate>2024</risdate><spage>1</spage><epage>19</epage><pages>1-19</pages><issn>0018-9251</issn><eissn>1557-9603</eissn><coden>IEARAX</coden><abstract>The detection fusion and anti-jamming of a networked radar (NR) create a significant dynamic game between the NR and the jammer, making immediate jamming strategies that maximize the current jamming benefit unsuitable. Therefore, this paper proposes a long-term joint optimization problem of jamming tasks and power allocation in the NR anti-jamming fusion game. Specifically, a jammer is utilized to disrupt the joint detection capability of the NR, which possesses detection fusion and jamming suppression mechanisms. By simulating task decomposition and hierarchical control ideas from human decision-making processes, a hierarchical reinforcement learning-based jamming resource allocation scheme is established. This scheme designs a hierarchical policy network with a shared evaluation network, achieving joint optimization of hybrid discrete (jamming task) and continuous (jamming power) control variables. The value loss, top-level policy loss, and low-level policy loss, which are constructed based on the total reward, are optimized to update the parameters of the evaluation network and hierarchical policy network, thereby improving allocation strategies. Moreover, the state features and total reward are suitably designed based on the jamming mission to aid the jammer's strategy exploration. Finally, our approach is compared with state-of-the-art deep reinforcement learning (DRL) algorithms and timely optimization methods, demonstrating superior jamming performance and shorter decision-making time under typical parameters considering radar deployment and formation motion.</abstract><pub>IEEE</pub><doi>10.1109/TAES.2024.3467041</doi><tpages>19</tpages><orcidid>https://orcid.org/0000-0003-0424-3010</orcidid><orcidid>https://orcid.org/0000-0003-4798-4257</orcidid><orcidid>https://orcid.org/0000-0002-7476-4201</orcidid></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 0018-9251
ispartof	IEEE transactions on aerospace and electronic systems, 2024-09, p.1-19
issn	0018-9251 1557-9603
language	eng
recordid	cdi_ieee_primary_10693358
source	IEEE Electronic Library (IEL)
subjects	Airborne radar Aircraft Deep reinforcement learning Jamming networked radar noise jamming Optimization Radar cross-sections Radar detection resource allocation Resource management
title	Hierarchical Reinforcement Learning-Based Joint Allocation of Jamming Task and Power for Countering Networked Radar
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-09T05%3A21%3A19IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-crossref_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Hierarchical%20Reinforcement%20Learning-Based%20Joint%20Allocation%20of%20Jamming%20Task%20and%20Power%20for%20Countering%20Networked%20Radar&rft.jtitle=IEEE%20transactions%20on%20aerospace%20and%20electronic%20systems&rft.au=Wang,%20Yuedong&rft.date=2024-09-24&rft.spage=1&rft.epage=19&rft.pages=1-19&rft.issn=0018-9251&rft.eissn=1557-9603&rft.coden=IEARAX&rft_id=info:doi/10.1109/TAES.2024.3467041&rft_dat=%3Ccrossref_RIE%3E10_1109_TAES_2024_3467041%3C/crossref_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=10693358&rfr_iscdi=true