Deep Reinforcement Learning for Dynamic Spectrum Access: Convergence Analysis and System Design

In dynamic spectrum access (DSA) networks, secondary users (SUs) need to opportunistically access primary users' (PUs) radio spectrum without causing significant interference. Since the SU-PU interaction is limited, deep reinforcement learning has been introduced to help SUs conduct spectrum ac...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on wireless communications 2024-12, Vol.23 (12), p.18888-18902
Hauptverfasser:	Safavinejad, Ramin, Chang, Hao-Hsuan, Liu, Lingjia
Format:	Artikel
Sprache:	eng
Schlagworte:	5G beyond and 6G 5G mobile communication 6G mobile communication Convergence covering numbers Deep learning Deep reinforcement learning (DRL) dynamic spectrum access (DSA) echo state network (ESN) Networks Parameters Performance evaluation Radio spectra recurrent neural network Spectrum allocation Systems design Training Training data Upper bound Upper bounds Wireless communication
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	18902
container_issue	12
container_start_page	18888
container_title	IEEE transactions on wireless communications
container_volume	23
creator	Safavinejad, Ramin Chang, Hao-Hsuan Liu, Lingjia
description	In dynamic spectrum access (DSA) networks, secondary users (SUs) need to opportunistically access primary users' (PUs) radio spectrum without causing significant interference. Since the SU-PU interaction is limited, deep reinforcement learning has been introduced to help SUs conduct spectrum access. Specifically, deep recurrent Q network (DRQN) has been utilized in DSA networks for SUs to aggregate information from recent experiences to make spectrum access decisions. DRQN is notorious for its sample efficiency since it needs a rather large number of training samples to tune its parameters which is a computationally demanding task. Deep echo state network (DEQN) has been introduced for DSA networks to address the sample efficiency issue of DRQN. In this work, we compare the convergence of DRQN and DEQN by comparing the upper bounds we obtain on their covering number, a notion of richness. Furthermore, we introduce a method to determine the right hyper-parameters for DEQN, providing system design guidance for DEQN-based DSA networks. Extensive performance evaluation confirms that DEQN-based DSA strategy is the superior choice with regard to computational power while outperforming DRQN-based ones.
doi_str_mv	10.1109/TWC.2024.3414428
format	Article
fullrecord	<record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_proquest_journals_3143027681</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10623365</ieee_id><sourcerecordid>3143027681</sourcerecordid><originalsourceid>FETCH-LOGICAL-c175t-1a3f28d10c4f4bf2694066efaa0dbcb726e2ec394b0af1f9c223e33901448be63</originalsourceid><addsrcrecordid>eNpNkM1LAzEQxYMoWKt3Dx4Cnrfma7O73srWLygItuIxZNNJSelm12Qr7H9vSj14msfw3jDvh9AtJTNKSfWw_qpnjDAx44IKwcozNKF5XmaMifL8qLnMKCvkJbqKcUcILWSeT5BaAPT4A5y3XTDQgh_wEnTwzm9xWuHF6HXrDF71YIZwaPHcGIjxEded_4GwBW8Az73ej9FFrP0Gr8Y4QIsXEN3WX6MLq_cRbv7mFH0-P63r12z5_vJWz5eZoUU-ZFRzy8oNJUZY0VgmK0GkBKs12TSmKZgEBoZXoiHaUlsZxjhwXpHUtWxA8im6P93tQ_d9gDioXXcI6a2oOBWcpOYlTS5ycpnQxRjAqj64VodRUaKOGFXCqI4Y1R_GFLk7RRwA_LNLxrnM-S8i7m6T</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3143027681</pqid></control><display><type>article</type><title>Deep Reinforcement Learning for Dynamic Spectrum Access: Convergence Analysis and System Design</title><source>IEEE/IET Electronic Library (IEL)</source><creator>Safavinejad, Ramin ; Chang, Hao-Hsuan ; Liu, Lingjia</creator><creatorcontrib>Safavinejad, Ramin ; Chang, Hao-Hsuan ; Liu, Lingjia</creatorcontrib><description>In dynamic spectrum access (DSA) networks, secondary users (SUs) need to opportunistically access primary users' (PUs) radio spectrum without causing significant interference. Since the SU-PU interaction is limited, deep reinforcement learning has been introduced to help SUs conduct spectrum access. Specifically, deep recurrent Q network (DRQN) has been utilized in DSA networks for SUs to aggregate information from recent experiences to make spectrum access decisions. DRQN is notorious for its sample efficiency since it needs a rather large number of training samples to tune its parameters which is a computationally demanding task. Deep echo state network (DEQN) has been introduced for DSA networks to address the sample efficiency issue of DRQN. In this work, we compare the convergence of DRQN and DEQN by comparing the upper bounds we obtain on their covering number, a notion of richness. Furthermore, we introduce a method to determine the right hyper-parameters for DEQN, providing system design guidance for DEQN-based DSA networks. Extensive performance evaluation confirms that DEQN-based DSA strategy is the superior choice with regard to computational power while outperforming DRQN-based ones.</description><identifier>ISSN: 1536-1276</identifier><identifier>EISSN: 1558-2248</identifier><identifier>DOI: 10.1109/TWC.2024.3414428</identifier><identifier>CODEN: ITWCAX</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>5G beyond and 6G ; 5G mobile communication ; 6G mobile communication ; Convergence ; covering numbers ; Deep learning ; Deep reinforcement learning (DRL) ; dynamic spectrum access (DSA) ; echo state network (ESN) ; Networks ; Parameters ; Performance evaluation ; Radio spectra ; recurrent neural network ; Spectrum allocation ; Systems design ; Training ; Training data ; Upper bound ; Upper bounds ; Wireless communication</subject><ispartof>IEEE transactions on wireless communications, 2024-12, Vol.23 (12), p.18888-18902</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2024</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c175t-1a3f28d10c4f4bf2694066efaa0dbcb726e2ec394b0af1f9c223e33901448be63</cites><orcidid>0000-0003-1915-1784 ; 0000-0002-6910-054X ; 0000-0002-5494-2154</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10623365$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,777,781,793,27905,27906,54739</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/10623365$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Safavinejad, Ramin</creatorcontrib><creatorcontrib>Chang, Hao-Hsuan</creatorcontrib><creatorcontrib>Liu, Lingjia</creatorcontrib><title>Deep Reinforcement Learning for Dynamic Spectrum Access: Convergence Analysis and System Design</title><title>IEEE transactions on wireless communications</title><addtitle>TWC</addtitle><description>In dynamic spectrum access (DSA) networks, secondary users (SUs) need to opportunistically access primary users' (PUs) radio spectrum without causing significant interference. Since the SU-PU interaction is limited, deep reinforcement learning has been introduced to help SUs conduct spectrum access. Specifically, deep recurrent Q network (DRQN) has been utilized in DSA networks for SUs to aggregate information from recent experiences to make spectrum access decisions. DRQN is notorious for its sample efficiency since it needs a rather large number of training samples to tune its parameters which is a computationally demanding task. Deep echo state network (DEQN) has been introduced for DSA networks to address the sample efficiency issue of DRQN. In this work, we compare the convergence of DRQN and DEQN by comparing the upper bounds we obtain on their covering number, a notion of richness. Furthermore, we introduce a method to determine the right hyper-parameters for DEQN, providing system design guidance for DEQN-based DSA networks. Extensive performance evaluation confirms that DEQN-based DSA strategy is the superior choice with regard to computational power while outperforming DRQN-based ones.</description><subject>5G beyond and 6G</subject><subject>5G mobile communication</subject><subject>6G mobile communication</subject><subject>Convergence</subject><subject>covering numbers</subject><subject>Deep learning</subject><subject>Deep reinforcement learning (DRL)</subject><subject>dynamic spectrum access (DSA)</subject><subject>echo state network (ESN)</subject><subject>Networks</subject><subject>Parameters</subject><subject>Performance evaluation</subject><subject>Radio spectra</subject><subject>recurrent neural network</subject><subject>Spectrum allocation</subject><subject>Systems design</subject><subject>Training</subject><subject>Training data</subject><subject>Upper bound</subject><subject>Upper bounds</subject><subject>Wireless communication</subject><issn>1536-1276</issn><issn>1558-2248</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpNkM1LAzEQxYMoWKt3Dx4Cnrfma7O73srWLygItuIxZNNJSelm12Qr7H9vSj14msfw3jDvh9AtJTNKSfWw_qpnjDAx44IKwcozNKF5XmaMifL8qLnMKCvkJbqKcUcILWSeT5BaAPT4A5y3XTDQgh_wEnTwzm9xWuHF6HXrDF71YIZwaPHcGIjxEded_4GwBW8Az73ej9FFrP0Gr8Y4QIsXEN3WX6MLq_cRbv7mFH0-P63r12z5_vJWz5eZoUU-ZFRzy8oNJUZY0VgmK0GkBKs12TSmKZgEBoZXoiHaUlsZxjhwXpHUtWxA8im6P93tQ_d9gDioXXcI6a2oOBWcpOYlTS5ycpnQxRjAqj64VodRUaKOGFXCqI4Y1R_GFLk7RRwA_LNLxrnM-S8i7m6T</recordid><startdate>202412</startdate><enddate>202412</enddate><creator>Safavinejad, Ramin</creator><creator>Chang, Hao-Hsuan</creator><creator>Liu, Lingjia</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0003-1915-1784</orcidid><orcidid>https://orcid.org/0000-0002-6910-054X</orcidid><orcidid>https://orcid.org/0000-0002-5494-2154</orcidid></search><sort><creationdate>202412</creationdate><title>Deep Reinforcement Learning for Dynamic Spectrum Access: Convergence Analysis and System Design</title><author>Safavinejad, Ramin ; Chang, Hao-Hsuan ; Liu, Lingjia</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c175t-1a3f28d10c4f4bf2694066efaa0dbcb726e2ec394b0af1f9c223e33901448be63</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>5G beyond and 6G</topic><topic>5G mobile communication</topic><topic>6G mobile communication</topic><topic>Convergence</topic><topic>covering numbers</topic><topic>Deep learning</topic><topic>Deep reinforcement learning (DRL)</topic><topic>dynamic spectrum access (DSA)</topic><topic>echo state network (ESN)</topic><topic>Networks</topic><topic>Parameters</topic><topic>Performance evaluation</topic><topic>Radio spectra</topic><topic>recurrent neural network</topic><topic>Spectrum allocation</topic><topic>Systems design</topic><topic>Training</topic><topic>Training data</topic><topic>Upper bound</topic><topic>Upper bounds</topic><topic>Wireless communication</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Safavinejad, Ramin</creatorcontrib><creatorcontrib>Chang, Hao-Hsuan</creatorcontrib><creatorcontrib>Liu, Lingjia</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE/IET Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE transactions on wireless communications</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Safavinejad, Ramin</au><au>Chang, Hao-Hsuan</au><au>Liu, Lingjia</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Deep Reinforcement Learning for Dynamic Spectrum Access: Convergence Analysis and System Design</atitle><jtitle>IEEE transactions on wireless communications</jtitle><stitle>TWC</stitle><date>2024-12</date><risdate>2024</risdate><volume>23</volume><issue>12</issue><spage>18888</spage><epage>18902</epage><pages>18888-18902</pages><issn>1536-1276</issn><eissn>1558-2248</eissn><coden>ITWCAX</coden><abstract>In dynamic spectrum access (DSA) networks, secondary users (SUs) need to opportunistically access primary users' (PUs) radio spectrum without causing significant interference. Since the SU-PU interaction is limited, deep reinforcement learning has been introduced to help SUs conduct spectrum access. Specifically, deep recurrent Q network (DRQN) has been utilized in DSA networks for SUs to aggregate information from recent experiences to make spectrum access decisions. DRQN is notorious for its sample efficiency since it needs a rather large number of training samples to tune its parameters which is a computationally demanding task. Deep echo state network (DEQN) has been introduced for DSA networks to address the sample efficiency issue of DRQN. In this work, we compare the convergence of DRQN and DEQN by comparing the upper bounds we obtain on their covering number, a notion of richness. Furthermore, we introduce a method to determine the right hyper-parameters for DEQN, providing system design guidance for DEQN-based DSA networks. Extensive performance evaluation confirms that DEQN-based DSA strategy is the superior choice with regard to computational power while outperforming DRQN-based ones.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/TWC.2024.3414428</doi><tpages>15</tpages><orcidid>https://orcid.org/0000-0003-1915-1784</orcidid><orcidid>https://orcid.org/0000-0002-6910-054X</orcidid><orcidid>https://orcid.org/0000-0002-5494-2154</orcidid></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 1536-1276
ispartof	IEEE transactions on wireless communications, 2024-12, Vol.23 (12), p.18888-18902
issn	1536-1276 1558-2248
language	eng
recordid	cdi_proquest_journals_3143027681
source	IEEE/IET Electronic Library (IEL)
subjects	5G beyond and 6G 5G mobile communication 6G mobile communication Convergence covering numbers Deep learning Deep reinforcement learning (DRL) dynamic spectrum access (DSA) echo state network (ESN) Networks Parameters Performance evaluation Radio spectra recurrent neural network Spectrum allocation Systems design Training Training data Upper bound Upper bounds Wireless communication
title	Deep Reinforcement Learning for Dynamic Spectrum Access: Convergence Analysis and System Design
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-17T17%3A53%3A25IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Deep%20Reinforcement%20Learning%20for%20Dynamic%20Spectrum%20Access:%20Convergence%20Analysis%20and%20System%20Design&rft.jtitle=IEEE%20transactions%20on%20wireless%20communications&rft.au=Safavinejad,%20Ramin&rft.date=2024-12&rft.volume=23&rft.issue=12&rft.spage=18888&rft.epage=18902&rft.pages=18888-18902&rft.issn=1536-1276&rft.eissn=1558-2248&rft.coden=ITWCAX&rft_id=info:doi/10.1109/TWC.2024.3414428&rft_dat=%3Cproquest_RIE%3E3143027681%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3143027681&rft_id=info:pmid/&rft_ieee_id=10623365&rfr_iscdi=true