Deep Reinforcement Learning for Dynamic Spectrum Access: Convergence Analysis and System Design

In dynamic spectrum access (DSA) networks, secondary users (SUs) need to opportunistically access primary users' (PUs) radio spectrum without causing significant interference. Since the SU-PU interaction is limited, deep reinforcement learning has been introduced to help SUs conduct spectrum ac...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on wireless communications 2024-12, Vol.23 (12), p.18888-18902
Hauptverfasser: Safavinejad, Ramin, Chang, Hao-Hsuan, Liu, Lingjia
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 18902
container_issue 12
container_start_page 18888
container_title IEEE transactions on wireless communications
container_volume 23
creator Safavinejad, Ramin
Chang, Hao-Hsuan
Liu, Lingjia
description In dynamic spectrum access (DSA) networks, secondary users (SUs) need to opportunistically access primary users' (PUs) radio spectrum without causing significant interference. Since the SU-PU interaction is limited, deep reinforcement learning has been introduced to help SUs conduct spectrum access. Specifically, deep recurrent Q network (DRQN) has been utilized in DSA networks for SUs to aggregate information from recent experiences to make spectrum access decisions. DRQN is notorious for its sample efficiency since it needs a rather large number of training samples to tune its parameters which is a computationally demanding task. Deep echo state network (DEQN) has been introduced for DSA networks to address the sample efficiency issue of DRQN. In this work, we compare the convergence of DRQN and DEQN by comparing the upper bounds we obtain on their covering number, a notion of richness. Furthermore, we introduce a method to determine the right hyper-parameters for DEQN, providing system design guidance for DEQN-based DSA networks. Extensive performance evaluation confirms that DEQN-based DSA strategy is the superior choice with regard to computational power while outperforming DRQN-based ones.
doi_str_mv 10.1109/TWC.2024.3414428
format Article
fullrecord <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_proquest_journals_3143027681</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10623365</ieee_id><sourcerecordid>3143027681</sourcerecordid><originalsourceid>FETCH-LOGICAL-c175t-1a3f28d10c4f4bf2694066efaa0dbcb726e2ec394b0af1f9c223e33901448be63</originalsourceid><addsrcrecordid>eNpNkM1LAzEQxYMoWKt3Dx4Cnrfma7O73srWLygItuIxZNNJSelm12Qr7H9vSj14msfw3jDvh9AtJTNKSfWw_qpnjDAx44IKwcozNKF5XmaMifL8qLnMKCvkJbqKcUcILWSeT5BaAPT4A5y3XTDQgh_wEnTwzm9xWuHF6HXrDF71YIZwaPHcGIjxEded_4GwBW8Az73ej9FFrP0Gr8Y4QIsXEN3WX6MLq_cRbv7mFH0-P63r12z5_vJWz5eZoUU-ZFRzy8oNJUZY0VgmK0GkBKs12TSmKZgEBoZXoiHaUlsZxjhwXpHUtWxA8im6P93tQ_d9gDioXXcI6a2oOBWcpOYlTS5ycpnQxRjAqj64VodRUaKOGFXCqI4Y1R_GFLk7RRwA_LNLxrnM-S8i7m6T</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3143027681</pqid></control><display><type>article</type><title>Deep Reinforcement Learning for Dynamic Spectrum Access: Convergence Analysis and System Design</title><source>IEEE/IET Electronic Library (IEL)</source><creator>Safavinejad, Ramin ; Chang, Hao-Hsuan ; Liu, Lingjia</creator><creatorcontrib>Safavinejad, Ramin ; Chang, Hao-Hsuan ; Liu, Lingjia</creatorcontrib><description>In dynamic spectrum access (DSA) networks, secondary users (SUs) need to opportunistically access primary users' (PUs) radio spectrum without causing significant interference. Since the SU-PU interaction is limited, deep reinforcement learning has been introduced to help SUs conduct spectrum access. Specifically, deep recurrent Q network (DRQN) has been utilized in DSA networks for SUs to aggregate information from recent experiences to make spectrum access decisions. DRQN is notorious for its sample efficiency since it needs a rather large number of training samples to tune its parameters which is a computationally demanding task. Deep echo state network (DEQN) has been introduced for DSA networks to address the sample efficiency issue of DRQN. In this work, we compare the convergence of DRQN and DEQN by comparing the upper bounds we obtain on their covering number, a notion of richness. Furthermore, we introduce a method to determine the right hyper-parameters for DEQN, providing system design guidance for DEQN-based DSA networks. Extensive performance evaluation confirms that DEQN-based DSA strategy is the superior choice with regard to computational power while outperforming DRQN-based ones.</description><identifier>ISSN: 1536-1276</identifier><identifier>EISSN: 1558-2248</identifier><identifier>DOI: 10.1109/TWC.2024.3414428</identifier><identifier>CODEN: ITWCAX</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>5G beyond and 6G ; 5G mobile communication ; 6G mobile communication ; Convergence ; covering numbers ; Deep learning ; Deep reinforcement learning (DRL) ; dynamic spectrum access (DSA) ; echo state network (ESN) ; Networks ; Parameters ; Performance evaluation ; Radio spectra ; recurrent neural network ; Spectrum allocation ; Systems design ; Training ; Training data ; Upper bound ; Upper bounds ; Wireless communication</subject><ispartof>IEEE transactions on wireless communications, 2024-12, Vol.23 (12), p.18888-18902</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2024</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c175t-1a3f28d10c4f4bf2694066efaa0dbcb726e2ec394b0af1f9c223e33901448be63</cites><orcidid>0000-0003-1915-1784 ; 0000-0002-6910-054X ; 0000-0002-5494-2154</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10623365$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,777,781,793,27905,27906,54739</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/10623365$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Safavinejad, Ramin</creatorcontrib><creatorcontrib>Chang, Hao-Hsuan</creatorcontrib><creatorcontrib>Liu, Lingjia</creatorcontrib><title>Deep Reinforcement Learning for Dynamic Spectrum Access: Convergence Analysis and System Design</title><title>IEEE transactions on wireless communications</title><addtitle>TWC</addtitle><description>In dynamic spectrum access (DSA) networks, secondary users (SUs) need to opportunistically access primary users' (PUs) radio spectrum without causing significant interference. Since the SU-PU interaction is limited, deep reinforcement learning has been introduced to help SUs conduct spectrum access. Specifically, deep recurrent Q network (DRQN) has been utilized in DSA networks for SUs to aggregate information from recent experiences to make spectrum access decisions. DRQN is notorious for its sample efficiency since it needs a rather large number of training samples to tune its parameters which is a computationally demanding task. Deep echo state network (DEQN) has been introduced for DSA networks to address the sample efficiency issue of DRQN. In this work, we compare the convergence of DRQN and DEQN by comparing the upper bounds we obtain on their covering number, a notion of richness. Furthermore, we introduce a method to determine the right hyper-parameters for DEQN, providing system design guidance for DEQN-based DSA networks. Extensive performance evaluation confirms that DEQN-based DSA strategy is the superior choice with regard to computational power while outperforming DRQN-based ones.</description><subject>5G beyond and 6G</subject><subject>5G mobile communication</subject><subject>6G mobile communication</subject><subject>Convergence</subject><subject>covering numbers</subject><subject>Deep learning</subject><subject>Deep reinforcement learning (DRL)</subject><subject>dynamic spectrum access (DSA)</subject><subject>echo state network (ESN)</subject><subject>Networks</subject><subject>Parameters</subject><subject>Performance evaluation</subject><subject>Radio spectra</subject><subject>recurrent neural network</subject><subject>Spectrum allocation</subject><subject>Systems design</subject><subject>Training</subject><subject>Training data</subject><subject>Upper bound</subject><subject>Upper bounds</subject><subject>Wireless communication</subject><issn>1536-1276</issn><issn>1558-2248</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpNkM1LAzEQxYMoWKt3Dx4Cnrfma7O73srWLygItuIxZNNJSelm12Qr7H9vSj14msfw3jDvh9AtJTNKSfWw_qpnjDAx44IKwcozNKF5XmaMifL8qLnMKCvkJbqKcUcILWSeT5BaAPT4A5y3XTDQgh_wEnTwzm9xWuHF6HXrDF71YIZwaPHcGIjxEded_4GwBW8Az73ej9FFrP0Gr8Y4QIsXEN3WX6MLq_cRbv7mFH0-P63r12z5_vJWz5eZoUU-ZFRzy8oNJUZY0VgmK0GkBKs12TSmKZgEBoZXoiHaUlsZxjhwXpHUtWxA8im6P93tQ_d9gDioXXcI6a2oOBWcpOYlTS5ycpnQxRjAqj64VodRUaKOGFXCqI4Y1R_GFLk7RRwA_LNLxrnM-S8i7m6T</recordid><startdate>202412</startdate><enddate>202412</enddate><creator>Safavinejad, Ramin</creator><creator>Chang, Hao-Hsuan</creator><creator>Liu, Lingjia</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0003-1915-1784</orcidid><orcidid>https://orcid.org/0000-0002-6910-054X</orcidid><orcidid>https://orcid.org/0000-0002-5494-2154</orcidid></search><sort><creationdate>202412</creationdate><title>Deep Reinforcement Learning for Dynamic Spectrum Access: Convergence Analysis and System Design</title><author>Safavinejad, Ramin ; Chang, Hao-Hsuan ; Liu, Lingjia</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c175t-1a3f28d10c4f4bf2694066efaa0dbcb726e2ec394b0af1f9c223e33901448be63</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>5G beyond and 6G</topic><topic>5G mobile communication</topic><topic>6G mobile communication</topic><topic>Convergence</topic><topic>covering numbers</topic><topic>Deep learning</topic><topic>Deep reinforcement learning (DRL)</topic><topic>dynamic spectrum access (DSA)</topic><topic>echo state network (ESN)</topic><topic>Networks</topic><topic>Parameters</topic><topic>Performance evaluation</topic><topic>Radio spectra</topic><topic>recurrent neural network</topic><topic>Spectrum allocation</topic><topic>Systems design</topic><topic>Training</topic><topic>Training data</topic><topic>Upper bound</topic><topic>Upper bounds</topic><topic>Wireless communication</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Safavinejad, Ramin</creatorcontrib><creatorcontrib>Chang, Hao-Hsuan</creatorcontrib><creatorcontrib>Liu, Lingjia</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE/IET Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE transactions on wireless communications</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Safavinejad, Ramin</au><au>Chang, Hao-Hsuan</au><au>Liu, Lingjia</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Deep Reinforcement Learning for Dynamic Spectrum Access: Convergence Analysis and System Design</atitle><jtitle>IEEE transactions on wireless communications</jtitle><stitle>TWC</stitle><date>2024-12</date><risdate>2024</risdate><volume>23</volume><issue>12</issue><spage>18888</spage><epage>18902</epage><pages>18888-18902</pages><issn>1536-1276</issn><eissn>1558-2248</eissn><coden>ITWCAX</coden><abstract>In dynamic spectrum access (DSA) networks, secondary users (SUs) need to opportunistically access primary users' (PUs) radio spectrum without causing significant interference. Since the SU-PU interaction is limited, deep reinforcement learning has been introduced to help SUs conduct spectrum access. Specifically, deep recurrent Q network (DRQN) has been utilized in DSA networks for SUs to aggregate information from recent experiences to make spectrum access decisions. DRQN is notorious for its sample efficiency since it needs a rather large number of training samples to tune its parameters which is a computationally demanding task. Deep echo state network (DEQN) has been introduced for DSA networks to address the sample efficiency issue of DRQN. In this work, we compare the convergence of DRQN and DEQN by comparing the upper bounds we obtain on their covering number, a notion of richness. Furthermore, we introduce a method to determine the right hyper-parameters for DEQN, providing system design guidance for DEQN-based DSA networks. Extensive performance evaluation confirms that DEQN-based DSA strategy is the superior choice with regard to computational power while outperforming DRQN-based ones.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/TWC.2024.3414428</doi><tpages>15</tpages><orcidid>https://orcid.org/0000-0003-1915-1784</orcidid><orcidid>https://orcid.org/0000-0002-6910-054X</orcidid><orcidid>https://orcid.org/0000-0002-5494-2154</orcidid></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 1536-1276
ispartof IEEE transactions on wireless communications, 2024-12, Vol.23 (12), p.18888-18902
issn 1536-1276
1558-2248
language eng
recordid cdi_proquest_journals_3143027681
source IEEE/IET Electronic Library (IEL)
subjects 5G beyond and 6G
5G mobile communication
6G mobile communication
Convergence
covering numbers
Deep learning
Deep reinforcement learning (DRL)
dynamic spectrum access (DSA)
echo state network (ESN)
Networks
Parameters
Performance evaluation
Radio spectra
recurrent neural network
Spectrum allocation
Systems design
Training
Training data
Upper bound
Upper bounds
Wireless communication
title Deep Reinforcement Learning for Dynamic Spectrum Access: Convergence Analysis and System Design
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-17T17%3A53%3A25IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Deep%20Reinforcement%20Learning%20for%20Dynamic%20Spectrum%20Access:%20Convergence%20Analysis%20and%20System%20Design&rft.jtitle=IEEE%20transactions%20on%20wireless%20communications&rft.au=Safavinejad,%20Ramin&rft.date=2024-12&rft.volume=23&rft.issue=12&rft.spage=18888&rft.epage=18902&rft.pages=18888-18902&rft.issn=1536-1276&rft.eissn=1558-2248&rft.coden=ITWCAX&rft_id=info:doi/10.1109/TWC.2024.3414428&rft_dat=%3Cproquest_RIE%3E3143027681%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3143027681&rft_id=info:pmid/&rft_ieee_id=10623365&rfr_iscdi=true