Actor-Critic Methods for IRS Design in Correlated Channel Environments: A Closer Look Into the Neural Tangent Kernel of the Critic

The article studies the design of an Intelligent Reflecting Surface (IRS) in order to support a Multiple-Input-Single-Output (MISO) communication system operating in a mobile, spatiotemporally correlated channel environment. The design objective is to maximize the expected sum of Signal-to-Noise Rat...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on signal processing 2023, Vol.71, p.4029-4044
Hauptverfasser: Evmorfos, Spilios, Petropulu, Athina P., Poor, H. Vincent
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 4044
container_issue
container_start_page 4029
container_title IEEE transactions on signal processing
container_volume 71
creator Evmorfos, Spilios
Petropulu, Athina P.
Poor, H. Vincent
description The article studies the design of an Intelligent Reflecting Surface (IRS) in order to support a Multiple-Input-Single-Output (MISO) communication system operating in a mobile, spatiotemporally correlated channel environment. The design objective is to maximize the expected sum of Signal-to-Noise Ratio (SNR) at the receiver over an infinite time horizon. The problem formulation gives rise to a Markov Decision Process (MDP). We propose an actor-critic algorithm for continuous control that accounts for both channel correlations and destination motion by constructing the state of the Reinforcement Learning algorithm to include history of destination positions and IRS phases. To account for the variability of the underlying value function, arising due to the channel variability, we propose to pre-process the input of the critic with a Fourier kernel, which enables stability in the process of neural value approximation. We also examine the use of the destination SNR as a component of the designed MDP state, which constitutes common practice in previous works. We empirically show that, when the channels are spatiotemporally varying, including the SNR in the state representation causes divergence. We provide insight on the aforementioned divergence by demonstrating the effect of the SNR inclusion on the Neural Tangent Kernel of the critic network. Based on our study, we propose a framework for designing actor-critic methods for IRS design and also for more general problems, that is predicated upon sufficient conditions of the critic's Neural Tangent Kernel for convergence under neural value learning.
doi_str_mv 10.1109/TSP.2023.3322830
format Article
fullrecord <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_ieee_primary_10285578</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10285578</ieee_id><sourcerecordid>2889728508</sourcerecordid><originalsourceid>FETCH-LOGICAL-c245t-835f7ba3c54b49b8a8f2a02c79dc17bbad673cdb81c4eaa6ac9765365c047be43</originalsourceid><addsrcrecordid>eNpNkElPwzAQRiMEEuudAwdLnFO8xg63KmwVZREUiVvkOJM2JbWL7SJx5ZeT0h44zUjzvm-klySnBA8IwfnF5PV5QDFlA8YoVQzvJAck5yTFXGa7_Y4FS4WS7_vJYQhzjAnneXaQ_AxNdD4tfBtbgx4gzlwdUOM8Gr28oisI7dSi1qLCeQ-djlCjYqathQ5d26_WO7sAG8MlGqKicwE8Gjv3gUY2OhRngB5h5XWHJtpOew7dg19HXfN33Hw9TvYa3QU42c6j5O3melLcpeOn21ExHKeGchFTxUQjK82M4BXPK6VVQzWmRua1IbKqdJ1JZupKEcNB60ybXGaCZcL0Cirg7Cg53_QuvftcQYjl3K287V-WVKlcUiWw6im8oYx3IXhoyqVvF9p_lwSXa9Nlb7pcmy63pvvI2SbSAsA_vC8UUrFfhzF6pg</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2889728508</pqid></control><display><type>article</type><title>Actor-Critic Methods for IRS Design in Correlated Channel Environments: A Closer Look Into the Neural Tangent Kernel of the Critic</title><source>IEEE Electronic Library (IEL)</source><creator>Evmorfos, Spilios ; Petropulu, Athina P. ; Poor, H. Vincent</creator><creatorcontrib>Evmorfos, Spilios ; Petropulu, Athina P. ; Poor, H. Vincent</creatorcontrib><description>The article studies the design of an Intelligent Reflecting Surface (IRS) in order to support a Multiple-Input-Single-Output (MISO) communication system operating in a mobile, spatiotemporally correlated channel environment. The design objective is to maximize the expected sum of Signal-to-Noise Ratio (SNR) at the receiver over an infinite time horizon. The problem formulation gives rise to a Markov Decision Process (MDP). We propose an actor-critic algorithm for continuous control that accounts for both channel correlations and destination motion by constructing the state of the Reinforcement Learning algorithm to include history of destination positions and IRS phases. To account for the variability of the underlying value function, arising due to the channel variability, we propose to pre-process the input of the critic with a Fourier kernel, which enables stability in the process of neural value approximation. We also examine the use of the destination SNR as a component of the designed MDP state, which constitutes common practice in previous works. We empirically show that, when the channels are spatiotemporally varying, including the SNR in the state representation causes divergence. We provide insight on the aforementioned divergence by demonstrating the effect of the SNR inclusion on the Neural Tangent Kernel of the critic network. Based on our study, we propose a framework for designing actor-critic methods for IRS design and also for more general problems, that is predicated upon sufficient conditions of the critic's Neural Tangent Kernel for convergence under neural value learning.</description><identifier>ISSN: 1053-587X</identifier><identifier>EISSN: 1941-0476</identifier><identifier>DOI: 10.1109/TSP.2023.3322830</identifier><identifier>CODEN: ITPRED</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Algorithms ; Communications systems ; Correlation ; Deep learning ; Divergence ; Intelligent Reflecting Surfaces ; IRS parameter design ; Kernel ; Kernels ; Machine learning ; Markov processes ; Neural Tangent Kernels ; reinforcement learning ; Signal processing algorithms ; Signal to noise ratio ; Spatiotemporal phenomena ; Supervised learning</subject><ispartof>IEEE transactions on signal processing, 2023, Vol.71, p.4029-4044</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2023</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c245t-835f7ba3c54b49b8a8f2a02c79dc17bbad673cdb81c4eaa6ac9765365c047be43</cites><orcidid>0000-0001-7380-7815 ; 0000-0002-2062-131X ; 0000-0002-9899-4562</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10285578$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,796,4024,27923,27924,27925,54758</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/10285578$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Evmorfos, Spilios</creatorcontrib><creatorcontrib>Petropulu, Athina P.</creatorcontrib><creatorcontrib>Poor, H. Vincent</creatorcontrib><title>Actor-Critic Methods for IRS Design in Correlated Channel Environments: A Closer Look Into the Neural Tangent Kernel of the Critic</title><title>IEEE transactions on signal processing</title><addtitle>TSP</addtitle><description>The article studies the design of an Intelligent Reflecting Surface (IRS) in order to support a Multiple-Input-Single-Output (MISO) communication system operating in a mobile, spatiotemporally correlated channel environment. The design objective is to maximize the expected sum of Signal-to-Noise Ratio (SNR) at the receiver over an infinite time horizon. The problem formulation gives rise to a Markov Decision Process (MDP). We propose an actor-critic algorithm for continuous control that accounts for both channel correlations and destination motion by constructing the state of the Reinforcement Learning algorithm to include history of destination positions and IRS phases. To account for the variability of the underlying value function, arising due to the channel variability, we propose to pre-process the input of the critic with a Fourier kernel, which enables stability in the process of neural value approximation. We also examine the use of the destination SNR as a component of the designed MDP state, which constitutes common practice in previous works. We empirically show that, when the channels are spatiotemporally varying, including the SNR in the state representation causes divergence. We provide insight on the aforementioned divergence by demonstrating the effect of the SNR inclusion on the Neural Tangent Kernel of the critic network. Based on our study, we propose a framework for designing actor-critic methods for IRS design and also for more general problems, that is predicated upon sufficient conditions of the critic's Neural Tangent Kernel for convergence under neural value learning.</description><subject>Algorithms</subject><subject>Communications systems</subject><subject>Correlation</subject><subject>Deep learning</subject><subject>Divergence</subject><subject>Intelligent Reflecting Surfaces</subject><subject>IRS parameter design</subject><subject>Kernel</subject><subject>Kernels</subject><subject>Machine learning</subject><subject>Markov processes</subject><subject>Neural Tangent Kernels</subject><subject>reinforcement learning</subject><subject>Signal processing algorithms</subject><subject>Signal to noise ratio</subject><subject>Spatiotemporal phenomena</subject><subject>Supervised learning</subject><issn>1053-587X</issn><issn>1941-0476</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpNkElPwzAQRiMEEuudAwdLnFO8xg63KmwVZREUiVvkOJM2JbWL7SJx5ZeT0h44zUjzvm-klySnBA8IwfnF5PV5QDFlA8YoVQzvJAck5yTFXGa7_Y4FS4WS7_vJYQhzjAnneXaQ_AxNdD4tfBtbgx4gzlwdUOM8Gr28oisI7dSi1qLCeQ-djlCjYqathQ5d26_WO7sAG8MlGqKicwE8Gjv3gUY2OhRngB5h5XWHJtpOew7dg19HXfN33Hw9TvYa3QU42c6j5O3melLcpeOn21ExHKeGchFTxUQjK82M4BXPK6VVQzWmRua1IbKqdJ1JZupKEcNB60ybXGaCZcL0Cirg7Cg53_QuvftcQYjl3K287V-WVKlcUiWw6im8oYx3IXhoyqVvF9p_lwSXa9Nlb7pcmy63pvvI2SbSAsA_vC8UUrFfhzF6pg</recordid><startdate>2023</startdate><enddate>2023</enddate><creator>Evmorfos, Spilios</creator><creator>Petropulu, Athina P.</creator><creator>Poor, H. Vincent</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0001-7380-7815</orcidid><orcidid>https://orcid.org/0000-0002-2062-131X</orcidid><orcidid>https://orcid.org/0000-0002-9899-4562</orcidid></search><sort><creationdate>2023</creationdate><title>Actor-Critic Methods for IRS Design in Correlated Channel Environments: A Closer Look Into the Neural Tangent Kernel of the Critic</title><author>Evmorfos, Spilios ; Petropulu, Athina P. ; Poor, H. Vincent</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c245t-835f7ba3c54b49b8a8f2a02c79dc17bbad673cdb81c4eaa6ac9765365c047be43</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Algorithms</topic><topic>Communications systems</topic><topic>Correlation</topic><topic>Deep learning</topic><topic>Divergence</topic><topic>Intelligent Reflecting Surfaces</topic><topic>IRS parameter design</topic><topic>Kernel</topic><topic>Kernels</topic><topic>Machine learning</topic><topic>Markov processes</topic><topic>Neural Tangent Kernels</topic><topic>reinforcement learning</topic><topic>Signal processing algorithms</topic><topic>Signal to noise ratio</topic><topic>Spatiotemporal phenomena</topic><topic>Supervised learning</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Evmorfos, Spilios</creatorcontrib><creatorcontrib>Petropulu, Athina P.</creatorcontrib><creatorcontrib>Poor, H. Vincent</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE transactions on signal processing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Evmorfos, Spilios</au><au>Petropulu, Athina P.</au><au>Poor, H. Vincent</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Actor-Critic Methods for IRS Design in Correlated Channel Environments: A Closer Look Into the Neural Tangent Kernel of the Critic</atitle><jtitle>IEEE transactions on signal processing</jtitle><stitle>TSP</stitle><date>2023</date><risdate>2023</risdate><volume>71</volume><spage>4029</spage><epage>4044</epage><pages>4029-4044</pages><issn>1053-587X</issn><eissn>1941-0476</eissn><coden>ITPRED</coden><abstract>The article studies the design of an Intelligent Reflecting Surface (IRS) in order to support a Multiple-Input-Single-Output (MISO) communication system operating in a mobile, spatiotemporally correlated channel environment. The design objective is to maximize the expected sum of Signal-to-Noise Ratio (SNR) at the receiver over an infinite time horizon. The problem formulation gives rise to a Markov Decision Process (MDP). We propose an actor-critic algorithm for continuous control that accounts for both channel correlations and destination motion by constructing the state of the Reinforcement Learning algorithm to include history of destination positions and IRS phases. To account for the variability of the underlying value function, arising due to the channel variability, we propose to pre-process the input of the critic with a Fourier kernel, which enables stability in the process of neural value approximation. We also examine the use of the destination SNR as a component of the designed MDP state, which constitutes common practice in previous works. We empirically show that, when the channels are spatiotemporally varying, including the SNR in the state representation causes divergence. We provide insight on the aforementioned divergence by demonstrating the effect of the SNR inclusion on the Neural Tangent Kernel of the critic network. Based on our study, we propose a framework for designing actor-critic methods for IRS design and also for more general problems, that is predicated upon sufficient conditions of the critic's Neural Tangent Kernel for convergence under neural value learning.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/TSP.2023.3322830</doi><tpages>16</tpages><orcidid>https://orcid.org/0000-0001-7380-7815</orcidid><orcidid>https://orcid.org/0000-0002-2062-131X</orcidid><orcidid>https://orcid.org/0000-0002-9899-4562</orcidid></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 1053-587X
ispartof IEEE transactions on signal processing, 2023, Vol.71, p.4029-4044
issn 1053-587X
1941-0476
language eng
recordid cdi_ieee_primary_10285578
source IEEE Electronic Library (IEL)
subjects Algorithms
Communications systems
Correlation
Deep learning
Divergence
Intelligent Reflecting Surfaces
IRS parameter design
Kernel
Kernels
Machine learning
Markov processes
Neural Tangent Kernels
reinforcement learning
Signal processing algorithms
Signal to noise ratio
Spatiotemporal phenomena
Supervised learning
title Actor-Critic Methods for IRS Design in Correlated Channel Environments: A Closer Look Into the Neural Tangent Kernel of the Critic
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-03T02%3A30%3A54IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Actor-Critic%20Methods%20for%20IRS%20Design%20in%20Correlated%20Channel%20Environments:%20A%20Closer%20Look%20Into%20the%20Neural%20Tangent%20Kernel%20of%20the%20Critic&rft.jtitle=IEEE%20transactions%20on%20signal%20processing&rft.au=Evmorfos,%20Spilios&rft.date=2023&rft.volume=71&rft.spage=4029&rft.epage=4044&rft.pages=4029-4044&rft.issn=1053-587X&rft.eissn=1941-0476&rft.coden=ITPRED&rft_id=info:doi/10.1109/TSP.2023.3322830&rft_dat=%3Cproquest_RIE%3E2889728508%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2889728508&rft_id=info:pmid/&rft_ieee_id=10285578&rfr_iscdi=true