Combinatorial-restless-bandit-based Transmitter-Receiver Online Selection for Distributed MIMO Radars With Non-Stationary Channels
We track moving targets with a distributed multiple-input multiple-output (MIMO) radar, for which the transmitters and receivers are appropriately paired and selected with a limited number of radar stations. We aim to maximize the sum of the signal-to-interference-plus-noise ratios (SINRs) of all th...
Gespeichert in:
Hauptverfasser: | , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | |
container_volume | |
creator | Hao, Yuhang Wang, Zengfu Fu, Jing Bai, Xianglong Li, Can Pan, Quan |
description | We track moving targets with a distributed multiple-input multiple-output
(MIMO) radar, for which the transmitters and receivers are appropriately paired
and selected with a limited number of radar stations. We aim to maximize the
sum of the signal-to-interference-plus-noise ratios (SINRs) of all the targets
by sensibly selecting the transmitter-receiver pairs during the tracking
period. A key is to model the optimization problem of selecting the
transmitter-receiver pairs by a restless multi-armed bandit (RMAB) model that
is able to formulate the time-varying signals of the transceiver channels
whenever the channels are being probed or not. We regard the estimated mean
reward (i.e., SINR) as the state of an arm. If an arm is probed, the estimated
mean reward of the arm is the weighted sum of the observed reward and the
predicted mean reward; otherwise, it is the predicted mean reward. We associate
the predicted mean reward with the estimated mean reward at the previous time
slot and the state of the target, which is estimated via the interacting
multiple model-unscented Kalman filter (IMM-UKF). The optimized selection of
transmitter-receiver pairs at each time is accomplished by using Binary
Particle Swarm Optimization (BPSO) based on indexes of arms, each of which is
designed by the upper confidence bound (UCB1) algorithm. Above all, a
multi-group combinatorial-restless-bandit technique taking into account of
different combinations of transmitters and receivers and the closed-loop scheme
between transmitter-receiver pair selection and target state estimation, namely
MG-CRB-CL, is developed to achieve a near-optimal selection strategy and
improve multi-target tracking performance. Simulation results for different
scenarios are provided to verify the effectiveness and superior performance of
our MG-CRB-CL algorithm. |
doi_str_mv | 10.48550/arxiv.2306.09710 |
format | Article |
fullrecord | <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2306_09710</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2306_09710</sourcerecordid><originalsourceid>FETCH-LOGICAL-a670-9cbeed3062cb48cdc7dfc46f76ca6aeaeda71eb65db0d08cf53a97fde485ef883</originalsourceid><addsrcrecordid>eNotkL9OwzAYxL0woMIDMOEXcHCaxE5GFP5VaonURmKMPtufVUupg2xTwcqTk7ZMt9yd7n6E3OU8K-uq4g8Qvt0xWxZcZLyROb8mv-10UM5DmoKDkQWMacQYmQJvXJoloqF9AB8PLiUMbIsa3RED7fzoPNIdjqiTmzy1U6BPLqbg1FeaU5vVpqNbMBAi_XBpT98nz3YJTmYIP7Tdg_c4xhtyZWGMePuvC9K_PPftG1t3r6v2cc1ASM4arRDNvHypVVlro6WxuhRWCg0CENCAzFGJyihueK1tVUAjrcH5Odq6Lhbk_lJ7hjB8BneYVwwnGMMZRvEHY8JexA</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Combinatorial-restless-bandit-based Transmitter-Receiver Online Selection for Distributed MIMO Radars With Non-Stationary Channels</title><source>arXiv.org</source><creator>Hao, Yuhang ; Wang, Zengfu ; Fu, Jing ; Bai, Xianglong ; Li, Can ; Pan, Quan</creator><creatorcontrib>Hao, Yuhang ; Wang, Zengfu ; Fu, Jing ; Bai, Xianglong ; Li, Can ; Pan, Quan</creatorcontrib><description>We track moving targets with a distributed multiple-input multiple-output
(MIMO) radar, for which the transmitters and receivers are appropriately paired
and selected with a limited number of radar stations. We aim to maximize the
sum of the signal-to-interference-plus-noise ratios (SINRs) of all the targets
by sensibly selecting the transmitter-receiver pairs during the tracking
period. A key is to model the optimization problem of selecting the
transmitter-receiver pairs by a restless multi-armed bandit (RMAB) model that
is able to formulate the time-varying signals of the transceiver channels
whenever the channels are being probed or not. We regard the estimated mean
reward (i.e., SINR) as the state of an arm. If an arm is probed, the estimated
mean reward of the arm is the weighted sum of the observed reward and the
predicted mean reward; otherwise, it is the predicted mean reward. We associate
the predicted mean reward with the estimated mean reward at the previous time
slot and the state of the target, which is estimated via the interacting
multiple model-unscented Kalman filter (IMM-UKF). The optimized selection of
transmitter-receiver pairs at each time is accomplished by using Binary
Particle Swarm Optimization (BPSO) based on indexes of arms, each of which is
designed by the upper confidence bound (UCB1) algorithm. Above all, a
multi-group combinatorial-restless-bandit technique taking into account of
different combinations of transmitters and receivers and the closed-loop scheme
between transmitter-receiver pair selection and target state estimation, namely
MG-CRB-CL, is developed to achieve a near-optimal selection strategy and
improve multi-target tracking performance. Simulation results for different
scenarios are provided to verify the effectiveness and superior performance of
our MG-CRB-CL algorithm.</description><identifier>DOI: 10.48550/arxiv.2306.09710</identifier><language>eng</language><subject>Computer Science - Systems and Control</subject><creationdate>2023-06</creationdate><rights>http://creativecommons.org/licenses/by/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,776,881</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2306.09710$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2306.09710$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Hao, Yuhang</creatorcontrib><creatorcontrib>Wang, Zengfu</creatorcontrib><creatorcontrib>Fu, Jing</creatorcontrib><creatorcontrib>Bai, Xianglong</creatorcontrib><creatorcontrib>Li, Can</creatorcontrib><creatorcontrib>Pan, Quan</creatorcontrib><title>Combinatorial-restless-bandit-based Transmitter-Receiver Online Selection for Distributed MIMO Radars With Non-Stationary Channels</title><description>We track moving targets with a distributed multiple-input multiple-output
(MIMO) radar, for which the transmitters and receivers are appropriately paired
and selected with a limited number of radar stations. We aim to maximize the
sum of the signal-to-interference-plus-noise ratios (SINRs) of all the targets
by sensibly selecting the transmitter-receiver pairs during the tracking
period. A key is to model the optimization problem of selecting the
transmitter-receiver pairs by a restless multi-armed bandit (RMAB) model that
is able to formulate the time-varying signals of the transceiver channels
whenever the channels are being probed or not. We regard the estimated mean
reward (i.e., SINR) as the state of an arm. If an arm is probed, the estimated
mean reward of the arm is the weighted sum of the observed reward and the
predicted mean reward; otherwise, it is the predicted mean reward. We associate
the predicted mean reward with the estimated mean reward at the previous time
slot and the state of the target, which is estimated via the interacting
multiple model-unscented Kalman filter (IMM-UKF). The optimized selection of
transmitter-receiver pairs at each time is accomplished by using Binary
Particle Swarm Optimization (BPSO) based on indexes of arms, each of which is
designed by the upper confidence bound (UCB1) algorithm. Above all, a
multi-group combinatorial-restless-bandit technique taking into account of
different combinations of transmitters and receivers and the closed-loop scheme
between transmitter-receiver pair selection and target state estimation, namely
MG-CRB-CL, is developed to achieve a near-optimal selection strategy and
improve multi-target tracking performance. Simulation results for different
scenarios are provided to verify the effectiveness and superior performance of
our MG-CRB-CL algorithm.</description><subject>Computer Science - Systems and Control</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotkL9OwzAYxL0woMIDMOEXcHCaxE5GFP5VaonURmKMPtufVUupg2xTwcqTk7ZMt9yd7n6E3OU8K-uq4g8Qvt0xWxZcZLyROb8mv-10UM5DmoKDkQWMacQYmQJvXJoloqF9AB8PLiUMbIsa3RED7fzoPNIdjqiTmzy1U6BPLqbg1FeaU5vVpqNbMBAi_XBpT98nz3YJTmYIP7Tdg_c4xhtyZWGMePuvC9K_PPftG1t3r6v2cc1ASM4arRDNvHypVVlro6WxuhRWCg0CENCAzFGJyihueK1tVUAjrcH5Odq6Lhbk_lJ7hjB8BneYVwwnGMMZRvEHY8JexA</recordid><startdate>20230616</startdate><enddate>20230616</enddate><creator>Hao, Yuhang</creator><creator>Wang, Zengfu</creator><creator>Fu, Jing</creator><creator>Bai, Xianglong</creator><creator>Li, Can</creator><creator>Pan, Quan</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20230616</creationdate><title>Combinatorial-restless-bandit-based Transmitter-Receiver Online Selection for Distributed MIMO Radars With Non-Stationary Channels</title><author>Hao, Yuhang ; Wang, Zengfu ; Fu, Jing ; Bai, Xianglong ; Li, Can ; Pan, Quan</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a670-9cbeed3062cb48cdc7dfc46f76ca6aeaeda71eb65db0d08cf53a97fde485ef883</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Computer Science - Systems and Control</topic><toplevel>online_resources</toplevel><creatorcontrib>Hao, Yuhang</creatorcontrib><creatorcontrib>Wang, Zengfu</creatorcontrib><creatorcontrib>Fu, Jing</creatorcontrib><creatorcontrib>Bai, Xianglong</creatorcontrib><creatorcontrib>Li, Can</creatorcontrib><creatorcontrib>Pan, Quan</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Hao, Yuhang</au><au>Wang, Zengfu</au><au>Fu, Jing</au><au>Bai, Xianglong</au><au>Li, Can</au><au>Pan, Quan</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Combinatorial-restless-bandit-based Transmitter-Receiver Online Selection for Distributed MIMO Radars With Non-Stationary Channels</atitle><date>2023-06-16</date><risdate>2023</risdate><abstract>We track moving targets with a distributed multiple-input multiple-output
(MIMO) radar, for which the transmitters and receivers are appropriately paired
and selected with a limited number of radar stations. We aim to maximize the
sum of the signal-to-interference-plus-noise ratios (SINRs) of all the targets
by sensibly selecting the transmitter-receiver pairs during the tracking
period. A key is to model the optimization problem of selecting the
transmitter-receiver pairs by a restless multi-armed bandit (RMAB) model that
is able to formulate the time-varying signals of the transceiver channels
whenever the channels are being probed or not. We regard the estimated mean
reward (i.e., SINR) as the state of an arm. If an arm is probed, the estimated
mean reward of the arm is the weighted sum of the observed reward and the
predicted mean reward; otherwise, it is the predicted mean reward. We associate
the predicted mean reward with the estimated mean reward at the previous time
slot and the state of the target, which is estimated via the interacting
multiple model-unscented Kalman filter (IMM-UKF). The optimized selection of
transmitter-receiver pairs at each time is accomplished by using Binary
Particle Swarm Optimization (BPSO) based on indexes of arms, each of which is
designed by the upper confidence bound (UCB1) algorithm. Above all, a
multi-group combinatorial-restless-bandit technique taking into account of
different combinations of transmitters and receivers and the closed-loop scheme
between transmitter-receiver pair selection and target state estimation, namely
MG-CRB-CL, is developed to achieve a near-optimal selection strategy and
improve multi-target tracking performance. Simulation results for different
scenarios are provided to verify the effectiveness and superior performance of
our MG-CRB-CL algorithm.</abstract><doi>10.48550/arxiv.2306.09710</doi><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | DOI: 10.48550/arxiv.2306.09710 |
ispartof | |
issn | |
language | eng |
recordid | cdi_arxiv_primary_2306_09710 |
source | arXiv.org |
subjects | Computer Science - Systems and Control |
title | Combinatorial-restless-bandit-based Transmitter-Receiver Online Selection for Distributed MIMO Radars With Non-Stationary Channels |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-31T05%3A05%3A22IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Combinatorial-restless-bandit-based%20Transmitter-Receiver%20Online%20Selection%20for%20Distributed%20MIMO%20Radars%20With%20Non-Stationary%20Channels&rft.au=Hao,%20Yuhang&rft.date=2023-06-16&rft_id=info:doi/10.48550/arxiv.2306.09710&rft_dat=%3Carxiv_GOX%3E2306_09710%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true |