Combinatorial-restless-bandit-based Transmitter-Receiver Online Selection for Distributed MIMO Radars With Non-Stationary Channels

We track moving targets with a distributed multiple-input multiple-output (MIMO) radar, for which the transmitters and receivers are appropriately paired and selected with a limited number of radar stations. We aim to maximize the sum of the signal-to-interference-plus-noise ratios (SINRs) of all th...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Hao, Yuhang, Wang, Zengfu, Fu, Jing, Bai, Xianglong, Li, Can, Pan, Quan
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator Hao, Yuhang
Wang, Zengfu
Fu, Jing
Bai, Xianglong
Li, Can
Pan, Quan
description We track moving targets with a distributed multiple-input multiple-output (MIMO) radar, for which the transmitters and receivers are appropriately paired and selected with a limited number of radar stations. We aim to maximize the sum of the signal-to-interference-plus-noise ratios (SINRs) of all the targets by sensibly selecting the transmitter-receiver pairs during the tracking period. A key is to model the optimization problem of selecting the transmitter-receiver pairs by a restless multi-armed bandit (RMAB) model that is able to formulate the time-varying signals of the transceiver channels whenever the channels are being probed or not. We regard the estimated mean reward (i.e., SINR) as the state of an arm. If an arm is probed, the estimated mean reward of the arm is the weighted sum of the observed reward and the predicted mean reward; otherwise, it is the predicted mean reward. We associate the predicted mean reward with the estimated mean reward at the previous time slot and the state of the target, which is estimated via the interacting multiple model-unscented Kalman filter (IMM-UKF). The optimized selection of transmitter-receiver pairs at each time is accomplished by using Binary Particle Swarm Optimization (BPSO) based on indexes of arms, each of which is designed by the upper confidence bound (UCB1) algorithm. Above all, a multi-group combinatorial-restless-bandit technique taking into account of different combinations of transmitters and receivers and the closed-loop scheme between transmitter-receiver pair selection and target state estimation, namely MG-CRB-CL, is developed to achieve a near-optimal selection strategy and improve multi-target tracking performance. Simulation results for different scenarios are provided to verify the effectiveness and superior performance of our MG-CRB-CL algorithm.
doi_str_mv 10.48550/arxiv.2306.09710
format Article
fullrecord <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2306_09710</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2306_09710</sourcerecordid><originalsourceid>FETCH-LOGICAL-a670-9cbeed3062cb48cdc7dfc46f76ca6aeaeda71eb65db0d08cf53a97fde485ef883</originalsourceid><addsrcrecordid>eNotkL9OwzAYxL0woMIDMOEXcHCaxE5GFP5VaonURmKMPtufVUupg2xTwcqTk7ZMt9yd7n6E3OU8K-uq4g8Qvt0xWxZcZLyROb8mv-10UM5DmoKDkQWMacQYmQJvXJoloqF9AB8PLiUMbIsa3RED7fzoPNIdjqiTmzy1U6BPLqbg1FeaU5vVpqNbMBAi_XBpT98nz3YJTmYIP7Tdg_c4xhtyZWGMePuvC9K_PPftG1t3r6v2cc1ASM4arRDNvHypVVlro6WxuhRWCg0CENCAzFGJyihueK1tVUAjrcH5Odq6Lhbk_lJ7hjB8BneYVwwnGMMZRvEHY8JexA</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Combinatorial-restless-bandit-based Transmitter-Receiver Online Selection for Distributed MIMO Radars With Non-Stationary Channels</title><source>arXiv.org</source><creator>Hao, Yuhang ; Wang, Zengfu ; Fu, Jing ; Bai, Xianglong ; Li, Can ; Pan, Quan</creator><creatorcontrib>Hao, Yuhang ; Wang, Zengfu ; Fu, Jing ; Bai, Xianglong ; Li, Can ; Pan, Quan</creatorcontrib><description>We track moving targets with a distributed multiple-input multiple-output (MIMO) radar, for which the transmitters and receivers are appropriately paired and selected with a limited number of radar stations. We aim to maximize the sum of the signal-to-interference-plus-noise ratios (SINRs) of all the targets by sensibly selecting the transmitter-receiver pairs during the tracking period. A key is to model the optimization problem of selecting the transmitter-receiver pairs by a restless multi-armed bandit (RMAB) model that is able to formulate the time-varying signals of the transceiver channels whenever the channels are being probed or not. We regard the estimated mean reward (i.e., SINR) as the state of an arm. If an arm is probed, the estimated mean reward of the arm is the weighted sum of the observed reward and the predicted mean reward; otherwise, it is the predicted mean reward. We associate the predicted mean reward with the estimated mean reward at the previous time slot and the state of the target, which is estimated via the interacting multiple model-unscented Kalman filter (IMM-UKF). The optimized selection of transmitter-receiver pairs at each time is accomplished by using Binary Particle Swarm Optimization (BPSO) based on indexes of arms, each of which is designed by the upper confidence bound (UCB1) algorithm. Above all, a multi-group combinatorial-restless-bandit technique taking into account of different combinations of transmitters and receivers and the closed-loop scheme between transmitter-receiver pair selection and target state estimation, namely MG-CRB-CL, is developed to achieve a near-optimal selection strategy and improve multi-target tracking performance. Simulation results for different scenarios are provided to verify the effectiveness and superior performance of our MG-CRB-CL algorithm.</description><identifier>DOI: 10.48550/arxiv.2306.09710</identifier><language>eng</language><subject>Computer Science - Systems and Control</subject><creationdate>2023-06</creationdate><rights>http://creativecommons.org/licenses/by/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,776,881</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2306.09710$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2306.09710$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Hao, Yuhang</creatorcontrib><creatorcontrib>Wang, Zengfu</creatorcontrib><creatorcontrib>Fu, Jing</creatorcontrib><creatorcontrib>Bai, Xianglong</creatorcontrib><creatorcontrib>Li, Can</creatorcontrib><creatorcontrib>Pan, Quan</creatorcontrib><title>Combinatorial-restless-bandit-based Transmitter-Receiver Online Selection for Distributed MIMO Radars With Non-Stationary Channels</title><description>We track moving targets with a distributed multiple-input multiple-output (MIMO) radar, for which the transmitters and receivers are appropriately paired and selected with a limited number of radar stations. We aim to maximize the sum of the signal-to-interference-plus-noise ratios (SINRs) of all the targets by sensibly selecting the transmitter-receiver pairs during the tracking period. A key is to model the optimization problem of selecting the transmitter-receiver pairs by a restless multi-armed bandit (RMAB) model that is able to formulate the time-varying signals of the transceiver channels whenever the channels are being probed or not. We regard the estimated mean reward (i.e., SINR) as the state of an arm. If an arm is probed, the estimated mean reward of the arm is the weighted sum of the observed reward and the predicted mean reward; otherwise, it is the predicted mean reward. We associate the predicted mean reward with the estimated mean reward at the previous time slot and the state of the target, which is estimated via the interacting multiple model-unscented Kalman filter (IMM-UKF). The optimized selection of transmitter-receiver pairs at each time is accomplished by using Binary Particle Swarm Optimization (BPSO) based on indexes of arms, each of which is designed by the upper confidence bound (UCB1) algorithm. Above all, a multi-group combinatorial-restless-bandit technique taking into account of different combinations of transmitters and receivers and the closed-loop scheme between transmitter-receiver pair selection and target state estimation, namely MG-CRB-CL, is developed to achieve a near-optimal selection strategy and improve multi-target tracking performance. Simulation results for different scenarios are provided to verify the effectiveness and superior performance of our MG-CRB-CL algorithm.</description><subject>Computer Science - Systems and Control</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotkL9OwzAYxL0woMIDMOEXcHCaxE5GFP5VaonURmKMPtufVUupg2xTwcqTk7ZMt9yd7n6E3OU8K-uq4g8Qvt0xWxZcZLyROb8mv-10UM5DmoKDkQWMacQYmQJvXJoloqF9AB8PLiUMbIsa3RED7fzoPNIdjqiTmzy1U6BPLqbg1FeaU5vVpqNbMBAi_XBpT98nz3YJTmYIP7Tdg_c4xhtyZWGMePuvC9K_PPftG1t3r6v2cc1ASM4arRDNvHypVVlro6WxuhRWCg0CENCAzFGJyihueK1tVUAjrcH5Odq6Lhbk_lJ7hjB8BneYVwwnGMMZRvEHY8JexA</recordid><startdate>20230616</startdate><enddate>20230616</enddate><creator>Hao, Yuhang</creator><creator>Wang, Zengfu</creator><creator>Fu, Jing</creator><creator>Bai, Xianglong</creator><creator>Li, Can</creator><creator>Pan, Quan</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20230616</creationdate><title>Combinatorial-restless-bandit-based Transmitter-Receiver Online Selection for Distributed MIMO Radars With Non-Stationary Channels</title><author>Hao, Yuhang ; Wang, Zengfu ; Fu, Jing ; Bai, Xianglong ; Li, Can ; Pan, Quan</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a670-9cbeed3062cb48cdc7dfc46f76ca6aeaeda71eb65db0d08cf53a97fde485ef883</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Computer Science - Systems and Control</topic><toplevel>online_resources</toplevel><creatorcontrib>Hao, Yuhang</creatorcontrib><creatorcontrib>Wang, Zengfu</creatorcontrib><creatorcontrib>Fu, Jing</creatorcontrib><creatorcontrib>Bai, Xianglong</creatorcontrib><creatorcontrib>Li, Can</creatorcontrib><creatorcontrib>Pan, Quan</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Hao, Yuhang</au><au>Wang, Zengfu</au><au>Fu, Jing</au><au>Bai, Xianglong</au><au>Li, Can</au><au>Pan, Quan</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Combinatorial-restless-bandit-based Transmitter-Receiver Online Selection for Distributed MIMO Radars With Non-Stationary Channels</atitle><date>2023-06-16</date><risdate>2023</risdate><abstract>We track moving targets with a distributed multiple-input multiple-output (MIMO) radar, for which the transmitters and receivers are appropriately paired and selected with a limited number of radar stations. We aim to maximize the sum of the signal-to-interference-plus-noise ratios (SINRs) of all the targets by sensibly selecting the transmitter-receiver pairs during the tracking period. A key is to model the optimization problem of selecting the transmitter-receiver pairs by a restless multi-armed bandit (RMAB) model that is able to formulate the time-varying signals of the transceiver channels whenever the channels are being probed or not. We regard the estimated mean reward (i.e., SINR) as the state of an arm. If an arm is probed, the estimated mean reward of the arm is the weighted sum of the observed reward and the predicted mean reward; otherwise, it is the predicted mean reward. We associate the predicted mean reward with the estimated mean reward at the previous time slot and the state of the target, which is estimated via the interacting multiple model-unscented Kalman filter (IMM-UKF). The optimized selection of transmitter-receiver pairs at each time is accomplished by using Binary Particle Swarm Optimization (BPSO) based on indexes of arms, each of which is designed by the upper confidence bound (UCB1) algorithm. Above all, a multi-group combinatorial-restless-bandit technique taking into account of different combinations of transmitters and receivers and the closed-loop scheme between transmitter-receiver pair selection and target state estimation, namely MG-CRB-CL, is developed to achieve a near-optimal selection strategy and improve multi-target tracking performance. Simulation results for different scenarios are provided to verify the effectiveness and superior performance of our MG-CRB-CL algorithm.</abstract><doi>10.48550/arxiv.2306.09710</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier DOI: 10.48550/arxiv.2306.09710
ispartof
issn
language eng
recordid cdi_arxiv_primary_2306_09710
source arXiv.org
subjects Computer Science - Systems and Control
title Combinatorial-restless-bandit-based Transmitter-Receiver Online Selection for Distributed MIMO Radars With Non-Stationary Channels
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-31T05%3A05%3A22IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Combinatorial-restless-bandit-based%20Transmitter-Receiver%20Online%20Selection%20for%20Distributed%20MIMO%20Radars%20With%20Non-Stationary%20Channels&rft.au=Hao,%20Yuhang&rft.date=2023-06-16&rft_id=info:doi/10.48550/arxiv.2306.09710&rft_dat=%3Carxiv_GOX%3E2306_09710%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true