Universal Learning Waveform Selection Strategies for Adaptive Target Tracking
Online selection of optimal waveforms for target tracking with active sensors has long been a problem of interest. Many conventional solutions utilize an estimation-theoretic interpretation, in which a waveform-specific Cramér-Rao lower bound on measurement error is used to select the optimal wavefo...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on aerospace and electronic systems 2022-12, Vol.58 (6), p.5798-5814 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Online selection of optimal waveforms for target tracking with active sensors has long been a problem of interest. Many conventional solutions utilize an estimation-theoretic interpretation, in which a waveform-specific Cramér-Rao lower bound on measurement error is used to select the optimal waveform for each tracking step. However, this approach is only valid in the high SNR regime, and requires a rather restrictive set of assumptions regarding the target motion and measurement models. Furthermore, due to computational concerns, many traditional approaches are limited to near-term, or myopic , optimization, even though radar scenes exhibit strong temporal correlation. More recently, reinforcement learning has been proposed for waveform selection, in which the problem is framed as a Markov decision process, allowing for long-term planning. However, a major limitation of reinforcement learning is that the memory length of the underlying Markov process is often unknown for realistic target and channel dynamics, and a more general framework is desirable. This work develops a universal sequential waveform selection scheme which asymptotically achieves Bellman optimality in any radar scene, which can be modeled as a U{\text{th}}-order Markov process for a finite, but unknown, integer U. Our approach is based on well-established tools from the field of universal source coding, where a stationary source is parsed into variable length phrases in order to build a context-tree , which is used as a probabilistic model for the scene's behavior. We show that an algorithm based on a multialphabet version of the context-tree weighting method can be used to optimally solve a broad class of waveform-agile tracking problems while making minimal assumptions about the environment's behavior. |
---|---|
ISSN: | 0018-9251 1557-9603 |
DOI: | 10.1109/TAES.2022.3181554 |