Universal Learning Waveform Selection Strategies for Adaptive Target Tracking

Online selection of optimal waveforms for target tracking with active sensors has long been a problem of interest. Many conventional solutions utilize an estimation-theoretic interpretation, in which a waveform-specific Cramér-Rao lower bound on measurement error is used to select the optimal wavefo...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on aerospace and electronic systems 2022-12, Vol.58 (6), p.5798-5814
Hauptverfasser: Thornton, Charles E., Buehrer, R. Michael, Dhillon, Harpreet S., Martone, Anthony F.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Online selection of optimal waveforms for target tracking with active sensors has long been a problem of interest. Many conventional solutions utilize an estimation-theoretic interpretation, in which a waveform-specific Cramér-Rao lower bound on measurement error is used to select the optimal waveform for each tracking step. However, this approach is only valid in the high SNR regime, and requires a rather restrictive set of assumptions regarding the target motion and measurement models. Furthermore, due to computational concerns, many traditional approaches are limited to near-term, or myopic , optimization, even though radar scenes exhibit strong temporal correlation. More recently, reinforcement learning has been proposed for waveform selection, in which the problem is framed as a Markov decision process, allowing for long-term planning. However, a major limitation of reinforcement learning is that the memory length of the underlying Markov process is often unknown for realistic target and channel dynamics, and a more general framework is desirable. This work develops a universal sequential waveform selection scheme which asymptotically achieves Bellman optimality in any radar scene, which can be modeled as a U{\text{th}}-order Markov process for a finite, but unknown, integer U. Our approach is based on well-established tools from the field of universal source coding, where a stationary source is parsed into variable length phrases in order to build a context-tree , which is used as a probabilistic model for the scene's behavior. We show that an algorithm based on a multialphabet version of the context-tree weighting method can be used to optimally solve a broad class of waveform-agile tracking problems while making minimal assumptions about the environment's behavior.
ISSN:0018-9251
1557-9603
DOI:10.1109/TAES.2022.3181554