Universal Learning Waveform Selection Strategies for Adaptive Target Tracking

Online selection of optimal waveforms for target tracking with active sensors has long been a problem of interest. Many conventional solutions utilize an estimation-theoretic interpretation, in which a waveform-specific Cramér-Rao lower bound on measurement error is used to select the optimal wavefo...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on aerospace and electronic systems 2022-12, Vol.58 (6), p.5798-5814
Hauptverfasser:	Thornton, Charles E., Buehrer, R. Michael, Dhillon, Harpreet S., Martone, Anthony F.
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Behavioral sciences Context Cramer-Rao bounds Error analysis Interference mitigation Learning Lower bounds Markov analysis Markov processes Optimization Probabilistic models Radar Radar tracking radar waveform selection Reinforcement learning Sensors Target tracking Tracking universal prediction Waveforms Weighting methods
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Online selection of optimal waveforms for target tracking with active sensors has long been a problem of interest. Many conventional solutions utilize an estimation-theoretic interpretation, in which a waveform-specific Cramér-Rao lower bound on measurement error is used to select the optimal waveform for each tracking step. However, this approach is only valid in the high SNR regime, and requires a rather restrictive set of assumptions regarding the target motion and measurement models. Furthermore, due to computational concerns, many traditional approaches are limited to near-term, or myopic , optimization, even though radar scenes exhibit strong temporal correlation. More recently, reinforcement learning has been proposed for waveform selection, in which the problem is framed as a Markov decision process, allowing for long-term planning. However, a major limitation of reinforcement learning is that the memory length of the underlying Markov process is often unknown for realistic target and channel dynamics, and a more general framework is desirable. This work develops a universal sequential waveform selection scheme which asymptotically achieves Bellman optimality in any radar scene, which can be modeled as a U{\text{th}}-order Markov process for a finite, but unknown, integer U. Our approach is based on well-established tools from the field of universal source coding, where a stationary source is parsed into variable length phrases in order to build a context-tree , which is used as a probabilistic model for the scene's behavior. We show that an algorithm based on a multialphabet version of the context-tree weighting method can be used to optimally solve a broad class of waveform-agile tracking problems while making minimal assumptions about the environment's behavior.
ISSN:	0018-9251 1557-9603
DOI:	10.1109/TAES.2022.3181554