Stochastic optimal control under Poisson-distributed observations

Optimal control problems for linear, stochastic continuous-time systems are considered, in which the time domain is decomposed into a finite set of N disjoint random intervals of the form [t/sub i/, t/sub i+1/), in which a complete state observation is taken at each instant t/sub i/, 0/spl les/i/spl...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on automatic control 2000-01, Vol.45 (1), p.3-13
Hauptverfasser: Ades, M., Caines, P.E., Malhame, R.P.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Optimal control problems for linear, stochastic continuous-time systems are considered, in which the time domain is decomposed into a finite set of N disjoint random intervals of the form [t/sub i/, t/sub i+1/), in which a complete state observation is taken at each instant t/sub i/, 0/spl les/i/spl les/N-1. Two optimal control problems termed, respectively, the (piecewise) time-invariant control and time-variant control are considered in this framework. Concerning the observation point process, we first consider the general situation in which the increment intervals are i.i.d.r.v.s with unspecified probabilistic distributions. The (piecewise) time-invariant solution is thoroughly developed in this general case, and computations are illustrated using Erlang as the observations interarrival distribution. Next, the problem is specialized so increments are exponentially distributed, and the particular optimal control structure that results from this assumption is presented. Finally, and still under the Poisson assumption and for the time-variant case, we show that the control problem is closely related to linear quadratic Gaussian regulation with an exponentially discounted cost. The optimal control is made again of a sequence of piecewise open-loop controls corresponding, in this case, to linear feedback of the state predictor based on the most recent information on each interval. The feedback gains are time-varying matrices obtained from a sequence of algebraic Riccati equations, which are also computed off-line.
ISSN:0018-9286
1558-2523
DOI:10.1109/9.827351