Linear Quadratic Tracking Control of Partially-Unknown Continuous-Time Systems Using Reinforcement Learning

In this technical note, an online learning algorithm is developed to solve the linear quadratic tracking (LQT) problem for partially-unknown continuous-time systems. It is shown that the value function is quadratic in terms of the state of the system and the command generator. Based on this quadrati...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on automatic control 2014-11, Vol.59 (11), p.3051-3056
Hauptverfasser:	Modares, Hamidreza, Lewis, Frank L.
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Commands Dynamical systems Dynamics Equations Generators Heuristic algorithms Learning (artificial intelligence) Linear quadratic Mathematical model Optimal control Quadratic forms Reinforcement Trajectory
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	In this technical note, an online learning algorithm is developed to solve the linear quadratic tracking (LQT) problem for partially-unknown continuous-time systems. It is shown that the value function is quadratic in terms of the state of the system and the command generator. Based on this quadratic form, an LQT Bellman equation and an LQT algebraic Riccati equation (ARE) are derived to solve the LQT problem. The integral reinforcement learning technique is used to find the solution to the LQT ARE online and without requiring the knowledge of the system drift dynamics or the command generator dynamics. The convergence of the proposed online algorithm to the optimal control solution is verified. To show the efficiency of the proposed approach, a simulation example is provided.
ISSN:	0018-9286 1558-2523
DOI:	10.1109/TAC.2014.2317301