Integration of Adaptive Control and Reinforcement Learning for Real-time Control and Learning

This paper considers the problem of real-time control and learning in dynamic systems subjected to parametric uncertainties. We propose a combination of a Reinforcement Learning (RL) based policy in the outer loop suitably chosen to ensure stability and optimality for the nominal dynamics, together...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on automatic control 2023-12, Vol.68 (12), p.1-16
Hauptverfasser:	Annaswamy, Anuradha M., Guha, Anubhav, Cui, Yingnan, Tang, Sunbochen, Fisher, Peter A., Gaudio, Joseph E.
Format:	Artikel
Sprache:	eng
Schlagworte:	Adaptive control Algorithms Closed loops Convergence Dynamic stability Dynamical systems Nonlinear dynamical systems Nonlinear dynamics Nonlinear systems Numerical stability Real time Real-time systems Stability analysis Uncertainty
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	This paper considers the problem of real-time control and learning in dynamic systems subjected to parametric uncertainties. We propose a combination of a Reinforcement Learning (RL) based policy in the outer loop suitably chosen to ensure stability and optimality for the nominal dynamics, together with Adaptive Control (AC) in the inner loop so that in real-time AC contracts the closed-loop dynamics towards a stable trajectory traced out by RL. Two classes of nonlinear dynamic systems are considered, both of which are control-affine. The first class of dynamic systems utilizes equilibrium points and a Lyapunov approach while second class of nonlinear systems uses contraction theory. AC-RL controllers are proposed for both classes of systems and shown to lead to online policies that guarantee stability using a high-order tuner and accommodate parametric uncertainties and magnitude limits on the input. In addition to establishing a stability guarantee with real-time control, the AC-RL controller is also shown to lead to parameter learning with persistent excitation for the first class of systems. Numerical validations of all algorithms are carried out using a quadrotor landing task on a moving platform.
ISSN:	0018-9286 1558-2523
DOI:	10.1109/TAC.2023.3290037