Control of Unknown Nonlinear Systems With Efficient Transient Performance Using Concurrent Exploitation and Exploration

Learning mechanisms that operate in unknown environments should be able to efficiently deal with the problem of controlling unknown dynamical systems. Many approaches that deal with such a problem face the so-called exploitation-exploration dilemma where the controller has to sacrifice efficient per...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transaction on neural networks and learning systems 2010-08, Vol.21 (8), p.1245-1261
1. Verfasser:	Kosmatopoulos, E B
Format:	Artikel
Sprache:	eng
Schlagworte:	Adaptation, Physiological - physiology Adaptive control Algorithms Analytical models Animals Applied sciences Artificial Intelligence Computer science control theory systems Computer simulation Connectionism. Neural networks Control Lyapunov function (CLF) Control systems Dynamical systems Exact sciences and technology Exploitation exploitation vs exploration Exploration Feedback high-order neural networks (HONN) Humans Learning Learning systems Lyapunov method Mathematical Computing Neural Networks (Computer) Nonlinear control systems Nonlinear Dynamics Nonlinear systems persistence of excitation (PE) Programmable control State feedback State vectors Studies sum-of-squares (SoS) System performance
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Learning mechanisms that operate in unknown environments should be able to efficiently deal with the problem of controlling unknown dynamical systems. Many approaches that deal with such a problem face the so-called exploitation-exploration dilemma where the controller has to sacrifice efficient performance for the sake of learning "better" control strategies than the ones already known: during the exploration period, poor or even unstable closed-loop system performance may be exhibited. In this paper, we show that, in the case where the control goal is to stabilize an unknown dynamical system by means of state feedback, exploitation and exploration can be concurrently performed without the need of sacrificing efficiency. This is made possible through an appropriate combination of recent results developed by the author in the areas of adaptive control and adaptive optimization and a new result on the convex construction of control Lyapunov functions for nonlinear systems. The resulting scheme guarantees arbitrarily good performance in the regions where the system is controllable. Theoretical analysis as well as simulation results on a particularly challenging control problem verify such a claim.
ISSN:	1045-9227 2162-237X 1941-0093 2162-2388
DOI:	10.1109/TNN.2010.2050211