Approximate Optimal Trajectory Tracking With Sparse Bellman Error Extrapolation

This article provides an approximate online adaptive solution to the infinite-horizon optimal tracking problem for control-affine continuous-time nonlinear systems with uncertain drift dynamics. A model-based approximate dynamic programming (ADP) approach, which is facilitated using a concurrent lea...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on automatic control 2023-06, Vol.68 (6), p.3618-3624
Hauptverfasser:	Greene, Max L., Deptula, Patryk, Nivison, Scott, Dixon, Warren E.
Format:	Artikel
Sprache:	eng
Schlagworte:	Adaptive control Aerospace electronics Computational modeling Discontinuity Dynamic programming Extrapolation Function approximation Learning Mathematical models Neural networks Nonlinear control Nonlinear systems Optimal control reinforcement learning Segments Stability analysis Tracking control Tracking problem Trajectory Trajectory optimization
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	This article provides an approximate online adaptive solution to the infinite-horizon optimal tracking problem for control-affine continuous-time nonlinear systems with uncertain drift dynamics. A model-based approximate dynamic programming (ADP) approach, which is facilitated using a concurrent learning-based system identifier, approximates the optimal value function. To reduce the computational complexity of model-based ADP, the state space is segmented into user-defined segments (i.e., regions). Off-policy trajectories are selected within each segment to facilitate learning of the value function weight estimates; this process is called Bellman error (BE) extrapolation. Within certain segments of the state space, sparse neural networks are used to reduce the computational expense of BE extrapolation. Discontinuities occur in the weight update laws since different groupings of extrapolated BE trajectories are active in certain regions of the state space. A Lyapunov-like stability analysis is presented to prove boundedness of the overall system in the presence of discontinuities. Simulation results are included to demonstrate the performance and validity of the developed method. The simulation results demonstrate that using the sparse, switched BE extrapolation method developed in this article reduces the computation time by 85.6% when compared to the traditional BE extrapolation method.
ISSN:	0018-9286 1558-2523
DOI:	10.1109/TAC.2022.3194040