Learning-Based Attitude Tracking Control With High-Performance Parameter Estimation

This article aims to handle the optimal attitude tracking control tasks for rigid bodies via a reinforcement-learning-based control scheme, in which a constrained parameter estimator is designed to compensate system uncertainties accurately. This estimator guarantees the exponential convergence of e...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on aerospace and electronic systems 2022-06, Vol.58 (3), p.2218-2230
Hauptverfasser:	Dong, Hongyang, Zhao, Xiaowei, Hu, Qinglei, Yang, Haoyang, Qi, Pengyuan
Format:	Artikel
Sprache:	eng
Schlagworte:	Adaptive control adaptive dynamic programming (ADP) Attitude control attitude tracking control Control tasks Cost function Dynamic programming Hardware-in-the-loop simulation Learning Mathematical models Optimal control Parameter estimation Rigid structures Task analysis Tracking Tracking control Tracking errors Uncertainty
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	This article aims to handle the optimal attitude tracking control tasks for rigid bodies via a reinforcement-learning-based control scheme, in which a constrained parameter estimator is designed to compensate system uncertainties accurately. This estimator guarantees the exponential convergence of estimation errors and can strictly keep all instant estimates always within predetermined bounds. Based on it, a critic-only adaptive dynamic programming (ADP) control strategy is proposed to learn the optimal control policy with respect to a user-defined cost function. The matching condition on reference control signals, which is commonly employed in relevant ADP design, is not required in the proposed control scheme. We prove the uniform ultimate boundedness of the tracking errors and critic weight’s estimation errors under finite excitation conditions by Lyapunov-based analysis. Moreover, an easy-to-implement initial control policy is designed to trigger the real-time learning process. The effectiveness and advantages of the proposed method are verified by both numerical simulations and hardware-in-the-loop experimental tests.
ISSN:	0018-9251 1557-9603
DOI:	10.1109/TAES.2021.3130537