Attitude control for hypersonic reentry vehicles: An efficient deep reinforcement learning method

Aiming at the attitude control problem of hypersonic reentry vehicles (HRVs), a deep reinforcement learning (DRL) based anti-disturbance control method is proposed. First, a compound control framework consisting of a DRL-based auxiliary controller and a fixed-time anti-disturbance controller is prop...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Applied soft computing 2022-07, Vol.123, p.108865, Article 108865
Hauptverfasser: Liu, Yiheng, Wang, Honglun, Wu, Tiancai, Lun, Yuebin, Fan, Jiaxuan, Wu, Jianfa
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Aiming at the attitude control problem of hypersonic reentry vehicles (HRVs), a deep reinforcement learning (DRL) based anti-disturbance control method is proposed. First, a compound control framework consisting of a DRL-based auxiliary controller and a fixed-time anti-disturbance controller is proposed to improve the control performance under the premise of ensuring stability. Then, a novel value function approximation mechanism, named experience-based value expansion (EVE), is proposed to modify the value function update equation based on a two-dimensional replay buffer, which solves the DRL convergence problem brought by the HRV’s strong nonlinearities, tight coupling, and big flight envelope. Furthermore, a result-oriented encoder (ROE) is proposed to solve the DRL generalization problem brought by the HRV’s high uncertainties and unavailable real training environment. A bottleneck shape neural network structure is used for the DRL’s network structure to extract high-dimensional features and prevent overfitting to the training environment. Finally, abundant numerical comparative simulations demonstrate the effectiveness of the proposed efficient DRL algorithms and the DRL-based attitude controller. •Integrating merits of classic and intelligent control by a compound control framework.•A baseline controller is designed for disturbance rejection and stability guarantee.•A value expansion method is proposed to improve DRL training effect.•A result-oriented encoder is proposed to improve DRL generalization ability.•A DRL auxiliary controller is proposed to improve the control performance.
ISSN:1568-4946
1872-9681
DOI:10.1016/j.asoc.2022.108865