Attitude control for hypersonic reentry vehicles: An efficient deep reinforcement learning method
Aiming at the attitude control problem of hypersonic reentry vehicles (HRVs), a deep reinforcement learning (DRL) based anti-disturbance control method is proposed. First, a compound control framework consisting of a DRL-based auxiliary controller and a fixed-time anti-disturbance controller is prop...
Gespeichert in:
Veröffentlicht in: | Applied soft computing 2022-07, Vol.123, p.108865, Article 108865 |
---|---|
Hauptverfasser: | , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Aiming at the attitude control problem of hypersonic reentry vehicles (HRVs), a deep reinforcement learning (DRL) based anti-disturbance control method is proposed. First, a compound control framework consisting of a DRL-based auxiliary controller and a fixed-time anti-disturbance controller is proposed to improve the control performance under the premise of ensuring stability. Then, a novel value function approximation mechanism, named experience-based value expansion (EVE), is proposed to modify the value function update equation based on a two-dimensional replay buffer, which solves the DRL convergence problem brought by the HRV’s strong nonlinearities, tight coupling, and big flight envelope. Furthermore, a result-oriented encoder (ROE) is proposed to solve the DRL generalization problem brought by the HRV’s high uncertainties and unavailable real training environment. A bottleneck shape neural network structure is used for the DRL’s network structure to extract high-dimensional features and prevent overfitting to the training environment. Finally, abundant numerical comparative simulations demonstrate the effectiveness of the proposed efficient DRL algorithms and the DRL-based attitude controller.
•Integrating merits of classic and intelligent control by a compound control framework.•A baseline controller is designed for disturbance rejection and stability guarantee.•A value expansion method is proposed to improve DRL training effect.•A result-oriented encoder is proposed to improve DRL generalization ability.•A DRL auxiliary controller is proposed to improve the control performance. |
---|---|
ISSN: | 1568-4946 1872-9681 |
DOI: | 10.1016/j.asoc.2022.108865 |