Two-Loop Acceleration Autopilot Design and Analysis Based on TD3 Strategy

A two-loop acceleration autopilot is designed using the twin-delayed deep deterministic policy gradient (TD3) strategy to avoid the tedious design process of conventional tactical missile acceleration autopilots and the difficulty of meeting the performance requirements of the full flight envelope....

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:International journal of aerospace engineering 2023, Vol.2023, p.1-15
Hauptverfasser: Fan, Junfang, Dou, Denghui, Ji, Yi, Liu, Ning, Chen, Shiwei, Yan, Huajie, Li, Junxian
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:A two-loop acceleration autopilot is designed using the twin-delayed deep deterministic policy gradient (TD3) strategy to avoid the tedious design process of conventional tactical missile acceleration autopilots and the difficulty of meeting the performance requirements of the full flight envelope. First, a deep reinforcement learning model for the two-loop autopilot is developed. The flight state information serves as the state, the to-be-designed autopilot control parameters serve as the action, and a reward mechanism based on the stability margin index is designed. The TD3 strategy is subsequently used to offline learn the control parameters for the entire flight envelope. An autopilot control parameter fitting model that can be directly applied to the guidance loop is obtained. Finally, the obtained fitting model is combined with the impact angle constraint in the guidance system and verified online. The simulation results demonstrate that the autopilot based on the TD3 strategy can self-adjust the control parameters online based on the real-time flight state, ensuring system stability and achieving accurate acceleration command tracking.
ISSN:1687-5966
1687-5974
DOI:10.1155/2023/5759135