Deep reinforcement learning formation transformation method and system based on dynamic target allocation

The invention relates to a deep reinforcement learning formation transformation method and system based on dynamic target allocation. The method comprises the following steps: determining a state space, an action space and a reward function; initializing network parameters, an experience pool and a...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: LU WEIWEI, YU HAO, YANG XIUXIA, CHU ZHENG, ZHANG YI, GAO HENGJIE, WANG HONG, JIANG ZIJIE, YANG LIN, WANG CHENLEI
Format: Patent
Sprache:chi ; eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The invention relates to a deep reinforcement learning formation transformation method and system based on dynamic target allocation. The method comprises the following steps: determining a state space, an action space and a reward function; initializing network parameters, an experience pool and a training environment; judging whether the training round number reaches the maximum or not; each aircraft starts in a certain initial formation; calculating optimal distribution target points of the aircrafts, detecting surrounding own aircrafts by a detector, and judging whether the aircrafts need to avoid obstacles or collision according to an obstacle cone; course angles of the aircrafts needing to avoid obstacles are calculated, and the aircrafts select actions and enter the next state; calculating a reward value; the current system state, action, reward value and the next system state serve as a group of tuple data to be stored in an experience pool; updating network parameters; whether rs is C2 + C3 is judged