Deep reinforcement learning formation transformation method and system based on dynamic target allocation

The invention relates to a deep reinforcement learning formation transformation method and system based on dynamic target allocation. The method comprises the following steps: determining a state space, an action space and a reward function; initializing network parameters, an experience pool and a...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	LU WEIWEI, YU HAO, YANG XIUXIA, CHU ZHENG, ZHANG YI, GAO HENGJIE, WANG HONG, JIANG ZIJIE, YANG LIN, WANG CHENLEI
Format:	Patent
Sprache:	chi ; eng
Schlagworte:	CONTROLLING PHYSICS REGULATING SYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	The invention relates to a deep reinforcement learning formation transformation method and system based on dynamic target allocation. The method comprises the following steps: determining a state space, an action space and a reward function; initializing network parameters, an experience pool and a training environment; judging whether the training round number reaches the maximum or not; each aircraft starts in a certain initial formation; calculating optimal distribution target points of the aircrafts, detecting surrounding own aircrafts by a detector, and judging whether the aircrafts need to avoid obstacles or collision according to an obstacle cone; course angles of the aircrafts needing to avoid obstacles are calculated, and the aircrafts select actions and enter the next state; calculating a reward value; the current system state, action, reward value and the next system state serve as a group of tuple data to be stored in an experience pool; updating network parameters; whether rs is C2 + C3 is judged