Deep reinforcement learning formation transformation method and system based on dynamic target allocation
The invention relates to a deep reinforcement learning formation transformation method and system based on dynamic target allocation. The method comprises the following steps: determining a state space, an action space and a reward function; initializing network parameters, an experience pool and a...
Gespeichert in:
Hauptverfasser: | , , , , , , , , , |
---|---|
Format: | Patent |
Sprache: | chi ; eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The invention relates to a deep reinforcement learning formation transformation method and system based on dynamic target allocation. The method comprises the following steps: determining a state space, an action space and a reward function; initializing network parameters, an experience pool and a training environment; judging whether the training round number reaches the maximum or not; each aircraft starts in a certain initial formation; calculating optimal distribution target points of the aircrafts, detecting surrounding own aircrafts by a detector, and judging whether the aircrafts need to avoid obstacles or collision according to an obstacle cone; course angles of the aircrafts needing to avoid obstacles are calculated, and the aircrafts select actions and enter the next state; calculating a reward value; the current system state, action, reward value and the next system state serve as a group of tuple data to be stored in an experience pool; updating network parameters; whether rs is C2 + C3 is judged |
---|