Multiagent reinforcement learning for strictly constrained tasks based on Reward Recorder
Multiagent reinforcement learning (MARL) has been widely applied in engineering problems. However, many strictly constrained problems such as distributed optimization in engineering applications are still a great challenge to MARL. Especially for strict global constraints of agents' actions, it...
Gespeichert in:
Veröffentlicht in: | International journal of intelligent systems 2022-11, Vol.37 (11), p.8387-8411 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Multiagent reinforcement learning (MARL) has been widely applied in engineering problems. However, many strictly constrained problems such as distributed optimization in engineering applications are still a great challenge to MARL. Especially for strict global constraints of agents' actions, it is very easy to lead to sparse rewards. Besides, existing studies cannot solve the instability caused by partial observability while making the algorithm fully distributed. Algorithms with centralized training may encounter significant obstacles in real‐world deployment. For the first time, we provide theoretical analysis for MARL to determine the adverse effects of partial observability on convergence, and a fully distributed and convergent MARL algorithm based on Reward Recorder is proposed. Each agent runs an independent reinforcement learning algorithm and uses the average‐consensus protocol to estimate the global state‐action value locally to achieve global optimization. To verify the performance of the algorithm, we propose a novel generalized constrained optimization model, which includes local inequality constraints and strict global constraints. The proposed distributed reinforcement learning algorithm is supported by several simulation examples. The results reveal that the proposed algorithm has high stability and excellent decision‐making ability. |
---|---|
ISSN: | 0884-8173 1098-111X |
DOI: | 10.1002/int.22945 |