MASER: Multi-Agent Reinforcement Learning with Subgoals Generated from Experience Replay Buffer
In this paper, we consider cooperative multi-agent reinforcement learning (MARL) with sparse reward. To tackle this problem, we propose a novel method named MASER: MARL with subgoals generated from experience replay buffer. Under the widely-used assumption of centralized training with decentralized...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | In this paper, we consider cooperative multi-agent reinforcement learning
(MARL) with sparse reward. To tackle this problem, we propose a novel method
named MASER: MARL with subgoals generated from experience replay buffer. Under
the widely-used assumption of centralized training with decentralized execution
and consistent Q-value decomposition for MARL, MASER automatically generates
proper subgoals for multiple agents from the experience replay buffer by
considering both individual Q-value and total Q-value. Then, MASER designs
individual intrinsic reward for each agent based on actionable representation
relevant to Q-learning so that the agents reach their subgoals while maximizing
the joint action value. Numerical results show that MASER significantly
outperforms StarCraft II micromanagement benchmark compared to other
state-of-the-art MARL algorithms. |
---|---|
DOI: | 10.48550/arxiv.2206.10607 |