RACA: Relation-Aware Credit Assignment for Ad-Hoc Cooperation in Multi-Agent Deep Reinforcement Learning
In recent years, reinforcement learning has faced several challenges in the multi-agent domain, such as the credit assignment issue. Value function factorization emerges as a promising way to handle the credit assignment issue under the centralized training with decentralized execution (CTDE) paradi...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | In recent years, reinforcement learning has faced several challenges in the
multi-agent domain, such as the credit assignment issue. Value function
factorization emerges as a promising way to handle the credit assignment issue
under the centralized training with decentralized execution (CTDE) paradigm.
However, existing value function factorization methods cannot deal with ad-hoc
cooperation, that is, adapting to new configurations of teammates at test time.
Specifically, these methods do not explicitly utilize the relationship between
agents and cannot adapt to different sizes of inputs. To address these
limitations, we propose a novel method, called Relation-Aware Credit Assignment
(RACA), which achieves zero-shot generalization in ad-hoc cooperation
scenarios. RACA takes advantage of a graph-based relation encoder to encode the
topological structure between agents. Furthermore, RACA utilizes an
attention-based observation abstraction mechanism that can generalize to an
arbitrary number of teammates with a fixed number of parameters. Experiments
demonstrate that our method outperforms baseline methods on the StarCraftII
micromanagement benchmark and ad-hoc cooperation scenarios. |
---|---|
DOI: | 10.48550/arxiv.2206.01207 |