Learning to Share in Multi-Agent Reinforcement Learning
In this paper, we study the problem of networked multi-agent reinforcement learning (MARL), where a number of agents are deployed as a partially connected network and each interacts only with nearby agents. Networked MARL requires all agents to make decisions in a decentralized manner to optimize a...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | In this paper, we study the problem of networked multi-agent reinforcement
learning (MARL), where a number of agents are deployed as a partially connected
network and each interacts only with nearby agents. Networked MARL requires all
agents to make decisions in a decentralized manner to optimize a global
objective with restricted communication between neighbors over the network.
Inspired by the fact that sharing plays a key role in human's learning of
cooperation, we propose LToS, a hierarchically decentralized MARL framework
that enables agents to learn to dynamically share reward with neighbors so as
to encourage agents to cooperate on the global objective through collectives.
For each agent, the high-level policy learns how to share reward with neighbors
to decompose the global objective, while the low-level policy learns to
optimize the local objective induced by the high-level policies in the
neighborhood. The two policies form a bi-level optimization and learn
alternately. We empirically demonstrate that LToS outperforms existing methods
in both social dilemma and networked MARL scenarios across scales. |
---|---|
DOI: | 10.48550/arxiv.2112.08702 |