Scalable Anytime Planning for Multi-Agent MDPs
We present a scalable tree search planning algorithm for large multi-agent sequential decision problems that require dynamic collaboration. Teams of agents need to coordinate decisions in many domains, but naive approaches fail due to the exponential growth of the joint action space with the number...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | We present a scalable tree search planning algorithm for large multi-agent
sequential decision problems that require dynamic collaboration. Teams of
agents need to coordinate decisions in many domains, but naive approaches fail
due to the exponential growth of the joint action space with the number of
agents. We circumvent this complexity through an anytime approach that allows
us to trade computation for approximation quality and also dynamically
coordinate actions. Our algorithm comprises three elements: online planning
with Monte Carlo Tree Search (MCTS), factored representations of local agent
interactions with coordination graphs, and the iterative Max-Plus method for
joint action selection. We evaluate our approach on the benchmark SysAdmin
domain with static coordination graphs and achieve comparable performance with
much lower computation cost than our MCTS baselines. We also introduce a
multi-drone delivery domain with dynamic, i.e., state-dependent coordination
graphs, and demonstrate how our approach scales to large problems on this
domain that are intractable for other MCTS methods. We provide an open-source
implementation of our algorithm at
https://github.com/JuliaPOMDP/FactoredValueMCTS.jl. |
---|---|
DOI: | 10.48550/arxiv.2101.04788 |