Avoiding collaborative paradox in multi‐agent reinforcement learning

The collaboration productively interacting between multi‐agents has become an emerging issue in real‐world applications. In reinforcement learning, multi‐agent environments present challenges beyond tractable issues in single‐agent settings. This collaborative environment has the following highly co...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:ETRI journal 2021, 43(6), , pp.1004-1012
Hauptverfasser: Kim, Hyunseok, Kim, Seonghyun, Lee, Donghun, Jang, Ingook
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The collaboration productively interacting between multi‐agents has become an emerging issue in real‐world applications. In reinforcement learning, multi‐agent environments present challenges beyond tractable issues in single‐agent settings. This collaborative environment has the following highly complex attributes: sparse rewards for task completion, limited communications between each other, and only partial observations. In particular, adjustments in an agent's action policy result in a nonstationary environment from the other agent's perspective, which causes high variance in the learned policies and prevents the direct use of reinforcement learning approaches. Unexpected social loafing caused by high dispersion makes it difficult for all agents to succeed in collaborative tasks. Therefore, we address a paradox caused by the social loafing to significantly reduce total returns after a certain timestep of multi‐agent reinforcement learning. We further demonstrate that the collaborative paradox in multi‐agent environments can be avoided by our proposed effective early stop method leveraging a metric for social loafing.
ISSN:1225-6463
2233-7326
DOI:10.4218/etrij.2021-0010