LDR: Learning Discrete Representation to Improve Noise Robustness in Multiagent Tasks
In real-world applications of multiagent reinforcement learning (MARL), agents often face inaccurate environments due to unavoidable noise, presenting a challenge to their robustness. However, limited prior work focuses on addressing such noise in observations, hindering the deployment of multiagent...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on systems, man, and cybernetics. Systems man, and cybernetics. Systems, 2025-01, Vol.55 (1), p.513-525 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | In real-world applications of multiagent reinforcement learning (MARL), agents often face inaccurate environments due to unavoidable noise, presenting a challenge to their robustness. However, limited prior work focuses on addressing such noise in observations, hindering the deployment of multiagent systems. In this article, we propose a method named learning discrete representation (LDR) to improve robustness against noise in multiagent tasks. Specifically, LDR employs a quantization module with a segment mechanism to encode observations and teammate actions, generating discrete representations from learnable codebooks. These representations are subsequently processed via a combiner for decision-making. Through discretization, LDR is able to mitigate the impact of minor noise on decision-making. To enhance the learning efficiency, we incorporate a set-input block that treats the joint observations of agents as a permutation-invariant set, thereby reducing the complexity of the joint observation space. Additionally, we theoretically analyze the expressiveness of discrete representation and the boundedness of discrete distortion. We evaluate the proposed method on StarCraft II micromanagement tasks and multiagent MuJoCo with noisy observations. Empirical results demonstrate that LDR outperforms existing algorithms, improving robustness in noisy cooperative MARL tasks while maintaining superior performance in clean observations. |
---|---|
ISSN: | 2168-2216 2168-2232 |
DOI: | 10.1109/TSMC.2024.3487535 |