Adversarial Environment Design via Regret-Guided Diffusion Models
Training agents that are robust to environmental changes remains a significant challenge in deep reinforcement learning (RL). Unsupervised environment design (UED) has recently emerged to address this issue by generating a set of training environments tailored to the agent's capabilities. While...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Training agents that are robust to environmental changes remains a
significant challenge in deep reinforcement learning (RL). Unsupervised
environment design (UED) has recently emerged to address this issue by
generating a set of training environments tailored to the agent's capabilities.
While prior works demonstrate that UED has the potential to learn a robust
policy, their performance is constrained by the capabilities of the environment
generation. To this end, we propose a novel UED algorithm, adversarial
environment design via regret-guided diffusion models (ADD). The proposed
method guides the diffusion-based environment generator with the regret of the
agent to produce environments that the agent finds challenging but conducive to
further improvement. By exploiting the representation power of diffusion
models, ADD can directly generate adversarial environments while maintaining
the diversity of training environments, enabling the agent to effectively learn
a robust policy. Our experimental results demonstrate that the proposed method
successfully generates an instructive curriculum of environments, outperforming
UED baselines in zero-shot generalization across novel, out-of-distribution
environments. Project page: https://rllab-snu.github.io/projects/ADD |
---|---|
DOI: | 10.48550/arxiv.2410.19715 |