Deep Policy Networks for NPC Behaviors that Adapt to Changing Design Parameters in Roguelike Games
Recent advances in Deep Reinforcement Learning (DRL) have largely focused on improving the performance of agents with the aim of replacing humans in known and well-defined environments. The use of these techniques as a game design tool for video game production, where the aim is instead to create No...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Recent advances in Deep Reinforcement Learning (DRL) have largely focused on
improving the performance of agents with the aim of replacing humans in known
and well-defined environments. The use of these techniques as a game design
tool for video game production, where the aim is instead to create Non-Player
Character (NPC) behaviors, has received relatively little attention until
recently. Turn-based strategy games like Roguelikes, for example, present
unique challenges to DRL. In particular, the categorical nature of their
complex game state, composed of many entities with different attributes,
requires agents able to learn how to compare and prioritize these entities.
Moreover, this complexity often leads to agents that overfit to states seen
during training and that are unable to generalize in the face of design changes
made during development. In this paper we propose two network architectures
which, when combined with a \emph{procedural loot generation} system, are able
to better handle complex categorical state spaces and to mitigate the need for
retraining forced by design decisions. The first is based on a dense embedding
of the categorical input space that abstracts the discrete observation model
and renders trained agents more able to generalize. The second proposed
architecture is more general and is based on a Transformer network able to
reason relationally about input and input attributes. Our experimental
evaluation demonstrates that new agents have better adaptation capacity with
respect to a baseline architecture, making this framework more robust to
dynamic gameplay changes during development. Based on the results shown in this
paper, we believe that these solutions represent a step forward towards making
DRL more accessible to the gaming industry. |
---|---|
DOI: | 10.48550/arxiv.2012.03532 |