Evaluating Long-Term Memory in 3D Mazes
Intelligent agents need to remember salient information to reason in partially-observed environments. For example, agents with a first-person view should remember the positions of relevant objects even if they go out of view. Similarly, to effectively navigate through rooms agents need to remember t...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Intelligent agents need to remember salient information to reason in
partially-observed environments. For example, agents with a first-person view
should remember the positions of relevant objects even if they go out of view.
Similarly, to effectively navigate through rooms agents need to remember the
floor plan of how rooms are connected. However, most benchmark tasks in
reinforcement learning do not test long-term memory in agents, slowing down
progress in this important research direction. In this paper, we introduce the
Memory Maze, a 3D domain of randomized mazes specifically designed for
evaluating long-term memory in agents. Unlike existing benchmarks, Memory Maze
measures long-term memory separate from confounding agent abilities and
requires the agent to localize itself by integrating information over time.
With Memory Maze, we propose an online reinforcement learning benchmark, a
diverse offline dataset, and an offline probing evaluation. Recording a human
player establishes a strong baseline and verifies the need to build up and
retain memories, which is reflected in their gradually increasing rewards
within each episode. We find that current algorithms benefit from training with
truncated backpropagation through time and succeed on small mazes, but fall
short of human performance on the large mazes, leaving room for future
algorithmic designs to be evaluated on the Memory Maze. |
---|---|
DOI: | 10.48550/arxiv.2210.13383 |