Deep Transformer Q-Networks for Partially Observable Reinforcement Learning
Real-world reinforcement learning tasks often involve some form of partial observability where the observations only give a partial or noisy view of the true state of the world. Such tasks typically require some form of memory, where the agent has access to multiple past observations, in order to pe...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Real-world reinforcement learning tasks often involve some form of partial
observability where the observations only give a partial or noisy view of the
true state of the world. Such tasks typically require some form of memory,
where the agent has access to multiple past observations, in order to perform
well. One popular way to incorporate memory is by using a recurrent neural
network to access the agent's history. However, recurrent neural networks in
reinforcement learning are often fragile and difficult to train, susceptible to
catastrophic forgetting and sometimes fail completely as a result. In this
work, we propose Deep Transformer Q-Networks (DTQN), a novel architecture
utilizing transformers and self-attention to encode an agent's history. DTQN is
designed modularly, and we compare results against several modifications to our
base model. Our experiments demonstrate the transformer can solve partially
observable tasks faster and more stably than previous recurrent approaches. |
---|---|
DOI: | 10.48550/arxiv.2206.01078 |