TEMPORAL DIFFERENCE SCALING WHEN CONTROLLING AGENTS USING REINFORCEMENT LEARNING

A reinforcement learning neural network system configured to manage rewards on scales that can vary significantly. The system determines the value of a scale factor that is applied to a temporal difference error used for reinforcement learning. The scale factor depends at least upon a variance of th...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
1. Verfasser: SCHAUL, Tom
Format: Patent
Sprache:eng ; fre ; ger
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:A reinforcement learning neural network system configured to manage rewards on scales that can vary significantly. The system determines the value of a scale factor that is applied to a temporal difference error used for reinforcement learning. The scale factor depends at least upon a variance of the rewards received during the reinforcement learning.