TRAINING ACTION SELECTION NEURAL NETWORKS USING Q-LEARNING COMBINED WITH LOOK AHEAD SEARCH

A reinforcement learning system and method that selects actions to be performed by an agent interacting with an environment. The system uses a combination of reinforcement learning and a look ahead search: Reinforcement learning Q-values are used to guide the look ahead search and the search is used...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Weber, Theophane Guillaume, Battaglia, Peter William, Buesing, Lars, Pfaff, Tobias, Hamrick, Jessica Blake Chandler, Sanchez, Alvaro, Bapst, Victor Constant
Format:	Patent
Sprache:	eng
Schlagworte:	CALCULATING COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS COMPUTING COUNTING PHYSICS
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	A reinforcement learning system and method that selects actions to be performed by an agent interacting with an environment. The system uses a combination of reinforcement learning and a look ahead search: Reinforcement learning Q-values are used to guide the look ahead search and the search is used in turn to improve the Q-values. The system learns from a combination of real experience and simulated, model-based experience.