Using the Q-learning algorithm in the constructive phase of the GRASP and reactive GRASP metaheuristics

Currently many non-tractable considered problems have been solved satisfactorily through methods of approximate optimization called metaheuristic. These methods use non-deterministic approaches that find good solutions which, however, do not guarantee the determination of the global optimum. The suc...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: de Lima, Francisco Chagas, de Melo, Jorge Dantas, Doria Neto, Adriao Duarte
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Currently many non-tractable considered problems have been solved satisfactorily through methods of approximate optimization called metaheuristic. These methods use non-deterministic approaches that find good solutions which, however, do not guarantee the determination of the global optimum. The success of a metaheuristic is conditioned by capacity to adequately alternate between exploration and exploitation of the solution space. A way to guide such algorithms while searching for better solutions is supplying them with more knowledge of the solution space (environment of the problem). This can to be made in terms of a mapping of such environment in states and actions using reinforcement learning. This paper proposes the use of a technique of reinforcement learning - Q-learning algorithm - for the constructive phase of GRASP and reactive GRASP metaheuristic. The proposed methods will be applied to the symmetrical traveling salesman problem.
ISSN:2161-4393
1522-4899
2161-4407
DOI:10.1109/IJCNN.2008.4634399