Evolutionary Policy Iteration Under a Sampling Regime for Stochastic Combinatorial Optimization

This article modifies the evolutionary policy selection algorithm of Chang et al., which was designed for use in infinite horizon Markov decision processes (MDPs) with a large action space to a discrete stochastic optimization problem, in an algorithm called Evolutionary Policy Iteration-Monte Carlo...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on automatic control 2010-05, Vol.55 (5), p.1254-1257
Hauptverfasser: Hannah, Lauren A, Powell, Warren B
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:This article modifies the evolutionary policy selection algorithm of Chang et al., which was designed for use in infinite horizon Markov decision processes (MDPs) with a large action space to a discrete stochastic optimization problem, in an algorithm called Evolutionary Policy Iteration-Monte Carlo (EPI-MC). EPI-MC allows EPI to be used in a stochastic combinatorial optimization setting with a finite action space and a noisy cost (value) function by introducing a sampling schedule. Convergence of EPI-MC to the optimal action is proven and experimental results are given.
ISSN:0018-9286
1558-2523
DOI:10.1109/TAC.2010.2042766