Evolutionary Policy Iteration Under a Sampling Regime for Stochastic Combinatorial Optimization

This article modifies the evolutionary policy selection algorithm of Chang et al., which was designed for use in infinite horizon Markov decision processes (MDPs) with a large action space to a discrete stochastic optimization problem, in an algorithm called Evolutionary Policy Iteration-Monte Carlo...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on automatic control 2010-05, Vol.55 (5), p.1254-1257
Hauptverfasser:	Hannah, Lauren A, Powell, Warren B
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithm design and analysis Algorithms Ant colony optimization Applied sciences Combinatorial analysis Combinatorial optimization Convergence Cost function Decision theory. Utility theory Design optimization Evolutionary Evolutionary design method evolutionary policy iteration (EPI) Exact sciences and technology Flows in networks. Combinatorial problems Infinite horizon Markov processes Mathematical programming Mathematics Monte Carlo (MC) Monte Carlo methods Operational research and scientific management Operational research. Management science Optimization Policies Probability and statistics Probability theory and stochastic processes Sampling Sampling methods Sciences and techniques of general use State-space methods stochastic optimization Stochastic processes Stochasticity
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	This article modifies the evolutionary policy selection algorithm of Chang et al., which was designed for use in infinite horizon Markov decision processes (MDPs) with a large action space to a discrete stochastic optimization problem, in an algorithm called Evolutionary Policy Iteration-Monte Carlo (EPI-MC). EPI-MC allows EPI to be used in a stochastic combinatorial optimization setting with a finite action space and a noisy cost (value) function by introducing a sampling schedule. Convergence of EPI-MC to the optimal action is proven and experimental results are given.
ISSN:	0018-9286 1558-2523
DOI:	10.1109/TAC.2010.2042766