Learning by doing and the value of optimal experimentation

Recent research on learning by doing has provided the limit properties of beliefs and actions for a class of learning problems, in which experimentation is an important aspect of optimal decision making. However, under these conditions the optimal policy cannot be derived analytically, because Bayes...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Journal of economic dynamics & control 2000-04, Vol.24 (4), p.501-534
1. Verfasser:	Wieland, Volker
Format:	Artikel
Sprache:	eng
Schlagworte:	Bayesian analysis Bayesian learning Bayesian method Decision theory Dynamic programming Economic models Economic theory Estimation Experimental economics Experimentation Experiments Learning Learning by doing Optimal control with unknown parameters Programming Studies
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Recent research on learning by doing has provided the limit properties of beliefs and actions for a class of learning problems, in which experimentation is an important aspect of optimal decision making. However, under these conditions the optimal policy cannot be derived analytically, because Bayesian learning about unknown parameters introduces a nonlinearity in the dynamic optimization problem. This paper utilizes numerical methods to characterize the optimal policy function for a learning by doing problem that is general enough for practical economic applications. The optimal policy is found to incorporate a substantial degree of experimentation under a wide range of initial beliefs about the unknown parameters. Dynamic simulations indicate that optimal experimentation dramatically improves the speed of learning and the stream of future payoffs. Furthermore, these simulations reveal that a policy, which separates control and estimation and does not incorporate experimentation, frequently induces a long-lasting bias in the control and target variables. While these sequences tend to converge steadily under the optimal policy, they frequently exhibit non-stationary behavior when estimation and control are treated separately.
ISSN:	0165-1889 1879-1743
DOI:	10.1016/S0165-1889(99)00015-9