| Literature DB >> 11607366 |
Abstract
We combine the formulation of Mandelbaum [Mandelbaum, A. (1986) Probab. Theory Rel. Fields 71, 129-147] with ideas from Whittle [Whittle, P. (1980) J. R. Stat. Soc. B 42, 143-149] to obtain a simple and constructive proof for the optimality of Gittins index processes in the general, nonmarkovian dynamic allocation (or "multi-armed bandit") problem. Our approach also provides an explicit expression for the value of this problem.Entities:
Year: 1993 PMID: 11607366 PMCID: PMC45846 DOI: 10.1073/pnas.90.4.1232
Source DB: PubMed Journal: Proc Natl Acad Sci U S A ISSN: 0027-8424 Impact factor: 11.205