Hostname: page-component-76fb5796d-wq484 Total loading time: 0 Render date: 2024-04-27T11:15:54.162Z Has data issue: false hasContentIssue false

Branching Bandit Processes

Published online by Cambridge University Press:  27 July 2009

Gideon Weiss
Affiliation:
Industrial and Systems Engineering Georgia Institute of Technology Atlanta, Georgia 30332-0205 and Department of Statistics Tel Aviv University

Abstract

A set of ni arms of type i, i = 1,…, L, is available. A pull of arm of type i occupies a duration Vi at the end of which a reward Ci and Ni1,…, NiL new arms are obtained, while all other arms are frozen. A Gittins priority order of types is obtained and shown to yield the maximal discounted reward from this branching process of arms.

Type
Articles
Copyright
Copyright © Cambridge University Press 1988

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Chen, Y.R. & Katehakis, M.N. (1986). Linear programming for finite state multiarmed bandit problems. Mathematics of Operations Research 11: 180183.Google Scholar
Gittins, J.C. (1979). Bandit processes and dynamic allocation indices. Journal of the Royal Staiistical Society Series B 14: 148177.Google Scholar
Gittins, J.C. & Jones, D.M. (1974). A dynamic allocation index for the sequential design of experiments. In Gani, J., Sarkadi, K. & Vince, I. (eds.), Progress in Statistics European Meeting of Statisticians 1972, Vol. 1. Amsterdam: North-Holland, pp. 241266.Google Scholar
Gittins, J.C. & Glazebrook, K.D. (1977). On Bayesian models in stochastic scheduling. Journal of Applied Probability 14: 556565.CrossRefGoogle Scholar
Gittins, J.C. & Nash, P. (1977). Scheduling, queues, and dynamic allocation indices. Proceedings EMS, Prague, 1974, Czechoslovak Academy of Sciences, Prague, 191202.Google Scholar
Glazebrook, K.D. (1976). Stochastic scheduling with order constraints. International Journal of Systems Science 7: 657666.CrossRefGoogle Scholar
Glazebrook, K.D. (1987). Sensitivity analysis for stochastic scheduling problems. Mat hematics of Operations Research 12: 205223.Google Scholar
Harrison, J.M. (1975). Dynamic scheduling of a multiclass queue: discount optimality. Operations Research 23: 270282.Google Scholar
Kallenberg, L.C.M. (1986). A note on M.N. Katehakis and Y.R. Chen's computation of the Gittins Index. Mathematics of Operations Research 11: 184186.Google Scholar
Katehakis, M.N. & Veinott, A.F. (1987). The multiarmed bandit problem: decomposition and computation. Mathematics of Operations Research 12: 262268.CrossRefGoogle Scholar
Kelly, F.P. (1981). Multiarmed bandits with discount factor near one: the Bernoulli case. The Annals of Statistics 9: 9871001.Google Scholar
Klimov, G.P. (1974). Time sharing service systems I. Theory of Probability and Applications 19: 532551.CrossRefGoogle Scholar
Klimov, G.P. (1978). Time sharing service systems II. Theory of Probability and Applications 23: 314321.Google Scholar
Meilijson, I. & Weiss, G. (1977). Multiple feedback at a single server station. Stochastic Processes and their Applications 5: 195205.CrossRefGoogle Scholar
Nash, P. & Gittins, J.C. (1977). A Hamiltonian approach to optimal stochastic resource allocation. Advances in Applied Probability 9: 5568.Google Scholar
Ross, S.M. (1970). Applied probability models with optimization applications. San Francisco, CA: Holden Day.Google Scholar
Sevcik, K.C. (1974). Scheduling for minimum total loss using service time distributions. Journal of the Association for Computing Machine 21: 6675.Google Scholar
Varaiya, P., Walrand, J., & Buyukkoc, C. (1985). Extensions of the multiarmed bandit problem: the discounted case. IEEE Transactions on Automatic Control AC-30: 426436.Google Scholar
Whittle, P. (1980). Multiarmed bandits and the Gittins index. Journal of the Royal Statistical Society Series B 42: 143149.Google Scholar
Whittle, P. (1981). Arm acquiring bandits. Annals of Probability 9: 284292.CrossRefGoogle Scholar