Skip to main content
×
×
Home

A conjecture on the Feldman bandit problem

  • Maher Nouiehed (a1) and Sheldon M. Ross (a1)
Abstract

We consider the Bernoulli bandit problem where one of the arms has win probability α and the others β, with the identity of the α arm specified by initial probabilities. With u = max(α, β), v = min(α, β), call an arm with win probability u a good arm. Whereas it is known that the strategy of always playing the arm with the largest probability of being a good arm maximizes the expected number of wins in the first n games for all n, we conjecture that it also stochastically maximizes the number of wins. That is, we conjecture that this strategy maximizes the probability of at least k wins in the first n games for all k, n. The conjecture is proven when k = 1, and k = n, and when there are only two arms and k = n - 1.

Copyright
Corresponding author
* Postal address: Department of Industrial and Systems Engineering, University of Southern California, Los Angeles, CA 90089, USA.
** Email address: nouiehed@usc.edu
*** Email address: smross@usc.edu
References
Hide All
[1]Feldman, D. (1962). Contributions to the "two-armed bandit" problem. Ann. Math. Statist. 33, 847856.
[2]Rodman, L. (1978). On the many-armed bandit problem. Ann. Prob. 6, 491498.
[3]Presman, É. L. and Sonin, I. N. (1990). Sequential Control With Incomplete Information: The Bayesian Approach to Multi-Armed Bandit Problems. Academic Press, San Diego, CA.
Recommend this journal

Email your librarian or administrator to recommend adding this journal to your organisation's collection.

Journal of Applied Probability
  • ISSN: 0021-9002
  • EISSN: 1475-6072
  • URL: /core/journals/journal-of-applied-probability
Please enter your name
Please enter a valid email address
Who would you like to send this to? *
×

Keywords

MSC classification

Metrics

Full text views

Total number of HTML views: 0
Total number of PDF views: 0 *
Loading metrics...

Abstract views

Total abstract views: 0 *
Loading metrics...

* Views captured on Cambridge Core between <date>. This data will be updated every 24 hours.

Usage data cannot currently be displayed