Twenty Questions with Noise: Bayes Optimal Policies for Entropy Loss

Bruno Jedynak; Peter I. Frazier; Raphael Sznitman

doi:10.1239/jap/1331216837

Twenty Questions with Noise: Bayes Optimal Policies for Entropy Loss

Part of: Operations research and management science Decision theory Markov processes

Published online by Cambridge University Press: 04 February 2016

Bruno Jedynak ,

Peter I. Frazier and

Raphael Sznitman

Show author details

Bruno Jedynak*: Affiliation:
Johns Hopkins University
Peter I. Frazier*: Affiliation:
Cornell University
Raphael Sznitman*: Affiliation:
Johns Hopkins University
*: ∗ Postal address: Department of Applied Mathematics and Statistics, Johns Hopkins University, Whitehead 208-B, 3400 North Charles Street, Baltimore, MD 21218, USA. Email address: bruno.jedynak@jhu.edu
∗∗ Postal address: School of Operations Research and Industrial Engineering, Cornell University, 232 Rhodes Hall, Ithaca, NY 14850, USA.
∗∗∗ Postal address: Johns Hopkins University, Hackerman Hall, 3400 North Charles Street, Baltimore, MD 21218, USA.

Article contents

Abstract
Footnotes
References

Rights & Permissions

Abstract

Core share and HTML view are not available for this content. However, as you have access to this content, a full PDF is available via the ‘Save PDF’ action button.

We consider the problem of twenty questions with noisy answers, in which we seek to find a target by repeatedly choosing a set, asking an oracle whether the target lies in this set, and obtaining an answer corrupted by noise. Starting with a prior distribution on the target's location, we seek to minimize the expected entropy of the posterior distribution. We formulate this problem as a dynamic program and show that any policy optimizing the one-step expected reduction in entropy is also optimal over the full horizon. Two such Bayes optimal policies are presented: one generalizes the probabilistic bisection policy due to Horstein and the other asks a deterministic set of questions. We study the structural properties of the latter, and illustrate its use in a computer vision application.

Keywords

Twenty questions dynamic programing bisection search object detection entropy loss sequential experimental design Bayesian experimental design

MSC classification

Primary: 60J20: Applications of Markov chains and discrete-time Markov processes on general state spaces (social mobility, learning theory, industrial processes, etc.)

Secondary: 62C10: Bayesian problems; characterization of Bayes procedures 90B40: Search theory 90C39: Dynamic programming

Type: Research Article
Information: Journal of Applied Probability , Volume 49 , Issue 1 , March 2012 , pp. 114 - 136

DOI: https://doi.org/10.1239/jap/1331216837 [Opens in a new window]
Copyright: © Applied Probability Trust

Footnotes

Research supported in part by AFOSRYIP FA9550-11-1-0083.

Research supported in part by NIH grant R01 EB 007969-01.

References

Ben-Or, M. and Hassidim, A. (2008). The Bayesian learner is optimal for noisy binary search (and pretty good for quantum as well). In 2008 49th Ann. IEEE Symp. Foundations of Computer Science. IEEE Computer Society Press, Washington, DC, pp. 221–230.Google Scholar

Berry, D. A. and Fristedt, B. (1985). Bandit Problems. Chapman & Hall, London.CrossRef Google Scholar

Blum, J. R. (1954). Multidimensional stochastic approximation methods. Ann. Math. Statist. 25, 737–744.CrossRef Google Scholar

Burnašhev, M. V. and Zigangirov, K. Š. (1974). A certain problem of interval estimation in observation control. Problemy Peredachi Informatsii 10, 51–61.Google Scholar

Castro, R. and Nowak, R. (2008). Active learning and sampling. In Foundations and Applications of Sensor Management, Springer, pp. 177–200.CrossRef Google Scholar

Cover, T. M. and Thomas, J. A. (1991). Elements of Information Theory. John Wiley, New York.Google Scholar

DeGroot, M. H. (1970). Optimal Statistical Decisions. McGraw Hill, New York.Google Scholar

Dynkin, E. B. and Yushkevich, A. A. (1979). Controlled Markov Processes. Springer, New York.CrossRef Google Scholar

Frazier, P. I., Powell, W. B. and Dayanik, S. (2008). A knowledge-gradient policy for sequential information collection. SIAM J. Control Optimization 47, 2410–2439.CrossRef Google Scholar

Geman, D. and Jedynak, B. (1996). An active testing model for tracking roads in satellite images. IEEE Trans. Pattern Anal. Machine Intelligence 18, 1–14.CrossRef Google Scholar

Gittins, J. C. (1989). Multi-Armed Bandit Allocation Indices. John Wiley, Chichester.Google Scholar

Horstein, M. (1963). Sequential decoding using noiseless feedback. IEEE Trans. Inf. Theory 9, 136–143.CrossRef Google Scholar

Horstein, M. (2002). Sequential transmission using noiseless feedback. IEEE Trans. Inf. Theory 9, 136–143.CrossRef Google Scholar

Karp, R. M. and Kleinberg, R. (2007). Noisy binary search and its applications. In Proc. 18th Ann. ACM-SIAM Symp. Discrete Algorithms, ACM, New York, pp. 881–890.Google Scholar

Kushner, H. J. and Yin, G. G. (2003). Stochastic Approximation and Recursive Algorithms and Applications, 2nd edn. Springer, New York.Google Scholar

Lai, T. L. and Robbins, H. (1985). Asymptotically efficient adaptive allocation rules. Adv. Appl. Math. 6, 4–22.CrossRef Google Scholar

Lampert, C. H., Blaschko, M. B. and Hofmann, T. (2009). Efficient subwindow search: a branch and bound framework for object localization. IEEE Trans. Pattern Anal. Machine Intelligence 31, 2129–2142.CrossRef Google Scholar PubMed

Lowe, D. G. (2004). Distinctive image features from scale-invariant keypoints. Internat. J. Comput. Vision 60, 91–110.CrossRef Google Scholar

Nowak, R. (2008). Generalized binary search. In 2008 46th Ann. Allerton Conf. Commun., Control, and Computing, pp. 568–574.CrossRef Google Scholar

Nowak, R. (2009). Noisy generalized binary search. Adv. Neural Inf. Processing Systems 22, 1366–1374.Google Scholar

Pelc, A. (2002). Searching games with errors—fifty years of coping with liars. Theoret. Comput. Sci. 270, 71–109.CrossRef Google Scholar

Polyak, B. T. (1990). A new method of stochastic approximation type. Automat. Remote Control 51, 937–946.Google Scholar

Robbins, H. (1952). Some aspects of the sequential design of experiments. Bull. Amer. Math. Soc. 58, 527–535.CrossRef Google Scholar

Robbins, H. and Monro, S. (1951). A stochastic approximation method. Ann. Math. Satist. 22, 400–407.CrossRef Google Scholar

Ruppert, D. (1988). Efficient estimators from a slowly convergent Robbins-Monro procedure. Tech. Rep. 781, School of Operations Research and Industrial Engineering, Cornell University.Google Scholar

Schapire, R. E. (1990). The strength of weak learnability. Machine Learning 5, 197–227.CrossRef Google Scholar

Sznitman, R. and Jedynak, B. (2010). Active testing for face detection and localization. IEEE Trans. Pattern Anal. Machine Intelligence 32, 1914–1914.CrossRef Google Scholar PubMed

Vapnik, V. N. (1995). The Nature of Statistical Learning Theory. Springer, New York.CrossRef Google Scholar

Vedaldi, A., Gulshan, V., Varma, M. and Zisserman, A. (2009). Multiple kernels for object detection. In Proc. Internat. Conf. Computer Vision, pp. 606–613.CrossRef Google Scholar

Viola, P. and Jones, M. J. (2004). Robust real-time face detection. Internat. J. Comput. Vision 57, 137–154.CrossRef Google Scholar

Waeber, R., Frazier, P. I. and Henderson, S. G. (2011). A Bayesian approach to stochastic root finding. In Proc. 2011 Winter Simulation Conference, eds Jain, S. et al., IEEE.Google Scholar

Whittle, P. (1981). Arm-acquiring bandits. Ann. Prob. 9, 284–292.CrossRef Google Scholar

Whittle, P. (1988). Restless bandits: activity allocation in a changing world. In A Celebration of Applied Probability (J. Appl. Prob. Spec. Vol. 25A), ed. Gani, J., Applied Probability Trust, Sheffield, pp. 287–298.Google Scholar

Article contents

Twenty Questions with Noise: Bayes Optimal Policies for Entropy Loss

Abstract

Keywords

MSC classification

Footnotes

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests