references

Roman Garnett

references

Published online by Cambridge University Press: 25 January 2023

Roman Garnett

Show author details

Roman Garnett: Affiliation:
Washington University in St Louis

Book contents

Get access

Summary

A summary is not available for this content so a preview has been provided. Please use the Get access link above for information on how to access this content.

Image of the first page of this content. For PDF version, please use the ‘Save PDF’ preceeding this image.'

Information

Type: Chapter
Information: Bayesian Optimization , pp. 331 - 352

DOI: https://doi.org/10.1017/9781108348973 [Opens in a new window]

Publisher: Cambridge University Press

Print publication year: 2023

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Book purchase

Temporarily unavailable

References

Abbasi-Yadkori, Yasin (2012). Online Learning for Linearly Parameterized Control Problems. Ph.D. thesis. University of Alberta.Google Scholar

Acerbi, Luigi (2018). Variational Bayesian Monte Carlo. Advances in Neural Information Processing Systems 31 (Neurips 2018), pp. 8213–8223.Google Scholar

Acerbi, Luigi and Wei, Ji Ma (2017). Practical Bayesian Optimization for Model Fitting with Bayesian Adaptive Direct Search. Advances in Neural Information Processing Systems 30 (Neurips 2017), pp. 1836–1846.Google Scholar

Adams, Ryan Prescott, Iain Murray, and Mackay, David J. C. (2009). Tractable Nonpara-metric Bayesian Inference in Poisson Processes with Gaussian Process Intensities. Proceedings of the 26th International Conference on Machine Learning (Icml 2009), pp. 9–16.Google Scholar

Adler, Robert J. and Taylor, Jonathan E. (2007). Random Fields and Geometry. Springer Monographs in Mathematics. Springer–Verlag.Google Scholar

Agrawal, Rajeev (1995). The Continuum-Armed Bandit Problem. siam Journal on Control and Optimization 33(6):1926–1951.Google Scholar

Agrawal, Shipra and Navin, Goyal (2012). Analysis of Thompson Sampling for the Multi-Armed Bandit Problem. Proceedings of the 25th Annual Conference on Learning Theory (Colt 2012). Vol. 23. Proceedings of Machine Learning Research, pp. 39.1– 39.26.Google Scholar

ÁLvarez, Mauricio A., Rosasco, Lorenzo, and Lawrence, Neil D. (2012). Kernels for Vector-Valued Functions: A Review. Foundations and Trends in Machine Learning 4(3):195– 266.CrossRef Google Scholar

Arcones, Miguel A. (1992). On the arg max of a Gaussian Process. Statistics & Probability Letters 15(5):373–374.Google Scholar

Auer, Peter, Nicole Cesa-Bianchi, and Paul Fischer (2002). Finite-Time Analysis of the Multiarmed Bandit Problem. Machine Learning 47(2–3):235–256.Google Scholar

Balandat, Maximilian, Brian, Karrer, Jiang, Daniel R., Samuel, Daulton, Benjamin, Letham, Andrew, Gordon Wilson, et al. (2020). Botorch: A Framework for Efficient Monte-Carlo Bayesian Optimization. Advances in Neural Information Processing Systems 33 (Neurips 2020), pp. 21524–21538.Google Scholar

Baptista, Ricardo and Matthias, Poloczek (2018). Bayesian Optimization of Combinatorial Structures. Proceedings of the 35th International Conference on Machine Learning (Icml 2018). Vol. 80. Proceedings of Machine Learning Research, pp. 462–471.Google Scholar

Bather, John (1996). A Conversation with Herman Chernoff. Statistical Science 11(4):335– 350.CrossRef Google Scholar

Belakaria, Syrine, Aryan Deshwal, and Janardhan Rao Doppa (2019). Max-value Entropy Search for Multi-Objective Bayesian Optimization. Advances in Neural Infor-mation Processing Systems 32 (Neurips 2019), pp. 7825–7835.Google Scholar

Bellman, Richard (1952). On the Theory of Dynamic Programming. Proceedings of the National Academy of Sciences 38(8):716–719.Google Scholar

Bellman, Richard (1957). Dynamic Programming. Princeton University Press.Google Scholar

Berger, James O. (1985). Statistical Decision Theory and Bayesian Analysis. 2nd ed. Springer Series in Statistics. Springer–Verlag.Google Scholar

Bergstra, James, Rtmi Bardenet, Yoshua Bengio, and BalÁZs Ktgl (2011). Algorithms for Hyper-Parameter Optimization. Advances in Neural Information Processing Systems 24 (Neurips 2011), pp. 2546–2554.Google Scholar

Bergstra, James and Yoshua, Bengio (2012). Random Search for Hyper-Parameter Optimization. Journal of Machine Learning Research 13:281–305.Google Scholar

Berkenkamp, Felix, Schoelling, Angela P., and Krause, Andreas (2019). No-Regret Bayes-ian Optimization with Unknown Hyperparameters. Journal of Machine Learning Research 20(50):1–24.Google Scholar

Berry, Donald A. and Fristedt, Bert (1985). Bandit Problems: Sequential Allocation of Experiments. Monographs on Statistics and Applied Probability. Chapman & Hall.Google Scholar

Bertsekas, Dimitri P. (2017). Dynamic Programming and Optimal Control. 4th ed. Vol. 1. Athena Scientific.Google Scholar

Bochner, S (1933). Monotone Funktionen, Stieltjessche Integrale und harmonische Analyse. Mathematische Annalen 108:378–410.CrossRef Google Scholar

Bogunovic, Ilija, Krause, Andreas, and Scarlett, Jonathan (2020). Corruption-Tolerant Gaussian Process Bandit Optimization. Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics (Aistats 2020). Vol. 108. Proceedings of Machine Learning Research, pp. 1071–1081.Google Scholar

Bogunovic, Ilija, Scarlett, Jonathan, Jegelka, Stefanie, and Cevher, Volkan (2018). Adversarially Robust Optimization with Gaussian Processes. Advances in Neural Information Processing Systems 31 (Neurins 2018), pp. 5760–5770.Google Scholar

Box, G. E. P. (1954). The Exploration and Exploitation of Response Surfaces: Some General Considerations and Examples. Biometrics 10(1):16–60.Google Scholar

Box, George E. P., Hunter, J. Stuart, and Hunter, William G. (2005). Statistics for Experimenters: Design, Innovation, and Discovery. 2nd ed. Wiley Series in Probability and Statistics. John Wiley & Sons.Google Scholar

Box, G. E. P. and Wilson, K. B. (1951). On the Experimental Attainment of Optimum Conditions. Journal of the Royal Statistical Society Series B (Methodological) 13(1):1–45.CrossRef Google Scholar

Box, G. E. P. and Youle, P. V. (1954). The Exploration and Exploitation of Response Surfaces: An Example of the Link between the Fitted Surface and the Basic Mechanism of the System. Biometrics 11(3):287–323.Google Scholar

Breiman, Leo (2001). Random Forests. Machine Learning 45(1):5–32.Google Scholar

Brent, Richard P. (1973). Algorithms for Minimization without Derivatives. Prentice–Hall Series in Automatic Computation. Prentice–Hall.Google Scholar

Brochu, Eric, Cora, Vlad M., and Freitas, Nando De (2010). A Tutorial on Bayesian Optimization of Expensive Cost Functions, with Application to Active User Modeling and Hierarchical Reinforcement Learning. arXiv: 1012.2599 [cs.Lg].Google Scholar

Brooks, Steve, Gelman, Andrew, Jones, Galin L., and Meng, Xiao-Li, eds. (2011). Handbook of Markov Chain Monte Carlo. Handbooks of Modern Statistical Methods. Chapman & Hall.Google Scholar

Bubeck, Sabastien and Nicole, Cesa-Bianchi (2012). Regret Analysis of Stochastic and Nonstochastic Multi-Armed Bandit Problems. Foundations and Trends in Machine Learning 5(1):1–122.CrossRef Google Scholar

Bubeck, Sabastien, Munos, Rami, and Stoltz, Gilles (2009). Pure Exploration in Multi-Armed Bandits Problems. Proceedings of the 20th International Conference on Algorithmic Learning Theory (Alt 2009). Vol. 5809. Lecture Notes in Computer Science. Springer–Verlag, pp. 23–37.Google Scholar

Bull, Adam D. (2011). Convergence Rates of Efficient Global Optimization Algorithms. Journal of Machine Learning Research 12(88):2879–2904.Google Scholar

Caflisch, Russel E. (1998). Monte Carlo and Quasi-Monte Carlo Methods. Acta Numerica 7:1–49.Google Scholar

Cai, Xu, Gomes, Selwyn, and Scarlett, Jonathan (2021). Lenient Regret and Good-Action Identification in Gaussian Process Bandits. Proceedings of the 38th International Conference on Machine Learning (Icml 2021). Vol. 139. Proceedings of Machine Learning Research, pp. 1183–1192.Google Scholar

Cakmak, Sait, Astudillo, Raul, Frazier, Peter, and Zhou, Enlu (2020). Bayesian Optimization of Risk Measures. Advances in Neural Information Processing Systems 33 (Neurins 2020), pp. 18039–18049.Google Scholar

Calandra, Roberto, Peters, Jan, Rasmussen, Carl Edward, and Deisen-Roth, Marc Peter (2016). Manifold Gaussian Processes for Regression. Proceedings of the 2016 International Joint Conference on Neural Networks (Ijcnn 2016), pp. 3338–3345.Google Scholar

Calvin, J. and žlinskas, A. (1999). On the Convergence of the P-Algorithm for One-Dimensional Global Optimization of Smooth Functions. Journal of Optimization Theory and Applications 102(3):479–495.Google Scholar

Calvin, James M. (1993). Consistency of a Myopic Bayesian Algorithm for One-Dimensional Global Optimization. Journal of Global Optimization 3(2):223–232.Google Scholar

Calvin, James M. (2000). Convergence Rate of the P-Algorithm for Optimization of Continuous Functions. In: Approximation and Complexity in Numerical Optimization: Continuous and Discrete Problems. Ed. by Pardalos, Panos M.. Vol. 42. Nonconvex Optimization and Its Applications. Springer–Verlag, pp. 116–129.Google Scholar

Calvin, James M. and Zilinskas, Antanas (2001). On Convergence of a P-Algorithm Based on a Statistical Model of Continuously Differentiable Functions. Journal of Global Optimization 19(3):229–245.Google Scholar

Camilleri, Romain, Katz-Samuels, Julian, and Kevin, Jamieson (2021). High-Dimensional Experimental Design and Kernel Bandits. Proceedings of the 38th International Conference on Machine Learning (Icml 2021). Vol. 139. Proceedings of Machine Learning Research, pp. 1227–1237.Google Scholar

Cartinhour, Jack (1989). One-Dimensional Marginal Density Functions of a Truncated Multivariate Normal Density Function. Communications in Statistics – Theory and Methods 19(1):197–203.Google Scholar

Chapelle, Olivier and Lihong, Li (2011). An Empirical Evaluation of Thompson Sampling. Advances in Neural Information Processing Systems 24 (Neurins 2011), pp. 2249–2257.Google Scholar

Chernoff, Herman (1959). Sequential Design of Experiments. The Annals of Mathematical Statistics 30(3):755–770.Google Scholar

Chernoff, Herman (1972). Sequential Analysis and Optimal Design. cbms–nsf Regional Conference Series in Applied Mathematics. Society for Industrial and Applied Mathematics.Google Scholar

Chevalier, Cltment and David, Ginsbourger (2013). Fast Computation of the Multi-Points Expected Improvement with Applications in Batch Selection. Proceedings of the 7th Learning and Intelligent Optimization Conference (Lion 7). Vol. 7997. Lecture Notes in Computer Science. Springer–Verlag, pp. 59–69.CrossRef Google Scholar

Chowdhury, Sayak Ray and Gopalan, Aditya (2017). On Kernelized Multi-Armed Bandits. Proceedings of the 34th International Conference on Machine Learning (Icml 2017). Vol. 70. Proceedings of Machine Learning Research, pp. 844–853.Google Scholar

Clark, Charles E. (1961). The Greatest of a Finite Set of Random Variables. Operations Research 9(2):145–162.CrossRef Google Scholar

Contal, Emile, David, Buffoni, Robicquet, Alexandre, and Vayatis, Nicolas (2013). Parallel Gaussian Process Optimization with Upper Confidence Bound and Pure Exploration. Proceedings of the 2013 European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases ( ecml pkdd 2013). Vol. 8188. Lecture Notes in Computer Science. Springer–Verlag, pp. 225–240.Google Scholar

Cover, Thomas M. and Thomas, Joy A. (2006). Elements of Information Theory. 2nd ed. John Wiley & Sons.Google Scholar

Cox, Dennis D. and John, Susan (1992). A Statistical Method for Global Optimization. Proceedings of the 1992 Ieee International Conference on Systems, Man, and Cybernetics (smc 1992), pp. 1241–1246.Google Scholar

Cunningham, John P., Hennig, Philipp, and Lacoste-Julien, Simon (2011). Gaussian Probabilities and Expectation Propagation. arXiv: 1111.6832 [stat.Ml].Google Scholar

Cutajar, Kurt, Osborne, Michael A., Cunningham, John P., and Filippone, Maurizio (2016). Preconditioning Kernel Matrices. Proceedings of the 33rd International Conference on Machine Learning (Icml 2016). Vol. 48. Proceedings of Machine Learning Research, pp. 2529–2538.Google Scholar

Dai, Zhenwen, Damianou, Andreas, GonzÁLez, Javier, and Lawrence, Neil (2016). Variational Auto-encoded Deep Gaussian Processes. Proceedings of the 4th International Conference on Learning Representations (ICLR 2016). arXiv: 1511.06455 [cs.Lg].Google Scholar

Dai, Zhongxiang, Yu, Haibin, Low, Bryan Kian Hsiang, and Jaillet, Patrick (2019). Bayesian Optimization Meets Bayesian Optimal Stopping. Proceedings of the 36th International Conference on Machine Learning (Icml 2019). Vol. 97. Proceedings of Machine Learning Research, pp. 1496–1506.Google Scholar

Dalibard, Valentin, Schaarschmidt, Michael, and Yoneki, Eiko (2017). Boat: Building Auto-Tuners with Structured Bayesian Optimization. Proceedings of the 26th International Conference on World Wide Web (www 2017), pp. 479–488.Google Scholar

Dani, Varsha, Hayes, Thomas P., and Kakade, Sham M. (2008). Stochastic Linear Optimization under Bandit Feedback. Proceedings of the 21st Conference on Learning Theory (Colt 2008), pp. 355–366.Google Scholar

Davis, Philip J. and Rabinowitz, Philip (1984). Methods of Numerical Integration. 2nd ed. Computer Science and Applied Mathematics. Academic Press.Google Scholar

De Ath, George, Everson, Richard M., Rahat, Alma A., and Fieldsend, Jonathan E. (2021). Greed Is Good: Exploration and Exploitation Trade-offs in Bayesian Optimisation. acm Transactions on Evolutionary Learning and Optimization 1(1):1–22.Google Scholar

De Ath, , George, , Fieldsend, Jonathan E., and Everson, Richard M. (2020). What Do You Mean? The Role of the Mean Function in Bayesian Optimization. Proceedings of the 2020 Genetic and Evolutionary Computation Conference (gecco 2020), pp. 1623–1631.Google Scholar

De Freitas, Nando, Smola, Alex J., and Zoghi, Masrour (2012a). Regret Bounds for Deterministic Gaussian Process Bandits. arXiv: 1203.2177 [cs.Lg].Google Scholar

De Freitas, Nando, Smola, Alex J., and Zohgi, Masrour (2012b). Exponential Regret Bounds for Gaussian Process Bandits with Deterministic Observations. Proceedings of the 29th International Conference on Machine Learning (Icml 2012), pp. 955–962.Google Scholar

Degroot, Morris H. (1970). Optimal Statistical Decisions. McGraw–Hill.Google Scholar

Desautels, Thomas, Krause, Andreas, and Burdick, Joel W. (2014). Parallelizing Exploration– Exploitation Tradeoffs in Gaussian Process Bandit Optimization. Journal ofMachine Learning Research 15(119):4053–4103.Google Scholar

Diaconis, Persi (1988). Bayesian Numerical Analysis. In: Statistical Decision Theory and Related Topics iv. Ed. by Gupta, Shanti S. and Berger, James O.. Vol. 1. Springer– Verlag, pp. 163–175.Google Scholar

Djolonga, Josip, Krause, Andreas, and Cevher, Volkan (2013). High-Dimensional Gaussian Process Bandits. Advances in Neural Information Processing Systems 26 (Neurips 2013), pp. 1025–1033.Google Scholar

Domhan, Tobias, Springenberg, Jost Tobias, and Hutter, Frank (2015). Speeding up Automatic Hyperparameter Optimization of Deep Neural Networks by Extrapolation of Learning Curves. Proceedings of the 24th International Conference on Artificial Intelligence (ijcai 2015), pp. 3460–3468.Google Scholar

Duvenaud, David, Lloyd, James Robert, Grosse, Roger, Tenenbaum, Joshua B., and Ghahramani, Zoubin (2013). Structure Discovery in Nonparametric Regression through Compositional Kernel Search. Proceedings of the 30th International Conference on Machine Learning (Icml 2013). Vol. 28. Proceedings of Machine Learning Research, pp. 1166–1174.Google Scholar

Emmerich, Michael T. M., Giannakoglou, Kyriakos C., and Naujoks, Boris, (2006). Single-and Multiobjective Evolutionary Optimization Assisted by Gaussian Random Field Metamodels. Ieee Transactions on Evolutionary Computation 10(4):421–439.Google Scholar

Emmerich, Michael and Naujoks, Boris, (2004). Metamodel Assisted Multiobjective Optimisation Strategies and Their Application in Airfoil Design. In: Adaptive Computing in Design and Manufacture vi. Ed. by Parmee, I. C.. Springer–Verlag, pp. 249–260.Google Scholar

Eriksson, David, Pearce, Michael, Gardner, Jacob R., Turner, Ryan, and Poloczek, Matthias (2019). Scalable Global Optimization via Local Bayesian Optimization. Advances in Neural Information Processing Systems 32 (Neurips 2019), pp. 5496–5507.Google Scholar

FernÁNdez-Delgado, Manuel, Cernadas, Eva, Barro, Sentn, and Amorim, Dinani (2014). Do We Need Hundreds of Classifiers to Solve Real World Classification Problems? Journal of Machine Learning Research 15(90):3133–3181.Google Scholar

Feurer, Matthias, Springenberg, Jost Tobias, and Hutter, Frank (2015). Initializing Bayesian Hyperparameter Optimization via Meta-Learning. Proceedings of 29th aaai Conference on Artificial Intelligence (aaai 2015), pp. 1128–1135.Google Scholar

Fisher, Ronald A. (1935). The Design of Experiments. Oliver and Boyd.Google Scholar

Fleischer, M. (2003). The Measure of Pareto Optima: Applications to Multi-Objective Metaheuristics. Proceedings of the 2nd International Conference on Evolutionary Multi-Criterion Optimization (emo 2003). Vol. 2632. Lecture Notes in Computer Science. Springer–Verlag, pp. 519–533.Google Scholar

Forrester, Alexander I. J., Keane, Andy J., and Bressloff, Neil W. (2006). Design and Analysis of “Noisy” Computer Experiments. aiaa Journal 44(10):2331–2339.Google Scholar

Frazier, Peter and Powell, Warren (2007). The Knowledge Gradient Policy for Offline Learning with Independent Normal Rewards. Proceedings of the 2007 Ieee International Symposium on Approximate Dynamic Programming and Reinforcement Learning (adprl 2007), pp. 143–150.Google Scholar

Frazier, Peter, Powell, Warren, and Dayanik, Savas (2009). The Knowledge-Gradient Policy for Correlated Normal Beliefs. informs Journal on Computing 21(4):599–613.CrossRef Google Scholar

Friedman, Milton and Savage, L. J. (1947). Planning Experiments Seeking Maxima. In: Selected Techniques of Statistical Analysis for Scienti~c and Industrial Research, and Production and Management Engineering. Ed. by Eisenhart, Churchill, Hartay, Millard W., andWallis, W. Allen. McGraw–Hill, pp. 363–372.Google Scholar

FrÖHlich, Lukas P., Klenske, Edgar D., Vinogradska, Julia, Daniel, Christian, and Zeilinger, Melanie N. (2020). Noisy-Input Entropy Search for Efficient Robust Bayesian Optimization. Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics (Aistats 2020). Vol. 108. Proceedings of Machine Learning Research, pp. 2262–2272.Google Scholar

Garcfa-Barcos, Javier and Martinez-Cantin, Ruben (2021). Robust Policy Search for Robot Navigation. Ieee Robotics and Automation Letters 6(2):2389–2396.Google Scholar

Gardner, Jacob R., Guo, Chuan, Weinberger, Kilian Q., Garnett, Roman, and Grosse, Roger (2017). Discovering and Exploiting Additive Structure for Bayesian Optimization. Proceedings of the 20th International Conference on Artificial Intelligence and Statistics (Aistats 2017). Vol. 54. Proceedings of Machine Learning Research, pp. 1311–1319.Google Scholar

Gardner, Jacob R., Kusner, Matt J., Xu, Zhixiang (Eddie), Weinberger, Kilian Q., and Cunningham, John P. (2014). Bayesian Optimization with Inequality Constraints. Proceedings of the 31st International Conference on Machine Learning (Icml 2014). Vol. 32. Proceedings of Machine Learning Research, pp. 937–945.Google Scholar

Gardner, Jacob R., Pleiss, Geoff, Bindel, David, Weinberger, Kilian Q., and Wilson, Andrew Gordon (2018). Gpytorch: Blackbox Matrix–Matrix Gaussian Process Inference with Gpu Acceleration. Advances in Neural Information Processing Systems 31 (Neurips 2018), pp. 7576–7586.Google Scholar

Garnett, Roman, GÄRtner, Thomas, Vogt, Martin, and Bajorath, JÜRgen (2015). Introducing the ‘Active Search’ Method for Iterative Virtual Screening. Journal of Computer-Aided Molecular Design 29(4):305–314.Google Scholar

Garnett, Roman, Krishnamurthy, Yamuna, Xiong, Xuehan, Schneider, Jeff, and Mann, Richard (2012). Bayesian Optimal Active Search and Surveying. Proceedings of the 29th International Conference on Machine Learning (Icml 2012), pp. 1239–1246.Google Scholar

Garnett, Roman, Osborne, Michael A., and Hennig, Philipp (2014). Active Learning of Linear Embeddings for Gaussian Processes. Proceedings of the 30th Conference on Uncertainty in Artificial Intelligence (Uai 2014), pp. 230–239.Google Scholar

Garnett, R., Osborne, M. A., and Roberts, S. J. (2010). Bayesian Optimization for Sensor Set Selection. Proceedings of the 9th Acmeieee International Conference on Information Processing in Sensor Networks (Ipsn 2010), pp. 209–219.Google Scholar

Gelbart, Michael A., Snoek, Jasper, and Adams, Ryan P. (2014). Bayesian Optimization with Unknown Constraints. Proceedings of the 30th Conference on Uncertainty in Artificial Intelligence (Uai 2014), pp. 250–259.Google Scholar

Gelman, Andrew and Vehtari, Aki (2021). What Are the Most Important Statistical Ideas of the Past 50 Years? Journal of the American Statistical Association 116(536):2087–2097.Google Scholar

Gergonne, Joseph Diez (1815). Application de la méthode des moindres quarrés à l’interpolation des suites. Annales de Mathématiques pures et appliquées 6:242–252.Google Scholar

Ghosal, Subhashis and Roy, Anindya (2006). Posterior Consistency of Gaussian Process Prior for Nonparametric Binary Regression. The Annals of Statistics 34(5):2413–2429.Google Scholar

Gibbs, Mark N. (1997). Bayesian Gaussian Processes for Regression and Classification. Ph.D. thesis. University of Cambridge.Google Scholar

Gilboa, Elad, SaatÇI,, Yunus and Cunningham, John P. (2013). Scaling Multidimensional Gaussian Processes Using Projected Additive Approximations. Proceedings of the 30th International Conference on Machine Learning (Icml 2013). Vol. 28. Proceedings of Machine Learning Research, pp. 454–461.Google Scholar

Ginsbourger, David and Riche, Rodolphe Le (2010). Towards Gaussian Process-Based Optimization with Finite Time Horizon. Proceedings of the 9th International Workshop in Model-Oriented Design and Analysis (Moda 9). Contributions to Statistics. Springer–Verlag, pp. 89–96.Google Scholar

Ginsbourger, David, Riche, Rodolphe Le, and Carraro, Laurent (2010). Kriging is Well-Suited to Parallelize Optimization. In: Computational Intelligence in Expensive Optimization Problems. Ed. by Yenne, Yoel and Go, Chi-Keong. Adaptation Learning and Optimization. Springer–Verlag, pp. 131–162.Google Scholar

Golovin, Daniel and Zhang, Qiuyi (Richard) (2020). Random Hypervolume Scalariza-tions for Provable Multi-Objective Black Box Optimization. Proceedings of the 37th International Conference on Machine Learning (IcrrL 2020). Vol. 119. Proceedings of Machine Learning Research, pp. 11096–11105.Google Scholar

Golub, Gene H. and Loan, Charles F. Van (2013). Matrix Computations. 4th ed. Johns Hopkins Studies in the Mathematical Sciences. Johns Hopkins University Press.Google Scholar

G6Mez-Bombarelli, Rafael, Wei, Jennifer N., Duvenaud, David, HernÁNdez-Lobato, Jost Miguel, SÁNchez-Lengeling, Benjamfn, Sheberla, Dennis, et al. (2018). Automatic Chemical Design Using a Data-Driven Continuous Representation of Molecules. Acs Central Science 4(2):268–276.Google Scholar

GonzÁLez, Javier, Dai, Zhenwen, Hennig, Philipp, and Lawrence, Neil (2016a). Batch Bayesian Optimization via Local Penalization. Proceedings of the 19th International Conference on Artificial Intelligence and Statistics (Aistats 2016). Vol. 51. Proceedings of Machine Learning Research, pp. 648–657.Google Scholar

GonzÁLez, Javier, Osborne, Michael, and Lawrence, Neil D. (2016b). Glasses: Relieving the Myopia of Bayesian Optimisation. Proceedings of the 19th International Conference on Artificial Intelligence and Statistics (Aistats 2016). Vol. 51. Proceedings of Machine Learning Research, pp. 790–799.Google Scholar

Gramacy, Robert B. and Lee, Herbert K. H. (2011). Optimization under Unknown Constraints. In: Bayesian Statistics 9. Ed. by Bernardo, J. M., Bayarri, M. J., Berger, J. O., Dawid, A. P., Heckerman, D., Smith, A. F. M., et al. Oxford University Press, pp. 229– 256.Google Scholar

Granmo, Ole-Christoffer (2010). Solving Two-Armed Bernoulli Bandit Problems Using a Bayesian Learning Automaton. International Journal of Intelligent Computing and Cybernetics 3(2):207–234.Google Scholar

GrÜNewÄLder, Steffen, Audibert, Jean-Yves, Opper, Manfred, and Shawe-Taylor, John (2010). Regret Bounds for Gaussian Process Bandit Problems. Proceedings of the 13th International Conference on Artificial Intelligence and Statistics (Aistats 2010). Vol. 9. Proceedings of Machine Learning Research, pp. 273–280.Google Scholar

Hansen, Nikolaus (2016). The Cma Evolution Strategy: A Tutorial. arXiv: 1604.00772 [cs.Lg].Google Scholar

Hastie, Trevor and Tibshirani, Robert (1986). Generalized Additive Models. Statistical Science 1(3):297–318.Google Scholar

Hennig, Philipp, Osborne, Michael A., and Girolami, Mark (2015). Probabilistic Numerics and Uncertainty in Computations. Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences 471(2179):20150142.Google Scholar

Hennig, Philipp, Osborne, Michael A., and Kersting, Hans (2022). Probabilistic Numerics: Computation as Machine Learning. Cambridge University Press.Google Scholar

Hennig, Philipp and Schuler, Christian J. (2012). Entropy Search for Information-Efficient Global Optimization. Journal of Machine Learning Research 13(Jun):1809–1837.Google Scholar

Hensman, James, Matthews, Alexander G. De G., Filippone, Maurizio, and Ghahramani, Zoubin (2015). Mcmc for Variationally Sparse Gaussian Processes. Advances in Neural Information Processing Systems 28 (Neurips 2015), pp. 1648–1656.Google Scholar

HernÁNdez-Lobato, Daniel, HernÁNdez-Lobato, Josh Miguel, Shah, Amar, and Adams, Ryan P. (2016a). Predictive Entropy Search for Multi-Objective Bayesian Optimization. Proceedings of the 33rd International Conference on Machine Learning (Icml 2016). Vol. 48. Proceedings of Machine Learning Research, pp. 1492–1501.Google Scholar

HernÁNdez-Lobato, Josh Miguel, Gelbart, Michael A., Adams, Ryan P., Hoff-Man, Matthew W., and Ghahramani, Zoubin (2016b). A General Framework for Constrained Bayesian Optimization Using Information-Based Search. Journal of Machine Learning Research 17:1–53.Google Scholar

HernÁNdez-Lobato, Josh Miguel, Hoffman, Matthew W., and Ghahramani, Zoubin (2014). Predictive Entropy Search for Efficient Global Optimization of Black-Box Functions. Advances in Neural Information Processing Systems 27 (Neurips 2014), pp. 918–926.Google Scholar

HernÁNdez-Lobato, Josh Miguel, Requeima, James, Pyzer-Knapp, Edward O., and Aspuru-Guzik, AlÁN (2017). Parallel and Distributed Thompson Sampling for Large-Scale Accelerated Exploration of Chemical Space. Proceedings of the 34th International Conference on Machine Learning (Icml 2017). Vol. 70. Proceedings of Machine Learning Research, pp. 1470–1479.Google Scholar

Hestenes, Magnus R. and Stiefel, Eduard (1952). Methods of Conjugate Gradients for Solving Linear Systems. Journal of Research of the National Bureau of Standards 49(6):409–436.Google Scholar

Hie, Brian L. and Yang, Kevin K. (2021). Adaptive Machine Learning for Protein Engineering. arXiv: 2106.05466 [q-bio.Qm].Google Scholar

Hoang, Trong Nghia, Hoang, Quang Minh, Ouyang, Ruofei, and Low, Kian Hsiang (2018). Decentralized High-Dimensional Bayesian Optimization with Factor Graphs. Proceedings of 32nd aaai Conference on Artificial Intelligence (aaai 2018), pp. 3231–3238.Google Scholar

Hoffman, Matthew D. and Gelman, Andrew (2014). The No-U-turn Sampler: Adaptively Setting Path Lengths in Hamiltonian Monte Carlo. Journal of Machine Learning Research 15(4):1593–1623.Google Scholar

Hoffman, Matthew W. and Ghahramani, Zoubin (2015). Output-Space Predictive Entropy Search for Flexible Global Optimization. Bayesian Optimization: Scalability and Flexibility Workshop (BayesOpt 2015), Conference on Neural Information Processing Systems (Neurips 2015).Google Scholar

Hoffman, Matthew W. and Shahriari, Bobak (2014). Modular Mechanisms for Bayesian Optimization. Bayesian Optimization in Academia and Industry (BayesOpt 2014), Conference on Neural Information Processing Systems (Neurips 2014).Google Scholar

Hotelling, Harold (1941). Experimental Determination of the Maximum of a Function. The Annals of Mathematical Statistics 12(1):20–45.Google Scholar

Houlsby, Neil, HernÁNdez-Lobato, Josh Miguel, HuszÁR,, Ferenc and Ghahra-Mani, Zoubin (2012). Collaborative Gaussian Processes for Preference Learning. Advances in Neural Information Processing Systems 25 (Neurips 2012), pp. 2096–2104.Google Scholar

Huang, D., Allen, T. T., Notz, W. I., and Miller, R. A. (2006a). Sequential Kriging Optimization Using Multiple-Fidelity Evaluations. Structural and Multidisciplinary Optimization 32(5):369–382.Google Scholar

Huang, D., Allen, T. T., Notz, W. I., and Zeng, N. (2006b). Global Optimization of Stochastic Black-Box Systems via Sequential Kriging Meta-Models. Journal of Global Optimization 34(3):441–466.Google Scholar

Hutter, Frank, Hoos, Holger H., and Leyton-Brown, Kevin (2011). Sequential Model-Based Optimization for General Algorithm Configuration. Proceedings of the 5th Learning and Intelligent Optimization Conference (lion 5). Vol. 6683. Lecture Notes in Computer Science. Springer–Verlag, pp. 507–523.Google Scholar

Hutter, Frank, Xu, Lin, Hoos, Holger H., and Leyton-Brown, Kevin (2014). Algorithm Runtime Prediction: Methods & Evaluation. Artificial Intelligence 206:79–111.Google Scholar

Ingersoll Jr., Jonathan E. (1987). Theory of Financial Decision Making. Rowman & Little-field Studies in Financial Economics. Rowman & Littlefield.Google Scholar

Irwin, John J., Tang, Khanh G., Young, Jennifer, Dandarchuluun, Chinzorig, Wong, Benjamin R., Khurelbaatar, Munkhzul, et al. (2020). Zinc20 – A Free Ultralarge-Scale Chemical Database for Ligand Discovery. Journal of Chemical Information and Modeling 60(12):6065–6073.Google Scholar

Janz, David, Burt, David R., and GonzÁLez, Javier (2020). Bandit Optimisation of Functions in the Matérn Kernel Rkhs. Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics (aistats 2020). Vol. 108. Proceedings of Machine Learning Research, pp. 2486–2495.Google Scholar

Jiang, Shali, Chai, Henry, GonzÁLez,, Javier and Garnett, Roman (2020a). Binoculars for Efficient, Nonmyopic Sequential Experimental Design. Proceedings of the 37th International Conference on Machine Learning (Icml 2020). Vol. 119. Proceedings of Machine Learning Research, pp. 4794–4803.Google Scholar

Jiang, Shali, Jiang, Daniel R., Balandat, Maximilian, Karrer, Brian, Gardner, Jacob R., and Garnett, Roman (2020b). Efficient Nonmyopic Bayesian Optimization via One-Shot Multi-Step Trees. Advances in Neural Information Processing Systems 33 (Neurips 2020), pp. 18039–18049.Google Scholar

Jiang, Shali, Malkomes, Gustavo, Converse, Geoff, Shofner, Alyssa, Moseley, Benjamin, and Garnett, Roman (2017). Efficient Nonmyopic Active Search. Proceedings of the 34th International Conference on Machine Learning (Icml 2017). Vol. 70. Proceedings of Machine Learning Research, pp. 1714–1723.Google Scholar

Jones, D. R., Perttunen, C. D., and Stuckman, B. E. (1993). Lipschitzian Optimization without the Lipschitz Constant. Journal of Optimization Theory and Application 79(1):157–181.Google Scholar

Jones, Donald R. (2001). A Taxonomy of Global Optimization Methods Based on Response Surfaces. Journal of Global Optimization 21(4):345–383.Google Scholar

Jones, Donald R., Schonlau, Matthias, and Welch, William J. (1998). Efficient Global Optimization of Expensive Black-Box Functions. Journal of Global Optimization 13(4):455–492.Google Scholar

JylÄNki, Pasi, Vanhatalo, Jarno, and Vehtari, Aki (2011). Robust Gaussian Process Regression with a Student-t Likelihood. Journal of Machine Learning Research 12(99): 3227–3257.Google Scholar

Kanagawa, Motonobu, Hennig, Philipp, Sejdinovic, Dino, and Sriperum-Budur, Bharath K. (2018). Gaussian Processes and Kernel Methods: A Review on Connections and Equivalences. arXiv: 1807.02582 [stat.Ml].Google Scholar

Kandasamy, Kirthevasan, Dasarathy, Gautam, Oliva, Junier, Schneider, Jeff, and P6Czos, BarnabÁS (2016). Gaussian Process Bandit Optimisation with Multi-Fidelity Evaluations. Advances in Neural Information Processing Systems 29 (Neurips 2016), pp. 992–1000.Google Scholar

Kandasamy, Kirthevasan, Dasarathy, Gautam, Schneider, Jeff, and P6Czos, BarnabÁS (2017). Multi-Fidelity Bayesian Optimisation with Continuous Approximations. Proceedings of the 34th International Conference on Machine Learning (Icml 2017). Vol. 70. Proceedings of Machine Learning Research, pp. 1799–1808.Google Scholar

Kandasamy, Kirthevasan, Krishnamurthy, Akshay, Schneider, Jeff, and P6C-Zos, BarnabÁS (2018). Parallelised Bayesian Optimisation via Thompson Sampling. Proceedings of the 21st International Conference on Artificial Intelligence and Statistics (aistats 2018). Vol. 84. Proceedings of Machine Learning Research, pp. 133–142.Google Scholar

Kandasamy, Kirthevasan, Schneider, Jeff, and P6Czos, BarnabÁS (2015). High Dimensional Bayesian Optimisation and Bandits via Additive Models. Proceedings of the 32nd International Conference on Machine Learning (Icml 2015). Vol. 37. Proceedings of Machine Learning Research, pp. 295–304.Google Scholar

Kathuria, Tarun, Deshpande, Amit, and Kohli, Pushmeet (2016). Batched Gaussian Process Bandit Optimization via Determinantal Point Processes. Advances in Neural Information Processing Systems 29 (Neurips 2016), pp. 4206–4214.Google Scholar

Kim, Jeankyung and Pollard, David (1990). Cube Root Asymptotics. The Annals ofStatistics 18(1):191–219.Google Scholar

Kim, Jungtaek and Choi, Seungjin (2020). On Local Optimizers of Acquisition Functions in Bayesian Optimization. Lecture Notes in Computer Science 12458:675–690.Google Scholar

Kim, Jungtaek, Mccourt, Michael, You, Tackgeun, Kim, Saehoon, and Choi, Seungjin (2021). Bayesian Optimization with Approximate Set Kernels. Machine Learning 110(5):857–879.Google Scholar

Klein, Aaron, Bartels, Simon, Falkner, Stefan, Hennig, Philipp, and Hutter, Frank (2015). Towards Efficient Bayesian Optimization for Big Data. Bayesian Optimization: Scalability and Flexibility Workshop (BayesOpt 2015), Conference on Neural Information Processing Systems (Neurips 2015).Google Scholar

Klein, Aaron, Falkner, Stefan, Springenberg, Jost Tobias, and Hutter, Frank (2017). Learning Curve Prediction with Bayesian Neural Networks. Proceedings of the 5th International Conference on Learning Representations (ICLR 2017).Google Scholar

Knowles, Joshua (2005). Parego: A Hybrid Algorithm with On-Line Landscape Approximation for Expensive Multiobjective Optimization Problems. Ieee Transactions on Evolutionary Computation 10(1):50–66.CrossRef Google Scholar

Ko, Chun-Wa, Jon Lee, and Queyranne, Maurice (1995). An Exact Algorithm for Maximum Entropy Sampling. Operations Research 43(4):684–691.Google Scholar

Konishi, Sadanori and Kitagawa, Genshiro (2008). Information Criteria and Statistical Modeling. Springer Series in Statistics. Springer–Verlag.Google Scholar

Kschischang, Frank R., Frey, Brendan J., and Leoliger, Hans-Andrea (2001). Factor Graphs and the Sum–Product Algorithm. Ieee Transactions on Information Theory 47(2):498–519.Google Scholar

Kulesza, Alex and Taskar, Ben (2012). Determinantal Point Processes for Machine Learning. Foundations and Trends in Machine Learning 5(2–3):123–286.Google Scholar

Kushner, Harold J. (1962). A Versatile Stochastic Model of a Function of Unknown and Time Varying Form. Journal of Mathematical Analysis and Applications 5(1):150–167.Google Scholar

Kushner, H. J. (1964). A New Method of Locating the Maximum Point of an Arbitrary Multipeak Curve in the Presence of Noise. Journal ofBasic Engineering 86(1):97–106.Google Scholar

Kuss, Malte (2006). Gaussian Process Models for Robust Regression, Classification, and Reinforcement Learning. Ph.D. thesis. Technische Universität Darmstadt.Google Scholar

Lai, T. L. and Robbins, Herbert (1985). Asymptotically Efficient Adaptive Allocation Rules. Advances in Applied Mathematics 6(1):4–22.Google Scholar

Lam, Remi R., Wilcox, Karen E., and Wolpert, David H. (2016). Bayesian Optimization with a Finite Budget: An Approximate Dynamic Programming Approach. Advances in Neural Information Processing Systems 29 (Neurips 2016), pp. 883–891.Google Scholar

Lange, Kenneth L., Little, Roderick J. A., and Taylor, Jeremy M. G. (1989). Robust Statistical Modeling Using the t Distribution. Journal of the American Statistical Association 84(408):881–896.Google Scholar

Lattimore, Tor and SzepesvÁRi, Csaba (2020). Bandit Algorithms. Cambridge University Press.Google Scholar

LÁZaro-Gredilla, Miguel, QuiÑOnero-Candela, Joaquin, Rasmussen, Carl Edward, and Figueiras-Vidal, Anfbal R. (2010). Sparse Spectrum Gaussian Process Regression. Journal of Machine Learning Research 11(Jun):1865–1881.Google Scholar

Letham, Benjamin, Karrer, Brian, Ottoni, Guilherme, and Bakshy, Eytan (2019). Constrained Bayesian Optimization with Noisy Experiments. Bayesian Analysis 14(2): 495–519.Google Scholar

Levina, Elizaveta and Bickel, Peter J. (2004). Maximum Likelihood Estimation of Intrinsic Dimension. Advances in Neural Information Processing Systems 17 (Neurips 2004), pp. 777–784.Google Scholar

Lavy, Paul (1948). Processus stochastiques et mouvement brownien. Gauthier–Villars.Google Scholar

Li, Chun-Liang, Kirthevasan Kandasamy, BarnabÁS P6Czos, and Schneider, Jeff (2016). High Dimensional Bayesian Optimization via Restricted Projection Pursuit Models. Proceedings of the 19th International Conference on Artificial Intelligence and Statistics (Aistats 2016). Vol. 51. Proceedings of Machine Learning Research, pp. 884–892.Google Scholar

Li, Chunyuan, Farkhoor, Heerad, Liu, Rosanne, and Yosinski, Jason (2018a). Measuring the Intrinsic Dimension of Objective Landscapes. Proceedings of the 6th International Conference on Learning Representations (ICLR 2018). arXiv: 1804.08838 [cs.Lg].Google Scholar

Li, Lisha, Jamieson, Kevin, Desalvo, Giulia, Rostamizadeh, Afshin, and Talwalkar, Ameet (2018b). Hyperband: A Novel Bandit-Based Approach to Hyperparameter Optimization. Journal of Machine Learning Research 18(185):1–52.Google Scholar

Li, Zihan and Scarlett, Jonathan (2021). Gaussian Process Bandit Optimization with Few Batches. arXiv: 2110.07788 [stat.Ml].Google Scholar

Lindley, D. V. (1956). On a Measure of the Information Provided by an Experiment. The Annals of Mathematical Statistics 27(4):986–1005.Google Scholar

Lindley, D. V. (1972). Bayesian Statistics, A Review. Cbms-Nsf Regional Conference Series in Applied Mathematics. Society for Industrial and Applied Mathematics.Google Scholar

Locatelli, M. (1997). Bayesian Algorithms for One-Dimensional Global Optimization. Journal of Global Optimization 10(1):57–76.Google Scholar

LÖWner, Karl (1934). Über monotone Matrixfunktionen. Mathematische Zeitschrift 38: 177–216.Google Scholar

Lukic, Milan N. and Beder, Jay H. (2001). Stochastic Processes with Sample Paths in Reproducing Kernel Hilbert Spaces. Transactions of the American Mathematical Society 353(10):3945–3969.Google Scholar

Lyu, Yueming, Yuan, Yuan, and Tsang, Ivor W. (2019). Efficient Batch Black-Box Optimization with Deterministic Regret Bounds. arXiv: 1905.10041 [cs.Lg].Google Scholar

Mackay, David J. C. (1998). Introduction to Gaussian Processes. In: Neural Networks and Machine Learning. Ed. by Bishop, Christopher M.. Vol. 168. Nato Asi Series F: Computer and Systems Sciences. Springer–Verlag, pp. 133–165.Google Scholar

Mackay, David J. C. (2003). Information Theory, Inference, and Learning Algorithms. Cambridge University Press.Google Scholar

Mahsereci, Maren and Hennig, Philipp (2015). Probabilistic Line Searches for Stochastic Optimization. Advances in Neural Information Processing Systems 28 (Neurips 2015), pp. 181–189.Google Scholar

Malkomes, Gustavo and Garnett, Roman (2018). Automating Bayesian Optimization with Bayesian Optimization. Advances in Neural Information Processing Systems 31 (Neurips 2018), pp. 5984–5994.Google Scholar

Malkomes, Gustavo, Schaff, Chip, and Garnett, Roman (2016). Bayesian Optimization for Automated Model Selection. Advances in Neural Information Processing Systems 29 (Neurips 2016), pp. 2900–2908.Google Scholar

Maraval, Alexandre, Zimmer, Matthieu, Grosnit, Antoine, Tutunov, Rasul, Wang, Jun, and Ammar, Haitham Bou (2022). Sample-Efficient Optimisation with Probabilistic Transformer Surrogates. arXiv: 2205.13902 [cs.Lg].Google Scholar

Marmin, Stbastien, Chevalier, Cltment, and Ginsbourger, David (2015). Differentiating the Multipoint Expected Improvement for Optimal Batch Design. Proceedings of the 1st International Workshop on Machine Learning, Optimization, and Big Data (Mod 2015). Vol. 9432. Lecture Notes in Computer Science. Springer–Verlag, pp. 37–48.Google Scholar

Marschak, Jacob and Radner, Roy (1972). Economic Theory of Teams. Yale University Press.Google Scholar

Martinez-Cantin, Ruben, Tee, Kevin, and Mccourt, Michael (2018). Practical Bayesian Optimization in the Presence of Outliers. Proceedings of the 21st International Conference on Artificial Intelligence and Statistics (Aistats 2018). Vol. 84. Proceedings of Machine Learning Research, pp. 1722–1731.Google Scholar

Massart, Pascal (2007). Concentration Inequalities and Model Selection: Ecole d’Eté de Probabilités de Saint-Flour xxxiii – 2003. Vol. 1896. Lecture Notes in Mathematics. Springer–Verlag.Google Scholar

Mccullagh, P. and Nelder, J. A. (1989). Generalized Linear Models. 2nd ed. Monographs on Statistics and Applied Probability. Chapman & Hall.Google Scholar

Meinshausen, Nicolai (2006). Quantile Regression Forests. Journal of Machine Learning Research 7(35):983–999.Google Scholar

Miettinen, Kaisa M. (1998). Nonlinear Multiobjective Optimization. International Series in Operations Research & Management Science. Kluwer Academic Publishers.Google Scholar

Milgrom, Paul and Segal, Ilya (2002). Envelope Theorems for Arbitrary Choice Sets. Econometrica 70(2):583–601.Google Scholar

Minka, Thomas P. (2001). A Family of Algorithms for Approximate Bayesian Inference. Ph.D. thesis. Massachusetts Institute of Technology.Google Scholar

Minka, Thomas (2008). Ep: A Quick Reference. Url: https://tminka.github.io/papers/ep/minka-ep-quickref.pdf.Google Scholar

Mockus, Jonas (1972). Bayesian Methods of Search for an Extremum. Avtomatika i Vychis-litel’naya Tekhnika (Automatic Control and Computer Sciences) 6(3):53–62.Google Scholar

Mockus, Jonas (1974). On Bayesian Methods for Seeking the Extremum. Optimization Techniques: ifiP Technical Conference. Vol. 27. Lecture Notes in Computer Science. Springer–Verlag, pp. 400–404.Google Scholar

Mockus, Jonas (1989). Bayesian Approach to Global Optimization: Theory and Applications. Mathematics and Its Applications. Kluwer Academic Publishers.Google Scholar

Mockus, Jonas, Eddy, William, Mockus, Audris, Mockus, Linas, and Reklaitas, Gintaras (2010). Bayesian Heuristic Approach to Discrete and Global Optimization: Algorithms, Visualization, Software, and Applications. Nonconvex Optimization and Its Applications. Kluwer Academic Publishers.Google Scholar

Mockus, J., Tiesis, V., andžlinskas, A. (1978). The Application of Bayesian Methods for Seeking the Extrememum. In: Towards Global Optimization 2. Ed. by Dixon, L. C. W. and North–Holland, G. P. Szeg6., pp. 117–129.Google Scholar

MØLler, Jesper, Syversveen, Anne Randi, and Waagepetersen, Rasmus Plenge (1998). Log Gaussian Cox Processes. Scandinavian Journal of Statistics 25(3):451–482.Google Scholar

Montgomery, Douglas C. (2019). Design and Analysis of Experiments. 10th ed. John Wiley & Sons.Google Scholar

Moore, Andrew W. and Atkeson, Christopher G. (1993). Memory-Based Reinforcement Learning: Efficient Computation with Prioritized Sweeping. Advances in Neural Information Processing Systems 5 (Neurips 1992), pp. 263–270.Google Scholar

Moriconi, Riccardo, Deisenroth, Marc Peter, and Kumar, K. S. Sesh (2020). High-Dimensional Bayesian Optimization Using Low-Dimensional Feature Spaces. Machine Learning 109(9–10):1925–1943.Google Scholar

Moss, Henry B., Daniel Beck, Javier GonzÁLez, David S. Leslie, and Rayson, Paul (2020). Boss: Bayesian Optimization over String Spaces. Advances in Neural Information Processing Systems 33 (Neurips 2020), pp. 15476–15486.Google Scholar

MÜLler, Sarah, Rohr, Alexander Von, and Trimpe, Sebastian (2021). Local Policy Search with Bayesian Optimization. Advances in Neural Information Processing Systems 34 (Neurips 2021), pp. 20708–20720.Google Scholar

Murray, Iain (2016). Differentiation of the Cholesky decomposition. arXiv: 1602.07527 [stat.Co].Google Scholar

Murray, Iain, Adams, Ryan Prescott, and Mackay, David J. C. (2010). Elliptical Slice Sampling. Proceedings of the 13th International Conference on Artificial Intelligence and Statistics (aistats 2010). Vol. 9. Proceedings of Machine Learning Research, pp. 541–548.Google Scholar

Mutni, Mojmfr and Krause, Andreas (2018). Efficient High Dimensional Bayesian Optimization with Additivity and Quadrature Fourier Features. Advances in Neural Information Processing Systems 31 (Neurips 2018), pp. 9005–9016.Google Scholar

Neal, Radford M. (1997). Monte Carlo Implementation of Gaussian Process Models for Bayes-ian Regression and Classi~cation. Technical report (9702). Department of Statistics, University of Toronto.Google Scholar

Neal, Radford M. (1998). Regression and Classification Using Gaussian Process Priors. In: Bayesian Statistics 6. Ed. by Bernardo, J. M., Berger, J. O., Dawid, A. P., and Smith, A. F. M.. Oxford University Press, pp. 475–490.Google Scholar

Nguyen, Quan, Wu, Kaiwen, Garder, Jacob R., and Garnett, Roman (2022). Local Bayesian Optimization via Maximizing Probability of Descent. arXiv: 2210.11662 [cs.Lg].Google Scholar

Nguyen, Quoc Phong, Dai, Zhongxiang, Low, Bryan Kian Hsiang, and Jaillet, Patrick (2021a). Optimizing Conditional Value-At-Risk of Black-Box Functions. Advances in Neural Information Processing Systems 34 (Neurips 2021).Google Scholar

Nguyen, Quoc Phong, Dai, Zhongxiang, Low, Bryan Kian Hsiang, and Jaillet, Patrick (2021b). Value-at-Risk Optimization with Gaussian Processes. Proceedings of the 38th International Conference on Machine Learning (Icml 2021). Vol. 139. Proceedings of Machine Learning Research, pp. 8063–8072.Google Scholar

Nguyen, Vu, Gupta, Sunil, Rana, Santu, Li, Cheng, and Venkatesh, Svetha (2017). Regret for Expected Improvement over the Best-Observed Value and Stopping Condition. Proceedings of the 9th Asian Conference on Machine Learning (acml 2017). Vol. 77. Proceedings of Machine Learning Research, pp. 279–294.Google Scholar

Nickisch, Hannes and Rasmussen, Carl Edward (2008). Appproximations for Binary Gaussian Process Classification. Journal of Machine Learning Research 9(Oct):2035– 2078.Google Scholar

O'Hagan, A. (1978). Curve Fitting and Optimal Design for Prediction. Journal of the Royal Statistical Society Series B (Methodological) 40(1):1–42.Google Scholar

O'Hagan, A. (1991). Bayes–Hermite Quadrature. Journal ofStatistical Planning and Inference 29(3):245–260.Google Scholar

O'Hagan, Anthony and Forster, Jonathan (2004). Kendall’s Advanced Theory of Statistics. 2nd ed. Vol. 2B: Bayesian Inference. Arnold.Google Scholar

Oh, Changyong, Tomczak, Jakub M., Gavves, Efstratios, and Welling, Max (2019). Combinatorial Bayesian Optimization Using the Graph Cartesian Product. Advances in Neural Information Processing Systems 32 (Neurips 2019), pp. 2914–2924.Google Scholar

ØKsendal, Bernt (2013). Stochastic Di~erential Equations: An Introduction with Applications. 6th ed. Universitext. Springer–Verlag.Google Scholar

Oliveira, Rafael, Ott, Lionel, and Ramos, Fabio (2019). Bayesian Optimisation Under Uncertain Inputs. Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics (Aistats 2019). Vol. 89. Proceedings of Machine Learning Research, pp. 1177–1184.Google Scholar

Osborne, Michael A., David Duvenaud, Roman Garnett, Carl E. Rasmussen, Stephen J. Roberts, and Ghahramani, Zoubin (2012). Active Learning of Model Evidence Using Bayesian Quadrature. Advances in Neural Information Processing Systems 25 (Neurips 2012), pp. 46–54.Google Scholar

Osborne, Michael A., Garnett, Roman, and Roberts, Stephen J. (2009). Gaussian Processes for Global Optimization. Proceedings of the 3rd Learning and Intelligent Optimization Conference (LioN 3).Google Scholar

Paria, Biswajit, Kandasamy, Kirthevasan, and P6Czos, BarnabÁS (2019). A Flexible Framework for Multi-Objective Bayesian Optimization Using Random Scalariza-tions. Proceedings of the 35th Conference on Uncertainty in Artificial Intelligence (uAi 2019). Vol. 115. Proceedings of Machine Learning Research, pp. 766–776.Google Scholar

Peirce, C. S. (1876). Note on the Theory of the Economy of Research. In: Report of the Superintendent of the United States Coast Survey Showing the Progress of the Work for the Fiscal Year Ending with June, 1876. Government Printing Office, pp. 197–201.Google Scholar

Picheny, Victor (2014). A Stepwise Uncertainty Reduction Approach to Constrained Global Optimization. Proceedings of the 17th International Conference on Artificial Intelligence and Statistics (Aistats 2014). Vol. 33. Proceedings of Machine Learning Research, pp. 787–795.Google Scholar

Picheny, Victor (2015). Multiobjective Optimization Using Gaussian Process Emulators via Stepwise Uncertainty Reduction. Statistics and Computing 25(6):1265–1280.Google Scholar

Picheny, Victor, Ginsbourger, David, Richet, Yann, and Caplin, Gregory (2013a). Quantile-Based Optimization of Noisy Computer Experiments with Tunable Precision. Tech-nometrics 55(1):2–13.Google Scholar

Picheny, Victor, Wagner, Tobias, and Ginsbourger, David (2013b). A Benchmark of Kriging-Based Infill Criteria for Noisy Optimization. Structural and Multidisciplinary Optimization 48(3):607–626.Google Scholar

Pleiss, Geoff, Jankowiak, Martin, Eriksson, David, Damle, Anil, and Gardner, Jacob R. (2020). Fast Matrix Square Roots with Applications to Gaussian Processes and Bayesian Optimization. Advances in Neural Information Processing Systems 33 (Neurips 2020), pp. 22268–22281.Google Scholar

Poincara, Henri (1912). Calcul des probabilités. 2nd ed. Gauthier–Villars.Google Scholar

Ponweiser, Wolfgang, Wagner, Tobias, Biermann, Dirk, and Vincze, Markus (2008). Multiobjective Optimization on a Limited Budget of Evaluations Using Model-Assisted S-Metric Selection. Proceedings of the 10th International Confernce on Parallel Problem Solving from Nature (Ppsn x). Vol. 5199. Lecture Notes in Computer Science. Springer–Verlag, pp. 784–794.Google Scholar

Powell, Warren B. (2011). Approximate Dynamic Programming: Solving the Curses of Dimensionality. 2nd ed. Wiley Series in Probability and Statistics. John Wiley & Sons.Google Scholar

Pronzato, Luc (2017). Minimax and Maximin Space-Filling Designs: Some Properties and Methods for Construction. Journal de la Société Française de Statistique 158(1):7–36.Google Scholar

Raiffa, Howard and Schlaifer, Robert (1961). Applied Statistical Decision Theory. Division of Research, Graduate School of Business Administration, Harvard University.Google Scholar

Rasmussen, Carl Edward and Ghahramani, Zoubin (2002). Bayesian Monte Carlo. Advances in Neural Information Processing Systems 15 (Neurips 2002), pp. 505–512.Google Scholar

Rasmussen, Carl Edward and Williams, Christopher K. I. (2006). Gaussian Processes for Machine Learning. Adaptive Computation and Machine Learning. Mit Press.Google Scholar

Riquelme, Carlos, Tucker, George, and Snoek, Jasper (2018). Deep Bayesian Bandits Showdown. Proceedings of the 6th International Conference on Learning Representations (ICLR 2018). arXiv: 1802.09127 [cs.Lg].Google Scholar

Robbins, Herbert (1952). Some Aspects of the Sequential Design of Experiments. Bulletin of the American Mathematical Society 58(5):527–535.Google Scholar

Rolland, Paul, Scarlett, Jonathan, Bogunovic, Ilija, and Cevher, Volkan (2018). High-Dimensional Bayesian Optimization via Additive Models with Overlapping Groups. Proceedings of the 21st International Conference on Artificial Intelligence and Statistics (Aistats 2018). Vol. 84. Proceedings of Machine Learning Research, pp. 298–307.Google Scholar

Ross, Andrew M. (2010). Computing Bounds on the Expected Maximum of Correlated Normal Variables. Methodology and Computing in Applied Probability 12(1):111–138.Google Scholar

Rudin, Walter (1976). Principles of Mathematical Analysis. 3rd ed. International Series in Pure and Applied Mathematics. McGraw–Hill.Google Scholar

Rue, HÅVard, Sara Martino, and Chopin, Nicolas (2009). Approximate Bayesian Inference for Latent Gaussian Models by Using Integrated Nested Laplace Approximations. Journal of the Royal Statistical Society Series B (Methodological) 71(2):319–392.Google Scholar

Russo, Daniel and Roy, Benjamin Van (2014). Learning to Optimize via Posterior Sampling. Mathematics of Operations Research 39(4):1221–1243.Google Scholar

Russo, Daniel and Roy, Benjamin Van (2016). An Information-Theoretic Analysis of Thompson Sampling. Journal of Machine Learning Research 17(68):1–30.Google Scholar

Sacks, Jerome, Welch, William J., Mitchell, Toby J., and Wynn, Henry P. (1989). Design and Analysis of Computer Experiments. Statistical Science 4(4):409–435.Google Scholar

Salgia, Sudeep, Vakili, Sattar, and Zhao, Qing (2020). A Computationally Efficient Approach to Black-Box Optimization Using Gaussian Process Models. arXiv: 2010. 13997 [stat.Ml].Google Scholar

Saltenis, VydÜNas R. (1971). One Method of Multiextremum Optimization. Avtomatika i Vychislitel’naya Tekhnika (Automatic Control and Computer Sciences) 5(3):33–38.Google Scholar

Sanchez, Susan M. and Sanchez, Paul J. (2005). Very Large Fractional Factorial and Central Composite Designs. Acm Transactions on Modeling and Computer Simulation 15(4): 362–377.Google Scholar

Scarlett, Jonathan (2018). Tight Regret Bounds for Bayesian Optimization in One Dimension. Proceedings of the 35th International Conference on Machine Learning (Icml 2018). Vol. 80. Proceedings of Machine Learning Research, pp. 4500–4508.Google Scholar

Scarlett, Jonathan, Bogunovic, Ilija, and Cevher, Volkan (2017). Lower Bounds on Regret for Noisy Gaussian Process Bandit Optimization. Proceedings of the 2017 Conference on Learning Theory (Colt 2017). Vol. 65. Proceedings of Machine Learning Research, pp. 1723–1742.Google Scholar

Scarlett, Jonathan and Cevhar, Volkan (2021). An Introductory Guide to Fano’s Inequality with Applications in Statistical Estimation. In: Information-Theoretic Methods in Data Science. Ed. by Rodrigues, Miguel R. D. and Eldar, Yonina C.. Cambridge University Press, pp. 487–528.Google Scholar

Schonlau, Matthias (1997). Computer Experiments and Global Optimization. Ph.D. thesis. University of Waterloo.Google Scholar

Schonlau, Matthias, Welch, William J., and Jones, Donald R. (1998). Global versus Local Search in Constrained Optimization of Computer Models. In: New Developments and Applications in Experimental Design. Vol. 34. Lecture Notes – Monograph Series. Institute of Mathematical Statistics, pp. 11–25.Google Scholar

Scott, Warren, Frazier, Peter, and Powell, Warren (2011). The Correlated Knowledge Gradient for Simulation Optimization of Continuous Parameters Using Gaussian Process Regression. siam Journal on Optimization 21(3):996–1026.Google Scholar

Seeger, Matthias (2008). Expectation Propagation for Exponential Families. Technical report. University of California, Berkeley.Google Scholar

Settles, Burr (2012). Active Learning. Synthesis Lectures on Artificial Intelligence and Machine Learning. Morgan & Claypool.Google Scholar

Shah, Amar and Ghahramani, Zoubin (2015). Parallel Predictive Entropy Search for Batch Global Optimization of Expensive Objective Functions. Advances in Neural Information Processing Systems 28 (Neurips 2015), pp. 3330–3338.Google Scholar

Shah, Amar, Wilson, Andrew Gordon, and Ghahramani, Zoubin (2014). Student-t Processes as Alternatives to Gaussian Processes. Proceedings of the 17th International Conference on Artificial Intelligence and Statistics (aistats 2014). Vol. 33. Proceedings of Machine Learning Research, pp. 877–885.Google Scholar

Shahriari, Bobak, Swersky, Kevin, Wang, Ziyu, Adams, Ryan P., and Freitas, Nando De (2016). Taking the Human out of the Loop: A Review of Bayesian Optimization. Proceedings of the Ieee 104(1):148–175.Google Scholar

Shahriari, Bobak, Wang, Ziyu, Hoffman, Matthew W., Bouchard-Côté,, Alexandre and Freitas, Nando De (2014). An Entropy Search Portfolio for Bayesian Optimization. arXiv: 1406.4625 [stat.Ml].Google Scholar

Shamir, Ohad (2013). On the Complexity of Bandit and Derivative-Free Stochastic Convex Optimization. Proceedings of the 24th Annual Conference on Learning Theory (Colt 2013). Vol. 30. Proceedings of Machine Learning Research, pp. 3–24.Google Scholar

Shannon, C. E. (1948). A Mathematical Theory of Communication. The Bell System Technical Journal 27(3):379–423.Google Scholar

Shao, T. S., Chen, T. C., and Frank, R. M. (1964). Tables of Zeros and Gaussian Weights of Certain Associated Laguerre Polynomials and the Related Generalized Hermite Polynomials. Mathematics of Computation 18(88):598–616.Google Scholar

Shepp, L. A. (1979). The Joint Density of the Maximum and Its Location for a Wiener Process with Drift. Journal of Applied Probability 16(2):423–427.Google Scholar

Silverman, B. W. (1986). Density Estimation for Statistics and Data Analysis. Monographs on Statistics and Applied Probability. Chapman & Hall.Google Scholar

Slepian, David (1962). The One-Sided Barrier Problem for Gaussian Noise. The Bell System Technical Journal 41(2):463–501.Google Scholar

Smith, Kirstine (1918). On the Standard Deviations of Adjusted and Interpolated Values of an Observed Polynomial Function and Its Constants and the Guidance They Give towards a Proper Choice of the Distribution of Observations. Biometrika 12(1–2): 1–85.Google Scholar

Smola, Alex SchÖLkopf, J. and Bernhard (2000). Sparse Greedy Matrix Approximation for Machine Learning. Proceedings of the 17th International Conference on Machine Learning (Icml 2000), pp. 911–918.Google Scholar

Snelson, Edward and Ghahramani, Zoubin (2005). Sparse Gaussian Processes Using Pseudo-inputs. Advances in Neural Information Processing Systems 18 (Neurips 2005), pp. 1257–1264.Google Scholar

Snoek, Jasper, Larochelle, Hugo, and Adams, Ryan P. (2012). Practical Bayesian Optimization of Machine Learning Algorithms. Advances in Neural Information Processing Systems 25 (Neurips 2012), pp. 2951–2959.Google Scholar

Snoek, Jasper, Rippel, Oren, Swersky, Kevin, Kiros, Ryan, Satish, Nadathur, Sundaram, Narayanan, et al. (2015). Scalable Bayesian Optimization Using Deep Neural Networks. Proceedings of the 32nd International Conference on Machine Learning (Icml 2015). Vol. 37. Proceedings of Machine Learning Research, pp. 2171–2180.Google Scholar

Snoek, Jasper, Swersky, Kevin, Zemel, Richard, and Adams, Ryan P. (2014). Input Warping for Bayesian Optimization of Non-Stationary Functions. Proceedings of the 31st International Conference on Machine Learning (Icml 2014). Vol. 32. Proceedings of Machine Learning Research, pp. 1674–1682.Google Scholar

Song, Jialin, Chen, Yuxin, and Yue, Yisong (2019). A General Framework for Multi-Fidelity Bayesian Optimization with Gaussian Processes. Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics (aistats 2019). Vol. 89. Proceedings of Machine Learning Research, pp. 3158–3167.Google Scholar

Springenberg, Jost Tobias, Klein, Aaron, Falkner, Stefan, and Hutter, Frank (2016). Bayesian Optimization with Robust Bayesian Neural Networks. Advances in Neural Information Processing Systems 29 (Neurips 2016), pp. 4134–4142.Google Scholar

Srinivas, Niranjan, Krause, Andreas, Kakade, Sham, and Seeger, Matthias (2010). Gaussian Process Optimization in the Bandit Setting: No Regret and Experimental Design. Proceedings of the 27th International Conference on Machine Learning (Icml 2010), pp. 1015–1022.Google Scholar

Stein, Michael L. (1999). Interpolation of Spatial Data: Some Theory for Kriging. Springer Series in Statistics. Springer–Verlag.Google Scholar

Streltsov, Simon and Vakili, Pirooz (1999). A Non-Myopic Utility Function for Statistical Global Optimization Algorithms. Journal of Global Optimization 14(3):283–298.Google Scholar

Sutton, Richard S. (1990). Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming. Proceedings of the 7th International Conference on Machine Learning (Icml 1990), pp. 216–224.Google Scholar

Svenson, Joshua and Santner, Thomas (2016). Multiobjective Optimization of Expensive-to-evaluate Deterministic Computer Simulator Models. Computational Statistics and Data Analysis 94:250–264.Google Scholar

Swersky, Kevin, Snoek, Jasper, and Adams, Ryan P. (2013). Multi-Task Bayesian Optimization. Advances in Neural Information Processing Systems 26 (Neuritis 2013), pp. 2004– 2012.Google Scholar

Swersky, Kevin, Snoek, Jasper, and Adams, Ryan P. (2014). Freeze–Thaw Bayesian Optimization. arXiv: 1406.3896 [stat.Ml].Google Scholar

Takeno, Shion, Fukuoka, Hitoshi, Tsukada, Yuhki, Koyama, Toshiyuki, Shiga, Motoki, Takeuchi, Ichiro, et al. (2020). Multi-Fidelity Bayesian Optimization with Max-value Entropy Search and Its Parallelization. Proceedings of the 39th International Conference on Machine Learning (Icml 2020). Vol. 119. Proceedings of Machine Learning Research, pp. 9334–9345.Google Scholar

Tallis, G. M. (1961). The Moment Generating Function of the Truncated Multi-Normal Distribution. Journal of the Royal Statistical Society Series B (Methodological) 23(1): 223–229.Google Scholar

Tesch, Matthew, Schneider, Jeff, and Choset, Howie (2013). Expensive Function Optimization with Stochastic Binary Outcomes. Proceedings of the 30th International Conference on Machine Learning (Icml 2013). Vol. 28. Proceedings of Machine Learning Research, pp. 1283–1291.Google Scholar

The Uniprot Consortium (2021). UniProt: The Universal Protein Knowledgebase in 2021. Nucleic Acids Research 49(D1):D480-D489.Google Scholar

Thompson, William R. (1933). On the Likelihood That One Unknown Probability Exceeds Another in View of the Evidence of Two Samples. Biometrika 25(3–4):285–294.Google Scholar

Thompson, William R. (1935). On the Theory of Apportionment. American Journal of Mathematics 57(2):450–456.Google Scholar

Tiao, Louis C., Klein, Aaron, Seeger, Matthias, Bonilla, Edwin V., Archambeau, Ctdric, and Ramos, Fabio (2021). Bore: Bayesian Optimization by Density-Ratio Estimation. Proceedings of the 38th International Conference on Machine Learning (Icml 2021). Vol. 139. Proceedings of Machine Learning Research, pp. 10289–10300.Google Scholar

Titsias, Michalis (2009). Variational Learning of Inducing Variables in Sparse Gaussian Processes. Proceedings of the 12th International Conference on Artificial Intelligence and Statistics (Aistats 2009). Vol. 5. Proceedings of Machine Learning Research, pp. 567–574.Google Scholar

Toscano-Palmerin, Saul and Frazier, Peter I. (2018). Bayesian Optimization with Expensive Integrands. arXiv: 1803.08661 [stat.Ml].Google Scholar

Turban, Sebastien (2010). Convolution of a Truncated Normal and a Centered Normal Variable. Technical report. Columbia University.Google Scholar

Turner, Ryan, Eriksson, David, Mccourt, Michael, Kiili, Juha, Laaksonen, Eero, XU, Zhen, et al. (2021). Bayesian Optimization Is Superior to Random Search for Machine Learning Hyperparameter Tuning: Analysis of the Black-Box Optimization Challenge 2020. Proceedings of the Neuritis 2020 Competition and Demonstration Track. Vol. 133. Proceedings of Machine Learning Research, pp. 3–26.Google Scholar

Ulrich, Kyle, Carlson, David E., Dzirasa, Kafui, and Carin, Lawrence (2015). Gp Kernels for Cross-Spectrum Analysis. Advances in Neural Information Processing Systems 28 (Neuritis 2015), pp. 1999–2007.Google Scholar

Vakili, Sattar, Bouziani, Nacime, Jalali, Sepehr, Bernacchia, Alberto, and Shiu, Da-Shan (2021a). Optimal Order Simple Regret for Gaussian Process Bandits. Advances in Neural Information Processing Systems 34 (Neuritis 2021).Google Scholar

Vakili, Sattar, Khezeli, Kia, and Picheny, Victor (2021b). On Information Gain and Regret Bounds in Gaussian Process Bandits. Proceedings of the 24th International Conference on Artificial Intelligence and Statistics (aistats 2021). Vol. 130. Proceedings of Machine Learning Research, pp. 82–90.Google Scholar

Vakili, Sattar, Scarlett, Jonathan, and Javidi, Tara (2021c). Open Problem: Tight Online Confidence Intervals for Rkhs Elements. Proceedings of the 34th Annual Conference on Learning Theory (Colt 2021). Vol. 134. Proceedings of Machine Learning Research, pp. 4647–4652.Google Scholar

Valko, Michal, Korda, Nathan, Munos, Rami, Flaounas, Ilias, and Cristianini, Nello (2013). Finite-Time Analysis of Kernelised Contextual Bandits. Proceedings of the 29th Conference on Uncertainty in Artificial Intelligence (uai 2013), pp. 654–663.Google Scholar

Van De Corput, J. G. (1935). Verteilungsfunktionen: Erste Mitteilung. Proceedings of the Koninklijke Nederlandse Akademie van Wetenschappen 38:813–821.Google Scholar

Van Der Vaart, Aad W. and Wellner, Jon A. (1996). Weak Convergence and Empirical Processes with Applications to Statistics. Springer Series in Statistics. Springer–Verlag.Google Scholar

Vanchinathan, Hastagiri P., Marfurt, Andreas, Robelin, Charles-Antoine, Kossmann, Donald, and Krause, Andreas (2015). Discovering Valuable Items from Massive Data. Proceedings of the 21st acm sigkdd International Conference on Knowledge Discovery and Data Mining (kdd 2015), pp. 1195–1204.Google Scholar

Vazquez, Emmanuel, Villemonteix, Julien, Sidorkiewicz, Maryan, and Walter, Aric (2008). Global Optimization Based on Noisy Evaluations: An Empirical Study of Two Statistical Approaches. Proceedings of the 6th International Conference on Inverse Problems in Engineering: Theory and Practice (icipe 2008). Vol. 135. Journal of Physics: Conference Series, paper number 012100.Google Scholar

Vehtari, Aki and Ojanen, Janne (2012). A Survey of Bayesian Predictive Methods for Model Assessment, Selection and Comparison. Statistics Surveys 6:142–228.Google Scholar

Villemonteix, Julien, Vazquez, Emmanuel, and Walter, Eric (2009). An Informational Approach to the Global Optimization of Expensive-to-evaluate Functions. Journal of Global Optimization 44(4):509–534.Google Scholar

Vivarelli, Francesco and Williams, Christopher K. I (1998). Discovering Hidden Features with Gaussian Process Regression. Advances in Neural Information Processing Systems 11 (Neurips 1998), pp. 613–619.Google Scholar

Von Neumann, John and Morgenstern, Oskar (1944). Theory of Games and Economic Behavior. Princeton University Press.Google Scholar

VondrÁK, Jan (2005). Probabilistic Methods in Combinatorial and Stochastic Optimization. Ph.D. thesis. Massachusetts Institute of Technology.Google Scholar

Wald, A. (1945). Sequential Tests of Statistical Hypotheses. The Annals of Mathematical Statistics 16(2):117–186.Google Scholar

Wald, Abraham (1947). Sequential Analysis. Wiley Mathematical Statistics Series. John Wiley & Sons.Google Scholar

Wang, Jialei, Clark, Scott C., Liu, Eric, and Frazier, Peter I. (2020a). Parallel Bayesian Global Optimization of Expensive Functions. Operations Research 68(6):1850–1865.Google Scholar

Wang, Zexin, Tan, Vincent Y. F., and Scarlett, Jonathan (2020b). Tight Regret Bounds for Noisy Optimization of a Brownian Motion. arXiv: 2001.09327 [cs.Lg].Google Scholar

Wang, Zi and Jegelka, Stefanie (2017). Max-value Entropy Search for Efficient Bayesian Optimization. Proceedings of the 34th International Conference on Machine Learning (Icml 2017). Vol. 70. Proceedings of Machine Learning Research, pp. 3627–3635.Google Scholar

Wang, Zi, Zhou, Bolei, and Jegelka, Stefanie (2016a). Optimization as Estimation with Gaussian Processes in Bandit Settings. Proceedings of the 19th International Conference on Artificial Intelligence and Statistics (Aistats 2016). Vol. 51. Proceedings of Machine Learning Research, pp. 1022–1031.Google Scholar

Wang, Ziyu and Freitas, Nando De (2014). Theoretical Analysis of Bayesian Optimization with Unknown Gaussian Process Hyper-Parameters. arXiv: 1406.7758 [stat.Ml].Google Scholar

Wang, Ziyu, Hutter, Frank, Zoghi, Masrour, Matheson, David, and Freitas, Nando De (2016b). Bayesian Optimization in a Billion Dimensions via Random Embeddings. Journal of Artificial Intelligence Research 55:361–387.Google Scholar

Wendland, Holger (2004). Scattered Data Approximation. Cambridge Monographs on Applied and Computational Mathematics. Cambridge University Press.Google Scholar

Whittle, Peter (1982). Optimization over Time: Dynamic Programming and Stochastic Control. Vol. 1. Wiley Series in Probability and Mathematical Statistics. John Wiley & Sons.Google Scholar

Whittle, P. (1988). Restless Bandits: Activity Allocation in a Changing World. Journal of Applied Probability 25(A):287–298.Google Scholar

Williams, Brian J., Santner, Thomas J., and Notz, William I. (2000). Sequential Design of Computer Experiments to Minimize Integrated Response Functions. Statistica Sinica 10(4):1133–1152.Google Scholar

Williams, Christopher K. I. and Seeger, Matthias (2000). Using the Nyström Method to Speed up Kernel Machines. Advances in Neural Information Processing Systems 13 (Neurips 2000), pp. 682–688.Google Scholar

Wilson, Andrew Gordon and Adams, Ryan Prescott (2013). Gaussian Process Kernels for Pattern Discovery and Extrapolation. Proceedings of the 30th International Conference on Machine Learning (Icml 2013). Vol. 28. Proceedings of Machine Learning Research, pp. 1067–1075.Google Scholar

Wilson, Andrew Gordon, Hu, Zhiting, Salakhutdinov, Ruslan, and Xing, Eric P. (2016). Deep Kernel Learning. Proceedings of the 19th International Conference on Artificial Intelligence and Statistics (Aistats 2016). Vol. 51. Proceedings of Machine Learning Research, pp. 370–378.Google Scholar

Wilson, James T., Hutter, Frank, and Deisenroth, Marc Peter (2018). Maximizing Acquisition Functions for Bayesian Optimization. Advances in Neural Information Processing Systems 31 (Neurips 2018), pp. 9884–9895.Google Scholar

Wu, Jian and Frazier, Peter I. (2016). The Parallel Knowledge Gradient Method for Batch Bayesian Optimization. Advances in Neural Information Processing Systems 29 (Neurips 2016), pp. 3126–3134.Google Scholar

Wu, Jian and Frazier, Peter I. (2019). Practical Two-Step Look-Ahead Bayesian Optimization. Advances in Neural Information Processing Systems 32 (Neurips 2019), pp. 9813– 9823.Google Scholar

Wu, Jian, Poloczek, Matthias, Wilson, Andrew Gordon, and Frazier, Peter I. (2017). Bayesian Optimization with Gradients. Advances in Neural Information Processing Systems 30 (Neurips 2017), pp. 5267–5278.Google Scholar

Wu, Jian, Toscano-Palmerin, Saul, Frazier, Peter I., and Wilson, Andrew Gordon (2019). Practical Multi-Fidelity Bayesian Optimization for Hyperparameter Tuning. Proceedings of the 35th Conference on Uncertainty in Artificial Intelligence (uAi 2019). Vol. 115. Proceedings of Machine Learning Research, pp. 788–798.Google Scholar

Yang, Kaifeng, Emmerich, Michael, Deutz, Andra, and BÄCk, Thomas (2019a). Efficient Computation of Expected Hypervolume Improvement Using Box Decomposition Algorithms. Journal of Global Optimization 75(1):3–34.Google Scholar

Yang, Kaifeng, Emmerich, Michael, Deutz, Andra, and BÄCk, Thomas (2019b). Multi-Objective Bayesian Global Optimization Using Expected Hypervolume Improvement Gradient. Swarm and Evolutionary Computation 44:945–956.Google Scholar

Yue, Xubo and Kontar, Raed Al (2020). Why Non-Myopic Bayesian Optimization Is Promising and How Far Should We Look-ahead? A Study via Rollout. Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics (Aistats 2020). Vol. 108. Proceedings of Machine Learning Research, pp. 2808–2818.Google Scholar

Zhang, Weitong, Zhou, Dongruo, Li, Lihong, and Gu, Quanquan (2021). Neural Thompson Sampling. Proceedings of the 9th International Conference on Learning Representations (ICLR 2021). arXiv: 2010.00827 [cs.Lg].Google Scholar

Zhang, Yehong, Hoang, Trong Nghia, Low, Bryan Kian Hsiang, and Kankan-Halli, Mohan (2017). Information-Based Multi-Fidelity Bayesian Optimization. Bayesian Optimization for Science and Engineering Workshop (BayesOpt 2017), Conference on Neural Information Processing Systems (Neurips 2017).Google Scholar

Zhou, Dongrou, Li, Lihong, and Gu, Quanquan (2020). Neural Contextual Bandits with Ucb-Based Exploration. Proceedings of the 37th International Conference on Machine Learning (Icml 2020). Vol. 119. Proceedings of Machine Learning Research, pp. 11492– 11502.Google Scholar

Ziatdinov, Maxim A., Ghosh, Ayana, and Kalinin, Sergei V. (2021). Physics Makes the Difference: Bayesian Optimization and Active Learning via Augmented Gaussian Process. arXiv: 2108.10280 [physics.comp-ph].Google Scholar

Zilberstein, Schlomo (1996). Using Anytime Algorithms in Intelligent Systems. Ai Magazine 17(3):73–83.Google Scholar

žlinskas, Antanas G. (1975). Single-Step Bayesian Search Method for an Extremum of Functions of a Single Variable. Kibernetika (Cybernetics) 11(1):160–166.Google Scholar

Zitzler, Eckart (1999). Evolutionary Algorithms for Multiobjective Optimization: Methods and Applications. Ph.D. thesis. Eidgenössische Technische Hochschule Zürich.Google Scholar

Zuluaga, Marcela, Krause, Andreas, and PÜSchel, Markus (2016). E-Pal: An Active Learning Approach to the Multi-Objective Optimization Problem. Journal of Machine Learning Research 17(104):1–32.Google Scholar

Accessibility standard: Unknown

Accessibility compliance for the PDF of this book is currently unknown and may be updated in the future.

Book contents

references

Summary

Information

Access options

Book purchase

Temporarily unavailable

References

Accessibility standard: Unknown

Save book to Kindle

Save book to Dropbox

Save book to Google Drive