Skip to main content Accessibility help
×
Hostname: page-component-848d4c4894-hfldf Total loading time: 0 Render date: 2024-05-19T22:58:07.955Z Has data issue: false hasContentIssue false

references

Published online by Cambridge University Press:  25 January 2023

Roman Garnett
Affiliation:
Washington University in St Louis
Get access

Summary

Image of the first page of this content. For PDF version, please use the ‘Save PDF’ preceeding this image.'
Type
Chapter
Information
Bayesian Optimization , pp. 331 - 352
Publisher: Cambridge University Press
Print publication year: 2023

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Abbasi-Yadkori, Yasin (2012). Online Learning for Linearly Parameterized Control Problems. Ph.D. thesis. University of Alberta.Google Scholar
Acerbi, Luigi (2018). Variational Bayesian Monte Carlo. Advances in Neural Information Processing Systems 31 (Neurips 2018), pp. 82138223.Google Scholar
Acerbi, Luigi and Wei, Ji Ma (2017). Practical Bayesian Optimization for Model Fitting with Bayesian Adaptive Direct Search. Advances in Neural Information Processing Systems 30 (Neurips 2017), pp. 18361846.Google Scholar
Adams, Ryan Prescott, Iain Murray, and Mackay, David J. C. (2009). Tractable Nonpara-metric Bayesian Inference in Poisson Processes with Gaussian Process Intensities. Proceedings of the 26th International Conference on Machine Learning (Icml 2009), pp. 916.Google Scholar
Adler, Robert J. and Taylor, Jonathan E. (2007). Random Fields and Geometry. Springer Monographs in Mathematics. Springer–Verlag.Google Scholar
Agrawal, Rajeev (1995). The Continuum-Armed Bandit Problem. siam Journal on Control and Optimization 33(6):19261951.Google Scholar
Agrawal, Shipra and Navin, Goyal (2012). Analysis of Thompson Sampling for the Multi-Armed Bandit Problem. Proceedings of the 25th Annual Conference on Learning Theory (Colt 2012). Vol. 23. Proceedings of Machine Learning Research, pp. 39.139.26.Google Scholar
ÁLvarez, Mauricio A., Rosasco, Lorenzo, and Lawrence, Neil D. (2012). Kernels for Vector-Valued Functions: A Review. Foundations and Trends in Machine Learning 4(3):195266.CrossRefGoogle Scholar
Arcones, Miguel A. (1992). On the arg max of a Gaussian Process. Statistics & Probability Letters 15(5):373374.Google Scholar
Auer, Peter, Nicole Cesa-Bianchi, and Paul Fischer (2002). Finite-Time Analysis of the Multiarmed Bandit Problem. Machine Learning 47(2–3):235256.Google Scholar
Balandat, Maximilian, Brian, Karrer, Jiang, Daniel R., Samuel, Daulton, Benjamin, Letham, Andrew, Gordon Wilson, et al. (2020). Botorch: A Framework for Efficient Monte-Carlo Bayesian Optimization. Advances in Neural Information Processing Systems 33 (Neurips 2020), pp. 2152421538.Google Scholar
Baptista, Ricardo and Matthias, Poloczek (2018). Bayesian Optimization of Combinatorial Structures. Proceedings of the 35th International Conference on Machine Learning (Icml 2018). Vol. 80. Proceedings of Machine Learning Research, pp. 462471.Google Scholar
Bather, John (1996). A Conversation with Herman Chernoff. Statistical Science 11(4):335350.CrossRefGoogle Scholar
Belakaria, Syrine, Aryan Deshwal, and Janardhan Rao Doppa (2019). Max-value Entropy Search for Multi-Objective Bayesian Optimization. Advances in Neural Infor-mation Processing Systems 32 (Neurips 2019), pp. 78257835.Google Scholar
Bellman, Richard (1952). On the Theory of Dynamic Programming. Proceedings of the National Academy of Sciences 38(8):716719.Google Scholar
Bellman, Richard (1957). Dynamic Programming. Princeton University Press.Google Scholar
Berger, James O. (1985). Statistical Decision Theory and Bayesian Analysis. 2nd ed. Springer Series in Statistics. Springer–Verlag.Google Scholar
Bergstra, James, Rtmi Bardenet, Yoshua Bengio, and BalÁZs Ktgl (2011). Algorithms for Hyper-Parameter Optimization. Advances in Neural Information Processing Systems 24 (Neurips 2011), pp. 25462554.Google Scholar
Bergstra, James and Yoshua, Bengio (2012). Random Search for Hyper-Parameter Optimization. Journal of Machine Learning Research 13:281305.Google Scholar
Berkenkamp, Felix, Schoelling, Angela P., and Krause, Andreas (2019). No-Regret Bayes-ian Optimization with Unknown Hyperparameters. Journal of Machine Learning Research 20(50):124.Google Scholar
Berry, Donald A. and Fristedt, Bert (1985). Bandit Problems: Sequential Allocation of Experiments. Monographs on Statistics and Applied Probability. Chapman & Hall.Google Scholar
Bertsekas, Dimitri P. (2017). Dynamic Programming and Optimal Control. 4th ed. Vol. 1. Athena Scientific.Google Scholar
Bochner, S (1933). Monotone Funktionen, Stieltjessche Integrale und harmonische Analyse. Mathematische Annalen 108:378410.CrossRefGoogle Scholar
Bogunovic, Ilija, Krause, Andreas, and Scarlett, Jonathan (2020). Corruption-Tolerant Gaussian Process Bandit Optimization. Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics (Aistats 2020). Vol. 108. Proceedings of Machine Learning Research, pp. 10711081.Google Scholar
Bogunovic, Ilija, Scarlett, Jonathan, Jegelka, Stefanie, and Cevher, Volkan (2018). Adversarially Robust Optimization with Gaussian Processes. Advances in Neural Information Processing Systems 31 (Neurins 2018), pp. 57605770.Google Scholar
Box, G. E. P. (1954). The Exploration and Exploitation of Response Surfaces: Some General Considerations and Examples. Biometrics 10(1):1660.Google Scholar
Box, George E. P., Hunter, J. Stuart, and Hunter, William G. (2005). Statistics for Experimenters: Design, Innovation, and Discovery. 2nd ed. Wiley Series in Probability and Statistics. John Wiley & Sons.Google Scholar
Box, G. E. P. and Wilson, K. B. (1951). On the Experimental Attainment of Optimum Conditions. Journal of the Royal Statistical Society Series B (Methodological) 13(1):145.CrossRefGoogle Scholar
Box, G. E. P. and Youle, P. V. (1954). The Exploration and Exploitation of Response Surfaces: An Example of the Link between the Fitted Surface and the Basic Mechanism of the System. Biometrics 11(3):287323.Google Scholar
Breiman, Leo (2001). Random Forests. Machine Learning 45(1):532.Google Scholar
Brent, Richard P. (1973). Algorithms for Minimization without Derivatives. Prentice–Hall Series in Automatic Computation. Prentice–Hall.Google Scholar
Brochu, Eric, Cora, Vlad M., and Freitas, Nando De (2010). A Tutorial on Bayesian Optimization of Expensive Cost Functions, with Application to Active User Modeling and Hierarchical Reinforcement Learning. arXiv: 1012.2599 [cs.Lg].Google Scholar
Brooks, Steve, Gelman, Andrew, Jones, Galin L., and Meng, Xiao-Li, eds. (2011). Handbook of Markov Chain Monte Carlo. Handbooks of Modern Statistical Methods. Chapman & Hall.Google Scholar
Bubeck, Sabastien and Nicole, Cesa-Bianchi (2012). Regret Analysis of Stochastic and Nonstochastic Multi-Armed Bandit Problems. Foundations and Trends in Machine Learning 5(1):1122.CrossRefGoogle Scholar
Bubeck, Sabastien, Munos, Rami, and Stoltz, Gilles (2009). Pure Exploration in Multi-Armed Bandits Problems. Proceedings of the 20th International Conference on Algorithmic Learning Theory (Alt 2009). Vol. 5809. Lecture Notes in Computer Science. Springer–Verlag, pp. 2337.Google Scholar
Bull, Adam D. (2011). Convergence Rates of Efficient Global Optimization Algorithms. Journal of Machine Learning Research 12(88):28792904.Google Scholar
Caflisch, Russel E. (1998). Monte Carlo and Quasi-Monte Carlo Methods. Acta Numerica 7:149.Google Scholar
Cai, Xu, Gomes, Selwyn, and Scarlett, Jonathan (2021). Lenient Regret and Good-Action Identification in Gaussian Process Bandits. Proceedings of the 38th International Conference on Machine Learning (Icml 2021). Vol. 139. Proceedings of Machine Learning Research, pp. 11831192.Google Scholar
Cakmak, Sait, Astudillo, Raul, Frazier, Peter, and Zhou, Enlu (2020). Bayesian Optimization of Risk Measures. Advances in Neural Information Processing Systems 33 (Neurins 2020), pp. 1803918049.Google Scholar
Calandra, Roberto, Peters, Jan, Rasmussen, Carl Edward, and Deisen-Roth, Marc Peter (2016). Manifold Gaussian Processes for Regression. Proceedings of the 2016 International Joint Conference on Neural Networks (Ijcnn 2016), pp. 33383345.Google Scholar
Calvin, J. and žlinskas, A. (1999). On the Convergence of the P-Algorithm for One-Dimensional Global Optimization of Smooth Functions. Journal of Optimization Theory and Applications 102(3):479495.Google Scholar
Calvin, James M. (1993). Consistency of a Myopic Bayesian Algorithm for One-Dimensional Global Optimization. Journal of Global Optimization 3(2):223232.Google Scholar
Calvin, James M. (2000). Convergence Rate of the P-Algorithm for Optimization of Continuous Functions. In: Approximation and Complexity in Numerical Optimization: Continuous and Discrete Problems. Ed. by Pardalos, Panos M.. Vol. 42. Nonconvex Optimization and Its Applications. Springer–Verlag, pp. 116129.Google Scholar
Calvin, James M. and Zilinskas, Antanas (2001). On Convergence of a P-Algorithm Based on a Statistical Model of Continuously Differentiable Functions. Journal of Global Optimization 19(3):229245.Google Scholar
Camilleri, Romain, Katz-Samuels, Julian, and Kevin, Jamieson (2021). High-Dimensional Experimental Design and Kernel Bandits. Proceedings of the 38th International Conference on Machine Learning (Icml 2021). Vol. 139. Proceedings of Machine Learning Research, pp. 12271237.Google Scholar
Cartinhour, Jack (1989). One-Dimensional Marginal Density Functions of a Truncated Multivariate Normal Density Function. Communications in Statistics – Theory and Methods 19(1):197203.Google Scholar
Chapelle, Olivier and Lihong, Li (2011). An Empirical Evaluation of Thompson Sampling. Advances in Neural Information Processing Systems 24 (Neurins 2011), pp. 22492257.Google Scholar
Chernoff, Herman (1959). Sequential Design of Experiments. The Annals of Mathematical Statistics 30(3):755770.Google Scholar
Chernoff, Herman (1972). Sequential Analysis and Optimal Design. cbms–nsf Regional Conference Series in Applied Mathematics. Society for Industrial and Applied Mathematics.Google Scholar
Chevalier, Cltment and David, Ginsbourger (2013). Fast Computation of the Multi-Points Expected Improvement with Applications in Batch Selection. Proceedings of the 7th Learning and Intelligent Optimization Conference (Lion 7). Vol. 7997. Lecture Notes in Computer Science. Springer–Verlag, pp. 5969.CrossRefGoogle Scholar
Chowdhury, Sayak Ray and Gopalan, Aditya (2017). On Kernelized Multi-Armed Bandits. Proceedings of the 34th International Conference on Machine Learning (Icml 2017). Vol. 70. Proceedings of Machine Learning Research, pp. 844853.Google Scholar
Clark, Charles E. (1961). The Greatest of a Finite Set of Random Variables. Operations Research 9(2):145162.CrossRefGoogle Scholar
Contal, Emile, David, Buffoni, Robicquet, Alexandre, and Vayatis, Nicolas (2013). Parallel Gaussian Process Optimization with Upper Confidence Bound and Pure Exploration. Proceedings of the 2013 European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases ( ecml pkdd 2013). Vol. 8188. Lecture Notes in Computer Science. Springer–Verlag, pp. 225240.Google Scholar
Cover, Thomas M. and Thomas, Joy A. (2006). Elements of Information Theory. 2nd ed. John Wiley & Sons.Google Scholar
Cox, Dennis D. and John, Susan (1992). A Statistical Method for Global Optimization. Proceedings of the 1992 Ieee International Conference on Systems, Man, and Cybernetics (smc 1992), pp. 12411246.Google Scholar
Cunningham, John P., Hennig, Philipp, and Lacoste-Julien, Simon (2011). Gaussian Probabilities and Expectation Propagation. arXiv: 1111.6832 [stat.Ml].Google Scholar
Cutajar, Kurt, Osborne, Michael A., Cunningham, John P., and Filippone, Maurizio (2016). Preconditioning Kernel Matrices. Proceedings of the 33rd International Conference on Machine Learning (Icml 2016). Vol. 48. Proceedings of Machine Learning Research, pp. 25292538.Google Scholar
Dai, Zhenwen, Damianou, Andreas, GonzÁLez, Javier, and Lawrence, Neil (2016). Variational Auto-encoded Deep Gaussian Processes. Proceedings of the 4th International Conference on Learning Representations (ICLR 2016). arXiv: 1511.06455 [cs.Lg].Google Scholar
Dai, Zhongxiang, Yu, Haibin, Low, Bryan Kian Hsiang, and Jaillet, Patrick (2019). Bayesian Optimization Meets Bayesian Optimal Stopping. Proceedings of the 36th International Conference on Machine Learning (Icml 2019). Vol. 97. Proceedings of Machine Learning Research, pp. 14961506.Google Scholar
Dalibard, Valentin, Schaarschmidt, Michael, and Yoneki, Eiko (2017). Boat: Building Auto-Tuners with Structured Bayesian Optimization. Proceedings of the 26th International Conference on World Wide Web (www 2017), pp. 479488.Google Scholar
Dani, Varsha, Hayes, Thomas P., and Kakade, Sham M. (2008). Stochastic Linear Optimization under Bandit Feedback. Proceedings of the 21st Conference on Learning Theory (Colt 2008), pp. 355366.Google Scholar
Davis, Philip J. and Rabinowitz, Philip (1984). Methods of Numerical Integration. 2nd ed. Computer Science and Applied Mathematics. Academic Press.Google Scholar
De Ath, George, Everson, Richard M., Rahat, Alma A., and Fieldsend, Jonathan E. (2021). Greed Is Good: Exploration and Exploitation Trade-offs in Bayesian Optimisation. acm Transactions on Evolutionary Learning and Optimization 1(1):122.Google Scholar
De Ath, , George, , Fieldsend, Jonathan E., and Everson, Richard M. (2020). What Do You Mean? The Role of the Mean Function in Bayesian Optimization. Proceedings of the 2020 Genetic and Evolutionary Computation Conference (gecco 2020), pp. 16231631.Google Scholar
De Freitas, Nando, Smola, Alex J., and Zoghi, Masrour (2012a). Regret Bounds for Deterministic Gaussian Process Bandits. arXiv: 1203.2177 [cs.Lg].Google Scholar
De Freitas, Nando, Smola, Alex J., and Zohgi, Masrour (2012b). Exponential Regret Bounds for Gaussian Process Bandits with Deterministic Observations. Proceedings of the 29th International Conference on Machine Learning (Icml 2012), pp. 955962.Google Scholar
Degroot, Morris H. (1970). Optimal Statistical Decisions. McGraw–Hill.Google Scholar
Desautels, Thomas, Krause, Andreas, and Burdick, Joel W. (2014). Parallelizing Exploration– Exploitation Tradeoffs in Gaussian Process Bandit Optimization. Journal ofMachine Learning Research 15(119):40534103.Google Scholar
Diaconis, Persi (1988). Bayesian Numerical Analysis. In: Statistical Decision Theory and Related Topics iv. Ed. by Gupta, Shanti S. and Berger, James O.. Vol. 1. Springer– Verlag, pp. 163175.Google Scholar
Djolonga, Josip, Krause, Andreas, and Cevher, Volkan (2013). High-Dimensional Gaussian Process Bandits. Advances in Neural Information Processing Systems 26 (Neurips 2013), pp. 10251033.Google Scholar
Domhan, Tobias, Springenberg, Jost Tobias, and Hutter, Frank (2015). Speeding up Automatic Hyperparameter Optimization of Deep Neural Networks by Extrapolation of Learning Curves. Proceedings of the 24th International Conference on Artificial Intelligence (ijcai 2015), pp. 34603468.Google Scholar
Duvenaud, David, Lloyd, James Robert, Grosse, Roger, Tenenbaum, Joshua B., and Ghahramani, Zoubin (2013). Structure Discovery in Nonparametric Regression through Compositional Kernel Search. Proceedings of the 30th International Conference on Machine Learning (Icml 2013). Vol. 28. Proceedings of Machine Learning Research, pp. 11661174.Google Scholar
Emmerich, Michael T. M., Giannakoglou, Kyriakos C., and Naujoks, Boris, (2006). Single-and Multiobjective Evolutionary Optimization Assisted by Gaussian Random Field Metamodels. Ieee Transactions on Evolutionary Computation 10(4):421439.Google Scholar
Emmerich, Michael and Naujoks, Boris, (2004). Metamodel Assisted Multiobjective Optimisation Strategies and Their Application in Airfoil Design. In: Adaptive Computing in Design and Manufacture vi. Ed. by Parmee, I. C.. Springer–Verlag, pp. 249260.Google Scholar
Eriksson, David, Pearce, Michael, Gardner, Jacob R., Turner, Ryan, and Poloczek, Matthias (2019). Scalable Global Optimization via Local Bayesian Optimization. Advances in Neural Information Processing Systems 32 (Neurips 2019), pp. 54965507.Google Scholar
FernÁNdez-Delgado, Manuel, Cernadas, Eva, Barro, Sentn, and Amorim, Dinani (2014). Do We Need Hundreds of Classifiers to Solve Real World Classification Problems? Journal of Machine Learning Research 15(90):31333181.Google Scholar
Feurer, Matthias, Springenberg, Jost Tobias, and Hutter, Frank (2015). Initializing Bayesian Hyperparameter Optimization via Meta-Learning. Proceedings of 29th aaai Conference on Artificial Intelligence (aaai 2015), pp. 11281135.Google Scholar
Fisher, Ronald A. (1935). The Design of Experiments. Oliver and Boyd.Google Scholar
Fleischer, M. (2003). The Measure of Pareto Optima: Applications to Multi-Objective Metaheuristics. Proceedings of the 2nd International Conference on Evolutionary Multi-Criterion Optimization (emo 2003). Vol. 2632. Lecture Notes in Computer Science. Springer–Verlag, pp. 519533.Google Scholar
Forrester, Alexander I. J., Keane, Andy J., and Bressloff, Neil W. (2006). Design and Analysis of “Noisy” Computer Experiments. aiaa Journal 44(10):23312339.Google Scholar
Frazier, Peter and Powell, Warren (2007). The Knowledge Gradient Policy for Offline Learning with Independent Normal Rewards. Proceedings of the 2007 Ieee International Symposium on Approximate Dynamic Programming and Reinforcement Learning (adprl 2007), pp. 143150.Google Scholar
Frazier, Peter, Powell, Warren, and Dayanik, Savas (2009). The Knowledge-Gradient Policy for Correlated Normal Beliefs. informs Journal on Computing 21(4):599613.CrossRefGoogle Scholar
Friedman, Milton and Savage, L. J. (1947). Planning Experiments Seeking Maxima. In: Selected Techniques of Statistical Analysis for Scienti~c and Industrial Research, and Production and Management Engineering. Ed. by Eisenhart, Churchill, Hartay, Millard W., andWallis, W. Allen. McGraw–Hill, pp. 363372.Google Scholar
FrÖHlich, Lukas P., Klenske, Edgar D., Vinogradska, Julia, Daniel, Christian, and Zeilinger, Melanie N. (2020). Noisy-Input Entropy Search for Efficient Robust Bayesian Optimization. Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics (Aistats 2020). Vol. 108. Proceedings of Machine Learning Research, pp. 22622272.Google Scholar
Garcfa-Barcos, Javier and Martinez-Cantin, Ruben (2021). Robust Policy Search for Robot Navigation. Ieee Robotics and Automation Letters 6(2):23892396.Google Scholar
Gardner, Jacob R., Guo, Chuan, Weinberger, Kilian Q., Garnett, Roman, and Grosse, Roger (2017). Discovering and Exploiting Additive Structure for Bayesian Optimization. Proceedings of the 20th International Conference on Artificial Intelligence and Statistics (Aistats 2017). Vol. 54. Proceedings of Machine Learning Research, pp. 13111319.Google Scholar
Gardner, Jacob R., Kusner, Matt J., Xu, Zhixiang (Eddie), Weinberger, Kilian Q., and Cunningham, John P. (2014). Bayesian Optimization with Inequality Constraints. Proceedings of the 31st International Conference on Machine Learning (Icml 2014). Vol. 32. Proceedings of Machine Learning Research, pp. 937945.Google Scholar
Gardner, Jacob R., Pleiss, Geoff, Bindel, David, Weinberger, Kilian Q., and Wilson, Andrew Gordon (2018). Gpytorch: Blackbox Matrix–Matrix Gaussian Process Inference with Gpu Acceleration. Advances in Neural Information Processing Systems 31 (Neurips 2018), pp. 75767586.Google Scholar
Garnett, Roman, GÄRtner, Thomas, Vogt, Martin, and Bajorath, JÜRgen (2015). Introducing the ‘Active Search’ Method for Iterative Virtual Screening. Journal of Computer-Aided Molecular Design 29(4):305314.Google Scholar
Garnett, Roman, Krishnamurthy, Yamuna, Xiong, Xuehan, Schneider, Jeff, and Mann, Richard (2012). Bayesian Optimal Active Search and Surveying. Proceedings of the 29th International Conference on Machine Learning (Icml 2012), pp. 12391246.Google Scholar
Garnett, Roman, Osborne, Michael A., and Hennig, Philipp (2014). Active Learning of Linear Embeddings for Gaussian Processes. Proceedings of the 30th Conference on Uncertainty in Artificial Intelligence (Uai 2014), pp. 230239.Google Scholar
Garnett, R., Osborne, M. A., and Roberts, S. J. (2010). Bayesian Optimization for Sensor Set Selection. Proceedings of the 9th Acmeieee International Conference on Information Processing in Sensor Networks (Ipsn 2010), pp. 209219.Google Scholar
Gelbart, Michael A., Snoek, Jasper, and Adams, Ryan P. (2014). Bayesian Optimization with Unknown Constraints. Proceedings of the 30th Conference on Uncertainty in Artificial Intelligence (Uai 2014), pp. 250259.Google Scholar
Gelman, Andrew and Vehtari, Aki (2021). What Are the Most Important Statistical Ideas of the Past 50 Years? Journal of the American Statistical Association 116(536):20872097.Google Scholar
Gergonne, Joseph Diez (1815). Application de la méthode des moindres quarrés à l’interpolation des suites. Annales de Mathématiques pures et appliquées 6:242252.Google Scholar
Ghosal, Subhashis and Roy, Anindya (2006). Posterior Consistency of Gaussian Process Prior for Nonparametric Binary Regression. The Annals of Statistics 34(5):24132429.Google Scholar
Gibbs, Mark N. (1997). Bayesian Gaussian Processes for Regression and Classification. Ph.D. thesis. University of Cambridge.Google Scholar
Gilboa, Elad, SaatÇI,, Yunus and Cunningham, John P. (2013). Scaling Multidimensional Gaussian Processes Using Projected Additive Approximations. Proceedings of the 30th International Conference on Machine Learning (Icml 2013). Vol. 28. Proceedings of Machine Learning Research, pp. 454461.Google Scholar
Ginsbourger, David and Riche, Rodolphe Le (2010). Towards Gaussian Process-Based Optimization with Finite Time Horizon. Proceedings of the 9th International Workshop in Model-Oriented Design and Analysis (Moda 9). Contributions to Statistics. Springer–Verlag, pp. 8996.Google Scholar
Ginsbourger, David, Riche, Rodolphe Le, and Carraro, Laurent (2010). Kriging is Well-Suited to Parallelize Optimization. In: Computational Intelligence in Expensive Optimization Problems. Ed. by Yenne, Yoel and Go, Chi-Keong. Adaptation Learning and Optimization. Springer–Verlag, pp. 131162.Google Scholar
Golovin, Daniel and Zhang, Qiuyi (Richard) (2020). Random Hypervolume Scalariza-tions for Provable Multi-Objective Black Box Optimization. Proceedings of the 37th International Conference on Machine Learning (IcrrL 2020). Vol. 119. Proceedings of Machine Learning Research, pp. 1109611105.Google Scholar
Golub, Gene H. and Loan, Charles F. Van (2013). Matrix Computations. 4th ed. Johns Hopkins Studies in the Mathematical Sciences. Johns Hopkins University Press.Google Scholar
G6Mez-Bombarelli, Rafael, Wei, Jennifer N., Duvenaud, David, HernÁNdez-Lobato, Jost Miguel, SÁNchez-Lengeling, Benjamfn, Sheberla, Dennis, et al. (2018). Automatic Chemical Design Using a Data-Driven Continuous Representation of Molecules. Acs Central Science 4(2):268276.Google Scholar
GonzÁLez, Javier, Dai, Zhenwen, Hennig, Philipp, and Lawrence, Neil (2016a). Batch Bayesian Optimization via Local Penalization. Proceedings of the 19th International Conference on Artificial Intelligence and Statistics (Aistats 2016). Vol. 51. Proceedings of Machine Learning Research, pp. 648657.Google Scholar
GonzÁLez, Javier, Osborne, Michael, and Lawrence, Neil D. (2016b). Glasses: Relieving the Myopia of Bayesian Optimisation. Proceedings of the 19th International Conference on Artificial Intelligence and Statistics (Aistats 2016). Vol. 51. Proceedings of Machine Learning Research, pp. 790799.Google Scholar
Gramacy, Robert B. and Lee, Herbert K. H. (2011). Optimization under Unknown Constraints. In: Bayesian Statistics 9. Ed. by Bernardo, J. M., Bayarri, M. J., Berger, J. O., Dawid, A. P., Heckerman, D., Smith, A. F. M., et al. Oxford University Press, pp. 229256.Google Scholar
Granmo, Ole-Christoffer (2010). Solving Two-Armed Bernoulli Bandit Problems Using a Bayesian Learning Automaton. International Journal of Intelligent Computing and Cybernetics 3(2):207234.Google Scholar
GrÜNewÄLder, Steffen, Audibert, Jean-Yves, Opper, Manfred, and Shawe-Taylor, John (2010). Regret Bounds for Gaussian Process Bandit Problems. Proceedings of the 13th International Conference on Artificial Intelligence and Statistics (Aistats 2010). Vol. 9. Proceedings of Machine Learning Research, pp. 273280.Google Scholar
Hansen, Nikolaus (2016). The Cma Evolution Strategy: A Tutorial. arXiv: 1604.00772 [cs.Lg].Google Scholar
Hastie, Trevor and Tibshirani, Robert (1986). Generalized Additive Models. Statistical Science 1(3):297318.Google Scholar
Hennig, Philipp, Osborne, Michael A., and Girolami, Mark (2015). Probabilistic Numerics and Uncertainty in Computations. Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences 471(2179):20150142.Google Scholar
Hennig, Philipp, Osborne, Michael A., and Kersting, Hans (2022). Probabilistic Numerics: Computation as Machine Learning. Cambridge University Press.Google Scholar
Hennig, Philipp and Schuler, Christian J. (2012). Entropy Search for Information-Efficient Global Optimization. Journal of Machine Learning Research 13(Jun):18091837.Google Scholar
Hensman, James, Matthews, Alexander G. De G., Filippone, Maurizio, and Ghahramani, Zoubin (2015). Mcmc for Variationally Sparse Gaussian Processes. Advances in Neural Information Processing Systems 28 (Neurips 2015), pp. 16481656.Google Scholar
HernÁNdez-Lobato, Daniel, HernÁNdez-Lobato, Josh Miguel, Shah, Amar, and Adams, Ryan P. (2016a). Predictive Entropy Search for Multi-Objective Bayesian Optimization. Proceedings of the 33rd International Conference on Machine Learning (Icml 2016). Vol. 48. Proceedings of Machine Learning Research, pp. 14921501.Google Scholar
HernÁNdez-Lobato, Josh Miguel, Gelbart, Michael A., Adams, Ryan P., Hoff-Man, Matthew W., and Ghahramani, Zoubin (2016b). A General Framework for Constrained Bayesian Optimization Using Information-Based Search. Journal of Machine Learning Research 17:153.Google Scholar
HernÁNdez-Lobato, Josh Miguel, Hoffman, Matthew W., and Ghahramani, Zoubin (2014). Predictive Entropy Search for Efficient Global Optimization of Black-Box Functions. Advances in Neural Information Processing Systems 27 (Neurips 2014), pp. 918926.Google Scholar
HernÁNdez-Lobato, Josh Miguel, Requeima, James, Pyzer-Knapp, Edward O., and Aspuru-Guzik, AlÁN (2017). Parallel and Distributed Thompson Sampling for Large-Scale Accelerated Exploration of Chemical Space. Proceedings of the 34th International Conference on Machine Learning (Icml 2017). Vol. 70. Proceedings of Machine Learning Research, pp. 14701479.Google Scholar
Hestenes, Magnus R. and Stiefel, Eduard (1952). Methods of Conjugate Gradients for Solving Linear Systems. Journal of Research of the National Bureau of Standards 49(6):409436.Google Scholar
Hie, Brian L. and Yang, Kevin K. (2021). Adaptive Machine Learning for Protein Engineering. arXiv: 2106.05466 [q-bio.Qm].Google Scholar
Hoang, Trong Nghia, Hoang, Quang Minh, Ouyang, Ruofei, and Low, Kian Hsiang (2018). Decentralized High-Dimensional Bayesian Optimization with Factor Graphs. Proceedings of 32nd aaai Conference on Artificial Intelligence (aaai 2018), pp. 32313238.Google Scholar
Hoffman, Matthew D. and Gelman, Andrew (2014). The No-U-turn Sampler: Adaptively Setting Path Lengths in Hamiltonian Monte Carlo. Journal of Machine Learning Research 15(4):15931623.Google Scholar
Hoffman, Matthew W. and Ghahramani, Zoubin (2015). Output-Space Predictive Entropy Search for Flexible Global Optimization. Bayesian Optimization: Scalability and Flexibility Workshop (BayesOpt 2015), Conference on Neural Information Processing Systems (Neurips 2015).Google Scholar
Hoffman, Matthew W. and Shahriari, Bobak (2014). Modular Mechanisms for Bayesian Optimization. Bayesian Optimization in Academia and Industry (BayesOpt 2014), Conference on Neural Information Processing Systems (Neurips 2014).Google Scholar
Hotelling, Harold (1941). Experimental Determination of the Maximum of a Function. The Annals of Mathematical Statistics 12(1):2045.Google Scholar
Houlsby, Neil, HernÁNdez-Lobato, Josh Miguel, HuszÁR,, Ferenc and Ghahra-Mani, Zoubin (2012). Collaborative Gaussian Processes for Preference Learning. Advances in Neural Information Processing Systems 25 (Neurips 2012), pp. 20962104.Google Scholar
Huang, D., Allen, T. T., Notz, W. I., and Miller, R. A. (2006a). Sequential Kriging Optimization Using Multiple-Fidelity Evaluations. Structural and Multidisciplinary Optimization 32(5):369382.Google Scholar
Huang, D., Allen, T. T., Notz, W. I., and Zeng, N. (2006b). Global Optimization of Stochastic Black-Box Systems via Sequential Kriging Meta-Models. Journal of Global Optimization 34(3):441466.Google Scholar
Hutter, Frank, Hoos, Holger H., and Leyton-Brown, Kevin (2011). Sequential Model-Based Optimization for General Algorithm Configuration. Proceedings of the 5th Learning and Intelligent Optimization Conference (lion 5). Vol. 6683. Lecture Notes in Computer Science. Springer–Verlag, pp. 507523.Google Scholar
Hutter, Frank, Xu, Lin, Hoos, Holger H., and Leyton-Brown, Kevin (2014). Algorithm Runtime Prediction: Methods & Evaluation. Artificial Intelligence 206:79111.Google Scholar
Ingersoll Jr., Jonathan E. (1987). Theory of Financial Decision Making. Rowman & Little-field Studies in Financial Economics. Rowman & Littlefield.Google Scholar
Irwin, John J., Tang, Khanh G., Young, Jennifer, Dandarchuluun, Chinzorig, Wong, Benjamin R., Khurelbaatar, Munkhzul, et al. (2020). Zinc20 – A Free Ultralarge-Scale Chemical Database for Ligand Discovery. Journal of Chemical Information and Modeling 60(12):60656073.Google Scholar
Janz, David, Burt, David R., and GonzÁLez, Javier (2020). Bandit Optimisation of Functions in the Matérn Kernel Rkhs. Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics (aistats 2020). Vol. 108. Proceedings of Machine Learning Research, pp. 24862495.Google Scholar
Jiang, Shali, Chai, Henry, GonzÁLez,, Javier and Garnett, Roman (2020a). Binoculars for Efficient, Nonmyopic Sequential Experimental Design. Proceedings of the 37th International Conference on Machine Learning (Icml 2020). Vol. 119. Proceedings of Machine Learning Research, pp. 47944803.Google Scholar
Jiang, Shali, Jiang, Daniel R., Balandat, Maximilian, Karrer, Brian, Gardner, Jacob R., and Garnett, Roman (2020b). Efficient Nonmyopic Bayesian Optimization via One-Shot Multi-Step Trees. Advances in Neural Information Processing Systems 33 (Neurips 2020), pp. 1803918049.Google Scholar
Jiang, Shali, Malkomes, Gustavo, Converse, Geoff, Shofner, Alyssa, Moseley, Benjamin, and Garnett, Roman (2017). Efficient Nonmyopic Active Search. Proceedings of the 34th International Conference on Machine Learning (Icml 2017). Vol. 70. Proceedings of Machine Learning Research, pp. 17141723.Google Scholar
Jones, D. R., Perttunen, C. D., and Stuckman, B. E. (1993). Lipschitzian Optimization without the Lipschitz Constant. Journal of Optimization Theory and Application 79(1):157181.Google Scholar
Jones, Donald R. (2001). A Taxonomy of Global Optimization Methods Based on Response Surfaces. Journal of Global Optimization 21(4):345383.Google Scholar
Jones, Donald R., Schonlau, Matthias, and Welch, William J. (1998). Efficient Global Optimization of Expensive Black-Box Functions. Journal of Global Optimization 13(4):455492.Google Scholar
JylÄNki, Pasi, Vanhatalo, Jarno, and Vehtari, Aki (2011). Robust Gaussian Process Regression with a Student-t Likelihood. Journal of Machine Learning Research 12(99): 32273257.Google Scholar
Kanagawa, Motonobu, Hennig, Philipp, Sejdinovic, Dino, and Sriperum-Budur, Bharath K. (2018). Gaussian Processes and Kernel Methods: A Review on Connections and Equivalences. arXiv: 1807.02582 [stat.Ml].Google Scholar
Kandasamy, Kirthevasan, Dasarathy, Gautam, Oliva, Junier, Schneider, Jeff, and P6Czos, BarnabÁS (2016). Gaussian Process Bandit Optimisation with Multi-Fidelity Evaluations. Advances in Neural Information Processing Systems 29 (Neurips 2016), pp. 9921000.Google Scholar
Kandasamy, Kirthevasan, Dasarathy, Gautam, Schneider, Jeff, and P6Czos, BarnabÁS (2017). Multi-Fidelity Bayesian Optimisation with Continuous Approximations. Proceedings of the 34th International Conference on Machine Learning (Icml 2017). Vol. 70. Proceedings of Machine Learning Research, pp. 17991808.Google Scholar
Kandasamy, Kirthevasan, Krishnamurthy, Akshay, Schneider, Jeff, and P6C-Zos, BarnabÁS (2018). Parallelised Bayesian Optimisation via Thompson Sampling. Proceedings of the 21st International Conference on Artificial Intelligence and Statistics (aistats 2018). Vol. 84. Proceedings of Machine Learning Research, pp. 133142.Google Scholar
Kandasamy, Kirthevasan, Schneider, Jeff, and P6Czos, BarnabÁS (2015). High Dimensional Bayesian Optimisation and Bandits via Additive Models. Proceedings of the 32nd International Conference on Machine Learning (Icml 2015). Vol. 37. Proceedings of Machine Learning Research, pp. 295304.Google Scholar
Kathuria, Tarun, Deshpande, Amit, and Kohli, Pushmeet (2016). Batched Gaussian Process Bandit Optimization via Determinantal Point Processes. Advances in Neural Information Processing Systems 29 (Neurips 2016), pp. 42064214.Google Scholar
Kim, Jeankyung and Pollard, David (1990). Cube Root Asymptotics. The Annals ofStatistics 18(1):191219.Google Scholar
Kim, Jungtaek and Choi, Seungjin (2020). On Local Optimizers of Acquisition Functions in Bayesian Optimization. Lecture Notes in Computer Science 12458:675690.Google Scholar
Kim, Jungtaek, Mccourt, Michael, You, Tackgeun, Kim, Saehoon, and Choi, Seungjin (2021). Bayesian Optimization with Approximate Set Kernels. Machine Learning 110(5):857879.Google Scholar
Klein, Aaron, Bartels, Simon, Falkner, Stefan, Hennig, Philipp, and Hutter, Frank (2015). Towards Efficient Bayesian Optimization for Big Data. Bayesian Optimization: Scalability and Flexibility Workshop (BayesOpt 2015), Conference on Neural Information Processing Systems (Neurips 2015).Google Scholar
Klein, Aaron, Falkner, Stefan, Springenberg, Jost Tobias, and Hutter, Frank (2017). Learning Curve Prediction with Bayesian Neural Networks. Proceedings of the 5th International Conference on Learning Representations (ICLR 2017).Google Scholar
Knowles, Joshua (2005). Parego: A Hybrid Algorithm with On-Line Landscape Approximation for Expensive Multiobjective Optimization Problems. Ieee Transactions on Evolutionary Computation 10(1):5066.CrossRefGoogle Scholar
Ko, Chun-Wa, Jon Lee, and Queyranne, Maurice (1995). An Exact Algorithm for Maximum Entropy Sampling. Operations Research 43(4):684691.Google Scholar
Konishi, Sadanori and Kitagawa, Genshiro (2008). Information Criteria and Statistical Modeling. Springer Series in Statistics. Springer–Verlag.Google Scholar
Kschischang, Frank R., Frey, Brendan J., and Leoliger, Hans-Andrea (2001). Factor Graphs and the Sum–Product Algorithm. Ieee Transactions on Information Theory 47(2):498519.Google Scholar
Kulesza, Alex and Taskar, Ben (2012). Determinantal Point Processes for Machine Learning. Foundations and Trends in Machine Learning 5(2–3):123286.Google Scholar
Kushner, Harold J. (1962). A Versatile Stochastic Model of a Function of Unknown and Time Varying Form. Journal of Mathematical Analysis and Applications 5(1):150167.Google Scholar
Kushner, H. J. (1964). A New Method of Locating the Maximum Point of an Arbitrary Multipeak Curve in the Presence of Noise. Journal ofBasic Engineering 86(1):97106.Google Scholar
Kuss, Malte (2006). Gaussian Process Models for Robust Regression, Classification, and Reinforcement Learning. Ph.D. thesis. Technische Universität Darmstadt.Google Scholar
Lai, T. L. and Robbins, Herbert (1985). Asymptotically Efficient Adaptive Allocation Rules. Advances in Applied Mathematics 6(1):422.Google Scholar
Lam, Remi R., Wilcox, Karen E., and Wolpert, David H. (2016). Bayesian Optimization with a Finite Budget: An Approximate Dynamic Programming Approach. Advances in Neural Information Processing Systems 29 (Neurips 2016), pp. 883891.Google Scholar
Lange, Kenneth L., Little, Roderick J. A., and Taylor, Jeremy M. G. (1989). Robust Statistical Modeling Using the t Distribution. Journal of the American Statistical Association 84(408):881896.Google Scholar
Lattimore, Tor and SzepesvÁRi, Csaba (2020). Bandit Algorithms. Cambridge University Press.Google Scholar
LÁZaro-Gredilla, Miguel, QuiÑOnero-Candela, Joaquin, Rasmussen, Carl Edward, and Figueiras-Vidal, Anfbal R. (2010). Sparse Spectrum Gaussian Process Regression. Journal of Machine Learning Research 11(Jun):18651881.Google Scholar
Letham, Benjamin, Karrer, Brian, Ottoni, Guilherme, and Bakshy, Eytan (2019). Constrained Bayesian Optimization with Noisy Experiments. Bayesian Analysis 14(2): 495519.Google Scholar
Levina, Elizaveta and Bickel, Peter J. (2004). Maximum Likelihood Estimation of Intrinsic Dimension. Advances in Neural Information Processing Systems 17 (Neurips 2004), pp. 777784.Google Scholar
Lavy, Paul (1948). Processus stochastiques et mouvement brownien. Gauthier–Villars.Google Scholar
Li, Chun-Liang, Kirthevasan Kandasamy, BarnabÁS P6Czos, and Schneider, Jeff (2016). High Dimensional Bayesian Optimization via Restricted Projection Pursuit Models. Proceedings of the 19th International Conference on Artificial Intelligence and Statistics (Aistats 2016). Vol. 51. Proceedings of Machine Learning Research, pp. 884892.Google Scholar
Li, Chunyuan, Farkhoor, Heerad, Liu, Rosanne, and Yosinski, Jason (2018a). Measuring the Intrinsic Dimension of Objective Landscapes. Proceedings of the 6th International Conference on Learning Representations (ICLR 2018). arXiv: 1804.08838 [cs.Lg].Google Scholar
Li, Lisha, Jamieson, Kevin, Desalvo, Giulia, Rostamizadeh, Afshin, and Talwalkar, Ameet (2018b). Hyperband: A Novel Bandit-Based Approach to Hyperparameter Optimization. Journal of Machine Learning Research 18(185):152.Google Scholar
Li, Zihan and Scarlett, Jonathan (2021). Gaussian Process Bandit Optimization with Few Batches. arXiv: 2110.07788 [stat.Ml].Google Scholar
Lindley, D. V. (1956). On a Measure of the Information Provided by an Experiment. The Annals of Mathematical Statistics 27(4):9861005.Google Scholar
Lindley, D. V. (1972). Bayesian Statistics, A Review. Cbms-Nsf Regional Conference Series in Applied Mathematics. Society for Industrial and Applied Mathematics.Google Scholar
Locatelli, M. (1997). Bayesian Algorithms for One-Dimensional Global Optimization. Journal of Global Optimization 10(1):5776.Google Scholar
LÖWner, Karl (1934). Über monotone Matrixfunktionen. Mathematische Zeitschrift 38: 177216.Google Scholar
Lukic, Milan N. and Beder, Jay H. (2001). Stochastic Processes with Sample Paths in Reproducing Kernel Hilbert Spaces. Transactions of the American Mathematical Society 353(10):39453969.Google Scholar
Lyu, Yueming, Yuan, Yuan, and Tsang, Ivor W. (2019). Efficient Batch Black-Box Optimization with Deterministic Regret Bounds. arXiv: 1905.10041 [cs.Lg].Google Scholar
Mackay, David J. C. (1998). Introduction to Gaussian Processes. In: Neural Networks and Machine Learning. Ed. by Bishop, Christopher M.. Vol. 168. Nato Asi Series F: Computer and Systems Sciences. Springer–Verlag, pp. 133165.Google Scholar
Mackay, David J. C. (2003). Information Theory, Inference, and Learning Algorithms. Cambridge University Press.Google Scholar
Mahsereci, Maren and Hennig, Philipp (2015). Probabilistic Line Searches for Stochastic Optimization. Advances in Neural Information Processing Systems 28 (Neurips 2015), pp. 181189.Google Scholar
Malkomes, Gustavo and Garnett, Roman (2018). Automating Bayesian Optimization with Bayesian Optimization. Advances in Neural Information Processing Systems 31 (Neurips 2018), pp. 59845994.Google Scholar
Malkomes, Gustavo, Schaff, Chip, and Garnett, Roman (2016). Bayesian Optimization for Automated Model Selection. Advances in Neural Information Processing Systems 29 (Neurips 2016), pp. 29002908.Google Scholar
Maraval, Alexandre, Zimmer, Matthieu, Grosnit, Antoine, Tutunov, Rasul, Wang, Jun, and Ammar, Haitham Bou (2022). Sample-Efficient Optimisation with Probabilistic Transformer Surrogates. arXiv: 2205.13902 [cs.Lg].Google Scholar
Marmin, Stbastien, Chevalier, Cltment, and Ginsbourger, David (2015). Differentiating the Multipoint Expected Improvement for Optimal Batch Design. Proceedings of the 1st International Workshop on Machine Learning, Optimization, and Big Data (Mod 2015). Vol. 9432. Lecture Notes in Computer Science. Springer–Verlag, pp. 3748.Google Scholar
Marschak, Jacob and Radner, Roy (1972). Economic Theory of Teams. Yale University Press.Google Scholar
Martinez-Cantin, Ruben, Tee, Kevin, and Mccourt, Michael (2018). Practical Bayesian Optimization in the Presence of Outliers. Proceedings of the 21st International Conference on Artificial Intelligence and Statistics (Aistats 2018). Vol. 84. Proceedings of Machine Learning Research, pp. 17221731.Google Scholar
Massart, Pascal (2007). Concentration Inequalities and Model Selection: Ecole d’Eté de Probabilités de Saint-Flour xxxiii – 2003. Vol. 1896. Lecture Notes in Mathematics. Springer–Verlag.Google Scholar
Mccullagh, P. and Nelder, J. A. (1989). Generalized Linear Models. 2nd ed. Monographs on Statistics and Applied Probability. Chapman & Hall.Google Scholar
Meinshausen, Nicolai (2006). Quantile Regression Forests. Journal of Machine Learning Research 7(35):983999.Google Scholar
Miettinen, Kaisa M. (1998). Nonlinear Multiobjective Optimization. International Series in Operations Research & Management Science. Kluwer Academic Publishers.Google Scholar
Milgrom, Paul and Segal, Ilya (2002). Envelope Theorems for Arbitrary Choice Sets. Econometrica 70(2):583601.Google Scholar
Minka, Thomas P. (2001). A Family of Algorithms for Approximate Bayesian Inference. Ph.D. thesis. Massachusetts Institute of Technology.Google Scholar
Minka, Thomas (2008). Ep: A Quick Reference. Url: https://tminka.github.io/papers/ep/minka-ep-quickref.pdf.Google Scholar
Mockus, Jonas (1972). Bayesian Methods of Search for an Extremum. Avtomatika i Vychis-litel’naya Tekhnika (Automatic Control and Computer Sciences) 6(3):5362.Google Scholar
Mockus, Jonas (1974). On Bayesian Methods for Seeking the Extremum. Optimization Techniques: ifiP Technical Conference. Vol. 27. Lecture Notes in Computer Science. Springer–Verlag, pp. 400404.Google Scholar
Mockus, Jonas (1989). Bayesian Approach to Global Optimization: Theory and Applications. Mathematics and Its Applications. Kluwer Academic Publishers.Google Scholar
Mockus, Jonas, Eddy, William, Mockus, Audris, Mockus, Linas, and Reklaitas, Gintaras (2010). Bayesian Heuristic Approach to Discrete and Global Optimization: Algorithms, Visualization, Software, and Applications. Nonconvex Optimization and Its Applications. Kluwer Academic Publishers.Google Scholar
Mockus, J., Tiesis, V., andžlinskas, A. (1978). The Application of Bayesian Methods for Seeking the Extrememum. In: Towards Global Optimization 2. Ed. by Dixon, L. C. W. and North–Holland, G. P. Szeg6., pp. 117129.Google Scholar
MØLler, Jesper, Syversveen, Anne Randi, and Waagepetersen, Rasmus Plenge (1998). Log Gaussian Cox Processes. Scandinavian Journal of Statistics 25(3):451482.Google Scholar
Montgomery, Douglas C. (2019). Design and Analysis of Experiments. 10th ed. John Wiley & Sons.Google Scholar
Moore, Andrew W. and Atkeson, Christopher G. (1993). Memory-Based Reinforcement Learning: Efficient Computation with Prioritized Sweeping. Advances in Neural Information Processing Systems 5 (Neurips 1992), pp. 263270.Google Scholar
Moriconi, Riccardo, Deisenroth, Marc Peter, and Kumar, K. S. Sesh (2020). High-Dimensional Bayesian Optimization Using Low-Dimensional Feature Spaces. Machine Learning 109(9–10):19251943.Google Scholar
Moss, Henry B., Daniel Beck, Javier GonzÁLez, David S. Leslie, and Rayson, Paul (2020). Boss: Bayesian Optimization over String Spaces. Advances in Neural Information Processing Systems 33 (Neurips 2020), pp. 1547615486.Google Scholar
MÜLler, Sarah, Rohr, Alexander Von, and Trimpe, Sebastian (2021). Local Policy Search with Bayesian Optimization. Advances in Neural Information Processing Systems 34 (Neurips 2021), pp. 2070820720.Google Scholar
Murray, Iain (2016). Differentiation of the Cholesky decomposition. arXiv: 1602.07527 [stat.Co].Google Scholar
Murray, Iain, Adams, Ryan Prescott, and Mackay, David J. C. (2010). Elliptical Slice Sampling. Proceedings of the 13th International Conference on Artificial Intelligence and Statistics (aistats 2010). Vol. 9. Proceedings of Machine Learning Research, pp. 541548.Google Scholar
Mutni, Mojmfr and Krause, Andreas (2018). Efficient High Dimensional Bayesian Optimization with Additivity and Quadrature Fourier Features. Advances in Neural Information Processing Systems 31 (Neurips 2018), pp. 90059016.Google Scholar
Neal, Radford M. (1997). Monte Carlo Implementation of Gaussian Process Models for Bayes-ian Regression and Classi~cation. Technical report (9702). Department of Statistics, University of Toronto.Google Scholar
Neal, Radford M. (1998). Regression and Classification Using Gaussian Process Priors. In: Bayesian Statistics 6. Ed. by Bernardo, J. M., Berger, J. O., Dawid, A. P., and Smith, A. F. M.. Oxford University Press, pp. 475490.Google Scholar
Nguyen, Quan, Wu, Kaiwen, Garder, Jacob R., and Garnett, Roman (2022). Local Bayesian Optimization via Maximizing Probability of Descent. arXiv: 2210.11662 [cs.Lg].Google Scholar
Nguyen, Quoc Phong, Dai, Zhongxiang, Low, Bryan Kian Hsiang, and Jaillet, Patrick (2021a). Optimizing Conditional Value-At-Risk of Black-Box Functions. Advances in Neural Information Processing Systems 34 (Neurips 2021).Google Scholar
Nguyen, Quoc Phong, Dai, Zhongxiang, Low, Bryan Kian Hsiang, and Jaillet, Patrick (2021b). Value-at-Risk Optimization with Gaussian Processes. Proceedings of the 38th International Conference on Machine Learning (Icml 2021). Vol. 139. Proceedings of Machine Learning Research, pp. 80638072.Google Scholar
Nguyen, Vu, Gupta, Sunil, Rana, Santu, Li, Cheng, and Venkatesh, Svetha (2017). Regret for Expected Improvement over the Best-Observed Value and Stopping Condition. Proceedings of the 9th Asian Conference on Machine Learning (acml 2017). Vol. 77. Proceedings of Machine Learning Research, pp. 279294.Google Scholar
Nickisch, Hannes and Rasmussen, Carl Edward (2008). Appproximations for Binary Gaussian Process Classification. Journal of Machine Learning Research 9(Oct):2035– 2078.Google Scholar
O'Hagan, A. (1978). Curve Fitting and Optimal Design for Prediction. Journal of the Royal Statistical Society Series B (Methodological) 40(1):142.Google Scholar
O'Hagan, A. (1991). Bayes–Hermite Quadrature. Journal ofStatistical Planning and Inference 29(3):245260.Google Scholar
O'Hagan, Anthony and Forster, Jonathan (2004). Kendall’s Advanced Theory of Statistics. 2nd ed. Vol. 2B: Bayesian Inference. Arnold.Google Scholar
Oh, Changyong, Tomczak, Jakub M., Gavves, Efstratios, and Welling, Max (2019). Combinatorial Bayesian Optimization Using the Graph Cartesian Product. Advances in Neural Information Processing Systems 32 (Neurips 2019), pp. 29142924.Google Scholar
ØKsendal, Bernt (2013). Stochastic Di~erential Equations: An Introduction with Applications. 6th ed. Universitext. Springer–Verlag.Google Scholar
Oliveira, Rafael, Ott, Lionel, and Ramos, Fabio (2019). Bayesian Optimisation Under Uncertain Inputs. Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics (Aistats 2019). Vol. 89. Proceedings of Machine Learning Research, pp. 11771184.Google Scholar
Osborne, Michael A., David Duvenaud, Roman Garnett, Carl E. Rasmussen, Stephen J. Roberts, and Ghahramani, Zoubin (2012). Active Learning of Model Evidence Using Bayesian Quadrature. Advances in Neural Information Processing Systems 25 (Neurips 2012), pp. 4654.Google Scholar
Osborne, Michael A., Garnett, Roman, and Roberts, Stephen J. (2009). Gaussian Processes for Global Optimization. Proceedings of the 3rd Learning and Intelligent Optimization Conference (LioN 3).Google Scholar
Paria, Biswajit, Kandasamy, Kirthevasan, and P6Czos, BarnabÁS (2019). A Flexible Framework for Multi-Objective Bayesian Optimization Using Random Scalariza-tions. Proceedings of the 35th Conference on Uncertainty in Artificial Intelligence (uAi 2019). Vol. 115. Proceedings of Machine Learning Research, pp. 766776.Google Scholar
Peirce, C. S. (1876). Note on the Theory of the Economy of Research. In: Report of the Superintendent of the United States Coast Survey Showing the Progress of the Work for the Fiscal Year Ending with June, 1876. Government Printing Office, pp. 197201.Google Scholar
Picheny, Victor (2014). A Stepwise Uncertainty Reduction Approach to Constrained Global Optimization. Proceedings of the 17th International Conference on Artificial Intelligence and Statistics (Aistats 2014). Vol. 33. Proceedings of Machine Learning Research, pp. 787795.Google Scholar
Picheny, Victor (2015). Multiobjective Optimization Using Gaussian Process Emulators via Stepwise Uncertainty Reduction. Statistics and Computing 25(6):12651280.Google Scholar
Picheny, Victor, Ginsbourger, David, Richet, Yann, and Caplin, Gregory (2013a). Quantile-Based Optimization of Noisy Computer Experiments with Tunable Precision. Tech-nometrics 55(1):213.Google Scholar
Picheny, Victor, Wagner, Tobias, and Ginsbourger, David (2013b). A Benchmark of Kriging-Based Infill Criteria for Noisy Optimization. Structural and Multidisciplinary Optimization 48(3):607626.Google Scholar
Pleiss, Geoff, Jankowiak, Martin, Eriksson, David, Damle, Anil, and Gardner, Jacob R. (2020). Fast Matrix Square Roots with Applications to Gaussian Processes and Bayesian Optimization. Advances in Neural Information Processing Systems 33 (Neurips 2020), pp. 2226822281.Google Scholar
Poincara, Henri (1912). Calcul des probabilités. 2nd ed. Gauthier–Villars.Google Scholar
Ponweiser, Wolfgang, Wagner, Tobias, Biermann, Dirk, and Vincze, Markus (2008). Multiobjective Optimization on a Limited Budget of Evaluations Using Model-Assisted S-Metric Selection. Proceedings of the 10th International Confernce on Parallel Problem Solving from Nature (Ppsn x). Vol. 5199. Lecture Notes in Computer Science. Springer–Verlag, pp. 784794.Google Scholar
Powell, Warren B. (2011). Approximate Dynamic Programming: Solving the Curses of Dimensionality. 2nd ed. Wiley Series in Probability and Statistics. John Wiley & Sons.Google Scholar
Pronzato, Luc (2017). Minimax and Maximin Space-Filling Designs: Some Properties and Methods for Construction. Journal de la Société Française de Statistique 158(1):736.Google Scholar
Raiffa, Howard and Schlaifer, Robert (1961). Applied Statistical Decision Theory. Division of Research, Graduate School of Business Administration, Harvard University.Google Scholar
Rasmussen, Carl Edward and Ghahramani, Zoubin (2002). Bayesian Monte Carlo. Advances in Neural Information Processing Systems 15 (Neurips 2002), pp. 505512.Google Scholar
Rasmussen, Carl Edward and Williams, Christopher K. I. (2006). Gaussian Processes for Machine Learning. Adaptive Computation and Machine Learning. Mit Press.Google Scholar
Riquelme, Carlos, Tucker, George, and Snoek, Jasper (2018). Deep Bayesian Bandits Showdown. Proceedings of the 6th International Conference on Learning Representations (ICLR 2018). arXiv: 1802.09127 [cs.Lg].Google Scholar
Robbins, Herbert (1952). Some Aspects of the Sequential Design of Experiments. Bulletin of the American Mathematical Society 58(5):527535.Google Scholar
Rolland, Paul, Scarlett, Jonathan, Bogunovic, Ilija, and Cevher, Volkan (2018). High-Dimensional Bayesian Optimization via Additive Models with Overlapping Groups. Proceedings of the 21st International Conference on Artificial Intelligence and Statistics (Aistats 2018). Vol. 84. Proceedings of Machine Learning Research, pp. 298307.Google Scholar
Ross, Andrew M. (2010). Computing Bounds on the Expected Maximum of Correlated Normal Variables. Methodology and Computing in Applied Probability 12(1):111138.Google Scholar
Rudin, Walter (1976). Principles of Mathematical Analysis. 3rd ed. International Series in Pure and Applied Mathematics. McGraw–Hill.Google Scholar
Rue, HÅVard, Sara Martino, and Chopin, Nicolas (2009). Approximate Bayesian Inference for Latent Gaussian Models by Using Integrated Nested Laplace Approximations. Journal of the Royal Statistical Society Series B (Methodological) 71(2):319392.Google Scholar
Russo, Daniel and Roy, Benjamin Van (2014). Learning to Optimize via Posterior Sampling. Mathematics of Operations Research 39(4):12211243.Google Scholar
Russo, Daniel and Roy, Benjamin Van (2016). An Information-Theoretic Analysis of Thompson Sampling. Journal of Machine Learning Research 17(68):130.Google Scholar
Sacks, Jerome, Welch, William J., Mitchell, Toby J., and Wynn, Henry P. (1989). Design and Analysis of Computer Experiments. Statistical Science 4(4):409435.Google Scholar
Salgia, Sudeep, Vakili, Sattar, and Zhao, Qing (2020). A Computationally Efficient Approach to Black-Box Optimization Using Gaussian Process Models. arXiv: 2010. 13997 [stat.Ml].Google Scholar
Saltenis, VydÜNas R. (1971). One Method of Multiextremum Optimization. Avtomatika i Vychislitel’naya Tekhnika (Automatic Control and Computer Sciences) 5(3):3338.Google Scholar
Sanchez, Susan M. and Sanchez, Paul J. (2005). Very Large Fractional Factorial and Central Composite Designs. Acm Transactions on Modeling and Computer Simulation 15(4): 362377.Google Scholar
Scarlett, Jonathan (2018). Tight Regret Bounds for Bayesian Optimization in One Dimension. Proceedings of the 35th International Conference on Machine Learning (Icml 2018). Vol. 80. Proceedings of Machine Learning Research, pp. 45004508.Google Scholar
Scarlett, Jonathan, Bogunovic, Ilija, and Cevher, Volkan (2017). Lower Bounds on Regret for Noisy Gaussian Process Bandit Optimization. Proceedings of the 2017 Conference on Learning Theory (Colt 2017). Vol. 65. Proceedings of Machine Learning Research, pp. 17231742.Google Scholar
Scarlett, Jonathan and Cevhar, Volkan (2021). An Introductory Guide to Fano’s Inequality with Applications in Statistical Estimation. In: Information-Theoretic Methods in Data Science. Ed. by Rodrigues, Miguel R. D. and Eldar, Yonina C.. Cambridge University Press, pp. 487528.Google Scholar
Schonlau, Matthias (1997). Computer Experiments and Global Optimization. Ph.D. thesis. University of Waterloo.Google Scholar
Schonlau, Matthias, Welch, William J., and Jones, Donald R. (1998). Global versus Local Search in Constrained Optimization of Computer Models. In: New Developments and Applications in Experimental Design. Vol. 34. Lecture Notes – Monograph Series. Institute of Mathematical Statistics, pp. 1125.Google Scholar
Scott, Warren, Frazier, Peter, and Powell, Warren (2011). The Correlated Knowledge Gradient for Simulation Optimization of Continuous Parameters Using Gaussian Process Regression. siam Journal on Optimization 21(3):9961026.Google Scholar
Seeger, Matthias (2008). Expectation Propagation for Exponential Families. Technical report. University of California, Berkeley.Google Scholar
Settles, Burr (2012). Active Learning. Synthesis Lectures on Artificial Intelligence and Machine Learning. Morgan & Claypool.Google Scholar
Shah, Amar and Ghahramani, Zoubin (2015). Parallel Predictive Entropy Search for Batch Global Optimization of Expensive Objective Functions. Advances in Neural Information Processing Systems 28 (Neurips 2015), pp. 33303338.Google Scholar
Shah, Amar, Wilson, Andrew Gordon, and Ghahramani, Zoubin (2014). Student-t Processes as Alternatives to Gaussian Processes. Proceedings of the 17th International Conference on Artificial Intelligence and Statistics (aistats 2014). Vol. 33. Proceedings of Machine Learning Research, pp. 877885.Google Scholar
Shahriari, Bobak, Swersky, Kevin, Wang, Ziyu, Adams, Ryan P., and Freitas, Nando De (2016). Taking the Human out of the Loop: A Review of Bayesian Optimization. Proceedings of the Ieee 104(1):148175.Google Scholar
Shahriari, Bobak, Wang, Ziyu, Hoffman, Matthew W., Bouchard-Côté,, Alexandre and Freitas, Nando De (2014). An Entropy Search Portfolio for Bayesian Optimization. arXiv: 1406.4625 [stat.Ml].Google Scholar
Shamir, Ohad (2013). On the Complexity of Bandit and Derivative-Free Stochastic Convex Optimization. Proceedings of the 24th Annual Conference on Learning Theory (Colt 2013). Vol. 30. Proceedings of Machine Learning Research, pp. 324.Google Scholar
Shannon, C. E. (1948). A Mathematical Theory of Communication. The Bell System Technical Journal 27(3):379423.Google Scholar
Shao, T. S., Chen, T. C., and Frank, R. M. (1964). Tables of Zeros and Gaussian Weights of Certain Associated Laguerre Polynomials and the Related Generalized Hermite Polynomials. Mathematics of Computation 18(88):598616.Google Scholar
Shepp, L. A. (1979). The Joint Density of the Maximum and Its Location for a Wiener Process with Drift. Journal of Applied Probability 16(2):423427.Google Scholar
Silverman, B. W. (1986). Density Estimation for Statistics and Data Analysis. Monographs on Statistics and Applied Probability. Chapman & Hall.Google Scholar
Slepian, David (1962). The One-Sided Barrier Problem for Gaussian Noise. The Bell System Technical Journal 41(2):463501.Google Scholar
Smith, Kirstine (1918). On the Standard Deviations of Adjusted and Interpolated Values of an Observed Polynomial Function and Its Constants and the Guidance They Give towards a Proper Choice of the Distribution of Observations. Biometrika 12(1–2): 185.Google Scholar
Smola, Alex SchÖLkopf, J. and Bernhard (2000). Sparse Greedy Matrix Approximation for Machine Learning. Proceedings of the 17th International Conference on Machine Learning (Icml 2000), pp. 911918.Google Scholar
Snelson, Edward and Ghahramani, Zoubin (2005). Sparse Gaussian Processes Using Pseudo-inputs. Advances in Neural Information Processing Systems 18 (Neurips 2005), pp. 12571264.Google Scholar
Snoek, Jasper, Larochelle, Hugo, and Adams, Ryan P. (2012). Practical Bayesian Optimization of Machine Learning Algorithms. Advances in Neural Information Processing Systems 25 (Neurips 2012), pp. 29512959.Google Scholar
Snoek, Jasper, Rippel, Oren, Swersky, Kevin, Kiros, Ryan, Satish, Nadathur, Sundaram, Narayanan, et al. (2015). Scalable Bayesian Optimization Using Deep Neural Networks. Proceedings of the 32nd International Conference on Machine Learning (Icml 2015). Vol. 37. Proceedings of Machine Learning Research, pp. 21712180.Google Scholar
Snoek, Jasper, Swersky, Kevin, Zemel, Richard, and Adams, Ryan P. (2014). Input Warping for Bayesian Optimization of Non-Stationary Functions. Proceedings of the 31st International Conference on Machine Learning (Icml 2014). Vol. 32. Proceedings of Machine Learning Research, pp. 16741682.Google Scholar
Song, Jialin, Chen, Yuxin, and Yue, Yisong (2019). A General Framework for Multi-Fidelity Bayesian Optimization with Gaussian Processes. Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics (aistats 2019). Vol. 89. Proceedings of Machine Learning Research, pp. 31583167.Google Scholar
Springenberg, Jost Tobias, Klein, Aaron, Falkner, Stefan, and Hutter, Frank (2016). Bayesian Optimization with Robust Bayesian Neural Networks. Advances in Neural Information Processing Systems 29 (Neurips 2016), pp. 41344142.Google Scholar
Srinivas, Niranjan, Krause, Andreas, Kakade, Sham, and Seeger, Matthias (2010). Gaussian Process Optimization in the Bandit Setting: No Regret and Experimental Design. Proceedings of the 27th International Conference on Machine Learning (Icml 2010), pp. 10151022.Google Scholar
Stein, Michael L. (1999). Interpolation of Spatial Data: Some Theory for Kriging. Springer Series in Statistics. Springer–Verlag.Google Scholar
Streltsov, Simon and Vakili, Pirooz (1999). A Non-Myopic Utility Function for Statistical Global Optimization Algorithms. Journal of Global Optimization 14(3):283298.Google Scholar
Sutton, Richard S. (1990). Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming. Proceedings of the 7th International Conference on Machine Learning (Icml 1990), pp. 216224.Google Scholar
Svenson, Joshua and Santner, Thomas (2016). Multiobjective Optimization of Expensive-to-evaluate Deterministic Computer Simulator Models. Computational Statistics and Data Analysis 94:250264.Google Scholar
Swersky, Kevin, Snoek, Jasper, and Adams, Ryan P. (2013). Multi-Task Bayesian Optimization. Advances in Neural Information Processing Systems 26 (Neuritis 2013), pp. 2004– 2012.Google Scholar
Swersky, Kevin, Snoek, Jasper, and Adams, Ryan P. (2014). Freeze–Thaw Bayesian Optimization. arXiv: 1406.3896 [stat.Ml].Google Scholar
Takeno, Shion, Fukuoka, Hitoshi, Tsukada, Yuhki, Koyama, Toshiyuki, Shiga, Motoki, Takeuchi, Ichiro, et al. (2020). Multi-Fidelity Bayesian Optimization with Max-value Entropy Search and Its Parallelization. Proceedings of the 39th International Conference on Machine Learning (Icml 2020). Vol. 119. Proceedings of Machine Learning Research, pp. 93349345.Google Scholar
Tallis, G. M. (1961). The Moment Generating Function of the Truncated Multi-Normal Distribution. Journal of the Royal Statistical Society Series B (Methodological) 23(1): 223229.Google Scholar
Tesch, Matthew, Schneider, Jeff, and Choset, Howie (2013). Expensive Function Optimization with Stochastic Binary Outcomes. Proceedings of the 30th International Conference on Machine Learning (Icml 2013). Vol. 28. Proceedings of Machine Learning Research, pp. 12831291.Google Scholar
The Uniprot Consortium (2021). UniProt: The Universal Protein Knowledgebase in 2021. Nucleic Acids Research 49(D1):D480-D489.Google Scholar
Thompson, William R. (1933). On the Likelihood That One Unknown Probability Exceeds Another in View of the Evidence of Two Samples. Biometrika 25(3–4):285294.Google Scholar
Thompson, William R. (1935). On the Theory of Apportionment. American Journal of Mathematics 57(2):450456.Google Scholar
Tiao, Louis C., Klein, Aaron, Seeger, Matthias, Bonilla, Edwin V., Archambeau, Ctdric, and Ramos, Fabio (2021). Bore: Bayesian Optimization by Density-Ratio Estimation. Proceedings of the 38th International Conference on Machine Learning (Icml 2021). Vol. 139. Proceedings of Machine Learning Research, pp. 1028910300.Google Scholar
Titsias, Michalis (2009). Variational Learning of Inducing Variables in Sparse Gaussian Processes. Proceedings of the 12th International Conference on Artificial Intelligence and Statistics (Aistats 2009). Vol. 5. Proceedings of Machine Learning Research, pp. 567574.Google Scholar
Toscano-Palmerin, Saul and Frazier, Peter I. (2018). Bayesian Optimization with Expensive Integrands. arXiv: 1803.08661 [stat.Ml].Google Scholar
Turban, Sebastien (2010). Convolution of a Truncated Normal and a Centered Normal Variable. Technical report. Columbia University.Google Scholar
Turner, Ryan, Eriksson, David, Mccourt, Michael, Kiili, Juha, Laaksonen, Eero, XU, Zhen, et al. (2021). Bayesian Optimization Is Superior to Random Search for Machine Learning Hyperparameter Tuning: Analysis of the Black-Box Optimization Challenge 2020. Proceedings of the Neuritis 2020 Competition and Demonstration Track. Vol. 133. Proceedings of Machine Learning Research, pp. 326.Google Scholar
Ulrich, Kyle, Carlson, David E., Dzirasa, Kafui, and Carin, Lawrence (2015). Gp Kernels for Cross-Spectrum Analysis. Advances in Neural Information Processing Systems 28 (Neuritis 2015), pp. 19992007.Google Scholar
Vakili, Sattar, Bouziani, Nacime, Jalali, Sepehr, Bernacchia, Alberto, and Shiu, Da-Shan (2021a). Optimal Order Simple Regret for Gaussian Process Bandits. Advances in Neural Information Processing Systems 34 (Neuritis 2021).Google Scholar
Vakili, Sattar, Khezeli, Kia, and Picheny, Victor (2021b). On Information Gain and Regret Bounds in Gaussian Process Bandits. Proceedings of the 24th International Conference on Artificial Intelligence and Statistics (aistats 2021). Vol. 130. Proceedings of Machine Learning Research, pp. 8290.Google Scholar
Vakili, Sattar, Scarlett, Jonathan, and Javidi, Tara (2021c). Open Problem: Tight Online Confidence Intervals for Rkhs Elements. Proceedings of the 34th Annual Conference on Learning Theory (Colt 2021). Vol. 134. Proceedings of Machine Learning Research, pp. 46474652.Google Scholar
Valko, Michal, Korda, Nathan, Munos, Rami, Flaounas, Ilias, and Cristianini, Nello (2013). Finite-Time Analysis of Kernelised Contextual Bandits. Proceedings of the 29th Conference on Uncertainty in Artificial Intelligence (uai 2013), pp. 654663.Google Scholar
Van De Corput, J. G. (1935). Verteilungsfunktionen: Erste Mitteilung. Proceedings of the Koninklijke Nederlandse Akademie van Wetenschappen 38:813821.Google Scholar
Van Der Vaart, Aad W. and Wellner, Jon A. (1996). Weak Convergence and Empirical Processes with Applications to Statistics. Springer Series in Statistics. Springer–Verlag.Google Scholar
Vanchinathan, Hastagiri P., Marfurt, Andreas, Robelin, Charles-Antoine, Kossmann, Donald, and Krause, Andreas (2015). Discovering Valuable Items from Massive Data. Proceedings of the 21st acm sigkdd International Conference on Knowledge Discovery and Data Mining (kdd 2015), pp. 11951204.Google Scholar
Vazquez, Emmanuel, Villemonteix, Julien, Sidorkiewicz, Maryan, and Walter, Aric (2008). Global Optimization Based on Noisy Evaluations: An Empirical Study of Two Statistical Approaches. Proceedings of the 6th International Conference on Inverse Problems in Engineering: Theory and Practice (icipe 2008). Vol. 135. Journal of Physics: Conference Series, paper number 012100.Google Scholar
Vehtari, Aki and Ojanen, Janne (2012). A Survey of Bayesian Predictive Methods for Model Assessment, Selection and Comparison. Statistics Surveys 6:142228.Google Scholar
Villemonteix, Julien, Vazquez, Emmanuel, and Walter, Eric (2009). An Informational Approach to the Global Optimization of Expensive-to-evaluate Functions. Journal of Global Optimization 44(4):509534.Google Scholar
Vivarelli, Francesco and Williams, Christopher K. I (1998). Discovering Hidden Features with Gaussian Process Regression. Advances in Neural Information Processing Systems 11 (Neurips 1998), pp. 613619.Google Scholar
Von Neumann, John and Morgenstern, Oskar (1944). Theory of Games and Economic Behavior. Princeton University Press.Google Scholar
VondrÁK, Jan (2005). Probabilistic Methods in Combinatorial and Stochastic Optimization. Ph.D. thesis. Massachusetts Institute of Technology.Google Scholar
Wald, A. (1945). Sequential Tests of Statistical Hypotheses. The Annals of Mathematical Statistics 16(2):117186.Google Scholar
Wald, Abraham (1947). Sequential Analysis. Wiley Mathematical Statistics Series. John Wiley & Sons.Google Scholar
Wang, Jialei, Clark, Scott C., Liu, Eric, and Frazier, Peter I. (2020a). Parallel Bayesian Global Optimization of Expensive Functions. Operations Research 68(6):18501865.Google Scholar
Wang, Zexin, Tan, Vincent Y. F., and Scarlett, Jonathan (2020b). Tight Regret Bounds for Noisy Optimization of a Brownian Motion. arXiv: 2001.09327 [cs.Lg].Google Scholar
Wang, Zi and Jegelka, Stefanie (2017). Max-value Entropy Search for Efficient Bayesian Optimization. Proceedings of the 34th International Conference on Machine Learning (Icml 2017). Vol. 70. Proceedings of Machine Learning Research, pp. 36273635.Google Scholar
Wang, Zi, Zhou, Bolei, and Jegelka, Stefanie (2016a). Optimization as Estimation with Gaussian Processes in Bandit Settings. Proceedings of the 19th International Conference on Artificial Intelligence and Statistics (Aistats 2016). Vol. 51. Proceedings of Machine Learning Research, pp. 10221031.Google Scholar
Wang, Ziyu and Freitas, Nando De (2014). Theoretical Analysis of Bayesian Optimization with Unknown Gaussian Process Hyper-Parameters. arXiv: 1406.7758 [stat.Ml].Google Scholar
Wang, Ziyu, Hutter, Frank, Zoghi, Masrour, Matheson, David, and Freitas, Nando De (2016b). Bayesian Optimization in a Billion Dimensions via Random Embeddings. Journal of Artificial Intelligence Research 55:361387.Google Scholar
Wendland, Holger (2004). Scattered Data Approximation. Cambridge Monographs on Applied and Computational Mathematics. Cambridge University Press.Google Scholar
Whittle, Peter (1982). Optimization over Time: Dynamic Programming and Stochastic Control. Vol. 1. Wiley Series in Probability and Mathematical Statistics. John Wiley & Sons.Google Scholar
Whittle, P. (1988). Restless Bandits: Activity Allocation in a Changing World. Journal of Applied Probability 25(A):287298.Google Scholar
Williams, Brian J., Santner, Thomas J., and Notz, William I. (2000). Sequential Design of Computer Experiments to Minimize Integrated Response Functions. Statistica Sinica 10(4):11331152.Google Scholar
Williams, Christopher K. I. and Seeger, Matthias (2000). Using the Nyström Method to Speed up Kernel Machines. Advances in Neural Information Processing Systems 13 (Neurips 2000), pp. 682688.Google Scholar
Wilson, Andrew Gordon and Adams, Ryan Prescott (2013). Gaussian Process Kernels for Pattern Discovery and Extrapolation. Proceedings of the 30th International Conference on Machine Learning (Icml 2013). Vol. 28. Proceedings of Machine Learning Research, pp. 10671075.Google Scholar
Wilson, Andrew Gordon, Hu, Zhiting, Salakhutdinov, Ruslan, and Xing, Eric P. (2016). Deep Kernel Learning. Proceedings of the 19th International Conference on Artificial Intelligence and Statistics (Aistats 2016). Vol. 51. Proceedings of Machine Learning Research, pp. 370378.Google Scholar
Wilson, James T., Hutter, Frank, and Deisenroth, Marc Peter (2018). Maximizing Acquisition Functions for Bayesian Optimization. Advances in Neural Information Processing Systems 31 (Neurips 2018), pp. 98849895.Google Scholar
Wu, Jian and Frazier, Peter I. (2016). The Parallel Knowledge Gradient Method for Batch Bayesian Optimization. Advances in Neural Information Processing Systems 29 (Neurips 2016), pp. 31263134.Google Scholar
Wu, Jian and Frazier, Peter I. (2019). Practical Two-Step Look-Ahead Bayesian Optimization. Advances in Neural Information Processing Systems 32 (Neurips 2019), pp. 9813– 9823.Google Scholar
Wu, Jian, Poloczek, Matthias, Wilson, Andrew Gordon, and Frazier, Peter I. (2017). Bayesian Optimization with Gradients. Advances in Neural Information Processing Systems 30 (Neurips 2017), pp. 52675278.Google Scholar
Wu, Jian, Toscano-Palmerin, Saul, Frazier, Peter I., and Wilson, Andrew Gordon (2019). Practical Multi-Fidelity Bayesian Optimization for Hyperparameter Tuning. Proceedings of the 35th Conference on Uncertainty in Artificial Intelligence (uAi 2019). Vol. 115. Proceedings of Machine Learning Research, pp. 788798.Google Scholar
Yang, Kaifeng, Emmerich, Michael, Deutz, Andra, and BÄCk, Thomas (2019a). Efficient Computation of Expected Hypervolume Improvement Using Box Decomposition Algorithms. Journal of Global Optimization 75(1):334.Google Scholar
Yang, Kaifeng, Emmerich, Michael, Deutz, Andra, and BÄCk, Thomas (2019b). Multi-Objective Bayesian Global Optimization Using Expected Hypervolume Improvement Gradient. Swarm and Evolutionary Computation 44:945956.Google Scholar
Yue, Xubo and Kontar, Raed Al (2020). Why Non-Myopic Bayesian Optimization Is Promising and How Far Should We Look-ahead? A Study via Rollout. Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics (Aistats 2020). Vol. 108. Proceedings of Machine Learning Research, pp. 28082818.Google Scholar
Zhang, Weitong, Zhou, Dongruo, Li, Lihong, and Gu, Quanquan (2021). Neural Thompson Sampling. Proceedings of the 9th International Conference on Learning Representations (ICLR 2021). arXiv: 2010.00827 [cs.Lg].Google Scholar
Zhang, Yehong, Hoang, Trong Nghia, Low, Bryan Kian Hsiang, and Kankan-Halli, Mohan (2017). Information-Based Multi-Fidelity Bayesian Optimization. Bayesian Optimization for Science and Engineering Workshop (BayesOpt 2017), Conference on Neural Information Processing Systems (Neurips 2017).Google Scholar
Zhou, Dongrou, Li, Lihong, and Gu, Quanquan (2020). Neural Contextual Bandits with Ucb-Based Exploration. Proceedings of the 37th International Conference on Machine Learning (Icml 2020). Vol. 119. Proceedings of Machine Learning Research, pp. 1149211502.Google Scholar
Ziatdinov, Maxim A., Ghosh, Ayana, and Kalinin, Sergei V. (2021). Physics Makes the Difference: Bayesian Optimization and Active Learning via Augmented Gaussian Process. arXiv: 2108.10280 [physics.comp-ph].Google Scholar
Zilberstein, Schlomo (1996). Using Anytime Algorithms in Intelligent Systems. Ai Magazine 17(3):7383.Google Scholar
žlinskas, Antanas G. (1975). Single-Step Bayesian Search Method for an Extremum of Functions of a Single Variable. Kibernetika (Cybernetics) 11(1):160166.Google Scholar
Zitzler, Eckart (1999). Evolutionary Algorithms for Multiobjective Optimization: Methods and Applications. Ph.D. thesis. Eidgenössische Technische Hochschule Zürich.Google Scholar
Zuluaga, Marcela, Krause, Andreas, and PÜSchel, Markus (2016). E-Pal: An Active Learning Approach to the Multi-Objective Optimization Problem. Journal of Machine Learning Research 17(104):132.Google Scholar

Save book to Kindle

To save this book to your Kindle, first ensure coreplatform@cambridge.org is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about saving to your Kindle.

Note you can select to save to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

Find out more about the Kindle Personal Document Service.

  • references
  • Roman Garnett, Washington University in St Louis
  • Book: Bayesian Optimization
  • Online publication: 25 January 2023
Available formats
×

Save book to Dropbox

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Dropbox.

  • references
  • Roman Garnett, Washington University in St Louis
  • Book: Bayesian Optimization
  • Online publication: 25 January 2023
Available formats
×

Save book to Google Drive

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Google Drive.

  • references
  • Roman Garnett, Washington University in St Louis
  • Book: Bayesian Optimization
  • Online publication: 25 January 2023
Available formats
×