Skip to main content Accessibility help
×
Hostname: page-component-75d7c8f48-z5ksc Total loading time: 0 Render date: 2026-03-14T19:54:28.681Z Has data issue: false hasContentIssue false

References

Published online by Cambridge University Press:  07 March 2026

Tor Lattimore
Affiliation:
Google DeepMind, London
Get access

Summary

Image of the first page of this content. For PDF version, please use the ‘Save PDF’ preceeding this image.'

Information

Type
Chapter
Information
Publisher: Cambridge University Press
Print publication year: 2026

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Book purchase

Temporarily unavailable

References

Abernethy, J. D., Hazan, E., and Rakhlin, A. 2008. Competing in the dark: An efficient algorithm for bandit linear optimization. In: Proceedings of the 21st Conference on Learning Theory.Google Scholar
Agarwal, A., Dekel, O., and Xiao, L. 2010. Optimal algorithms for online convex optimization with multi-point bandit feedback. In: Proceedings of the 23rd Conference on Learning Theory.Google Scholar
Agarwal, A., Foster, D. P., Hsu, D. J., Kakade, S. M., and Rakhlin, A. 2011. Stochastic convex optimization with bandit feedback. In: Advances in Neural Information Processing Systems.Google Scholar
Agarwal, A., Foster, D. P., Hsu, D., Kakade, S. M., and Rakhlin, A. 2013. Stochastic convex optimization with bandit feedback. SIAM Journal on Optimization, 23(1), 213240.10.1137/110850827CrossRefGoogle Scholar
Akhavan, A., Pontil, M., and Tsybakov, A. 2020. Exploiting higher order smoothness in derivative-free optimization and continuous bandits. In: Advances in Neural Information Processing Systems.Google Scholar
Akhavan, A., Lounici, K., Pontil, M., and Tsybakov, A. 2024a. Contextual continuum bandits: Static versus dynamic regret. arXiv:2406.05714.Google Scholar
Akhavan, A., Chzhen, E., Pontil, M., and Tsybakov, A. 2024b. Gradient-free optimization of highly smooth functions: Improved analysis and a new algorithm. Journal of Machine Learning Research, 25(370), 150.Google Scholar
Ao, R., Hu, H., and Simchi-Levi, D. 2025. Riemannian online convex optimization with self-concordant barrier. Available at SSRN 5250625.10.2139/ssrn.5250625CrossRefGoogle Scholar
Artstein-Avidan, S., Giannopoulos, A., and Milman, V. D. 2015. Asymptotic geometric analysis, Part I. American Mathematical Society.CrossRefGoogle Scholar
Atkinson, D. S., and Vaidya, P. M. 1995. A cutting plane algorithm for convex programming that uses analytic centers. Mathematical Programming, 69(1), 143.CrossRefGoogle Scholar
Bach, F. 2013. Learning with submodular functions: A convex optimization perspective. Foundations and Trends® in Machine Learning, 6(2–3), 145373.10.1561/2200000039CrossRefGoogle Scholar
Bach, F., and Perchet, V. 2016. Highly-smooth zero-th order online optimization. In: Proceedings of the 29th Conference on Learning Theory.Google Scholar
Bachoc, F., Cesari, T., Colomboni, R., and Paudice, A. 2022. Regret analysis of dyadic search. arXiv:2209.00885.Google Scholar
Bachoc, F., Cesari, T., Colomboni, R., and Paudice, A. 2024. A theoretical framework for zeroth-order budget convex optimization. Transactions on Machine Learning Research.Google Scholar
Bakhtiari, A., Lattimore, T., and Szepesvári, Cs. 2025. Thompson sampling for bandit convex optimisation. In: Proceedings of the 38th Conference on Learning Theory.Google Scholar
Balasubramanian, K., and Ghadimi, S. 2022. Zeroth-order nonconvex stochastic optimization: Handling constraints, high dimensionality, and saddle points. Foundations of Computational Mathematics, 22(2), 142.10.1007/s10208-021-09499-8CrossRefGoogle Scholar
Barthe, F. 1998. An extremal property of the mean width of the simplex. Mathematische Annalen, 310, 685693.CrossRefGoogle Scholar
Bauschke, H., and Borwein, J. 1997. Legendre functions and the method of random Bregman projections. Journal of Convex Analysis, 4(1), 2767.Google Scholar
Belloni, A., Liang, T., Narayanan, H., and Rakhlin, A. 2015. Escaping the local minima via simulated annealing: Optimization of approximately convex functions. In: Proceedings of the 28th Conference on Learning Theory.Google Scholar
Berry, D., Chen, R., Zame, A., Heath, D., and Shepp, L. 1997. Bandit problems with infinitely many arms. The Annals of Statistics, 25(5), 21032116.10.1214/aos/1069362389CrossRefGoogle Scholar
Bertsimas, D., and Tsitsiklis, J. N. 1997. Introduction to linear optimization. Athena Scientific.Google Scholar
Bertsimas, D., and Vempala, S. 2004. Solving convex programs by random walks. Journal of the ACM (JACM), 51(4), 540556.CrossRefGoogle Scholar
Besson, L., and Kaufmann, E. 2018. What doubling tricks can and can't do for multi-armed bandits. arXiv:1803.06971.Google Scholar
Bhatnagar, S., Prasad, H. L., and Prashanth, L. A. 2012. Stochastic recursive algorithms for optimization: Simultaneous perturbation methods. Lecture Notes in Control and Information Sciences. Springer.Google Scholar
Bilmes, J. 2022. Submodularity in machine learning and artificial intelligence. arXiv:2202.00132.Google Scholar
Blum, J. R. 1954. Multidimensional stochastic approximation methods. The Annals of Mathematical Statistics, 25(4), 737744.10.1214/aoms/1177728659CrossRefGoogle Scholar
Boucheron, S., Lugosi, G., and Massart, P. 2013. Concentration inequalities: A nonasymptotic theory of independence. Oxford University Press.CrossRefGoogle Scholar
Boyd, S., and Vandenberghe, L. 2004. Convex optimization. Cambridge University Press.10.1017/CBO9780511804441CrossRefGoogle Scholar
Bubeck, S. 2015. Convex optimization: Algorithms and complexity. Foundations and Trends® in Machine Learning, 8(3–4), 231357.CrossRefGoogle Scholar
Bubeck, S., and Cesa-Bianchi, N. 2012. Regret analysis of stochastic and nonstochastic multi-armed bandit problems. Foundations and Trends® in Machine Learning, 5(1), 1122.10.1561/2200000024CrossRefGoogle Scholar
Bubeck, S., and Eldan, R. 2015. The entropic barrier: A simple and optimal universal self-concordant barrier. In: Proceedings of the 28th Conference on Learning Theory.Google Scholar
Bubeck, S., and Eldan, R. 2018. Exploratory distributions for convex functions. Mathematical Statistics and Learning, 1(1), 73100.10.4171/msl/1-1-3CrossRefGoogle Scholar
Bubeck, S., Cesa-Bianchi, N., and Kakade, S. 2012. Towards minimax policies for online linear optimization with bandit feedback. In: Proceedings of the 25th Conference on Learning Theory.Google Scholar
Bubeck, S., Cesa-Bianchi, N., and Lugosi, G. 2013. Bandits with heavy tail. IEEE Transactions on Information Theory, 59(11), 77117717.10.1109/TIT.2013.2277869CrossRefGoogle Scholar
Bubeck, S., Dekel, O., Koren, T., and Peres, Y. 2015. Bandit convex optimization: regret in one dimension. In: Proceedings of the 28th Conference on Learning Theory.Google Scholar
Bubeck, S., Lee, Y.-T., and Eldan, R. 2017. Kernel-based methods for bandit convex optimization. In: Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing.10.1145/3055399.3055403CrossRefGoogle Scholar
Bubeck, S., Eldan, R., and Lehec, J. 2018. Sampling from a log-concave distribution with projected Langevin Monte Carlo. Discrete & Computational Geometry, 59, 757783.10.1007/s00454-018-9992-1CrossRefGoogle Scholar
Carpentier, Alexandra. 2025. A simple and improved algorithm for noisy, convex, zeroth-order optimisation. Mathematical Statistics and Learning, 8(3–4), 165192.10.4171/msl/54CrossRefGoogle Scholar
Cesa-Bianchi, N., and Lugosi, G. 2006. Prediction, learning, and games. Cambridge University Press.10.1017/CBO9780511546921CrossRefGoogle Scholar
Chatterji, N., Pacchiano, A., and Bartlett, P. 2019. Online learning with kernel losses. Pages 971980 of: Proceedings of the 36th International Conference on Machine Learning.Google Scholar
Chen, L., Krause, A., and Karbasi, A. 2017. Interactive submodular bandit. In: Advances in Neural Information Processing Systems.Google Scholar
Chewi, S. 2023. The entropic barrier is n-self-concordant. In: Geometric Aspects of Functional Analysis: Israel Seminar.10.1007/978-3-031-26300-2_6CrossRefGoogle Scholar
Chewi, S. 2024. Log-concave sampling.Google Scholar
Conn, A., Scheinberg, K., and Vicente, L. 2009. Introduction to derivative-free optimization. Society for Industrial and Applied Mathematics.CrossRefGoogle Scholar
Cover, T. M., and Thomas, J. A. 2012. Elements of information theory. John Wiley & Sons.Google Scholar
Dani, V., Hayes, T. P., and Kakade, S. M. 2008. Stochastic linear optimization under bandit feedback. In: Proceedings of the 21st Conference on Learning Theory.Google Scholar
Drori, Y. 2018. On the properties of convex functions over open sets. arXiv:1812.02419.Google Scholar
Duchi, J., Jordan, M., Wainwright, M., and Wibisono, A. 2015. Optimal rates for zero-order convex optimization: The power of two function evaluations. IEEE Transactions on Information Theory, 61(5), 27882806.10.1109/TIT.2015.2409256CrossRefGoogle Scholar
Duembgen, L. 2010. Bounding standard gaussian tail probabilities. arXiv:1012.2063.Google Scholar
Evans, L. 2018. Measure theory and fine properties of functions. Routledge.10.1201/9780203747940CrossRefGoogle Scholar
Even-Dar, E., Mannor, S., and Mansour, Y. 2006. Action elimination and stopping conditions for the multi-armed bandit and reinforcement learning problems. Journal of Machine Learning Research, 7, 10791105.Google Scholar
Finch, S. 2011. Mean width of a regular simplex. arXiv:1111.4976.Google Scholar
Flaxman, A., Kalai, A., and McMahan, H.B. 2005. Online convex optimization in the bandit setting: Gradient descent without a gradient. In: Proceedings of the 16th Annual ACM-SIAM Symposium on Discrete Algorithms.Google Scholar
Fokkema, H., van der Hoeven, D., Lattimore, T., and Mayo, J. 2024. Online Newton method for bandit convex optimisation. In: Proceedings of the 37th Conference on Learning Theory.Google Scholar
Foster, D. J., Kakade, S., Qian, J., and Rakhlin, A. 2021. The statistical complexity of interactive decision making. arXiv:2112.13487.Google Scholar
Foster, D. J., Rakhlin, A., Sekhari, A., and Sridharan, K. 2022. On the complexity of adversarial decision making. In: Advances in Neural Information Processing Systems.10.52202/068431-2566CrossRefGoogle Scholar
Foster, D. P., and Rakhlin, A. 2021. On submodular contextual bandits. arXiv:2112.02165.Google Scholar
Gabillon, V., Kveton, B., Wen, Z., Eriksson, B., and Muthukrishnan, S. 2013. Adaptive submodular maximization in bandit setting. In: Advances in Neural Information Processing Systems.Google Scholar
Galicer, D., Merzbacher, M., and Pinasco, D. 2019. The minimal volume of simplices containing a convex body. Journal of Geometric Analysis, 29, 717732.10.1007/s12220-018-0016-4CrossRefGoogle Scholar
Garber, D., and Kretzu, B. 2022. New projection-free algorithms for online convex optimization with adaptive regret guarantees. In: Proceedings of the 35th Conference on Learning Theory.Google Scholar
Ghadimi, S., and Lan, G. 2013. Stochastic first- and zeroth-order methods for nonconvex stochastic programming. SIAM Journal on Optimization, 23(4), 23412368.10.1137/120880811CrossRefGoogle Scholar
Giannopoulos, A., and Milman, E. 2014. M-estimates for isotropic convex bodies and their Lq-centroid bodies. In: Geometric Aspects of Functional Analysis: Israel Seminar.CrossRefGoogle Scholar
Grötschel, M., Lovász, L., and Schrijver, A. 2012. Geometric algorithms and combinatorial optimization. Springer Science & Business Media.Google Scholar
Grünbaum, B. 1960. Partitions of mass-distributions and of convex bodies by hyperplanes. Pacific Journal of Mathematics, 10(4), 12571261.10.2140/pjm.1960.10.1257CrossRefGoogle Scholar
Hale, N., Higham, N., and Trefethen, L. 2008. Computing Aα, log(A) and related matrix functions by contour integrals. SIAM Journal on Numerical Analysis, 46(5), 25052523.10.1137/070700607CrossRefGoogle Scholar
Hazan, E. 2016. Introduction to online convex optimization. Foundations and Trends® in Optimization, 2(3–4), 157325.10.1561/2400000013CrossRefGoogle Scholar
Hazan, E., and Kale, S. 2012. Online submodular minimization. Journal of Machine Learning Research, 13(10).Google Scholar
Hazan, E., and Levy, K. 2014. Bandit convex optimization: Towards tight bounds. In: Advances in Neural Information Processing Systems.Google Scholar
Hazan, E., and Li, Y. 2016. An optimal algorithm for bandit convex optimization. arXiv:1603.04350.Google Scholar
Hazan, E., Agarwal, A., and Kale, S. 2007. Logarithmic regret algorithms for online convex optimization. Machine Learning, 69, 169192.10.1007/s10994-007-5016-8CrossRefGoogle Scholar
Hazan, E., Karnin, Z., and Meka, R. 2016. Volumetric spanners: An efficient exploration basis for learning. Journal of Machine Learning Research, 17(119), 134.Google Scholar
Hu, X., Prashanth, L. A., György, A., and Szepesvári, Cs. 2016. (Bandit) convex optimization with biased noisy gradient oracles. In: Proceedings of the 19th International Conference on Artificial Intelligence and Statistics.Google Scholar
Ito, S. 2020. An optimal algorithm for bandit convex optimization with strongly-convex and smooth loss. In: Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics.Google Scholar
Jamieson, K., Nowak, R., and Recht, B. 2012. Query complexity of derivative-free optimization. In: Advances in Neural Information Processing Systems.Google Scholar
Jegelka, S., and Bilmes, J. 2011. Online submodular minimization for combinatorial structures. In: Proceedings of the 28th International Conference on Machine Learning.Google Scholar
Jiang, H. 2022. Minimizing convex functions with rational minimizers. Journal of the ACM, 70(1), 127.10.1145/3566050CrossRefGoogle Scholar
Jiang, S., Song, Z., Weinstein, O., and Zhang, H. 2021. A faster algorithm for solving general LPs. In: Proceedings of the 53rd Annual ACM SIGACT Symposium on Theory of Computing.CrossRefGoogle Scholar
Jingxin, Z., Yuchen, X., Kaicheng, J., and Zhihua, Z. 2025. A regularized online Newton method for stochastic convex bandits with linear vanishing noise. arXiv:2501.11127.Google Scholar
Kaiser, M. J. 1993. The Santaló point of a planar convex set. Applied Mathematics Letters, 6(2), 4753.10.1016/0893-9659(93)90011-BCrossRefGoogle Scholar
Kannan, R., Lovász, L., and Simonovits, M. 1995. Isoperimetric problems for convex bodies and a localization lemma. Discrete & Computational Geometry, 13, 541559.10.1007/BF02574061CrossRefGoogle Scholar
Karimi, H., Nutini, J., and Schmidt, M. 2016. Linear convergence of gradient and proximalgradient methods under the Polyak-Lojasiewicz condition. In: Proceedings of the Joint European Conference on Machine Learning and Knowledge Discovery in Databases.10.1007/978-3-319-46128-1_50CrossRefGoogle Scholar
Karp, R. 1972. Reducibility among combinatorial problems. In: Proceedings of a Symposium on the Complexity of Computer Computations.10.1007/978-1-4684-2001-2_9CrossRefGoogle Scholar
Khachiyan, L. 1990. An inequality for the volume of inscribed ellipsoids. Discrete & Computational Geometry, 5, 219222.10.1007/BF02187786CrossRefGoogle Scholar
Khachiyan, L., and Todd, M. 1993. On the complexity of approximating the maximal inscribed ellipsoid for a polytope. Mathematical Programming, 61(08), 137159.10.1007/BF01582144CrossRefGoogle Scholar
Kiefer, J. 1953. Sequential minimax search for a maximum. Proceedings of the American Mathematical Society, 4(3), 502506.10.1090/S0002-9939-1953-0055639-3CrossRefGoogle Scholar
Kiefer, J., and Wolfowitz, J. 1952. Stochastic estimation of the maximum of a regression function. The Annals of Mathematical Statistics, 23(3), 462466.10.1214/aoms/1177729392CrossRefGoogle Scholar
Kiefer, J., and Wolfowitz, J. 1960. The equivalence of two extremum problems. Canadian Journal of Mathematics, 12(5), 363365.10.4153/CJM-1960-030-4CrossRefGoogle Scholar
Kirschner, J. 2021. Information-directed sampling – frequentist analysis and applications. Ph.D. thesis, ETH Zurich.Google Scholar
Klartag, B., and Lehec, J. 2024. Affirmative resolution of Bourgain’s slicing problem using Guan’s bound. arXiv:2412.15044.10.1007/s00039-025-00718-wCrossRefGoogle Scholar
Kleinberg, R. 2005. Nearly tight bounds for the continuum-armed bandit problem. In: Advances in Neural Information Processing Systems.Google Scholar
Larson, J., Menickelly, M., and Wild, S. 2019. Derivative-free optimization methods. Acta Numerica, 28, 287404.10.1017/S0962492919000060CrossRefGoogle Scholar
Lattimore, T. 2020. Improved regret for zeroth-order adversarial bandit convex optimisation. Mathematical Statistics and Learning, 2(3–4), 311334.10.4171/msl/17CrossRefGoogle Scholar
Lattimore, T., and György, A. 2021a. Improved regret for zeroth-order stochastic convex bandits. In: Proceedings of the 34th Conference on Learning Theory.Google Scholar
Lattimore, T., and György, A. 2021b. Mirror descent and the information ratio. In: Proceedings of the 34th Conference on Learning Theory.Google Scholar
Lattimore, T., and György, A. 2023. A second-order method for stochastic bandit convex optimisation. In: Proceedings of the 36th Conference on Learning Theory.Google Scholar
Lattimore, T., and Hao, B. 2021. Bandit phase retrieval. In: Advances in Neural Information Processing Systems.Google Scholar
Lattimore, T., and Szepesvári, Cs. 2019. An information-theoretic approach to minimax regret in partial monitoring. In: Proceedings of the 32nd Conference on Learning Theory.Google Scholar
Lattimore, T., and Szepesvári, Cs. 2020. Bandit algorithms. Cambridge University Press.CrossRefGoogle Scholar
Lee, Y.-T., and Yue, M-C. 2021. Universal barrier is n-self-concordant. Mathematics of Operations Research, 46(3), 11291148.10.1287/moor.2020.1113CrossRefGoogle Scholar
Lee, Y.-T., Sidford, A., and Wong, S. 2015. A faster cutting plane method and its implications for combinatorial and convex optimization. In: IEEE 56th Annual Symposium on Foundations of Computer Science.10.1109/FOCS.2015.68CrossRefGoogle Scholar
Lee, Y.-T., Sidford, A., and Vempala, S. 2018. Efficient convex optimization with membership oracles. In: Proceedings of the 31st Conference on Learning Theory.CrossRefGoogle Scholar
Levin, A. 1965. An algorithm for minimizing convex functions. Pages 1244–1247 of: Doklady Akademii Nauk, vol. 160. Russian Academy of Sciences.Google Scholar
Liang, T., Narayanan, H., and Rakhlin, A. 2014. On zeroth-order stochastic convex optimization via random walks. arXiv:1402.2667.Google Scholar
Liu, S., Chen, P.-Y., Kailkhura, B., Zhang, G., Hero, A. III, and Varshney, P. 2020. A primer on zeroth-order optimization in signal processing and machine learning: Principles, recent advances, and applications. IEEE Signal Processing Magazine, 37(5), 4354.10.1109/MSP.2020.3003837CrossRefGoogle Scholar
Liu, X., Baudry, D., Zimmert, J., Rebeschini, P., and Akhavan, A. 2025. Non-stationary bandit convex optimization: A comprehensive study. arXiv:2506.02980.Google Scholar
Lovász, L. 1983. Submodular functions and convexity. Pages 235–257 of: Mathematical Programming: The State of the Art: Bonn 1982.10.1007/978-3-642-68874-4_10CrossRefGoogle Scholar
Lovász, L., and Vempala, S. 2006. Simulated annealing in convex bodies and an O*(n4) volume algorithm. Journal of Computer and System Sciences, 72(2), 392417.10.1016/j.jcss.2005.08.004CrossRefGoogle Scholar
Luo, H., Zhang, M., and Zhao, P. 2022. Adaptive bandit convex optimization with heterogeneous curvature. In: Proceedings of the 25th Conference on Learning Theory.Google Scholar
Marchal, O., and Arbel, J. 2017. On the sub-Gaussianity of the Beta and Dirichlet distributions. Electronic Communications in Probability, 22(54), 114.10.1214/17-ECP92CrossRefGoogle Scholar
McCormick, S. 2005. Submodular function minimization. Handbooks in Operations Research and Management Science, 12, 321391.10.1016/S0927-0507(05)12007-6CrossRefGoogle Scholar
Meyer, M., and Werner, E. 1998. The Santaló-regions of a convex body. Transactions of the American Mathematical Society, 350(11), 45694591.10.1090/S0002-9947-98-02162-XCrossRefGoogle Scholar
Mhammedi, Z. 2022. Efficient projection-free online convex optimization with membership oracle. In: Proceedings of the 35th Conference on Learning Theory.Google Scholar
Milman, E. 2015. On the mean-width of isotropic convex bodies and their associated L p-centroid bodies. International Mathematics Research Notices, 2015(11), 34083423.Google Scholar
Motzkin, T., and Straus, G. 1965. Maxima for graphs and a new proof of a theorem of Turán. Canadian Journal of Mathematics, 17, 533540.10.4153/CJM-1965-053-6CrossRefGoogle Scholar
Nemirovski, A. 1996. Lecture notes: Interior-point polynomial time methods for convex programming. Georgia Institute of Technology.Google Scholar
Nemirovski, A., Juditsky, A., Lan, G., and Shapiro, A. 2009. Robust stochastic approximation approach to stochastic programming. SIAM Journal on Optimization, 19(4), 15741609.10.1137/070704277CrossRefGoogle Scholar
Nemirovsky, A., and Yudin, D. 1983. Problem complexity and method efficiency in optimization. John Wiley & Sons.Google Scholar
Nesterov, Y. 1988. Polynomial time methods in linear and quadratic programming. Izvestija AN SSSR, Tekhnitcheskaya kibernetika.Google Scholar
Nesterov, Y. 1995. Cutting plane algorithms from analytic centers: efficiency estimates. Mathematical Programming, 69(1), 149176.10.1007/BF01585556CrossRefGoogle Scholar
Nesterov, Y. 2018. Lectures on convex optimization. Springer.10.1007/978-3-319-91578-4CrossRefGoogle Scholar
Nesterov, Y., and Nemirovski, A. 1994. Interior-point polynomial algorithms in convex programming. Society for Industrial and Applied Mathematics.10.1137/1.9781611970791CrossRefGoogle Scholar
Nesterov, Y., and Nemirovsky, A. 1989. Self-concordant functions and polynomial time methods in convex programming. USSR Academy of Sciences Central Economic & Mathematical Institute, Moscow.Google Scholar
Nesterov, Y., and Spokoiny, V. 2017. Random gradient-free minimization of convex functions. Foundations of Computational Mathematics, 17, 527566.10.1007/s10208-015-9296-2CrossRefGoogle Scholar
Newman, D. J. 1965. Location of the maximum on unimodal surfaces. Journal of the ACM, 12(3), 395398.10.1145/321281.321291CrossRefGoogle Scholar
Niazadeh, R., Golrezaei, N., Wang, J., Susan, F., and Badanidiyuru, A. 2021. Online learning via offline greedy algorithms: Applications in market design and optimization. In: Proceedings of the 22nd ACM Conference on Economics and Computation.10.1145/3465456.3467571CrossRefGoogle Scholar
Novitskii, V., and Gasnikov, A. 2021. Improved exploiting higher order smoothness in derivative-free optimization and continuous bandit. arXiv:2101.03821.10.1007/s11590-022-01863-zCrossRefGoogle Scholar
Orabona, F. 2019. A modern introduction to online learning. arXiv:1912.13213.Google Scholar
Orseau, L., and Hutter, M. 2023. Line search for convex minimization. arXiv:2307.16560.Google Scholar
Pinelis, I. 2022. Improved concentration bounds for sums of independent sub-exponential random variables. Statistics & Probability Letters, 191, 109666.10.1016/j.spl.2022.109666CrossRefGoogle Scholar
Pivovarov, P. 2010. On the volume of caps and bounding the mean-width of an isotropic convex body. Mathematical Proceedings of the Cambridge Philosophical Society, 149(2), 317331.10.1017/S0305004110000216CrossRefGoogle Scholar
Polyak, B. 1963. Gradient methods for minimizing functionals. Zhurnal Vychislitel’noi Matematiki I Matematicheskoi Fiziki, 3(4), 643653.Google Scholar
Polyak, B. T., and Tsybakov, A. B. 1990. Optimal accuracy orders of stochastic approximation algorithms. Problemy Peredachi Informatsii, 26(2), 4553.Google Scholar
Prashanth, L. A., and Bhatnagar, S. 2025. Gradient-based algorithms for zeroth-order optimization. Foundations and Trends® in Optimization, 8(1–3), 1332.Google Scholar
Protasov, V. 1996. Algorithms for approximate calculation of the minimum of a convex function from its values. Mathematical Notes, 59(1), 6974.10.1007/BF02312467CrossRefGoogle Scholar
Robbins, H., and Monro, S. 1951. A stochastic approximation method. The Annals of Mathematical Statistics, 22(3), 400407.10.1214/aoms/1177729586CrossRefGoogle Scholar
Rockafellar, R. T. 1970. Convex analysis. Princeton University Press.10.1515/9781400873173CrossRefGoogle Scholar
Russo, D., and Van Roy, B. 2014. Learning to optimize via information-directed sampling. In: Advances in Neural Information Processing Systems.Google Scholar
Russo, D., and Van Roy, B. 2016. An information-theoretic analysis of Thompson sampling. Journal of Machine Learning Research, 17(1), 24422471.Google Scholar
Saha, A., and Tewari, A. 2011. Improved regret guarantees for online smooth convex optimization with bandit feedback. In: Proceedings of the 14th International Conference on Artificial Intelligence and Statistics.Google Scholar
Santaló, L. A. 1949. Un invariante afin para los cuerpos convexos del espacio de n dimensiones. Portugaliae Mathematica, 8, 155161.Google Scholar
Schneider, R. 2013. Convex bodies: The Brunn–Minkowski theory. Cambridge University Press.10.1017/CBO9781139003858CrossRefGoogle Scholar
Shamir, O. 2013. On the complexity of bandit and derivative-free stochastic convex optimization. In: Proceedings of the 26th Conference on Learning Theory.Google Scholar
Shamir, O. 2015. On the complexity of bandit linear optimization. In: Proceedings of the 28th Conference on Learning Theory.Google Scholar
Shor, N. 1977. Cut-off method with space extension in convex programming problems. Cybernetics, 13(1), 9496.10.1007/BF01071394CrossRefGoogle Scholar
Slivkins, A. 2019. Introduction to multi-armed bandits. Foundations and Trends® in Machine Learning, 12(1–2), 1286.10.1561/2200000068CrossRefGoogle Scholar
Spall, J. C. 1992. Multivariate stochastic approximation using a simultaneous perturbation gradient approximation. IEEE Transactions on Automatic Control, 37(3), 332341.10.1109/9.119632CrossRefGoogle Scholar
Spall, J. C. 1994. Developments in stochastic optimization algorithms with gradient approximations based on function measurements. In: Proceedings of Winter Simulation Conference.Google Scholar
Suggala, A., Ravikumar, P., and Netrapalli, P. 2021. Efficient bandit convex optimization: Beyond linear losses. In: Proceedings of the 34th Conference on Learning Theory.Google Scholar
Suggala, A., Sun, J., Netrapalli, P., and Hazan, E. 2024. Second order methods for bandit optimization and control. In: Proceedings of the 37th Conference on Learning Theory.Google Scholar
Tajdini, A., Jain, L., and Jamieson, K. 2024. Nearly minimax optimal submodular maximization with bandit feedback. In: Advances in Neural Information Processing Systems.10.52202/079017-3051CrossRefGoogle Scholar
Takemori, S., Sato, M., Sonoda, T., Singh, J., and Ohkuma, T. 2020. Submodular bandit problem under multiple constraints. In: Proceedings of the 36th Conference on Uncertainty in Artificial Intelligence.Google Scholar
Tarasov, S., Khachiyan, L. G., and Erlich, I.I. 1988. The method of inscribed ellipsoids. Pages 226–230 of: Soviet Mathematics-Doklady, vol. 37.Google Scholar
Thompson, W. 1933. On the likelihood that one unknown probability exceeds another in view of the evidence of two samples. Biometrika, 25(3–4), 285294.10.1093/biomet/25.3-4.285CrossRefGoogle Scholar
Vaidya, P. M. 1996. A new algorithm for minimizing convex functions over convex sets. Mathematical Programming, 73(3), 291341.10.1007/BF02592216CrossRefGoogle Scholar
van der Hoeven, D., van Erven, T., and Kotlowski, W. 2018. The many faces of exponential weights in online learning. In: Proceedings of the 31st Conference on Learning Theory.Google Scholar
Vershynin, R. 2018. High-dimensional probability: An introduction with applications in data science. Cambridge University Press.Google Scholar
Wang, Y. 2023. On adaptivity in nonstationary stochastic optimization with bandit feedback. Operations Research, 73(2), 819828.10.1287/opre.2022.0576CrossRefGoogle Scholar
Yudin, D., and Nemirovskii, A. 1976. Informational complexity and efficient methods for the solution of convex extremal problems. Matekon, 13(2), 2245.Google Scholar
Yudin, D., and Nemirovskii, A. 1977. Evaluation of the informational complexity of mathematical programming problems. Matekon, 13(2), 325.Google Scholar
Zhang, H., and Chen, S. 2021. Concentration inequalities for statistical inference. Communications in Mathematical Research, 37(1), 185.10.4208/cmr.2020-0041CrossRefGoogle Scholar
Zhang, M., Chen, L., Hassani, H., and Karbasi, A. 2019. Online continuous submodular maximization: From full-information to bandit feedback. In: Advances in Neural Information Processing Systems.Google Scholar
Zhao, P., Wang, G., Zhang, L., and Zhou, Z.-H. 2021. Bandit convex optimization in non-stationary environments. Journal of Machine Learning Research, 22(1), 55625606.Google Scholar
Zimmert, J., and Lattimore, T. 2019. Connections between mirror descent, Thompson sampling and the information ratio. In: Advances in Neural Information Processing Systems.Google Scholar
Zimmert, J., and Lattimore, T. 2022. Return of the bias: Almost minimax optimal high probability bounds for adversarial linear bandits. In: Proceedings of the 35th Conference on Learning Theory.Google Scholar
Zinkevich, M. 2003. Online convex programming and generalized infinitesimal gradient ascent. In: Proceedings of the 20th International Conference on Machine Learning.Google Scholar
Zwillinger, D. 2018. CRC standard mathematical tables and formulae. 33rd edn. Chapman and Hall/CRC.Google Scholar

Accessibility standard: WCAG 2.1 A

Why this information is here

This section outlines the accessibility features of this content - including support for screen readers, full keyboard navigation and high-contrast display options. This may not be relevant for you.

Accessibility Information

The PDF of this chapter complies with version 2.1 of the Web Content Accessibility Guidelines (WCAG), covering newer accessibility requirements and improved user experiences and meets the basic (A) level of WCAG compliance, addressing essential accessibility barriers.

Content Navigation

Table of contents navigation
Allows you to navigate directly to chapters, sections, or non‐text items through a linked table of contents, reducing the need for extensive scrolling.
Index navigation
Provides an interactive index, letting you go straight to where a term or subject appears in the text without manual searching.

Reading Order & Textual Equivalents

Single logical reading order
You will encounter all content (including footnotes, captions, etc.) in a clear, sequential flow, making it easier to follow with assistive tools like screen readers.
Short alternative textual descriptions
You get concise descriptions (for images, charts, or media clips), ensuring you do not miss crucial information when visual or audio elements are not accessible.

Save book to Kindle

To save this book to your Kindle, first ensure no-reply@cambridge.org is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about saving to your Kindle.

Note you can select to save to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

Find out more about the Kindle Personal Document Service.

  • References
  • Tor Lattimore, Google DeepMind, London
  • Book: Bandit Convex Optimisation
  • Online publication: 07 March 2026
  • Chapter DOI: https://doi.org/10.1017/9781009607551.019
Available formats
×

Save book to Dropbox

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Dropbox.

  • References
  • Tor Lattimore, Google DeepMind, London
  • Book: Bandit Convex Optimisation
  • Online publication: 07 March 2026
  • Chapter DOI: https://doi.org/10.1017/9781009607551.019
Available formats
×

Save book to Google Drive

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Google Drive.

  • References
  • Tor Lattimore, Google DeepMind, London
  • Book: Bandit Convex Optimisation
  • Online publication: 07 March 2026
  • Chapter DOI: https://doi.org/10.1017/9781009607551.019
Available formats
×