References

Tor Lattimore

doi:10.1017/9781009607551.019

References

Published online by Cambridge University Press: 07 March 2026

Tor Lattimore

Show author details

Tor Lattimore: Affiliation:
Google DeepMind, London

Book contents

Get access

Summary

A summary is not available for this content so a preview has been provided. Please use the Get access link above for information on how to access this content.

Image of the first page of this content. For PDF version, please use the ‘Save PDF’ preceeding this image.'

Information

Type: Chapter
Information: Bandit Convex Optimisation , pp. 258 - 266

DOI: https://doi.org/10.1017/9781009607551.019 [Opens in a new window]

Publisher: Cambridge University Press

Print publication year: 2026

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Book purchase

Temporarily unavailable

References

Abernethy, J. D., Hazan, E., and Rakhlin, A. 2008. Competing in the dark: An efficient algorithm for bandit linear optimization. In: Proceedings of the 21st Conference on Learning Theory.Google Scholar

Agarwal, A., Dekel, O., and Xiao, L. 2010. Optimal algorithms for online convex optimization with multi-point bandit feedback. In: Proceedings of the 23rd Conference on Learning Theory.Google Scholar

Agarwal, A., Foster, D. P., Hsu, D. J., Kakade, S. M., and Rakhlin, A. 2011. Stochastic convex optimization with bandit feedback. In: Advances in Neural Information Processing Systems.Google Scholar

Agarwal, A., Foster, D. P., Hsu, D., Kakade, S. M., and Rakhlin, A. 2013. Stochastic convex optimization with bandit feedback. SIAM Journal on Optimization, 23(1), 213–240.10.1137/110850827CrossRef Google Scholar

Akhavan, A., Pontil, M., and Tsybakov, A. 2020. Exploiting higher order smoothness in derivative-free optimization and continuous bandits. In: Advances in Neural Information Processing Systems.Google Scholar

Akhavan, A., Lounici, K., Pontil, M., and Tsybakov, A. 2024a. Contextual continuum bandits: Static versus dynamic regret. arXiv:2406.05714.Google Scholar

Akhavan, A., Chzhen, E., Pontil, M., and Tsybakov, A. 2024b. Gradient-free optimization of highly smooth functions: Improved analysis and a new algorithm. Journal of Machine Learning Research, 25(370), 1–50.Google Scholar

Ao, R., Hu, H., and Simchi-Levi, D. 2025. Riemannian online convex optimization with self-concordant barrier. Available at SSRN 5250625.10.2139/ssrn.5250625CrossRef Google Scholar

Artstein-Avidan, S., Giannopoulos, A., and Milman, V. D. 2015. Asymptotic geometric analysis, Part I. American Mathematical Society.CrossRef Google Scholar

Atkinson, D. S., and Vaidya, P. M. 1995. A cutting plane algorithm for convex programming that uses analytic centers. Mathematical Programming, 69(1), 1–43.CrossRef Google Scholar

Bach, F. 2013. Learning with submodular functions: A convex optimization perspective. Foundations and Trends® in Machine Learning, 6(2–3), 145–373.10.1561/2200000039CrossRef Google Scholar

Bach, F., and Perchet, V. 2016. Highly-smooth zero-th order online optimization. In: Proceedings of the 29th Conference on Learning Theory.Google Scholar

Bachoc, F., Cesari, T., Colomboni, R., and Paudice, A. 2022. Regret analysis of dyadic search. arXiv:2209.00885.Google Scholar

Bachoc, F., Cesari, T., Colomboni, R., and Paudice, A. 2024. A theoretical framework for zeroth-order budget convex optimization. Transactions on Machine Learning Research.Google Scholar

Bakhtiari, A., Lattimore, T., and Szepesvári, Cs. 2025. Thompson sampling for bandit convex optimisation. In: Proceedings of the 38th Conference on Learning Theory.Google Scholar

Balasubramanian, K., and Ghadimi, S. 2022. Zeroth-order nonconvex stochastic optimization: Handling constraints, high dimensionality, and saddle points. Foundations of Computational Mathematics, 22(2), 1–42.10.1007/s10208-021-09499-8CrossRef Google Scholar

Barthe, F. 1998. An extremal property of the mean width of the simplex. Mathematische Annalen, 310, 685–693.CrossRef Google Scholar

Bauschke, H., and Borwein, J. 1997. Legendre functions and the method of random Bregman projections. Journal of Convex Analysis, 4(1), 27–67.Google Scholar

Belloni, A., Liang, T., Narayanan, H., and Rakhlin, A. 2015. Escaping the local minima via simulated annealing: Optimization of approximately convex functions. In: Proceedings of the 28th Conference on Learning Theory.Google Scholar

Berry, D., Chen, R., Zame, A., Heath, D., and Shepp, L. 1997. Bandit problems with infinitely many arms. The Annals of Statistics, 25(5), 2103–2116.10.1214/aos/1069362389CrossRef Google Scholar

Bertsimas, D., and Tsitsiklis, J. N. 1997. Introduction to linear optimization. Athena Scientific.Google Scholar

Bertsimas, D., and Vempala, S. 2004. Solving convex programs by random walks. Journal of the ACM (JACM), 51(4), 540–556.CrossRef Google Scholar

Besson, L., and Kaufmann, E. 2018. What doubling tricks can and can't do for multi-armed bandits. arXiv:1803.06971.Google Scholar

Bhatnagar, S., Prasad, H. L., and Prashanth, L. A. 2012. Stochastic recursive algorithms for optimization: Simultaneous perturbation methods. Lecture Notes in Control and Information Sciences. Springer.Google Scholar

Bilmes, J. 2022. Submodularity in machine learning and artificial intelligence. arXiv:2202.00132.Google Scholar

Blum, J. R. 1954. Multidimensional stochastic approximation methods. The Annals of Mathematical Statistics, 25(4), 737–744.10.1214/aoms/1177728659CrossRef Google Scholar

Boucheron, S., Lugosi, G., and Massart, P. 2013. Concentration inequalities: A nonasymptotic theory of independence. Oxford University Press.CrossRef Google Scholar

Boyd, S., and Vandenberghe, L. 2004. Convex optimization. Cambridge University Press.10.1017/CBO9780511804441CrossRef Google Scholar

Bubeck, S. 2015. Convex optimization: Algorithms and complexity. Foundations and Trends® in Machine Learning, 8(3–4), 231–357.CrossRef Google Scholar

Bubeck, S., and Cesa-Bianchi, N. 2012. Regret analysis of stochastic and nonstochastic multi-armed bandit problems. Foundations and Trends® in Machine Learning, 5(1), 1–122.10.1561/2200000024CrossRef Google Scholar

Bubeck, S., and Eldan, R. 2015. The entropic barrier: A simple and optimal universal self-concordant barrier. In: Proceedings of the 28th Conference on Learning Theory.Google Scholar

Bubeck, S., and Eldan, R. 2018. Exploratory distributions for convex functions. Mathematical Statistics and Learning, 1(1), 73–100.10.4171/msl/1-1-3CrossRef Google Scholar

Bubeck, S., Cesa-Bianchi, N., and Kakade, S. 2012. Towards minimax policies for online linear optimization with bandit feedback. In: Proceedings of the 25th Conference on Learning Theory.Google Scholar

Bubeck, S., Cesa-Bianchi, N., and Lugosi, G. 2013. Bandits with heavy tail. IEEE Transactions on Information Theory, 59(11), 7711–7717.10.1109/TIT.2013.2277869CrossRef Google Scholar

Bubeck, S., Dekel, O., Koren, T., and Peres, Y. 2015. Bandit convex optimization:

\sqrt{T}

regret in one dimension. In: Proceedings of the 28th Conference on Learning Theory.Google Scholar

Bubeck, S., Lee, Y.-T., and Eldan, R. 2017. Kernel-based methods for bandit convex optimization. In: Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing.10.1145/3055399.3055403CrossRef Google Scholar

Bubeck, S., Eldan, R., and Lehec, J. 2018. Sampling from a log-concave distribution with projected Langevin Monte Carlo. Discrete & Computational Geometry, 59, 757–783.10.1007/s00454-018-9992-1CrossRef Google Scholar

Carpentier, Alexandra. 2025. A simple and improved algorithm for noisy, convex, zeroth-order optimisation. Mathematical Statistics and Learning, 8(3–4), 165–192.10.4171/msl/54CrossRef Google Scholar

Cesa-Bianchi, N., and Lugosi, G. 2006. Prediction, learning, and games. Cambridge University Press.10.1017/CBO9780511546921CrossRef Google Scholar

Chatterji, N., Pacchiano, A., and Bartlett, P. 2019. Online learning with kernel losses. Pages 971–980 of: Proceedings of the 36th International Conference on Machine Learning.Google Scholar

Chen, L., Krause, A., and Karbasi, A. 2017. Interactive submodular bandit. In: Advances in Neural Information Processing Systems.Google Scholar

Chewi, S. 2023. The entropic barrier is n-self-concordant. In: Geometric Aspects of Functional Analysis: Israel Seminar.10.1007/978-3-031-26300-2_6CrossRef Google Scholar

Chewi, S. 2024. Log-concave sampling.Google Scholar

Conn, A., Scheinberg, K., and Vicente, L. 2009. Introduction to derivative-free optimization. Society for Industrial and Applied Mathematics.CrossRef Google Scholar

Cover, T. M., and Thomas, J. A. 2012. Elements of information theory. John Wiley & Sons.Google Scholar

Dani, V., Hayes, T. P., and Kakade, S. M. 2008. Stochastic linear optimization under bandit feedback. In: Proceedings of the 21st Conference on Learning Theory.Google Scholar

Drori, Y. 2018. On the properties of convex functions over open sets. arXiv:1812.02419.Google Scholar

Duchi, J., Jordan, M., Wainwright, M., and Wibisono, A. 2015. Optimal rates for zero-order convex optimization: The power of two function evaluations. IEEE Transactions on Information Theory, 61(5), 2788–2806.10.1109/TIT.2015.2409256CrossRef Google Scholar

Duembgen, L. 2010. Bounding standard gaussian tail probabilities. arXiv:1012.2063.Google Scholar

Evans, L. 2018. Measure theory and fine properties of functions. Routledge.10.1201/9780203747940CrossRef Google Scholar

Even-Dar, E., Mannor, S., and Mansour, Y. 2006. Action elimination and stopping conditions for the multi-armed bandit and reinforcement learning problems. Journal of Machine Learning Research, 7, 1079–1105.Google Scholar

Finch, S. 2011. Mean width of a regular simplex. arXiv:1111.4976.Google Scholar

Flaxman, A., Kalai, A., and McMahan, H.B. 2005. Online convex optimization in the bandit setting: Gradient descent without a gradient. In: Proceedings of the 16th Annual ACM-SIAM Symposium on Discrete Algorithms.Google Scholar

Fokkema, H., van der Hoeven, D., Lattimore, T., and Mayo, J. 2024. Online Newton method for bandit convex optimisation. In: Proceedings of the 37th Conference on Learning Theory.Google Scholar

Foster, D. J., Kakade, S., Qian, J., and Rakhlin, A. 2021. The statistical complexity of interactive decision making. arXiv:2112.13487.Google Scholar

Foster, D. J., Rakhlin, A., Sekhari, A., and Sridharan, K. 2022. On the complexity of adversarial decision making. In: Advances in Neural Information Processing Systems.10.52202/068431-2566CrossRef Google Scholar

Foster, D. P., and Rakhlin, A. 2021. On submodular contextual bandits. arXiv:2112.02165.Google Scholar

Gabillon, V., Kveton, B., Wen, Z., Eriksson, B., and Muthukrishnan, S. 2013. Adaptive submodular maximization in bandit setting. In: Advances in Neural Information Processing Systems.Google Scholar

Galicer, D., Merzbacher, M., and Pinasco, D. 2019. The minimal volume of simplices containing a convex body. Journal of Geometric Analysis, 29, 717–732.10.1007/s12220-018-0016-4CrossRef Google Scholar

Garber, D., and Kretzu, B. 2022. New projection-free algorithms for online convex optimization with adaptive regret guarantees. In: Proceedings of the 35th Conference on Learning Theory.Google Scholar

Ghadimi, S., and Lan, G. 2013. Stochastic first- and zeroth-order methods for nonconvex stochastic programming. SIAM Journal on Optimization, 23(4), 2341–2368.10.1137/120880811CrossRef Google Scholar

Giannopoulos, A., and Milman, E. 2014. M-estimates for isotropic convex bodies and their L_q-centroid bodies. In: Geometric Aspects of Functional Analysis: Israel Seminar.CrossRef Google Scholar

Grötschel, M., Lovász, L., and Schrijver, A. 2012. Geometric algorithms and combinatorial optimization. Springer Science & Business Media.Google Scholar

Grünbaum, B. 1960. Partitions of mass-distributions and of convex bodies by hyperplanes. Pacific Journal of Mathematics, 10(4), 1257–1261.10.2140/pjm.1960.10.1257CrossRef Google Scholar

Hale, N., Higham, N., and Trefethen, L. 2008. Computing A^α, log(A) and related matrix functions by contour integrals. SIAM Journal on Numerical Analysis, 46(5), 2505–2523.10.1137/070700607CrossRef Google Scholar

Hazan, E. 2016. Introduction to online convex optimization. Foundations and Trends® in Optimization, 2(3–4), 157–325.10.1561/2400000013CrossRef Google Scholar

Hazan, E., and Kale, S. 2012. Online submodular minimization. Journal of Machine Learning Research, 13(10).Google Scholar

Hazan, E., and Levy, K. 2014. Bandit convex optimization: Towards tight bounds. In: Advances in Neural Information Processing Systems.Google Scholar

Hazan, E., and Li, Y. 2016. An optimal algorithm for bandit convex optimization. arXiv:1603.04350.Google Scholar

Hazan, E., Agarwal, A., and Kale, S. 2007. Logarithmic regret algorithms for online convex optimization. Machine Learning, 69, 169–192.10.1007/s10994-007-5016-8CrossRef Google Scholar

Hazan, E., Karnin, Z., and Meka, R. 2016. Volumetric spanners: An efficient exploration basis for learning. Journal of Machine Learning Research, 17(119), 1–34.Google Scholar

Hu, X., Prashanth, L. A., György, A., and Szepesvári, Cs. 2016. (Bandit) convex optimization with biased noisy gradient oracles. In: Proceedings of the 19th International Conference on Artificial Intelligence and Statistics.Google Scholar

Ito, S. 2020. An optimal algorithm for bandit convex optimization with strongly-convex and smooth loss. In: Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics.Google Scholar

Jamieson, K., Nowak, R., and Recht, B. 2012. Query complexity of derivative-free optimization. In: Advances in Neural Information Processing Systems.Google Scholar

Jegelka, S., and Bilmes, J. 2011. Online submodular minimization for combinatorial structures. In: Proceedings of the 28th International Conference on Machine Learning.Google Scholar

Jiang, H. 2022. Minimizing convex functions with rational minimizers. Journal of the ACM, 70(1), 1–27.10.1145/3566050CrossRef Google Scholar

Jiang, S., Song, Z., Weinstein, O., and Zhang, H. 2021. A faster algorithm for solving general LPs. In: Proceedings of the 53rd Annual ACM SIGACT Symposium on Theory of Computing.CrossRef Google Scholar

Jingxin, Z., Yuchen, X., Kaicheng, J., and Zhihua, Z. 2025. A regularized online Newton method for stochastic convex bandits with linear vanishing noise. arXiv:2501.11127.Google Scholar

Kaiser, M. J. 1993. The Santaló point of a planar convex set. Applied Mathematics Letters, 6(2), 47–53.10.1016/0893-9659(93)90011-BCrossRef Google Scholar

Kannan, R., Lovász, L., and Simonovits, M. 1995. Isoperimetric problems for convex bodies and a localization lemma. Discrete & Computational Geometry, 13, 541–559.10.1007/BF02574061CrossRef Google Scholar

Karimi, H., Nutini, J., and Schmidt, M. 2016. Linear convergence of gradient and proximalgradient methods under the Polyak-Lojasiewicz condition. In: Proceedings of the Joint European Conference on Machine Learning and Knowledge Discovery in Databases.10.1007/978-3-319-46128-1_50CrossRef Google Scholar

Karp, R. 1972. Reducibility among combinatorial problems. In: Proceedings of a Symposium on the Complexity of Computer Computations.10.1007/978-1-4684-2001-2_9CrossRef Google Scholar

Khachiyan, L. 1990. An inequality for the volume of inscribed ellipsoids. Discrete & Computational Geometry, 5, 219–222.10.1007/BF02187786CrossRef Google Scholar

Khachiyan, L., and Todd, M. 1993. On the complexity of approximating the maximal inscribed ellipsoid for a polytope. Mathematical Programming, 61(08), 137–159.10.1007/BF01582144CrossRef Google Scholar

Kiefer, J. 1953. Sequential minimax search for a maximum. Proceedings of the American Mathematical Society, 4(3), 502–506.10.1090/S0002-9939-1953-0055639-3CrossRef Google Scholar

Kiefer, J., and Wolfowitz, J. 1952. Stochastic estimation of the maximum of a regression function. The Annals of Mathematical Statistics, 23(3), 462–466.10.1214/aoms/1177729392CrossRef Google Scholar

Kiefer, J., and Wolfowitz, J. 1960. The equivalence of two extremum problems. Canadian Journal of Mathematics, 12(5), 363–365.10.4153/CJM-1960-030-4CrossRef Google Scholar

Kirschner, J. 2021. Information-directed sampling – frequentist analysis and applications. Ph.D. thesis, ETH Zurich.Google Scholar

Klartag, B., and Lehec, J. 2024. Affirmative resolution of Bourgain’s slicing problem using Guan’s bound. arXiv:2412.15044.10.1007/s00039-025-00718-wCrossRef Google Scholar

Kleinberg, R. 2005. Nearly tight bounds for the continuum-armed bandit problem. In: Advances in Neural Information Processing Systems.Google Scholar

Larson, J., Menickelly, M., and Wild, S. 2019. Derivative-free optimization methods. Acta Numerica, 28, 287–404.10.1017/S0962492919000060CrossRef Google Scholar

Lattimore, T. 2020. Improved regret for zeroth-order adversarial bandit convex optimisation. Mathematical Statistics and Learning, 2(3–4), 311–334.10.4171/msl/17CrossRef Google Scholar

Lattimore, T., and György, A. 2021a. Improved regret for zeroth-order stochastic convex bandits. In: Proceedings of the 34th Conference on Learning Theory.Google Scholar

Lattimore, T., and György, A. 2021b. Mirror descent and the information ratio. In: Proceedings of the 34th Conference on Learning Theory.Google Scholar

Lattimore, T., and György, A. 2023. A second-order method for stochastic bandit convex optimisation. In: Proceedings of the 36th Conference on Learning Theory.Google Scholar

Lattimore, T., and Hao, B. 2021. Bandit phase retrieval. In: Advances in Neural Information Processing Systems.Google Scholar

Lattimore, T., and Szepesvári, Cs. 2019. An information-theoretic approach to minimax regret in partial monitoring. In: Proceedings of the 32nd Conference on Learning Theory.Google Scholar

Lattimore, T., and Szepesvári, Cs. 2020. Bandit algorithms. Cambridge University Press.CrossRef Google Scholar

Lee, Y.-T., and Yue, M-C. 2021. Universal barrier is n-self-concordant. Mathematics of Operations Research, 46(3), 1129–1148.10.1287/moor.2020.1113CrossRef Google Scholar

Lee, Y.-T., Sidford, A., and Wong, S. 2015. A faster cutting plane method and its implications for combinatorial and convex optimization. In: IEEE 56th Annual Symposium on Foundations of Computer Science.10.1109/FOCS.2015.68CrossRef Google Scholar

Lee, Y.-T., Sidford, A., and Vempala, S. 2018. Efficient convex optimization with membership oracles. In: Proceedings of the 31st Conference on Learning Theory.CrossRef Google Scholar

Levin, A. 1965. An algorithm for minimizing convex functions. Pages 1244–1247 of: Doklady Akademii Nauk, vol. 160. Russian Academy of Sciences.Google Scholar

Liang, T., Narayanan, H., and Rakhlin, A. 2014. On zeroth-order stochastic convex optimization via random walks. arXiv:1402.2667.Google Scholar

Liu, S., Chen, P.-Y., Kailkhura, B., Zhang, G., Hero, A. III, and Varshney, P. 2020. A primer on zeroth-order optimization in signal processing and machine learning: Principles, recent advances, and applications. IEEE Signal Processing Magazine, 37(5), 43–54.10.1109/MSP.2020.3003837CrossRef Google Scholar

Liu, X., Baudry, D., Zimmert, J., Rebeschini, P., and Akhavan, A. 2025. Non-stationary bandit convex optimization: A comprehensive study. arXiv:2506.02980.Google Scholar

Lovász, L. 1983. Submodular functions and convexity. Pages 235–257 of: Mathematical Programming: The State of the Art: Bonn 1982.10.1007/978-3-642-68874-4_10CrossRef Google Scholar

Lovász, L., and Vempala, S. 2006. Simulated annealing in convex bodies and an O^*(n⁴) volume algorithm. Journal of Computer and System Sciences, 72(2), 392–417.10.1016/j.jcss.2005.08.004CrossRef Google Scholar

Luo, H., Zhang, M., and Zhao, P. 2022. Adaptive bandit convex optimization with heterogeneous curvature. In: Proceedings of the 25th Conference on Learning Theory.Google Scholar

Marchal, O., and Arbel, J. 2017. On the sub-Gaussianity of the Beta and Dirichlet distributions. Electronic Communications in Probability, 22(54), 1–14.10.1214/17-ECP92CrossRef Google Scholar

McCormick, S. 2005. Submodular function minimization. Handbooks in Operations Research and Management Science, 12, 321–391.10.1016/S0927-0507(05)12007-6CrossRef Google Scholar

Meyer, M., and Werner, E. 1998. The Santaló-regions of a convex body. Transactions of the American Mathematical Society, 350(11), 4569–4591.10.1090/S0002-9947-98-02162-XCrossRef Google Scholar

Mhammedi, Z. 2022. Efficient projection-free online convex optimization with membership oracle. In: Proceedings of the 35th Conference on Learning Theory.Google Scholar

Milman, E. 2015. On the mean-width of isotropic convex bodies and their associated L p-centroid bodies. International Mathematics Research Notices, 2015(11), 3408–3423.Google Scholar

Motzkin, T., and Straus, G. 1965. Maxima for graphs and a new proof of a theorem of Turán. Canadian Journal of Mathematics, 17, 533–540.10.4153/CJM-1965-053-6CrossRef Google Scholar

Nemirovski, A. 1996. Lecture notes: Interior-point polynomial time methods for convex programming. Georgia Institute of Technology.Google Scholar

Nemirovski, A., Juditsky, A., Lan, G., and Shapiro, A. 2009. Robust stochastic approximation approach to stochastic programming. SIAM Journal on Optimization, 19(4), 1574–1609.10.1137/070704277CrossRef Google Scholar

Nemirovsky, A., and Yudin, D. 1983. Problem complexity and method efficiency in optimization. John Wiley & Sons.Google Scholar

Nesterov, Y. 1988. Polynomial time methods in linear and quadratic programming. Izvestija AN SSSR, Tekhnitcheskaya kibernetika.Google Scholar

Nesterov, Y. 1995. Cutting plane algorithms from analytic centers: efficiency estimates. Mathematical Programming, 69(1), 149–176.10.1007/BF01585556CrossRef Google Scholar

Nesterov, Y. 2018. Lectures on convex optimization. Springer.10.1007/978-3-319-91578-4CrossRef Google Scholar

Nesterov, Y., and Nemirovski, A. 1994. Interior-point polynomial algorithms in convex programming. Society for Industrial and Applied Mathematics.10.1137/1.9781611970791CrossRef Google Scholar

Nesterov, Y., and Nemirovsky, A. 1989. Self-concordant functions and polynomial time methods in convex programming. USSR Academy of Sciences Central Economic & Mathematical Institute, Moscow.Google Scholar

Nesterov, Y., and Spokoiny, V. 2017. Random gradient-free minimization of convex functions. Foundations of Computational Mathematics, 17, 527–566.10.1007/s10208-015-9296-2CrossRef Google Scholar

Newman, D. J. 1965. Location of the maximum on unimodal surfaces. Journal of the ACM, 12(3), 395–398.10.1145/321281.321291CrossRef Google Scholar

Niazadeh, R., Golrezaei, N., Wang, J., Susan, F., and Badanidiyuru, A. 2021. Online learning via offline greedy algorithms: Applications in market design and optimization. In: Proceedings of the 22nd ACM Conference on Economics and Computation.10.1145/3465456.3467571CrossRef Google Scholar

Novitskii, V., and Gasnikov, A. 2021. Improved exploiting higher order smoothness in derivative-free optimization and continuous bandit. arXiv:2101.03821.10.1007/s11590-022-01863-zCrossRef Google Scholar

Orabona, F. 2019. A modern introduction to online learning. arXiv:1912.13213.Google Scholar

Orseau, L., and Hutter, M. 2023. Line search for convex minimization. arXiv:2307.16560.Google Scholar

Pinelis, I. 2022. Improved concentration bounds for sums of independent sub-exponential random variables. Statistics & Probability Letters, 191, 109666.10.1016/j.spl.2022.109666CrossRef Google Scholar

Pivovarov, P. 2010. On the volume of caps and bounding the mean-width of an isotropic convex body. Mathematical Proceedings of the Cambridge Philosophical Society, 149(2), 317–331.10.1017/S0305004110000216CrossRef Google Scholar

Polyak, B. 1963. Gradient methods for minimizing functionals. Zhurnal Vychislitel’noi Matematiki I Matematicheskoi Fiziki, 3(4), 643–653.Google Scholar

Polyak, B. T., and Tsybakov, A. B. 1990. Optimal accuracy orders of stochastic approximation algorithms. Problemy Peredachi Informatsii, 26(2), 45–53.Google Scholar

Prashanth, L. A., and Bhatnagar, S. 2025. Gradient-based algorithms for zeroth-order optimization. Foundations and Trends® in Optimization, 8(1–3), 1–332.Google Scholar

Protasov, V. 1996. Algorithms for approximate calculation of the minimum of a convex function from its values. Mathematical Notes, 59(1), 69–74.10.1007/BF02312467CrossRef Google Scholar

Robbins, H., and Monro, S. 1951. A stochastic approximation method. The Annals of Mathematical Statistics, 22(3), 400–407.10.1214/aoms/1177729586CrossRef Google Scholar

Rockafellar, R. T. 1970. Convex analysis. Princeton University Press.10.1515/9781400873173CrossRef Google Scholar

Russo, D., and Van Roy, B. 2014. Learning to optimize via information-directed sampling. In: Advances in Neural Information Processing Systems.Google Scholar

Russo, D., and Van Roy, B. 2016. An information-theoretic analysis of Thompson sampling. Journal of Machine Learning Research, 17(1), 2442–2471.Google Scholar

Saha, A., and Tewari, A. 2011. Improved regret guarantees for online smooth convex optimization with bandit feedback. In: Proceedings of the 14th International Conference on Artificial Intelligence and Statistics.Google Scholar

Santaló, L. A. 1949. Un invariante afin para los cuerpos convexos del espacio de n dimensiones. Portugaliae Mathematica, 8, 155–161.Google Scholar

Schneider, R. 2013. Convex bodies: The Brunn–Minkowski theory. Cambridge University Press.10.1017/CBO9781139003858CrossRef Google Scholar

Shamir, O. 2013. On the complexity of bandit and derivative-free stochastic convex optimization. In: Proceedings of the 26th Conference on Learning Theory.Google Scholar

Shamir, O. 2015. On the complexity of bandit linear optimization. In: Proceedings of the 28th Conference on Learning Theory.Google Scholar

Shor, N. 1977. Cut-off method with space extension in convex programming problems. Cybernetics, 13(1), 94–96.10.1007/BF01071394CrossRef Google Scholar

Slivkins, A. 2019. Introduction to multi-armed bandits. Foundations and Trends® in Machine Learning, 12(1–2), 1–286.10.1561/2200000068CrossRef Google Scholar

Spall, J. C. 1992. Multivariate stochastic approximation using a simultaneous perturbation gradient approximation. IEEE Transactions on Automatic Control, 37(3), 332–341.10.1109/9.119632CrossRef Google Scholar

Spall, J. C. 1994. Developments in stochastic optimization algorithms with gradient approximations based on function measurements. In: Proceedings of Winter Simulation Conference.Google Scholar

Suggala, A., Ravikumar, P., and Netrapalli, P. 2021. Efficient bandit convex optimization: Beyond linear losses. In: Proceedings of the 34th Conference on Learning Theory.Google Scholar

Suggala, A., Sun, J., Netrapalli, P., and Hazan, E. 2024. Second order methods for bandit optimization and control. In: Proceedings of the 37th Conference on Learning Theory.Google Scholar

Tajdini, A., Jain, L., and Jamieson, K. 2024. Nearly minimax optimal submodular maximization with bandit feedback. In: Advances in Neural Information Processing Systems.10.52202/079017-3051CrossRef Google Scholar

Takemori, S., Sato, M., Sonoda, T., Singh, J., and Ohkuma, T. 2020. Submodular bandit problem under multiple constraints. In: Proceedings of the 36th Conference on Uncertainty in Artificial Intelligence.Google Scholar

Tarasov, S., Khachiyan, L. G., and Erlich, I.I. 1988. The method of inscribed ellipsoids. Pages 226–230 of: Soviet Mathematics-Doklady, vol. 37.Google Scholar

Thompson, W. 1933. On the likelihood that one unknown probability exceeds another in view of the evidence of two samples. Biometrika, 25(3–4), 285–294.10.1093/biomet/25.3-4.285CrossRef Google Scholar

Vaidya, P. M. 1996. A new algorithm for minimizing convex functions over convex sets. Mathematical Programming, 73(3), 291–341.10.1007/BF02592216CrossRef Google Scholar

van der Hoeven, D., van Erven, T., and Kotlowski, W. 2018. The many faces of exponential weights in online learning. In: Proceedings of the 31st Conference on Learning Theory.Google Scholar

Vershynin, R. 2018. High-dimensional probability: An introduction with applications in data science. Cambridge University Press.Google Scholar

Wang, Y. 2023. On adaptivity in nonstationary stochastic optimization with bandit feedback. Operations Research, 73(2), 819–828.10.1287/opre.2022.0576CrossRef Google Scholar

Yudin, D., and Nemirovskii, A. 1976. Informational complexity and efficient methods for the solution of convex extremal problems. Matekon, 13(2), 22–45.Google Scholar

Yudin, D., and Nemirovskii, A. 1977. Evaluation of the informational complexity of mathematical programming problems. Matekon, 13(2), 3–25.Google Scholar

Zhang, H., and Chen, S. 2021. Concentration inequalities for statistical inference. Communications in Mathematical Research, 37(1), 1–85.10.4208/cmr.2020-0041CrossRef Google Scholar

Zhang, M., Chen, L., Hassani, H., and Karbasi, A. 2019. Online continuous submodular maximization: From full-information to bandit feedback. In: Advances in Neural Information Processing Systems.Google Scholar

Zhao, P., Wang, G., Zhang, L., and Zhou, Z.-H. 2021. Bandit convex optimization in non-stationary environments. Journal of Machine Learning Research, 22(1), 5562–5606.Google Scholar

Zimmert, J., and Lattimore, T. 2019. Connections between mirror descent, Thompson sampling and the information ratio. In: Advances in Neural Information Processing Systems.Google Scholar

Zimmert, J., and Lattimore, T. 2022. Return of the bias: Almost minimax optimal high probability bounds for adversarial linear bandits. In: Proceedings of the 35th Conference on Learning Theory.Google Scholar

Zinkevich, M. 2003. Online convex programming and generalized infinitesimal gradient ascent. In: Proceedings of the 20th International Conference on Machine Learning.Google Scholar

Zwillinger, D. 2018. CRC standard mathematical tables and formulae. 33rd edn. Chapman and Hall/CRC.Google Scholar

Accessibility standard: WCAG 2.1 A

Why this information is here

This section outlines the accessibility features of this content - including support for screen readers, full keyboard navigation and high-contrast display options. This may not be relevant for you.

Accessibility Information

The PDF of this chapter complies with version 2.1 of the Web Content Accessibility Guidelines (WCAG), covering newer accessibility requirements and improved user experiences and meets the basic (A) level of WCAG compliance, addressing essential accessibility barriers.

Content Navigation

Table of contents navigation
Allows you to navigate directly to chapters, sections, or non‐text items through a linked table of contents, reducing the need for extensive scrolling.

Index navigation
Provides an interactive index, letting you go straight to where a term or subject appears in the text without manual searching.

Reading Order & Textual Equivalents

Single logical reading order
You will encounter all content (including footnotes, captions, etc.) in a clear, sequential flow, making it easier to follow with assistive tools like screen readers.

Short alternative textual descriptions
You get concise descriptions (for images, charts, or media clips), ensuring you do not miss crucial information when visual or audio elements are not accessible.

Book contents

References

Summary

Information

Access options

Book purchase

Temporarily unavailable

References

Accessibility standard: WCAG 2.1 A

Why this information is here

Accessibility Information

Content Navigation

Reading Order & Textual Equivalents

Save book to Kindle

Save book to Dropbox

Save book to Google Drive