Hostname: page-component-848d4c4894-75dct Total loading time: 0 Render date: 2024-05-01T11:33:00.623Z Has data issue: false hasContentIssue false

Improved Metropolis–Hastings algorithms via landscape modification with applications to simulated annealing and the Curie–Weiss model

Published online by Cambridge University Press:  30 August 2023

Michael C. H. Choi*
Affiliation:
National University of Singapore
*
*Postal address: Department of Statistics and Data Science, National University of Singapore, Singapore. Email address: mchchoi@nus.edu.sg

Abstract

In this paper, we propose new Metropolis–Hastings and simulated annealing algorithms on a finite state space via modifying the energy landscape. The core idea of landscape modification rests on introducing a parameter c, such that the landscape is modified once the algorithm is above this threshold parameter to encourage exploration, while the original landscape is utilized when the algorithm is below the threshold for exploitation purposes. We illustrate the power and benefits of landscape modification by investigating its effect on the classical Curie–Weiss model with Glauber dynamics and external magnetic field in the subcritical regime. This leads to a landscape-modified mean-field equation, and with appropriate choice of c the free energy landscape can be transformed from a double-well into a single-well landscape, while the location of the global minimum is preserved on the modified landscape. Consequently, running algorithms on the modified landscape can improve the convergence to the ground state in the Curie–Weiss model. In the setting of simulated annealing, we demonstrate that landscape modification can yield improved or even subexponential mean tunnelling time between global minima in the low-temperature regime by appropriate choice of c, and we give a convergence guarantee using an improved logarithmic cooling schedule with reduced critical height. We also discuss connections between landscape modification and other acceleration techniques, such as Catoni’s energy transformation algorithm, preconditioning, importance sampling, and quantum annealing. The technique developed in this paper is not limited to simulated annealing, but is broadly applicable to any difference-based discrete optimization algorithm by a change of landscape.

Type
Original Article
Copyright
© The Author(s), 2023. Published by Cambridge University Press on behalf of Applied Probability Trust

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Catoni, O. (1996). Metropolis, simulated annealing, and iterated energy transformation algorithms: theory and experiments. J. Complexity 12, 595623.CrossRefGoogle Scholar
Catoni, O. (1998). The energy transformation method for the Metropolis algorithm compared with simulated annealing. Prob. Theory Relat. Fields 110, 6989.CrossRefGoogle Scholar
Wang, Y., Wu, S. and Zou, J. (2016). Quantum annealing with Markov chain Monte Carlo simulations and D-wave quantum computers. Statist. Sci. 31, 362398.CrossRefGoogle Scholar
Choi, M. C. (2020). On the convergence of an improved and adaptive kinetic simulated annealing. Preprint. Available at https://arxiv.org/abs/2009.00195.Google Scholar
Fang, H., Qian, M. and Gong, G. (1997). An improved annealing method and its large-time behavior. Stoch. Process. Appl. 71, 5574.CrossRefGoogle Scholar
Monmarché, P. (2018). Hypocoercivity in metastable settings and kinetic simulated annealing. Prob. Theory Relat. Fields 172, 12151248.CrossRefGoogle Scholar
Zhang, J. and Choi, M. C. (2021). Improved simulated annealing for sampling from multimodal distributions. Working paper.Google Scholar
Deuschel, J.-D. and Mazza, C. (1994). $L^2$ convergence of time nonhomogeneous Markov processes: I. Spectral estimates. Ann. Appl. Prob. 4, 1012–1056.CrossRefGoogle Scholar
Nardi, F. R. and Zocca, A. (2019). Tunneling behavior of Ising and Potts models in the low-temperature regime. Stoch. Process. Appl. 129, 45564575.CrossRefGoogle Scholar
Zocca, A. (2018). Low-temperature behavior of the multicomponent Widom–Rowlison model on finite square lattices. J. Statist. Phys. 171, 137.CrossRefGoogle Scholar
Del Moral, P. and Miclo, L. (1999). On the convergence and applications of generalized simulated annealing. SIAM J. Control Optimization 37, 12221250.CrossRefGoogle Scholar
Löwe, M. (1996). Simulated annealing with time-dependent energy function via Sobolev inequalities. Stoch. Process. Appl. 63, 221233.CrossRefGoogle Scholar
Frigerio, A. and Grillo, G. (1993). Simulated annealing with time-dependent energy function. Math. Z. 213, 97116.CrossRefGoogle Scholar
Bovier, A. and den Hollander, F. (2015). Metastability: A Potential-Theoretic Approach. Springer, Cham.CrossRefGoogle Scholar
Aldous, D. and Fill, J. A. (2002). Reversible Markov Chains and Random Walks on Graphs. Unfinished monograph, recompiled 2014. Available at http://www.stat.berkeley.edu/aldous/RWG/book.html.Google Scholar
Mathieu, P. and Picco, P. (1998). Metastability and convergence to equilibrium for the random field Curie–Weiss model. J. Statist. Phys. 91, 679732.CrossRefGoogle Scholar
Menz, G., Schlichting, A., Tang, W. and Wu, T. (2022). Ergodicity of the infinite swapping algorithm at low temperature. Stoch. Process. Appl. 151, 519552.CrossRefGoogle Scholar
Croes, G. A. (1958). A method for solving traveling-salesman problems. Operat. Res. 6, 791812.CrossRefGoogle Scholar
Holley, R. and Stroock, D. (1988). Simulated annealing via Sobolev inequalities. Commun. Math. Phys. 115, 553569.CrossRefGoogle Scholar
Levin, D. A. and Peres, Y. (2017). Markov Chains and Mixing Times. American Mathematical Society, Providence, RI.CrossRefGoogle Scholar