Hostname: page-component-77f85d65b8-5ngxj Total loading time: 0 Render date: 2026-03-27T15:48:21.649Z Has data issue: false hasContentIssue false

Variational inference as an alternative to MCMC for parameter estimation and model selection

Published online by Cambridge University Press:  25 January 2022

Geetakrishnasai Gunapati
Affiliation:
Department of Computer Science and Engineering, IIT Hyderabad, Kandi, Telangana 502285, India
Anirudh Jain
Affiliation:
Department of Computer Science, Aalto University, Espoo 02150, Finland
P. K. Srijith
Affiliation:
Department of Computer Science and Engineering, IIT Hyderabad, Kandi, Telangana 502285, India
Shantanu Desai*
Affiliation:
Department of Physics, IIT Hyderabad, Kandi, Telangana 502285, India
*
Author for correspondence: Shantanu Desai, e-mail: shntn05@gmail.com
Rights & Permissions [Opens in a new window]

Abstract

Most applications of Bayesian Inference for parameter estimation and model selection in astrophysics involve the use of Monte Carlo techniques such as Markov Chain Monte Carlo (MCMC) and nested sampling. However, these techniques are time-consuming and their convergence to the posterior could be difficult to determine. In this study, we advocate variational inference as an alternative to solve the above problems, and demonstrate its usefulness for parameter estimation and model selection in astrophysics. Variational inference converts the inference problem into an optimisation problem by approximating the posterior from a known family of distributions and using Kullback–Leibler divergence to characterise the difference. It takes advantage of fast optimisation techniques, which make it ideal to deal with large datasets and makes it trivial to parallelise on a multicore platform. We also derive a new approximate evidence estimation based on variational posterior, and importance sampling technique called posterior-weighted importance sampling for the calculation of evidence, which is useful to perform Bayesian model selection. As a proof of principle, we apply variational inference to five different problems in astrophysics, where Monte Carlo techniques were previously used. These include assessment of significance of annual modulation in the COSINE-100 dark matter experiment, measuring exoplanet orbital parameters from radial velocity data, tests of periodicities in measurements of Newton’s constant G, assessing the significance of a turnover in the spectral lag data of GRB 160625B, and estimating the mass of a galaxy cluster using weak gravitational lensing. We find that variational inference is much faster than MCMC and nested sampling techniques for most of these problems while providing competitive results. All our analysis codes have been made publicly available.

Information

Type
Research Article
Copyright
© The Author(s), 2022. Published by Cambridge University Press on behalf of the Astronomical Society of Australia
Figure 0

Table 1. Log evidence values and Bayes factor for the two hypotheses computed using PWISE, and dynesty packages. This result favours $H_2$ that there is no annual modulation in COSINE-100 data.

Figure 1

Table 2. The assumed prior distribution of various parameters and their boundaries. It is similar to choice of priors given by Balan & Lahav (2009). For the parameters marked as Jeffreys prior, the prior used is equal to the reciprocal of the parameter. We note that modified Jeffreys refers to a slight modification of the standard Jeffreys prior, in which additive constants are added, since the lower limits are zero (Gregory 2005)

Figure 2

Table 3. The parameter values from both MCMC (computing using PyMC3) and ADVI for determination of exoplanet parameters from radial velocity data. Both of these are comparable to the actual values obtained from (Sharma 2017), which are used to generate the synthetic data used for this analysis.

Figure 3

Figure 1. Left: Radial velocity as a function of time for a star in a binary system. The orange line is the best fit obtained using ADVI and the green line is obtained from NUTS MCMC. Right: 68%, 90% and 95% credible intervals of parameters obtained using ADVI. The corresponding plots for the same data using MCMC can be found in Figure 8 of Sharma (2017).

Figure 4

Figure 2. Left: ADVI based marginalised credible intervals of the linear $(n=1)$ LIV fit for the spectral lag energy data. Right: ADVI based Marginalized parameter constraints of the linear $(n=2)$ LIV fit for the spectral lag energy data. Both the plots were generated using the corner.py module (Foreman-Mackey 2016). The corresponding parameter constraints obtained using MCMC can be found in Figures 3 and 4 from Wei et al. (2017), and they agree with these contours.

Figure 5

Table 4. Log evidence values for the four hypotheses and Bayes factor computed with respect to $H_1$ calculated using both PWISE and nestle package. The log evidence for all hypotheses are comparable, except for $H_3$. However, even for $H_3$, the Bayes factor using both the methods qualitatively leads to the same conclusion using Jeffreys scale of $H_3$ been decisively favoured over $H_1$.

Figure 6

Table 5. Log Evidence values computed using PWISE and nestle package and Bayes factor for hypothesis $n=1$ and $n=2$ LIV, when compared to the null hypothesis are shown.

Figure 7

Figure 3. Left : Credible intervals for parameter estimates using ADVI. Right: Credible intervals for parameter estimates using emcee MCMC sampler. The credible intervals were plotted using Corner python module. Note that $M_{200}$ is expressed in terms of ${\rm M}_{\odot}$.