Hostname: page-component-77f85d65b8-pkds5 Total loading time: 0 Render date: 2026-03-28T15:24:53.825Z Has data issue: false hasContentIssue false

Bayes factor hypothesis testing in meta-analyses: Practical advantages and methodological considerations

Published online by Cambridge University Press:  04 December 2025

Joris Mulder*
Affiliation:
Department of Methodology and Statistics, Tilburg University, Tilburg School of Social and Behavioral Sciences , Netherlands
Robbie C. M. van Aert
Affiliation:
Methodology and Statistics, Tilburg University, Netherlands
*
Corresponding authors: Joris Mulder; Email: jomulder@gmail.com
Rights & Permissions [Opens in a new window]

Abstract

Bayesian hypothesis testing via Bayes factors offers a principled alternative to classical p-value methods in meta-analysis, particularly suited to its cumulative and sequential nature. Unlike p-values, Bayes factors allow for quantifying support both for and against the existence of an effect, facilitate ongoing evidence monitoring, and maintain coherent long-run behavior as additional studies are incorporated. Recent theoretical developments further show how Bayes factors can flexibly control Type I error rates through connections to e-value theory. Despite these advantages, their use remains limited in the meta-analytic literature. This article provides a critical overview of their theoretical properties, methodological considerations—such as prior sensitivity—and practical advantages for evidence synthesis. Two illustrative applications are provided: one on statistical learning in individuals with language impairments, and another on seroma incidence following post-operative exercise in breast cancer patients. New tools supporting these methods are available in the open-source R package BFpack.

Information

Type
Research Article
Creative Commons
Creative Common License - CCCreative Common License - BYCreative Common License - ND
This is an Open Access article, distributed under the terms of the Creative Commons Attribution-NoDerivatives licence (https://creativecommons.org/licenses/by-nd/4.0), which permits re-use, distribution, and reproduction in any medium, provided that no alterations are made and the original article is properly cited.
Copyright
© The Author(s), 2025. Published by Cambridge University Press on behalf of The Society for Research Synthesis Methodology
Figure 0

Table 1 Summary of differences between classical p-value and Bayes factor testing.

Figure 1

Figure 1 Interpreting the evidence on a continuous (log) scale. The qualitative categories can be found in Kass and Raftery.12 Visualization of the colored bar from Mulder et al.28

Figure 2

Figure 2 Forest plot for the meta-analysis of McNeely et al. (2010). LOR is the log odds ratio, and RE and CE refer to the random effects and the common effect, respectively.

Figure 3

Table 2 An overview of available default priors when testing the mean effect using existing R packages: BFpack,36 RoBMA,60 and metaBMA.61

Figure 4

Figure 3 Left panel: The Bayes factor $B_{10}$ as a function of the standard deviation of a normal prior for the average effect having a mean of zero for the meta-analysis of McNeely et al.37 Right panel: Normal priors with mean 0 and a standard deviation of 1 (dashed line) or 5 (dotted line), and the rescaled likelihood evaluated at $\hat {\tau }=0.243$. The likelihood has its mode at $\hat {\mu }=0.416$.

Figure 5

Table 3 Three possible noninformative priors for the between-study heterogeneity $\tau ^2$ and the number of required studies to obtain a finite Bayes factor.

Figure 6

Figure 4 Evidence for $\mathcal {H}_0$ based on a Savage–Dickey density ratio. In case of posterior 1, there is evidence for $\mathcal {H}_0$, and in case of posterior 2, there would be evidence for $\mathcal {H}_1$.

Figure 7

Table 4 Linking thresholds for Bayes factors to common significance levels via $\alpha =1/B_{10}=B_{01}$.

Figure 8

Figure 5 The evidence for $\mathcal {H}_0$ against $\mathcal {H}_1$ (left panel) and the posterior support for an RE model under the two hybrid models (right panel) as a function of the between-study variation.

Figure 9

Figure 6 Forest plot and the results of Bayesian updating for the meta-analysis of Lammertink et al. (2017). Estimates of the mean effect size of the Bayesian methods were obtained using a normal prior with mean zero and standard deviation of 1,000 for the mean effect.

Figure 10

Table 5 Bayes factors (BF$_{10}$) and posterior probabilities ($PHP(\mathcal {H}_1)$) for testing $\mathcal {H}_1$ versus $\mathcal {H}_0$ based on all available studies.

Figure 11

Figure 7 Forest plot and the results of Bayesian updating for the meta-analysis of McNeely et al. (2010). Estimates of the mean effect size of the Bayesian methods were obtained using a normal prior with mean zero and standard deviation of 1,000 for the mean effect.

Figure 12

Table 6 Overview of models and applications for Bayesian evidence synthesis.

Figure 13

Figure B1 Density estimate of the default log odds based on uniform priors for the success probabilities (black line) and Student t approximation with scale 2.35 and 13 degrees of freedom.

Figure 14

Figure C1 5%, 50% (solid lines), and 95% quantiles of the sampling distribution of the Bayes factor using different priors for the nuisance parameters $\tau ^2$ based on 2,000 randomly generated data sets.

Figure 15

Figure F1 Average Bayes factor, $B_{10}$, for two-sided test of the global effect under the random effects model based on 10,000 randomly generated data sets under $\mathcal {H}_0$ (where the global effect is zero).

Figure 16

Table H1 Results of using different priors for the (average) effect for testing $\mathcal {H}_1$ versus $\mathcal {H}_0$ for the meta-analysis by Lammertink et al. (2017). The first two columns show the results for the used default prior $N(0,1)$ and the last two columns show the results using the prior $N(0,0.5)$ as sensitivity analysis. Bayes factors (BF$_{10}$) and posterior probabilities ($PHP(\mathcal {H}_1)$) are presented.

Figure 17

Table I1 Results of using different priors for the between-study heterogeneity for testing $\mathcal {H}_1$ versus $\mathcal {H}_0$ for the meta-analysis by Lammertink et al. (2017). The first four rows show the results of the different priors when the prior of the average effect is $N(0,1)$. The last four rows show the results of the different priors when the prior of the average effect is $N(0,0.5)$. Bayes factors (BF$_{10}$) and posterior probabilities ($PHP(\mathcal {H}_1)$) are presented.

Figure 18

Table I2 Results of using different priors for the (average) effect for testing $\mathcal {H}_1$ versus $\mathcal {H}_0$ for the meta-analysis by McNeely et al. (2010). The first two columns show the results for the used default prior $N(0,1)$ and the last two columns show the results using the prior $N(0,0.5)$ as sensitivity analysis. Bayes factors (BF$_{10}$) and posterior probabilities ($PHP(\mathcal {H}_1)$) are presented.

Figure 19

Table I3 Results of using different priors for the between-study heterogeneity for testing $\mathcal {H}_1$ versus $\mathcal {H}_0$ for the meta-analysis by McNeely et al. (2010). The first four rows show the results of the different priors when the prior of the average effect is $t_{13}(0,2.35)$. The last four rows show the results of the different priors when the prior of the average effect is $t_{41}(0,1.067)$. Bayes factors (BF$_{10}$) and posterior probabilities ($PHP(\mathcal {H}_1)$) are presented.