Hostname: page-component-89b8bd64d-dvtzq Total loading time: 0 Render date: 2026-05-07T22:31:45.781Z Has data issue: false hasContentIssue false

Meta-analysis with Jeffreys priors: Empirical frequentist properties

Published online by Cambridge University Press:  12 March 2025

Maya B. Mathur*
Affiliation:
Quantitative Sciences Unit and Department of Pediatrics, Stanford University, Palo Alto, CA, USA
Rights & Permissions [Opens in a new window]

Abstract

In small meta-analyses (e.g., up to 20 studies), the best-performing frequentist methods can yield very wide confidence intervals for the meta-analytic mean, as well as biased and imprecise estimates of the heterogeneity. We investigate the frequentist performance of alternative Bayesian methods that use the invariant Jeffreys prior. This prior has the usual Bayesian motivation, but also has a purely frequentist motivation: the resulting posterior modes correspond to the established Firth bias correction of the maximum likelihood estimator. We consider two forms of the Jeffreys prior for random-effects meta-analysis: the previously established “Jeffreys1” prior treats the heterogeneity as a nuisance parameter, whereas the “Jeffreys2” prior treats both the mean and the heterogeneity as estimands of interest. In a large simulation study, we assess the performance of both Jeffreys priors, considering different types of Bayesian estimates and intervals. We assess point and interval estimation for both the mean and the heterogeneity parameters, comparing to the best-performing frequentist methods. For small meta-analyses of binary outcomes, the Jeffreys2 prior may offer advantages over standard frequentist methods for point and interval estimation of the mean parameter. In these cases, Jeffreys2 can substantially improve efficiency while more often showing nominal frequentist coverage. However, for small meta-analyses of continuous outcomes, standard frequentist methods seem to remain the best choices. The best-performing method for estimating the heterogeneity varied according to the heterogeneity itself. Röver & Friede’s R package bayesmeta implements both Jeffreys priors. We also generalize the Jeffreys2 prior to the case of meta-regression.

Information

Type
Research Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (https://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2025. Published by Cambridge University Press on behalf of The Society for Research Synthesis Methodology
Figure 0

Figure 1 Priors for four simulated meta-analyses of standardized mean differences ($k=10$), in which the within-study sample sizes (N) were generated from four possible distributions. Studies’ standard errors were estimated using Eq. (5) and, given the data-generation parameters, were approximately equal to $2/\sqrt {N}$. Points are the maxima. The priors have been scaled to have the same maximum height.

Figure 1

Figure 2 Priors on $\tau $ for the meta-analysis on all-cause death ($k=3, \{ \sigma _i \} = \{ 1.15, 1.63, 0.19 \}$). Points are maxima. The priors have been scaled to have the same maximum height.

Figure 2

Figure 3 Joint posterior under the Jeffreys2 prior for the meta-analysis on all-cause death ($k=3, \{ \sigma _i \} = \{ 1.15, 1.63, 0.19 \}$). Horizontal red line: marginal posterior mode of $\mu $. Vertical blue line: marginal posterior mode of $\tau $.

Figure 3

Figure 4 Marginal posteriors under the Jeffreys2 prior for the meta-analysis on all-cause death. Solid vertical lines: marginal posterior modes. Dashed vertical lines: limits of 95% intervals.

Figure 4

Figure 5 Interval limits greater than $RR=10$ are truncated. The exact method does not yield point estimates. CI: credible interval.

Figure 5

Table 1 Methods assessed in simulation study

Figure 6

Table 2 Possible values of simulation parameters

Figure 7

Table 3 Scenarios with continuous outcomes; $\widehat {\mu }$ point and interval estimation

Figure 8

Table 4 Scenarios with continuous outcomes; $\widehat {\tau }$ point and interval estimation

Figure 9

Table 5 Scenarios with binary outcomes; $\widehat {\mu }$ point and interval estimation

Figure 10

Table 6 Scenarios with binary outcomes; $\widehat {\tau }$ point and interval estimation

Figure 11

Table 7 Scenarios with continuous outcomes, $k \le 5$; $\widehat {\mu }$ point and interval estimation

Figure 12

Table 8 Scenarios with continuous outcomes, $k \le 5$; $\widehat {\tau }$ point and interval estimation

Figure 13

Table 9 Scenarios with binary outcomes, $k \le 5$; $\widehat {\mu }$ point and interval estimation

Figure 14

Table 10 Scenarios with binary outcomes, $k \le 5$; $\widehat {\tau }$ point and interval estimation

Figure 15

Figure 6 Bias of $\widehat {\mu }$; all scenarios. Hinges of each boxplot are the 25th, 50th, and 75th percentiles. The upper and lower whiskers extend from the hinge to the minimum or maximum value that is no more than $1.5 \times (\text {interquartile range})$ from the nearest hinge.

Figure 16

Figure 7 Coverage of CI for $\widehat {\mu }$. Lines are slightly staggered horizontally for visibility. Lines are mean performances across scenarios, conditional on k, $\tau $, the distribution of population effects, and the outcome type. All HKSJ methods performed very similarly, so their overlapping lines look like a single grey line.

Figure 17

Figure 8 Width of CI for $\widehat {\mu }$. Lines are slightly staggered horizontally for visibility. Lines are mean performances across scenarios, conditional on k, $\tau $, the distribution of population effects, and the outcome type. Y-axis is on log scale.

Figure 18

Figure 9 Bias of $\widehat {\tau }$; all scenarios. Hinges of each boxplot are the 25th, 50th, and 75th percentiles. The upper and lower whiskers extend from the hinge to the minimum or maximum value that is no more than $1.5 \times (\text {interquartile range})$ from the nearest hinge.

Figure 19

Figure 10 MAE of $\widehat {\tau }$. Lines are slightly staggered horizontally for visibility. Lines are mean performances across scenarios, conditional on k, $\tau $, the distribution of population effects, and the outcome type.

Figure 20

Figure 11 RMSE of $\widehat {\tau }$. Lines are slightly staggered horizontally for visibility. Lines are mean performances across scenarios, conditional on k, $\tau $, the distribution of population effects, and the outcome type.

Figure 21

Figure 12 Coverage of CI for $\widehat {\tau }$. Lines are slightly staggered horizontally for visibility. Lines are mean performances across scenarios, conditional on k, $\tau $, the distribution of population effects, and the outcome type.

Figure 22

Figure 13 Width of CI for $\widehat {\tau }$. Lines are slightly staggered horizontally for visibility. Lines are mean performances across scenarios, conditional on k, $\tau $, the distribution of population effects, and the outcome type. Y-axis is on log scale.

Supplementary material: File

Mathur supplementary material

Mathur supplementary material
Download Mathur supplementary material(File)
File 605 KB