Hostname: page-component-77f85d65b8-v2srd Total loading time: 0 Render date: 2026-03-29T04:20:28.709Z Has data issue: false hasContentIssue false

Synthesis of depression outcomes reported on different scales: A comparison of methods for modelling mean differences

Published online by Cambridge University Press:  17 March 2025

Beatrice C. Downing*
Affiliation:
Population Health Sciences, Bristol Medical School, University of Bristol, Bristol, UK
Nicky J. Welton
Affiliation:
Population Health Sciences, Bristol Medical School, University of Bristol, Bristol, UK
Hugo Pedder
Affiliation:
Population Health Sciences, Bristol Medical School, University of Bristol, Bristol, UK
Ifigeneia Mavranezouli
Affiliation:
Centre for Outcomes Research and Effectiveness, Research Department of Clinical, Educational & Health Psychology, University College London, London, UK
Odette Megnin-Viggars
Affiliation:
Centre for Outcomes Research and Effectiveness, Research Department of Clinical, Educational & Health Psychology, University College London, London, UK
A.E. Ades
Affiliation:
Population Health Sciences, Bristol Medical School, University of Bristol, Bristol, UK
*
Corresponding author: Beatrice C. Downing; Email: beatrice.downing@bristol.ac.uk
Rights & Permissions [Opens in a new window]

Abstract

Several methods have been proposed for the synthesis of continuous outcomes reported on different scales, including the Standardised Mean Difference (SMD) and the Ratio of Means (RoM). SMDs can be formed by dividing the study mean treatment effect either by a study-specific (Study-SMD) or a scale-specific (Scale-SMD) standard deviation (SD). We compared the performance of RoM to the different standardisation methods with and without meta-regression (MR) on baseline severity, in a Bayesian network meta-analysis (NMA) of 14 treatments for depression, reported on five different scales. There was substantial between-study variation in the SDs reported on the same scale. Based on the Deviance Information Criterion, RoM was preferred as having better model fit than the SMD models. Model fit for SMD models was not improved with meta-regression. Percentage shrinkage was used as a scale-independent measure with higher % shrinkage indicating lower heterogeneity. Heterogeneity was lowest for RoM (20.5% shrinkage), then Scale-SMD (18.2% shrinkage), and highest for Study-SMD (16.7% shrinkage). Model choice impacted which treatment was estimated to be most effective. However, all models picked out the same three highest-ranked treatments using the GRADE criteria. Alongside other indicators, higher shrinkage of RoM models suggests that treatments for depression act multiplicatively rather than additively. Further research is needed to determine whether these findings extend to Patient- and Clinician-Reported Outcomes used in other application areas. Where treatment effects are additive, we recommend using Scale-SMD for standardisation to avoid the additional heterogeneity introduced by Study-SMD.

Information

Type
Research Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (https://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2025. Published by Cambridge University Press on behalf of The Society for Research Synthesis Methodology
Figure 0

Figure 1 Network of evidence on 14 pharmacological treatments of more severe depression, no treatment and pill placebo. The thickness of edges is proportional to the number of studies and the size of nodes is proportional to the number of participants receiving the treatment.

Figure 1

Table 1 Variation in baseline SD within each scale and regression of baseline SD against baseline severity

Figure 2

Figure 2 The relationship between the baseline pooled standard deviation and mean depression score at baseline. Results are shown by scale for HAMD-17, HAMD-21, HAMD-24 and MADRS scales. BDI-I could not be plotted because it was reported in a single study.

Figure 3

Table 2 Model fit statistics and heterogeneity estimated within each of the five models

Figure 4

Table 3 The mean difference of treatments relative to pill placebo, presented as units on the HAMD-17 scale, with their 95% CrI

Figure 5

Figure 3 Treatment effect vs placebo as change in depression score on the HAMD-17 scale by model. In each case, circular points indicate the median estimate, thick bars indicate the 95% credible interval (CrI) and thin bars indicate the 95% prediction interval, for each treatment vs placebo. Treatments are ordered by median treatment effect under the RoM model. The vertical grey line indicates one unit on the HAMD-17 scale.

Figure 6

Table 4 Treatment recommendations based on each model, ranked by efficacy, according to five decision rules

Figure 7

Table 5 Criteria supporting the assumption of multiplicative treatment effects

Supplementary material: File

Downing et al. supplementary material

Downing et al. supplementary material
Download Downing et al. supplementary material(File)
File 365.7 KB