The effectiveness of peer support for individuals with mental illness: systematic review and meta-analysis

Background The benefits of peer support interventions (PSIs) for individuals with mental illness are not well known. The aim of this systematic review and meta-analysis was to assess the effectiveness of PSIs for individuals with mental illness for clinical, personal, and functional recovery outcomes. Methods Searches were conducted in PubMed, Embase, and PsycINFO (December 18, 2020). Included were randomized controlled trials (RCTs) comparing peer-delivered PSIs to control conditions. The quality of records was assessed using the Cochrane Collaboration Risk of Bias tool. Data were pooled for each outcome, using random-effects models. Results After screening 3455 records, 30 RCTs were included in the systematic review and 28 were meta-analyzed (4152 individuals). Compared to control conditions, peer support was associated with small but significant post-test effect sizes for clinical recovery, g = 0.19, 95% CI (0.11–0.27), I2 = 10%, 95% CI (0–44), and personal recovery, g = 0.15, 95% CI (0.04–0.27), I2 = 43%, 95% CI (1–67), but not for functional recovery, g = 0.08, 95% CI (−0.02 to 0.18), I2 = 36%, 95% CI (0–61). Our findings should be considered with caution due to the modest quality of the included studies. Conclusions PSIs may be effective for the clinical and personal recovery of mental illness. Effects are modest, though consistent, suggesting potential efficacy for PSI across a wide range of mental disorders and intervention types.


Introduction
In recent years mental health care services and social organizations increased their focus on implementing peer support initiatives to promote recovery and expand the availability of support for individuals coping with mental illness (Stratford et al., 2017). This growing interest in peer support is stimulated by the World Health Organization (WHO), as they consider it a feasible tool which adds a person-centered, recovery, and rights-based approach to biomedical practices in mental health services (WHO, 2021). Also, the (coronavirus disease 2019) COVID-19 pandemic increases the need for community-based interventions such as peer support (Suresh, Alam, & Karkossa, 2021), since mental health problems may have exacerbated and mental health services may be less accessible (Salari et al., 2020).
Peer support involves a mutual exchange of practical and emotional support, based on 'shared understanding, respect, and mutual empowerment between people in similar situations' (Mead, Hilton, & Curtis, 2001) with critical ingredients such as shared responsibility (Mead, 2003;Mead & MacNeil, 2006), hope, self-determination over one's life, and the use of lived experience knowledge (Repper & Carter, 2011;Slade et al., 2014;Solomon, 2004). These aspects are embedded within the varying peer support programs implementing different structures, content, duration, and delivery formats, targeting different populations, and evaluating a wide range of outcomes (Chien, Clifton, Zhao, & Lui, 2019;Lloyd-Evans et al., 2014).
To the best of our knowledge, no previous meta-analysis has examined the effects of peer support across all patient groups and intervention types. We conducted a comprehensive systematic review and meta-analysis of randomized controlled trials (RCTs) comparing the effects of any peer support intervention with control conditions. We focused on 3 pre-specified main outcomesclinical, personal, and functional recoveryand, when possible, we also examined specific outcomes within these main categories (e.g. depressive symptoms, empowerment, and quality of life).

Protocol registration
This study adheres to the Preferred Reporting Items for Systematic Reviews and Meta-analyses (PRISMA) reporting guideline (Moher, Liberati, Tetzlaff, Altman, & The, 2009), and focuses on the effect of peer support for individuals with mental health disorders, corresponding to the main part of our protocol (https://osf.io/58urb). This protocol also includes our search for RCTs on peer support for relatives and caregivers of individuals with mental illness, which will not be reported here.

Search strategy
We searched PubMed, Embase, and PsycINFO up to December 18 th 2020, without language restriction. We used index terms from database-specific thesauruses as well as free text words indicative of mental illness and peer support (search strings are available in Appendix A). References of included trials and previous systematic reviews were reviewed for eligibility.

Identification and selection of studies
Two authors (DS and CM) independently screened titles and abstracts to identify eligible papers for inclusion. To determine final inclusions, full texts of the selected papers were examined. We included studies: (a) that were RCTs; (b) comparing any PSI format; (c) for adults with a clinical or self-reported mental disorder diagnosis, or a score above a cut-off on a standardized mental disorder symptom measure; (d) with care-as-usual (CAU), waiting list (WL), or other active (e.g. clinician-led therapies) or inactive comparators (e.g. an attention control website) (Griffiths et al., 2012); and (e) outcomes focusing on at least one of 3 categories: clinical (i.e. symptomatic) recovery (Slade et al., 2014;van Eck, Burger, Vellinga, Schirmbeck, & de Haan, 2018); personal recovery (e.g. empowerment; Mueser et al., 2006;van Weeghel, van Zelst, Boertien, & Hasson-Ohayon, 2019); functional recovery (e.g. quality of life; Mueser et al., 2006). For a definition of the categories, see Appendix B. Peers are defined as individuals recovered or in recovery from a mental illness. We excluded trials when the intervention was partially or co-delivered by a non-peer (e.g. a lay health worker), targeting substance use, somatic disorder self-management, or including (ex-)employees with mental illness due to their job (e.g. veterans). Any disagreement was resolved with a third author (PC), and central issues were discussed in meetings with all authors.

Data extraction and risk of bias assessment
A standardized form was used by 2 authors (DS and CM) to extract data regarding study context, participants' and intervention characteristics, including diagnoses, intervention format, control condition, and outcome data. When multiple measurements or control groups were available, we followed our developed decision tool (see Appendix C).
Study authors DS and CM independently assessed included trials using the Cochrane Collaboration Risk of Bias (RoB) tool 2.0 (Higgins et al., 2011), resolving any discrepancy with a third researcher (PC). Each of the following RoB-domains was rated as high risk, some concerns, or low risk: (a) the randomization process; (b) deviations from the intended interventions; (c) missing outcome data (up to 10% drop out was rated as low risk); (d) inappropriate measurement of the outcome; (e) selection of the reported result. An overall RoB score was calculated for each study, following our approach as presented in Appendix C.
Also, we examined subcategories within the main categories of outcomes: clinical recovery (depressive symptoms), personal recovery (empowerment, RAS, hope), and functional recovery (quality of life, social support, and loneliness). These subcategories of specific outcomes were pooled when a minimum of five trials were available. In Appendix B, a comprehensive definition for each outcome category is provided, with details on data extraction per category described in Appendix C, and corresponding instruments in Appendix D.

Statistical analysis
We conducted separate meta-analyses comparing PSIs and control conditions for each main group of outcomes (clinical, functional, and personal recovery) as well as subcategories of outcomes within the main groups (e.g. hope, quality of life). Effects were estimated at post-test, and when possible, at longterm follow-ups (⩾6 months after randomization).
We calculated between-group effect sizes (Hedges' g) by using means, standard deviations and N. When these were not reported, we used dichotomous outcomes or other statistics (e.g. p value, t value) for calculating effect sizes. Intention-to-treat data were used. Effect sizes were pooled with a random-effects model, using the Hartung-Knapp-Sidik-Jonkman method (IntHout, Ioannidis, & Borm, 2014). Heterogeneity was estimated with the I 2 statistic and its 95% confidence interval (CI). In addition, we included prediction intervals (PI), which represent 95% CI of the predictive distribution of effects in future comparable trials.
Categorical moderators of effects were explored in subgroup analyses by using a mixed-effects model. We conducted subgroup analyses when a minimum of three studies were available per subgroup.

Psychological Medicine
We estimated publication bias through visual funnel plot inspection, Egger's test (Egger, Smith, Schneider, & Minder, 1997), and with Duval and Tweedie trim-and-fill procedure (Duval & Tweedie, 2000). We conducted sensitivity analyses by: (a) excluding outliers (defined as studies whose 95% CI effect size did not overlap with the 95% CI of the pooled effect), and (b) exploring the influence of RoB in the results.

Inclusion of studies
The PRISMA flowchart is presented in Fig. 1. We screened 3455 hits, and we examined the full-text of 133 studies. A total of 30 studies (for references, see Appendix E) were included, of which 28 trials and 4152 participants, were included in the meta-analysis. Three studies (Field, Diego, Delgado, & Medina, 2013;Ludman et al., 2007;Mathews et al., 2018) included a clinician-led group as comparator [e.g. Interpersonal Psychotherapy (IPT) or Cognitive Behavioral Therapy (CBT)], including one overlapping trial (Ludman et al., 2007) which examined a control condition and a clinician-led comparator. Due to the limited number of studies, we did not pool trials with clinician-led comparators. A narrative description of these studies is presented in Appendix F.

Study characteristics
Selected characteristics of 30 included studies are presented in Appendix D. Two main subgroups were identified across the included trials: patients with SMI (20 trials) and individuals with depression (7 trials). SMI studies included a heterogeneous group of patients including but not limited to psychosis, depressive disorders, anxiety disorders, or bipolar disorders. The majority of depression studies (5 trials) focused on perinatal depression Dorien Smit et al. (Dennis, 2003;Dennis et al., 2009;Gjerdingen, McGovern, Pratt, Johnson, & Crow, 2013;Letourneau et al., 2011;Shorey et al., 2019), with participants scoring above a cut-off on a questionnaire. One study focused on women with eating disorders (Ranzenhofer et al., 2020). Most studies had CAU (16 trials) or WL (9 trials) as a control condition.
In 12 trials the PSI consisted of group meetings, 17 evaluated one-to-one peer support, and one trial implemented a mixed format. Face-to-face delivery was most common (16 trials), three trials evaluated telephone-based support, two trials examined internet support groups, and nine trials examined a mixed intervention, bringing together the latter formats. Intervention duration and frequency were heterogeneous and reported inconsistently, ranging from three weeks to six months with weekly meetings or a more flexible frequency.

Risk of bias
Overall, there is a high RoB in the majority of included studies: 21 trials were rated at high risk (21/30, 70%), six studies were judged as having some concerns for risk of bias (6/30, 20%), and only three studies met criteria for low risk of bias (3/30, 10%). Focusing on the separate RoB domains, twelve studies (12/30, 40%) were rated at low risk of bias for domain 1, due to reporting an adequate randomization process. Due to the unstructured naturalistic approach of peer support, 23 studies (23/30, 77%) were rated at low risk in domain 2 (deviations from the intended interventions). Ten trials (10/30, 33%) were rated as low RoB in domain 3 due to missing outcome data. Thirteen trials (13/30, 43%) were judged at low risk in domain 4 due to measurement of the outcome, using self-report measures only. For domain 5, only five studies (5/30, 17%) were prospectively registered and were rated at low risk (see Figures G1 and G2 in Appendix G, and Appendix H for RoB rating per domain and study).
Long-term effects for all clinical recovery outcomes indicated that the effect remained significant at six to nine months follow-up, g = 0.17, 95% CI (0.08-0.26), but not at 12 to 18 months follow-up, g = 0.10, 95% CI (−0.21 to 0.40).

Personal recovery
The pooled effect size at post-test across 19 PSI studies measuring personal recovery was significant, g = 0.15, 95% CI (0.04-0.27) (see Table 2 and Figure K1 in Appendix K). Heterogeneity was moderate, I 2 = 43%, 95% CI (1-67), although the PI (−0.16-0.47) was wide and contained the null effect.  (Dennis, 2003;Griffiths et al., 2012) was too small to reliably detect effects. Pooling specific outcomes within personal recovery resulted in significant effects for hope outcomes, g = 0.13, 95% CI (0.03-0.22), but not for empowerment or the Recovery Assessment Scale. In subgroup analyses, we found no differences in the effect of PSIs among potential moderators (see Appendix I).
No indications of publication bias were observed, Egger's test, p = 0.66, see Figure J2 in Appendix J. The effect size did not substantially change when excluding one outlier (Salzer et al., 2016), g = 0.13, 95% CI (0.05-0.21). Subgroup analyses did not detect differences in effects between RoB levels, although only one trial was rated at low risk and the impact of RoB is uncertain due to lack of power.
No indications of publication bias were observed, Egger's test, p = 0.74, see Figure J3 in Appendix J. When one outlier was removed (Salzer et al., 2016), the effect size remained significant, g = 0.06, 95% CI (−0.01 to 0.13). Subgroup analyses showed no differences in effects between RoB levels. Pooling the three trials rated at low risk resulted in a nonsignificant effect of g = 0.19, 95% CI (−0.37 to 0.76).

Discussion
In this comprehensive meta-analysis of 28 RCTs (n = 4152), PSIs for patients covering a broad spectrum of mental illnesses were associated with superior outcomes compared with control conditions regarding: (a) clinical recovery at post-test, and six to nine months follow-up; (b) personal recovery at post-test; and (c) functional recovery limited to six to nine months follow-up. When examining specific groups, we saw that specifically in the SMI patientsindividuals with serious mental disorderspeer support was associated with significant superiority to control conditions at post-intervention across all three recovery categories. For the subgroup of individuals with elevated depressive symptomsmost of them being perinatal womenno significant effects were found in any of the recovery categories. Nonetheless, the number of trials targeting this group was small and nonsignificant results could be due to a lack of power. Also, the analyses for more category-specific outcomes within each main outcome category were exploratory due to the small number of studies. CAU, care-as-usual; CI, confidence interval; NA, not applicable; PI, prediction interval; WL, waiting list. a According to the random-effects model. b Both studies (k = 2) included individuals with perinatal depressive symptoms scoring above a cut-off on a standardized mental disorder symptom measure. c Egger's test was not significant ( p = 0.66) and the number of imputed studies using Duvall and Tweedie trim-and-fill procedure was 24. d The p value for the between-group effect sizes is not significant ( p = 0.79). e Of the k = 7 studies, only one study included 18 months follow-up data, the remaining studies reported 12 months follow-up data.
Only the effect size for hope, considered part of personal recovery, was significant. We found no significant differences in the effect of PSIs among potential moderators (e.g. intervention delivery) for any of the outcomes, which could suggest that common values of peer support exceed disorder-specific needs and the intervention type. However, subgroup analyses should be considered with caution, since the number of trials for some categories was small and these analyses are likely underpowered. Accordingly, we could not analyze differences in effects between internet-based PSIs (2 trials) and traditional face-to-face interventions (16 trials; see Appendix I).
Since the evidence-base for eHealth is increasing (Chan et al., 2022;Deady et al., 2017;Massoudi, Holvast, Bockting, Burger, & Blanker, 2019) and digital PSIs for individuals with SMI seem to be associated with positive changes for both clinical and psychosocial outcomes (Fortuna et al., 2020), the effectiveness for technology-based PSIs should be further investigated.
The pooled effect sizes, that were confirmed in sensitivity analyses, were small ranging from g = 0.15 for overall personal recovery to g = 0.19 for overall clinical recovery at post-test. A surprising finding was low to moderate heterogeneity, suggesting that the effects were consistent across wide-varying studies. However, due to the relatively large width of the 95% CIs, caution must be applied. Moreover, although the effect size for clinical recovery appeared to be more robust, the prediction intervals for personal and functional recovery suggested that the effects are considerably uncertain. In addition, the risk of bias was high for the majority of included studies and we could not reliably estimate its impact on the results of the meta-analysis.
Operating with a broad scope, including the largest number of trials on peer support to date, we found a significant though small effect size for clinical recovery. This was not detected in previous meta-analyses (Burke et al., 2019;Chien et al., 2019;Fuhr et al., 2014;Huang et al., 2020;Lloyd-Evans et al., 2014;Lyons et al., 2021;White et al., 2020), possibly due to lack power. Considering the efficacy of peer support for personal recovery, we confirmed and extended the results of previous meta-analyses (Bryan & Arkowitz, 2015;Burke et al., 2019;Fuhr et al., 2014;Lloyd-Evans et al., 2014;Lyons et al., 2021;White et al., 2020). So far, outcomes for functional recovery are scarcely addressed in peer support meta-analyses (Fuhr et al., 2014;Lyons et al., 2021). Whilst only valid for the subgroup SMI and long-term analysis, we found significant effect sizes for functional recovery, with quality of life as the most important outcome parameter. Overall, results indicate that peer support is of clinical relevance for individuals with mental illness, and not limited to reinforcing personal recovery following the generally accepted recovery-oriented approach (Leamy, Bird, Le Boutillier, Williams, & Slade, 2011;van Weeghel et al., 2019). CAU, care-as-usual; CI, confidence interval; NA, not applicable; PI, prediction interval; WL, waiting list. a According to the random-effects model. b k = 6 studies included individuals with depressive symptoms scoring above a cut-off on a standardized mental disorder symptom measure (of which k = 5 are on perinatal depression), and k = 1 study included adults with a clinical diagnosis. c Egger's test was not significant ( p = 0.74) and the number of imputed studies using Duvall and Tweedie trim-and-fill procedure was 26. d The p value for the between-group effect sizes is not significant ( p = 0.45). e Of the k = 10 studies, only one study included 18 months follow-up data, the remaining studies reported 12 months follow-up data.

Limitations
The results of this study should be considered with caution because of several important limitations. First, measures for clinical, personal, and functional recovery differed considerably across studies. Second, long-term effects were limited to smaller samples of trials up to 12 months follow-up. Third, a major limitation of this study is the high risk of bias for the majority of trials, with limited reporting for many of the risk of bias items. Since peer support has an informal nature, it is difficult to quantitatively analyze these interventions. An established protocol would help to quantify variables that could be evaluated in trials, but this would restrict the open nature of PSIs. Still, since peer support has been increasingly considered an essential element for recovery there have been attempts to structure and professionalize PSIs (Chinman et al., 2016;SAMHSA, 2015). However, doubts remain because the core of peer support is its naturalistic approach (Fortuna, Solomon, & Rivera, 2022). The feasibility, acceptability, and benefits of structuring and professionalizing PSIs need further investigation. To improve the quality of studies, future research should implement clinician-rated instruments and prospective registration in clinical trial registries. Finally, though comparing the efficacy of PSIs with clinical psychotherapies seems relevant for implementing or referring to PSIs in mental health care, the number of trials was too small to conduct a meta-analysis for RCTs with a clinician-led comparator.

Conclusions
Engaging in a peer support intervention may be effective for reducing clinical mental illness symptoms, improving overall personal recovery, and more specifically hope. In particular for individuals with SMI, peer support demonstrated probable efficacy across the three recovery categories. Although the effects were small, peer support is a potentially cost-effective and relatively easy-to-implement intervention, and may complement professional treatment. Therapists, general practitioners, and employees of recovery-oriented services may refer their clients to peer support initiatives to expand the individuals' context to work on recovery when coping with mental illness.
Supplementary material. The supplementary material for this article can be found at https://doi.org/10.1017/S0033291722002422.
Author contributions. Dr Groeneweg, Dr Cuijpers, and Dr Spijker conceptualized and developed the study design. Smit (MSc) and Miguel (MSc) analyzed and interpreted the data and drafted the manuscript. Smit (MSc) and Miguel (MSc) had full access to all of the data in the study and take responsibility for the integrity of the data and the accuracy of the data analysis. Dr Cuijpers, Dr Spijker, Dr Vrijsen, and Dr Groeneweg supervised the study by providing intellectual content, reviewing data analysis and interpretation and critical revision of the manuscript.
Financial support. Smit, MSc is funded by a PhD Studentship of Pro Persona mental health care, which is partly funded by ZonMw, the Dutch organization for health research and health innovation. Research reported in this publication was supported by the Dutch Depression Association, who received funding from Janssen-Cilag to conduct this project. The funder had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review or approval of the manuscript and decision to submit the manuscript for publication. No other disclosures were reported.