Although psychological treatments are effective in reducing depressive symptoms, many individuals with depression do not receive adequate mental healthcare. Reference Thornicroft, Chatterji, Evans-Lacko, Gruber, Sampson and Aguilar-Gaxiola1,Reference Kohn, Saxena, Levav and Saraceno2 Approximately two-thirds of individuals with depression remain undertreated, Reference Mekonen, Chan, Connor, Hides and Leung3 owing to barriers such as shortage of mental health providers, long waiting lists, stigma and financial constraints. Reference Mohr, Ho, Duffecy, Baron, Lehman, Jin and Reifler4–Reference Packness, Halling, Simonsen, Waldorff and Hastrup6 These challenges can be addressed through digital mental health treatments, in which structured psychological modules are delivered via web browsers, mobile applications or chatbots. Reference Andersson, Titov, Dear, Rozental and Carlbring7–Reference Karyotaki, Efthimiou, Miguel, Maas genannt Bermpohl, Cuijpers and Furukawa9 These interventions allow individuals to access treatments at their own pace or with human support, reaching those who might not otherwise seek mental health treatments. Reference Andersson, Titov, Dear, Rozental and Carlbring7
Despite the growing use of internet-based interventions, fine-grained classification of the diverse levels of support is lacking. Reference Koelen, Vonk, Klein, de Koning, Vonk, de Vet and Wiers10–Reference Fairburn and Patel13 Most programmes include some form of human guidance to help participants to understand treatment strategies and provide feedback on their homework. This guidance may be delivered by licensed therapists or individuals with less clinical experience, such as psychology students in training, nurses and laypersons. Reference Baumeister, Reichler, Munzinger and Lin12 Alternatively, some interventions provide minimal human support that is not intended to offer clinical advice but focuses on resolving technical problems, offering motivational encouragement, or providing on-demand assistance for individuals experiencing elevated distress or suicidal risk. With technological advances, interventions can function independently of human support, delivering automated encouragement or personalised feedback generated by artificial intelligence algorithms. Reference Carlbring, Hadjistavropoulos, Kleiboer and Andersson14
However, it remains unclear which levels of support are more effective than others. This is an issue many clinicians and policy makers frequently face. Previous systematic reviews and meta-analyses have primarily classified internet-based interventions into two broad categories, guided versus self-guided, with guided formats showing better treatment outcomes compared with self-guided ones. Reference Karyotaki, Efthimiou, Miguel, Maas genannt Bermpohl, Cuijpers and Furukawa9,Reference Baumeister, Reichler, Munzinger and Lin12,Reference Cuijpers, Noma, Karyotaki, Cipriani and Furukawa15,Reference Krieger, Bur, Weber, Wolf, Berger and Watzke16 However, these classifications do not have a uniform meaning across studies, as they may be based on either the involvement of human support or the nature of that support. Reference Smoktunowicz, Barak, Andersson, Banos, Berger and Botella17 Moreover, they do not capture the nuanced roles of human involvement, such as technical assistance, motivational encouragement, on-demand guidance or clinical feedback. Reference Andersson11,Reference Smoktunowicz, Barak, Andersson, Banos, Berger and Botella17
In addition to support levels, pre-intervention human contact can significantly influence treatment effectiveness and drop-out risk (i.e. acceptability). Reference Krieger, Bur, Weber, Wolf, Berger and Watzke16,Reference Johansson and Andersson18 Such contact may include depression assessments, clinical interviews, or contact with or referrals from general practitioners, delivered either face-to-face or by phone. Reference Furukawa, Suganuma, Ostinelli, Andersson, Beevers and Shumake19 For example, our previous meta-analysis on self-guided interventions found higher treatment effects and acceptability when initial human contact was present. Reference Tong, Panagiotopoulou, Cuijpers and Karyotaki20 However, evidence for the impact of such contact across different support levels is lacking.
To address these gaps, this network meta-analysis (NMA) aimed to (a) examine the relative effectiveness and acceptability of internet-based interventions for adult depression, comparing seven supports and inactive controls; and (b) assess the impact of initial human contact on treatment outcomes across support levels. By identifying the most effective and acceptable support levels, our goal was to provide researchers, clinicians and policy makers with actionable strategies that balance treatment outcomes with human resource demands in the management of depression. This research question is particularly relevant given the growing mental health burden worldwide and the limited availability of specialist care.
Method
Search strategy and selection process
We identified eligible trials from an existing meta-analytic database of individual participant data for digital treatments for depression and anxiety disorders (https://osf.io/p5emr/). This repository was developed through systematic searches of PubMed, PsycINFO and Embase, with the latest search completed on 6 February 2024. Additional studies were identified by contacting the authors in the field and reviewing references from relevant published meta-analyses. We also included potentially eligible trials from a living meta-analytic database on psychotherapy for depression. Reference Cuijpers, Miguel, Harrer, Plessen, Marketa and Ebert21 This database was established through systematic searches of PubMed, PsycINFO, Embase and the Cochrane Library, with the latest search completed on 1 January 2025. Detailed search strings of the two data-sets are provided in Supplementary Appendix A available at https://doi.org/10.1192/bjp.2026.10653. This study forms part of the first author’s doctoral thesis. Reference Tong22
We included randomised controlled trials that (a) targeted adults (≥18 years old) with elevated levels of depressive symptoms, based on clinical diagnosis or scoring above a validated cut-off on self-reported scales; (b) examined internet-based cognitive–behavioural therapy (iCBT), Reference Carlbring, Hadjistavropoulos, Kleiboer and Andersson14 which was defined as an umbrella term encompassing cognitive restructuring, behavioural activation, problem-solving therapy, mindfulness-based CBT, acceptance and commitment therapy, metacognitive therapy, or a combination of these components; (c) studied interventions delivered via the internet, including web platforms, mobile applications, chatbots or a combination of these formats; and (d) compared asynchronous, stand-alone iCBT programmes with each other or with an inactive control condition (i.e. waiting list, care-as-usualFootnote a and other controls such as attention placebo).
We excluded interventions delivered synchronously (e.g. video conferencing), in blended formats (i.e. face-to-face sessions in addition to digital delivery) or fully in-person, as the aim of this NMA was to compare levels of support within the same treatment modality (stand-alone internet-based interventions). No language restrictions were applied.
Different pairs of researchers conducted title and abstract screening independently, with studies that potentially met eligibility criteria being selected for full-text selection. They resolved any disagreements through discussion or consulted a third researcher. This study followed PRISMA 2020 reporting guidelines, and the completed PRISMA checklist is presented in Supplementary Appendix B.
Risk of bias and data extraction
We assessed the study validity using the Metapsy risk of bias assessment tool, Reference Miguel, Harrer, Karyotaki, Sakher, Sakata and Furukawa23 an adaptation of the Cochrane risk of bias 2.0 tool for psychological intervention trials. Reference Sterne, Savović, Page, Elbers, Blencowe and Boutron24 This tool evaluates five domains: (a) the randomisation process, (b) deviations from the intended intervention, (c) missing outcome data, (d) outcome measurement and (e) selective reporting of results. Studies were rated as having a low risk of bias, some concerns or a high risk of bias.
We extracted data on participant characteristics (e.g. age group, mean age, percentage of females), intervention characteristics (e.g. therapy type, number of planned sessions) and study characteristics (e.g. comparison type, risk of bias, country). In addition, we extracted data on support formats and categorised these formats into seven levels based on previous systematic reviews and expert consensus. Reference Furukawa, Suganuma, Ostinelli, Andersson, Beevers and Shumake19,Reference Tong, Miguel, Panagiotopoulou, Karyotaki and Cuijpers25 A working group comprising clinicians and researchers attended the discussion and agreed on the following support format categories: (a) no support, (b) technical support only, (c) program-generated automated support, (d) support on demand, (e) minimal coaching (i.e. human support on encouragement and adherence only), (f) full coaching (i.e. guided support providing brief and (semi)standardised feedback without in-depth discussion on therapeutic strategies) and (g) therapeutic support (i.e. guided support proving more structured and individualised clinical guidance, primarily delivered by licensed therapists). When an intervention included multiple levels of support, we coded the support level based on the most intensive component. A summarised description is provided in Table 1.
Descriptions of support levels

In addition, we extracted information on the presence of human contact before the intervention (yes or no). Such contact could include pre-intervention symptom screening, clinical diagnostic interviews, assistance in study onboarding, or referrals from general practitioners, nurses or other health workers, provided through telephone, email or face-to-face interactions. Reference Furukawa, Suganuma, Ostinelli, Andersson, Beevers and Shumake19 Detailed documentation of support levels and initial human contact is provided in Supplementary Appendix C.
Pairs of researchers conducted data extraction and risk of bias assessments independently, resolving disagreements through discussion. If consensus was not reached, a third reviewer was consulted. We did not quantify interrater agreement for study selection, data extraction or quality assessments (e.g. Cohen’s kappa) in this review. Reference Orwin26 These metrics are not routinely collected on the basis of the Cochrane Handbook for Systematic Reviews of Interventions. Reference Higgins, Thomas, Chandler, Cumpston, Li and Page27 Moreover, support formats were categorised using predefined and explicit criteria, with two independent trained raters performing extraction and solving any discrepancies with the involvement of a third reviewer. Data extraction for the current study was conducted on 1 April 2025, following registration of the protocol (at https://osf.io/amw4r) on 16 March 2025.
Outcomes
The primary outcome was relative treatment effectiveness post-assessment, comparing different support levels and control conditions. Each study contributed one outcome, and this was selected on the basis of the most frequently used depressive symptom scale across studies (see Supplementary Appendix D for the scale hierarchy). Standardised mean differences were calculated as Hedges’ g to account for bias in small samples.
The secondary outcome was treatment acceptability, defined as study drop-out for any reason post-intervention. Participants missing post-treatment depressive assessments were considered to have dropped out. Relative risks with 95% confidence intervals were calculated. In addition, we measured the long-term effectiveness and acceptability of the intervention, defining follow-up measurements from 3 to 12 months post randomisation.
Data analysis
First, we generated a network plot to visually present the distributions of the seven support levels and three control conditions as nodes. Next, we conducted pairwise meta-analyses of all direct comparisons using a random-effects model, applying Knapp–Hartung adjustment for more robust inferences. Reference Knapp and Hartung28 We then performed a frequentist NMA to combine direct and indirect evidence, estimating comparative effectiveness (Hedges’ g with 95% CI for depressive symptoms) and acceptability (risk ratios with 95% CI for study drop-outFootnote b ) across support formats and control conditions. Reference Rücker and Schwarzer29,Reference Rücker30 Treatment rankings were determined by the P-score, a frequentist analogue of surface under the cumulative ranking curve (SUCRA). Reference Salanti, Ades and Ioannidis31 A random-effects model was used to account for between-study heterogeneity. Reference Higgins and Thompson32 We estimated tau2 using the DerSimonian–Laird method and quantified heterogeneity using I 2 with 95% CI. Reference Higgins and Thompson32,Reference Viechtbauer33
To examine the impact of initial human contact, we repeated the primary analysis separately for studies with and without pre-intervention contact. Considering both clinical and non-clinical pre-intervention human contact, we also performed separate NMAs for each contact category. For studies with multiple CBT arms involving the same support levels, we added each comparison to the network and performed sensitivity analyses by aggregating arms within studies, assuming a moderate within-study correlation (r = 0.50). Reference Bukumiric, Starcevic, Stanisavljevic, Marinkovic, Milic and Djukic-Dejanovic34 In the primary analysis, we excluded extreme effect sizes (g ≥ 2.0) to reduce heterogeneity and improve the network consistency.
We assessed transitivity by comparing distributions of potential effect modifiers (i.e. age, proportion of women, and baseline depressive severity) across comparisons. Reference Salanti35,Reference Cipriani, Higgins, Geddes and Salanti36 These factors were selected on the basis of a previous NMA methodological paper Reference Tonin, Rotta, Mendes and Pontarolo37 and the availability of reported data. In addition, we examined network consistency using the node-splitting method and a net-heat plot for local assessments Reference Dias, Welton, Caldwell and Ades38,Reference Krahn, Binder and König39 and the design-by-treatment interaction model for global inconsistency. Reference Higgins, Jackson, Barrett, Lu, Ades and White40
Sensitivity analyses were conducted to examine the robustness of findings, including (a) addition of outlier studies (i.e. g ≥ 2.0); (b) repeating NMA in low-risk-of-bias studies; and (c) aggregation of multiple CBT arms with r = 0.50. All analyses were performed in R version 4.5.0 (R Foundation for Statistical Computing, Vienna, Austria; https://www.r-project.org/) and R Studio (Posit Software, Boston, MA, USA; https://posit.co/) on MacOS, using the meta (Guido Schwarzer, Freiburg, Germany; https://cran.r-project.org/package=meta), dmetar (Mathias Harrer, Munich, Germany; https://github.com/MathiasHarrer/dmetar), netmeta (Guido Schwarzer, Freiburg, Germany; https://cran.r-project.org/package=netmeta) and metapsyTools (Metapsy Project, Amsterdam, the Netherlands; https://github.com/MathiasHarrer/metapsyTools) packages. R scripts are available at pre-registration (https://osf.io/krjfa/).
Certainty of evidence was assessed using CINeMA (https://cinema.ispm.unibe.ch/), Reference Papakonstantinou, Nikolakopoulou, Higgins, Egger and Salanti41,Reference Nikolakopoulou, Higgins, Papakonstantinou, Chaimani, Del Giovane and Egger42 covering five GRADE domains: study limitations, indirectness, inconsistency, imprecision and publication bias. Risk of bias due to missing evidence was evaluated using ROB-MEN (https://cinema.ispm.unibe.ch/rob-men). Reference Chiocchia, Holloway and Salanti43,Reference Chiocchia, Nikolakopoulou, Higgins, Page, Papakonstantinou and Cipriani44
Results
Selection, inclusion and characteristics of studies
After removing 9833 duplicates, we screened 12 757 titles and abstracts and reviewed 1414 full-text articles. In total, 141 randomised controlled trials (140 records, 1 paper included 2 trials) met the inclusion criteria, comprising 169 pairwise comparisons and 32 197 participants (14 837 in inactive control conditions). Eleven studies included multiple iCBT arms with the same support format. Figure 1 shows the PRISMA flowchart. Included studies are listed in Supplementary Appendix E.
PRISMA flow diagram. Flowchart of study selection process for the network meta-analysis, following PRISMA 2020 guidelines. CBT, cognitive–behavioural therapy.

Characteristics of selected studies and detailed support formats are presented in Supplementary Appendix F. Of the 141 trials, 48 arms (27.0%) offered full coaching, 37 (20.8%) therapeutic support, 37 (20.8%) no support, 22 (12.4%) minimal coaching, 15 (8.4%) technical support, 12 (6.7%) automated support and seven (3.9%) support on demand. Nearly half of the control groups were waiting list controls (n = 79, 48.8%), followed by care-as-usual (n = 46, 28.4%) and other controls (n = 37, 22.8%). Most studies were conducted in Europe (n = 64, 45.4%), used depressive symptoms cut-offs for inclusion (n = 92, 65.2%), were carried out in community settings (n = 87, 61.7%) and targeted general adults (n = 69, 48.9%). Forty-five studies (31.9%) had a low risk of bias, 57 (40.4%) had a high risk and 39 (27.7%) had some concerns. Risk of bias plots are provided in Supplementary Appendix G.
Network graph
Figure 2 presents the network graph for the effectiveness of iCBT support formats in reducing depressive symptoms. This network was well-connected, with all support levels linked to an inactive control condition. Full coaching was the most frequently examined level and was connected to all nodes except for therapeutic support, technical support and no support. By contrast, support on demand was the least examined, with direct comparisons limited to therapeutic support, full coaching and control conditions. A contribution plot showing the percentage of each direct comparison contributing to each network estimate is presented in Supplementary Appendix H.
Network plot. Network of all comparisons. Each node represents an intervention type; the size of the node reflects the total number of participants receiving that intervention, and the width of connecting lines corresponds to the number of direct comparisons between intervention pairs.

Pairwise meta-analyses
The results of pairwise meta-analyses are presented in Supplementary Appendix I. All support formats were significantly more effective than the waiting list control (g = 0.56 [95% CI: 0.24–0.89] to g = 0.9 [95% CI: 0.82–1.12]). Minimal coaching (i.e. human encouragement only) and technical support did not significantly outperform care-as-usual controls. Technical support did not outperform other controls. No significant differences were found between support levels. For acceptability, full coaching and no support had significantly higher drop-out rates than care-as-usual and waiting list controls (risk ratio = 1.46 [95% CI: 1.04–2.05] and risk ratio = 1.58 [95% CI: 1.10–2.27] v. care as usual). No other significant differences were observed. Heterogeneity in comparisons of ≥5 studies ranged from moderate to high (I 2 = 44.08% to I 2 = 77.97%).
Network meta-analysis
Table 2 presents the league table for the NMA. All support levels were significantly more effective than the waitlist (g = 0.54 [95% CI: 0.36–0.72] to 0.81 [95% CI: 0.69–0.92]) and other controls (g = 0.22 [95% CI: 0.04–0.39] to 0.48 [95% CI: 0.35–0.62]). Technical support was not significantly more effective than care as usual (g = 0.16 [95% CI: −0.02 to 0.33]). Most support formats showed comparable effectiveness, except that therapeutic support outperformed minimal coaching (g = 0.19 [95% CI: 0.03–0.35]) and technical support (g = 0.27 [95% CI: 0.08–0.45]). Regarding acceptability, all support formats except minimal coaching showed significantly higher drop-out risks than control conditions (risk ratio = 1.33 [95% CI: 1.07–1.67]) to risk ratio = 1.61 [95% CI: 1.21–2.15] v. care-as-usual). Drop-out risk was generally similar across support levels, although full coaching showed slightly higher drop-out than minimal coaching (risk ratio = 1.32 [95% CI: 1.02–1.72]).
League table of network meta-analyses of effectiveness and acceptability a

a. The outcomes assessed were effectiveness (reduction in depression symptoms) and acceptability (study drop-out for any reason) post-intervention. Estimates below the diagonal are from network meta-analyses focusing on effectiveness and are presented as Hedges’s g (95% confidence intervals), where g > 0 indicates that the treatment in the column is more effective than the treatment in the row. The estimates above the diagonal are treatment acceptability, presented as risk ratios (95% confidence intervals), where risk ratio > 1 indicates a higher drop-out risk for the treatment in the column compared with the one in the row.
Figure 3 presents a forest plot comparing iCBT with care as usual. On the basis of SUCRA rankings (Supplementary Appendix J), therapeutic support was most effective (g = 0.42 [95% CI: 0.30–0.55], followed by full coaching (g = 0.31 [95% CI: 0.20–0.43]) and automated support (g = 0.30 [95% CI: 0.12–0.48]), whereas technical support ranked lowest (g = 0.15 [95% CI: −0.02 to 0.33]). For acceptability, minimal coaching and therapeutic support had the lowest drop-out rates (risk ratio = 1.13 [95% CI: 0.88–1.46] and risk ratio = 1.20 [95% CI: 0.98–1.48], respectively), whereas technical support had the highest (risk ratio = 1.61 [95% CI: 1.21–2.15]).
Forest plot of treatment effectiveness and acceptability. (a) Effectiveness: forest plot of standardised mean difference (SMD; Hedges’ g) for depressive symptom reduction post-intervention. (b) Forest plot of study drop-out for any reason post-intervention (acceptability). Risk ratios greater than 1 indicate higher drop-out relative to care as usual (lower acceptability). Risk ratios less than 1 indicate lower drop-out (higher acceptability). Favours support levels: lower drop-out than care as usual; favours care as usual, lower drop-out than the support level.

Between-study heterogeneity was substantial (τ 2 = 0.05; I 2 = 68.1% [62.3% to 73.0%]). Effect modifiers appeared to be similarly distributed across comparisons (Supplementary Appendix, Supplementary Fig. J1). Although the design-by-treatment interaction model indicated significant global inconsistency (Q = 110.41; d.f. = 32, p < 0.001), node-splitting suggested no local inconsistencies (Supplementary Fig. J2). The net-heat plot and evidence certainty assessments are provided in Supplementary Appendix I (Supplementary Fig. J3 and Table J2).
Impact of initial human contact
There were 104 studies (121 comparisons) involving human contact before the intervention, with 7 studies providing technical non-clinical contact. In the 104 studies, no significant differences in treatment effectiveness or acceptability were observed across support levels (Supplementary Fig. K1 and Table K1). By contrast, among the 38 studies (48 comparisons) without such contact, therapeutic support was significantly more effective than other formats (g = 0.32 [95% CI: 0.05–0.59] to g = 0.68 [95% CI: 0.35–1.02]; Supplementary Fig. K2 and Table K2). Full coaching and no support also showed larger treatment effects than technical support (g = 0.37 [95% CI: 0.11–0.62] and g = 0.32 [95% CI: 0.05–0.59]). Last, similar results were observed in the network including the 97 studies (114 comparisons) with clinical pre-intervention human contact (Supplementary Fig. K3 and Table K3), consistent with the findings from the 104 studies with any human contact.
In the two networks comparing the presence of human contact, therapeutic support ranked highest in effectiveness (with and without human contact: g = 0.38 [95% CI: 0.24 to 0.53] and g = 0.71 [95% CI: 0.40 to 1.01], respectively), whereas technical support ranked lowest (g = 0.19 [95% CI: −0.04 to 0.43] and g = 0.02 [95% CI: −0.27 to 0.31]). Drop-out risk was generally lower in studies with initial human contact (risk ratio = 1.04 [95% CI: 0.66–1.65] to 1.82 [95% CI: 1.09–3.04]) than in those without (risk ratio = 1.13 [95% CI: 0.50–2.55] to 3.10 [95% CI: 1.56–6.16]; Supplementary Appendix K).
Long-term outcomes
The long-term NMA results are presented in Supplementary Appendix L. Across 58 trials (83 comparisons), all support formats remained more effective than the waiting list control (g = 0.40 [95% CI: 0.16–0.63] to 0.65 [95% CI: 0.50–0.80]). However, on-demand support was no longer significantly more effective than care as usual. All support levels were similarly effective, except that therapeutic support had a larger effect than support on demand (g = 0.26 [95% CI: 0.04–0.47]). In the SUCRA rankings, therapeutic support remained the most effective (g = 0.33 [95% CI: 0.19–0.46]), whereas technical support emerged as the second most effective (g = 0.23 [95% CI: 0.01–0.45]). Regarding acceptability, no support showed the lowest drop-out risk (risk ratio = 1.02 [95% CI: 0.84–1.23]), followed by therapeutic support (risk ratio = 1.13 [95% CI: 0.93–1.38]). Support on demand was the least acceptable, with the highest drop-out risk (risk ratio = 1.58 [95% CI: 1.11–2.26]).
Sensitivity analyses
The results of the sensitivity analyses are provided in Supplementary Appendix M. Overall, therapeutic support remained the most effective intervention format when the study with outlier effect sizes was included (g ≥ 2.0; Supplementary Fig. M1 and Table M1) or when multiple CBT arms were combined using an assumed correlation coefficient (Supplementary Fig. M2 and Table M2). However, when the outlier study was included, ‘no support’ intervention appeared to be the second most effective format (g = 0.33 [95% CI: 0.20–0.47]) after therapeutic support (g = 0.41 [95% CI: 0.28–0.54]), and the global inconsistency increased (Q = 115.32, d.f. = 32, p < 0.001). In studies with a low risk of bias (n = 45, 56 comparisons), all support levels showed comparable effectiveness in reducing depressive symptoms. However, automated support emerged as the most effective format (g = 0.66 [95% CI: 0.40–0.92]; Supplementary Fig. M3 and Table M3).
Discussion
This NMA compared the effectiveness and acceptability of iCBT for depression across various support levels and control conditions. Compared with care as usual, therapeutic support was most effective, whereas technical support was least effective and had the highest post-treatment drop-out rates. Minimal coaching had the lowest drop-out rates, although it was less effective than therapeutic support. When pre-intervention human contact was present, support formats showed comparable effectiveness and acceptability. Without such contact, therapeutic support significantly outperformed other formats, and drop-out risks increased. Long-term efficacy decreased slightly but remained significant, with therapeutic support remaining the most effective iCBT support format, whereas on-demand support became the least effective and acceptable.
These findings are consistent with those of previous research indicating that therapist-guided interventions are generally more effective than self-guided ones. Reference Cuijpers, Noma, Karyotaki, Cipriani and Furukawa15,Reference Krieger, Bur, Weber, Wolf, Berger and Watzke16 Furthermore, our results suggest that although technical support appeared least effective post-intervention, its long-term outcomes may match those of therapist-guided formats. This novel insight suggests that users receiving only technical assistance may initially struggle but later adapt or develop independent coping strategies, with such support potentially sustaining long-term effects. However, this finding should be interpreted cautiously, given the relatively small number of studies (n = 58) and the small effect sizes (g = 0.23 and g = 0.33 for technical support and therapeutic support, respectively) at follow-up.
Moreover, we found that lower-intensity support (e.g. full coaching, automated support, on-demand support, or no support) could be as effective as well-structured therapist-guided feedback. This suggests that digital interventions with automated or non-depth coaching could be used as alternatives to therapist-guided interventions, which could in turn help to address the growing burden and demand for mental healthcare. Notably, although the no support format showed promising treatment effects, application of purely unguided interventions in clinical practice requires caution. This is particularly relevant for individuals with severe depression, comorbid conditions or elevated suicide risk, for whom more supervision or monitoring is typically required. Further research, such as randomised controlled trials comparing unguided and therapist-guided formats in terms of negative effects, remains essential.
Last, consistent with previous findings, our study highlights the value of pre-treatment human contact. Reference Krieger, Bur, Weber, Wolf, Berger and Watzke16,Reference Johansson and Andersson18,Reference Tong, Panagiotopoulou, Cuijpers and Karyotaki20 When such contact was present, differences between support levels diminished, suggesting that brief initial interactions such as screening calls or referrals may reduce the need for intensive support during treatment. Moreover, in our current analysis, such contact was mostly clinical, including eligibility screening based on questionnaires or clinical diagnostic interviews, either by telephone or in person. Only a few studies (n = 7) included technical contact, such as assistance with installing the program, introducing the study procedures and treatment content, or general practitioner referral only. It was thereby unfeasible to explore whether the impact of support levels differed with respect to reducing depressive symptoms in the presence of varying intensity of pre-intervention human contact. Nevertheless, the results remained consistent for studies with clinical contact and those with any type of contact. These contacts are easier to implement and can enhance participants’ accountability and engagement, Reference Mohr, Cuijpers and Lehman45 ultimately enhancing outcomes. Therefore, our findings suggest the importance of including pre-intervention human contact in clinical practice to improve both treatment effectiveness and acceptability.
The current study had several limitations. First, the findings of the primary analysis should be interpreted cautiously owing to global inconsistency and a high risk of bias across included studies. Second, the intensity of support probably varied within categories. For example, minimal coaching ranged from weekly 10 min calls to two brief check-in calls at the beginning and end of the intervention. Similarly, the uptake of on-demand support was often unclear, with only one of seven studies reporting usage (i.e. 7% of participants contacted helplines). Third, we measured treatment acceptability as the relative risk of participants who did not complete post-intervention assessments. However, assessment completion may not necessarily reflect treatment disengagement or drop-out and may be confounded by other factors, such as participants’ engagement with research staff rather than their engagement with the intervention itself. However, post-treatment assessment completion was the most consistent measure across trials. Alternative indicators of acceptability, such as treatment drop-out rates or the average number of modules completed, are often inconsistently measured and poorly reported at the aggregated data level in primary trials. Reference van Ballegooijen, Cuijpers, van Straten, Karyotaki, Andersson and Smit46 Future research using individual-level data on treatment engagement could help to address this limitation and improve clinical interpretation of the findings. Fourth, guided and unguided interventions may differ systematically in aspects of their study design, such as control condition intensity, which has been found to be associated with treatment effectiveness. Reference Munder, Geisshüsler, Krieger, Zimmermann, Wolf and Berger47 Although observable participant characteristics appeared to be balanced across comparisons, unmeasured factors may have introduced bias into the indirect and mixed comparisons in the NMA. Therefore, the findings of this NMA should be interpreted with caution. Fifth, we did not calculate a formal reliability measure such as interrater agreement (e.g. Cohen’s kappa) for the classification of support levels. Therefore, agreement between raters beyond chance was not quantified and may have influenced the results. Nevertheless, support formats were categorised using predefined and explicit criteria to enhance consistency, and future studies may consider reporting such metrics. Last, relatively few studies examined conditions without initial human contact, reported long-term outcomes or met low-risk-of-bias criteria. The results of the analyses based on these small samples need to be interpreted cautiously.
Despite the limitations, this study was strengthened by its large sample size and the fine-grained classification of distinct support levels within iCBT. For two decades, research on internet-based interventions has primarily used a binary classification: guided versus self-guided formats. However, this binary classification did not have a unified definition and no longer reflects the complexity of modern digital mental health treatments. Our current NMA moves beyond that dichotomy, offering a nuanced comparison across multiple levels of human and automated support and comparing the relative treatment effects and drop-out risks among individuals receiving iCBT.
The findings have key clinical implications. For example, technical support alone appears to be suboptimal, given its low effectiveness and high drop-out rates post-treatment. By contrast, interventions incorporating fully automated feedback or on-demand support show promise, especially in low-resource settings where specialist care is scarce and stigma remains a barrier. In addition, human contact before the treatment, not only during it, could mitigate the influence of support level on outcomes, suggesting a cost-effective strategy to enhance engagement.
Supplementary material
The supplementary material is available online at https://doi.org/10.1192/bjp.2026.10653
Data availability
Data used in the study will be made publicly available at the Metapsy website (https://www.metapsy.org/) on publication. R scripts used in this study are available at https://osf.io/krjfa/.
Author contributions
L.T., E.K. and P.C. initiated the research question, with input from G.A. and H.R. L.T., O.-M.P. and M.R. extracted the data and assessed the risk of bias, with help from C.M. L.T. performed the data analysis and wrote the first draft of the paper, and O.-M.P., C.M., M.H., G.A., H.R., P.C. and E.K. provided feedback. P.C. and E.K. supervised this project.
Declaration of interest
P.C. and E.K. are members of the editorial board of the British Journal of Psychiatry. They were not involved in the review or decision-making process for this paper.


eLetters
No eLetters have been published for this article.