Is self-guided internet-based cognitive behavioural therapy (iCBT) harmful? An individual participant data meta-analysis

Background Little is known about potential harmful effects as a consequence of self-guided internet-based cognitive behaviour therapy (iCBT), such as symptom deterioration rates. Thus, safety concerns remain and hamper the implementation of self-guided iCBT into clinical practice. We aimed to conduct an individual participant data (IPD) meta-analysis to determine the prevalence of clinically significant deterioration (symptom worsening) in adults with depressive symptoms who received self-guided iCBT compared with control conditions. Several socio-demographic, clinical and study-level variables were tested as potential moderators of deterioration. Methods Randomised controlled trials that reported results of self-guided iCBT compared with control conditions in adults with symptoms of depression were selected. Mixed effects models with participants nested within studies were used to examine possible clinically significant deterioration rates. Results Thirteen out of 16 eligible trials were included in the present IPD meta-analysis. Of the 3805 participants analysed, 7.2% showed clinically significant deterioration (5.8% and 9.1% of participants in the intervention and control groups, respectively). Participants in self-guided iCBT were less likely to deteriorate (OR 0.62, p < 0.001) compared with control conditions. None of the examined participant- and study-level moderators were significantly associated with deterioration rates. Conclusions Self-guided iCBT has a lower rate of negative outcomes on symptoms than control conditions and could be a first step treatment approach for adult depression as well as an alternative to watchful waiting in general practice.


Introduction
Depression is a common and major health issue that is associated with a considerable personal and societal burden (Reddy, 2010;Lepine & Briley, 2011). Self-guided forms of internet-based cognitive behaviour therapy (iCBT) can increase accessibility and reduce the costs of depression treatment (Hedman et al. 2012;Muñoz et al. 2015). Over the past decade, free or commercially provided self-guided iCBT programmes have been made available to individuals with depression (Kaltenthaler et al. 2006). However, there is an ongoing discussion about the advantages and disadvantages of such programmes (Andersson & Titov, 2014). Many healthcare systems remain hesitant to implement self-guided iCBT. Among the barriers to implementation are concerns regarding safety and effectiveness (Waller & Gilbody, 2009). Offered as first stage treatment, self-guided iCBT has been criticised as potentially delaying individuals suffering from depression receiving effective clinical services (Robinson et al. 1998;Ybarra & Eaton, 2005).
Our recent individual participant data (IPD) meta-analysis demonstrated that self-guided iCBT produces encouraging effects although average effects are small in magnitude (Karyotaki et al. 2017). Small effects can be clinically useful and relevant, for instance in countries where mental healthcare services are scarce or inaccessible for other reasons (Muñoz et al. 2015;Karyotaki et al. 2017). Despite the ample evidence regarding the overall benefits of iCBT, possible effects harmful to individuals have been infrequently monitored. As with any other potentially effective treatment, iCBT could also result in undesirable outcomes or fail to arrest substantial ongoing deterioration (Dimidjian & Hollon, 2010). The issue of negative effects is crucial for clinical decision-making, yet limited systematic examination of this issue exists for iCBT.
In contrast to the vast majority of pharmacological trials, which rigorously measure and report adverse outcomes, only half of psychotherapeutic trials report undesirable outcomes, in particular substantial symptom deterioration (Jonsson et al. 2014;Vaughan et al. 2014). Previous research has shown that some forms of psychotherapy can be hazardous for some patients e.g. prolonged imaginal exposure may result in worsening of symptoms in some patients with posttraumatic stress disorder (Foa et al. 2005). With regard to self-guided psychological treatment, it has been argued that it may not be appropriate for all individuals (Newman, 2000). For instance, self-guided interventions may not be intensive enough for individuals with severe symptoms (Mohr et al. 1990). Moreover, lack of therapeutic support may jeopardise therapy outcomes as the progress of patients is not monitored (Newman et al. 2003). Most self-guided interventions are not tailored to the current depressive status of the individual and, accordingly, do not respond to symptom deterioration (Andersson & Titov, 2014).
To our knowledge symptom deterioration has not been examined in self-guided iCBT to date. It is therefore clearly important to determine the prevalence and extent of symptom deterioration in self-guided iCBT. Furthermore, given that not everyone receiving self-guided iCBT experiences worsening of symptoms, research examining for whom this particular form of therapy may be harmful is sorely needed: individuals likely to deteriorate could be diverted to other more appropriate treatment options. However, study-level meta-analyses and randomised controlled trials (RCTs) cannot thoroughly examine moderators of deterioration due to inadequate power (Bower et al. 2013). Novel methodological approaches, such as IPD meta-analysis, are needed to identify who may experience adverse effects from self-guided iCBT. IPD meta-analysis allows us to move beyond the 'grand mean' to explore change at the individual level as well as to examine study variability (e.g. level of adherence, settings, etc.) The present IPD meta-analysis examined rates of clinically significant symptom deterioration under self-guided iCBT compared with control conditions in adults with depression. In addition, we examined several socio-demographic and clinical variables as potential moderators of symptom deterioration. In the context of the present paper, the term 'self-guided iCBT' is defined as CBT treatment delivered via the Internet without any professional support related to the therapeutic content. The present analysis focuses on the prevalence of deterioration (numbers of persons affected) in contrast to our previous IPD meta-analysis of the present dataset in which we focused on population average outcomes due to selfguided iCBT on depression severity (Karyotaki et al. 2017)

Studies selection process
We included IPD from randomised controlled trials comparing self-guided iCBT to a control condition (waiting list, treatment as usual, attention placebo or other non-active controls) for adults (⩾18 years old) with symptoms of depression based either on a diagnostic interview or validated self-report scales. We excluded studies that did not primarily target depression. No limits were applied for language and publication status.
To identify eligible studies, we used an existing database of trials focusing on psychological therapies for adults with depression (Cuijpers et al. 2008). This database was developed in 2006 and is updated annually by a systematic literature search in four major bibliographic databases (PubMed, Embase, PsycINFO and Cochrane Library). The database was updated up to 1 January 2016 for the current search. In this search, various index and free terms of psychotherapy were used in combination with index and free terms of depression. Appendix A. presents the full electronic search string for PubMed. Two reviewers (PC and EK) examined 13 384 titles and abstracts independently. All full texts of papers that possibly met inclusion criteria according to one of the two reviewers were retrieved and checked for eligibility. Disagreement on the inclusion was solved through discussion. Additionally, references of recently published meta-analyses on this field were examined to ensure the inclusion of all eligible published trials. Finally, key researchers who are actively involved in this field were contacted to ask whether they were aware of unpublished trials on this topic or trials missed through the searches.

Data collection process and data items
We contacted the first or the senior author of the RCTs to request access to their datasets. In the case of no response, a reminder email was sent after 2 weeks and after 1 month. If no response was received 1 month after the first email, the study was excluded as unavailable. Authors provided IPD including socio-demographic variables (age, gender, educational level, employment status and relationship status), pre-and post-treatment depression scores, anxiety scores at baseline and the number of iCBT modules completed by each participant. All IPD were merged into one dataset. We also extracted study level variables available from the published reports of the included RCTs (type of control condition, recruitment method and level of support provided).

Risk of bias assessment
Two independent reviewers (EK and PC) assessed the risk of bias in the included studies according to the Cochrane risk of bias assessment tool Higgins & Green, Psychological Medicine 2011). We examined whether the included studies were at low or high risk of selection, performance, detection, attrition, reporting and other sources of bias. In cases of uncertainty, clarification was sought from the authors of the RCT.

Measures
The included studies used either the Beck Depression Inventory [BDI-I (Beck et al. 1961) or BDI-II (Beck et al. 1996)], the Centre for Epidemiological Studies Depression Scale [CES-D (Radloff, 1977)] or the Patient Health Questionnaire [PHQ-9 (Kroenke et al. 2001)] as outcome measures of depression severity.
We classified 'clinically significant deterioration' according to each participant's reliable change index (RCI) (Jacobson & Truax, 1991), which is the most commonly used method for calculating clinically significant negative effects. This method aims to ensure that each individual's deterioration could not attributable to measurement error (Lambert et al. 1993) and thus might warrant clinical intervention in the context of an unsupervised intervention such as iCBT. An RCI of ±1.96 indicates that the difference in scores is likely to be due to a real change in symptoms of an individual (95% confidence level). Participants showing a clinically significant change with an increase in their score (clinically significant negative change of more than −1.96) were classified as clinically significant deteriorated (Jacobson & Truax, 1991). A RCI was calculated separately for each of the studies, using their pre-treatment standard deviation, and the test-retest reliability coefficient of the outcome measure (BDI = 0.93, CES-D = 0.87, PHQ-9 = 0.84).
Finally, in a sensitivity analysis we calculated 'any deterioration' as any increase of equal or more than one point on depressive symptom scales (increase ⩾1 BDI or CES-D or PHQ-9 point) from pre-to post-treatments. Any deterioration includes all deterioration rates (clinically significant or not).

Missing data
Missing outcome data at the post-treatment were multiply imputed. We generated one hundred imputed datasets based on the missing-at-random assumption ('mi impute mvn' in Stata version 13.1; StataCorp LP). These datasets, which consisted of the observed and the imputed deterioration rates, were analysed separately using the selected model and the results were combined using Rubin's rules (Rubin, 2004). We ran sensitivity analyses including only observed values to ensure the robustness of the results ('the complete case analysis').

One-stage IPD
We performed a one-stage IPD meta-analysis in which the IPD from all included studies were merged, with participants nested within studies. A one-stage IPD approach is preferred over a twostage approach because it allows for a more exact likelihood specification (Stewart & Parmar, 1993;Debray et al. 2013;Burke et al. 2016). We used the statistical software Stata (version 14.2) for conducting all analyses. A multilevel-mixed effects logistic regression was used to analyse the effect of the iCBT intervention on clinically significant deterioration and any deterioration. We used a random intercepts model with a random effect for each trial and fixed effect for condition (intervention v. control) using the 'melogit' command in Stata. The binary variables 'clinically significant deterioration' and 'any deterioration' were used as fixed effect.
To explore the variation in outcomes between participants, we examined dichotomous and continuous baseline characteristics of the participants as potential moderators of clinically significant deterioration and of any deterioration. Moderation was tested by adding the interaction between each moderator and deterioration rates to the multilevel mixed-effects logistic regression model. Similarly, we examined whether adherence (defined as number of modules completed divided by the total number of treatment modules) predicted lower deterioration rates within the intervention group.

Two-stage IPD
In addition to the one-stage IPD, we performed a two-stage IPD to examine the effects of study-level variables. We calculated the odds ratio (OR) of deterioration for each study and we pooled the outcomes using a random effects model, chosen because considerable heterogeneity across studies was expected. The OR shows the probability that an event (deterioration) will occur in the intervention group (self-guided iCBT) compared with the probability that the same event occurring in the control group. An OR of more than 1 indicates increased probability that an event will occur in the intervention group while and an OR <1 shows increased probability that an event will occur in the control group (Deeks et al. 2008).
To examine heterogeneity, the I 2 -statistic was calculated. This is an indicator of heterogeneity expressed as a percentage: an I 2 value of 0% is interpreted as no heterogeneity, 25% as low, 50% as moderate and 75% as high heterogeneity (Cohen, 1988). We calculated the 95% confidence intervals (CI) around I 2 using the non-central chi-squared-based approach within the heterogeneity module for Stata (Orsini et al. 2006;Evangelou et al. 2007). Publication bias was examined by visually inspecting funnel plots and by using the Duval and Tweedie's trim and fill method, which yields an estimate of the effect size after adjusting for publication bias (Egger et al. 1997;Duval & Tweedie, 2000). We also conducted the Egger's test of the intercept to quantify the bias captured by the funnel plot and test whether it was significant 30 .
Finally, we examined study-level moderators by conducting subgroup analyses using the mixed effects models in which studies were treated as random effects and subgroups were tested as fixed effects.

Sample representativeness
In our previous analysis (Karyotaki et al. 2017), we tested differences between 13 studies that provided IPD and three eligible studies that did not. Results indicated that there are no statistically significant differences in depression severity outcomes between the included and the unavailable trials (Karyotaki et al. 2017). This suggests that the present sample of studies is representative. However, we could not test differences between included and unavailable trials in the current analysis since deterioration rates are not reported by the published reports of the unavailable trials.

Study selection
The systematic literature search resulted in 16 eligible studies for inclusion. The flowchart of the study selection process is presented in Fig. 1. Data were available from thirteen trials including 3876 participants (Christensen et al. 2004;de Graaf et al. 2009;Berger et al. 2011;Farrer et al. 2011;Gilbody et al. 2015;Karyotaki et al. 2017accepted, Spek et al. 2007Meyer et al. 2009;Moritz et al. 2012;Phillips et al. 2014;Kleiboer et al. 2015;Meyer et al. 2015;Klein et al. 2016;Mira et al. 2017). Three eligible datasets were excluded from the present IPD meta-analysis as data were unavailable (Clarke et al. 2002;Clarke et al. 2005;Clarke et al. 2009).

Studies and participants characteristics
Studies characteristics are presented in Table 1. Across the 13 included RCTs participants were recruited mainly within   Klein et al. 2016 trial provided therapeutic support to participants with moderate depression (PHQ-9 > 9). Participants with mild depressive symptoms received no support throughout the trial. Klein et al. 2016 stratified participants based on depression severity during randomisation. Therefore, we decided to exclude all participants who received therapeutic support (PHQ-9 > 9; n = 634) from the present IPD meta-analysis.
communities. The studies were conducted in six countries (Australia, Germany, Spain, Switzerland, the Netherlands and the UK). The included RCTs examined the effects of self-guided iCBT over control conditions (attention placebo, no treatment, treatment as usual or waiting list). The interventions consisted of five to 11 online self-guided modules. The majority of the trials provided fully self-guided iCBT while four studies provided technical support related to technical aspects of the web-platform. Patient characteristics of the intervention and control groups are presented in Table 2. There were no significant between-group differences in gender, age, relationship status, employment, education level, comorbid anxiety or pre-treatment scores on depression measures. Two-thirds of the participants were female (n = 2531/3832, 66%), and the mean age was 42 years (S.D. = 11.7). Most of the participants were employed (n = 2233/3146, 71%) and had a high school degree or further education (n = 2314/ 2574, 90%). Participants had mean baseline scores of BDI-II = 28.3 (S.D. = 14.4), CES-D = 25.7 (S.D. = 10.8) and PHQ-9 = 14.2 (S.D. = 5.4). A small number of individual cases (n = 71) did not start the treatment or did not have baseline data, leaving a total of n = 3805. Finally, 997 participants dropped out from the included RCTs and did not provide post-treatment scores of depressive symptoms (26% of the total sample). More specifically, 630/2220 (28.4%) and 347/1585 (22%) participants dropped out of the intervention and controls, respectively. The intervention had significantly higher study dropout rates compared with controls (OR 1.51, 95% CI 1.11-2.08; p = 0.01).

Assessment of risk of bias
The overall risk of bias was low across the included RCTs. Random sequence was adequately generated and allocation was concealed in all included studies. Blinding of participants and personnel is not possible in psychotherapy research due to the nature of the treatment. The included RCTs used self-report outcome measure, thus blinding of outcome assessment is not applicable. As an attempt to minimise attrition bias, we multiply imputed the missing data in the present IPD. Finally, there was no indication of selective reporting and other sources of bias. For a justification of risk of bias assessment for each study, the reader is referred to our previous publication (Karyotaki et al. 2017).

Prevalence of clinically significant deterioration
Overall, 7.2% (276/3805) of participants met criteria for clinically significant deterioration, while 30.1% (1143/3805) showed deterioration of any size. Divided by treatment group, 5.8% of participants who followed self-guided iCBT and 9.1% of participants in control groups showed clinically significant deterioration. Similarly, 26.2% of participants in self-guided iCBT and 35.3% of participants in control groups experienced any deterioration. Complete case analysis showed that 6.1 and 9.1% of participants showed clinically significant deterioration in self-guided iCBT and control groups, respectively. Any deterioration rates were 26.5% for self-guided iCBT and 36.6% for the control groups.

One-stage IPD: deterioration rates
The outcomes of one-stage IPD meta-analysis on deterioration rates are summarised in Table 3. Self-guided iCBT resulted in significantly lower clinically significant deterioration (OR 0.62; 95% CI 0.46-0.83, p < 0.001) compared with controls at post-treatment assessment (6-16 weeks post-randomisation). Sensitivity analysis on any deterioration rates and complete case analyses yielded similar outcomes, suggesting that these findings are robust. No significant associations were found between participant characteristics (age, gender, educational level, relationship status, employment status, comorbid anxiety and baseline severity of depression) or for deterioration rates in both full sample and complete case analyses. Finally, treatment adherence was not significantly associated with clinically significant (β = −0.01; S.E. = 0.37, p = 0.97) or any deterioration (β = −0.10; S.E. = 0.19, p = 0.59) within the intervention group.

Two-stage IPD deterioration rates
Results of the two-stage IPD meta-analysis replicated the findings of one-stage IPD on clinically significant deterioration (OR 0.62; 95% CI 0.48-0.81, p < 0.001). Heterogeneity was zero. These results were replicated in sensitivity analysis on any deterioration rates showed similar outcomes and complete case analyses. There was no indication for publication bias across all the analyses. Finally, the examined study-level variables were not significantly associated with clinically significant and any deterioration. Results of the twostage IPD are presented in Table 4 in the online supplement.

Discussion
The present IPD meta-analysis aimed to examine the clinically significant symptom deterioration rates of self-guided iCBT compared with controls and to evaluate the moderating effects of deterioration. Of the 3805 participants included in the analysis, 7.2% showed clinically significant deterioration. Self-guided iCBT had low clinically significant deterioration rates (5.8%) and resulted in lower risk of clinically significant deterioration compared with controls at the post-treatment assessment. Similar results were observed in sensitivity analyses. None of the examined participant-and study-level variables moderated clinically significant deterioration. This is the first meta-analysis to have systematically examined deterioration rates in self-guided iCBT for adults with depressive symptoms. The finding that self-guided iCBT shows lower deterioration rates compared with controls is consistent with previous IPD meta-analysis of 2079 participants on guided internet based interventions (Ebert et al. 2016). According to this work, guided web-based interventions resulted in lower risk of clinically significant deterioration compared with controls. Furthermore, guided internet-based interventions showed 3.6% clinically significant deterioration, which is slightly lower but comparable with the present clinically significant deterioration rates of self-guided iCBT (5.8%) (Ebert et al. 2016). The deterioration rates found in the present IPD meta-analysis are in line with the deterioration rates observed in face-to-face treatment (5-10%) (Rozental et al. 2014) and considerably lower compared with the 12% rate of deterioration reported by a recent meta-analysis examining change under treatment as usual (Kolovos et al. 2017).
The present study has several notable strengths; we were able to obtain IPD from 3876 participants from the majority (81%) of the eligible trials. This large number of participants offered us adequate power to detect statistically significant differences between self-guided iCBT and controls on symptom deterioration rates and to examine moderators of clinically significant deterioration. In addition, the current sample of studies appears to be free from critical bias, since the included trials had high methodological quality. Other strengths of the current work are that heterogeneity between studies was small to moderate and there was no indication of publication bias.
Nevertheless, the current findings should be interpreted with caution due to a number of limitations. The present sample of studies is at (relatively low) risk of availability bias since three eligible studies were not available for inclusion. The published reports of these unavailable trials did not include deterioration rates. Thus, traditional meta-analysis of direct comparison between available and unavailable studies was not possible. However, results of a previous analysis showed no difference between the effectiveness of the available and unavailable studies, indicating that the present sample is representative (Karyotaki et al. 2017). Moreover, the intervention had significantly higher study dropout rates compared with control conditions. It has been suggested that study dropout could be either related to symptom deterioration or improvement (Rozental et al. 2014). On the other hand, higher study adherence rates in controls, such as waiting list, may be associated to the expectation that participants will eventually receive treatment. Unfortunately, it was not possible to examine reasons for study dropout as this information is largely missing from the included studies. In light of this limitation, it should be noted that in the present analysis missing outcome data were multiply imputed based on missing at random assumption. The missing at random assumption underlying multiple imputation is critical as it assumes that those participants for whom post-treatment data were imputed behaved similarly to those with post-treatment data with comparable profiles. Therefore, if individuals dropped out due to deterioration (missing not at random), this would not be captured and a conservative estimate of deterioration is likely. Nevertheless, we performed a complete case analysis to test the robustness of our main full sample analysis and we found no differences between the full sample and complete cases in deterioration rates. Another limitation is that we could not examine several variables that might influence deterioration, such as previous episodes of depression and symptom duration. Individuals who have experienced several episodes of depression in the past and those who have chronic symptoms might be at greater risk of deterioration. Moreover, the majority of the trials included recruited their participants through the community. Thus, the current findings cannot be generalised to patients treated in clinical settings since active care seekers may differ from individuals responding to advertisements.
From a clinical point of view, the present study has important implications. Although many patients prefer psychotherapy to antidepressants (van Schaik et al. 2004), pharmacotherapy dominates primary and secondary mental healthcare while psychotherapy is offered to a lesser degree. Self-guided iCBT has the potential to increase the availability and accessibility of psychotherapy and minimise the costs. However, the implementation of self-guided iCBT in clinical practice is hampered by barriers, such as concerns regarding its safety and effectiveness. The present study has demonstrated that self-guided iCBT results in lower risk for symptom deterioration compared with controls (including regular care services) and our previous study showed that is effective in treating depressive symptoms. This suggests that self-guided iCBT could be an alternative to watchful waiting in general practice and it can be disseminated at a large scale in countries where resources of mental healthcare are limited or inaccessible due to other reasons.
Symptom deterioration occurs in self-guided iCBT and study dropout is higher compared with controls; thus, future studies should carefully examine and report deterioration rates and reasons for study dropout. Future research on self-guided iCBT interventions should examine deterioration rates over longer time periods. Moreover, moderators, such as duration of symptoms and previous episodes, should be addressed by future studies to indicate for whom self-guided iCBT might be harmful. More studies from primary care are needed to replicate the current findings with clinical samples.