Introduction
Obsessive compulsive disorder (OCD) is a debilitating condition that commonly runs a chronic course (Melkonian et al., Reference Melkonian, McDonald, Scott, Karin, Dear and Wootton2022). Cognitive behaviour therapy (CBT) has been established as an effective treatment for OCD, particularly over the short term (Fisher and Wells, Reference Fisher and Wells2005; McKay et al., Reference McKay, Sookman, Neziroglu, Wilhelm, Stein, Kyrios, Matthews and Veale2015; Olatunji et al., Reference Olatunji, Davis, Powers and Smits2013; Öst et al., Reference Öst, Havnen, Hansen and Kvale2015; Skapinakis et al., Reference Skapinakis, Caldwell, Hollingworth, Bryden, Fineberg, Salkovskis, Welton, Baxter, Kessler, Churchill and Lewis2016). CBT has been delivered in a wide range of formats, including more concentrated than weekly sessions (Abramowitz et al., Reference Abramowitz, Foa and Franklin2003, Cottraux et al., Reference Cottraux, Note, Yao, Lafont, Note, Mollard, Bouvard, Sauteraud, Bourgeois and Dartigues2001, Jónsson et al., Reference Jónsson, Kristensen and Arendt2015, Oldfield et al., Reference Oldfield, Salkovskis and Taylor2011, Remmerswaal et al., Reference Remmerswaal, Lans, Seldenrijk, Hoogendoorn, van Balkom and Batelaan2021; Storch et al., Reference Storch, Merlo, Lehmkuhl, Geffken, Jacob, Ricketts, Tanya, Murphy and Goodman2008, Veale et al., Reference Veale, Naismith, Miles, Gledhill, Stewart and Hodsoll2016; Whiteside et al., Reference Whiteside, Brown and Abramowitz2008, Whiteside and Jacobsen Reference Whiteside and Jacobsen2010).
A limited number of studies have reported on the long-term outcome of CBT for OCD, and those that have, are often encumbered with small sample sizes and high attrition rates. Nevertheless, these studies offer critical insights into long-term outcomes. Öst et al. (Reference Öst, Havnen, Hansen and Kvale2015, Reference Öst, Enebrink, Finnes, Ghaderi, Havnen, Kvale, Salomonsson and Wergeland2022) found in their meta-analyses nine studies that had at least 12-month follow-up, with a total of 17 treatment conditions. At post-treatment the remission rate was 47.5% (95% CI 37.6–57.6%) and at follow-up 45.7% (95% CI 39.7–51.8%). Among the studies that had longer follow-up, five conditions had 2-year follow-up, and four conditions 5-year follow-up. The difference between studies that had 1-year follow-up (n=8; 44.7%) and those that had longer follow-up was not significant (n=9; 46.6%). This suggests that the remission rate observed at 1-year follow-up is likely sustained over a longer period. However, roughly half of patients remain significantly impaired despite evidence-based treatment, underpinning the persistent nature of OCD (Öst et al., Reference Öst, Havnen, Hansen and Kvale2015; Sharma et al., Reference Sharma, Sharma, Balachander, Lin, Manohar, Khanna, Lu, Garg, Thomas, Chun Lam Au, Selles, Höjgaard, Skarphedinsson and Stewart2021). Understanding long-term outcomes is important to advancing treatment strategies and addressing the persistent challenges of people with OCD. It appears that some of the participants who do not respond at post-treatment do so at follow-up. In a study by de Haan et al. (Reference de Haan, Van Oppen, Van Balkom, Spinhoven, Hoogduin and Van Dyck1997), 17 of 45 non-responders at post-treatment assessments had become responders at 6-month follow-up. In Hansen et al. (Reference Hansen, Kvale, Hagen, Havnen and Öst2019), 57% of patients who were not remitted at post-treatment had become recovered at 4-year follow-up, but 17.9% of post-treatment remitters had relapsed. These results further stress the importance of follow-up data, as results at post-treatment do not tell the whole story.
Different treatment formats of CBT – namely, cognitive therapy (CT), exposure and response prevention (ERP), and a combination of both – have been evaluated for their efficacy in treating OCD. Fewer studies have examined CT than ERP, as the latter is the most common psychological treatment for OCD. An analysis of six studies comparing CT with ERP indicated that the mean percent change for CT from pre-treatment to post was 51% and 55% at follow-up, and 49% and 47%, respectively, for ERP (Öst et al., Reference Öst, Havnen, Hansen and Kvale2015). These differences between CT and ERP were not significant, which is in accordance with other systematic reviews and meta-analyses (Rosa-Alcázar et al., Reference Rosa-Alcázar, Sánchez-Meca, Gómez-Conesa and Marín-Martínez2008; Skapinakis et al., Reference Skapinakis, Caldwell, Hollingworth, Bryden, Fineberg, Salkovskis, Welton, Baxter, Kessler, Churchill and Lewis2016).
The Bergen 4-day treatment (B4DT) is a highly concentrated format of ERP delivered over four consecutive days in groups of 3–6 patients with the same number of therapists. The treatment format was originally developed for OCD but has later been adapted for panic disorder (Eide et al., Reference Eide, Hjelle, Sætre, Solem, Olsen, Sköld, Kvale, Hansen and Hagen2023; Eide et al., Reference Eide, Olsen, Hansen, Hansen, Solem and Hagen2025; Iversen et al., Reference Iversen, Eide, Harvold, Solem, Kvale, Hansen and Hagen2022), social anxiety disorder (Hansen et al., Reference Hansen, Eide, Reiråskag, Tjelle, Solem and Hagen2024) and emetophobia (Davidsdottir et al., Reference Davidsdottir, Hjartarson, Ludvigsdottir, Gunnarsson, Vidar, Kvale, Hansen, Hagen and Öst2025a). Studies on B4DT have yielded good treatment outcomes (Kvale et al., Reference Kvale, Hansen, Björgvinsson, Børtveit, Hagen, Haseth, Kristensen, Launes, Ressler, Solem, Strand, van den Heuvel and Öst2018; Launes et al., Reference Launes, Hagen, Sunde, Öst, Klovning, Laukvik, Himle, Solem, Hystad, Hansen and Kvale2019a; Launes et al., Reference Launes, Laukvik, Sunde, Klovning, Hagen, Solem, Öst, Hansen and Kvale2019b; Launes et al., Reference Launes, Hagen, Öst, Solem, Hansen and Kvale2020) with a remission rate of approximately 70% four years post-treatment (Hansen et al., Reference Hansen, Kvale, Hagen, Havnen and Öst2019). The ‘drop-out rate’ from the treatment has been minimal, equivalent to 0.7% in the preceding studies compared with 19.1% for ERP in general (Öst et al., Reference Öst, Havnen, Hansen and Kvale2015). The B4DT has been disseminated to several other countries (Björgvinsson et al., Reference Björgvinsson, Klein, Werner, Sy, Smith, Brandt and McIngvale2025; Jelinek et al., Reference Jelinek, Serve, Pampuch, Scheunemann, Schultz, Miegel, Hansen, Hagen, Bohnsack, Gallinat and Yassari2024; Silver et al., Reference Silver, Isometsä, Baryshnikov, Heino, Stenberg and Saarni2023) and two studies have been published on its effectiveness for OCD in Iceland (Davidsdottir et al., Reference Davidsdottir, Sigurjonsdottir, Ludvigsdottir, Hansen, Laukvik, Hagen, Björgvinsson and Kvale2019; Davidsdottir et al., Reference Davidsdottir, Sigurjonsdottir, Ludvigsdottir, Kvale, Hansen, Hagen, Gunnarsson, Hjartarson, Skarphedinsson and Öst2025b). A recent study indicated that of 86 participants receiving the treatment from 2018 to 2023, 68% were in remission post-treatment and 68% at 3-month follow-up, and there was no drop-out from treatment (Davidsdottir et al., Reference Davidsdottir, Sigurjonsdottir, Ludvigsdottir, Kvale, Hansen, Hagen, Gunnarsson, Hjartarson, Skarphedinsson and Öst2025). These results are promising, but it remains to be seen whether treatment gains are maintained over time.
The aim of the present study was to evaluate the effectiveness of B4DT for the sample of Icelandic patients reported in Davidsdottir et al. (Reference Davidsdottir, Sigurjonsdottir, Ludvigsdottir, Kvale, Hansen, Hagen, Gunnarsson, Hjartarson, Skarphedinsson and Öst2025b) at 12-month follow-up. It is hypothesised that treatment gains will be maintained at 12-month follow-up as in the Norwegian studies on B4DT (Hansen et al., Reference Hansen, Hagen, Öst, Solem and Kvale2018; Hansen et al., Reference Hansen, Kvale, Hagen, Havnen and Öst2019).
Method
Participants
The study was part of a standard quality control at an out-patient clinic in Iceland, The Icelandic Anxiety Centre, Kvíðameðferðarstöðin. Ethical approvals were obtained from the Icelandic ethical board (VSN-19-052) and the Norwegian (REK-Vest) regional ethical committee (417836) and the research conformed to the Declaration of Helsinki. Data were collected between 2018 and 2024.
Patients either contacted the centre directly or were referred by mental health professionals. Patients presenting with OCD symptoms were referred to the OCD team. Treatment was offered if a principal diagnosis of OCD on the Mini International Neuropsychiatric Interview (M.I.N.I.; Sheehan et al., Reference Sheehan, Lecrubier, Sheehan, Amorim, Janavs, Weiller, Hergueta, Baker and Dunbar1998) was confirmed and the OCD symptoms considered severe enough to warrant B4DT, the criterion being ≥21 points on Y-BOCS. The Icelandic patients largely paid for the treatment themselves as psychological treatment is not widely subsidised in Iceland. In this study the B4DT was not initiated if patients were suicidal, psychotic, actively abusing substances or unable to refrain from the use of anxiolytics during B4DT. In total, 157 patients received a confirmed diagnosis of OCD. Of these, 115 met the inclusion criteria for the B4DT and were offered participation. Eight declined treatment, primarily due to fear of exposure exercises or lack of motivation; 12 were unable to afford the treatment for financial reasons; and seven preferred individual treatment because they were reluctant to discuss their obsessions in a group setting. Of the 88 patients who commenced the B4DT, 86 consented to participate in the study.
Procedure
Referred patients met with a clinician for an initial screening session and if an OCD diagnosis was suspected, they were provided with one to two additional assessment sessions where background information was recorded, M.I.N.I. administered and the severity of the OCD symptoms assessed with the Y-BOCS (Goodman et al., Reference Goodman, Price, Rasmussen, Mazure, Fleischmann, Hill, Heninger and Charney1989a; Goodman et al., Reference Goodman, Price, Rasmussen, Mazure, Delgado, Heninger and Charney1989b). The participants were given detailed information on the study and signed the informed consent, if they were willing to participate.
The patients were then given an introduction to the B4DT and watched two short videos in Norwegian with Icelandic subtitles presenting the B4DT along with the outline and content of the treatment.Footnote 1 The patients’ expectations of treatment outcome as well as their evaluation of treatment credibility were assessed with an adapted version of the Borkovec and Nau (Reference Borkovec and Nau1972) Reaction to Treatment Scale, in which four aspects of credibility and expectancy were evaluated on a 0–100% scale, with higher values indicating more positive evaluations. If a patient reported an expectancy or credibility score below 70%, this was taken as an opportunity to clarify possible misunderstandings regarding the treatment.
When admitted to a group, the patients watched another video explaining the treatment in more detailFootnote 2 and were asked to make a list of exposure tasks. The week ahead of the B4DT, the group leader called each participant on the telephone to ensure that they were ready to start treatment and made a list of exposure tasks. The treatment was delivered in groups of 3–6 patients with an equal number of therapists. The treatment content was as follows, but for a more detailed description, see Davidsdottir et al. (Reference Davidsdottir, Sigurjonsdottir, Ludvigsdottir, Kvale, Hansen, Hagen, Gunnarsson, Hjartarson, Skarphedinsson and Öst2025b).
The first day involved a 4-hour psychoeducation session aimed at providing participants with a basic understanding of their condition and the therapeutic process. Days 2 and 3 were dedicated to intensive 8-hour daily sessions focused on ERP addressing the specific OCD concerns of each participant. These were followed by self-administered ERP in the evenings. Daily group meetings convened in the morning, at lunchtime, and in the afternoon, facilitated the sharing of progress and the discussion of challenges related to the exposure therapy process. On the third day, a psychoeducational session was extended to the relatives and significant others of the participants. On the fourth day, participants received a 4-hour session with instructions on strategies for maintaining the positive changes achieved during the treatment program. Additionally, they collaboratively planned how the participants could maintain the changes obtained during treatment. Each participant completed an activity schedule for the subsequent three weeks, in which they outlined approximately what they planned to do – or deliberately refrain from doing – during that period. Most tasks had already been practised during treatment, such as going to bed without engaging in usual compulsions, touching doorknobs without sanitising hands afterwards, cooking chicken for the family, and using sharp objects while preparing food. Participants were encouraged to continue with their everyday lives during these weeks, but to actively embrace naturally occurring opportunities to practise the technique of ‘going all in’, rather than engaging in tasks reluctantly or avoiding them altogether. Throughout this post-treatment period, participants monitored their progress by completing a brief daily online survey, which assessed, for example, the extent to which they had abstained from rituals and taken advantage of daily opportunities to practise this new way of approaching feared situations. Participants did not receive feedback on these registrations, for example regarding the extent to which they adhered to their plans, and there was no therapist contact during this phase.
Therapists
To ensure dissemination of the B4DT without reducing quality, procedures for training and certification of therapists as well as online registration of treatment results, were established by the developers of the treatment at Haukeland University Hospital in Bergen, Norway. The groups were always led by one out of four experienced psychologists of the OCD team. The psychologists had between 10 and 20 years of experience treating anxiety disorders with cognitive behavioural therapy. The co-therapists all had extensive training and practice in cognitive behavioural therapy of anxiety disorders and were part of the OCD team.
Assessment
Prior to the first assessment session, patients were sent several self-report questionnaires to be completed before the first visit at the clinic. These questionnaires were also completed after the treatment, and at 3- and 12-months post-treatment. The M.I.N.I. interviews, as well as the first Y-BOCS administration, were conducted by one of the psychologists in the OCD team, who were all trained in the administration of these semi-structured clinical interviews. The Y-BOCS was administered by telephone post-treatment and at 3- and 12-month follow-up by an independent psychologist.
Instruments
The Mini International Neuropsychiatric Interview (M.I.N.I.), is a semi-structured and standardised diagnostic interview used to determine the most common psychiatric disorders according to DSM-IV and ICD-10 disorders (Lecrubier et al., Reference Lecrubier, Sheehan, Weiller, Amorim, Bonora, Sheehan, Janavs and Dunbar1997; Sheehan et al., Reference Sheehan, Lecrubier, Sheehan, Amorim, Janavs, Weiller, Hergueta, Baker and Dunbar1998). The Icelandic version was administered for which adequate validity has been demonstrated (Sigurðsson, Reference Sigurðsson2008).
The Yale-Brown Obsessive-Compulsive Scale (Y-BOCS; Goodman et al., Reference Goodman, Price, Rasmussen, Mazure, Fleischmann, Hill, Heninger and Charney1989a; Goodman et al., Reference Goodman, Price, Rasmussen, Mazure, Delgado, Heninger and Charney1989b), in interview format, was the primary outcome measure, and consists of 10 items covering the severity of both obsessions and compulsions, and is frequently used to assess treatment response. It has good psychometric properties (Goodman et al., Reference Goodman, Price, Rasmussen, Mazure, Fleischmann, Hill, Heninger and Charney1989a), but although the Y-BOCS interview is widely used in clinical practice and Icelandic OCD research, published evaluations of the diagnostic or psychometric properties of an Icelandic translation of the clinician-administered Y-BOCS appear to be lacking. Published Icelandic validation work has instead focused on the self-report version, which has demonstrated good validity and reliability for assessing the severity of OCD symptoms (Ólafsson et al., Reference Ólafsson, Snorrason and Smári2010)
The Patient Health Questionnaire (PHQ-9; Kroenke et al., Reference Kroenke, Spitzer, Williams and Löwe2010), here a secondary outcome measure, is a 9-item screening measure for depression and severity of depressive symptoms with scores ranging from 0 to 27, and cut-off points of 5, 10, 15 and 20 representing mild, moderate, moderately severe, and severe levels of depression. The psychometric properties of PHQ-9 are well-established (Carroll et al., Reference Carroll, Hook, Perez, Denckla, Vince, Ghebrehiwet, Ando, Touma, Borba, Fricchione and Henderson2020; Titov et al., Reference Titov, Dear, McMillan, Anderson, Zou and Sunderland2011) including for the Icelandic translation of the scale (Ágústsdóttir and Daníelsdóttir, Reference Ágústsdóttir and Daníelsdóttir2018; Kristófersdóttir et al., Reference Kristófersdóttir, Vésteinsdóttir, Kristjánsdóttir, Karlsson and Thorsdottir2026).
The Generalized Anxiety Disorder Scale-7 (GAD-7; Spitzer et al., Reference Spitzer, Kroenke, Williams and Löwe2006), here a secondary outcome measure, is a 7-item scale, measuring symptoms of generalised anxiety with scores ranging from 0 to 21, and cut-off points of 5, 10 and 15 representing mild, moderately severe, and severe levels of anxiety. The psychometric properties are well-established (Beard and Björgvinsson, Reference Beard and Björgvinsson2014; Johnson et al., Reference Johnson, Ulvenes, Øktedalen and Hoffart2019; Rutter and Brown, Reference Rutter and Brown2017), including for the Icelandic translation of the scale (Harðardóttir et al., Reference Harðardóttir, Vésteinsdóttir, Ásgeirsdóttir, Kristjánsdóttir and Þórisdóttir2022; Ólafsson, Reference Ólafsson2018).
The Work and Social Adjustment Scale (WSAS; Mundt et al., Reference Mundt, Marks, Shear and Greist2002) was here a secondary outcome measure of functional impairment. It assesses the impact of a person’s mental health difficulties on their ability to function in terms of work, home management, social leisure, private leisure, and personal or family relationships. It consists of five items scored between 0 (not at all) to 8 (very severely). Scores above 20 suggest severe impairment, whereas scores between 10 and 20 indicate significant functional impairment (Mundt et al., Reference Mundt, Marks, Shear and Greist2002). The properties of the Icelandic translation have not been established.
Criteria for clinical improvement
Changes in total score and the percentage improved and remitted on Y-BOCS were the primary outcome measure and changes in total scores on GAD-7, PHQ-9 and WSAS secondary outcomes. The modified international consensus criteria were applied to determine the percentage of patients that showed clinical improvement (Mataix-Cols et al., Reference Mataix-Cols, de la Cruz, Nordsletten, Lenhard, Isomura and Simpson2016). The criteria require a ≥35% reduction of the pre-treatment Y-BOCS score to be classified as a clinically relevant response. For remission both a response and a post-treatment/follow-up Y-BOCS score of
$\le$
12 is required, and recovery when meeting criteria for remission, but lasting at least a year. No change was defined as not having improved (
$\le$
35%) or worsened compared with pre-treatment, and deterioration as having a post-treatment/follow-up score of ≥35% higher than the pre-treatment score.
Benchmarking
We compared our results with two sets of studies that all had at least 12-month follow-up data. The first set was the Norwegian B4DT studies (Hansen et al., Reference Hansen, Hagen, Öst, Solem and Kvale2018; Havnen et al., Reference Havnen, Hansen, Öst and Kvale2014; Havnen et al., Reference Havnen, Hansen, Öst and Kvale2017; the last two reported in Hansen et al., Reference Hansen, Kvale, Hagen, Havnen and Öst2019). The second set was other (non-concentrated) studies of ERP included in the meta-analysis (Öst et al., Reference Öst, Enebrink, Finnes, Ghaderi, Havnen, Kvale, Salomonsson and Wergeland2022) of effectiveness studies in OCD (Belloch et al., Reference Belloch, Cabedo and Carrio2008; Belloch et al., Reference Belloch, Cabedo, Carrió and Larsson2010; Håland et al., Reference Håland, Vogel, Lie, Launes, Pripp and Himle2010; Van Noppen et al., Reference Van Noppen, Pato, Marsland and Rasmussen1998; Vogel et al., Reference Vogel, Stiles and Götestam2004). Two of three treatment groups in the first set and five of seven in the second set used completer analysis. Thus, we used completer analysis in the benchmarking analysis of Y-BOCS score and remission rate.
Statistical analyses
At post-treatment, 78 out of 86 participants were available for Y-BOCS evaluation, and 81 and 68 for 3- and 12-month follow-ups, respectively. To examine missing data, we used Little’s MCAR test and logistic regression analysis. Little’s MCAR test was not significant (p=.070), suggesting that the data were likely missing at random. Logistic regression was also conducted to predict missing data at 3- and 12-month follow-up, by including the pre-treatment Y-BOCS score, WSAS, PHQ-9, GAD-7, gender, age, and co-morbidity. None of these comparisons showed significant differences, and no differences were found in treatment outcomes. Therefore, we assumed that the data were missing at random in subsequent analyses.
Fisher’s exact test was used to determine if there was a significant association between categorical variables, for example when participants moved between categories of no change, improvement and remission post-treatment.
The statistical method we applied for the Y-BOCS, GAD-7, PHQ-9, and WSAS as continuous measures was piecewise regression, sometimes referred to as a longitudinal discontinuity model (Singer and Willett, Reference Singer and Willett2003). This approach evaluates whether a shift in the outcome trajectory occurs after a known event. In this study, the known event was the post-treatment evaluation. Piecewise regression was used to investigate the decrease in total scores during treatment (first two weeks) compared with the 12-month follow-up.
Linear mixed effects (LME) models were employed to account for individual variability and to address the relationship between treatment and follow-up trajectories. The model included two random effects, namely intercept and slope (weeks since pre-treatment). Unlike previous analyses (Davidsdottir et al., Reference Davidsdottir, Sigurjonsdottir, Ludvigsdottir, Kvale, Hansen, Hagen, Gunnarsson, Hjartarson, Skarphedinsson and Öst2025b) where time was treated as a single categorical variable, this study used two continuous time variables. The first time variable, WEEK, tracked the progression of time starting at pre-treatment. The second variable, POSTWEEK, began tracking time after the post-treatment evaluation to capture changes specific to the follow-up period.
The models were fitted using restricted maximum likelihood estimation to accommodate missing data under the intention-to-treat. An unstructured covariance matrix was applied to the residuals to model within-subject correlations, and the residual variance was used to derive the residual deviation for effect size calculations.
Effect sizes, expressed as standardised mean differences or Cohen’s d, were calculated to provide a magnitude of change between time points. These were derived directly from model estimates, with the mean difference between two time points divided by the residual standard deviation obtained from the LME models. The primary focus of the analyses was on comparisons of the 12-month follow-up with pre-treatment, post-treatment, and 3-month follow-up scores.
Tests were 2-tailed, and a p-value less than .05 was considered to indicate statistical significance. Statistical analyses were performed with SAS 9.4.
For categorical outcomes, missing values were imputed in R by means of predictive mean matching which is a semi-parametric imputation approach. It is similar to the regression method except that for each missing value, it fills in a value randomly from among the observed donor values from an observation whose regression-predicted values are closest to the regression-predicted value for the missing value from the simulated regression model (Heitjan and Little Reference Heitjan and Little1991; Schenker and Taylor, Reference Schenker and Taylor1996; van Buuren, Reference van Buuren2018).
Results
Eighty-eight patients were offered to take part in the treatment; two patients did not wish to participate in the study, so the sample consisted of 86 participants: 66 women and 20 men, aged 16–71 years (mean age 29.7). Of these, 61.6% were classified with severe symptoms of OCD, and 38.4% with moderate symptoms, 67.4% were working or studying, 14% were unemployed and 18.6% were on sick leave. Of the sample, 72.1% had previously received psychological treatment for OCD, and 16.3% were currently being treated with medication (SSRI) for their OCD symptoms. They were instructed not to make any changes to their medication during the time of the treatment. Altogether 86.0% had at least one co-morbid disorder, depression being the most common (50.0%). All participants initiating treatment completed it, so the attrition rate was 0%.
Primary outcomes
Means estimates, standard errors and effect sizes for changes in Y-BOCS are presented in Table 1.
Table 1. Mean estimates, standard errors and effect sizes (Cohen’s d) for the Y-BOCS total score

SE, standard error; CI, confidence interval; FU, follow-up; ES, effect sizes;
Cohen’s d and compared with the 12-month FU mean estimate.
The LME model identified a significant effect of time in Y-BOCS scores across time. The model included two time variables to reflect reductions during treatment and follow-up (FU) periods. The Y-BOCS scores at the 12-month FU (M=11.49), were significantly reduced compared with pre-treatment (M=30.49), with a mean reduction of 19.80 ([95% CI –20.83, –17.16], t 141=–20.42, p<.001). The effect size was large (Cohen’s d=4.70). Comparisons between post-treatment (M=10.68) and the 12-month FU showed no significant change in Y-BOCS scores (M=0.81, 95% CI [–1.02, 2.64], t 141= 0.88, p=.38). This suggests that treatment gains achieved at post-treatment were maintained at 12-month follow-up. Similarly, Y-BOCS scores at 12-month follow-up were not significantly different from the 3-month follow-up (10.87), difference M=0.62, 95% CI [–0.78, 2.03], t 141=0.88, p=.38, which suggests the long-term maintenance of effects from 3-month to 12-month follow-up.
According to the severity benchmarks for Y-BOCS (Cervin et al., Reference Cervin, Arumugham, Lochner, Cervin, Crowley and Mataix-Cols2022), 61.6% of participants were classified with severe symptoms of OCD pre-treatment and 38.4% moderate symptoms. These proportions were 15.2% and 2.3%, respectively, at 12-month follow-up.
Treatment response
As can be seen in Table 2, at post-treatment 95.4% had responded and 69.8% were in remission according to the modified international consensus criteria. At 12-month follow-up, 83.7% had responded and 67.4% were recovered.
Table 2. Clinical status at post-treatment and 12-month follow-up according to the modified international consensus criteria

Response, ≥35% reduction of the pre-treatment Y-BOCS score; Remission/recovery, ≥35% reduction of the pre-treatment Y-BOCS and a score of ≤12; No change, not having improved or worsened (≥35%) compared with pre-treatment.
Table 2 displays the clinical improvement at post-treatment and 12-month follow-up for individual patients, using the modified international consensus criteria. The remission rate at post-assessment (69.8%) and the recovery rate at follow-up (67.4%) were very similar, but there were some movements between the categories. Of the 60 remitted patients at post-assessment, 46 (76.7%) remained in that category and were recovered at follow-up, whereas 14 had a worse status. Of the 22 in the response category half (n=11) had moved to recovery, seven remained and four had worsened. Finally, in the no change category at post-assessment, one had moved to recovery, two to response, and one remained in no change. When combining the remission and response categories, 18 out of 82 (22.0%) participants were in a worse category at follow-up than at post. However, 15 out of 26 (53.8%) in the response and no change categories were in a better category at follow-up. Fisher’s exact test of these proportions yielded a significant 2-tailed p-value of 0.0031.
Secondary outcomes
The LME model showed significant changes in the PHQ-9 across time, as can be seen in Table 3. Direct comparisons showed a significant reduction from pre-treatment on PHQ-9 (M=12.58) to 12-month follow-up (M=8.36) with a mean reduction of 4.22, t 99=–4.65, p<.001, and a large effect size of 1.17. We did not observe a significant difference between the post-treatment score and the 12-month follow-up (M=0.82), t 99=0.89, p=.38. Similarly, we did not find any difference between the 3-month follow-up and the 12-month follow-up (M=0.63, t 99=0.89, p=.38).
Table 3. Mean estimates, standard errors and effect sizes (Cohen’s d) on PHQ-9, GAD-7 and WSAS

SE, standard error; CI, confidence interval; ES, effect sizes; PHQ-9, Patient Health Questionnaire; GAD-7, Generalized Anxiety Disorder Scale; WSAS, Work and Social Adjustment Scale; FU, follow-up.
The LME model demonstrated significant changes in GAD-7 scores over time. Direct comparisons showed a significant reduction in scores from pre-treatment (M=13.15, 95% CI [12.14, 14.16]) to 12-month follow-up (M=7.20), with a mean reduction of 5.95 (95% CI [–7.51, –4.39], t 99=–7.57, p<.001), and a large effect size of 1.72. However, the changes from post-treatment to 12-month follow-up (M=0.24, t 99=0.31, p=.76) and from the 3-month follow-up to 12-month follow-up (M=0.19, t 99=0.31, p=.76) were not significant.
The LME model for WSAS scores showed significant changes over time. Pairwise comparisons revealed a significant reduction in WSAS scores from pre-treatment (M=20.27), to 12-month follow-up (M=7.57) with a mean reduction of 12.70, t(65)=–7.57, p<.001) and large effect size (d=2.27). Comparisons between post-treatment and the 12-month follow-up also showed a significant reduction of 3.99, t(65)=–2.34, p=.022), which corresponds to a medium effect size (d=0.71). Similarly, the comparison between the 3-month follow-up and the 12-month follow-up showed a significant reduction of 3.07, t(65)=–2.34, p=.022), with a medium effect size (d=0.55).
Benchmarking analyses
As a precursor to the comparison, we tested whether the subgroup (n=18) without follow-up data differed from the main group (n=68) with 12-month follow-up data on Y-BOCS. There was no significant difference between the subgroups at any time point. Interestingly, the subgroup lacking 12-month data has nominally lower means at post (9.88 compared with 10.76; t 84=0.83, p=0.41) and 3-month assessment (10.06 compared with 11.09; t 84=0.69; p=0.49). Thus, the participants that lacked 12-month data did not fail the treatment.
Comparison between categories on Y-BOCS
The comparisons of the Icelandic and the Norwegian B4DT and the Icelandic and other ERP from the meta-analysis on Y-BOCS (Öst et al., Reference Öst, Havnen, Hansen and Kvale2015) are displayed in Table 4. At pre-treatment the Icelandic sample had a significantly higher mean than both the Norwegian and the MA other ERP categories. At both post-treatment and follow-up, the Icelandic mean did not differ from the Norwegian but was significantly lower than the MA other ERP mean score.
Table 4. Means (SD) on Y-BOCS for the Icelandic, Norwegian, and MA other ERP studies at pre-, post-, and 12-month follow-up assessment

M, mean; SD, standard deviation; FU, follow-up; MA, meta-analysis; ERP, exposure and response prevention.
ap<0.05, bp<0.01, cp<0.0001.
Comparison between categories on remission rate
The comparison of remission rates was done in two steps. First, the Icelandic and Norwegian rates were compared as they used the same modified international consensus criteria with a Y-BOCS score of ≤12 as cut-off. Table 5 (left half) shows the results and there was no significant difference between Iceland and Norway, neither post nor at follow-up. Second, the MA other ERP studies used different cut-off scores on Y-BOCS when applying the Jacobson and Truax (Reference Jacobson and Truax1991) criteria for clinically significant change. For three treatment groups it was 12, for one 14, and for three 16, with a median of 14. This median was used in the comparison between the Icelandic and MA other ERP samples. As can be seen in the right half of Table 5, the Icelandic sample had significantly higher remission rates both at post and follow-up.
Table 5. Remission rate (%) for the Icelandic, Norwegian, and MA other ERP studies at post- and 12-month follow-up assessment

FU, follow-up; MA, meta-analyses.
Comparison on background variables
The comparison between the categories of studies on background variables are presented in Table 6. The Icelandic participants had a significantly lower mean age than the other two categories but there was no difference on proportion of females. Icelandic participants had a significantly higher proportion with co-morbid disorders and a lower proportion being medicated for their OCD than both the other categories. There was no difference in attrition rate between the Icelandic and Norwegian studies, but the MA other ERP studies had a significantly higher drop-out during the treatment. Finally, treatment time in hours was very similar, with 22 for the two B4DT and 23.4 for the MA other ERP category.
Table 6. Comparison of background variables and attrition rate

Discussion
The aim of the study was to evaluate the 12-month outcome of B4DT for OCD. The results indicate that treatment gains were sustained at the 12-month follow-up. The lasting benefits from treatment are in line with other studies on B4DT for OCD, where symptomatic changes from post-treatment to follow-up assessments have been non-significant (Hansen et al., Reference Hansen, Hagen, Öst, Solem and Kvale2018; Hansen et al., Reference Hansen, Kvale, Hagen, Havnen and Öst2019). The same goes for panic disorder (Eide et al., Reference Eide, Olsen, Hansen, Hansen, Solem and Hagen2025), emetophobia (Davidsdottir et al., Reference Davidsdottir, Hjartarson, Ludvigsdottir, Gunnarsson, Vidar, Kvale, Hansen, Hagen and Öst2025a), and social anxiety disorder (Hansen et al., Reference Hansen, Eide, Reiråskag, Tjelle, Solem and Hagen2024).
In line with previous studies (de Haan et al., Reference de Haan, Van Oppen, Van Balkom, Spinhoven, Hoogduin and Van Dyck1997; Hansen et al., Reference Hansen, Hagen, Öst, Solem and Kvale2018; Hansen et al., Reference Hansen, Kvale, Hagen, Havnen and Öst2019) there was movement between categories from post-treatment to follow-up. Roughly half of those who had not fully benefited from the treatment at post-treatment had done so by 12-month follow-up.
Symptoms of depression (PHQ-9) and generalised anxiety (GAD-7) were reduced during treatment, as was the case in Hansen et al. (Reference Hansen, Hagen, Öst, Solem and Kvale2018), with no significant difference between post-treatment and follow-up mean scores. Work and social interference (WSAS) decreased significantly at all time points, and the participants’ mean was below the cut-off (Mundt et al., Reference Mundt, Marks, Shear and Greist2002) for impairment on WSAS at 12-month follow-up. This resonates with Hansen et al. (Reference Hansen, Kvale, Hagen, Havnen and Öst2019) where 20 out of the 27 participants that were not working or studying pre-treatment, did so at 4-year follow-up. The results suggest that significant effects can be obtained on measures of psychiatric and work/social problems that were not dealt with in the treatment, which was totally focused on OCD. It corroborates the results reported in the meta-analysis of CBT in routine clinical care for OCD (Öst et al., Reference Öst, Enebrink, Finnes, Ghaderi, Havnen, Kvale, Salomonsson and Wergeland2022).
At pre-treatment the Icelandic sample had a significantly higher mean on Y-BOCS than the Norwegian and other ERP studies according to the meta-analyses (MA) on OCD (Öst et al., Reference Öst, Enebrink, Finnes, Ghaderi, Havnen, Kvale, Salomonsson and Wergeland2022). At post-treatment and follow-up, the Icelandic mean did not differ from the Norwegian one but was, like the Norwegian mean, significantly lower than the MA on other ERP mean score. The remission rates did not differ from Norwegian studies but when compared with MA on other ERP, were significantly higher both at post-treatment and follow-up. This suggests that the B4DT may be more effective than other ERP and with lower attrition rates, non-existent in the case of this study. This is remarkable, as a high attrition rate is considered one of the main factors hindering outcome in the treatment of OCD (McKay et al., Reference McKay, Sookman, Neziroglu, Wilhelm, Stein, Kyrios, Matthews and Veale2015).
The study has some limitations. Of 86 participants, 68 were available at 12-month follow-up so the missing values were imputed by means of predictive mean matching. As there was no difference between patients with missing and non-missing data with respect to pre-treatment scores on Y-BOCS, gender, age, co-morbidity or treatment outcome, it was assumed that the data were missing at random, which was also shown in Little’s test.
It should also be pointed out that the Icelandic participants paid for the treatment, as it is not reimbursed as it is in Norway. This may have impacted the Icelandic patients’ motivation and they may have had different characteristics. It should, however, be stressed that the Icelandic sample scored higher on Y-BOCS pre-treatment and had a higher number of participants with co-morbidity compared with the Norwegian participants (Hansen et al., Reference Hansen, Hagen, Öst, Solem and Kvale2018; Hansen et al., Reference Hansen, Kvale, Hagen, Havnen and Öst2019).
Another limitation of the study is the lack of control group. It is impossible to rule out, although quite unlikely, that participants might have recovered spontaneously during the follow-up period. The meta-analysis by Öst et al. (Reference Öst, Havnen, Hansen and Kvale2015) showed that untreated waitlist conditions had a non-significant within-group effect size of 0.10 at post-assessment compared with an effect size of 4.70 in this study. This is a large difference. A randomised controlled trial comparing the B4DT, and standard weekly sessions CBT is currently being carried out in Stockholm, Sweden (Ivanova et al., Reference Ivanova, Fondberg, Flygare, Sannemalm, Asplund, Dahlén, Sampaio, Andersson, Mataix-Cols, Ivanov and Rück2023).
Then there is a slight, although non-significant increase in Y-BOCS scores at 12-month follow-up. A longer follow-up period would have been optimal to investigate if treatment gains were maintained. Lastly, as this was a naturalistic follow-up, we did not control for any treatments the patients may have received during the follow-up period.
Despite the limitations, the results of this study are promising and consistent with former studies on B4DT. This treatment appears to be more effective than other forms of ERP, boasting a negligible drop-out rate and delivering results in just 4 days of treatment. It can significantly accelerate remission from this debilitating condition and is well-suited for individuals in rural areas who struggle to attend weekly sessions in the city. A key next step is to explore why some participants experience greater benefits from the treatment than others. This insight would allow for further refinement, making the treatment even more personalised and effective for each participant.
Key practice points
-
(1) Providing a thorough introduction to treatment, and evaluating the patient’s reaction to it, may help reduce the risk of drop-out.
-
(2) Individualised treatment in a group setting may provide the optimal support for patients severely affected by OCD.
-
(3) The B4DT is rewarding for both patients and therapists, with significant results achieved through collaborative efforts within only 4 days.
Data availability statement
Data can be made available on request.
Acknowledgements
The authors would like to express their gratitude to the staff at the Icelandic Anxiety Centre for their support during the study’s implementation. We also extend our appreciation to the participants for their time and commitment to this research. Additionally, we thank the Kavli Trust for their funding of the training of the Icelandic therapists. Finally, we acknowledge the anonymous reviewers for their valuable feedback, which helped improve the quality of this manuscript.
Author contributions
Sóley Dröfn Davidsdottir: Conceptualization (lead), Data curation (lead), Formal analysis (lead), Investigation (lead), Methodology (lead), Project administration (lead), Validation (lead), Visualization (lead), Writing - original draft (lead), Writing - review & editing (lead); Sigurbjörg Jóna Ludvigsdóttir: Data curation (supporting), Writing - review & editing (supporting); Gerd Kvale: Conceptualization (lead), Data curation (supporting), Formal analysis (equal), Funding acquisition (lead), Investigation (equal), Methodology (lead), Project administration (supporting), Resources (supporting), Supervision (lead), Writing - review & editing (supporting); Bjarne Hansen: Funding acquisition (lead), Methodology (equal), Project administration (supporting), Supervision (supporting), Writing - review & editing (supporting); Kristen Hagen: Supervision (supporting), Writing - review & editing (supporting); Kristján Helgi Hjartarson: Methodology (supporting), Visualization (supporting), Writing - review & editing (supporting); Gudmundur Skarphedinsson: Formal analysis (equal), Methodology (equal), Supervision (supporting), Visualization (equal); Emanúel Geir Gudmundsson: Data curation (equal), Writing - review & editing (equal); Hrefna Gudmundsdottir: Data curation (equal), Writing - review & editing (equal); Lars-Göran Öst: Formal analysis (equal), Methodology (equal), Project administration (equal), Supervision (lead), Validation (supporting), Visualization (equal), Writing - review & editing (equal).
Financial support
The study was funded by the Kavli Trust.
Competing interests
The authors declare no conflicts of interest related to this study.
Ethical standards
Ethical approval was obtained from the Icelandic ethical board (VSN-19-052) and the Norwegian (REK-Vest) regional ethical committee (417836), and the research conformed to the Declaration of Helsinki. Informed consent was obtained from all participants.






Comments
No Comments have been published for this article.