Binge eating disorder, defined in the DSM-IV as a provisional diagnosis in need of further study, is characterised by recurrent binge eating that occurs in the absence of regular compensatory behaviours. 1 Ample evidence indicates that binge eating disorder is a clinically significant disorder, associated with eating disorder and general psychopathology, psychiatric comorbidity, overweight and obesity and impaired quality of life. Reference Striegel–Moore and Franko2,Reference Wonderlich, Gordon, Mitchell, Crosby and Engel3 According to current reviews, Reference Brownley, Berkman, Sedway, Lohr and Bulik4,Reference Wilson, Grilo and Vitousek5 meta-analyses Reference Hay, Bacaltchuk, Stefano and Kashyap6,Reference Vocks, Tuschen–Caffier, Pietrowsky, Rustenbach, Kersting and Herpertz7 and clinical treatment guidelines, 8 cognitive–behavioural therapy (CBT) is considered the first-line specialty treatment for binge eating disorder. Cognitive–behavioural therapy produces substantial improvements in binge eating, associated psychopathology and psychosocial functioning, and modest weight loss has been documented in those who achieve abstinence from binge eating. Reference Wilfley, Welch, Stein, Spurrell, Cohen and Saelens9,Reference Wilson, Wilfley, Agras and Bryson10 Virtually identical gains have been found for interpersonal psychotherapy (IPT), a theoretically and procedurally distinct therapeutic approach. Reference Wilfley, Welch, Stein, Spurrell, Cohen and Saelens9,Reference Wilson, Wilfley, Agras and Bryson10 For both CBT and IPT, stability of treatment effects has been documented in randomised controlled trials (RCTs) over a period of up to 2 years following treatment cessation, Reference Wilson, Wilfley, Agras and Bryson10,Reference Devlin, Goldfein, Petkova, Liu and Walsh11 yet longer-term effects remain unknown. The current study sought to examine the long-term effects of out-patient group CBT and IPT in a randomised controlled binge eating disorder psychotherapy trial (trial registration: NCT01208272). Reference Wilfley, Welch, Stein, Spurrell, Cohen and Saelens9
Participants and procedure
Participants were the first 90 patients (55.6%; the first five cohorts) out of a larger (n = 162) treatment trial for overweight people with binge eating disorder, recruited in New Haven, Connecticut, USA (for methodological detail see the main report Reference Wilfley, Welch, Stein, Spurrell, Cohen and Saelens9 ). As this long-term follow-up study was not part of the initial two-site clinical trial, funded by the National Institutes of Health (NIH), participants were re-consented with a separate informed consent approved by the San Diego State University Institutional Review Board. Owing to the unfunded nature of this follow-up study, we opted to only enrol patients from the New Haven site. All participants met DSM-IV-TR diagnostic criteria for binge eating disorder 1 and were, after stratification by gender, randomised to either CBT or IPT (n = 45 each). Both treatments were manual-based and consisted of 20 weekly 90-min group sessions and three additional individual sessions.
For the present study, patients were contacted for a long-term follow-up assessment approximately 4 years after treatment cessation (i.e. a mean of 46.0 months). As illustrated in Fig. 1, of the 90 patients included in this study, 77 (85.6%) could be contacted (i.e. no current contact information: 10; incapacitated or deceased: 3). Of the 77 individuals contacted, 58 (75.3%) completed the long-term follow-up assessment. Completer rates for post-treatment assessment were 89/90 (98.9%) and for the 1-year follow-up assessment 77/90 (85.6%). Eighty-three (92.2%) of the 90 patients had completed treatment. Treatment and assessment completion rates did not differ by group (all P>0.05).
Assessments and procedures
Outcome analyses included pre-treatment, post-treatment, 1-year follow-up and long-term follow-up assessments. The long-term follow-up assessment included a telephone interview and self-report questionnaires, whereas all other assessments involved in-person diagnostic visits and self-report questionnaires. All structured clinical interviews were conducted by trained assessors (Bachelor level or higher) who received ongoing supervision to ensure standardised administration.
Eating disorder psychopathology
The diagnostic version of the Eating Disorder Examination (EDE 12.0D), Reference Fairburn, Cooper, Fairburn and Wilson12 a semi-structured interview with good reliability and validity, was used to assess days with objective bulimic episodes (i.e. eating an unusually large amount of food, accompanied by a sense of loss of control over eating). 1 The primary outcome variables of recovered and remitted were based on the EDE assessment of days with objective bulimic episodes over the previous 28 days (see Data analysis). Secondary outcome variables included the number of days with objective bulimic episodes over the previous 28 days and a composite shape/weight concern score, both derived from the EDE. For descriptive purposes, DSM-IV-TR diagnoses of binge eating disorder, bulimia nervosa, anorexia nervosa and eating disorder not otherwise specified were determined with the abbreviated EDE. Objective bulimic episode days and episodes were comprehensively assessed over the previous 6 months, whereas purging behaviour (i.e. self-induced vomiting, laxative or diuretic misuse), fasting, intense exercising, importance of shape or weight, fear of weight gain, feelings of fatness, maintained low weight, and menstruation were assessed over the previous 28 days and, if positive, over the previous 3 months.
In addition, the self-report version of the EDE, the Eating Disorder Examination Questionnaire (EDE-Q), Reference Fairburn and Beglin13 was used to assess the associated eating disorder psychopathology on the four subscales of restraint, eating concern, shape concern and weight concern, and on global eating disorder psychopathology (an average of the four subscales). The EDE-Q indicators have demonstrated adequate internal consistency, convergent validity and sensitivity to change. Reference Sysko, Walsh and Fairburn14 The EDE weight/shape composite and the EDE-Q indicators were based on the previous 28 days and ranged from 0 to 6, with higher scores indicating greater psychopathology.
The depression and anxiety subscales of the Brief Symptom Inventory (BSI) Reference Derogatis15 were used for assessment of general psychopathology. Subscale scores were used as secondary outcomes; they ranged from 0 to 4, with higher scores indicating more severe psychiatric symptoms. Both subscales have adequate internal consistency, convergent validity and sensitivity to change. Reference Derogatis and Melisaratos16
Body mass index
Body mass index (BMI, kg/m2), a proxy for body fat, was used as a secondary outcome. Body mass index was calculated from weight (self-reported at long-term follow-up and measured at all other time points) and height (measured at pre-treatment). For diagnosis of anorexia nervosa (see above), underweight was determined as BMI<18.5 kg/m2. Prior to treatment, measured and self-reported body weight were highly associated (r = 0.99, P<0.001).
At long-term follow-up, healthcare utilisation for eating or weight problems was retrospectively assessed for the time after 1-year follow-up. Use of psychotherapy, pharmacotherapy, consultation (for example with a dietician) and/or alternative treatment (for example hypnosis) and any of these treatments was determined in a dichotomous format (0, no; 1, yes).
Preliminary analyses served (a) to compare patients randomised to CBT v. IPT (n = 45 each) on sociodemographic characteristics (age, gender), drop-out from treatment, adherence (i.e. number of sessions attended), and pre-treatment level of any outcome variable (univariate general linear model (GLM) analyses or χ2 analyses by treatment (CBT, IPT)); (b) to compare participants who were included in the long-term follow-up assessment v. those who were not included (n =90 v. 72) by treatment on the same variables and outcome variables at all time points (univariate GLM or generalised linear model analyses of sample (included, not-included)×treatment (CBT, IPT) for continuous or binary variables, respectively); and (c) to compare participants with completed v. non-completed long-term follow-up assessment (n =58 v. 32) by treatment on these same variables and outcome variables at pre-treatment, post-treatment, and 1-year follow-up (univariate GLM or generalised linear model analyses of assessment (completed, not-completed)×treatment (CBT, IPT) for continuous or binary variables, respectively).
Statistical analyses were based on the generalised estimating equations (GEE) approach for the primary (categorical) outcome variables and on hierarchical linear modelling (HLM) for the secondary (continuous) outcome variables. Both intention-to-treat approaches allow data from participants with missing data at some, but not all, time points to remain in the analyses. In addition, both approaches correct for the dependency of observations within participants, in GEE analysis by assuming an exchangeable working correlation structure, and in HLM by allowing the regression coefficients to vary between participants. For the efficacy analyses, the assessment completer sample size at long-term follow-up provided 80% power to detect a medium-to-large effect size of treatment difference (d = 0.76; n = 25 for CBT, n = 33 for IPT).
The primary categorical outcome variables were analysed using GEE logistic regression models (i.e. logit link function and binomial error distribution) that included treatment (CBT, IPT)×time (post-treatment, 1-year follow-up, long-term follow-up) and the respective main effects as predictors. Least significant difference tests were used for post hoc analyses in case of significant higher-order effects. Consistent with the main outcome report, Reference Wilfley, Welch, Stein, Spurrell, Cohen and Saelens9 the primary outcome included three variables, all determined at post-treatment, 1-year follow-up and long-term follow-up: recovered (i.e. no objective bulimic episodes in the previous month), improved to subclinical binge eating (i.e. <4 objective bulimic episode days in the previous month) and being at or below a comparative level of eating disorder attitudes and behaviours. The latter rating was made based on whether the global EDE-Q score was at or below the global EDE-Q score of overweight treatment-seeking individuals without binge eating disorder who had a similar sociodemographic profile as the patients in the current study (global EDE-Q score, 2.47). Reference Castellini, Lapi, Ravaldi, Vannacci, Rotella and Faravelli17
The secondary continuous outcome variables were analysed using HLM of treatment (CBT, IPT)×time (pre-treatment, post-treatment, 1-year follow-up, long-term follow-up), with patients nested within time. In this analysis, time and treatment were treated as fixed factors and patients as a random factor. Least significant difference tests were used for post hoc analyses.
To ensure that the results from the primary and secondary intention-to-treat analyses were robust, three sensitivity analyses Reference Sterne, White, Carlin, Spratt, Royston and Kenward18 were conducted in addition, handling missing data as follows: (a) with missing data multiply imputed, creating five completed data-sets via an iterative Markov Chain Monte Carlo method; (b) with last observations carried forward; and (c) with no replacement values for missing data (i.e. completer analysis only including participants who had completed all assessments). Outcome analyses were performed on all sensitivity data-sets for inspection of significance. Results were reported only if different from those of the primary and secondary intention-to-treat analyses described above. Completer effect sizes were computed as Pearson's r (small, r⩾0.10; medium, ⩾0.30; large, ⩾0.50) or Cohen's d (small, d⩾0.20; medium, ⩾0.50; large, ⩾0.80). Reference Cohen19
therapy (n = 45)
psychotherapy (n = 45)
|F (d.f.)||χ2 (n = 90)||P|
|Age, years: mean (s.d.)||45.73 (9.86)||44.02 (10.49)||0.64 (1,88)||0.427|
|Age at onset of disorder, years: mean (s.d.)||17.50 (11.76)||18.50 (10.20)||0.17 (1,78)||0.686|
|Gender, female: n (%)||36 (80.0)||35 (77.8)||0.07||0.796|
|Ethnicity, n (%)||0.21||0.899|
|White||41 (91.1)||42 (93.3)|
|African American||3 (6.7)||2 (4.4)|
|Hispanic||1 (2.2)||1 (2.2)|
|Comorbid psychiatric diagnoses, n (%)|
|Any Axis I disorder, current||19 (42.2)||18 (40.0)||0.05||0.830|
|Any Axis II disorder||4 (8.9)||5 (11.1)||0.12||0.725|
All analyses were performed with PASW 18.0 for Windows. A two-tailed significance level of α<0.05 was applied to all statistical tests. In order to avoid α inflation by multiple testing, the significance level was adjusted to a two-tailed α<0.01 for post hoc tests.
Randomisation, attrition and sampling
Patients randomised to CBT v. IPT did not differ with regard to sociodemographic characteristics (Table 1) or any pre-treatment primary or secondary outcome variable (all P>0.05). There were no significant differences in drop-out (CBT: 3 (6.7%); IPT: 4 (8.9%)) or adherence to treatment (CBT: 16.8 (s.d. = 3.0); IPT: 17.7 (s.d. = 3.9) sessions; all P>0.05).
The 90 patients who were included in the long-term follow-up assessment did not significantly differ from those who were not included in the follow-up assessment (n = 72) on pre-treatment characteristics, drop-out and adherence, and no interaction effects with treatment condition were observed (sample (included, not-included)×treatment (CBT, IPT); all P>0.05). The 58 patients who completed the long-term follow-up assessment did not significantly differ from the 32 assessment non-completers on sociodemographic characteristics, drop-out, adherence, missing data, or primary and secondary outcomes at pre-treatment, post-treatment, or 1-year follow-up, and no interaction effects with treatment condition occurred (assessment (completed, not-completed)× treatment (CBT, IPT); all P>0.05).
Long-term recovery rates were 52.0% for the CBT group and 76.7% for the IPT group (Table 2, Fig. 2). The GEE logistic regression analysis showed a significant treatment×time effect on abstinence from binge eating (P<0.001). Post hoc comparisons by time point did not reveal any significant between-treatment differences (all P>0.01). However, for CBT, a significant decline in recovery rates from post-treatment and 1-year follow-up to long-term follow-up was observed (both P⩾0.002), whereas for IPT, abstinence rates did not change over the follow-up period (both P>0.01).
Long-term rates of remission to subclinical binge eating were 72.0% for the CBT group and 83.9% for the IPT group. The GEE logistic regression analysis of treatment×time did not show any significant effects on remission (all P>0.05). Long-term rates of improvement to a comparative level of eating disorder attitudes and behaviours (i.e. global EDE-Q score ⩽2.47) were 54.5% for the CBT group and 61.5% for the IPT group. The GEE logistic regression analysis showed a significant time effect on improvement (P = 0.049), but post hoc analyses did not reveal any change over the follow-up period (all P>0.01).
For all primary outcome variables, the sensitivity analysis revealed the same results with no replacement of missing data and with last observation carried forward. Multiple imputation results were largely consistent, with two exceptions: there was a significant time effect for remission and a significant treatment effect for clinically significant improvement (both P<0.05). Between-treatment effect sizes at long-term follow-up were small (recovery, r = 0.26; remission, r = 0.14; improvement, r = 0.07).
Descriptive completer analyses revealed that 12 (27.3%) CBT patients and 10 (22.2%) IPT patients showed persistent recovery across all three follow-up assessments. From 1-year follow-up to long-term follow-up: 13 (52.0%) patients in the CBT and 13 (43.3%) in the IPT group maintained abstinence; no one (0.0%) in the CBT and 10 patients (33.3%) in the IPT group achieved abstinence; 6 (24.0%) patients in the CBT and 3 (10.0%) in the IPT group relapsed; and 6 (24.0%) patients in the CBT and 4 (13.3%) in the IPT group remained non-abstinent from objective bulimic episodes.
At long-term follow-up, 3 (12.0%) individuals in the CBT group and 3 (9.4%) in the IPT group were diagnosed with binge eating disorder. None of the participants were diagnosed with anorexia nervosa or bulimia nervosa or eating disorder not otherwise specified, and none revealed purging behaviour, fasting or intense exercising.
The HLM analyses of treatment×time revealed significant time effects for all secondary outcome variables (all P<0.001), except BMI (P = 0.433; online Table DS1). Post hoc analyses showed significant improvements in the number of days with objective bulimic episodes and EDE shape/weight concern, all EDE-Q indicators (except restraint) and depression at post-treatment, 1-year follow-up and long-term follow-up when compared with pre-treatment (all P<0.01). In contrast, the significant improvements of EDE-Q restraint and BSI anxiety present at post-treatment and 1-year follow-up when compared with pre-treatment (both P<0.01) were no longer present at long-term follow-up (P>0.01). Concerning the course over the follow-up period, the number of days with objective bulimic episodes and anxiety increased from post-treatment or 1-year follow-up to long-term follow-up (both P<0.01), whereas the reduced levels of EDE shape/weight concern, all EDE-Q indicators and depression were maintained from post-treatment or 1-year follow-up to long-term follow-up (all P>0.01).
The course of EDE-Q eating concern, shape concern and the global eating disorder psychopathology, and of EDE shape/weight concern showed significant interactions with treatment (all P⩽0.03): for the CBT group, EDE-Q eating concern and EDE shape/weight concern worsened from 1-year follow-up to long-term follow-up (all post hoc P<0.01), whereas for IPT, there was an improvement in EDE-Q eating concern, shape concern and global eating disorder psychopathology from post-treatment to long-term follow-up (P<0.01).
The sensitivity analysis revealed largely consistent results with no replacement of missing data, last observations carried forward and multiple imputation. The only difference was that for EDE-Q shape concern, the interaction effect was no longer significant with multiple imputation (P<0.05). Most effect sizes between pre-treatment and long-term follow-up for eating disorder psychopathology were large (0.97⩽d⩽2.10). Depression yielded a medium effect size (d = 0.53), and EDE-Q restraint, anxiety and BMI yielded small effect sizes (d⩽0.37). Between-treatment effect sizes for all secondary outcomes at long-term follow-up were small (0.18⩽d⩽0.49).
|Pre-treatment||Post-treatment||1-year follow-up||Long-term follow-up||Treatment||Time||Treatment×time|
|n||%||n||%||n||%||n||%||χ2 (d.f.)||P||χ2 (d.f.)||P||χ2 (d.f.)||P|
|Recovered: no objective bulimic episodes in the previous month||0.68 (1)||0.411||2.38 (2)||0.304||15.85 (2)||<0.001|
|Remitted: fewer than 4 objective bulimic episodes days in the previous month||0.64 (1)||0.424||5.14 (2)||0.077||3.67 (2)||0.160|
|Improved: being at or lower than comparative EDE-Q global score||3.05 (1)||0.081||6.03 (2)||0.049||3.52 (2)||0.172|
EDE-Q, Eating Disorder Examination Questionnaire.
a. Numbers presented are intention-to-treat rates of primary outcomes and results of generalised estimating equations logistic regression of treatment (cognitive–behavioural therapy, interpersonal psychotherapy)×time (post-treatment, 1-year follow-up, long-term follow-up). Objective bulimic episode assessed through the Eating Disorder Examination.
Descriptive completer analyses on healthcare utilisation showed that between 1-year follow-up and long-term follow-up, 20 (80.0%) patients in the CBT group and 27 (84.4%) in the IPT group had received treatment for eating or weight problems (psychotherapy: CBT group 11 (44.0%), IPT group 12 (37.5%); pharmacotherapy: CBT group 13 (52.0%), IPT group 15 (46.9%); consultation, for example with a dietician: CBT group 8 (32.0%), IPT group 16 (50.0%); alternative treatment, for example hypnosis: CBT group 7 (28.0%), IPT group 9 (28.1%)). Healthcare utilisation was slightly associated with treatment condition (all |r|⩽0.18).
The current long-term follow-up study documented a substantial and long-lasting efficacy of both CBT and IPT for binge eating disorder, with full recovery from binge eating in 64.4% of patients. These data are consistent with 2-year follow-up recovery rates found in other clinical trials for binge eating disorder. Reference Wilson, Wilfley, Agras and Bryson10,Reference Devlin, Goldfein, Petkova, Liu and Walsh11 Both CBT and IPT yielded comparable long-term rates of remission to a subclinical level of binge eating in 80.0% of patients and of clinically significant improvement of the associated eating disorder psychopathology in 58.0%. The persistence of improvements was also reflected in low rates of binge eating disorder and in absence of compensatory behaviours or of any other eating disorder, as determined by the abbreviated EDE. In addition, most secondary outcomes of the associated eating disorder and general psychopathology showed significant and large improvements when compared with pre-treatment levels, although single outcomes, including objective bulimic episode days (the key symptom of binge eating disorder), restraint, and anxiety, showed tendencies toward relapse. Despite the overall favourable long-term outcome and in light of high rates of additional treatment-seeking following CBT and IPT, careful monitoring of these symptoms appears essential for early identification of the subset of patients showing reoccurrence of eating disorder symptoms over time.
There was some indication that CBT and IPT involved a differential time course over the follow-up period, although treatments did not differ in recovery rates at any time point. Abstinence from binge eating was stable over the follow-up period in the IPT group, whereas there was a significant tendency to relapse among patients in the CBT group. Concomitantly, reduction of eating disorder psychopathology in the IPT group was better maintained or further improved over the follow-up period, whereas for the CBT group psychopathology worsened from 1-year follow-up to long-term follow-up. This differential time course is similar to the ‘catching up’ effect of IPT that has previously been found in bulimia nervosa treatment. Reference Agras, Crow, Halmi, Mitchell, Wilson and Kraemer20,Reference Fairburn, Jones, Peveler and Carr21 In two studies, IPT emerged as inferior to CBT at the end of treatment, but IPT patients showed continued improvement at 1-year follow-up, levelling off between-treatment differences. Based on the underlying theories of CBT and IPT, one might speculate that the focus on improving interpersonal relationships prepares individuals more comprehensively for the social challenges of daily life than the more rapidly acting, more eating disorder-focused CBT treatment, which reaches its efficacy earlier. Further research is warranted to clarify mechanisms of action of CBT versus IPT for binge eating disorder.
Body mass index
Body mass index was stable throughout the follow-up period, suggesting that the course of weight gain that is characteristic of treatment-seeking individuals with binge eating disorder Reference Barnes, Blomquist and Grilo22 has sustainably been interrupted. Given a tendency among adults to gain 400 g of weight per year, Reference Rosell, Appleby, Spencer and Key23 a stabilisation of body weight as documented in the current study may have led to a small amount of decreased weight gain of approximately 1600 g over the assessment period. Stabilisation of body weight as documented in the current study is a positive finding, in and of itself, as it is one of the priorities of obesity prevention. Reference Rosell, Appleby, Spencer and Key23
Strengths of this study include the conduct of an additional long-term follow-up addressing a fairly large subsample of patients (55.6%) of a well-controlled clinical trial on binge eating disorder. Although the long-term assessment was not pre-planned and was unfunded, the overall participation rate of 64.4% was adequate for this study's sample (corresponding to 35.8% of the initial study sample that, however, was not considered for participation in this study).
Regarding generalisability of findings, several aspects underscore the certainty of results. No measurable biases existed in sample selection and assessment completion regarding sociodemographic characteristics, treatment drop-out, adherence and any outcome from pre-treatment to 1-year follow-up. In order to account for missing data, intention-to-treat analyses allowing data from assessment non-completers to remain in the analyses were conducted. The results from these intention-to-treat analyses showed the same pattern of results when compared with different treatment of missing data. Reference Sterne, White, Carlin, Spratt, Royston and Kenward18 The current completer sample size generated sufficient power to detect a medium-to-large effect size of treatment difference at long-term follow-up; effect sizes were reported in order to reflect smaller size differences. Of the patients who were or could have been consented for this study, 75.3% completed long-term assessment. Likely related to the longer time interval, assessment completion rates were gradually lower from post-treatment to 1-year follow-up to long-term follow-up.
A further limitation to this study is that although we collected data on interim treatments, their effects could not be separated out from long-term effects of CBT and IPT themselves. Other intervening factors (for example life events) were not controlled for. Finally, the study sample was mostly female and White, which limits generalisation of results to male and other ethnic groups.
This report on the long-term efficacy of two major out-patient treatments for binge eating disorder suggests that IPT is a viable treatment alternative to standard CBT. Both treatments yielded high rates of treatment response and long-term maintenance of therapeutic gains. To bolster this study's findings, replication of long-term efficacy in a larger sample and over a longer time period is warranted. Alternative evidence-based treatment options, such as more individualised or comprehensive treatment, or extended or additional treatment, Reference Mitchell, Agras, Crow, Halmi, Fairburn and Bryson24 should be considered for individuals with poor initial treatment response to further optimise long-term treatment effects of CBT and IPT.
This research was supported by grants , and from the National Institutes of Health (NIH). A.H. was supported by grant from the German Federal Ministry of Education and Research; R.I.S. was partly supported by a KL2 Career Development Award and the Clinical and Translational Science Award at Washington University School of Medicine (NIH grants and ).