Introduction
Hearing voices occurs when an individual perceives a voice or voices in the absence of an external stimulus (Beck and Rector, Reference Beck and Rector2003). This experience has traditionally been associated with psychosis, with a lifetime prevalence rate of about 64–80% (Landmark et al., Reference Landmark, Merskey, Cernovsky and Helmes1990; McCarthy-Jones et al., Reference McCarthy-Jones, Smailes, Corvin, Gill, Morris, Dinan, Murphy, O’Neill, Waddington, Bank, Donohoe and Dudley2017). However, there is increasing recognition that voice hearing occurs on a continuum, having been reported across clinical and non-clinical populations, with prevalence estimates of about 5–15% across the general population internationally (Beavan et al., Reference Beavan, Read and Cartwright2011; Maijer et al., Reference Maijer, Begemann, Palmen, Leucht and Sommer2018; Waters and Fernyhough, Reference Waters and Fernyhough2017). Although these experiences can be benign and even helpful for some individuals (Miller et al., Reference Miller, O’Connor and DiPasquale1993; Sanjuan et al., Reference Sanjuan, Gonzalez, Aguilar, Leal and Van Os2004), they can be associated with significant distress for some hearers (Sorrell et al., Reference Sorrell, Hayward and Meddings2010).
Cognitive behavioural therapy for psychosis (CBTp) has an established evidence base as a psychological intervention for psychotic experiences (including distressing voices), and a minimum of 16 sessions is currently recommended by the National Institute for Health and Care Excellence (2014). However, implementation rates of CBTp are low (Royal College of Psychiatrists, 2018), and its multi-component nature has led to some criticisms surrounding difficulties in identifying what works for whom (Peters, Reference Peters2014; Thomas et al., Reference Thomas, Hayward, Peters, Van Der Gaag, Bentall, Jenner, Strauss, Sommer, Johns, Varese, Garcia-Montes, Waters, Dodgson and McCarthy-Jones2014). To this end, the application of brief CBT-informed interventions targeted specifically at distressing voices has been proffered as a viable way to increase access to CBT where resources are constrained (Hayward et al., Reference Hayward, Berry, Bremner, Jones, Robertson, Cavanagh, Gage, Neumann, Hazell, Fowler, Greenwood and Strauss2021; Hazell et al., Reference Hazell, Hayward, Cavanagh and Strauss2016). These targeted interventions have generated greater benefits relative to generic CBT (i.e. effect sizes above d=0.4 reported for at least one primary outcome at post-therapy; Lincoln and Peters, Reference Lincoln and Peters2019), albeit within a small number of trials, and offers increased opportunities to identify causal components of change (Birchwood and Trower, Reference Birchwood and Trower2006; Lincoln and Peters, Reference Lincoln and Peters2019; Loizou et al., Reference Loizou, Fowler and Hayward2022).
However, there still exists heterogeneity in outcomes for individuals receiving targeted CBT for voices (CBTv), and rates of drop-out require investigation (Paulik et al., Reference Paulik, Thomas, Glasshouse, Hayward, Badcock and Paulik2020). To promote the directed provision of interventions to those most likely to benefit, it is necessary to elucidate the relationship between the factors driving the variability in individual differences in engagement and response to these interventions. Drop-out from therapy can have a negative impact at service-user, service provider, and healthcare systems levels. At higher levels, the allocation of limited resources to those who drop out negatively impacts on the efficiency and cost-effectiveness of service providers and healthcare systems (Cooper et al., Reference Cooper, Kline, Baier and Feeny2018; Hunsley, Reference Hunsley2003). For service users themselves, those that drop out experience poorer outcomes than those who complete therapy, irrespective of therapy modality (Saxon et al., Reference Saxon, Firth and Barkham2017), and may be less likely to return for future therapeutic interventions (Buchholz et al., Reference Buchholz, Bohnert, Pfeiffer, Valenstein, Ganoczy, Anderson and Sripada2017).
There have been few investigations into predictors of engagement and outcome with CBTv interventions specifically, and these have yielded variable results. Jones et al. (Reference Jones, Strauss and Hayward2021) assessed the predictors of engagement with a group mindfulness-based intervention for distressing voices and found that those with higher levels of depression at baseline were less likely to commence or complete the intervention, while those with greater recovery and lower levels of depression at baseline experienced greater improvement in voice-related distress. Paulik et al. (Reference Paulik, Jones and Hayward2018) used similar measures to assess the predictors of engagement with an individually delivered brief intervention focused on the enhancement of coping strategies. They found that individuals either particularly high or low in voice-related distress were less likely to commence or complete the intervention, respectively. With regard to outcomes, Thomas et al. (Reference Thomas, Rossell, Farhall, Shawyer and Castle2011) found that better outcomes from individually delivered interventions were associated with shorter duration of symptoms, less negative symptoms, and younger age. Paulik et al. (Reference Paulik, Jones and Hayward2018) found that higher levels of anxiety, stress, and depression at baseline were associated with poorer outcomes, particularly if higher voice-related distress was also reported at baseline. They referred to individuals as ‘high need’ if they reported high levels of voice-related distress and high levels of anxiety and/or stress and/or depression at baseline.
This study will explore the possible predictors of engagement and outcome for a transdiagnostic cohort of service users receiving Guided self-help cognitive behaviour intervention for VoicEs (GiVE), a brief, manualised CBTv intervention developed to address factors which are involved in the maintenance of voice-related distress. Recent randomised controlled trials (RCTs) observed medium-to-large between-group effects of GiVE on voice-related distress, anxiety, wellbeing, recovery and self-esteem for service users when delivered by therapists with varying levels of training (Hayward et al., Reference Hayward, Berry, Bremner, Jones, Robertson, Cavanagh, Gage, Neumann, Hazell, Fowler, Greenwood and Strauss2021; Hazell et al., Reference Hazell, Hayward, Cavanagh, Jones and Strauss2018a). However, this will be the first study to investigate the outcomes of the GiVE intervention in a naturalistic setting and uniquely explore predictors of engagement and outcome. Specifically, this study investigated the following questions:
-
(1) What outcomes do service users experience from the GiVE intervention, and what are engagement levels with the GiVE intervention in naturalistic settings?
-
(2) Do levels of anxiety, depression, voice severity, voice-related distress and recovery pre-intervention, and service user demographic characteristics predict engagement with the intervention?
-
(3) Do levels of anxiety, depression, voice severity and recovery pre-intervention, and service user demographic characteristics predict levels of voice-related distress post-intervention?
-
(4) Do levels of anxiety, depression, voice severity, and voice-related distress pre-intervention, and service user demographic characteristics predict levels of recovery post-intervention?
Method
Design and setting
This study utilised a quasi-experimental approach to explore and analyse potential predictors of engagement and outcome for service users offered a course of GiVE. It utilised secondary data of pre–post intervention measures for service users who attended the Sussex Voices Clinic (SVC) and were offered the intervention. The SVC is a secondary care service within the Sussex Partnership NHS Foundation Trust (SPFT) which delivers CBT-informed interventions to service users distressed by hearing voices, irrespective of diagnosis. Participants continued to receive their usual treatment throughout the intervention. Assessments were conducted upon referral to the SVC and after the intervention by clinic assistants who were not otherwise involved in the delivery of the intervention. The intervention was delivered by therapists who worked clinically within SPFT and had completed a one-day training course for the delivery of the GiVE intervention within the SVC.
Ethical considerations
NHS Research Ethics Committee approval was not required for the study because it was completed as a service evaluation of routine practice within a clinical service (Department of Health, 2017). This study received ethical approval from the School of Psychology at the University of Sussex (ER/SB2064/1), and the Quality Improvement Support Team (QIST) and Research Governance Team at SPFT. All participants provided informed consent for data collection, storage and use for research purposes. All data were anonymised by clinic assistants at the SVC prior to receipt of the data by the first author for analysis. All authors abided by the Ethical Principles of Psychologists and Code of Conduct as set out by the British Association of Counselling and Psychotherapy (BACP) and British Psychological Society (BPS).
Participants
The study population included a transdiagnostic cohort of 142 service users who were referred to the SVC, attended a baseline assessment and were offered GiVE. Inclusion criteria were: aged 18 years or over; in receipt of care within secondary care mental health services; and distressed by hearing voices (indicated by a score of 8 or more on the Hamilton Program for Schizophrenia Voices Questionnaire-Emotional Subscale (HPSVQ-ES; Van Lieshout and Goldberg, Reference Van Lieshout and Goldberg2007). There were no formal exclusion criteria; however, the ability of service users to engage with the baseline assessment (and subsequent discussions with the referrer) informed clinical decisions. Diagnoses were categorised as Psychosis; Emotionally Unstable Personality Disorder (EUPD); or Other. Participants with any psychosis diagnosis (e.g. schizophrenia, psychosis, or psychotic depression) were grouped under Psychosis. The Other category included a range of conditions, participants with no formal diagnosis, or individuals with multiple co-morbid diagnoses. A full list of diagnoses is not presented here for brevity.
Measures
Primary Measures
Hamilton Program for Schizophrenia Voices Questionnaire (HPSVQ; Van Lieshout and Goldberg, Reference Van Lieshout and Goldberg 2007 ). The HPSVQ is a 9-item self-report measure of voice hearing. Items are scored on a 5-point rating scale, with higher scores indicating increased severity. The HPSVQ has excellent test–retest reliability, sensitivity to change, good internal consistency (Cronbach’s alpha=0.74–0.85), and convergent validity with the Psychotic Symptom Rating Scales-Auditory Hallucination Subscale (PSYRATS-AH) (Berry et al., Reference Berry, Newcombe, Strauss, Rammou, Schlier, Lincoln and Hayward2021; Kim et al., Reference Kim, Jung, Hwang, Chang, Kim, Ahn and Kim2010). It consists of two subscales – the Physical Subscale (HPSVQ-PS) examines the physical voice characteristics, and the Emotional Subscale (HPSVQ-ES) examines voice-related distress.
Choice of Outcome in Cbt for psychoses – Short form (CHOICE-SF; Webb et al., Reference Webb, Bartl, James, Skan, Peters, Jones, Garety, Kuipers, Hayward and Greenwood 2021 ). The CHOICE-SF is a shortened version of the service-user defined self-report measure of recovery developed by Greenwood et al. (Reference Greenwood, Sweeney, Williams, Garety, Kuipers, Scott and Peters2010). The measure was co-created with experts-by-experience to comprise a measure of patient-reported ‘psychological recovery’ and is aligned with treatment targets of CBTp. It consists of 11 items which are scored on an 11-point Likert scale (0=worst, 10=best), and one additional personal goal item. The measure has good test–retest reliability (Cronbach’s alpha=0.93), high sensitivity to change, and good construct validity (Webb et al., Reference Webb, Bartl, James, Skan, Peters, Jones, Garety, Kuipers, Hayward and Greenwood2021).
Secondary measures
General Anxiety Disorder Scale (GAD-7; Spitzer et al., Reference Spitzer, Kroenke, Williams and Löwe 2006 ). The GAD-7 is a 7-item self-report measure of anxiety symptoms and associated severity over the previous 2 weeks. The measure has been indicated to have good reliability, excellent internal consistency (Cronbach’s alpha=0.92), as well as good construct, criterion and factorial validity (Spitzer et al., Reference Spitzer, Kroenke, Williams and Löwe2006). It has been validated in both clinical and general populations (Löwe et al., Reference Löwe, Decker, Müller, Brähler, Schellberg, Herzog and Herzberg2008).
Patient Health Questionnaire (PHQ-9; Kroenke et al., Reference Kroenke, Spitzer and Williams 2001 ). The PHQ-9 is a 9-item self-report questionnaire measuring depression symptom severity over the previous 2 weeks. The measure displays good psychometric properties, excellent internal reliability (Cronbach’s alpha=0.89), test–retest reliability (intraclass correlation=0.84), good construct and criterion validity and sensitivity to change (Kroenke et al., Reference Kroenke, Spitzer and Williams2001). The measure has displayed validity in both clinical (Beard et al., Reference Beard, Hsu, Rifkin, Busch and Björgvinsson2016) and non-clinical samples (Martin et al., Reference Martin, Rief, Klaiberg and Braehler2006).
Intervention
Guided self-help cognitive behaviour intervention for VoicEs (GiVE)
GiVE is a brief, targeted form of CBT for distressing voices. The intervention focuses on five modules implemented over eight 1-hour sessions, held weekly where possible. It was developed from the ‘Overcoming Distressing Voices’ CBT self-help book (Hayward et al., Reference Hayward, Strauss and Kingdon2018). Intervention delivery within session was supported through a published workbook (Hazell et al., Reference Hazell, Hayward, Strauss and Kingdon2018b), and a mobile app. The five modules comprising the intervention are listed below:
-
(1) Coping – an exploration of triggers and responses to voices (1 session);
-
(2) Me – identifying and evaluating negative beliefs about the self (2 sessions);
-
(3) My Voices – identifying and addressing unhelpful beliefs about voices (2 sessions);
-
(4) My Relationships – promoting assertiveness in difficult relationships with voices and others (2 sessions);
-
(5) Looking to the Future – supporting the continued application of the skills learned in previous modules (1 session).
Service users were invited to express a preference for the method of intervention delivery – face-to-face, videocall or telephone. Intervention completion was defined as attending at least six of the eight sessions. Service users were encouraged to engage with the self-help materials between sessions as the intervention proceeded.
Intervention delivery
Therapists who delivered the intervention had a variety of roles (e.g. support workers, graduate mental health workers, nurses and clinical psychologists) and had varying levels of therapy training/experience (39% had formal training in CBT; 58% had no formal therapy training; data were missing for 3% of therapists). The intervention was delivered to service users either face-to-face (56%), via telephone call (28%), videocall (12%), or through a mixture of modalities (4%).
Statistical analysis
Clinical and demographic characteristics
Descriptive demographic and clinical characteristic statistics were calculated, including age, gender, employment status, relationship status, ethnic group, level of education, length of voice hearing, and diagnosis, with count (n), and percentage of total provided for categorical variables, and mean and standard deviation (SD) provided for continuous variables. Diagnoses were clinician-reported and confirmed by a review of clinical records, where necessary.
Pre–post outcome measures
Mean pre- and post-intervention scores on primary and secondary measures were calculated, including standard deviations, minimum and maximum values, and 95% confidence intervals (CIs) for each. A comparison of pre- and post-intervention scores on primary and secondary measures was carried out using paired sample t-tests to determine whether statistically significant changes (i.e. p-values <0.05) occurred following the intervention (William, Reference William1908).
Potential predictors and classification of need
All clinical assessment measures and demographic characteristics were utilised as potential predictors due to evidence indicating potential impact on treatment outcomes and engagement (Ajnakina et al., Reference Ajnakina, Stubbs, Francis, Gaughran, David, Murray and Lally2021; Freeman and Garety, Reference Freeman and Garety2003; Hayward et al., Reference Hayward, Slater, Berry and Perona-Garcelán2016; O’Keeffe et al., Reference O’Keeffe, Conway and McGuire2017; Waters and Fernyhough, Reference Waters and Fernyhough2017). Consistent with the approach employed in Paulik et al. (Reference Paulik, Jones and Hayward2018), participants who scored highly for voice-related distress (13.14 or above on HPSVQ-ES), and also depression (19.33 or above on PHQ-9), or anxiety (16.01 or above on GAD-7) were classed as high need, and all other participants were categorised as low need. This distinction was defined a posteriori based on mean scores on each measure within the sample, where scores above the mean indicated high scores and those below indicated lower scores. For all analyses, to increase the predictive power of the test, categorical variables were re-categorised into binary variables (e.g. employment status: employed/not-employed).
Prediction of engagement
Service users were classified as non-commencers (offered GiVE but did not attend any sessions), non-completers (attended 1–5 sessions), and completers (attended 6–8 sessions). Consistent with the approach applied in Paulik et al. (Reference Paulik, Jones and Hayward2018) and Jones et al. (Reference Jones, Strauss and Hayward2021), models were built to compare completers, non-completers, and non-commencers via ANOVA for continuous variables, as application of Levene’s (Reference Levene, Olkin, Ghurye, Hoeffding, Madow and Mann1960) test revealed that the assumption of homogeneity of variances was not met for some measures. Chi-square tests were applied to compare levels of engagement across categorical variables.
Where results were statistically significant, post hoc analyses were applied to help interpret the findings, exploring the pattern of association between each level of engagement and the predictor variable. This was carried out to identify whether engagement could be predicted for various levels of the predictor variable, and thus inform clinical decision making. Multinomial regression models were applied here. Continuous variables were converted to categorical variables based on an initially agreed grouping. We also checked the total number of participants in each category was calculated to ensure categories were relatively balanced in size. This was implemented to help ensure results from measures where the sample was skewed would lead to more meaningful and clinically informative comparisons across levels, increased reliability of the model, and reduced bias. In this model, engagement served as the dependent variable, while the statistically significantly predictor was the independent variable. Odds ratios were calculated to illustrate results.
Prediction of outcome
Outcome was predicted using data from completers only, as follow-up data were not available for those who disengaged from the service. Separate linear regression models for each primary outcome measure (with post-intervention scores for voice-related distress or recovery as dependent variables) were built for each potential predictor. Independent variables were the corresponding primary outcome baseline score, the potential predictor (either a baseline clinical measure or demographic characteristic) and covariates. All regression coefficients are based on raw (unstandardised) scores to preserve interpretability. As such, regression coefficients represent the expected unit change for the outcome variable per one-unit increase in the predictor. For example, in models predicting HPSVQ-ES post-intervention (maximum score 16), a coefficient of 1 indicates that a 1-point increase in the predictor is associated with an expected 1-point increase on the HPSVQ-ES, reflecting an increase in voice-related distress.
Effect size, treatment response rate, and minimal clinically important change
Consistent with the approach employed in Jones et al. (Reference Jones, Strauss and Hayward2021), effect sizes were reported as Cohen’s d (Cohen, Reference Cohen1988), and treatment response rate (TRR) was calculated for each completer for each primary and secondary measure.
TRR was calculated through categorising service users as unchanged, deteriorated, minimally improved, or much improved, where minimal improvement was at least 20% change in a favourable direction over baseline, and much improvement corresponded to a 50% or greater change in a favourable direction from baseline scores (Bighelli et al., Reference Bighelli, Huhn, Schneider-Thoma, Krause, Reitmeir, Wallis, Schwermann, Pitschel-Walz, Barbiu, Furukawa and Leucht2018). Those who improved or deteriorated less than 20% from their original score were counted as unchanged. Favourable direction involved decreases on all measures with the exception of the CHOICE, where increased scores indicate favourable change.
To serve as a further indicator of clinical change, minimal clinically important change (MCID) was calculated for the two primary outcome measures (HPSVQ-ES and CHOICE-SF). MCID was defined a priori, for voice-related distress as 2 points below baseline assessment scores on the HPSVQ-ES (Hazell et al., Reference Hazell, Hayward, Cavanagh, Jones and Strauss2018a), and for recovery as 1.45 increase from baseline on the CHOICE-SF (Jolley et al., Reference Jolley, Garety, Peters, Fornells-Ambrojo, Onwumere, Harris, Brabban and Johns2015).
Missing data
Missing data were quantified, analysed and those assumed to be missing at random were treated using multiple imputation for chained equations (MICE; Van Buuren and Groothuis-Oudshoorn, Reference Van Buuren and Groothuis-Oudshoorn2011; Van Buuren, Reference Van Buuren2018). If missing data amounted to 5–40% of the total dataset, MICE was deemed appropriate (Newman, Reference Newman2014; Schafer, Reference Schafer1999). For the imputation stage, consistent with recommendations (Collins et al., Reference Collins, Schafer and Kam2001; Jakobsen et al., Reference Jakobsen, Gluud, Wetterslev and Winkel2017; White et al., Reference White, Royston and Wood2011), all variables which reasonably predicted missingness (auxiliary variables) or the outcome, and all covariates used in the analysis model which had a minimum correlation of r=0.1 with missingness were used for imputation (Van Buuren and Goothuis-Oudshoorn, Reference Van Buuren and Groothuis-Oudshoorn2011). Auxiliary variables were identified using logistic regression models with missingness as the outcome variable and potential continuous auxiliary variables set as predictors. Potential categorical auxiliary variables were identified using a chi-square test. The number of imputed datasets m, was selected through the total percentage of missingness in the dataset. Once the imputed datasets were obtained, these were used to fit the same models as with the complete case dataset. Model imputations were then pooled into a single entity, i.e. m parameter estimates were pooled into one estimate for each variable and the variance estimated (Rubin, Reference Rubin1987; Van Buuren, Reference Van Buuren2018).
Results
Service user characteristics
The sample consisted of 142 service users who were assessed within SVC between January 2017 and September 2019 and offered the GiVE intervention: 34 (24%) did not attend any sessions (non-commencers); 34 (24%) began the intervention but did not complete it (non-completers); and 74 (52%) completed the intervention (completers) (see Fig. 1 for the flow of participants through the study). Service users had a mean age of 38.2 years, had been hearing voices for an average of 16.9 years, and experienced voice onset at a mean of 20.8 years. Approximately half of the participants had a diagnosis of psychosis. A majority (53%) of the sample identified as female, were unemployed or unable to work (51%), single (59%), and had left education before or at the age of 18 (57%). A large majority (91%) identified as White British, consistent with the ethnicity profile of the local population in Sussex. This demographic profile is typical for a trans-diagnostic cohort of service users distressed by hearing voices (see Table 1 for full details of the demographic and clinical characteristics of the sample).

Figure 1. CONSORT flow diagram of service user engagement with GiVE intervention.
Table 1. Service user demographic and clinical characteristics

All categorical variables are provided as n (%), with all percentages provided as total valid cases within that variable. All continuous variables presented as mean (SD). Missing data for each variable include: age (n=1), diagnosis (n=4), gender (n=2), employment status (n=3), marital status (n=2), ethnic group (n=2), education (n=4), age of onset (n=7), duration of voice hearing (n=7).
Missing data
Missing data information for clinical characteristics are provided in the footnote of Table 1. For each clinical assessment measure at baseline, data were missing for (n and (%)) HPSVQ-ES: 1 (1.4%), HPSVQ-PS: 4 (5.4%), CHOICE-SF: 10 (13.5%), CHOICE-SF Goal: 4 (13.7%), GAD-7: 10 (13.5%), PHQ-9: 10 (13.5%). At outcome, missing cases on each measure were HPSVQ-ES: 23 (31.1%), HPSVQ-PS: 23 (31.1%), CHOICE-SF: 25 (33.8%), CHOICE-SF Goal: 23 (31.5%), GAD-7: 24 (32.4%), PHQ-9: 26 (35.1%). For the adjusted analysis, implemented with MICE, 12 participants were removed as the extent of missing data in these cases did not allow for the imputation of missing values. Thirteen imputations were computed from the dataset for 351 missing data points from a total of 1820 (13% missing). Auxiliary variables which predicted missingness were gender, age of onset of voices, marital status, employment status, and ethnic group.
Pre–post outcome measures and effect sizes
Descriptive statistics of scores on all clinical assessment measures are displayed in Table 2. Baseline scores are reported by participant engagement level, while post-intervention scores are shown for those who completed the intervention. There were some differences between the means of each group on each measure, with completers presenting with more favourable scores on each measure, with the exception of voice-related distress, which was highest in non-commencers and lowest in non-completers at baseline. Among completers, mean scores moved in a favourable direction for all measures.
Table 2. Pre–post GiVE clinical assessment measures

n on each measure refers to the complete cases for that measure at that time point. M, mean; CI, 95% confidence interval (provided as Lower, Higher); SD, standard deviation; HPSVQ-ES, Hamilton Program for Schizophrenia Voices Questionnaire-Emotional Subscale; HPSVQ-PS, Hamilton Program for Schizophrenia Voices Questionnaire-Physical Subscale; CHOICE-SF, Choice of Outcome in Cbt for psychoses-Short form; GAD-7, General Anxiety Disorder Scale; PHQ-9, Patient Health Questionnaire.
Table 3 illustrates differences and effect sizes on each primary and secondary measure for individuals who completed the intervention. To achieve this, paired sample t-tests for both complete case and adjusted (i.e. following MICE) analyses were implemented. Significant differences were seen on each measure between pre- and post-intervention in both complete case and adjusted analyses. A large effect was found for voice-related distress, where on average, participants displayed a –2.69 (95% CI [–3.49, –1.89]) point difference on the HPSVQ-ES at outcome, than at baseline, p <0.001, Cohen’s d=–0.82 (95% CI [–1.10, –0.54]). A large effect was also found for the personal goal rating, where on average, participants displayed a 3.22 (95% CI [2.54, 3.90]) point difference on CHOICE-SF Goal at outcome, than at baseline, p<0.001, Cohen’s d=1.44 (95% CI [1.01, 1.88]). Medium effect sizes were seen for all other measures, i.e. voice severity, recovery, anxiety, and depression.
Table 3. Pre–post effect sizes on clinical assessment measures

Data are provided for completers only. HPSVQ-ES, Hamilton Program for Schizophrenia Voices Questionnaire- Emotional Subscale; HPSVQ-PS, Hamilton Program for Schizophrenia Voices Questionnaire-Physical Subscale; CHOICE-SF, Choice of Outcome in CBT for psychoses-Short form; GAD-7, General Anxiety Disorder Scale; PHQ-9, Patient Health Questionnaire; Adjusted, analysis after multiple imputation by chained equations (MICE).
Effect estimates were mostly consistent between complete case and adjusted analyses with a few notable differences. For the HPSVQ-PS, the reduction in physical attributes of voices increased in the adjusted analysis by around 1 point. Reductions in anxiety scores post-intervention also increased in the adjusted analysis. In contrast, reductions in depression scores post-intervention were slightly higher in the complete case analysis compared with the adjusted analysis. Importantly, both analytical approches led to the same conclusions regarding the statistical significance of the findings.
Question 1 – predictors of engagement
Current age (t 69.82=4.20, p=0.019) significantly predicted engagement with the intervention. No clinical assessment measure or any other clinical characteristic variables were found to significantly predict individuals’ engagement with the intervention.
A post hoc multinomial regression model was applied to determine if differences between age groups were indicative of engagement with the intervention. Completers were utilised as the reference group, with comparisons made against non-commencers and non-completers. For readability, the odds ratios (ORs) are presented as their inverses (1/OR), while raw OR values reflect comparisons between Completers and the comparative group. For age, it was found that those aged 45–54 were 4.54 times more likely to be completers than non-commencers (OR=0.22, 95% CI [0.06, 0.83], p=0.025). Those aged 55–64 were 5.88 times more likely be completers than non-commencers (OR=0.17, 95% CI [0.04, 0.67], p=0.011), and 9.09 times more likely to be completers than non-completers (OR=0.11, 95% CI [0.02, 0.63], p=0.013). No significant relationship was found between level of engagement and age for those aged 18–24, 25–34, and 35–44.
Question 2 – predictors of outcome (voice-related distress)
Anxiety at baseline significantly predicted voice-related distress post-intervention in the complete case analysis, with a regression coefficient of 0.23 (95% CI 0.00 to 0.46, standard error (SE)=0.11, p=0.049, correlation r=0.30). This effect was not retained in the adjusted analysis at 0.16 (95% CI –0.03 to 0.35, SE=0.10, p=0.097, r=0.23). Baseline physical voice characteristics, recovery, personal goal, depression, diagnosis, gender, employment status, marital status, ethnic group, educational attainment, duration of voice hearing, age of voice hearing onset, and need at baseline were not significantly predictive of voice-related distress post-intervention.
Question 3 – predictors of outcome (recovery)
No measure was found to significantly predict recovery at outcome.
Treatment response rate and minimal clinically important difference
Table 4 illustrates treatment response rates for each clinical measure among individuals who completed the intervention. The proportion of service users who reported improvement varied across the measures, with the greatest proportion reporting improvements on the personal goal rating (88%) and recovery (60%). With regard to changes meeting the MCID on the primary measures, 54% of individuals met this threshold for voice-related distress, and 48% for recovery.
Table 4. Treatment response rate on clinical assessment measures

n on each measure refers to the complete cases for that measure at both time points. Data are provided for completers only. HPSVQ-ES, Hamilton Program for Schizophrenia Voices Questionnaire-Emotional Subscale; HPSVQ-PS, Hamilton Program for Schizophrenia Voices Questionnaire-Physical Subscale; CHOICE-SF, Choice of Outcome in CBT for psychoses-Short form; GAD-7, General Anxiety Disorder Scale; PHQ-9, Patient Health Questionnaire.
Discussion
This study sought to identify the predictors of engagement and outcome when the GiVE intervention was delivered within routine clinical practice to a trans-diagnostic cohort of 142 service users. This is the first study to evaluate the delivery of the GiVE intervention within routine clinical practice. The offer of the intervention was accepted by 108 (76%) service users and completed by 74 (52%). Clinically meaningful benefits on the primary outcomes of voice-related distress and recovery were reported by approximately half of the service users who completed the intervention.
Main findings
For the prediction of engagement, only age was found to be significantly associated with level of engagement. Further analyses suggested that increased engagement with the intervention was found for those in higher age brackets, and statistically meaningful differences between completers, non-completers and non-commencers were not found in younger age groups. Those aged 45–54 were much more likely to complete than not commence the intervention, and those aged 55–64 were far more likely to complete than not commence or not complete the intervention. Increased engagement of older adults in CBT interventions is supported by findings from studies of CBTp in routine clinical practice (Richardson et al., Reference Richardson, Dasyam, Courtney, White, Tedbury, Butt and Newman-Taylor2019), group therapy for psychosis (Sedgwick et al., Reference Sedgwick, Hardy, Newbery and Cella2021), and from CBT for treatment-resistant depression (Button et al., Reference Button, Turner, Campbell, Kessler, Kuyken, Lewis, Peters, Thomas and Wiles2015).
For the prediction of changes in voice-related distress at outcome, the only clinical measure found to be associated with poorer outcome was an increased anxiety score at baseline. However, this effect was no longer significant within the adjusted analysis, and this finding should be treated with caution. The potential influence of anxiety corroborates the finding of Paulik et al. (Reference Paulik, Jones and Hayward2018). However, in contrast to the studies of Paulik et al. (Reference Paulik, Jones and Hayward2018) and Jones et al. (Reference Jones, Strauss and Hayward2021), this study did not find that higher levels of baseline depression predicted less improvement in voice-related distress at outcome. No demographic or clinical characteristics were found to predict voice-related distress at outcome within this study.
For the prediction of changes in recovery, none of the clinical measures or demographic/clinical characteristics were found to be associated with outcome. This is consistent with findings from Paulik et al. (Reference Paulik, Jones and Hayward2018) and Jones et al. (Reference Jones, Strauss and Hayward2021), who included the same variables for prediction and found no significant relationships.
The inconsistency of the findings across three similar studies (the current study, Paulik et al., Reference Paulik, Jones and Hayward2018, and Jones et al., Reference Jones, Strauss and Hayward2021) is noteworthy. Each of the studies draws upon lessons learnt from transdiagnostic groups of service users within the same clinical environment, suggesting there are no clinical or demographic variables that consistently predict either engagement or outcome within that context. However, each of the studies evaluated the delivery of a different form of CBTv, and this may have influenced the findings.
There are broader lessons to be learnt from this first evaluation of the GiVE intervention within routine clinical practice. The offer of the intervention was accepted by most service users (76% attended at least one session), a proportion that is similar to the acceptance rates of Paulik et al. (Reference Paulik, Jones and Hayward2018) and Jones et al. (Reference Jones, Strauss and Hayward2021) – 64% and 84%, respectively. These rates contrast with the acceptance rates for CBTp of 52% within a national survey (Royal College of Psychiatrists, 2018). These differences may be attributable to brief and targeted CBTv interventions being favourably perceived as offering greater transparency about their content and purpose. Completion rates are similar across the three studies: 52% in the current study, 52% in Paulik et al. (Reference Paulik, Jones and Hayward2018), and 59% in Jones et al. (Reference Jones, Strauss and Hayward2021), suggesting that progress still needs to be made with understanding and addressing the barriers to sustained engagement, and these barriers may not vary across differing forms of CBTv. The preferences for how service users want CBTv to be delivered may play a role in this respect, and have recently been reported in relation to the location of sessions (highest rated preference – clinical environment), timing of sessions (highest rated preference – weekday afternoon), delivery of therapy (highest rated preference – meeting face-to-face with a therapist) and therapy approach (highest rated preference – me and therapist working together to develop new solutions; Berry et al., Reference Berry, Newcombe, Strauss, Rammou, Schlier, Lincoln and Hayward2021).
Approximately half of the service users who completed the GiVE intervention reported improvements in voice-related distress and/or recovery. This level of improvement is modest and is slightly lower than the improvements in psychotic symptoms reported for CBTp (Bighelli et al., Reference Bighelli, Huhn, Schneider-Thoma, Krause, Reitmeir, Wallis, Schwermann, Pitschel-Walz, Barbiu, Furukawa and Leucht2018). However, it is noteworthy that the improvements following the GiVE intervention were generated after only eight sessions, relative to a median number of 13 sessions of CBTp reported by Bighelli et al. (Reference Bighelli, Huhn, Schneider-Thoma, Krause, Reitmeir, Wallis, Schwermann, Pitschel-Walz, Barbiu, Furukawa and Leucht2018). Furthermore, most of the service users who completed the intervention showed improvement in the rating of their personal goal, highlighting the potential value of making greater use of idiosyncratically defined measures of outcome (Morrison et al., Reference Morrison, Gonçalves, Peel, Larkin and Bowe2023).
Clinical implications
The clinical implications of this study’s findings are at least threefold. Firstly, as anxiety may be related to voice-related distress at outcome, corroborating the findings in Paulik et al. (Reference Paulik, Jones and Hayward2018), this variable should be assessed as part of the screening process for CBTv. If anxiety is found to be severe, it may be a barrier to learning within CBTv and require a targeted intervention prior to the commencement of CBTv. Brief and targeted CBT-informed interventions are available for the treatment of anxiety and depression (Coull and Morris, Reference Coull and Morris2011), but their efficacy in the context of severe mental health problems has yet to be demonstrated (Waller et al., Reference Waller, Landau, Fornells-Ambrojo, Jolley, McCrone, Halkoree, Basit, Iredale, Tunnard, Zala, Craig and Garety2018). Secondly, as greater engagement is associated with the older age of service users, the engagement of younger service users (i.e. below 45 years) may benefit from innovative approaches to promote engagement, e.g. a pre-CBTv psychoeducational approach may support informed decision-making (Greenwood et al., Reference Greenwood, Alford, O’Leary, Peters, Hardy, Cavanagh, Field, de Visser, Fowler, Davies, Papamichail and Garety2018). Finally, this study has generated data to support the emerging evidence that briefly-trained therapists may be able to deliver CBT-informed interventions for distressing voices (Clarke et al., Reference Clarke, Jones and Hayward2021; Hayward et al., Reference Hayward, Berry, Bremner, Jones, Robertson, Cavanagh, Gage, Neumann, Hazell, Fowler, Greenwood and Strauss2021), thereby creating the potential for treatment access to be increased through the delivery of CBTv by a wider workforce of therapists. However, the modest improvements on the primary measures suggest that the GiVE intervention may need to be located within a pathway of CBT-informed interventions for distressing voices, an approach that has recently been explored within a preliminary evaluation of the Feeling Heard intervention pathway and offered encouraging findings (Hayward et al., Reference Hayward, Ashment, Frost, Byles, Vogel, Walsh, Fowler, Loizoi and Strauss2025).
Limitations
This study has limitations in several respects. Firstly, this study utilised an uncontrolled pre–post design and benefits may have been attributable to variables beyond the intervention. Secondly, whilst the naturalistic setting promoted the ecological validity of the findings, this setting is associated with less experimental control, meaning that inferences drawn from the dataset are more likely to be subject to the influence of confounding variables than those in an experimental setting. As there was no adjustment for multiple testing in the original analysis, there is a potential for inflation of Type I error rates, and findings should be interpreted with caution in this light. Rather than confirming specific effects, these findings helped to identify patterns in engagement and outcomes with the GiVE intervention. The lessons learnt can inform the generation of potential hypotheses for future research. Furthermore, given the frequentist approach taken in this analysis, the non-significant results mean that judgement on whether these variables serve as indicators of outcome should be withheld, instead of serving as evidence of a lack of relation between these variables (Dienes, Reference Dienes2008). Thirdly, the predictive power of the study may have been impacted by the sample size utilised, and a replication with a larger sample would need to be appropriately powered to detect effects, should they exist. Additionally, there are limitations regarding the availability of data as information related to medication usage and the service users excluded from the intervention (e.g. due to cognitive ability) were not available for analysis. A further limitation to the study surrounds the dichotomisation of some categorical measures for analysis. This was carried out to increase predictive power of the models applied, but there also exists evidence to suggest that this approach can result in misleading findings (MacCallum et al., Reference MacCallum, Zhang, Preacher and Rucker2002), and this should be taken into account for future studies. Finally, this study is service and context specific and this may limit the ability for generalisation.
Supplementary material
To view supplementary material for this article, please visit https://doi.org/10.1017/S1352465825101161
Data availability statement
The data that support the findings of this study are available on request from the corresponding author, M.H. The data are not publicly available due to their containing information that could compromise the privacy of the service users.
Acknowledgements
We are grateful to the service users who gave permission for their data to be used to facilitate the learning of others. Thanks are also due to the Research Assistants at the Sussex Voices Clinic who supported the collection of data.
Author contributions
Seafra Barrett: Conceptualization (equal), Data curation (lead), Formal analysis (lead), Investigation (lead), Methodology (lead), Writing - original draft (lead); Anna-Marie Bibby-Jones: Formal analysis (supporting), Methodology (supporting), Writing - review & editing (supporting); Mark Hayward: Conceptualization (equal), Data curation (supporting), Formal analysis (supporting), Investigation (supporting), Methodology (supporting), Supervision (lead), Writing - review & editing (equal).
Financial support
No financial support was received to support the delivery of this study.
Competing interests
M.H. is an author of the self-help book and the companion workbook that was used to assist the delivery of the GiVE intervention within this study. He receives a small royalty for each copy of the book and companion workbook that is sold.
Ethical standards
NHS Research Ethics Committee approval was not required for the study because it was completed as a service evaluation of routine practice within a clinical service (Department of Health, 2017). This study received ethical approval from the School of Psychology at the University of Sussex (ER/SB2064/1), and the Quality Improvement Support Team (QIST) and Research Governance Team at SPFT. All participants provided informed consent for data collection, storage and use for research purposes. All data were anonymised by clinic assistants at the SVC prior to receipt of the data by the first author for analysis. All authors abided by the Ethical Principles of Psychologists and Code of Conduct as set out by the BABCP and BPS.




Comments
No Comments have been published for this article.