The Individual Recovery Outcomes Counter : preliminary validation of a personal recovery measure

concept has become embedded in local and national mental health policy across the UK. Its advance into the mainstream can be gauged not just by the volume of academic literature devoted to it, but also by the scope and range of this work. In a recent literature search, we identified over 300 recovery-focused publications in the past 12 years. These ranged from personal accounts of the recovery journey to attempts to define or measure the concept, along with histories, empirical studies and signs of an emerging critique. Despite the large amount of research and the diversity within it, there are some emergent themes. Recovery is both an outcome and a process whereby ‘recovery from’ and ‘recovery in’ mental illness are fluid concepts that are not mutually exclusive. Recovery is subjective, explaining some of the difficulty in accurately defining it.


Measuring recovery
To measure recovery, there needs to be agreement on the most common themes emerging from research. Davidson and colleagues 11 neatly summarise these as involving: (1) recovery as a journey; (2) being supported by others; (3) renewing hope and commitment; (4) engaging in meaningful activities; (5) redefining self; (6) incorporating illness; (7) overcoming stigma; (8) assuming control; (9) managing symptoms; (10) becoming empowered and exercising citizenship. This view is also supported by literature reviews examining recovery in Britain. [12][13][14] These concur with the findings of our literature search as they show a similarity in the breadth of work and the key themes emerging from it.
Of particular significance to the present study is work that deals with the measurement of recovery. Effective measurement can provide valuable feedback to service users and mental health workers in terms of individual progress. It can also be used to help shape care planning and can be drawn on as evidence of outcomes by commissioners and managers within services. 15 To date a range of tools have been developed to measure different aspects of recovery. These were reviewed by Burgess and colleagues. 16 Where measures have focused on individual recovery and/or outcomes, as opposed to service orientation or practitioner attitudes, they can be criticised for an over-reliance on purely clinician-generated items (e.g. the Milestones of Recovery Scale, MORS); 17 for their length (e.g. the Recovery Measurement Tool: 91 questions), 18 which it is felt makes them inappropriate for routine use; 16 and for their focus on symptom reduction as opposed to attributes of personal recovery 4 (e.g. the Illness Management and Recovery (IMR) Scales). 19 Moreover, given the subjectivity of the recovery concept, it has been argued that many of the tools currently available lack sensitivity to the needs of local populations. 20 Specifically, the majority have been developed in North America and Australia and their relevance to the local, in this case UK, population is therefore unclear.

Recovery Star
One of the widely used tools in assessing recovery in the UK is the Recovery Star. 21 It is a measure based on a 10-stage model of recovery where service users are asked to identify their current stage of recovery against ten indicators. Its psychometric properties have recently been analysed and high internal consistency and good test-retest reliability were reported. 22,23 However, although the tool has demonstrated good convergent validity with a measure of social functioning, it did not correlate significantly with the Mental Health Recovery Measure. 23 The authors conclude that it may therefore not be accurate to describe the Recovery Star as a measure of personal recovery. 23 These data were published after the current study was undertaken, and although the initial results are promising, there is still a relative lack of detailed information on this tool.

Process of Recovery Questionnaire
Another popular UK recovery measurement tool is the Process of Recovery Questionnaire (QPR), a 22-item questionnaire measuring personal recovery on a 5-point Likert scale ranging from 'disagree strongly' to 'agree strongly'. Good internal consistency (for the two subscales identified), test-retest reliability and convergent validity have been reported with a small number of measures of aspects of recovery within a population with a history of psychosis. 4 This has not yet been tested against any other measures of recovery or within a wider population of people with mental health problems, so full validity is yet to be established.
In summary, although two measures of recovery have been developed for use with UK populations, as yet neither of these has been subject to full, standardised psychometric testing. Also, their application may be limited as they are not necessarily focused towards personal recovery across all client groups (unlike I.ROC which was specifically developed to fulfil this role).

Development of I.ROC
The Individual Recovery Outcomes Counter (I.ROC) was developed by Scottish mental health charity Penumbra in 2007 to measure service users' 'distance travelled'. A working group of senior managers was established to investigate potential indicators felt to be pertinent to well-being and recovery. Drawing from experience, output from UK health and social care agencies (e.g. Health Scotland), and examining existing tools (including the Outcomes Star 24 ), 21 led to the identification of 12 indicators relevant to Penumbra's work. An initial version of the I.ROC questionnaire was composed and subsequently refined based on feedback from a pilot group of 40 service users. After the refinement of the scoring and wording of questions, I.ROC was rolled out across Penumbra's services. Since 2011, Penumbra has been working with the University of Abertay Dundee to explore the psychometric properties of this tool. Focus groups with service users and staff identified more areas for improvement, resulting in changes to the wording and layout of the questionnaire. These changes were then confirmed with more focus groups and staff working groups to establish good content validity. 25

Measures
Comparative validity of I.ROC was assessed by asking participants to complete it along with two well-established outcome/recovery measurement questionnaires, the Recovery Scale (RAS) 26,27 and the Behaviour and Symptom Identification Scale (BASIS-32). 28,29 These tools were chosen because of their robust and widespread use within recovery and outcome measurement, both in practice and in the validation of other measures. Like I.ROC, both tools use a Likert scale, making answers easily comparable.
The revised I.ROC is a facilitated self-assessment, which is administered on a quarterly basis as part of service users' ongoing support. It consists of 12 questions, based around 12 indicators of recovery ( Table 1).
The RAS is a 41-item questionnaire, scored on a 5-step scale of 'strongly disagree' to 'strongly agree'. It has been tested against other measures of recovery and has been shown to be both valid and reliable. 30 It has demonstrated acceptable test-retest reliability (r = 0.88), good internal consistency (a = 0.93) and convergent validity with measures of empowerment, self-esteem, social support and quality of life and hope. 27,31 Convergent validity has also been established with other recovery measures including the Mental Health Recovery Measure, 32 Stages of Recovery Instrument (STORI) 33,34 and QPR. 4 This makes it appropriate to use as a benchmark for personal recovery.
The second comparative tool, BASIS-32 28,29 is a 32-item routine outcome measurement self-report questionnaire designed to measure clinical outcomes of interventions from the service user's perspective. It is widely applied in Australia and New Zealand, where national and state funders require services to collect and use outcome data. 35 The tool has good test-retest reliability and internal consistency both overall (0.89) and for the identified subscales (0.65-0.81). 36 It shows sensitivity to changes in functioning and symptoms 37 and has been used as a comparative measure in the validation of recovery and outcomes measures, 38,39 for example, in validating the Japanese version of the RAS, where it significantly negatively correlated with the RAS. 40 It has also been used in the assessment of recovery and rehabilitation-based interventions, treatments and programmes. [41][42][43] As a clinical outcomes measure, BASIS-32 can be used to establish the validity of I.ROC as an outcome measure more broadly in line with routine outcome measurement.

Participants
Participants were all those currently receiving support in the community from Penumbra. There were no exclusion criteria, and data collection was carried out by Penumbra staff (n = 17), all of whom received training from the research team prior to the commencement of the study. Ethical approval was granted by the University of Abertay Ethics Committee. Participants were 79 women and 92 men ranging in age from 15 to 79 years, with a mean age of 46 years. One participant was excluded from analysis as they did not complete all three questionnaires. Mental illness diagnoses were largely self-reported and ranged from anxiety through to multiple, complex diagnoses. Between the 170 participants included in the analysis, there were 320 reported diagnoses, with the most common being depression, reported in over 50% of participants; 94 participants reported 2 or more diagnoses, with anxiety and depression the most common dual diagnosis. Participants largely lived alone (66%) in rented or supported accommodation; 65% were unemployed, with only 10 participants in paid employment and 7 in education. Participants had been receiving support from Penumbra for varying lengths of time, ranging from 49 days to 20 years, with 70% receiving support for between 6 months and 2 years and 32% in their first year of service. This support ranged from occasional respite care through to 24-hour supported accommodation. Participants varied in the number of previous I.ROCs they have completed; an I.ROC is completed as soon as possible following intake and is then repeated every 3 months. The 55 individuals in their first year (32%) had completed fewer than three I.ROCs, whereas 70% (120 individuals) had been in service less than 2 years completing seven or fewer I.ROCs.

Procedure
From November 2011 to April 2012, participants were asked to complete I.ROC, RAS and BASIS-32 with a support worker. Testing took place at a location of the participant's choosing, under Penumbra's lone working policy. After filling in a demographics questionnaire, participants completed the questionnaires (in a standardised format and counterbalanced order, with a third of participants filling in I.ROC as the first, as the second and as the third questionnaire) with the testers who read out each question to the participant before recording the answer. After finishing the final questionnaire, participants were asked to fill in a feedback form, briefly describing how they found the questionnaires and the testing experience.

Analysis
Quantitative methods were used to analyse the comparative validity and internal consistency of the questionnaire. Data were analysed with SPSS-19 for Windows. Analysis methods were similar to those used in other measurement tool validations. 31,33,44 Results

Score distributions
All three questionnaires were tested using the Kolmogorov-Smirnov test for normality, which showed that both I.ROC and BASIS-32 were normally distributed (D 170 = 0.129, P = 0.200), but RAS was significantly non-normal (D 170 = 0.073, P50.05). Therefore comparisons between measures were carried out with non-parametric tests.

Demographic/confounding variables
The results of a Kruskal-Wallis (non-parametric ANOVA equivalent) determined that age was not a significant confounding variable for any measure (Table 2).  38 all of which reported no significant differences dependent on gender. A different pattern of results was found on the other two scales, however. Unlike the original validation, males were found to score significantly higher on RAS and significantly lower on BASIS-32 than females. This suggests that men in the current sample were more likely to report higher recovery scores when using these questionnaires.

Concurrent validity
Scoring on I.ROC and RAS is similar, with higher scores indicating better well-being, whereas BASIS-32 uses an inverse scoring system with lower scores indicating better mental health. To measure the strength of the relationships between the questionnaires, Spearman's correlations (two-tailed) were calculated for the total scores on all three measures. Thus, I.ROC scores were significantly positively correlated to RAS scores (r s = 0.723, P50.001) and significantly negatively correlated to BASIS-32 scores (r s = 70.602, P50.001). These results meet with a minimal criterion for correlation between similar psychometric instruments (0.55). 45 We contend that this indicates positive initial support for the concurrent validity of I.ROC, demonstrating an ability to measure recoveryfocused outcomes in a way that is similar to the current leading measures.
Pearson's correlations were also calculated for subscales within BASIS-32 and RAS, along with the two I.ROC factors found during factor analysis. Both I.ROC factors were found to correlate significantly with the subscales on the other two measures. The only subscale correlation not to produce a significant result was between I.ROC factor 2 and the BASIS-32 psychosis subscale. This suggests that I.ROC compares favourably with the other two measures at a structural as well as general level.

Internal consistency
Internal consistency correlations were calculated using Cronbach's alpha. This is a measure of the relatedness of all questions within a questionnaire to each other and to a single overarching construct (in this case recovery). High scores indicate a strong relationship between all questions, thus determining the level of homogeneity of the tool. 46 All three questionnaires have good internal consistency, although that of RAS was highest (0.96), closely followed by BASIS-32 (0.95). However, I.ROC produced a sufficiently high score (0.86) to indicate good internal consistency. 47 It should be noted, however, that both BASIS-32 and RAS consist of far more questions than I.ROC, which may have positively affected their score. The literature suggests that 0.8 is a good goal to aim for, and clearly I.ROC exceeds this. 48

Factor analysis
Exploratory factor analysis was performed to investigate I.ROC's underlying structure (Table 3). By investigating the correlations between each item on a questionnaire, it is possible to identify existing question groupings, which can be compared with the theories underpinning the measure.  In I.ROC, two significant factors were found using a varimax rotation. The KMO statistic 49 was 'meritorious' (0.859). 50 Bartlett's test of sphericity was highly significant, proving the correlation matrix not to be an identity matrix (P50.001), and the determinant of the correlation matrix (d = 2.01, E = 0.007) indicated that principal components analysis was appropriate.

Monger et al Individual Recovery Outcomes Counter validation
Exploratory factor analysis indicated that two underlying factors comprising eight and four items accounted for 51.8% of the variance in scores. Two items ('hope for the future' and 'physical health') loaded on both factors, so were assigned to the factor with a slightly higher loading (physical health: factor 1, 0.48; hope: factor 2, 0.62). Eigen values showed that factor 1 explained 40.7% of the variance in the data and factor 2 explained 11.1%. Item loadings in the two extracted factors (highest factor loading) all exceeded 0.45. Internal consistency was good for factor 1 (Cronbach's a = 0.83) and acceptable for factor 2 (Cronbach's a = 0.74). 32 The authors (academic researchers and mental health professionals) agreed that the factors identified by the analyses should be labelled as 'intrapersonal' and 'interpersonal'. These closely resemble factors in the QPR suggested by Neil and colleagues, 4 where intrapersonal elements are described as 'tasks that the individual is responsible for conducting, and which help them to rebuild their lives'. These could include taking responsibility for the management of their physical and mental health and day-to-day life skills (e.g. cooking, cleaning) and for participating in choices that affect their lives. These types of behaviours have been described by Andresen et al 51 as including 'self-determination and resilience'. Interpersonal items relate to reflection on the individual's value to external processes and relationships. These could include their ability to participate in social activities and to feel that they play a meaningful part in their own lives and their wider community. As in the QPR, of the two factors identified in I.ROC, the majority of items loaded on the intrapersonal subscale (17 items v. 5 in QPR).

Questionnaire preference
After completing the three questionnaires, participants were asked to identify which questionnaire was their favourite and which one they liked least. Of the 124 service users who answered that question, 52% (n = 64) chose I.ROC as their favourite. Significantly more participants selected I.ROC than either RAS (n = 35, t = 5.996, P50.001) or BASIS-32 (n = 26, t = 7.245, P50.001) as their favourite.
Conversely, BASIS-32 (n = 45) was found to be the least popular questionnaire, with significantly more participants selecting it as their least favourite than either I.ROC

Discussion
The evidence presented here supports the hypothesis that I.ROC is a valid and reliable recovery outcomes measure which can be used in routine clinical practice as a means of tracking service user progress, as an aid to care planning and as a means of assessing the impact of service inputs. It is significantly correlated with BASIS-32 and RAS, and the correlation between I.ROC and BASIS-32 increases the breadth of possible applications of I.ROC in the future, for example as an outcomes tool within public health and social care services. The internal consistency of I.ROC was high, with no item redundancy, suggesting that all questions contribute to a single recovery-related construct. This supports the content validity of the measure. Exploratory factor analysis revealed that this construct comprised two factors, labelled intrapersonal and interpersonal recovery, although further work is needed to understand their wider relationship to both the overall construct and other aspects of recovery.
On the whole, participants preferred I.ROC to the other measures. They described it as easy to complete and agreed that it helped them think about their recovery. It thus seems fair to conclude that I.ROC measures similar recovery and personal outcomes constructs to RAS and BASIS-32, but in a way that most service users found preferable. This was a consistent finding irrespective of demographic variables and test scores. Trends indicated that I.ROC total score was not a significant preference moderator, suggesting that participants feel comfortable with the measure no matter the current state of their mental health. Controlled empirical testing has supported the internal structure and validity of the tool. Qualitative analysis of service user focus groups and the feedback surveys used in the validity testing have evidenced the face and external validity of the tool. The use of I.ROC as a valid measure of recovery within a Scottish mental health population is therefore supported.

Limitations
Although 171 participants were recruited for the current study, the robustness of the results would be improved by increasing sample size. For factor analysis, for example, it has been argued that the bigger the sample size, the better the results, with a minimum sample size of 300 recommended by some. 52 Although others have argued that a sample size of between 100 and 200 may be sufficient, 53 the general consensus remains 'the bigger the better'. Participants were drawn from Penumbra's existing service user base with testers made up of Penumbra staff. This presented a readily available sample group and minimised the likelihood of breaches of client confidentiality, but it should also be recognised that these choices may have positively influenced aspects of the results. Specifically, participants may have been more likely to favour the I.ROC over other instruments and staff may have, albeit inadvertently, reinforced this. The collection of data from individuals not connected with Penumbra using data collectors from outside the organisation would be useful in terms of determining the extent to which pre-existing familiarity with the tool may have affected the results.
The I.ROC has yet to be benchmarked against the most widely used measures of recovery in the UK, the Recovery Star and the QPR. As noted earlier, at the time of the present study there were no published data on the psychometric properties of the Recover Star. This has now changed. 22,23 Although it is clear that further work needs to be done on the Recovery Star before it can be recommended as a routine clinical measure, as the current 'market leader' it may now be appropriate to examine similarities and differences between this tool and I.ROC. As QPR has also been used as a recovery outcomes measure within research in the UK, 54 comparisons with this tool would also prove useful.
Although I.ROC has been used within Penumbra with the majority of their clients for a number of years, recent changes have been made to the questions in response to service user feedback at the beginning of this study. These data have been collected by staff with minimal training in research methods and without standardised instructions and it has not been used in the current analysis. As a result, neither test-retest nor interrater reliability was explored in the current study. Both are clearly vital in terms of further establishing the usefulness of I.ROC. Future work might usefully consider these as part on an ongoing project to examine the reliability, validity and usability of this tool.
Notwithstanding, this study provides strong preliminary data to support the use of I.ROC as a measure of recovery in mental health. Its brevity and clarity support its routine use by a broad spectrum of service users with mental health problems and by busy front-line mental health workers. Undoubtedly, further testing is required, yet I.ROC compares very well with existing measures.
Copies of the tool are available from Penumbra on request. Training is required for use of I.ROC.

Funding
The work was funded by a Knowledge Transfer Partnership, from the Technology Strategy Board and the Scottish Funding Council.