A systematic review and synthesis of outcome domains for use within forensic services for people with intellectual disabilities

Background There is limited empirical information on service-level outcome domains and indicators for the large number of people with intellectual disabilities being treated in forensic psychiatric hospitals. Aims This study identified and developed the domains that should be used to measure treatment outcomes for this population. Method A systematic review of the literature highlighted 60 studies which met eligibility criteria; they were synthesised using content analysis. The findings were refined within a consultation and consensus exercises with carers, patients and experts. Results The final framework encompassed three a priori superordinate domains: (a) effectiveness, (b) patient safety and (c) patient and carer experience. Within each of these, further sub-domains emerged from our systematic review and consultation exercises. These included severity of clinical symptoms, offending behaviours, reactive and restrictive interventions, quality of life and patient satisfaction. Conclusions To index recovery, services need to measure treatment outcomes using this framework. Declaration of interest None. Copyright and usage © The Royal College of Psychiatrists 2017. This is an open access article distributed under the terms of the Creative Commons Attribution (CC BY) licence.

Following de-institutionalisation, most people with intellectual disabilities live fairly independent lives in the community. There are 900 000 adults with intellectual disabilities in England, and estimates suggest that only around 3035 (0.3%) receive treatment in psychiatric hospital settings, with about half of them being in forensic hospitals. [1][2][3] The health expenditure in this sector belies the low numbers, and it is estimated at over 300 million pounds sterling per annum. 4,5 However, there is limited empirical information on service-level outcome domains and indicators, which in turn limits the ability to measure the effectiveness of these services. This is of concern in a health climate focused on outcomes 6 and 'payment by results', but is even more relevant because of the recent government initiative to fundamentally transform care for people with intellectual disabilities. 7 Although Fitzpatrick et al 8 have conducted a systematic review of outcome measures used in generic forensic mental health services, and Gilbody et al 9 completed a similar review of outcome studies in mental health, there has been no such work for forensic services providing care to people with intellectual disabilities. Further, although there has been a marked focus on recovery from mental health services, there has been comparatively little focus on this construct and its measurement within psychiatric hospital settings for people with intellectual disabilities, including forensic services. Recovery is often construed as 'getting better' or 'reducing symptoms', and within the context of in-patient services for people with intellectual disabilities, where there is often a focus on person-centred support and normalisation, including living as independently as possible within the community, the concept remains unclear, but the issues are not dissimilar from the 'recovery' debates within wider mental health services. However, recovery in the context of forensic services for people with intellectual disabilities, while subjective, should nevertheless incorporate the connectedness; hope and optimism about the future; identity; meaning in life; and empowerment (CHIME) framework, 10 bearing in mind that some associated factors may be more proximal for this population (e.g. offending behaviours and stigma associated with disabilities).
In order to address these shortcomings, this study had the single aim of identifying the domains that should be used to measure outcome from forensic services for people with intellectual disabilities. Within the context of this project, outcome was defined as occurring at the level of the service as a whole, rather than individual outcomes associated with a specific treatment or intervention. In other words, we were primarily interested in outcomes that could index change over time across the entire range of interventions offered by a service, rather than outcome from a specific intervention (e.g. medication or psychological treatment), as this represents the real world of service delivery. Our aim was achieved within the context of two interrelated and iterative work streams: (a) undertaking a systematic review of studies that focused directly or indirectly on measuring outcomes from forensic services for people with intellectual disabilities and synthesising the findings into an initial framework of outcome domains, and (b) taking our initial framework and refining further within the context of a consultation exercise with patients and carers, as well as a two-round Delphi exercise with experts.

Systematic review
An initial outcome framework was developed following a systematic review of the literature that focused on outcomes from forensic { The full report of this project is published in Health Services and Delivery Research 2017; 5(3), available at: https://www.journalslibrary.nihr.ac.uk/ hsdr/hsdr05030/#/abstract services for people with intellectual disabilities. As a starting point, and following discussion within the research team, we initially envisaged outcomes as falling into one of the three areas that were defined by the Department of Health 11 as representative of quality. They are as follows: (a) effectiveness (e.g. the impact of generic treatment on health), (b) patient safety (e.g. untoward events as a result of treatment) and (c) patient experience of care (e.g. satisfaction).

Search strategy
The search strategy aimed to identify studies from a range of sources. Electronic databases searched on 1 June 2015 included Medline, Psyc (INFO), Embase, AMED, HMIC, BNI and CINAHL. Search terms employed were based on those used for a previous Cochrane reviews, for intellectual disability 12 and forensic/ offenders. 13 The full search terms, including 'explode' terms, keywords and text words are included within our supplementary material. The systematic review is registered in advance with PROSPERO (registration number: CRD42015016941).
In order to ensure that no relevant publications were missed, the grey literature (opengrey.eu) was also searched using the keywords. The ancestry method was used to find suitable studies within the references of eligible papers. The ancestry method means searching the reference lists of papers that met our eligibility criteria for any further papers that may not have been previously included. In addition, expert members of the project team were consulted in order to identify any key references not retrieved by the search strategy as well as in press or unpublished articles.

Study selection and eligibility criteria
Duplicate studies were removed, and titles and abstracts of articles were screened against the eligibility criteria independently by two members of the research team (C.M. and N.G.). Any disagreements were resolved by a third reviewer (M.F.). Studies were included that (a) were published after 1980, as our initial searches revealed there was little relevant literature available before 1980; we opted to use this cut-off date to reduce the number of returned ineligible papers; (b) were in any language, as translations were obtained; (c) made use of any type of quantitative method; (d) involved adults within intellectual or autism spectrum disorders; (e) who were older than 18 years of age; and (e) had current or past use of forensic services for people with intellectual disabilities, including community-based forensic services. Forensic services were defined according to the bed categories defined by the Royal College of Psychiatrists. 3 This means that we included papers where the participants were either living within a high, medium or low secure in-patient forensic health service, or a forensic rehabilitation service, or they were living in the community, but receiving a service from a communitybased forensic service. Studies were excluded if they only evaluated the effects of a specific intervention or treatment programme (e.g. randomised control trial of a medication or psychological treatment group), rather than examining outcomes at a service level.
Sixty studies met the inclusion criteria. None of the included studies were randomised controlled trials or meta-analyses. Twenty-eight studies were cohort outcome studies with follow-up from 1 to 20 years, and a further 32 were cross-sectional studies which reported service-level outcome data at one point in time.
Several of these studies made use of the same or overlapping samples of participants, but as we did not make use of meta-analytic methods, this did not erroneously affect precision. The large majority of the studies included were from the UK, with only two studies originating elsewhere. Most studies made use of samples of men, as only two studies included women. Figure 1 depicts a flowchart outlining the study selection process and the number of studies identified at each stage.

Data extraction and analysis
Using a structured form, data were extracted from the included articles. Specifically, details regarding the sample, design, service type, methods, outcome domains and specific measures were obtained and coded. Content analysis was used to synthesise the outcome domains with reference to the three areas of quality as defined by the Department of Health, 11 namely: (a) effectiveness, (b) patient safety and (c) patient experience. These three areas were used as an a priori superordinate framework. A process of refining and grouping similar outcome sub-domains together was then undertaken by two researchers. This led to the construction of a 'framework' to describe the outcome domains extracted from the eligible studies. This process is best described as both directed and summative content analysis because the process started with an a priori theoretical stance pertaining to service quality, followed by both counting and coding the extracted data, which was then interpreted within the context of our a priori theoretical stance. 14 This methodology was advantageous because it allowed us to identify key concepts, consider their context and underlying meaning within and across studies, and develop coding variables, which were then refined into sub-domains.

Consultation groups
Following the completion of our systematic review, and the development of our initial outcome framework, we undertook three consultation groups with patients and one consultation group with carers to further consider and refine our outcome framework. Two of our patient groups took place within a high secure hospital in England, whereas the remaining groups took place within both a low secure hospital and a medium secure hospital, also in England. Participants were approached by the researchers, and the purpose of the group was explained using information sheets. Participants who agreed to take part provided informed consent. However, we were advised by our associated Research Governance office, within our National Health Service (NHS) Trust, that NHS Research Ethics opinion was not required for this project. Our groups included 3 women, 1 transgendered person and 11 men. For our consultation group involving carers, we recruited four participants from an existing carer group within a secure hospital, whereas two participants were recruited who were not part of this group. The carer participants had family members detained within three different secure hospitals.

Analysis.
A semi-structured topic guide was used which was based around our initial outcome framework as a method to structure the conversations within our groups. Participants were encouraged to consider and discuss our initial framework, make modifications and choose outcomes they considered most important. The discussions were recorded and fully transcribed. The transcriptions were analysed using both directed and summative content analysis. 14 As with our systematic review, this methodology allowed us to identify key concepts, context and meaning within transcripts, which were interpreted within our proposed framework. Any changes, or newly identified outcome subdomains, were incorporated within our superordinate framework.

Delphi exercise
The Delphi method 15,16 is an iterative and multi-staged structured process that can be used to develop group consensus. We made use of a two-round online Delphi exercise with expert clinicians, researchers and commissioners with experience of working within forensic services for people with intellectual disabilities.
Information about the study was advertised within the communication networks of existing stakeholder organisations within the UK (e.g. British Psychological Society). All participants were provided with information to help them make a decision as to whether they wished to take part in the study. Participants were presented with the revised outcome framework developed following our patient and carer consultation exercise. They were then invited to rate the importance of each sub-domain within each of the three superordinate domains along a 5-point Likert scale, where 1 was 'not important' and 5 was 'extremely important'. Participants were also asked for their expert opinion about each sub-domain and whether they thought any additional outcome measures needed to be added. Finally, participants were asked to indicate the five subdomains they considered to be the most important measures of outcome.
Following the completion of the first round, participants were invited to consider the responses of the group and re-consider their previous ratings. Those sub-domains with a mean rating of four or more were taken through to the second round, and participants re-rated their importance along the same 5-point Likert scale. Participants were invited to select up to five subdomains they perceived to be the most important. All participants were reminded that they did not have to change their original responses.
Participants. Seventeen participants took part in the first Delphi round, with 15 taking part in the second round. Nine participants were psychologists, seven were psychiatrists and one was a nurse. Participants were eligible to take part in the Delphi exercise if they were a clinician, researcher or commissioner with experience of working with forensic services for people with intellectual disabilities. Two participants identified themselves as having responsibility for commissioning, whereas a further two identified themselves as having both clinical and academic responsibilities.

Systematic review
Using content analysis, data from eligible studies were extracted and categorised within the overarching superordinate domains: (a) effectiveness, (b) patient safety and (c) patient and carer experience. The complete list of identified sub-domains that emerged following our analysis, along with the associated studies, is found in Tables 1-6. For simplicity, studies have been divided into cohort, retrospective cohort, cross-sectional or case study designs. These findings were synthesised into our initial framework of outcomes which was taken forward and used within our consensus exercises ( Table 7).  . % of patients discharged to rehab villa, community or hostel . Level of adjustment at follow-up, based on personal knowledge, hospital notes and liaison with involved agencies . % of patients readmitted to the same unit . Number who reoffended or returned to prison  . Mean number of months . Level security/type of placement at discharge . Good (discharged to community) or bad (not placed in a community setting) outcome . Collected from hospital incident records. Incidents at baseline (week 6 to 10 of stay) were compared to end of stay (last 4 weeks of treatment). Frequency (total number of incidents per month) was adjusted for length of stay.

Effectiveness
Fifty-three studies were categorised as presenting data that involved at least a single outcome that attempted to measure effectiveness (Tables 1 and 2). Our analysis led to 12 sub-domains within the effectiveness superordinate domain (Table 7). These included sub-domains such as length of stay, discharge outcome, clinical symptoms, treatment responsiveness, reoffending behaviours and risk assessment. As a sub-domain, length of stay was considered within 22 studies (Tables 1 and 2), and varied between 1 and 9 years across included studies. However, it was recognised that as a measure of outcome, length of stay is problematic because (a) it tended to be reported for only those who had actually been discharged, rather than the entire in-patient population, and (b) it is complicated because some patients move from one hospital to another, and data may not capture their entire length of stay across all hospitals. As another sub-domain, discharge outcome was considered within 16 studies (Tables 1 and 2) and was defined as moving to an increasing or decreasing level of security within or across forensic hospitals, or discharge to a community-based setting. Several of the included studies focused on delayed discharge 54,61,64,65 and highlighted the difficulties with finding appropriate accommodation that mitigated risk.
Sixteen studies were judged to have included sub-domains that were classified as falling within the clinical symptom sub-domain and these are detailed in Tables 1 and 2. These included measures that made use of clinician or patient ratings of clinical symptomatology. However, only two studies reported change in clinical symptoms over time for a cohort of patients. A variety of tools were used to index change over time within this sub-domain and included such measures as the Brief Symptom Inventory, 73 Emotional Problem Scales, 74 Mini Psychiatric Assessment Schedules for Adults with Developmental Disabilities (mini PAS-ADD), 75 Health of the Nation Outcome Scale (HoNOS)-Secure 26,76 and Clinical Global Impressions Scale. 77 Treatment responsiveness was also coded as a sub-domain, but it was recognised that this is intertwined with the clinical symptom sub-domain; this was included as a separate sub-domain because it focused on whether a patient was likely to be responsive to treatment efforts, rather than the actual response.
Reoffending and risk were classed as separate sub-domains, with 18 and 12 studies considering variables within these subdomains, respectively (Tables 1 and 2). Most commonly, studies tended to focus on reoffending using data derived from police or Ministry of Justice records. One study followed up reoffending at 1, 2 and 5 years post-discharge, 35 whereas another set of studies using the same data-set reported on whether there was reoffending behaviour within the 2 years following discharge from hospital. 78,79 Another, based in Australia, considered arrest data and 'any criminal justice involvement' following discharge. 80 A series of studies, based in the community, considered whether treatment within the context of a community-based forensic service led to a reduction in offending behaviours. 23,[30][31][32][33] It is important to note that many people with intellectual disabilities may not be formally dealt with by criminal justice agencies, and as a consequence, 'formal' arrest and conviction data may not be a valid index of reoffending. Hence, the category of 'reoffending-like behaviour' described in these studies 17,22,29,30 is one which is important because it is likely to have increased validity.
Risk also emerged as a likely sub-domain which could be used to index the effectiveness of forensic services for people with   58,59,80 There was only a single study that considered how changes in scores on a risk assessment tool may relate to treatment outcome from forensic services for people with intellectual disabilities. 37

Patient safety
Eleven studies were categorised as presenting data that were considered to index outcome within the patient safety domain. Premature death was considered by one study, which differentiated between suicide and death associated with natural causes, whereas another study incorporated physical health, an important and relevant sub-domain considering the high rates of morbidity amongst forensic populations, including people with intellectual disabilities.
The five sub-domains that emerged following our analysis also included those related to 'reactive' or 'restrictive' interventions such as the use of physical interventions and seclusion, pro re nata (PRN) medication or a change in observations levels. 'Reactive' or 'restrictive' interventions fall within the safety domain defined by the Department of Health. 6,81 We have adopted the same approach here.
However, there is an overlap with the previously discussed effectiveness domain. 'Reactive' or 'restrictive' interventions can be construed as proxy variables for behaviour, and their use may correlate with increasing behaviour difficulties. However, they are not an intervention and instead are reactive strategies taken to try to manage behaviour difficulties in the short term to ensure safety. However, medical, psychological and social care interventions developed using a formulation that aim to rehabilitate and/or habilitate are not 'reactive', and as such, these would fall within the effectiveness domain. This includes psychological and social interventions, as well as medication prescribed to treat a diagnosed mental illness or distressing symptoms.

Patient experience
Within this superordinate domain, 11 studies were categorised as capturing outcomes related to patient experience which were categorised into 4 sub-domains. These were quality of life, therapeutic milieu, patient involvement and patient satisfaction. Four studies included in the review measured quality of life using a number of ratings scales, such as the Quality of Life Questionnaire 36 or Life Experience Checklist, 71 whereas three other studies focused on therapeutic milieu or ward atmosphere using either the Correctional Institutions Environment Scale 70 or the EssenCES Climate Evaluation Scale. 69,72 Three studies focused on patient satisfaction in response to service development, and only a single study considered patient involvement as an indicator of outcome.

Consultation exercise
Following the completion of our systematic review, and the development of our initial outcome framework (Table 7), this was presented to our consultation groups with patients and carers. Revisions were made and the revised outcome framework was used within our Delphi exercise with experts. As with the systematic review, we made use of the three superordinate domains (a) effectiveness, (b) patient safety and (c) patient and carer experience as a framework for our analysis of the data generated from our consultation exercises.

Consultation groups
Effectiveness. Several patients expressed the view that length of stay should be an important index of outcome; several said they were frustrated because they thought that length of stay was excessive for many patients. However, some carers expressed an alternative view, stating that a shorter length of stay may be problematic and lead to premature discharge. Patients from high security settings were of the opinion that discharge to medium security was indicative of positive progress, whereas for those in medium and low security, discharge to a communitybased service was seen as positive. Several carers further discussed how frequent moves between hospitals and wards can be particularly destabilising and may actually be associated with a negative outcome.
The appropriateness of a placement with respect to meeting treatment needs was discussed and considered important by many carers and patients. One stated, 'I would much rather be further away for eight to nine months [and get the right treatment] than be nearer for 18 months'. Another commented, 'it is very important for people to go to a place where they are happy, not just because it is closer to family'.
Many commented further about the importance of much needed clinical interventions being available within each service, focusing specifically on psychological treatments and appropriate levels of meaningful activity. Carers spoke about wanting and needing individually tailored care pathways focusing on patient need rather than rigidly designed care pathways that were not based upon a formulation of treatment needs. One said, 'it has got to be individually led', and another commented, 'he needed an individualised package of support which was right for him'. Alongside this, carers also expressed the view that a similar individualised package of support needed to be made available to patients when discharged into the community, with one stating, 'I worry about the fact that the service wasn't there in the community…there is so little support in the community'. Improvements in clinical symptoms and behaviour were recognised by both patients and carers as indicative of positive change. This included quantifiable changes in the frequency of incidents, including improvements in communication and a reduction in angry feelings. Different patients stated, 'before I wouldn't engage in conversation and now I've learnt different strategies so I don't kick off so often', and 'a reduction in incidents, reduction in restraints, using diversion more frequently, preempting incidents'. Carers broadened this by commenting that some patients may not fully understand what they need to achieve to move forward, as there is often too much focus on measuring incidents within services. Some patients also shared this view and stated, 'there is too much focus on incidents and not on understanding them … taking back to the beginning of the process as opposed to just dealing with what the consequences are'. Carers considered that a family member may be able to make a more nuanced judgement about changes to clinical symptoms because of their long-standing knowledge of the patient. Both patients and carers commented on the importance of engagement with services and therapies as positive indicators of progress within this sub-domain.
Both patients and carers agreed that 'staying safe' once discharged was a positive outcome, recognising that a reduction in risk was associated with a positive outcome, and several carers adopted the position of both carer and potential victim, expressing concern that their own safety could be compromised. One carer said, 'if they said take him home I would be too scared'.
Finally, within the effectiveness superordinate domain, and as an addition to our initial framework, adaptive functioning was considered by patients to be an important indicator of outcome. They talked positively about how they hoped that staying in hospital would bring about improvements in adaptive functioning, such as budgeting, occupational skills and broader life skills. One said, 'I have been given skills like cooking and cleaning…'. However, several commented that staying in hospital may be associated with a loss of adaptive functioning, and several said they thought they have lost skills. For example, one said, 'other hospitals let patients get real jobs. I want this to happen in this hospital', while another commented, 'since we've been locked up here we don't get a chance to do that sort of thing [budgeting] so you don't know what to do when you get your money'. One of our carer participants commented, 'he used to be able to do things. He's lost those skills since he's been here'.
Patient safety. Patients spoke about how a reduction in aggression and the use of seclusion was a relevant outcome measure, as did a number of carers, whereas patients also spoke about being victimised by other patients in hospital. Some carers further considered that taking medication regularly was an indicator of positive outcome, and they also spoke about how a planned reduction in medication could also be a positive outcome. For example, one commented, 'if he could come off olanzapine, that would be progress', whereas another stated, 'a reduction in PRN medication and other medication is a goal'.
Several carers and patients expressed concern about polypharmacy and side-effects, indicating that they felt this was a restrictive practice and alluding to the possibility that medication may be used to sedate in order to control behaviour; one commented, 'he's never been on this amount of medication … he's so heavily dosed up … if he's been medicated to manage his behaviour, he's not learned how to manage his behaviour'. One of the patients strengthened this view by commenting, 'can you be careful about medication and patients being overdosed'.
Patient and carer experience. This superordinate domain was modified as a consequence of our consultation groups in order to incorporate carer experience, alongside the experiences of service users. Carers spoke about whether they were satisfied with the level of care being afforded by their family member and indicated that this was an important measure of outcome. Several spoke about being satisfied with the care being offered by the hospital. One stated, 'it is a dream come true; the place where he is now, it's lovely…it's a dream for places like that to be about', whereas another commented, 'the hospital are [sic] fantastic; the staff are fantastic and at long last somebody is realising the amount of problems he has got and that is one of the problems I had before'. Others considered the importance of having a sense of security as a consequence of the quality and responsiveness of care being given to their relative; one stated, 'not having to be worried about him; if we died tomorrow, services would be there for him and do what was best for him without thinking of the cost'. However, several carers spoke about having to fight or battle for service provision and felt that sometimes services did not listen or involve them appropriately in the care pathway. This was illustrated by the following, 'there was nothing we could say which would be taken on board … it was very much 'no' this is what we think', and 'I was asking for help in the community for years before my son was admitted to hospital'.
Patients and carers considered the importance of quality of life as an indicator of positive outcome, and many spoke about having hopes for a job, relationships and involvement in their local communities, with well-integrated high-quality support. Several carers emphasised the importance of high-quality accommodation once a patient was discharged, and one commented, 'he would be in accommodation that was specifically designed for people with autism, but he had sufficient support with people who actually understood his condition and were able to spot the warning signs so I didn't have to keep flagging them up'. Another stated, 'he needs an individual planned package with sufficient staff and appropriate training'. Carers also commented that leaving hospital was not the end of patients' journeys and spoke about the importance of continuing to monitor outcome and progress over the longer term, rather than view the 'case as closed'. Others spoke about valuing having in-patient services which could be used in times of crisis; this was illustrated by the following comment, '… for him to go back to a secure unit because he's a danger when he does deteriorate'.
The availability of and engagement with meaningful activity was seen as a potential indicator of positive outcome by both patients and carers. They spoke about having employment, and how increasing engagement in activities could be indicative of improvement. Carers spoke further about the importance of having meaningful activities available within hospital settings and went on to further consider how developing and maintaining social networks are further evidence of a positive outcome. This included developing and maintaining positive relationships with family, friends, and pets, and further included romantic relationships.
Changes to our initial outcome framework. A variety of changes to our initial outcome framework were made following the analysis of the data from our consultation groups. This included the incorporation of additional sub-domains or the modification of sub-domains. Specifically, we changed the label of the superordinate domain 'patient experience' to 'patient and carer experience'. Considering the effectiveness superordinate domain, we made changes as follows: (a) treatment response and recovery and clinical symptom severity were modified to include carer ratings of clinical improvement, (b) acquiring adaptive skills was added as a new sub-domain, as was (c) engagement with therapies and services. Within the patient safety superordinate domain, we incorporated (a) safeguarding and victimisation, as a new subdomain, whereas (b) overuse of medication was strengthened by making reference to unacceptable side-effects and patient satisfaction with prescribed medication. Finally, within the patient and carer experience superordinate domain, we incorporated subdomains focusing on (a) the carer experience incorporating both communication and involvement, (b) closeness to home area and (c) the level of support and involvement within the community, as well as access to occupational activities. We also included that quality of life could be indexed by either clinicians or the patient.

Delphi exercise
Following our revisions to the outcome framework, we completed a two-round Delphi exercise with experts in order to create consensus about the most important outcomes for forensic services for people with intellectual disabilities. None of the subdomains were rated as 'not important' or 'slightly important' by the participants. Five sub-domains did not reach consensus at the end of round one, and these were (a) length of stay, (b) security needs, (c) adaptive functioning, (d) clinician-rated quality of life and (e) closeness to home area. Participants were asked to rate five outcomes that they thought were the most important, and length of stay was included within these top five and was therefore retained and taken through to round two.
Six sub-domains received the highest average ratings by experts at the end of round two where a clear consensus emerged. These were (a) discharge outcome, (b) treatment response/engagement, (c) premature death and suicide, (d) therapeutic milieu, (e) meaningful activity and (f) reoffending/offending-like behaviour. However, when asked to indicate their top five sub-domains, participants chose sub-domains exclusively within the effectiveness superordinate domain, and these were (a) clinical symptom severity/treatment needs, (b) reoffending/offending-like behaviour, (c) treatment response/engagement/insight, (d) risk assessment measures and (e) recovery measures/direction of care pathway. Perhaps this is not surprising, considering that all of the participants had current or past clinical responsibility for patients within services.
Integrating the findings, the final most important sub-domains were (a) discharge outcome, (b) recovery measures/direction of care pathway, (c) treatment response/engagement/insight, (d) clinical symptom severity, (e) reoffending/offending-like behaviour, (f) risk assessment, (g) premature death and suicide, (h) therapeutic milieu and (i) access to work and meaningful activity. The findings from the Delphi exercise were considered and synthesised into our findings from the consensus exercises and our systematic review. This led to the emergence of a final outcome framework, and we have identified which aspect of the current project led to the generation of each sub-domain in Table 8. During this process, sub-domains were not removed, but additional sub-domains were added or combined into existing subdomains that had emerged from our analysis.

Discussion
The aim of this project was to identify the domains that should be used to measure outcome for people with intellectual disabilities and forensic needs. This is a topic relevant to all psychiatrists, particularly following the abuse scandal at a specialist intellectual disability hospital in England, Winterbourne View, and the resulting agenda to care for people with intellectual disabilities within 'mainstream' psychiatric services. 11 Similar issues are of concern around the world as many work toward the social inclusion of people with intellectual disabilities within mainstream services within the health and social care sectors. Our aim was achieved by undertaking a systematic review coupled with a consultation exercise involving patients, carers and experts. The findings revealed a series of important sub-domains spread across three superordinate domains indicative of quality. 12 These captured a range of clinical and patient safety variables, along with factors measuring both the patient and carer experience of care.
The largest outcome domain was effectiveness, which is not surprising. The sub-domains included were those that captured aspects of the care pathway, along with a focus on clinical symptoms, recovery and a reduction in reoffending. Related variables, such as length of stay, discharge and need for security, were included, but these may not always directly correlate with clinical need. For example, it would be possible for someone who has received successful treatment to remain in hospital due to delayed discharge because of difficulties with the provision of community-based services to manage risk. Further, length of stay in this context, as an indicator of outcome, should be neither too short nor too long, and instead should be 'just right' as it should be appropriate to meet the needs of individual patients, adding substantial complexity, especially when considered as a sole indicator of outcome. As such, focusing on multiple sub-domains allows for a richer and more thorough picture of the circumstances surrounding the care being offered to patients within forensic services.
However, consideration as to what 'effective treatment' in this context actually looks like requires further exploration, both on an individual patient level and on a wider service level. Only one study 5 described the nature of the treatment programme that is delivered within the service. Effective treatment is likely to form a combination of appropriate medical, psychological and social intervention, informed by individual clinical formulations, but the availability is likely to vary across services, depending on patient needs. At present, in deciding whether a service is effective, regulatory and commissioning bodies rely on easily measurable process variables (e.g. the existence or otherwise of various policies, and the availability or otherwise of various treatments) rather than paying attention to the more important question of whether any of this is making a difference to the outcome. This is clearly unsatisfactory. Likewise, there are no studies which have looked at the economic evaluation of treatments, a rather surprising finding considering the abundance of anecdote and opinion in this field about costs. 5 Considering the future, the structure and form of 'effective treatment' within forensic services should be clarified and drawn from a robust evidence base, bearing in mind that there are very few clinical trials to identify the most effective intervention across the range of those that are available. As such, greater investment in research investigating the clinical effectiveness of forensic services for people with intellectual disabilities is needed.
There are other sub-domains clustered around safety and the patient and carer experience which we incorporated into our final framework. These are important indicators of the quality of forensic services, but may not always directly relate to clinical effectiveness. Nevertheless, helping to ensure that patients with intellectual disabilities detained in forensic services have a good quality of life, including access to meaningful activity, is a core business of such services. Measuring sub-domains with the broader safety and patient and carer experience domain is clearly important, for all stakeholders, especially patients and their carers. The measurement strategy for these domains could be standardised nationally, bearing in mind that they may not correlate directly with clinical effectiveness. Related to this, the addition of a series of consultation exercises, alongside our systematic review, adds a particular strength to our project. This helped to ensure that we adequately captured the views of all stakeholders and appropriately synthesised them into our final framework. It is important to note that although experts tended to focus on clinical outcomes, patients and carers tended to focus more upon the quality of service provision, and the experience of receiving a service, alongside clinical outcomes. As such, it became important to ensure that these findings formed part of our final outcome framework.
Contrasting our framework with that developed by Fitzpatrick et al, 8 there are both similarities and differences. Fitzpatrick et al 8 grouped outcome measures across a variety of similar domains, such as recidivism, service outcomes, mental state, compliance, satisfaction and substance misuse, among others. They were able to successfully review a variety of specific outcome measures that would enable measurement across these domains, whereas in our study there are relatively fewer instruments that have been standardised for use with people with intellectual disabilities across the sub-domains we have included within our framework. At the same time, there were some noted differences between our framework and that reported by Fitzpatrick et al. 8 For example, substance misuse did not feature explicitly in our framework, but nevertheless is an issue for many with intellectual disabilities, and would fall easily within our Incidents sub-domain. Conversely, there were specific sub-domains that we included which did not appear within the framework reported by Fitzpatrick et al, 8 such as adaptive functioning, access to meaningful activity, as well as the use of restrictive practices, which no doubt are all issues for those with forensic mental health problems and are likely to be more salient with services for people with intellectual disabilities.

Clinical implications
The findings from the current project have direct relevance to recent government initiatives, including Building the Right Support 7 and the new National Service Model 83 that were developed and published in response to the institutional abuse that took place at Winterbourne View in England. 82 For many years, there has been a focus on ensuring that people with intellectual disabilities are afforded good quality care within their own communities, rather than in hospital, and the abuse that occurred at Winterbourne View has reignited the drive to ensure that people with intellectual disabilities are not unnecessarily kept in hospital and other restrictive environments, recognising at the same time that some people with intellectual disabilities do need appropriate hospital care from time to time, depending upon their needs. The new National Service Model incorporated hospital admission, which should be integrated within community-based teams, alongside active, clear and robust discharge planning. In order to achieve these aims, services need to be able to measure outcomes, and for those who are admitted to in-patient forensic services, including forensic rehabilitation services, our framework of outcomes should be used by hospitals to index change, as well as service quality. Further, our work has the potential to strengthen current initiatives, such as the Quality Network for Forensic Services (http://www.rcpsych.ac.uk/quality/quality,accreditationaudit/forensi cmentalhealth/templatehomepage.aspx), when used within forensic services for people with intellectual disabilities where there is a focus on ensuring practice standards are agreed and met.
Care and treatment reviews, a further initiative created by NHS England following Winterbourne View, involve reviewing the care within a hospital in order to make a judgement about whether an individual is receiving the right care within the right environment. Each review involves a service commissioner and at least two expert advisors, one being a carer or patient. For patients who are within in-patient forensic services, it would be valuable for care and treatment reviews to be structured around our outcome framework. This would help ensure that decisions about care are based on the research evidence and on indicators that are considered to measure change appropriately, helping to ensure the process is robust. One of the further important findings from our work is that we have integrated the findings from the evidence base which was used as the springboard to develop our framework. Although there are difficulties with many of the included studies, what was apparent was the absence of a focus on recovery and exploration of the subjective meaning of recovery in this context. Alongside this, many of the studies were small and very few longitudinal studies drawing on a well-developed outcomes framework have been completed, which is both clearly and sorely needed.

Limitations
All of these recommendations need to be balanced against several weaknesses associated with this study. First, our findings from the systematic review are based on the research evidence. Inherently, our findings from the systematic review are only as robust as the quality of the research that was reviewed. The predominant issue with many of the studies that were included was that few were longitudinal studies measuring outcomes, demonstrating that these outcomes had validity and reliability as an outcome indicator. Related to this, because of the marked variation across studies in terms of methodology, it became impossible to find a suitably reliable and valid tool that would index quality in this context. Moreover, if we had been able to measure study quality, this would not have altered the weight put on one study as opposed to another, because the focus was on the domains and how those domains were measured.
Although this is problematic, it is attenuated by the consultation exercises with patients, carers and experts. The patients included within our focus groups are vulnerable, detained under the Mental Health Act, and are often not given a voice. They directly contributed to the development of our outcomes, telling us what was important to them, as users of the services. Our findings from the consultation exercises were incorporated into our methods which helped to ensure that our findings were shaped carefully by those affected by our findings, which in turn increased validity. The second weakness is that the findings from our consensus exercises are based upon the views of a group of individuals, and only four carers were included; our findings may have been enhanced with a larger number of carers, but the content had become repetitive suggesting we had reached content saturation. Although we attempted to capture the views of a variety of patients, carers and experts, it is certainly possible that had we asked a different group of patients, carers and experts, different issues may have emerged from our analysis. Third, as our study included participants from the UK, there is a question as to whether the findings are generalisable to healthcare systems in other countries. However, we would anticipate that the findings have implications within other countries offering similar services and could be used to inform further research within similar hospitals and services in other parts of the world.
Finally, and looking forward to the future, further work is needed to investigate the reliability and validity of our outcomes framework. This may lead to a reduction in the number of sub-domains found within our current outcomes framework, which would increase the probability that services would integrate the framework into their services. Related to this, it is important to consider that individual-level outcomes are likely to be very important when indexing recovery, and further work is needed as to how these are measured across services, because they are likely to be associated with local clinical practices which may be idiographic and vary from service to service. However, a degree of standardisation would be valuable when monitoring and improving service outcomes, and specifying the method of measurement across our sub-domains is an important next step. Together, our framework should have a beneficial impact on improving both service quality and patient outcomes, whereas it would also allow for the creation of a national minimum data-set, specific to these services, which could be used to track patient outcomes and help develop and refine care pathways. Considering future research, it is now appropriate to consider the likely instruments that could be used to measure outcomes, allowing us to trial this framework within existing hospital care pathways.