Suicide and self-harm are major public health concerns, both in the UK and internationally. 1–4 Self-harm is one of the most common reasons for hospital admission, and accounts for over 200 000 hospital attendances every year in England and Wales. Reference Hawton, Bergen, Casey, Simkin, Palmer and Cooper5 People who have self-harmed are at much greater risk of future episodes of self-harm and suicide than the general population. Reference Hawton, Zahl and Weatherall6 It has been estimated that one in six people will repeat self-harm in the year after a hospital attendance . Reference Owens, Horrocks and House7 The risk of suicide is elevated by between 30- and 100-fold in the year following self-harm, Reference Hawton, Zahl and Weatherall6,Reference Cooper, Kapur, Webb, Lawlor, Guthrie and Mackway-Jones8 and the risk persists: 1 in 15 people die by suicide within 9 years of the index episode. Reference Owens, Horrocks and House7 It has been suggested that multiple repeat episodes of self-harm are associated with an even greater suicide risk. Reference Zahl and Hawton9 A key priority for health service providers as well as national governments, therefore, is to better identify those individuals who are at high risk of suicide. 10 Investigating the utility of risk factors and risk scales in the prediction of suicide is central to this endeavour.
Much of our understanding of the risk factors for repeated self-harm and suicide is derived from individual studies of variable quality and size. Moreover, reviews of the literature to date have been either largely narrative, retrospective in nature Reference Fliege, Lee, Grimm and Klapp11 or look at non-fatal outcomes. Reference Larkin, Di Blasi and Arensman12 This raises concerns because prospective cohort studies are more appropriate than retrospective studies for identifying risk factors, and are less prone to bias. Reference Mann13 A refinement of a simple ‘risk factor’ approach to assessment is to incorporate individual factors into composite risk scales. These scales are specifically designed to quantify the risk of later suicide and are commonly used in clinical practice, leading clinicians to classify people as being at low, medium or high risk. A wide variety of risk assessment scales are currently used in different health settings. For example, a recent study in 32 English hospitals found that risk assessment scales were in widespread use, with many services using locally developed instruments. Reference Quinlivan, Cooper, Steeg, Davies, Hawton and Gunnell14 The utility of scales has seldom been investigated in a systematic manner. A recent paper Reference Randall, Colman and Rowe15 reviewed a number of risk scales, but the researchers did not perform a meta-analysis because of the studies' heterogeneity; they only considered a restricted number of scales used in an emergency department and did not focus on suicide as an outcome.
Drawing on the international research literature, this is the first systematic review and meta-analysis of (a) prospective studies examining the factors associated with suicide following self-harm and (b) risk assessment scales predicting suicide in people who have self-harmed or were under specialist mental healthcare. We were keen to examine individual risk factors as well as combinations of risk factors (in the form of scales) in this paper. Both contribute to clinical assessments of risk in health service settings. The current analyses were initially undertaken as part of the development of the guideline on the longer-term management of self-harm for National Institute for Health and Care Excellence (NICE). 16
Types of studies and search method
A search was conducted in Embase, MEDLINE, PsycINFO and CINAHL, from their inception up to February 2014, for English-language prospective cohort studies for inclusion in the review of risk factors and risk scales. The use of prospective studies provides some reassurance that the factors identified here are those most robustly linked to later suicide. The searches formed part of a wider search that was undertaken for the NICE guideline on the longer-term management of self-harm 16 and included research articles published up to February 2014. Additional articles were identified through discussion with the NICE Guideline Development Group and from reference lists of relevant studies, including grey literature. We also consulted experts in the field during the consultation period of the guideline by emailing them with a list of papers that had already been identified and asking for any additional studies that had been omitted. Citations from the searches were downloaded to the Reference Manager software tool and duplicates were removed. Records were then screened against the eligibility criteria of the review before being appraised. Full details of the search strategies used for MEDLINE are provided in online Table DS1. The PRISMA statement for this study can be found in online supplement DS1.
Population: risk factors and risk scales
We included studies of people who presented to hospital following self-harm. Consistent with current research and clinical practice in the UK (NICE clinical guideline 133), 16 we included all types of self-harm irrespective of motive.
For the risk scales review, we also included studies examining the risk of suicide in people under specialist mental healthcare. This was to broaden the scope of the review and increase the number of studies considered. Differences in scale performance between populations were examined where applicable.
Outcomes: risk factors and risk scales
Studies that reported an effect estimate (adjusted or unadjusted odds ratios, risk ratios or hazard ratios (HRs) with their 95% confidence interval) for the association between the examined risk factor and suicide following self-harm were included for meta-analysis. First, one of the authors (M.K.Y.C.) listed all of the risk factors and the reported effect estimates from each study in a table. Then, M.K.Y.C. grouped the risk factors with the reported hazard ratios from different studies. For example, three studies reported the adjusted hazard ratio for the risk factor ‘history of previous self-harm’ in relation to suicide following self-harm, and these were grouped together then meta-analysed.
Risk assessment scales required previous validation by at least one study to be included in the review. The psychometric properties of the scales that were examined included sensitivity, specificity, positive predictive value (PPV) and negative predictive value (NPV), using predefined cut-off scores. For further details on the calculation of PPV and NPV, see Altman & Bland. Reference Altman and Bland17 The main outcome was suicide. For studies that did not report PPV or NPV, these were calculated and authors H.B. and N.M. cross-checked each other's calculations.
Assessment of bias in included studies
The risk factor review adopted the NICE methodology assessment checklist for cohort studies. 18 It consisted of six questions covering the representativeness of the sample, the effect of loss to follow-up, the measurement of prognostic factors and outcomes, the use of confounders and the appropriateness of the statistical analysis for the design of the study.
The quality assessment for the risk scales studies was conducted using the NICE methodology checklist: the Quality Assessment of Diagnostic Accuracy Studies (QUADAS) tool for diagnostic test accuracy. 18 The checklist covered the clarity of the selection criteria, the appropriateness of the reference standard in identifying the target condition, the clarity of the execution of the index test and reference standard to allow replication, and an explanation of drop-out.
There were insufficient studies in the meta-analysis to assess publication bias through standard techniques such as Egger's test. Reference Egger, Smith, Schneider and Minder19 In addition, there are currently no widely accepted techniques for assessing the risk of publication bias in diagnostic accuracy/screening studies; Reference Deeks, Bossuyt and Gatsonis20 therefore, we did not use any of these techniques.
Two reviewers (M.K.Y.C., H.B.) assessed the quality of each paper. The assessment of study quality was rated by one reviewer (H.B.) and checked by another (M.K.Y.C.). The second reviewer (M.K.Y.C.) checked individual items on the score sheets. For any disagreements that could not be resolved through inter-reviewer discussion, the issues were brought before the full Guideline Development Group (15 members, including experienced psychiatrists, psychologists, academic researchers, practitioners in the field of social care and service user representatives). Discrepancies were discussed until consensus was reached in the group.
Data were extracted and entered into a spreadsheet independently by two reviewers (M.K.Y.C., H.B.) who then checked each other's data extraction and entry. Despite the limited number of studies, meta-analysis was conducted for both reviews because suicide is a rare outcome and meta-analyses may help to highlight the limitations of primary data more clearly. Reference Higgins, Thompson, Deeks and Altman21 ‘K’ represented the number of populations studied, and there was no duplication of samples in the meta-analyses. Risk factors robustly reported across multiple distinct samples may have greater validity than those reported in fewer samples. For the risk factor review, the natural log of the hazard ratios and the standard errors from the upper and lower confidence intervals reported for each risk factor were calculated. The natural logs of the ratios and their standard errors were entered into Review Manager 5 software according to the grouping of risk factors. A generic inverse variance method was used to calculate the pooled effect estimates of the hazard ratios. The random-effects model was used to ensure relative conservative results. The I statistic was used to quantify heterogeneity in terms of the proportion of total variation of the pooled effect. Reference Higgins and Thompson22
For the review of risk scales, data were required from a minimum of four separate samples to conduct bivariate meta-analysis – a limitation imposed by the software that was used. This reflects difficulties in model convergence that are commonly experienced when a smaller number of studies are included in a complex meta-analytic model. The ‘metandi’ command for Stata 12 was used to obtain pooled estimates of sensitivity and specificity. Review Manager 5 was also used for producing forest plots. Heterogeneity was assessed by visual examination of the forest plots and the 95% prediction regions of the hierarchical summary receiver operator characteristic (ROC) curve plots. Reference Rutter and Gatsonis23
In total, 18 590 records were identified from the electronic search. Of these, 18 364 citations were excluded because they were not relevant, and 226 full-text articles were included in the review. There were 12 prospective cohort studies included in the meta-analysis for risk factors associated with suicide following self-harm. Reference Cooper, Kapur, Webb, Lawlor, Guthrie and Mackway-Jones8,Reference Bergen, Hawton, Waters, Ness, Cooper and Steeg24–Reference Suokas, Suominen, Isometsä, Ostamo and Lönnqvist34 For the full-text articles, studies were excluded if they were retrospective in their design, if the outcomes were not repeated self-harm or not extractable, and if the population did not meet our criteria. Reference Brown, Beck, Steer and Grisham35,Reference Fawcett, Scheftner, Fogg and Clark36 More details can be found in online Fig. DS1(a). All participants had experienced at least one episode of self-harm and all were recruited in the hospital setting. They were followed up for variable time periods, with suicide most commonly determined from national registers.
Seven prospective cohort studies were included in the review of risk scales. Reference Beck, Steer, Kovacs and Garrison37–Reference Suominen, Isometsä, Ostamo and Lönnqvist43 Studies were excluded when relevant data were unavailable or the reference standard did not meet the criteria. For example, studies that reported the development of a new measure Reference Steeg, Kapur, Webb, Applegate, Stewart and Hawton44 or did not provide usable data on the prediction of suicide Reference Cooper, Kapur, Dunning, Guthrie, Appleby and Mackway-Jones45,Reference Bolton, Spiwak and Sareen46 were excluded. More details can be found in online Fig. DS1(b). Participants who had self-harmed or were under mental healthcare had all been assessed using a risk assessment scale. They had then been followed up, during which time the number of deaths by suicide was determined in order to provide data for the predictive validity of the scales used.
A risk of bias assessment was conducted for the review of risk factors and risk scales. The two reviewers followed the guideline methodology for assessment, and they reached consensus in their ratings (see the Method section for details). A majority of studies (89.5%) met the criteria and overall they were of acceptable quality, with the exception that the majority of studies (95%) were unclear about the reasons for loss to follow-up. For a full list of included studies and their characteristics, see online Tables DS2 and DS3.
Several factors had robust evidence (the adjusted hazard ratio was statistically significant with low heterogeneity) to support their association with suicide following an index episode of self-harm. They included previous episodes of self-harm, suicidal intent, physical health problems and male gender. These factors emerged from the meta-analysis with robust effect sizes that changed little when adjusted for important confounders, and they appeared to be independent of each other.
There was insufficient evidence for other factors included in the meta-analysis to identify or discount an association with the risk of suicide following self-harm. For instance, alcohol misuse was of marginal significance with moderate heterogeneity; however, definitions varied between studies, making interpretation difficult. Psychiatric history and unemployment were also of marginal significance after pooling the effects.
Strong evidence for an association with suicide following self-harm
Previous episodes of self-harm. People with a history of self-harm prior to an index episode were at higher risk of completing suicide compared with those who did not have such a history (adjusted HR = 1.68, 95% CI 1.38–2.05, K = 4 studies, all were adjusted for confounders and non-significant heterogeneity was observed, I 2 = 19%, Table 1).
Results in bold are significant.
a. The ratios (adjusted or unadjusted) are based on what has been reported in the studies. See online Table DS4 for adjusted confounds.
b. Past history, treatments, admissions from records, psychiatric out-patient.
Suicidal intent. People with suicidal intent were more likely to complete suicide following their index episode of self-harm (adjusted HR = 2.70, 95% CI 1.91–3.81, K = 3, Table 1). The three studies had slightly different definitions of ‘suicidal intent’, although no heterogeneity was observed in our analysis. Aside from a binary classification of ‘yes’ or ‘no’, Reference Suokas, Suominen, Isometsä, Ostamo and Lönnqvist34 one study used ‘avoided discovery at the time of self-harm’ Reference Cooper, Kapur, Webb, Lawlor, Guthrie and Mackway-Jones8 and another used ‘suicidal motive’. Reference Bjornaas, Jacobsen, Haldorsen and Ekeberg25
Gender. Compared with females, males were at higher risk of completing suicide following an episode of self-harm. Data were pooled to report an adjusted hazard ratio of 2.05 (95% CI 1.70–2.46, K = 5, Table 1). No heterogeneity was observed.
Poor physical health. People with poor physical health/chronic illness were at higher risk of suicide following self-harm. The adjusted hazard ratio for the association between poor physical health and completed suicide was statistically significant (adjusted HR = 1.99, 95% CI 1.16–3.43, K = 3, I 2 = 29%, Table 1).
Marginal evidence for an association with suicide following self-harm
History of psychiatric contact. People with a history of contact with psychiatric services were found to be at a slightly higher risk of suicide following self-harm than those without such a history. An adjusted hazard ratio of 1.27, 95% CI 0.94–1.73 (K = 4, I 2 = 55%) was found (see Table 1 for the unadjusted hazard ratio). The heterogeneity might be explained by the inconsistency in the definition of psychiatric contact.
Alcohol misuse. The association between alcohol misuse and completed suicide following self-harm was found to be marginally significant. The adjusted hazard ratio was reported as 1.63, 95% CI 1.00–2.65, K = 3. However, high heterogeneity was observed (I 2 = 53% (heterogeneity over 50% was regarded as high)). Unadjusted data from two studies were also pooled, yet resulted in considerable heterogeneity (I 2 = 64%) (Table 1). Participants in the studies had a psychiatric diagnosis of alcohol misuse, but it was unclear whether alcohol was consumed shortly before they died by suicide.
Economic status. The pooled and adjusted hazard ratio for this association was not statistically significant (adjusted HR = 1.08, 95% CI 0.65–1.8, K = 3) and high heterogeneity was observed (I 2 = 71%). The wide confidence interval suggested no clear evidence of an association in the context of high heterogeneity. For the list of adjusted confounding factors, please refer to online Table DS4.
Three scales were included in this review: the Beck Hopelessness Scale (BHS), Reference Beck, Steer, Kovacs and Garrison37 the Suicide Intent Scale (SIS) Reference Harriss and Hawton39 and the Scale for Suicide Ideation (SSI). Reference Beck, Brown, Steer, Dahlsgaard and Grisham38 A brief description of what these tools were designed to measure/assess are listed in online Table DS5. Table 2 shows the results of the predictive validity of the scales reviewed.
|Risk of bias assessment a|
to permit its
to permit its
|Beck et al (1985) Reference Beck, Steer, Kovacs and Garrison37||Yes||Yes||Unclear||Unclear||Unclear|
|BHS (⩾10)||91||50.6||11.6 b||98.7 b||11/165 (6.67)|
|Beck et al (1999) Reference Beck, Brown, Steer, Dahlsgaard and Grisham38||No||Yes||Yes||Yes||Unclear|
|BHS (⩾8)||90||42||1.3||99.7 b||30/3701 (0.81)|
|SSI-W (>16)||80||78||2.8||99.7 b||30/3701 (0.81)|
|SSI-C (⩾2)||53||83||2.4||99.5 b||30/3701 (0.81)|
|Nimeus et al (1997) Reference Nimeus, Träskman-Bendz and Alsén40||No||Yes||No||Yes||Unclear|
|BHS (9)||77||42||8||96.5 b||13/212 (6.13)|
|BHS (13)||77||61.3||13||97.6 b||13/212 (6.13)|
|Nimeus et al (2002) Reference Nimeus, En and Traskman-Bendz41||Yes||Yes||Yes||Yes||Unclear|
|SIS (19)||59||77||9.7||97.8 b||22/555 (3.96)|
|Suominen et al (2004) Reference Suominen, Isometsä, Ostamo and Lönnqvist43,c||Yes||Yes||Yes||Yes||Unclear|
|BHS (⩾9)||60||52||9.2||93.9 b||17/224 (7.6)|
|Harriss & Hawton
(2005) Reference Harriss and Hawton39
|SIS (10, male)||76.7||48.8||4.2||98.6 b||30/1049 (2.86)|
|SIS (14, female)||66.7||75.3||4||99.2 b||24/1440 (1.67)|
|Stefansson et al
(2012) Reference Stefansson, Nordström and Jokinen42
|SIS (16)||100||52||16.7||100 b||7/80 (8.75)|
a. Criteria for the risk of bias assessment: were the selection criteria clearly described?; was the reference standard likely to classify the target condition correctly?; was the execution of the index test described in sufficient detail to permit its replication?; was the execution of the reference standard described in sufficient detail to permit its replication?; were withdrawals from the study explained?
b. Calculated score (not reported in original paper).
c. Not reported in original paper, but obtained by McMillan et al Reference McMillan, Gilbody, Beresford and Neilly47 for their review by writing to the authors.
Scales that predict suicide in clinical populations
Of the three included scales, meta-analysis was conducted for studies that used the BHS and SIS, whereas the SSI did not have enough data points. The analysis of the BHS for predicting suicide in high-risk groups comprised four studies: two with patients receiving mental healthcare (60 and 180 months' follow-up) Reference Beck, Steer, Kovacs and Garrison37,Reference Beck, Brown, Steer, Dahlsgaard and Grisham38 and two with people who had self-harmed (4 and 144 months' follow-up) Reference Nimeus, Träskman-Bendz and Alsén40,Reference Suominen, Isometsä, Ostamo and Lönnqvist43 with a total sample size of 4302. When meta-analysed, the results showed moderate sensitivity (0.80; 95% CI 0.64–0.90) and low specificity (0.46, 95% CI 0.41–0.51). There was moderate to high heterogeneity for both sensitivity and specificity (see Fig. 1(a) for the summary ROC plot and Fig. 2(a) for forest plots). Although comparisons are limited by the small number of studies in the meta-analysis, the BHS appeared to be more sensitive for patients receiving mental healthcare than for people who had self-harmed, but in both groups it was similar in terms of specificity.
The highest sensitivity (100%) reported in any study was for the SIS (54 to 120 months' follow-up). Reference Stefansson, Nordström and Jokinen42 However, the sensitivity of the SIS was much lower in other studies that investigated this instrument. The meta-analysis of the SIS as a whole found relatively low sensitivity (0.73, 95% CI 0.58–0.84) and specificity (0.64, 95% CI 0.50–0.76) based on four populations from three studies and 3124 participants (see Figs. 1(b) and 2(b)).
This is the first meta-analysis of prospective studies investigating risk factors associated with suicide following an episode of self-harm. There is robust pooled evidence from 12 studies to show that four factors (previous episodes of self-harm, suicidal intent, poor physical health and male gender) are associated with a higher risk of dying by suicide following the index episode. In these studies, at least 32% of people had a prior history of self-harm before the index episode.
This is also the first systematic review and meta-analysis of a range of risk scales investigating their potential to improve the prediction of suicide in high-risk groups. However, despite using broad inclusion criteria, only seven studies providing data on three scales (BHS, SSI, SIS) met the criteria for our review. Of these three scales, it was only possible to conduct meta-analysis on two (BHS, SIS). From this review, there is no robust evidence to support the use of one risk scale over another, and because all the scales reviewed had a low PPV with significant numbers of false positives these scales should not be used in clinical practice alone to assess the future risk of suicide. Taken together, our findings cast doubt on the current approach to ‘risk assessment’ in which risk tools and scales have become the norm.
Although this review employed a systematic approach, the overlap of risk factors and the fact that very few studies adjust for the same confounders limits our confidence in the meta-analysis. In addition, comprehensive data on the factors associated with suicide following self-harm are not always available. Clearly, these problems limit the interpretation of our findings and leave some uncertainty about which factors should be regarded as the most important markers of risk. Moreover, studies measure risk factors in different ways, which may contribute to the heterogeneity and/or uncertainty of some of the results.
With regard to the risk scales review, a paucity of studies meant that there were limited options for conducting a meta-analysis. In addition, where meta-analyses were possible they were based on sparse data and high heterogeneity. Therefore, only limited conclusions can be drawn. An important drawback is that there were low PPVs (between 1.3 and 16.7%) found for all scales. It could be argued that the low PPV is simply a reflection of the low incidence of fatal outcomes. This suggests that such scales are identifying many false positives, thereby limiting their utility. However, these studies had very long follow-up periods (up to 15 years), which would increase the incidence of such outcomes. In the shorter term, it is thought that the PPV of these scales will be even lower. For example, Nimeus and colleagues Reference Nimeus, Träskman-Bendz and Alsén40 used the shortest follow-up period (4 months) compared with the other studies and found a PPV of 8%. Nevertheless, the clinical implications drawn from studies using long follow-up periods may be of limited use because clinicians' primary concern is to predict suicide in the immediate period following an act of self-harm, rather than in the subsequent months or years. It is also important to recognise that different studies used different risk scales, and some used different cut-off scores for the same risk scales (BHS and SIS). This is probably because reported cut-off scores were determined post hoc based on optimal performance derived from the ROC curve. Such approaches are likely to overestimate the screening accuracy of the test, which further raises concerns regarding the performance of all risk scales. Taking these limitations into account, we can conclude that there is insufficient evidence to support the use of risk scales and tools in clinical practice. Nevertheless, given the complexity in this area, the utility of novel risk factors, groups of risk factors and interactions between risk factors in assessment might be helpfully explored in future studies.
Self-harm is a major health problem in many countries. People who self-harm have poorer physical health and a lower life expectancy than the general population. Reference Bergen, Hawton, Waters, Ness, Cooper and Steeg24 What do the results of our review tell us about how we should manage self-harm? Clearly, some factors indicate an increased risk of suicide in this population. We found the strongest evidence for long-recognised risk factors – previous episodes of self-harm, suicidal intent, poor physical health and male gender. The major advantage of our study over previous work was the ability to specifically investigate predictors of suicide risk following self-harm, and to pool findings across studies to produce robust estimates of the magnitude of any increased risk. However, when assessing people following an act of self-harm, being able to identify these associated factors is still unlikely to help us to predict the risk of later suicide, Reference Large, Sharma, Cannon, Ryan and Nielssen48 because these characteristics are common in clinical populations.
All of the scales and tools reviewed here had poor predictive value. The use of these scales or an over-reliance on the identification of risk factors in clinical practice, is, in our view, potentially dangerous and may provide false reassurance for clinicians and managers. The idea of risk assessment as risk prediction is a fallacy and should be recognised as such. We are simply unable to say with any certainty who will and will not go on to have poor outcomes. People who self-harm often have complex and difficult life circumstances, and clearly need to be assessed – but we need to move away from assessment models that prioritise risks at the expense of needs.
An alternative approach to the assessment of people who have self-harmed might be to characterise the prior act of self-harm, determine the specific factors that precipitated that episode for that individual and identify those personal factors that could increase the likelihood of later suicide. This may include recognition of the more robust factors identified by this review, including male gender, suicidal intent, having poor physical health and having self-harmed before. It would also include other factors not necessarily common to other people who have self-harmed. To do this would involve: first, understanding the meaning of the act of self-harm for that individual, taking into account their current relationships, context and past experiences; and, second, understanding how the act of self-harm, the person's intent and their affective state interrelate. No doubt, many of the factors identified in the previous or current reviews will be relevant at assessment. But many will not be. Importantly, there is some evidence that thorough assessments after self-harm may on their own improve outcomes. Reference Kapur, Murphy, Cooper, Bergen, Hawton and Simkin49,Reference Bergen, Hawton, Waters, Cooper and Kapur50 The opportunity for service users to discuss their concerns and formulate action plans may drive the improvements, or it may be that thorough assessments facilitate access to aftercare.
In our collective quest to reduce the risk of suicide following self-harm by building highly structured assessment tools from risk factors, rather than encouraging a real engagement with the individual, we may well be putting our own professional anxieties above the needs of service users and, paradoxically, increasing the risks of suicide following self-harm.
This study was funded in part by NICE during the development of the NICE guideline for self-harm: longer term management.
The authors would like to acknowledge and thank all members of the Guideline Development Group for the NICE Self-harm (longer term management) Guidelines. We would also like to acknowledge the help of Dr Clare Taylor and Ms Nuala Ernest for their help with editing the final version of the manuscript.