Appraisal of patient-level health economic models of severe mental illness: systematic review

Background Healthcare decision makers require accurate long-term economic models to evaluate the cost-effectiveness of new mental health interventions. Aims To assess the suitability of current patient-level economic models to estimate long-term economic outcomes in severe mental illness. Method We undertook pre-specified systematic searches in MEDLINE, Embase and PsycINFO to identify reviews and stand-alone publications of economic models of interventions for schizophrenia, bipolar disorder and major depressive disorder (PROSPERO: CRD42020158243). We screened paper titles and abstracts to identify unique patient-level economic models. We conducted a structured extraction of identified models, recording the presence of key predefined model features. Model quality and validation were appraised using the 2014 ISPOR and 2016 AdViSHE model checklists. Results We identified 15 unique patient-level models for psychosis and major depressive disorder from 1481 non-duplicate records. Models addressed schizophrenia (n = 6), bipolar disorder (n = 2) and major depressive disorder (n = 7). The predominant model type was discrete event simulation (n = 9). Model complexity and incorporation of patient heterogeneity varied considerably, and only five models extrapolated costs and outcomes over a lifetime horizon. Key model parameters were often based on low-quality evidence, and checklist quality assessment revealed weak model verification procedures. Conclusions Existing patient-level economic models of interventions for severe mental illness have considerable limitations. New modelling efforts must be supplemented by the generation of good-quality, contemporary evidence suitable for model building. Combined effort across the research community is required to build and validate economic extrapolation models suitable for accurately assessing the long-term value of new interventions from short-term clinical trial data.

Severe mental illness typically causes significant functional impairment and consequent poor physical health. 1,2 Excess mortality among people with severe mental illness is as much as 2-3 times higher than in the general population, with multiple interacting causes. 3 These include much higher rates of preventable chronic disease, such as diabetes and cardiovascular disease. 3-5 Premature death from non-communicable disease is up to 60% more likely in people with severe mental illness. 6 Life expectancy with severe mental illness is 10-20 years shorter in high-income countries and 30 years shorter in low-income countries. [7][8][9] Recent discussions have suggested that people with severe mental illness should be prioritised for COVID-19 vaccination, given their increased risk of severe infection and COVID-19-related morbidity and mortality. 10 Schizophrenia, bipolar disorder and major depressive disorder are all major contributors to the global burden of disease. 11 These three conditions are identified as severe mental illness in this review. All three conditions are associated with the pattern of excess mortality described above. [12][13][14] Quality of life is also severely diminished in individuals affected by each of these conditions 15 and each is associated with substantial functional impairment. [16][17][18][19] From an economic perspective, each of these mental disorders also carries substantial lifetime costs, borne by both individuals and health systems. 20 Although major depressive disorder is not always classified as a severe mental illness, 15 we include it in this review to capture severe depression that leads to psychiatric hospital admission.
Clinical trials of new interventions for these conditions are generally short term, and therefore do not measure the full scale of lifetime patient outcomes. Long-term evidence is necessary to inform decisions of which interventions should be implemented within healthcare systems such as the National Health Service (NHS). 21 Economic models that estimate lifetime health and cost outcomes for individual patients are vital to understanding the long-term value of new interventions for severe mental illness. 22 We therefore examine patientlevel economic models for the three conditions described.
Challenges in economic modelling in mental health are well described, 23 in particular those due to the short time horizon of clinical trials in mental health 24 and the wide scope of potential economic effects of mental disordersincluding productivity losses and greater lifetime use of healthcare resources for the individual directly affected, as well as spill-over effects on economic outcomes for a patient's family and their wider community. 22 In severe mental illness, a recent systematic review of economic models assessing antipsychotic medication for schizophrenia found 90% of models to have 'very serious' quality limitations based on National Institute for Health and Care Excellence (NICE) checklist appraisal. 25,26 There is concern that poor-quality economic evidence may be similarly widespread in bipolar disorder and major depressive disorder. 27,28 Poor-quality economic modelling may lead to inefficient allocation of healthcare resources, misestimating the health and cost effects of alternative interventions. 29 With increasing financial commitment and focus on improving outcomes The British Journal of Psychiatry (2022) 220, 86-97. doi: 10.1192/bjp.2021.121 in mental health, 30 clinical commissioners urgently need accurate evidence on cost-effective care in severe mental illness.

Informing national treatment guidelines
In the UK, NICE has separate guidelines for psychosis, encompassing schizophrenia and bipolar disorder, 31 and for major depressive disorder. 32 In psychosis, guidelines recommend the use of psychological therapies, such as cognitive-behavioural therapy (CBT) and family intervention, alongside pharmacological treatment. 31 Further treatments are under development, such as the use of virtual reality therapy to help patients overcome anxious avoidance of everyday social situations. 33 However, the long-term effectiveness and cost-effectiveness of new and existing treatments remain poorly understood, especially in the context of varying real-world adherence to treatment. 34,35 Studies available to inform NICE guidelines for psychosis are largely characterised by short follow-up periods (up to 6 months) and small samples (an average of 79 participants per study). 36 NICE explicitly identifies this as a key limitation in their psychosis guidanceas data for several parameters, including relapse and treatment discontinuation probabilities, require extrapolation to a lifetime horizon to capture the long-term impact of treatment on patient outcomes and costs. 31 Psychosis is a severe and often enduring mental health problem, with most patients experiencing multiple episodes or persistent symptoms. 31,37,38 However, in the absence of robust long-term evidence, it is not possible to confirm whether any extrapolation of short-term data either over-or underestimates the cost-effectiveness of different psychosis treatments. 31 Short follow-up in clinical studies similarly affects NICE guidelines for major depressive disorder. The bespoke economic model constructed to inform current guidelines for pharmacological interventions in depression had a time horizon of only 14 months, limited by short study follow-up. 32 The NICE guidance explicitly noted the variable methodological quality of the economic evaluation studies that were available to inform its policy-making. 32 As in psychosis, economic models of major depressive disorder must have capacity to estimate the impact of new interventions over the expected duration of the long-term disease course. Major depressive disorder can be a chronic condition with a high risk of recurrence over a patient's lifetime. 39 In a large prospective observational study in The Netherlands, nearly 20% of patients had a single major depressive episode lasting longer than 24 months. 40 Economic modelling over a short time horizon does not give a fair reflection of a new intervention's value to the health system during each patient's lifetime.

The present review
The aim of this systematic review is to summarise health economic models of schizophrenia, bipolar disorder and major depressive disorder and their potential to extrapolate short-term studies informing the long-term value of interventions for severe mental illness. We undertook this review to inform the extrapolation of the gameChange trial, 33 to provide recommendations for the broader research community to help identify patient-level models suitable for extrapolating short-term trials in severe mental illness, and to help improve the quality of future patient-level models in this area.
We focus on models designed to simulate individual patients (patient-level models) as they capture variation in presentation that leads to highly individualised patient experiences and outcomes in severe mental illness, which cannot readily be subgrouped. 41,42 Patient-level models are distinct from cohort models in explicitly calculating the expected costs and benefits for each individual patient, rather than estimating average outcomes across a patient group. 43 Compared with cohort model approaches, patient-level model structures are better able to represent complex interactions between patient characteristics and an evolving disease history 44 and to capture non-linear relationships between individual risk factors and modelled outcomes. 43 By more closely representing variation in disease course driven by individual patient histories and characteristics as seen in severe mental illness, patient-level models can generate more accurate estimates of health and cost outcomes in the overall population. 45

Method
The protocol for the literature review was registered in the PROSPERO international prospective register of systematic reviews (registration number CRD42020158243). Using OVID, we searched MEDLINE, Embase and PsycINFO for health economic models of psychosis, schizophrenia, bipolar disorder and major depressive disorder published between 1986 and 26 August 2020 (the date of extraction). Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA 2009) guidelines 46  We used a two-stage approach to identify patient-level models in the review. First, we identified previous reviews of economic models for psychosis, schizophrenia, bipolar disorder and major depressive disorder. To achieve this, two reviewers screened titles and abstracts of identified records for reviews of economic models (both patientlevel and non-patient-level models). Full-text records were requested for the reviews identified. Two reviewers extracted details of patientlevel models reported in each review, alongside the databases searched and time periods covered by each review. Second, we updated the identified reviews by searching for all economic models published since the last date covered by the reviews.
The inclusion criteria used to identify relevant studies were as follows: (a) studies with decision models of disease progression (models estimating risk factor progression) that reported health economic outcomes such as costs, (quality-adjusted) life expectancy and disease-related complications (such as psychotic or depressive episodes and treatment side-effects); (b) studies with a model-based economic evaluation of intervention(s) in severe mental illness, such as cost-consequence, cost-utility and cost-effectiveness studies.
Searches were restricted to English language studies owing to challenges in locating and assessing non-English studies, given limited resources available to the research team, but no geographical restrictions were applied. Reference lists of identified economic models were also searched to identify any additional patient-level models missed by systematic searching. Abstracts and conference presentations reporting decision models were not included, as these did not provide sufficient information to allow critical appraisal of the models. For economic models identified across all conditions, patient-level economic models were extracted by reviewing titles and abstracts using keywords such as: 'microsimulation', 'firstorder Monte Carlo simulation', '(Markov) patient-level', 'individual-level' and 'discrete-event simulation'. 45 References were managed using ENDNOTE X9.
There are several types of patient-level model. 45 A patient-level decision tree estimates each patient's expected health outcomes and costs without accounting for the timing of each modelled event (such as an in-patient stay or medication switch), other than the sequence in which each event occurs. However, most patient-level models do account for the timing as well as the sequence of modelled events. Patient-level Markov models simulate individual patients flowing between several health states, with transitions between states at fixed time cycles (such as each day, month or year). In discrete-event simulation models, the timing of each event is predicted precisely for each patient, so the timing of changes in each patient's health status is completely flexible, rather than occurring at fixed intervals. We include all types of patient-level model in our review.
A detailed extraction form was completed for each unique model to assess the suitability of current patient-level economic models to estimate long-term economic outcomes in severe mental illness, which is reported in supplementary Appendix 3. Clinical and health economic experts within the authorship group tailored the questions in the structured extraction to capture the key economic drivers within the disease course. If a decision model was found to be associated with multiple publications, data were extracted from the study that described the model in greatest detail, supported by other publications and relevant online documentation. Two reviewers each extracted all identified studies, with disagreements resolved by consensus.
The main outcomes analysed were: (a) the objective of the model (b) the model structure (modelling method, modelled states and links between states) (c) the model's inputs and corresponding assumptions for costs and quality/length of life (d) whether internal or external model validation/calibration was undertaken and documented.
A standardised checklist ranking a hierarchy of evidence quality was completed for each model, in which the data source used to inform a certain aspect of the model is awarded a score of 1 (highest quality) to 6 (lowest quality). 47 This provided a structured assessment of the quality of input data for key model parameters. Full ranking criteria for the grading of evidence sources are presented in supplementary Appendix 4.
Two reviewers completed quality checklists for each patientlevel model identified. To assess model quality they used the International Society for Pharmacoeconomics and Outcomes Research (ISPOR) checklist, as published in a 2014 Good Practice Task Force Report by ISPOR, the Academy of Managed Care Pharmacy and the National Pharmaceutical Council (ISPOR-AMCP-NPC). 48 This checklist aims to establish a model's credibility and relevance for decision-making, indicating any 'fatal flaws' that could render the model's results inaccurate or incomplete. To assess model validation processes and reporting they used the 2016 Assessment of the Validation Status of Health-Economic Decision Models (AdViSHE) checklist. 49 The AdViSHE checklist supports structured reporting of model validation and aims to increase model transparency.
Findings from the review were synthesised narratively. This systematic review was exempt from ethics approval and consent of participants, as this study was based on previously published work.

Literature search
In total, 2479 papers were identified from the three databases, of which 1481 were non-duplicates (Fig. 1); 39 review papers were identified, with the 3 most recent reviews in each condition covering patient-level models published up to December 2015. 25,28,50 Inclusion criteria for these reviews closely match this study and are detailed in full in supplementary Appendix 5. To update previous systematic reviews, 572 papers published between 1 January 2016 and 26 August 2020 were identified from the 1481 non-duplicate papers. An additional 5 patient-level economic models not covered by previous systematic reviews were identified from the 572 papers. Hence, we identified 15 unique patient-level economic models from a total of 28 studies. [51][52][53][54][55][56][57][58][59][60][61][62][63][64][65] Full detail of records assessed is provided in supplementary Appendix 6.
Three models (20%) extended their modelling of medicationrelated side-effects, 53,55,56 explicitly simulating a pathway from short-term transient side-effects into long-term comorbidities: diabetes 53,55,56 and cardiovascular disease. 53,55 The risk of developing long-term comorbidities was conditional on side-effects such as weight gain, via mediating pathways such as hyperlipidaemia and impaired glucose tolerance. In contrast, one model accounted for the long-term impact of comorbidities implicitly, 60 with its authors applying a utility adjustment to the whole patient cohort, taking into account the incidence rate of medication-related comorbidities and their average health effect across all patients.
The medication-related side-effects incorporated by each model, and further detail of the precise modelling approach for long-term comorbidities, are presented in supplementary Appendix 8. Patient-level health economic models of severe mental illness   51,[53][54][55]57,58,60,61,64 In eight models (53%), including five models that varied relapse risk on the basis of patient characteristics, the risk of subsequent relapse was conditional on the number of previous relapses modelled. [52][53][54][55]59,62,63,65 The risk of subsequent relapse varied in complexity. The simplest approach applied a single hazard ratio adjustment if a patient had any previous relapse 59 and the most complex approach modelled future relapse risk as a continuous function driven by the duration of and time between previous relapses. 52 In the remaining seven models (47%), the relapse risk was independent of the number of previous relapses modelled. 51,[56][57][58]60,61,64 Hierarchy of evidence informing the models The hierarchy of evidence used in the models is summarised in Fig. 2, ranging from high-quality evidence (ranked 1) to the lowest rank of evidence (rank 6). Full ranking criteria for evidence sources across all categories are presented in supplementary Appendix 4.
Quality of evidence informing the models was mixed, and few models used high-quality evidence across all model elements. Evidence was particularly poor for treatment effect extrapolation, with seven models (47%) relying on expert opinion to inform treatment effect extrapolation beyond observed data to model efficacy over the whole simulated time. [52][53][54]56,58,60,64 Often, there was no report of external expert consultation, so the opinion on the durability of the treatment effect over the time horizon was tacitly assumed by the model authors. 54,58,60,64 Two models (13%) did not attempt to extrapolate beyond their observed evidence sources 51,57 and two models (13%) evaluated hypothetical treatments, 62,63 meaning that this evidence category was not applicable to these four studies. Side-effect data were also often poorly evidenced, with studies often reporting several sources of input data with poor reporting of how each evidence source was used or how studies were combined to inform model parameters.

Model quality and validity
Models generally performed reasonably against ISPOR 2014 checklist criteria, which provide a broad overview of model quality and relevance (Fig. 3). Model design and analysis were generally adequate, with model credibility weakest in terms of data and reporting. Although the ISPOR checklist provides a high-level perspective over many diverse model attributes, only one checklist item Score 2 Score 3 Score 4 Score 5 Score 6 (lowest) NA Fig. 2 Hierarchy of evidence quality. NA, not applicable.
Altunkaya et al directly scrutinises the model structure. Individual checklist items encompass multiple diverse topic areas. For example, short model time horizons are only assessed within a broad checklist item assessing the applicability of the model 'context'. In terms of model credibility, the models performed poorly against ISPOR validation criteria, as corroborated by poor performance on the AdVISHE validation checklist (Fig. 3).
Of the 15 studies, 13 (87%) performed at least 1 of the 12 validation checks listed on the AdVISHE checklist. [51][52][53][54][55][56]58,59,[61][62][63][64][65] However, model validation was generally restricted to face validity checks of model structure and the suitability of model inputs, and checks of model structure and inputs against published literature. Importantly, data and output validation was reported by only a handful of models. Two models (13%) reported validation of regression model fit, 59,64 one model (7%) reported testing with alternative input data 53 and three models (20%) reported validation checks against empirical data. 51,56,58 Validation of the model as implemented in software (i.e. the computerised model) was particularly infrequent. Model authors rarely sought external expert model appraisal and rarely reported that basic model checks had been undertaken, such as extreme value testing or patient tracking through the computerised model. Jin et al (2020) 56   Full data for both quality checklists are presented in supplementary Appendix 9.

Discussion
Our review identified 15 unique patient-level economic models that simulated the natural history of schizophrenia, bipolar disorder and major depressive disorder. Our broad definition of severe mental illness allowed us to capture additional models that included severe depression leading to hospital admission. Antipsychotic medications can be used to treat all the conditions considered, 66,67 with common long-term comorbidities arising from specific medication-related side-effects. This is the first systematic review of economic models comparing patient-level model structures across schizophrenia, bipolar disorder and major depressive disorder, thus examining common economic modelling considerations across these patient groups.
We found considerable limitations in the quality and validity of current models. Outdated input data, lack of structural complexity and limited incorporation of patient heterogeneity are major concerns limiting models' applicability to contemporary populations with severe mental illness. Only five models adopted the lifetime time horizon required by health technology reimbursement agencies. 21 Current models therefore have limited potential to reliably extrapolate the results of short-term studies and thus inadequately inform decision makers' assessment of the long-term value of existing and future interventions for severe mental illness.

Data quality
The data used to inform the models were generally of poor quality or were published more than 10 years ago. In psychosis, NICE treatment guidelines in England were last published in 2014 31 and more recent changes to the service pathway, including new waiting-time targets, have also substantially altered baseline care. [68][69][70] Outdated model data are a significant concern, as projections derived from models are unlikely to be relevant to current decision-making owing to shifts in best practice care over time. In several models, related model parameters such as baseline relapse/ remission rates and treatment effectiveness were obtained from different patient populations, with little or no adjustment to account for varying patient characteristics between evidence sources. Several models were informed by population data from regions and countries outside the setting of the model's decision problem, raising concerns about data transferability. Furthermore, despite the majority of models estimating QALYs or DALYs, the data informing quality-of-life weights were generally of medium/ poor quality.
Decision models should be based on high-quality and contemporary evidence to ensure that estimates of the scale and severity of disease burden and economic benefit of new interventions are sufficiently reliable. If good economic evidence is not available to support further investment in severe mental illness, healthcare decision makers will choose to prioritise the allocation of scarce healthcare resources in other disease areas where there is better evidence to support potential economic benefit from any additional investment. The poor quality and outdated data used in many of the identified decision models suggests that new large high-quality studies in severe mental illness are needed to construct economic models. New evidence is needed in almost all model aspectsboth clinical and economicwith particular need to obtain contemporary quality-of-life data in severe mental illness using modern methods.

Model complexity
Short-term side-effects and long-term comorbidities associated with treatment There was considerable variation in model complexity. Although most models allowed for treatment switching, there was substantial divergence in the simulation of treatment-related side-effects. Sideeffects are a primary driver of antipsychotic treatment switching. 71 In clinical practice, a systematic approach to medication selection may be complemented by an element of 'trial and error' of several medications, in response to variation in the occurrence and acceptability of specific side-effects in different patients. 72,73 Models therefore should incorporate all relevant side-effects resulting from multiple possible treatment choices. This is particularly important early in the disease course, when a patient's individual presentation is emerging and treatment switches are more frequent. 74 Given evidence of side-effects that are common to both first-generation 'conventional' and second-generation 'atypical' antipsychotic drug classes, 75 it is therefore advisable for models to consider a wide range of potential side-effects.
Most models failed to incorporate the link between short-term side-effects and long-term medication-related comorbidities. Diabetes and cardiovascular disease are common comorbidities in populations with severe mental illness, and both have distinct well-studied effects on length and quality of life, and related healthcare costs. 76,77 Omitting comorbidities arising from medication side-effects may bias a comparative assessment of the long-term value of different treatments for severe mental illness. Bias is particularly likely when an intervention and comparator induce sideeffects of differing magnitude. By modelling distinct pathways for patients who develop comorbidities, the health effects of medication-related comorbidities can be distinguished from health effects and costs arising directly from a mental health condition. This means that prevention and treatment strategies for comorbidities can be easily scrutinised and updated with changes to current practice, without affecting the calculation of health and cost outcomes in patients without medication-related comorbidities. However, only two models 55,56 identified in our review considered the impact of these long-term comorbidities over a patient's lifetime.

Patient heterogeneity
Few models incorporated all relevant aspects of patient heterogeneity and comprehensively estimated the impact of patient heterogeneity on health outcomes and costs. Only three models estimated individualised severity and frequency of relapse, based on each patient's baseline characteristics and modelled relapse history. 52,59,65 Failure to model interactions between patient characteristics and evolving disease history will produce inaccurate estimates of population-level health outcomes and costs. 44 For example, in psychosis a significant proportion of health service costs are driven by a subset of patients who have both a significant history of relapse and high degree of dependency on statutory social services. 78 By incorporating all relevant aspects of patient heterogeneity, including patient heterogeneity in excess mortality risk, models can more accurately estimate population-level outcomes. Quantification of expected outcomes for individual patients is also necessary to establish the overall value of stratified patient care.

Model validity
Beyond structural inadequacies, most models did not report rigorous validation checks to establish confidence in their results. Through the appraisal checklists, we found that authors were most likely to report cross-comparability or face validity of their model structure or results, based either on existing literature or on consultation with a clinical expert. However, few models documented more extensive validation efforts. Few models compared model predictions against empirical data, tested the robustness of model outputs using alternative input data or subjected the software underlying the model to thorough checking procedures or an external expert review. Only one model thoroughly documented most validation procedures from the 2016 AdViSHE checklist. 56 Nevertheless, decision makers need validated economic models to provide a foundation for credible long-term policy-making and treatment reimbursement decisions. Researchers also need evidence that economic models are sufficiently valid to extrapolate data from clinical trials. Validation efforts provide insight into model accuracy and establish the credibility of evidence generated to inform healthcare investment decisions. 79 Economic models generate predictions regarding the stream of future costs and benefits from changes to healthcare policy. Without internal and external validation, decision makers have limited knowledge of whether an economic model produces sufficiently accurate predictions for their own setting. In this regard, additional external validation of economic models is essential to enable decision makers to be confident in models' ability to discriminate between cost-effective and cost-ineffective health policies. Without validated models, healthcare decision makers may judge the uncertainty involved with a new investment in severe mental illness to be too great. The lack of high-quality and validated models may result in patients being denied good-value care, as depending on the model used, a range of long-term cost-effectiveness estimates could be obtained indicating different policy decisions. 80 Limitations This review has some limitations. First, we included only studies published in English, and therefore models developed for non-Anglophone decision contexts, where decision makers may have different evaluation requirements are probably underrepresented. Second, we did not include cohort-level models that could be used to extrapolate short-term studies. However, given that cohort models cannot fully capture the effect of patient heterogeneity in severe mental illness, these models were of limited interest. Third, our assessment of model quality is non-context specificwe did not address model suitability for each particular decision problem being addressed, instead providing an overview of each model's performance against a general benchmark standard. Finally, our assessment of model quality is contingent on the level of detail provided in publications about the modelwith publication detail potentially constrained by word-count limits.

Implications for future research and policy makers
The deficiencies of current models documented in this review across multiple dimensions can be used to inform the design of future models. In bipolar disorder and major depression, no structurally complex lifetime models with well-documented validation were identified. In schizophrenia, although Jin et al's model 56 was shown to be of a higher quality and more structurally complex than its peers with well-documented internal validity, additional research is needed to demonstrate its external validity and accuracy to extrapolate clinical trial data for informing healthcare decision-making.
Poor-quality economic models hinder policy makers' ability to allocate healthcare budgets appropriately. In turn, this reduces the ability of clinicians to improve performance against NHS mental health targets while simultaneously providing cost-effective care. There is a clear need for further development of contemporary and comprehensive patient-level decision models that capture the full structural complexity of severe mental illness, in particular its relation to long-term comorbidities. High-quality contemporary evidence is needed on health-related quality of life and costs (collected in line with current best practice), as well as on long-term disease progression to inform the development of robust economic models. Following extensive internal and external validation exercises, new economic models could significantly reduce the time needed to make health policy decisions by reliably extrapolating short-term clinical trials to inform the cost-effectiveness of interventions for severe mental illness. Public research resources in severe mental illness should be coordinated to prioritise these objectives.