To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
Increased mortality and reduced life expectancy are well documented among mental healthcare recipients. Whereas clinical research typically focuses on people with specific diagnoses, little is known about those who receive mental healthcare but have an unspecified or no diagnosis.
Aims
Using routinely collected mortality data, we aimed to explore how mortality and life expectancy differed between those with and without a specific mental health diagnosis.
Method
Using the South London and Maudsley NHS Foundation Trust clinical records interactive search system, we assembled annual cohorts of people who had past or current mental health service receipt between 2015 and 2024. Mortality rates and life expectancy were ascertained for those with mental health diagnoses (ICD-10 F-codes), those with unspecified diagnoses (Z-codes) and those without any diagnosis. Age- and gender-standardised mortality ratios (SMRs) and life expectancy were calculated in relation to the local catchment comparator population.
Results
Of the combined cohorts (n = 3 266 268) of people accessing mental health services, 57.7% had an F-code diagnosis, 13.0% a Z-code diagnosis and 29.3% no diagnosis. Annual SMRs (95% CI) for F-code diagnoses ranged from 2.25 (2.18–2.33) to 2.56 (2.46–2.65); for Z-code diagnoses from 1.88 (1.73–2.02) to 2.18 (2.00–2.36); and for no diagnosis from 1.59 (1.48–1.71) to 1.87 (1.72–2.01). Years of life lost were greatest for those with F-code diagnoses (females, 15.1 years; males, 16.7 years), followed by Z-codes (females, 11.8 years; males, 14.4 years) and no diagnosis (females, 9.4 years; males, 10.6 years). Raised SMRs were observed for both external- and natural-cause mortality for all groups.
Conclusions
People in contact with mental health services with unspecified or no mental health diagnosis have a substantially higher mortality and lower life expectancy compared with the general population. Further research is needed to characterise this group and study other outcomes, because they may fall outside care pathways.
Females are less likely than males to be diagnosed with attention–deficit hyperactivity disorder (ADHD). When diagnosed, females are older than males.
Aims
In this study, we examined the childhood antecedents of later ADHD diagnosis and its impact on adolescent/emerging adult outcomes, with a focus on females.
Method
In this cohort study, we used data from a Welsh nation-wide electronic cohort of 13 593 individuals (n = 2680 (19.7%) females) diagnosed with ADHD and 578 793 individuals (n = 286 734 (49.5%) females) without ADHD. We compared females with later diagnoses (ages 12–25) to those with earlier, timely diagnoses (ages 5–11) and no diagnosis, in terms of childhood (ages 5–11) antecedents and adolescent/adult (ages 12–25) outcomes. We also tested for sex differences.
Results
Although females with earlier ADHD diagnosis showed more health and educational difficulties in childhood than those with later diagnosed ADHD (odds ratios ranged from 0.18 to 0.92), there was clear evidence of these difficulties in females with later diagnosed ADHD, compared with females without ADHD (odds ratios: 1.07–9.02). In adolescence/early adulthood, females with later diagnosed ADHD used more healthcare services and had worse mental health, educational and socioeconomic outcomes than females diagnosed earlier (odds ratios: 1.39–4.96) and those without ADHD (odds ratios: 1.54–23.98). Many of these outcomes were exacerbated in females compared with males.
Conclusions
The results demonstrate that later ADHD diagnosis is associated with significant negative outcomes by adolescence and disproportionately disadvantages females. Despite later diagnosis, there was clear evidence of childhood mental health and educational difficulties when compared with females without ADHD. Therefore, timely childhood ADHD diagnosis may help to mitigate later risks, especially for females.
The COVID-19 pandemic and associated non-pharmaceutical interventions (NPIs) reduced transmission of other infections. We quantified changes in hospital admission rates for respiratory and gastrointestinal infections among young children in England during and after implementation of NPIs, compared to pre-pandemic, and variations by sociodemographic and clinical characteristics. Children aged <5 years at any time between 1 January 2017 and 31 January 2022 were followed from birth or 1 January 2017, until their 5th birthday, death, or 31 January 2022, within a birth cohort based on Hospital Episode Statistics data. Quarterly emergency admission rates for respiratory and gastrointestinal infections from April-June 2020 onwards were compared to corresponding quarters in 2017–2019 using Poisson regression, with and without interaction terms for time period and sociodemographic/clinical characteristics. Admission rates for respiratory and gastrointestinal infections were lower in April–June 2020 compared to this quarter pre-pandemic (incidence rate ratio (99% CI) 0.17 (0.17–0.18) for respiratory; 0.29 (0.28–0.31) for gastrointestinal). Rates remained below pre-pandemic levels until April–June 2021 (respiratory infections) and July–September 2021 (gastrointestinal infections), subsequently increasing above the corresponding pre-pandemic quarters. Changes in rates did not differ by sociodemographic/clinical characteristics. These results can inform planning for future pandemics and their aftermath.
Electroconvulsive therapy (ECT) is an effective treatment of severe manifestations of mental illness. Since delay in initiation of ECT can have detrimental effects, prediction of the need for ECT could improve outcomes via more timely treatment initiation. Therefore, this study aimed to predict the need for ECT following admission to a psychiatric hospital.
Methods:
This study was based on electronic health record (EHR) data from routine clinical practice. Adult patients admitted to a hospital within the Psychiatric Services of the Central Denmark Region between January 2013 and November 2021 were included in the study. The outcome was initiation of ECT >7 days (to not include patients admitted for planned ECT) and ≤67 days after admission. The data was randomly split into an 85% training set and a 15% test set. On the 7th day of the inpatient stay, machine learning models (extreme gradient boosting (XGBoost)) were trained to predict initiation of ECT and subsequently tested on the test set.
Results:
The cohort consisted of 41,610 patients with 164,961 admissions. In the held out test set, the trained model predicted ECT initiation with an area under the receiver operating characteristic curve of 0.94, 47% sensitivity, 98% specificity, positive predictive value (PPV) of 24% and negative predictive value (NPV) of 99%. The top predictors were the highest suicide assessment score and mean Brøset violence checklist score in the preceding three months.
Conclusions:
EHR data from routine clinical practice may be used to predict need for ECT. This may lead to more timely treatment initiation.
The impact of guideline-directed medical therapy (GDMT) has not fully translated to decreases in the disproportionate rates of hospitalization and lengths of stay in African Americans with congestive heart failure (CHF). GDMT is optimized by registered nurses (RNs) and their use of clinical information. Yet, there are no instruments for measuring the influence of clinical information use and nursing care. The study assessed an instrument’s ability to measure the influence of RN performance of social, technical, and socio-technical care tasks on length of stay in the CHF hospitalizations of African Americans.
Methods:
A sample of 200 RNs, who cared for 5060 African Americans with 14,123 heart failure hospitalizations, were surveyed. Descriptive statistics, Cronbach’s alpha, and a generalized linear regression assessed the instrument’s reliability and predictive validity.
Results:
The Cronbach’s alpha was 0.95 (95% CI: 0.94–0.96). The corrected item-total correlations for the 22 items ranged from 0.44 to 0.80. For an increase of one to four points per item in a RN’s performance, the estimated reductions in the patient’s length of stay were 3.34% (6.11,0.5), 6.58% (11.84,1), 9.70% (17.22,1.49), and 12.72% (22.28,1.99), respectively (P = 0.004).
Conclusions:
Increases in a RN’s performance of social, technical, and socio-technical care tasks were significantly associated with clinically meaningful decreases in their patients’ length of stay. The instrument has strong potential for addressing the disproportionate impact of CHF by measuring and tailoring interventions to optimize nursing care and the use of clinical information in the provision and receipt of GDMT.
This study sought to obtain the views of doctors associated with the Royal College of Psychiatrists on the use of outcome measures in mental health services. An online survey was developed by the College’s working group on outcome measures and widely disseminated to psychiatrists through College channels.
Results
In total, 339 completed responses were received. Respondents were mostly consultant psychiatrists; based in England; and working in the National Health Service with working-age adults. Almost half said they used outcome measures routinely, with almost half finding outcome measures clinically useful. Lack of time and inadequate information technology systems were identified as the top barriers to using outcome measures.
Clinical implications
Based on our results, psychiatrists are generally keen to use outcome measures, but are often prevented from doing so effectively by pressures on services and lack of appropriate support. The Royal College of Psychiatrists and other relevant organisations could enhance the use of outcome measures in mental health services through improved guidance, providing additional resources and integration of measures into electronic patient records.
This study aimed to develop and evaluate a predictive model using electronic health record (EHR) data from a large south London mental health service, in order to identify patients 3 months following first referral who are at risk of subsequent high-intensity service use over the subsequent 12 months. Early identification of such patients may support proactive and personalised care planning, reducing the need for high-cost episodes of care. Predictive models were developed using information from 18 869 patients newly referred between 2007 and 2011. High-intensity use was defined as the top 10% of estimated mental healthcare expenditure. The model was developed using demographic, clinical and service use variables, and was validated on data from the periods 2012–2017 and 2018–2023.
Results
A logistic regression model achieved an area under the receiver operating characteristic (AUROC) of 0.79 in development (sensitivity 0.82, specificity 0.54), with robust performance in validation sets (AUROC 0.81, 0.83, respectively). Key predictors included first 3 months service use, schizophrenia or eating disorder diagnoses and living alone. Natural language processing-derived features did not improve performance.
Clinical implications
Routine EHR data performed well in predicting the risk of high-cost care, potentially enabling targeted interventions and more efficient resource allocation.
Identifying diagnoses from noncoded healthcare visit records presents logistical challenges when large number of records are screened. This study aimed to develop a screening process to identify otitis media (OM) diagnoses in free-text primary care visit records.
Methods:
The free-text primary care records of 200 children aged 0 to 4 years were reviewed independently by three clinicians to determine whether OM was a diagnosis considered during each visit. Terms (abbreviations, words, and phrases) identifying visits where OM was considered or excluded were documented. These terms were used to design a software algorithm subsequently used to detect OM diagnosis within these primary care records. The diagnostic performance of the software algorithm was determined against the gold standard clinicians’ review and described using sensitivity, specificity, predictive values (PVs), and likelihood ratios (LRs) with 95% confidence intervals (CIs).
Results:
The 200 children had 10,034 primary care visits. Clinician review identified 917 (9%) visits where OM was considered, and 9117 (91%) visits where OM was excluded. The software algorithm identified 801/917 visits where OM was considered and 8705/9117 visits where OM was excluded. The algorithm sensitivity was 87% (95% CI 85–89), specificity 96% (95% CI 95–96), positive PV 66% (95% CI 63–69), negative PV 99% (95% CI 98–99), positive LR 19.33 (95% CI 17.54–21.31), and negative LR 0.13 (95% CI 0.11–0.16).
Conclusion:
Software algorithms can assist in screening healthcare visit records. When combined with clinician review, they enable accurate identification of OM visits from non-coded records.
We describe the steps taken to assess and improve the research readiness of data within PCORnet®, specifically focusing on the results of the PCORnet data curation process between Cycle 7 (October 2019) and Cycle 16 (October 2024).
Material and methods:
We describe the process for extending the PCORnet® CDM and for creating data checks.
Results:
We highlight growth in the number of records available across PCORnet between data curation Cycles 7 and 16 (e.g., diagnoses increasing from ∼3.7B to ∼6.9B and laboratory results from ∼7.7B to ∼15.1B among legacy DataMarts), present the current list of data checks and describe performance of the network. We highlight examples of data checks with relatively stable performance (e.g., future dates), those where performance has improved (e.g., RxNorm mapping), and others performance is more variable (e.g., persistence of records).
Conclusion:
Studies are a crucial source of information on the design of new data checks. The attention of PCORnet partners is focused primarily on those metrics that are generally modifiable. A transparent data curation process is an essential component of PCORnet, allowing network partners to learn from one another, while also informing the decisions of study investigators on which sites to include in their projects. The quality issues that exist within PCORnet stem from the way that data are captured within healthcare generally. We have been able to make to make great strides on improving data quality and research readiness. Many of the techniques piloted within PCORnet will be broadly applicable to other efforts.
Large language models (LLMs) like OpenAI’s ChatGPT, Google’s Gemini and Anthropic’s Claude can be useful tools in psychiatric practice, helping with tasks such as searching for information, managing administrative work and supporting education. This article demystifies how these tools work by explaining their core operational principles and noting their key limitations, including the risks of confabulation (fabricating information), sycophancy and knowledge cut-offs. It provides practical guidance on mitigating these risks through structured ‘prompt engineering’ and offers a safety framework for integrating LLMs into low-risk administrative and educational workflows. The article stresses the importance of approaching these technologies with caution by independently verifying information, adhering to UK data protection laws and upholding the principles of best practice in patient care. The goal is to help clinicians use these powerful but fallible technologies wisely, ensuring that patient safety and professional responsibility remain paramount as they explore these new tools.
The traditional case register involved assembling records of people with a given condition in order to support cohort studies to describe and investigate the course of their condition and other outcomes. This old design has been resurrected and revolutionised following the widespread implementation of fully electronic healthcare records over the past few decades, providing ‘big data’ resources that are both large and very detailed. These, in turn, are being further enhanced through linkages with complementary administrative data (both health and non-health) and through natural language processing generating structured meta-data from source text fields. This chapter provides an overview of this rapidly developing research infrastructure, considering and advising on some of the challenges faced by researchers planning studies using clinical data and by those considering future resource development.
Lesbian, gay, bisexual, transgender, queer and related community (LGBTQ+) individuals have significantly increased risk for mental health problems. However, research on inequalities in LGBTQ+ mental healthcare is limited because LGBTQ+ status is usually only contained in unstructured, free-text sections of electronic health records.
Aims
This study investigated whether natural language processing (NLP), specifically the large language model, Bi-directional Encoder Representations from Transformers (BERT), can identify LGBTQ+ status from this unstructured text in mental health records.
Method
Using electronic health records from a large mental healthcare provider in south London, UK, relevant search terms were identified and a random sample of 10 000 strings extracted. Each string contained 100 characters either side of a search term. A BERT model was trained to classify LGBTQ+ status.
Results
Among 10 000 annotations, 14% (1449) confirmed LGBTQ+ status while 86% (8551) did not. These other categories included LGBTQ+ negative status, irrelevant annotations and unclear cases. The final BERT model, tested on 2000 annotations, achieved a precision of 0.95 (95% CI 0.93–0.98), a recall of 0.93 (95% CI 0.91–0.96) and an F1 score of 0.94 (95% CI 0.92–0.97).
Conclusion
LGBTQ+ status can be determined using this NLP application with a high success rate. The NLP application produced through this work has opened up mental health records to a variety of research questions involving LGBTQ+ status, and should be explored further. Additional work should aim to extend what has been done here by developing an application that can distinguish between different LGBTQ+ groups to examine inequalities between these groups.
Despite growing healthcare coverage, disparities in access to and outcomes of psychiatric care persist, even in countries with universal healthcare. How socioeconomic status (SES), travel time, and social support individually and jointly affect psychiatric clinical trajectories remains largely unexplored.
Methods
We analyze electronic health records (EHRs) from patients diagnosed with bipolar disorder, major depressive disorder, or schizophrenia at Clínica San Juan de Dios Manizales. Using zero-inflated and standard negative binomial regression, we quantify the effects of SES, travel time, and family/social support on utilization, clinical outcomes, and symptoms of mania, psychosis, and suicidality. A mixed-effects model examines how care-seeking patterns affect visit-to-visit variability in outcomes.
Results
Among 21,095 patients, utilization is lower for those with low SES (rate ratio [RR] 0.92, 95% CI: 0.90–0.95, p = 1.27e−10) and longer travel times (RR 0.94, 95% CI: 0.93–0.95, p = 1.19e−53). Patients with low SES are more likely to have severe symptoms (e.g., delusions: RR 1.28, 95% CI: 1.20–1.37, p = 2.57e−15) and require hospitalization (RR 1.10, 95% CI: 1.05–1.15, p = 1.94e−04), suggesting they primarily seek care when critical. Longer travel differentially affects those with low SES. However, the relationship between SES and adverse outcomes is less pronounced when living with family (e.g., hospitalizations: LRT, χ2 = 47.08, df = 3, p = 3.35e−10). Frequent outpatient care is associated with lower odds of hospitalization, suicidality, and other symptoms.
Conclusions
Findings demonstrate use of EHRs to model patient outcomes, the important role of social support, and need for improved healthcare accessibility.
Insight assessment in psychosis remains challenging in practice-oriented research.
Aims
To develop and validate a proxy measure for insight based on information from electronic health records (EHR). For that purpose, we used data on the Scale to Assess Unawareness of Mental Disorder (SUMD) and data from EHR notes of patients in an early psychosis intervention programme (Programa de Atención a Fases Iniciales de Psicosis, Santander, Spain).
Method
Junior and senior clinicians examined 134 clinical notes from 106 patients to explore criterion and content validity between SUMD and a clinician-rated proxy measure, using three SUMD items.
Results
In terms of criterion validity, SUMD scores correlated with the proxy (r = 0.61, P < 0.001), even after adjusting for the following confounders: type of psychotic disorder, clinical remission status and rater experience (r = 0.58, P < 0.001); and the proxy predicted good insight status (odds ratio 20.95, 95% CI 7.32–59.91, P < 0.001). Regarding content validity, the three main SUMD subscores correlated with the proxy (r = 0.55–0.60, P < 0.005). There were no significant differences in age, gender or other clinical variables, i.e. discriminant validity, and the proxy significantly correlated with validated psychometric instruments, i.e. external validity. Intraclass correlation coefficient (i.e. interrater reliability) was 0.88 (95% CI 0.59–1.00, P < 0.05).
Conclusions
This SUMD-based proxy measure was shown to have good to excellent validity and reliability, which may offer a reliable and efficient alternative for assessing insight in real-world clinical practice, EHR-based research and management. Future studies should explore its applicability across different healthcare contexts and its potential for automation, using natural language-processing techniques.
The implementation of electronic health records (EHRs) in mental health contexts has been slow. Reasons for this include concerns from healthcare professionals regarding the collection of sensitive information and the stigma associated with mental health services. Despite the low uptake of EHRs, the benefits include patients feeling empowered and in control of their own treatment. However, ethnically diverse groups often access mental health services through crisis pathways and have been found to disengage with EHRs. The aim of this review was to explore ethnically diverse groups’ perceptions of the utility of mental health EHRs and establish perceived barriers and facilitators to access. MEDLINE, CINAHL, EMBASE, Scopus, PsycINFO, PubMed and Web of Science were searched. Included papers mentioned ethnically diverse groups from the 37 listed countries in the Organisation for Economic Co-operation and Development, and included service users, clients or patients accessing EHRs in mental healthcare settings. Papers were required to be published between 2009 and 2025. Eight papers met all criteria for inclusion, and three themes emerged: language barriers to EHR access, lack of access to technology and perceived impact of EHRs on access to care. Language barriers to EHR access, no access to technology and stigma were significant issues for ethnically diverse groups due to concerns about who has access to the electronic health data. Benefits of accessing EHRs included easier and efficient access to records. EHRs are critical for modern health systems and further work is required to improve EHRs usage in mental health systems for ethnically diverse groups.
Electronic Health Record (EHR) data are critical for advancing translational research and AI technologies. The ENACT network offers access to structured EHR data across 57 CTSA hubs. However, substantial information is contained in clinical narratives, requiring natural language processing (NLP) for research. The ENACT NLP Working Group was formed to make NLP-derived clinical information accessible and queryable across the network.
Methods:
We established the ENACT NLP Working Group with 13 sites selected based on criteria including clinical notes access, IT infrastructure, NLP expertise, and institutional support. We divided sites into five focus groups targeting clinical tasks within disease contexts. Each focus group consisted of two development sites and two validation sites. We extended the ENACT ontology to standardize NLP-derived data and conducted multisite evaluations using the Open Health Natural Language Processing (OHNLP) Toolkit.
Results:
The working group achieved 100% site retention and deployed NLP infrastructure across all sites. We developed and validated NLP algorithms for rare disease phenotyping, social determinants of health, opioid use disorder, sleep phenotyping, and delirium phenotyping. Performance varied across sites (F1 scores 0.53–0.96), highlighting data heterogeneity impacts. We extended the ENACT common data model and ontology to incorporate NLP-derived data while maintaining Shared Health Research Informatics NEtwork (SHRINE) compatibility.
Conclusion:
This demonstrates feasibility of deploying NLP infrastructure across large, federated networks. The focus group approach proved more practical than general-purpose approaches. Key lessons include the challenge of data heterogeneity and importance of collaborative governance. This work also provides a foundation that other networks can build on to implement NLP capabilities for translational research.
Initially prescribed for schizophrenia and psychosis, antipsychotics are increasingly prescribed for other indications. Since the late 1990s, prescribing shifted from first-generation to second-generation antipsychotics.
Aims
To examine overall initiation and prevalence of antipsychotic drug prescribing in UK primary care from 1995 to 2018, stratified by gender.
Method
Cohort studies using UK anonymised electronic primary care data from IQVIA Medical Research Data, including over 790 general practices and registered individuals aged 18–99 years.
Results
Antipsychotic drug initiation was stable in the late 1990s, at 6–7/1000 person-years at risk (PYAR) in men and 9–11/1000 PYAR in women. From 2001, initiation declined, stabilising from 2005 onward at 4/1000 PYAR in men and 4–5/1000 PYAR in women. Prevalence remained consistent from 1995 to 2018: 12/1000 in men and 14/1000 in women by 2018. Initiation and prevalence were higher in women than men, but increased with age in both genders: (18–39 v. 80–99 years; incidence rate ratio (IRR) 4.85, 95% CI 4.75–4.95 in men; IRR 5.90, 95% CI 5.78–6.02 in women; prevalence rate ratio (PRR) 2.22, 95% CI 2.19–2.25 in men; PRR 4.28, 95% CI 4.24–4.33 in women). Initiation and prevalence were greater in individuals with greater socioeconomic deprivation (Townsend score of 5 v. 1; IRR 2.69, 95% CI 2.64–2.75 in men; IRR 2.19, 95% CI 2.15–2.24 in women; PRR 3.87, 95% CI 3.82–3.92 in men; PRR 2.80, 95% CI 2.77–2.83 in women).
Conclusions
Antipsychotic drug initiation decreased after 2001, stabilising from 2005 onward. Prevalence remained relatively consistent throughout the study period. Women had higher initiation and prevalence than men. However, both genders showed increased prescribing with age and socioeconomic deprivation.
Adults with mood and/or anxiety disorders have increased risks of comorbidities, chronic treatments and polypharmacy, increasing the risk of drug–drug interactions (DDIs) with antidepressants.
Aims
To use primary care records from the UK Biobank to assess DDIs with citalopram, the most widely prescribed antidepressant in UK primary care.
Method
We classified drugs with pharmacokinetic or pharmacodynamic DDIs with citalopram, then identified prescription windows for these drugs that overlapped with citalopram prescriptions in UK Biobank participants with primary care records. We tested for associations of DDI status (yes/no) with sociodemographic and clinical characteristics and with cytochrome 2C19 activity, using univariate tests, then fitted multivariable models for variables that reached Bonferroni-corrected significance.
Results
In UK Biobank primary care data, 25 508 participants received citalopram prescription(s), among which 11 941 (46.8%) had at least one DDI, with an average of 1.96 interacting drugs. The drugs most commonly involved were proton pump inhibitors (40% of co-prescription instances). Individuals with DDIs were more often female and older, had more severe and less treatment-responsive depression, and had higher rates of psychiatric and physical disorders. In the multivariable models, treatment resistance and markers of severity (e.g. history of suicidal and self-harm behaviours) were strongly associated with DDIs, as well as comorbidity with cardiovascular disorders. Cytochrome 2C19 activity was not associated with the occurrence of DDIs.
Conclusions
The high frequency of DDIs with citalopram in fragile groups confirms the need for careful consideration before prescribing and periodic re-evaluation.
Physical health checks in primary care for people with severe mental illness ((SMI) defined as schizophrenia, bipolar disorders and non-organic psychosis) aim to reduce health inequalities. Patients who decline or are deemed unsuitable for screening are removed from the denominator used to calculate incentivisation, termed exception reporting.
Aims
To describe the prevalence of, and patient characteristics associated with, exception reporting in patients with SMI.
Method
We identified adult patients with SMI from the UK Clinical Practice Research Datalink (CPRD), registered with a general practice between 2004 and 2018. We calculated the annual prevalence of exception reporting and investigated patient characteristics associated with exception reporting, using logistic regression.
Results
Of 193 850 patients with SMI, 27.7% were exception reported from physical health checks at least once. Exception reporting owing to non-response or declining screening increased over the study period. Patients of Asian or Black ethnicity (Asian: odds ratio 0.72, 95% CI 0.65–0.80; Black: odds ratio 0.86, 95% CI 0.76–0.97; compared with White) and women (odds ratio 0.90, 95% CI 0.88–0.92) had a reduced odds of being exception reported, whereas patients diagnosed with ‘other psychoses’ (odds ratio 1.19, 95% CI 1.15–1.23; compared with bipolar disorder) had increased odds. Younger patients and those diagnosed with schizophrenia were more likely to be exception reported owing to informed dissent.
Conclusions
Exception reporting was common in people with SMI. Interventions are required to improve accessibility and uptake of physical health checks to improve physical health in people with SMI.
Electronic Health Records (EHR) analysis is pivotal in advancing medical research. Numerous real-world EHR data providers offer data access through exported datasets. While enabling profound research possibilities, exported EHR data requires quality control and restructuring for meaningful analysis. Challenges arise in medical events (e.g., diagnoses or procedures) sequence analysis, which provides critical insights into conditions, treatments, and outcomes progression. Identifying causal relationships, patterns, and trends requires a more complex approach to data mining and preparation.
Methods:
This paper introduces EHRchitect – an application written in Python that addresses the quality control challenges by automating dataset transformation, facilitating the creation of a clean, formatted, and optimized MySQL database (DB), and sequential data extraction according to the user’s configuration.
Results:
The tool creates a clean, formatted, and optimized DB, enabling medical event sequence data extraction according to users’ study configuration. Event sequences encompass patients’ medical events in specified orders and time intervals. The extracted data are presented as distributed Parquet files, incorporating events, event transitions, patient metadata, and events metadata. The concurrent approach allows effortless scaling for multi-processor systems.
Conclusion:
EHRchitect streamlines the processing of large EHR datasets for research purposes. It facilitates extracting sequential event-based data, offering a highly flexible framework for configuring event and timeline parameters. The tool delivers temporal characteristics, patient demographics, and event metadata to support comprehensive analysis. The developed tool significantly reduces the time required for dataset acquisition and preparation by automating data quality control and simplifying event extraction.