Risk factors for severe disease in patients admitted with COVID-19 to a hospital in London, England: a retrospective cohort study

COVID-19 has caused a major global pandemic and necessitated unprecedented public health restrictions in almost every country. Understanding risk factors for severe disease in hospitalised patients is critical as the pandemic progresses. This observational cohort study aimed to characterise the independent associations between the clinical outcomes of hospitalised patients and their demographics, comorbidities, blood tests and bedside observations. All patients admitted to Northwick Park Hospital, London, UK between 12 March and 15 April 2020 with COVID-19 were retrospectively identified. The primary outcome was death. Associations were explored using Cox proportional hazards modelling. The study included 981 patients. The mortality rate was 36.0%. Age (adjusted hazard ratio (aHR) 1.53), respiratory disease (aHR 1.37), immunosuppression (aHR 2.23), respiratory rate (aHR 1.28), hypoxia (aHR 1.36), Glasgow Coma Scale <15 (aHR 1.92), urea (aHR 2.67), alkaline phosphatase (aHR 2.53), C-reactive protein (aHR 1.15), lactate (aHR 2.67), platelet count (aHR 0.77) and infiltrates on chest radiograph (aHR 1.89) were all associated with mortality. These important data will aid clinical risk stratification and provide direction for further research.


Introduction
COVID-19, the illness caused by infection with severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), has caused a global pandemic and has placed an unprecedented strain on healthcare systems worldwide.
In the UK, two large cohort studies have corroborated the associations between older age, male sex and pre-existing comorbidities with poorer outcomes [13,14]. However, these studies included demographic and morbidity characteristics only, with no data to indicate disease severity at the time of presentation to hospital. The inclusion of investigations and observations that indicate clinical condition at presentation allows for increased acuity of predictive models. The aim of this large UK inpatient cohort study is to explore the association between results of admission laboratory tests and clinical observations, alongside demographic and morbidity characteristics, with clinical outcomes of patients hospitalised with COVID-19. All patients admitted with COVID-19 in the study period were enrolled thus avoiding the selection bias observed in other cohorts.

Study population
Northwick Park Hospital is a 658 bedded district general hospital serving an ethnically and socio-economically diverse population in North West London [15]. This observational cohort study retrospectively identified all adult patients (age ≥18 years) admitted to this hospital between 12 March and 15 April 2020 who were confirmed to be positive for SARS-CoV-2 by reverse transcription-polymerase chain reaction of pooled nasal and pharyngeal swabs. Patients were tested for COVID-19 in accordance with contemporaneous Public Health England guidelines [16]. Advice on which patients should be tested remained constant throughout the study period. Patients were admitted to hospital at the discretion of the admitting physician and managed using best supportive treatment, including high flow oxygen, antibiotics at the discretion of the treating physician, continuous positive airway pressure (CPAP), invasive ventilation and organ support as needed. This study had National Research Ethics approval (REC 20/NW/0218, IRAS 282630) and was carried out in accordance with the Declaration of Helsinki and the principles of Good Clinical Practice.

Variables
Data were collected from electronic patient records and entered into a securely stored database. This was pseudonymised for data analysis. Age, sex and ethnicity were recorded. Ethnicity was categorised as White, Black, Asian or Other (including mixed ethnicity). Comorbidities and medications which previous research had suggested might be associated with prognosis were extracted, including: hypertension, heart failure (HF), ischaemic heart disease (IHD), diabetes mellitus (DM), active malignancy and respiratory disease (including asthma, chronic obstructive pulmonary disease (COPD), pulmonary fibrosis and obstructive sleep apnoea). Use of angiotensin converting enzyme inhibitors (ACEi), angiotensinreceptor blockers (ARBs), metformin and immunosuppressive medications (including for autoimmune disease and anti-rejection drugs for solid organ transplant) were recorded. Information on comorbidities from admission documentation was crossreferenced with recent inpatient and specialist communications.
The following observations were collected on arrival to hospital: heart rate, blood pressure, temperature, respiratory rate, Glasgow Coma Scale (GCS) and hypoxia. Hypoxia was defined as oxygen saturations of <94% on air or the need for supplemental oxygen to maintain saturations ≥94%. Exceptions were made for two patients with established COPD who were known to retain carbon dioxide and had pre-COVID-19 saturations <94%.
The following blood test results were collected: haemoglobin, platelet count, lymphocyte count, neutrophil count, potassium, sodium, creatinine, urea, alkaline phosphatase (ALP), alanine aminotransferase (ALT), bilirubin, albumin, C-reactive protein (CRP) and lactate. As lactate levels may change rapidly in response to clinical interventions lactate was only recorded if taken within 4 h of admission; all other blood test results represent the first recorded on admission.
The presence or absence of formally reported pulmonary infiltrates on plain chest radiographs taken on admission was recorded.

Outcomes
Outcomes were ascertained for all patients on the census date of 13 May 2020, 28 days following the latest admission date. For all patients who were discharged from hospital alive, the end of follow-up date was considered to be the census date.
The primary outcome was death during the study period. Two secondary outcomes were considered. First, the composite of death or invasive mechanical ventilation (IMV). Second, the composite of death, IMV or CPAP.

Statistical analysis
Descriptive statistical analyses were performed on all of the population characteristics. Kaplan-Meier methods were used to calculate and graph cumulative survival.
Analyses examining factors associated with survival times were performed using a Cox proportional hazards model. Variable selection was executed by a multi-stage procedure. First, independent variables were examined in a series of univariable analyses. Where necessary, higher-order terms were used to better describe the relationship between continuous variables. Continuous variables found to have a heavily positively skewed distribution were analysed on the log 10 scale.
In the second stage, only factors found to show association with survival times were included. A backwards selection procedure was used to retain the significant variables in the final model.
To visually display the effect of the combinations of risk factors on outcomes, patients were allocated to three equal-sized groups based on their predicted risk from the multivariable regression model. These outcomes were displayed as tertiles on a Kaplan-Meier plot.
Forty-one patients did not have a positive swab result until >5 days after admission. We were unable to accurately establish the reason for this, but potential explanations include initial false-negative swabs and presentations that did not initially fit the contemporaneous case definition of COVID-19. To explore this further, sensitivity analyses were performed with these patients excluded.

Demographics and characteristics
Nine-hundred-and-eighty-one patients were included in the study. The median age was 69 (interquartile range 56-80) years, 64.3% were male and 776 (79.1%) had at least one documented comorbidity. The cohort's full demographic and comorbidity data are shown in Table 1.
Thirty patients received trial drugs for COVID-19: one received azithromycin alone; one received azithromycin with tocilizumab; 10 received dexamethasone; 10 received hydroxychloroquine; four received lopinavir-ritonavir and four received remdesivir.
Almost all patients (97.9%) had a chest radiograph taken on admission and of these 75.0% were reported as having pulmonary infiltrates. (The blood test results are available in the Supplementary material.)

Cohort outcomes
Median follow-up was 37 days (range 10-46). By the censoring date, 354 (36.0%) had died. A Kaplan-Meier survival curve is shown in Figure 1. Of the 627 patients alive at the end of the study period, 566 (90.3%) had been discharged from hospital, 13 (2.1%) had been discharged to rehabilitation facilities and 48 (7.7%) remained on acute wards.
For the 579 discharged alive the median duration of stay was 8 days (range 0-46).

Primary outcome
The association between baseline demographic, clinical, laboratory and radiological features with death, the primary outcome was examined. In multivariable analysis, 12 variables retained statistical significance. The final model was based on data from 741 of the 981 patients, due to missing values for some variables. These results are shown in Table 2.
Age was found to be associated with the risk of death, with a 53% higher chance for each 10-year increase in age (adjusted hazard ratio (aHR) 1.53 [1.37-1.71]). No association was found with either sex or ethnicity. The absence of association between ethnicity and death persisted when Black, Asian and minority ethnic (BAME) groups were combined (see Supplementary material).
Respiratory disease (aHR 1.37 [1.03-1.81]) was associated with a higher risk of death as was immunosuppression (aHR 2.23 The presence of pulmonary infiltrates on chest radiograph was associated with an increased risk of death (aHR 1.89 [1.38-2.60]).
In order to illustrate the difference in outcome between patients with different characteristics, patients were categorised into three tertiles based on the multivariable model. Survival at 60 days was 90% for the patients in the tertile with the 'best' combination of risk factors compared to 33% for the patients in the tertile with the 'worst' combination (Fig. 3).
A sensitivity analysis was performed excluding patients with a first positive swab >5 days after admission. The same variables were included as in the original model and yielded comparable results (see Supplementary material).

Secondary outcomes
Time to death or IMV was examined as a secondary composite endpoint using the same variables as for the primary outcome. Of the 981 patients in the cohort, 426 died or required IMV during the study period. The variables associated with this secondary outcome were similar to those for the primary endpoint, although ALP, respiratory disease and immunosuppressive medication were not found to be significantly associated. In addition, lower sodium and albumin levels were associated with higher risk of death or requiring IMV.
Four-hundred-and-sixty-one patients incurred the secondary endpoint of death, IMV or CPAP. The same variables were significantly associated with this endpoint as with the time to death or IMV, although the size of effects varied for some factors.
The full description of these secondary outcomes is shown in Table 3. Kaplan-Meier plots and sensitivity analyses are shown in the Supplementary material.

Discussion
This large cohort study of patients hospitalised with COVID-19 describes the association between a number of clinical factors and the risk of severe disease. The study was conducted in a major London hospital that serves an ethnically diverse population, with 69.6% of study patients from BAME groups. Overall, the mortality rates were comparable with those in other parts of the UK [14]. Although the results are from a single hospital, our catchment area covers a part of London with a very wide range of income, social status and ethnicity. We have no reason to believe that it is not representative of the city generally.
This study demonstrates a dramatic difference in outcomes between those who are in the tertile with the best combination of risk factors (90% survival at 60 days) vs. those in the tertile with the worse combination of risk factors (33% survival). This provides clear evidence that risk-stratification can be made early in hospital presentation.
The results of this study generally corroborate a number of previously published findings. However, some differences also exist. Increasing age was strongly associated with poorer outcomes, corroborating the findings of others [3, 6-8, 17, 18]. In line with previous research [19,20] a strong association between respiratory diseases and death was found. An association was also found between the use of immunosuppression and death which has been reported separately [21]. This population had a high incidence of hypertension and although this was a risk factor for severe disease in univariable analysis, it was not associated in multivariable analysis. This implies that although hypertension itself is not a risk factor for severe disease, it is associated with conditions which are. This is in line with other studies [7][8][9].
Unlike the recent UK cohort reported by the OpenSAFELY Collaborative [13], this study did not find an association between ethnicity and worse outcomes. Although BAME groups have been noted to have poor outcomes from COVID-19 at a population level, the explanation of which is likely to be a complex interaction between socio-economic and medical factors, the outcomes between different ethnic groups in hospitalised cohorts has  often been found to be similar [9,[22][23][24] and this was the case in our study. In this cohort, 64.3% were male, which is notably higher than the hospital's catchment area (50%) [15]. In line with a number of cohort studies of hospitalised patients, we failed to demonstrate an association with sex when adjusting for other factors [6,9,25]. One large UK cohort did find sex to be a significant factor when examined at the population level [14]. It may be that BAME and male patients are prone to more severe disease and therefore more of them appear in hospital-based cohorts but once in hospital they do not have higher mortality. Further research is needed to understand this.
Seventy-five percent of this cohort had pulmonary infiltrates reported on their admission chest radiograph which were associated with poorer outcomes as corroborated by other studies [26][27][28].
Several of the laboratory investigations taken on admission were associated with worse outcomes including CRP, urea and ALP.
Although several studies have found an association with raised hepatic enzymes [29][30][31], an independent association between death and ALP has not previously been reported. ALT was not associated with outcomes, suggesting that hepatocellular injury is not the principal explanation for this. Notably, the relationship between outcomes and ALP has been demonstrated in other infectious diseases and this finding merits further research [32].
Unlike some previous studies, reduced lymphocyte counts were not found to be associated with death [12]. An association was found between raised lactate and adverse outcomes. Although a few other studies have included this variable, it has been previously reported [23,33]. Hyperlactataemia may represent hypoperfusion in patients with a sepsis-like presentation or a high adrenergic state secondary to immune activation and it should be explored in future research.
The results of the analyses on secondary endpoints were broadly comparable with the results of the primary endpoints, although neither the use of immunosuppression nor comorbidity with respiratory disease was significantly associated with the secondary outcomes. The dilution of this association may be explained by selection bias from clinicians on the suitability for either mechanical ventilation or CPAP for patients with these comorbidities.
This study has a number of strengths. Crucially, it included all sequentially admitted patients during a predetermined time period thus reducing the selection bias which has been observed in many studies. This study includes admission observations and investigations, accounting for disease severity in the model. It also benefits from a high degree of data completeness, with patient characteristics cross-validated with discharge summaries and other information sources to ensure accuracy.
A limitation of this study is that for the purpose of the time to event analyses, it had to be assumed that those who had clinically improved and were discharged prior to censoring remained alive until the censoring date. Although it is possible that some of these patients died post-discharge, this risk was minimised by ensuring that patients were screened for readmission and by ensuring that none of the cohort were discharged for community end-of-life care.
A second limitation is that a significant minority of patients (157/981) did not have an ethnicity documented. The majority of these patients had opted not to have their ethnicity recorded.
In conclusion, this large cohort study included all patients who were hospitalised with COVID-19 during the study period. It was conducted in an area of London with high levels of ethnic diversity [15]. When considering the high mortality among those hospitalised with COVID-19, early identification of those most susceptible to a severe disease is of paramount importance. This study found that predominantly respiratory features such as respiratory rate, oxygen saturations, infiltrates on chest radiograph and a history of respiratory disease were associated with severe disease. Comparatively few non-respiratory features were also identified as being associated with severe disease (age, reduced GCS and immunosuppression) along with a number of ACEi/ARB, angiotensin converting enzyme inhibitor/angiotensin II receptor blockers. *Hazard ratios given for a 5-unit increase in variable. **Hazard ratios given for a 10-unit increase in variable. ***Hazard ratios given for a 50-unit increase in variable. ****Hazard ratios given for a 100-unit increase in variable. + Variable analysed on the log scale (base 10). Variables found to be significant in univariable analysis but did not retain significance in multivariable analysis were excluded from the final model.  CI, confidence interval. *Hazard ratios given for a 5-unit increase in variable. **Hazard ratios given for a 10-unit increase in variable. ***Hazard ratios given for a 50-unit increase in variable. ****Hazard ratios given for a 100-unit increase in variable. Variables found to be significant in univariable analysis but did not retain significance in multivariable analysis were excluded from the final model. Only variables found to be associated with outcomes are shown in this table. The variables considered are the same as shown in Table 2 and the full results are available in the Supplementary material.