Serotype-specific differences in short- and longer-term mortality following invasive pneumococcal disease

SUMMARY Invasive pneumococcal disease (IPD), caused by infection with Streptococcus pneumoniae, has a substantial global burden. There are over 90 known serotypes of S. pneumoniae with a considerable body of evidence supporting serotype-specific mortality rates immediately following IPD. This is the first study to consider the association between serotype and longer-term mortality following IPD. Using enhanced surveillance data from the North East of England we assessed both the short-term (30-day) and longer-term (⩽7 years) independent adjusted associations between individual serotypes and mortality following IPD diagnosis using logistic regression and extended Cox proportional hazards models. Of the 1316 cases included in the analysis, 243 [18·5%, 95% confidence interval (CI) 16·4–20·7] died within 30 days of diagnosis. Four serotypes (3, 6A, 9N, 19 F) were significantly associated with overall increased 30-day mortality. Effects were observable only for older adults (⩾60 years). After extension of the window to 12 months and 36 months, one serotype was associated with significantly increased mortality at 12 months (19 F), but no individual serotypes were associated with increased mortality at 36 months. Two serotypes had statistically significant hazard ratios (HR) for longer-term mortality: serotype 1 for reduced mortality (HR 0·51, 95% CI 0·30–0·86) and serotype 9N for increased mortality (HR 2·30, 95% CI 1·29–4·37). The association with serotype 9N was no longer observed after limiting survival analysis to an observation period starting 30 days after diagnosis. This study supports the evidence for associations between serotype and short-term (30-day) mortality following IPD and provides the first evidence for the existence of statistically significant associations between individual serotypes and longer-term variation in mortality following IPD.


INTRODUCTION
Invasive pneumococcal disease (IPD), caused by infection with Streptococcus pneumoniae, has a substantial global burden in young children [1] and adults [2].
There are over 90 known serotypes of S. pneumoniae, differentiation of which is based on the composition of the polysaccharide capsule [3]. Different serotypes induce different immune responses and, together with other bacterial virulence factors and host risk factors, contribute to pathogenicity and severity of infection [3]. Given that vaccines against infection with S. pneumoniae are limited to certain subsets of serotypes, consideration of differential pathogenicity (and the resultant severity of infection and associated mortality) is important for maximizing the societal benefits of a vaccination programme [4].
There is a growing body of evidence to support the existence of serotype-specific mortality rates immediately following IPD [4][5][6][7][8][9]. Studies which include infection of children are limited in statistical power due to low mortality rates for this age group [4,5], but there is clear evidence for adults that certain serotypes are associated with a relatively increased 30-day mortality compared to other serotypes [8]. In the longer-term, reduced life expectancy following pneumococcal pneumonia (410 years) [10] and meningitis (420 years) [11] has also been observed. Generally, septicaemia is associated with increased rates of long-term mortality (46 years), although it is not clear whether this is an independent consequence of pathology, or due to comorbidities or sequelae [12]. Whatever the mechanism, such an effect is likely for pneumococcal septicaemia.
Given the likely impact of IPD on longer-term mortality, it is important to understand whether serotype-specific associations with longer-term mortality exist. Such information may help to inform costeffectiveness studies for future vaccine formulations. Using enhanced surveillance data from the North East of England linked to registered deaths, we assessed both the short-term (30-day) and longer-term (47 years) associations between individual serotypes of S. pneumoniae and mortality following diagnosis.

Study population
The North East of England has a population of about 2·6 million persons [13], and has the least favourable self-reported general health in the country [14]. Life expectancy is among the lowest in the UK; 1·0 and 1·1 years below the national average for males and females, respectively [13]. On a small area level, the North East has a disproportionate level of social deprivation compared to other regions of the UK: 33% of small area populations of 1500 persons are within the most deprived national quintile [15].

Surveillance data and linkage to mortality data
Cases of IPD with a specimen date between 1 April 2006 and 31 March 2013 were obtained from the North East IPD enhanced surveillance system. Full details of the surveillance system are described elsewhere [16]. Briefly, all laboratories in the region notified local health protection staff of laboratoryconfirmed cases of IPD. Following notification, telephone interviews with laboratory staff and primary-/ secondary-care clinicians were conducted to obtain details of risk factors (indicators for pneumococcal vaccination agreed for England plus alcohol misuse) and vaccination status. Isolates were typed at the national reference laboratory. Cases were residents of the North East of England with S. pneumoniae detected from a normally sterile site. Age, sex, pneumococcal vaccination status, risk factors for IPD (alcohol misuse, chronic heart disease, chronic liver disease, chronic lung disease, chronic renal disease, diabetes, immunosuppression), and clinical presentation (bacteraemic pneumonia, meningitis, septicaemia) were routinely collected for each case. Other clinical presentations found at low frequency within the dataset (e.g. endocarditis, peritonitis) were grouped into a single category. Cases were attributed to quintiles of social deprivation as described elsewhere [17]. The social deprivation index used incorporates multiple domains (income, employment, health, education, housing, crime) to characterize small area living environments. All cases where the serotype was unknown were excluded. Serotypes which each accounted for >1% of the dataset were considered individually. We limited analysis to cases without missing data for any variables and those which could be linked to mortality data. For cases where the positive specimen was taken on or after the date of death, specimen dates were adjusted to 1 day before the date of death. Cases were linked to Office for National Statistics registered deaths up to 30 September 2013 (data obtained March 2014) using unique National Health Service (NHS) numbers. Data was used in accordance with a data access agreement with Public Health England. The study was neither powered nor designed to test a specific hypothesis.
Associations with mortality at 30 days, 12 months and 36 months post-diagnosis A binary logistic regression model was used to assess serotype-specific differences in all-cause mortality at three time points following diagnosis with IPD (30 days, 12 months, 36 months). Only cases with followup to the time point were included in that analysis. For each model, variables with a single variable association with outcome of P < 0·2 (χ 2 for variables with two categories, Wald test for those with >2) were considered for inclusion in the multivariable model; age group, sex and serotype were retained irrespective of statistical significance. A backward selection model-building strategy was used, starting with a model containing all selected variables from the single variable analysis. Variables were then tested for exclusion if their coefficient in the model had an associated P value >0·05 (starting with the variable with the highest P value). Each reduced model was evaluated for fit using a likelihood ratio test. Goodness of fit for the final model (where all parameters had an associated P value <0·05) was assessed using the Hosmer-Lemeshow goodness-of-fit test (with ten groups) [18] and a case classification table (cut-off 0·5). Marginal predicted probabilities of death at 30 days were obtained from the final multivariable model for each age group.

Survival analysis
A Cox proportional hazards (PH) model was used to assess serotype-specific differences in all-cause mortality following diagnosis with IPD (defined as specimen date of the first positive test). An event was the day of death, truncated to 30 September 2013. To ensure acceptable precision at the end of the observation period, the period was limited to 7 years (truncating a small number of cases with data beyond this time point). Crude associations with mortality were assessed using the log rank test and any variables with an association of P < 0·2 were considered for inclusion in the Cox PH model. A backward selection model was used as for the logistic regression model. The assumption of PH of the final model was evaluated using a test of correlation of scaled Schoenfeld residuals with time. Where the PH assumption was not met (P < 0·05), an extended Cox PH model was specified to include time-dependent effects (as a function of the natural logarithm of time) for each variable. The improvement of the extended model to the main-effects model was assessed using a likelihood ratio test. Adjusted Kaplan-Meier survival curves were used to assess serotype-specific effects after adjusting for all variables included in the Cox PH model. We repeated the Cox PH model using an observation period starting 30 days after diagnosis.

Statistical software
All analysis was performed using Stata v. 13.1 (StataCorp., USA).

Surveillance data and linkage to mortality data
From the full surveillance dataset, 1779/1785 (99·7%) IPD cases had NHS numbers available. Of these cases, 210 (11·8%) had no serotype data available. A further 253 cases (16·1%) had data missing for at least one variable (postcode, n = 26; pneumococcal vaccination status, n = 84; clinical presentation, n = 22; 51 risk factor, n = 162) and were excluded; leaving 1316 cases for analysis (74·0% of the original dataset, Table 1). Of this dataset, 24 serotypes each accounted for >1% of cases (516 cases per serotype) and were included as individual categories ( Table 2). Forty-seven cases from this dataset were diagnosed with IPD on the date of death or at post-mortem; to enable inclusion of these cases in survival analysis, the specimen date was adjusted to the day before the date of death.
A multivariable model for associations with mortality at 30 days contained age group, sex, serotype (retained irrespective of significance) plus deprivation, clinical presentation and number of risk factors (Supplementary Table S1, Table 3). The model has an overall acceptable goodness of fit (χ 2 = 8·46, D.F. = 8, P = 0·390), no reduction in fit compared to the starting model (χ 2 = 6·30, D.F. = 7, P = 0·505) and correct classification of 82% of cases. From this model, four serotypes (3, 6A, 9N, 19 F) were significantly associated with an overall increased mortality compared to the group of other serotypes. There is little variation in predicted probabilities of mortality at 30 days by serotype in children and younger adults (Fig. 2). Those serotypes found to have an overall significant association with increased mortality at 30 days are only predicted to have a rate significantly above the age group average for older adults (560 years).
After extending the all-cause mortality window to 12 months, one serotype remained associated with increased odds of mortality by this time point (19 F) and one serotype (1) was associated with reduced odds of mortality, both compared to the group of other serotypes (Table 3). Nonetheless, there remained substantial variation in predicted 12-month all-cause mortality rates with different serotypes for older age groups (Fig. 3). When comparing significant predictors to the 30-day all-cause mortality model, deprivation and clinical presentation were no longer significant predictors of death, while immunosuppression was now associated of increased odds of death and chronic renal disease was associated with reduced odds of death (Supplementary Table S2, Table 3). The 12-month mortality model also showed an acceptable goodness of fit (χ 2 = 7·49, D.F. = 8, P = 0·485), no reduction in fit compared to the full starting model (χ 2 = 5·17, D.F. = 7, P = 0·640), and a good classification rate for cases (77%).  Extending the mortality window to 36 months further reduced the association of serotype with mortality (Table 3). Only one individual serotype (1) remained a significant predictor of (a reduced) all-cause mortality at 36 months compared to the group of other serotypes. For this model, the number of IPD risk factors was no longer a significant predictor, but three additional individual risk factors (chronic heart disease, chronic liver disease, chronic lung disease, plus immunosuppression also included in the 12-month model) were now associated with increased odds of death (Supplementary Table S3, Table 3). Similarly to the 30-day model (but not the 12-month model), deprivation was a significant predictor of all-cause mortality within 36 months. As with both other time-point models there was substantial variation in predicted all-cause mortality rates with different serotypes for older age groups (Fig. 4). The 36-month mortality model also showed an acceptable goodness of fit (χ 2 = 5·20, D.F. = 8, P = 0·736), no reduction in fit compared to the full starting model (χ 2 = 5·07, D.F. = 7, P = 0·651), and a good classification rate for cases (78%).
Sensitivity analysis of all-cause mortality by time points showed no pattern or substantial change in crude odds ratios when using all available cases and not just cases without missing data: 30-day (median +18% change, range −119% to +26%); 12 months (median −8% change, range −80% to +99%); 36 months (median −10% change, range −62% to +9%). Given the small numbers of cases for certain serotypes, changes in the significance of associations were found, although there was no trend associated with the sensitivity analysis (Supplementary Table S4). Using all available data for final multivariable models did not lead to the loss of any significant associations for individual serotypes (Supplementary Table S5).

Analysis of longer-term survival
Of the 1316 cases included in the analysis, 542 (41·2%, 95% CI 38·5-43·9) died within the available follow-up period (from date of diagnosis to 30 September 2013; median 889 days, range 1-2557). For the all cases model, the main-effects Cox PH model included age group, sex, deprivation, serotype, clinical presentation, chronic heart disease, chronic liver disease, chronic lung disease, and immunosuppression (Supplementary Table S6, Table 4). Sensitivity analysis using all available cases rather than just cases without any missing data suggested no substantial change in crude associations if all available data were included (Supplementary Table S4). The assumption of PH was not met for three age groups (40-59, 60-79, 580 years), sex, one serotype (3), one clinical presentation (meningitis) and immunosuppression. Subsequent time-dependent effects were significant (P < 0·05) for all three age groups, sex, clinical presentation (meningitis) and immunosuppression (Supplementary Table S6). The extended Cox PH model represented an overall significantly improved fit compared to the main-effects model (χ 2 = 55·99, D.F. = 6, P < 0·001) and the results of this model were used for assessment of the hazard ratio (HR) ( Table 4). Two serotypes (1, reduced mortality; 9N, increased mortality) had statistically significant HRs, demonstrating serotype-specific variation in survival following infection, after adjustment for other significant predictors. Those average serotypespecific effects are restricted to older adults (Fig. 5). There was no change in the significance of maineffects predictors for serotypes using all available data for the final multivariable model (Supplementary Table S5).
The analysis using an observation period starting 30 days after diagnosis included 1073 cases, of which 299 (27·9%, 95% CI 25·2-30·7) died within the available follow-up period (median 1140 days, range 1-2527). The Cox PH model included main effects for age group, sex, serotype, clinical presentation, chronic heart disease, chronic liver disease, chronic lung disease and immunosuppression (Supplementary Table S5, Table 4). The assumption of PH was not met for two serotypes (1, 35 F) and chronic liver disease. Subsequent time-dependent effects were significant (P < 0·05) only for serotype 35 F (Supplementary Table S5). The extended Cox PH model represented an overall significantly improved fit compared to the main-effects model (χ 2 = 4·81, D.F. = 1, P = 0·028) and the results of this model were used for assessment of the HR (Table 4). Six serotypes (1, 7 F, 19A, 22 F, 23 F, 35 F) had statistically significant HRs, although only one of these (35 F) was for an increased risk of all-cause mortality. The association between serotype 9N and increased all-cause mortality found in the model containing the whole observation period (Table 3) was not found when excluding the first 30 days following diagnosis from the observation period (and thus removing from analysis the 243 cases that died within that time period).

DISCUSSION
Given the requirement to demonstrate the costeffectiveness of vaccination programmes, together with the need to select only certain serotypes for inclusion in vaccine formulations, the existence of serotypespecific mortality has implications for determining the societal benefits of pneumococcal vaccines. It is wellestablished that mortality rates immediately following IPD are associated with the serotype of S. pneumoniae causing the infection [8]. This is, to our knowledge, the first study to consider the potential association between serotype and longer-term mortality. Using a cohort of IPD cases from a region of England with a population of ∼2·5 million persons, we have estimated the relative associations between short-term (30-day) and long-term (47 years) all-cause mortality following IPD and 24 different serotypes of S. pneumoniae.
The lack of association between three serotypes (3, 6A, 9N) and increased mortality rates using time-frames beyond 30 days suggests that the clinical effects of differential virulence due to these serotypes are restricted to pathogenesis (and/or response to treatment) during the acute stage of the infection. Serotype 19 F was found to be associated with increased mortality at 12 months as well as 30 days, but not when the window was     expanded to 36 months. Furthermore, mortality during the 30 days following infection was significantly associated with the cumulative number of IPD risk factors, rather than individual risk factors being themselves associated with increased mortality, as is seen with the 12 months (immunosuppression) and 36 months (chronic heart disease, chronic liver disease, chronic lung disease, immunosuppression) analysis. It would appear that there is a switch from the combined effect of multiple comorbidities (within 30 days and 12 months of infection) to an effect of specific comorbidities as the time window since infection is expanded, such that within the 36-month time period the number of risk factors is no longer itself significantly associated with mortality. The association between reduced mortality at 12 and 36 months and chronic renal disease may reflect increased contact rates with health services that minimizes the probability of late diagnosis associated with poorer outcomes.
Through the use of a multivariable Cox PH model we have produced fully adjusted, independent associations between serotype and longer-term mortality following IPD. As was seen with the longer window time-point analyses, specific risk factors are predictive of increased mortality rates, rather than the cumulative number. Over a time period of 47 years, we have shown that serotype is associated with significant variation in mortality with one serotype associated with significantly reduced mortality (1) and one serotype associated with significantly increased mortality (9N). Due to a limited statistical power, these effects are only observable in older adults and higher-powered further studies are required to investigate these associations for children and younger adults. Both serotypes 1 and 9N have been previously shown to be associated with decreased and increased mortality, respectively [7,8].
When survival analysis was limited to an observation period that excluded the first 30 days following diagnosis, an altered pattern of serotype-specific mortality rates was observed, with a greater number of serotypes associated with significantly lower mortality rates (serotypes 1, 7 F, 19A, 22 F, 23 F) and a All serotypes by age group, (b) significantly associated serotypes for all ages, (c) significantly associated serotypes for ages 0-39 years, (d) significantly associated serotypes for ages 540 years. Survival function is adjusted within each age group for sex, deprivation quintile, clinical presentation, chronic heart disease, chronic liver disease, chronic lung disease, and immunosuppression. Only the two serotypes significantly associated with survival from the multivariable model are shown individually. different serotype associated with significantly higher mortality rates (serotype 35 F). However, the estimate for serotype 35 F has quite low precision and this statistical uncertainty requires further investigation with a larger dataset. Although the association between serotype 9N and increased mortality is no longer statistically significant, the point estimate for the HR remains >1 and the lack of significance may be due to the reduced power of the analysis compared to the full survival analysis.
We have not been able to consider the influence of events which have occurred post-diagnosis with IPD on the mortality of cases. Of course, it is expected that factors which might contribute towards an increased risk of mortality will have occurred in the follow-up period. However, for these factors to confound relative associations between mortality and individual serotypes they would need to be both causes of increased mortality and be associated with individual serotypes, and it is difficult to foresee situations when this would occur. Furthermore, it should be considered that this study has only assessed relative associations between serotypes and mortality following IPD; no inferences can be made regarding mortality following infection with individual serotypes compared to mortality for similar individuals who have not had IPD.
This study provides the first evidence that the pneumococcal serotype causing IPD is associated with differential longer-term mortality. Due to limited statistical power for children and younger adults it is currently not clear whether these associations apply to these age groups as well as to older adults. Further studies are required to determine the influence of serotype-specific longer-term mortality on the burden of IPD and the consequent implications for the relative benefit of different vaccine compositions.