Hostname: page-component-6766d58669-kn6lq Total loading time: 0 Render date: 2026-05-20T13:07:22.284Z Has data issue: false hasContentIssue false

Modelling socio-economic mortality at neighbourhood level

Published online by Cambridge University Press:  11 April 2023

Jie Wen
Affiliation:
Lloyds Banking Group, Edinburgh EH3 9PE, UK
Andrew J.G. Cairns
Affiliation:
The Maxwell Institute for Mathematical Sciences, Edinburgh EH9 3FD, UK Department of Actuarial Mathematics and Statistics, School of Mathematical and Computer Sciences, Heriot-Watt University, Edinburgh EH14 4AS, UK
Torsten Kleinow*
Affiliation:
Research Centre for Longevity Risk, Faculty of Economics and Business, University of Amsterdam, Amsterdam, Netherlands
*
*Corresponding author. E-mail: t.kleinow@uva.nl
Rights & Permissions [Opens in a new window]

Abstract

In this study, we quantify the relationship between socio-economic status and life expectancy and identify combinations of socio-economic variables that are particularly useful for explaining mortality differences between neighbourhoods in England. We achieve this by examining socio-economic variation in mortality experiences across small areas in England known as lower layer super output areas (LSOAs). We then consider 12 socio-economic variables that are known to have a strong association with mortality. We estimate the relationship between those variables and mortality rates using a random forest algorithm. Based on the resulting estimate, we then create a new socio-economic mortality index – the Longevity Index for England (LIFE). The index is constructed in a way that eliminates the impact of care homes that might artificially increase mortality rates in LSOAs with care homes compared to LSOAs that do not contain a care home. Using mortality data for different age groups, we make the index age-dependent and investigate the impact of specific socio-economic characteristics on the age-specific mortality risk. We compare the explanatory power of the LIFE index to the English Index of Multiple Deprivation (IMD) as predictors of mortality. While we find that the IMD can explain regional mortality differences to some extent, the LIFE index has significantly greater explanatory power for mortality differences between regions. Our empirical results also indicate that income deprivation amongst the elderly and employment deprivation are the most significant socio-economic factors for explaining mortality variation across LSOAs in England.

Information

Type
Research Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2023. Published by Cambridge University Press on behalf of The International Actuarial Association
Figure 0

Table 1. Predictive variables used in our study to model the relative mortality risk, ${R^0}$. Variables $x_1, \ldots, x_9$ are standardised using a N(0,1) distribution function, $x_{11}$ and $x_{12}$ take values in [0, 1] and $x_{10}$ is a categorical variable taking one of five values explained in Table 2.

Figure 1

Table 2. Five categories for the urban–rural class (predictive variable $X_{10}$).

Figure 2

Table 3. Correlations between the covariates, see Table 1 for details about the covariates. Empirical correlations have been calculated using all LSOAs in England and Wales regardless of their urban–rural classification. Note that $x_{10}$ (urban–rural classification) is not included in the table.

Figure 3

Figure 1. The values of the piecewise constant regression tree $\hat{f}^{(b)}_5(x)$ in (4.8) after five splits. Blue solid lines show the boundaries of the six nodes. Red numbers are the estimated relative risk for all LSOAs in each of the those nodes. Each of the gray dots represents the observed values of old-age income deprivation and employment deprivation for a single LSOA in the training set for this example.

Figure 4

Figure 2. A graphical representation of our example regression tree $\hat{f}^{(b)}_5(x)$ in (4.8), and the residual sum of squares, $\mbox{RSS}_s^b$ (4.6) as a function of the number of splits s for this tree.

Figure 5

Figure 3. Validation MSE calculated following (4.10) and over LSOAs in the validation set $\mathcal{S}^{va}$, of a random forest model trained using the relative risk of England males aged 70–79, and with different numbers of trees B (left plot, with m set as 4) and different numbers of variables considered per split m (right plot, with B set as 2500). The 12 predictive variables outlined in Table 1 are used.

Figure 6

Table 4. Settings of the final random forest model we use for creating the mortality index for England males.

Figure 7

Table 5. Test set MSE of the proposed random forest model fitted to three randomly chosen training sets (rounds) for data from different age groups. The applied hyperparameters are listed in Table 4.

Figure 8

Figure 4. Estimated relative risk over 16,422 LSOAs in $\mathcal{S}^{te}$ by the random forest model trained using LSOAs in $\mathcal{S}^{tr}$ and relative risks of the two year groups, $R^{0,odd}$ and $R^{0,even}$. x-axis: model trained with odd years; y-axis: model trained with even years. Left to right: relative risk of age 60–69, 70–79 and 80–89.

Figure 9

Figure 5. Percentiles of LIFE index values $R_i$ compared between indices estimated from different age groups over all 32,844 LSOAs. Left: 60–69 versus 70–79; Middle: 60–69 versus 80-89; Right: 70–79 versus 80–89.

Figure 10

Figure 6. Cumulative distribution function of the LIFE scores for all 32,844 LSOAs fitted to mortality data for different age groups.

Figure 11

Figure 7. Scatterplot of LIFE index versus IMD. The LIFE index is based on an estimated relative mortality risk fitted to mortality data for ages 40–49 (top left), 60–69 (top right) and 80–89 (bottom). Colour indicate the urban–rural class of an LSOA: conurbations (black), cities/towns (red), villages (green), rural areas (dark blue) and London (light blue).

Figure 12

Table 6. Spearman’s rank correlation of IMD and LIFE index values. The LIFE index has been fitted to the mortality experience in different age groups.

Figure 13

Table 7. Distribution of LSOAs across urban–rural classes for different subpopulations. The numbers in brackets refer to the proportion (in %) of all LSOAs in a group that fall within an urban–rural class. The subpopulation groups are defined in (7.1). See Table 2 for the definition of urban–rural classes.

Figure 14

Figure 8. Estimated relative risk as a function of income deprivation at old age (first row), employment deprivation (second row) and the proportion of the population born in the UK (third row). In each case, other covariates (except the urban–rural classification) are fixed to their median across all LSOAs. The index is fitted to mortality data for ages 60–69 (left column) and ages 80–89 (right column), and urban–rural classes are colour coded.

Figure 15

Figure 9. LIFE index scores as a function of income deprivation at old age (top left), employment deprivation (top right) and the proportion of the population born in the UK (bottom). In contrast to Figure 8, all covariates other than care home populations, $x_{11}$ and $x_{12}$, have been left at their observed values. The index is fitted to mortality data for ages 60–69, and urban–rural classes are colour coded.

Figure 16

Figure 10. Heatmaps of LIFE index values as a function of income deprivation at ages 65+ and employment deprivation. All other variables are fixed to the median. The left panel shows LSOAs in urban–rural class 1 (7921 LSOAs), the middle plot shows results for 2542 LSOAs in class 4 (isolated dwellings) and the right plot is for the 4810 LSOAs in London. The LIFE index is fitted to mortality data for ages 60–69.

Figure 17

Figure 11. Heatmaps of observed relative risk values as a function of income deprivation at ages 65+ and employment deprivation. All other variables are unchanged. The LIFE index is fitted to mortality data for ages 60–69. In the left plot, only LSOAs with $g(i) = 1$ (highest relative risk) are shown, and for the right plot, only LSOAs with $g(i)=10$ (lowest) are used.

Figure 18

Figure 12. ASMR by deprivation decile based on IMD scores (left) and LIFE scores (right) (log scale). The LIFE index and the ASMRs have been calculated for age groups 60–69 and 80–89. Similar figures for other age groups can be found in the supplementary material published online.

Figure 19

Table 8. Distribution of LSOA into regions for different subpopulation classes. The LIFE index has been fitted to ages 60–69.

Figure 20

Figure 13. ASMRs by region for mortality data for ages 60–69 and 80–89 (log scale). Similar figures for other age groups can be found in the supplementary material published online.

Figure 21

Figure 14. ADSMRs on the basis of IMD deciles (left) and LIFE deciles (right) where mortality data for different age groups have been used (log scale). Similar figures for other age groups can be found in the supplementary material published online.

Supplementary material: PDF

Wen et al. supplementary material

Wen et al. supplementary material

Download Wen et al. supplementary material(PDF)
PDF 269.9 KB