Hostname: page-component-89b8bd64d-x2lbr Total loading time: 0 Render date: 2026-05-06T18:06:52.356Z Has data issue: false hasContentIssue false

Robust mortality forecasting in the presence of outliers

Published online by Cambridge University Press:  04 December 2024

Stephen J. Richards*
Affiliation:
Longevitas Ltd., Edinburgh EH6 3AJ, UK
Rights & Permissions [Opens in a new window]

Abstract

Stochastic mortality models are important for a variety of actuarial tasks, from best-estimate forecasting to assessment of risk capital requirements. However, the mortality shock associated with the Covid-19 pandemic of 2020 distorts forecasts by (i) biasing parameter estimates, (ii) biasing starting points, and (iii) inflating variance. Stochastic mortality models therefore require outlier-robust methods for forecasting. Objective methods are required, as outliers are not always obvious on visual inspection. In this paper we look at the robustification of three broad classes of forecast: univariate time indices (such as in the Lee-Carter and APC models); multivariate time indices (such as in the Cairns-Blake-Dowd and newer Tang-Li-Tickle model families); and penalty projections (such as with the 2D P-spline model). In each case we identify outliers using quantitative methods, then co-estimate outlier effects along with other parameters. Doing so removes the bias and distortion to the forecast caused by a mortality shock, while providing a robust starting point for projections. Illustrations are given for various models in common use.

Information

Type
Sessional Paper
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons-Attribution-NoDerivatives licence (https://creativecommons.org/licenses/by-nd/4.0/), which permits re-use, distribution, and reproduction in any medium and for any purpose, provided the original work is unaltered and is properly cited. The written permission of Cambridge University Press must be obtained in order to create a derivative work.
Copyright
© The Institute and Faculty of Actuaries, 2024. Published by Cambridge University Press on behalf of the Institute and Faculty of Actuaries
Figure 0

Figure 1. Marginal death counts in England and Wales, ages 50–105, 1971–2020Source: HMD data

Figure 1

Figure 2. Contribution to (negative) log-likelihood for normal variable and robustified contribution using the Huber $\rho $-function of Equation (2) with $k = 1$

Figure 2

Figure 3. Types of outliers in moving-average process ${Y_t} = {\varepsilon _t} - 0.8{\varepsilon _{t - 1}}$. Simulation using the R code in Appendix A

Figure 3

Figure 4. Observed mortality rates at age 70 with Lee-Carter forecasts using ordinary regression ARIMA model (Appendix B.2) and robustified regression ARIMA model with critical value 3.5 (Appendix B.3). Data for males and females in England and Wales aged 50–105, over the period 1971–2020

Figure 4

Figure 5. Estimates for period effects under the minimally-constrained M9 model. These parameter estimates are not consistent with a multivariate random walk with drift. Contrast with the over-constrained equivalent in Figure 6. The plots are based on HMD data for females aged 50–105, 1971–2020

Figure 5

Figure 6. Estimates for forecasting parameters under the over-constrained model M9. Contrast with the minimally-constrained equivalent in Figure 5. The plots are based on HMD data for females aged 50–105, 1971–2020

Figure 6

Table 1. Selected results from fitting a Chen & Liu (1993) regression ARIMA model to the $\left\{ {{{\hat \kappa }_y}} \right\}$ values in a Lee-Carter model. Mortality data for females in England and Wales, ages 50–105, years 1971–2020

Figure 7

Figure 7. Scatterplot of ${\rm{\Delta }}\hat \kappa $ for M9. The plot is based on HMD data for females aged 50–105, 1971–2020

Figure 8

Table 2. Selected members from the CBD model family for ${\rm{log}}\ {m_{x,y}}$. The layout of the formulae emphasises the commonality and differences between adjacent models. $\overline x = \mathop \sum \nolimits_{x = {x_{{\rm{min}}}}}^{{x_{{\rm{max}}}}} x/{n_x} = 77.5$ is the unweighted mean age, while ${\hat \sigma ^2} = \mathop \sum \nolimits_{x = {x_{{\rm{min}}}}}^{{x_{{\rm{max}}}}} {(x - \bar x)^2}/{n_x} = 252.25 $ is a normalising constant2 based on the average squared deviation around $\bar x$

Figure 9

Figure 8. Mahalanobis distance for ${\rm{\Delta }}\hat \kappa $ for M9. The plot is based on HMD data females aged 50–105, 1971–2020

Figure 10

Table 3. TLT model family for ${\rm{log}}{m_{x,y}}$

Figure 11

Figure 9. Observed mortality rates at age 70 with forecasts using the ordinary M9 model (Dowd et al., 2020, Section 2) and robustified M9 model. The data are for males and females in England and Wales aged 50–105, over the period 1971–2020

Figure 12

Figure 10. Unscaled period shocks under the model in Equation (17). The data are for females in England and Wales aged 50–105, over the period 1971–2020

Figure 13

Figure 11. Inverse relative scaling factors (${\lambda _s}/{\lambda _i}$) for 2D period-shock model of Kirkby & Currie (2010) with $\alpha = 3.24$. The data are for females in England and Wales aged 50–105, over the period 1971–2020

Figure 14

Figure 12. Period shocks under optimised scaling using Equation (18). The data are for females in England and Wales aged 50–105, over the period 1971–2020

Figure 15

Figure 13. Observed mortality rates at age 70 with forecasts using ordinary 2DAP model (Currie et al., 2004) and 2DAP period-shock model (Kirkby & Currie, 2010). Data for males and females in England and Wales aged 50–105, over the period 1971–2020

Figure 16

Figure B.1. R command and output for fitting ARMA(1, 2) model with a mean using the data in Table B.1

Figure 17

Figure B.2. ${\hat \kappa _{1971 + t}}$ values from Table B.1

Figure 18

Table B.1. ${\hat \kappa _{1971 + t}},t = 0,1, \ldots, 48$ from Lee-Carter model with smoothed ${\hat \alpha _x}$ and ${\hat \beta _x}$ parameters. ONS data for males aged 50–105 in England and Wales, 1971–2019

Figure 19

Figure B.3. R command and output for fitting a linear regression with ARIMA(1, 1, 2) model for trend deviations using data in Table B.1

Figure 20

Figure B.4. ${\hat \kappa _{1971 + t}}$ values from Table B.2, showing the outlier in 2020

Figure 21

Table B.2. ${\hat \kappa _{1971 + t}},t = 0,1, \ldots, 49$ from the Lee-Carter model with smoothed ${\hat \alpha _x}$ and ${\hat \beta _x}$ parameters. ONS data for males aged 50–105 in England and Wales, 1971–2020

Figure 22

Figure B.5. R command and output for fitting linear regression with ARIMA(1, 1, 2) model for trend deviations using data in Table B.2

Figure 23

Figure B.6. R commands and output for fitting outlier-robustified linear regression with ARIMA(1, 1, 2) model for trend deviations using data in Table B.2

Figure 24

Table C.1. False-positive rates for outliers.hdts() function with hard-coded p-value of 5% in SLBDD package with 10,000 simulated random walks with drift. For mortality work with CBD and TLT models it is therefore better to robustify the differenced multivariate series. Results derived from simulating random walks with drift specified in Equations (25) and (26)