Hostname: page-component-89b8bd64d-4ws75 Total loading time: 0 Render date: 2026-05-06T07:39:14.820Z Has data issue: false hasContentIssue false

Mind the gap – safely incorporating deep learning models into the actuarial toolkit

Published online by Cambridge University Press:  04 November 2022

Ronald Richman*
Affiliation:
Old Mutual Insure, University of the Witwatersrand, Johannesburg, South Africa
Rights & Permissions [Opens in a new window]

Abstract

Deep neural network models have substantial advantages over traditional and machine learning methods that make this class of models particularly promising for adoption by actuaries. Nonetheless, several important aspects of these models have not yet been studied in detail in the actuarial literature: the effect of hyperparameter choice on the accuracy and stability of network predictions, methods for producing uncertainty estimates and the design of deep learning models for explainability. To allow actuaries to incorporate deep learning safely into their toolkits, we review these areas in the context of a deep neural network for forecasting mortality rates.

Information

Type
Sessional Paper
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution, and reproduction in any medium, provided the original work is properly cited.
Copyright
© Institute and Faculty of Actuaries 2022
Figure 0

Table 1. Claims triangle from Taylor and Ashe (1983)

Figure 1

Listing 1: Code to fit GLM to triangle in Table 1.

Figure 2

Listing 2: Keras code to fit GLM to triangle in Table 1.

Figure 3

Listing 3: Keras code to fit GLM to triangle in Table 1 using 1-dimensional embedding layers.

Figure 4

Figure 1. Comparison of the convergence of the GLM models fit using the Keras package in Listings 2 and 3.

Figure 5

Listing 4: Keras code to fit a neural network to triangle in Table 1 using 2-dimensional embedding layers and a single intermediate layer.

Figure 6

Figure 2. Comparison of the fits of the GLM and neural network models to the data shown in Table 1.

Figure 7

Listing 5: Keras code to fit a neural network to the national and sub-national mortality data.

Figure 8

Table 2. Test set mean squared error (multiplied by ${10^4}$) using the LC and neural network models, national and sub-national populations

Figure 9

Figure 3. Logarithm of out of sample MSE for each region for which a forecast has been made. Lower values (i.e. more negative) indicate better performance. Plots are colour-coded for national populations, and for the country in which sub-national forecasts have been made.

Figure 10

Table 3. Average and Standard Deviation of MSE over ten runs of the mortality forecasting model on the test set (multiplied by ${10^4}$), with the width of the intermediate layers being varied as noted in the description column. Results sorted by Standard Deviation of MSE

Figure 11

Table 4. MSE of the average prediction of ten runs of the mortality forecasting model on the test set (multiplied by ${10^4}$) and number of times that the LC model is beaten, with the width of the intermediate layers being varied as noted in the description column. Results sorted by MSE

Figure 12

Table 5. Average and Standard Deviation of MSE of ten runs of the mortality forecasting model on the test set (multiplied by ${10^4}$), with the dimension of the embedding layers being varied as noted in the description column. Results sorted by Standard Deviation of MSE

Figure 13

Table 6. MSE of the average prediction of ten runs of the mortality forecasting model on the test set (multiplied by ${10^4}$) and number of times that the LC model is beaten, with the dimension of the embedding layers being varied as noted in the description column. Results sorted by MSE

Figure 14

Table 7. Average and Standard Deviation of MSE of ten runs of the mortality forecasting model on the test set (multiplied by ${10^4}$), with the activation function of the intermediate layers being varied as noted in the description column. Results sorted by Standard Deviation of MSE

Figure 15

Table 8. MSE of the average prediction of ten runs of the mortality forecasting model on the test set (multiplied by ${10^4}$) and number of times that the LC model is beaten, with the activation function of the intermediate layers being varied as noted in the description column. Results sorted by MSE

Figure 16

Table 9. Average and Standard Deviation of MSE of ten runs of the mortality forecasting model on the test set (multiplied by ${10^4}$), with the application of batch normalization being varied as noted in the description column. Results sorted by Standard Deviation of MSE

Figure 17

Table 10. MSE of the average prediction of ten runs of the mortality forecasting model on the test set (multiplied by ${10^4}$) and number of times that the LC model is beaten, with the application of batch normalization being varied as noted in the description column. Results sorted by MSE

Figure 18

Table 11. Average and Standard Deviation of MSE of ten runs of the mortality forecasting model on the test set (multiplied by ${10^4}$), with the depth of the network being varied as noted in the description column. Results sorted by Standard Deviation of MSE

Figure 19

Table 12. MSE of the average prediction of ten runs of the mortality forecasting model on the test set (multiplied by ${10^4}$) and number of times that the LC model is beaten, with the depth of the network being varied as noted in the description column. Results sorted by MSE

Figure 20

Table 13. Average and Standard Deviation of MSE of ten runs of the mortality forecasting model on the test set (multiplied by ${10^4}$), with the application of drop out being varied as noted in the description column. Results sorted by Standard Deviation of MSE

Figure 21

Table 14. MSE of the average prediction of ten runs of the mortality forecasting model on the test set (multiplied by ${10^4}$) and number of times that the LC model is beaten, with the application of drop out being varied as noted in the description column. Results sorted by MSE

Figure 22

Table 15. Average and Standard Deviation of MSE of ten runs of the mortality forecasting model on the test set (multiplied by ${10^4}$), with the optimization parameters being varied as noted in the description column. Results sorted by Standard Deviation of MSE

Figure 23

Table 16. MSE of the average prediction of ten runs of the mortality forecasting model on the test set (multiplied by ${10^4}$) and number of times that the LC model is beaten, with the optimization parameters being varied as noted in the description column. Results sorted by MSE

Figure 24

Figure 4. Comparison of the performance of the averaged neural network predictions on the test set, measured by MSE, compared to the standard deviation of the MSE produced by the individual predictions over different training runs.

Figure 25

Listing 6: Keras code for the pinball loss function to estimate the 0.95 quantile.

Figure 26

Listing 7: Keras code for the Deep Ensemble method.

Figure 27

Listing 8: Keras code to modify the mortality forecasting model to estimate uncertainty.

Figure 28

Listing 9: Keras code to fit a neural network to the national and sub-national mortality data that estimates uncertainty using the pinball loss and \branches”.

Figure 29

Table 17. Empirical coverage of the 2.5%–97.5% confidence interval derived using ten runs of the uncertainty prediction models on the test set. Results sorted by deviation from the targeted 5% coverage

Figure 30

Figure 5. Confidence bands for projected UK mortality rates using the pinball loss model with ReLu branches, 2016, males and females.

Figure 31

Table 18. MSE of the average prediction of ten runs of the uncertainty forecasting models on the test set (multiplied by ${10^4}$) and number of times that the LC model is beaten, with the particular model choice being varied as noted in the description column. Results sorted by number of times each forecasting model beat the LC model

Figure 32

Figure 6. Confidence bands for projected mortality rates in the USA and Iceland using the pinball loss model with ReLu branches, 2016, males and females.

Figure 33

Figure 7. Confidence bands for projected mortality rates in Australia and its sub-national territories using the pinball loss model with ReLu branches, 2014, males and females.

Figure 34

Figure 8. Confidence bands for projected mortality rates in the UK using the pinball loss model with ReLu branches, 2000 and 2016, females.

Figure 35

Figure 9. Size of the confidence bands for projected mortality rates in the UK using the pinball loss model with ReLu branches, 2000 and 2016, females. Size was estimated as described in the text.

Figure 36

Listing 10: Keras code to fit a CANN model to the national and sub-national mortality data.

Figure 37

Table 19. Test set mean squared error (multiplied by ${10^4}$) using the LC and CANN models, national and sub-national populations

Figure 38

Figure 10. Magnitude of weights in last layer of CANN model for HMD data. Left pane shows weights for the linear and non-linear parts of the network and right pane shows only weights for the non-linear component.

Figure 39

Figure 11. Effect sizes for each level of the Year (top lhs), Gender (top middle), Age (top rhs) and Country variables (bottom), derived using the CANN model.

Figure 40

Figure 12. Comparison of linear and non-linear effects from the CANN model, coloured according to age. Horizontal lines indicate a non-linear effect of 0.05 and −0.05, respectively. 5% sample of the test set predictions.

Figure 41

Figure 13. Comparison of linear and non-linear effects from the CANN model, coloured according to age. Horizontal lines indicate a non-linear effect of 0.05 and −0.05, respectively. 5% sample of the test set predictions.

Figure 42

Listing 11: Keras code to fit the linear component of a CAXNN model to the average predictions of the CANN networks.

Figure 43

Table 20. Test set mean squared error (multiplied by ${10^4}$) using the LC and linear component of the CAXNN model, national and sub-national populations

Figure 44

Table 21. Test set mean squared error (multiplied by ${10^4}$) using the LC and full CAXNN model, national and sub-national populations

Figure 45

Listing 12: Keras code to fit the full CAXNN model to the average predictions of the CANN networks: linear weights transferred from previously trained model.

Figure 46

Listing 13: Keras code to define the network used in the XNN.

Figure 47

Listing 14: Keras code to fit the full CAXNN model to the average predictions of the CANN networks.

Figure 48

Figure 14. Effect sizes for each level of the Year (top lhs), Gender (top middle), Age (top rhs) and Country variables (bottom), derived using the CAXNN model.

Figure 49

Figure 15. Effect sizes from the CAXNN model for a 15% sample of the training and testing data, all effects included.

Figure 50

Figure 16. Interaction effect between Year, Age and Gender in the CAXNN model.

Figure 51

Figure 17. Interaction effect between Year and Country in the CAXNN model.

Figure 52

Figure 18. Comparison of the CAXNN approximation to the average prediction from the CANN models for Females in the United States and for Males in the Tokyo region of Japan, in 2010.

Figure 53

Figure 19. Breakdown of the contributions to the mortality predictions for Males and Females aged 50 in the year 2015, in the United Kingdom, and Georgia, United States.

Figure 54

Table 22. Summary of abbreviations and definitions used in the text

Figure 55

Table 23. Summary of notation and definitions used in the text