Hostname: page-component-5db58dd55d-lqwgf Total loading time: 0 Render date: 2026-05-30T19:07:57.008Z Has data issue: false hasContentIssue false

Privacy preserving neural network predictive modelling in insurance using horizontal federated learning

Published online by Cambridge University Press:  31 March 2026

Dylan Liew*
Affiliation:
Institute and Faculty of Actuaries , London, UK
Scott Hand
Affiliation:
Institute and Faculty of Actuaries , London, UK
Haoyuan Harry Loh
Affiliation:
Institute and Faculty of Actuaries , London, UK
Yung-Yu Michelle Chen
Affiliation:
Institute and Faculty of Actuaries , London, UK
*
Corresponding author: Dylan Liew; Email: dylan.liew@gmail.com
Rights & Permissions [Opens in a new window]

Abstract

Federated Learning is a novel method of training machine learning models, pioneered by Google, aimed for use on smartphones. In contrast to traditional machine learning, where data is centralised and brought to the model, Federated Learning involves the algorithm being brought to the data, ensuring privacy is preserved. This paper will demonstrate how insurance companies in a market could use this technique to build a claims frequency neural network prediction model collectively by combining and using all of their customer data, without actually sharing or compromising any sensitive information with each other. A simulated car insurance market with 10 players was created using the freMTPL2freq dataset. It was found that if all insurers were permitted to share their confidential data with each other, they could collectively build a model that achieved 5.57% of exposure weighted Poisson Deviance Explained (% PDE) on an unseen sample. However, if they are not permitted to share their customer data, none of them can achieve more than 3.82% exposure weighted PDE on the same unseen sample. With Federated Learning, they can retain all of their customer data privately and construct a model that achieves a similar level of accuracy to that achieved by centralising all the data for model training, reaching 5.34% exposure weighted PDE on the same unseen sample.

Information

Type
Sessional Paper
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution, and reproduction in any medium, provided the original work is properly cited.
Copyright
© The Institute and Faculty of Actuaries, 2026. Published by Cambridge University Press on behalf of The Institute and Faculty of Actuaries
Figure 0

Figure 1. Traditional approach: collaborating parties centralise Training data when fitting a machine learning model.

Figure 1

Figure 2. Federated Model training: parties don’t centralise or move Training data when collaborating on ML model.

Figure 2

Figure 3. How 3 parties might traditionally collate their data to build a machine learning model trained for 2 parameter update steps (such as epochs or boosting rounds).

Figure 3

Table 1. Model parameter for each insurer

Figure 4

Table 2. Particles and antiparticles for each insurer

Figure 5

Table 3. Sum of particles and antiparticles for each insurer

Figure 6

Table 4. Masked model parameters for each insurer

Figure 7

Figure 4. How 3 parties, for example, insurance companies would build a federated machine learning model with a central body such as a regulator, reinsurer, professional body etc. aggregating encrypted model parameters. k denotes the training round number.

Figure 8

Figure 5. Example of how 3 insurers, labelled 0, 1, 2, could securely aggregate their private model parameters $${\beta _0}$$, $${\beta _1}$$, $${\beta _2}$$. Only pairwise noise is exchanged between them so no sensitive data leaves the insurers. The aggregating body receives only data with added noise so cannot infer anything about the parameters. However upon aggregating the data together the noise cancels out so the average can still be calculated without compromising the data.

Figure 9

Table 5. Description of data, fields, and preprocessing transformations used in experiment

Figure 10

Figure 6. Distribution of Number of Claims and Exposure.

Figure 11

Figure 7. Distribution of Vehicle Power and Age.

Figure 12

Figure 8. Distribution of Driver Age and Bonus Malus.

Figure 13

Figure 9. Distribution of Vehicle Gas and Vehicle Density.

Figure 14

Table 6. Neural Network Architecture used in all 3 Scenarios

Figure 15

Table 7. Hyperparameter Search Space Considered in all 3 Scenarios

Figure 16

Table 8. Top 5 hyperparameter sets for the Global Models

Figure 17

Table 9. Chosen hyperparameters of each insurer using just their own private, unique data

Figure 18

Table 10. Performance of each insurer’s model against the Test set using just their own private data

Figure 19

Figure 10. Box plot of each insurer’s performance on the test set using just their own private data. The Global Model performance is also shown here for reference and is not an outlier. We can see that none of the insurers acting individually can approach the performance of the Global Model where data was freely shared between them.

Figure 20

Table 11. Top 5 Results of the novel hyperparameter tuning method proposed in Section 7

Figure 21

Table 12. FL learning rate and Local Validation Loss

Figure 22

Figure 11. Comparison of the performance of the 3 modelling approaches on the Test set. Using only the data available to each insurer leads to very poor performance, as shown in the boxplot compared to the either completely sharing the data with each other, or using FL. We can see that FL achieves nearly the same model performance as if the insurers were to completely share their sensitive data.

Figure 23

Figure 12. Double lift chart comparing the performance of the Federated Model against the Global Model on the Test dataset, with the Global Model showing slightly better performance than the Federated Model. The “X” shape by the orange and green lines show model performance by the 2 models is fairly even.

Figure 24

Figure 13. Double lift chart comparing the performance of the Federated Model against agent 5 on the Test data set. The Federated Model shows significantly higher model prediction accuracy compared to just using agent 5’s own data. Unlike the Global Model versus Federated double lift, the green and orange lines do not show a symmetrical “X” shape. The green line showing the Federated Model’s prediction lie significantly closer to the actual claims on the blue line.

Figure 25

Figure 14. Actual versus expected of Federated, Global, and the best performing individual insurer (agent 5) by Area, showing that whilst the Federated and Global Model predict the actual claims fairly well (being close to the blue line), agent 5’s model using just their own data leads to predictions that are too high.

Figure 26

Figure 15. Gini index by model demonstrating each model’s ability to rank policyholders correctly in terms of their relative risk to one another. We can observe that as with the Poisson deviance the Federated Model achieves similar performance to the Global but the Partial Models do not perform as well on this metric either. The “Oracle” shows the theoretical perfect model that would rank policyholders in perfect order without any error and included for benchmarking.

Figure 27

Table 13. FL learning rate and Average Local Validation Loss

Figure 28

Table 14. FL Local Learning Rate and Local Validation Loss

Figure 29

Table 15. FL Local Learning Rate and Average Local Validation Loss

Figure 30

Algorithm 1 Formalised Expression Of Proposed Federated Hyperparameter Tuning Protocol

Figure 31

Figure 16. Average validation performance of the Federated Model against various different numbers of rounds. We can see performance begins to decrease after 300 rounds so we assume the insurers would select this as the optimal amount of training.

Figure 32

Figure 17. (a) Graph showing training times of each model rebased to the Global Model training time. (b) Exposure Weighted Validation % PDE of the Global and Federated Models over different number of parameter update steps. Comparison of Federated Model training times. Whilst the observed wall time for FL appears to be longer in a) this may be due to federated learning requiring more update steps to reach the optimal set of parameters as shown in b).