Hostname: page-component-77f85d65b8-hprfw Total loading time: 0 Render date: 2026-03-26T08:49:25.357Z Has data issue: false hasContentIssue false

Neural networks for quantile claim amount estimation: a quantile regression approach

Published online by Cambridge University Press:  17 May 2023

Alessandro G. Laporta*
Affiliation:
Department of Statistics, Sapienza University of Rome, Roma, Italy
Susanna Levantesi
Affiliation:
Department of Statistics, Sapienza University of Rome, Roma, Italy
Lea Petrella
Affiliation:
MEMOTEF Department, Sapienza University of Rome, Roma, Italy
*
Corresponding author: Alessandro G. Laporta; Email: alelaporta93@gmail.com
Rights & Permissions [Opens in a new window]

Abstract

In this paper, we discuss the estimation of conditional quantiles of aggregate claim amounts for non-life insurance embedding the problem in a quantile regression framework using the neural network approach. As the first step, we consider the quantile regression neural networks (QRNN) procedure to compute quantiles for the insurance ratemaking framework. As the second step, we propose a new quantile regression combined actuarial neural network (Quantile-CANN) combining the traditional quantile regression approach with a QRNN. In both cases, we adopt a two-part model scheme where we fit a logistic regression to estimate the probability of positive claims and the QRNN model or the Quantile-CANN for the positive outcomes. Through a case study based on a health insurance dataset, we highlight the overall better performances of the proposed models with respect to the classical quantile regression one. We then use the estimated quantiles to calculate a loaded premium following the quantile premium principle, showing that the proposed models provide a better risk differentiation.

Information

Type
Original Research Paper
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2023. Published by Cambridge University Press on behalf of Institute and Faculty of Actuaries
Figure 0

Figure 1 QRNN architecture.

Figure 1

Figure 2 Quantile-CANN architecture.

Figure 2

Table 1. Summary of the variables available in the dataset.

Figure 3

Figure 3 Histograms for the covariates and the total claim severity.

Figure 4

Table 2. Frequency tables for the covariates and the total claim severity.

Figure 5

Table 3. In-sample and out-of-sample quantile loss function at the different $\tau^{\star}$ levels.

Figure 6

Figure 4 Variable importance for QRNN (left), Quantile-CANN (middle) and QR (right), trained at level $\tau^{\star}=0.8$. The results report are obtained on the first fold of the 5-fold cross validation.

Figure 7

Figure 5 Partial dependence plot for the age of the insured (AG), years of permanence (PE), gender (GE), and region of the insured (RE). The models are trained on the first fold of the 5-fold cross validation and fitted at level $\tau^{\star}=0.8$. The curves are obtained averaging ICE profiles plotted using 200 randomly selected observations in the training set.

Figure 8

Figure 6 ICE profiles for the age of the insured (AG), years of permanence (PE), gender (GE), and region of the insured (RE). The models are trained on the first fold of the 5-fold cross-validation and fitted at level $\tau^{\star}=0.8$. The curves are plotted using 200 randomly selected observations in the training set.

Figure 9

Figure 7 H-statistic of the possible two-way interactions for the different quantile models, fitted at level $\tau^{\star}=0.8$ and trained on the first fold of the 5-fold cross-validation.

Figure 10

Figure 8 Grouped partial dependence plots for the permanence variable with respect to gender.

Figure 11

Figure 9 Grouped partial dependence plots for the dimension variable with respect to gender.

Figure 12

Table 4. For the different models we report the values for the $LR_{uc}$ statistic and its corresponding p-values. The critical values of the $LR_{uc}$ statistic is 3.84, denoting that the null hypothesis is rejected at the 5% significance level. The asterisk indicates that the model passes the test.

Figure 13

Table 5. Two-way comparison of Gini Indices for the models.