Hostname: page-component-89b8bd64d-4ws75 Total loading time: 0 Render date: 2026-05-11T16:01:34.990Z Has data issue: false hasContentIssue false

A hybrid data mining framework for variable annuity portfolio valuation

Published online by Cambridge University Press:  28 July 2023

Hyukjun Gweon*
Affiliation:
Department of Statistical and Actuarial Sciences, Western University, London, ON, Canada
Shu Li
Affiliation:
Department of Statistical and Actuarial Sciences, Western University, London, ON, Canada
*
Corresponding author: Hyukjun Gweon; Email: hgweon@uwo.ca
Rights & Permissions [Opens in a new window]

Abstract

A variable annuity is a modern life insurance product that offers its policyholders participation in investment with various guarantees. To address the computational challenge of valuing large portfolios of variable annuity contracts, several data mining frameworks based on statistical learning have been proposed in the past decade. Existing methods utilize regression modeling to predict the market value of most contracts. Despite the efficiency of those methods, a regression model fitted to a small amount of data produces substantial prediction errors, and thus, it is challenging to rely on existing frameworks when highly accurate valuation results are desired or required. In this paper, we propose a novel hybrid framework that effectively chooses and assesses easy-to-predict contracts using the random forest model while leaving hard-to-predict contracts for the Monte Carlo simulation. The effectiveness of the hybrid approach is illustrated with an experimental study.

Information

Type
Research Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2023. Published by Cambridge University Press on behalf of The International Actuarial Association
Figure 0

Figure 1. An illustration of the hybrid data mining framework.

Figure 1

Table 1. Summary statistics of the continuous feature variables in the dataset.

Figure 2

Figure 2. Histogram of the FMVs of 190,000 VA contracts.

Figure 3

Figure 3. The boxplot (top) and density plot (bottom) of the estimated MSE of the unlabeled contracts.

Figure 4

Figure 4. The estimated and observed $R^2$ values obtained by the hybrid approach.

Figure 5

Figure 5. Scatter plots of the observed FMVs and the values predicted by random forest. The red dots represent RF-based predictions in the hybrid approach.

Figure 6

Figure 6. Performance of the hybrid approach with different sizes of representative data. For $R^2$ (left), higher is better. For mean absolute error (middle), lower is better. For percentage error (right), lower absolute value is better.

Figure 7

Table 2. Summary statistics for the hybrid approach ($n=2,000$) as a function of various thresholds $\left(\widehat{\underline{R}}^2_{{S^*_{RF}}}\right)$. The estimated times are in minutes.

Figure 8

Table 3. Model performance of each approach at different values of $\alpha$. Runtime represents minutes required for each approach to obtain a set of representative contracts (cLHS), train the RF model, and complete the RF-based prediction of the portfolio. Since both the metamodeling and hybrid approaches use the MC simulation for the same amount of data, the runtime required for the MC simulation is the same and thus omitted.