JOINT MODEL PREDICTION AND APPLICATION TO INDIVIDUAL-LEVEL LOSS RESERVING

Abstract Innon-life insurance, the payment history can be predictive of the timing of a settlement for individual claims. Ignoring the association between the payment process and the settlement process could bias the prediction of outstanding payments. To address this issue, we introduce into the literature of micro-level loss reserving a joint modeling framework that incorporates longitudinal payments of a claim into the intensity process of claim settlement. We discuss statistical inference and focus on the prediction aspects of the model. We demonstrate applications of the proposed model in the reserving practice with a detailed empirical analysis using data from a property insurance provider. The prediction results from an out-of-sample validation show that the joint model framework outperforms existing reserving models that ignore the payment–settlement association.


INTRODUCTION
A loss reserve represents the insurer's best estimate of outstanding liabilities for claims that occurred on or before a valuation date. Inaccurate prediction of unpaid claims may lead to under-reserving (inadequate reserves) or over-reserving (excessive reserves), which influences the insurer's key financial metrics that further feeds into the decision making of management, investors, and regulators (Petroni, 1992). For instance, inadequate reserves could lead to deficient rates and thereby increase solvency risk. Also, excessive reserves could increase the cost of capital and regulatory scrutiny. Therefore, reserving accuracy is essential for insurers to meet regulatory requirements, remain solvent, and stay competitive.
In claim management, it is common that small claims are settled faster than large claims, because large and complicated claims naturally require experienced adjusters, demand special expertise, involve multiple interested parties, and are more likely to be litigated. As a result, the duration of settlement and size of payments for individual claims are often positively correlated. See Figure 3 for an example in property insurance.
The payment-settlement association has important implications for the loss reserving practice. In loss reserving, actuaries predict the outstanding liabilities based on the claim history that is only observed up to a valuation date. When the settlement time and claim size are correlated, the historical claims that actuaries use for model building will not be representative of future payments, because large claims with longer settlement times will be more likely to be censored (not settled) by the valuation date, a type of selection bias. Specifically, when larger claims take more time to settle, outstanding payments would be underestimated if the selection bias in the sampling procedure is not accounted for. Similarly, one would expect overestimation of future payments if the claim size and settlement time are negatively correlated.
Further, the payment-settlement association suggests that payment history may help predict settlement time, which in turn feeds back into the prediction of unpaid losses. Then the relation between the two processes allows for the dynamic prediction of outstanding liabilities. The prediction is dynamic in the sense, when more information becomes available over time, an actuary could use claim history to update the prediction for the settlement time and ultimate claim payments.
The goal of this paper is to establish a micro-level loss reserving method that leverages claim level granular information while accounting for the paymentsettlement association, and thus improves accuracy in claim prediction. In doing so, we employ a joint modeling framework developed in the statistical literature for longitudinal outcomes and time-to-event data. The joint model (JM) for reserving purposes consists of two submodels, the longitudinal submodel governs the payment process for a given claim, and the survival submodel concerns the settlement process of the claim. The two components are joined via shared latent variables. The joint model has a natural interpretation in the reserving context, where the historical payments affect the instantaneous settlement probability, and the settlement intensity determines whether there are further payments.
For statistical inference of the joint model, we discuss both estimation and prediction with the focus on the latter. The properties of estimators and predictions are investigated using simulation studies, and we find that the advantages of the proposed joint model are more pronounced for long-tail lines of business. Furthermore, we present a detailed empirical analysis of the joint model framework using data from a property insurance provider with the focus on the Reported But Not Settled (RBNS) reserve prediction. We fit thejoint model to a training data set and find significant positive relation between the payment history and settlement time. The RBNS prediction performance of the joint model is compared to existing reserving models using out-of-sample data, and the results suggest that accounting for the payment-settlement association leads to better prediction.
We contribute to the literature in the following aspects: First, we introduce the joint model for longitudinal and time-to-event data into the micro-level loss reserving literature, and thus provide a novel solution to the sample selection issue that is due to the association between the size of claims and time of settlement. Second, because of the predictive nature of loss reserving, we investigate the predictive performance of the joint model, using both simulated and realworld data, which enriches the existing statistical literature that has primarily focused on the estimation aspect of inference. Third, the detailed analysis not only provides empirical evidence of the payment-settlement association and its role in the dynamic prediction but also provides guidance for practitioners to employ the proposed method in practice.
The rest of the paper is organized as follows: Section 2 reviews the literature on current loss reserving methods and joint models for longitudinal and time-to-event data. Section 3 introduces the joint modeling framework for individual-level loss reserving. Section 4 discusses estimation and prediction for the joint model. Section 5 evaluates the properties of the model using simulation studies. Section 6 describes the property insurance claims data set and its important characteristics that motivate the joint modeling framework, provides estimation results using a training data set and prediction results using a hold-out sample, and discusses limitations. Section 7 concludes the paper.

Literature on reserving models
In the actuarial literature, there are two main classes of reserving techniques: macro-level and micro-level. The macro-level models are based on aggregate claims data summarized in run-off triangles, and the reserve is estimated using the chain-ladder (CL) method and its extensions. See Wüthrich and Merz (2008) for a comprehensive review on macro-level reserving methods. The ease of implementation and interpretation is a major strength of macro-level models, but they come with a risk of inaccurate predictions. For instance, Friedland (2010) examined the effects of environmental changes on reserve prediction and found that the chain-ladder type methods are appropriate only in a steady-state. In case of environmental changes, some of the commonlyused macro-models can generate a reserve estimate without material errors. In this context, "environmental changes" refers to changes in the insurer's business that can affect loss reserving, for example, underwriting practices, claims processing, mix of products, and so forth. To handle environmental changes, macro-level methods consider either expected claims that allow actuaries to incorporate a priori reserve estimate, or trending techniques that treat environmental change as trend to adjust the development projections. However, highly dependent on actuaries' judgments, both techniques could lead to problematic reserve estimates (Jin, 2014).
Micro-level reserving techniques provide a Big Data approach to address the limitations of macro-level models. In recent years, following the general trend in analytics to look into detailed data, interests in micro-level techniques have spiked mostly because of their ability to leverage individual claim development in the prediction of outstanding liabilities. Granular covariate information allows one to account for both claim and policy specific effects, and thus naturally captures the environmental changes. Hence, reserve predictions from micro-level models are generally more accurate than those computed from aggregate data under non-homogeneous environmental conditions. The most studied method is the marked Poisson process (MPP) framework introduced by Arjas (1989), Jewell (1989), Norberg (1993) and(1999). Antonio and Plat (2014) provided the first empirical study with data from a personal-line general liability insurance portfolio. The MPP represents events, such as claims or claim payments, as a collection of time points on a timeline with some additional features (called marks) measured at each point. The marked Cox process provides an extension to allow for overdispersion and serial dependence (Avanzi et al., 2016;Badescu et al., 2016b and2016a). Another family of research using individual-level data employs generalized linear models (GLMs) in conjunction with survival analysis to incorporate settlement time as a predictor for ultimate claims (Taylor and Campbell, 2002;McGuire, 2004 andTaylor et al., 2008). Most recently, machine learning algorithms have become popular in individual-level loss reserving because they are highly flexible and can deal with structured and unstructured information (for example, see Wüthrich, 2018a and 2018b).

Literature on joint models
The existing micro-level reserving methods do not explicitly capture the dependence between the payment history and settlement process. Recently, papers such as Lopez et al., (2019) attempted to handle the bias due to the paymentsettlement association using a weighting scheme. But their framework does not explicitly capture the dependence between the payment history and settlement process. We further extend the literature by introducing the joint longitudinal survival model framework to handle such association explicitly.
The joint model has been proposed in the medical statistics literature for modeling longitudinal and survival outcomes when the two components are correlated (Elashoff et al., 2017). Two general frameworks have received extensive attention, the pattern mixture model and the selection model (Little, 2008). These two frameworks differ in the way the joint distribution is factorized. In the former, the joint distribution is specified using the marginal distribution of time-to-event outcome and the conditional distribution of longitudinal outcomes given the time-to-event outcome. In contrast, the joint distribution in the latter is specified using models for the marginal distribution of longitudinal outcomes and the conditional distribution of time-to-event outcome given longitudinal outcomes. Diggle and Kenward (1994) were the first to apply selection models to nonrandom drop-out in longitudinal studies by allowing the drop-out probabilities to depend on the history of measurement process up to the drop-out time. The two model families are primarily applied with discrete drop out times and cannot be easily extended to continuous time.
The properties of the joint models have been well developed in the biomedical literature in clinical studies (Ibrahim et al., 2010) and non-clinical studies (Liu, 2009). Tsiatis and Davidian (2004), Yu et al., (2004), and Verbeke et al., (2010) give excellent overviews of joint models. Besides, Rizopoulos (2010) and (2016) develop R packages for joint models.

General framework
In this section, we introduce the joint model framework to the loss reserving problem, focusing on a subset of selection models called shared-parameter models. In shared-parameter models, a latent random effects b i is used to capture the association between the longitudinal and the time-to-event outcomes (Rizopoulos, 2012). For this application, the sequence of payments from a reported claim forms the longitudinal outcomes, and the settlement time of the claim is the time-to-event outcome of interest. The development of claim payments may yield early indications of impending settlement, which introduces associations between the longitudinal and survival outcomes.
In this study, we set the time origin for a claim as its reporting time. For the ith claim (i = 1, . . . , N), we denote T * i and c i as the settlement time and valuation time, respectively. Assuming where b i denotes the vector of random effects that account for the claimspecific unobserved heterogeneity. The formulation in (3.1) relies on a conditional independence assumption. Figure 1 provides a graphical illustration of the cumulative payment process that experiences jumps at the time of each payment from the time of reporting to settlement. The size of the jump represents the amount of incremental payment. The left panel presents a closed claim where the entire development process of the claim is observed before the valuation time, that is ( i = 1, n i = n * i ). The right panel provides an example of an open claim where only a part of the development process of the claim is observed at the valuation time, that is ( i = 0, n i ≤ n * i ). The rest of the section focuses on the general modeling framework for both the longitudinal and survival submodels. Alternative model specifications under the general framework are considered in the empirical analysis and thus discussed in Section 6.

Longitudinal submodel of claim payments
The cumulative payments Y it are specified using generalized linear mixed effect models (GLMM). See, for instance, Frees (2004) and Molenberghs and Verbeke (2006) for details. Then, conditional on the random effects b i , the cumulative payment Y it is assumed to be from the exponential family with dispersion parameter φ. The conditional mean of Y it , μ it = E[Y it |b i ], is specified as a linear combination of covariates via a link function g(·), that is Here, f (t i ;β 1 ) is a function of payment time t parameterized by β 1 . Examples are linear functions, polynomial functions, and splines. Furthermore, x it and z it are the vectors of covariates in the fixed and random effects, respectively, and β = {β 1 , β 2 } is the regression coefficients to be estimated. In addition, b i , i = 1, . . . , N, are independent of each other and b i ∼ N(0, D), where D is the covariance matrix with unknown parameters ν. We emphasize that the covariates are indexed by t because, in addition to time-independent covariates like claim codes, the model allows us to include time-dependent covariates. In this model, we assume Y it are independent across time conditional on random effects b i and predictors x it .

Survival submodel of claim settlement
The time-to-settlement outcome of a claim is modeled using a proportional hazards model. The hazard function of T * i is specified as where h 0 (t) is the baseline hazard, w it is a vector of covariates, and γ is the corresponding regression coefficients. In this model, the association between the claim payment process and the settlement process is introduced through the effects of η it on the hazard of settlement that is measured by α. A positive α indicates a negative payment-settlement relation, that is larger payments will accelerate the settlement, and vice versa. From (3.3), the survival function of For the baseline hazard in (3.3), we consider both the Weibull model and an approximation based on splines. The Weibull baseline is given by h 0 (t) = λκt κ−1 , (3.5) where λ is the scale parameter, and κ is the shape parameter. When κ = 1, h 0 (t) reduces to an exponential baseline function. The Weibull model is commonly used because of its simplicity and the easy interpretability. A more flexible model is to approximate the baseline hazard using splines. Specifically, we consider: Here, B k (·) is a B-spline basis function, q denotes the degree of the B-spline basis function, K = q + m; where m is the number of interior knots, and λ = (λ 0 , λ 1 , · · · , λ K ) are the spline coefficients. For convenience, we denote ω to be the parameters in the baseline hazard model. Then, ω = {κ, λ} for the Weibull baseline and ω = {λ 0 , λ 1 , · · · , λ K } for the spline baseline.

Estimation
The parameters of the joint model are estimated using likelihood-based methods. Denote θ = (θ 1 , θ 2 ), where θ 1 = {β, ν, φ} summarizes the parameters of the longitudinal submodel including both regression coefficients and variance components, and θ 2 = {ω, γ , α} summarizes the parameters of the survival submodel that includes baseline hazard, regression coefficients, and association between claim payment and settlement. The likelihood function for observ- Given data collected on N individual claims, the MLE of model parameters are obtained byθ The variance ofθ is estimated using the inverse of the observed Information matrix, that is and the second-order derivative is approximated by the numerical Hessian matrix. Song et al., (2002) proposed an estimation procedure that does not require normality assumption for random effects and showed that estimation under normal assumption is robust to misspecification. In addition, Rizopoulos et al., (2008) showed that misspecification of the random effects distribution has a minimal effect in parameter estimation that wanes when the number of repeated measurements increases. Evaluation of the likelihood function is computationally difficult because of the integral in the likelihood function (4.1) and the integral in the survival density function (4.2). Numerical integration techniques such as Gaussian quadrature (Song et al., 2002), Monte Carlo (Henderson et al., 2000) and Laplace approximations (Rizopoulos et al., 2009) have been applied in the joint modeling framework. Maximization approaches include the EM algorithm that treats the random effects as missing data (Wulfsohn and Tsiatis, 1997) and a direct maximization of the log-likelihood using a quasi-Newton algorithm (Lange, 2004). In this paper, we employ the Gaussian quadrature numerical techniques to evaluate the likelihood function. It is worth noting that the computational aspect of the model is not the focus of our study. We refer to the aforementioned literature for details.
The random-effects estimateb i for claim specific predictions is obtained using Bayesian methods with posterior distribution: The meanb i of the posterior distribution is used as the empirical Bayes estimate and is obtained bŷ

Prediction
At the valuation time, an open claim i is characterized by the time since reporting c i and longitudinal claim history With the fitted joint model, we obtain the RBNS reserve prediction for the ith claim at the valuation time,R RBNS i (c i ), using the following steps which are elaborated in Algorithm 1: (a) Predict the future time when the ith claim will be settled,û i , given T * i > c i and Y i (c i ) using (4.11) from Section 4.2.1.
Let m be the number of open claims at the valuation time, that is, m = N i=1 I(δ i = 0). Then the total RBNS reserve amount is given bŷ (4.8)

Prediction of time-to-settlement
To predict the time-to-settlement for a RBNS claim, given that the claim survived (not settled) up to the valuation time, we are interested in estimating the conditional survival probability: where S i (·) is given by (3.4), and u > c i . w iu and w ic i are covariates at times u and c i . π i (u|c i ) gives the probability that there are further payments at future time u. Here, the probability prediction is dynamic because π i (u|c i ) depends on the expected claims amounts at valuation time c i and future time u given by η ic i and η iu , respectively. Then the predictions can be updated as more data becomes available. Using the MLE estimatesθ and the empirical Bayes estimateb i , an estimate of π i (u|c i ) is given bŷ (4.11)

Prediction of future claim Payments
For the future claim payments prediction of an open claim at the valuation time, we are interested in the expected cumulative payments at future (4.12) Here, g −1 (·) is the inverse of the link function, {x iu , z iu } are covariates, and β = {β 1 ,β 2 } are the maximum likelihood estimates. The point prediction of the ultimate amount of claim i,Ŷ ULT i (u), is given by the mean of all expected cumulative payments at simulated future times u > c i .
An algorithm for predicting the loss reserve using the joint model is given in Algorithm 1.
Algorithm 1 Reserve prediction routine for joint model. Calculateη To better understand the strength and limitations of the proposed joint model, we investigate the performance of estimation and prediction routines described in Section 4 using simulated data.

Simulation design
In the simulation, the longitudinal submodel is assumed to be a gamma regression with dispersion parameter 1/σ . The conditional mean is further specified as where f (t i ;β 1 ) = β 10 + tβ 11 is a linear function of the payment times, and x it = {x i1 , x i2 }. The random effects are generated from a normal distribution with mean zero and variance ν, N (0, ν). The survival submodel is a proportional hazards model with an exponential base hazard. Specifically, with w it = {x i1 , x i2 }, the conditional hazard function is The parameters used in data generation are summarized in Table 1. The payment times are assumed to be exogenous and are set at t = 0, 1, 2, ..., 9.
We assume x 1 ∼ Bernouli(0.5), representing a discrete predictor and x 2 ∼ Normal(1, 0.25), corresponding to a continuous predictor. The claims are evenly and independently distributed among ten accident years, and the censoring time is the end of calender year ten. Based on the work of Sweeting and Thompson (2011), we employ the Algorithm provided in the Online Appendix C to construct the training and validation data in the simulation study.

Parameter estimates
The main results on parameter estimation are summarized in Table 1. We consider different sample sizes (number of claims) and report the results for N = 500, 1000, and 1500. For each simulated sample, the parameter estimates and the associated standard error are obtained using the likelihood-based method described in Section 4. The results reported in Table 1 are based on S = 150 replications.
We show in the table the average bias (Bias) and the average standard error (SE) of the estimates. In addition, we calculate the nominal standard deviation of the point estimates (SD) and report the standard deviation of the average bias calculated as SD/ (S). As anticipated, both estimate and uncertainty of the average bias decrease as sample size increase. The average standard error are comparable to the nominal standard deviation, indicating the accuracy of variance estimates. Lastly, the standard errors are consistent with √ n convergence. To emphasize the importance of joint estimation, we explore two additional estimation strategies, independent and two-stage estimations. The former estimates the longitudinal and survival submodels separately ignoring the association between the two components. The latter estimates the parameters in the longitudinal submodel in the first stage, and then estimates the parameters in the survival submodel in the second stage holding the longitudinal model parameters fixed. Both estimation techniques turn out to introduce significant bias in the parameter estimates. Detailed discussions on the two alternative strategies and the associated estimation results are provided in the Online Appendix A.

RBNS prediction
This section focuses on the prediction performance of the proposed joint model in different scenarios. The prediction from the joint model is compared with the independent and two-stage estimates. Results presented in this section are based on sample size N = 1000 and S = 150 replications.
In this scenario, we investigate the effect of payment frequency from individual claims on the prediction accuracy. The payment frequency is defined as the number of payments per unit time period. The high-frequency payment case corresponds to the base model described in Section 5.1 where the maximum number of payments for each claim is ten, and payments are at times t = 0, 1, 2, ..., 9. In the low-frequency payment case, the maximum number of payments is reduced by half, and payment times are t = 0, 2, 4, 6, 8. Note that the payment frequency does not alter the settlement process, and it only affects the number of observations generated from the longitudinal submodel. Figure 2 illustrates the timeline of the payment times for the low-frequency and high-frequency payment models. It is seen that claims in the highfrequency model are likely to have more payment transactions than those in the low-frequency model. For instance, a claim that is to be settled at t = 1.5 will be closed with a single transaction under the low-frequency payment model. However, a claim with the same settlement time will be closed with two transactions under the high-frequency payment model.
One can think of the low-frequency payment scenario as a representation of short-tail business lines such as personal automobile collision insurance, where claims, once reported, are typically settled with a single payment within a relatively short period of time. In contrast, the high-frequency payment scenario mimics long-tail business lines such as workers' compensation insurance, where claim settlement is usually accompanied by more payment transactions than the short-tail lines. Table 2 shows the true RBNS reserve, the reserve error (estimated RBNS reserve minus the actual unpaid losses), the error as a percentage of actual unpaid losses, and the standard error of prediction divided by the number of replications (SE/ √ S). For the high-frequency scenario simulation, it is seen that joint model performs better than the independent and the two-stage estimation techniques with the smallest percentage error of 0.55%. The performance of the two-stage technique and independent technique in comparison to the joint model model emphasizes the point that when the endogenous nature of the cumulative payments and the association between cumulative payments and settlement process are ignored, it leads to biased estimates and consequently inaccurate predictions of unpaid losses.
For the low-frequency simulation, the joint model with the percentage error of 1.16% again performs better than the other estimation techniques. The slight increase in the percentage reserve error for the two-stage and the joint model compared to the high-frequency model indicates that the reduction in payment transactions reduces the accuracy of the individual claim random effects estimate used for the reserve predictions. Also, compared to the high-frequency model, the percentage reserve error for the independent model has reduced to −20.58%, which implies that the advantages of the joint model are significant for long-tail lines of business.

Data
The data analyzed in this paper are from the Wisconsin Local Government Property Insurance Fund (LGPIF), which was established to make property insurance available for local government units. The LGPIF offers three major types of coverage for local government properties: building and contents,  , 2006, and December 31, 2009, but settled between January 1, 2010, and December 31, 2013. Specifically, there are 163 claims with a total outstanding payment of $4, 511, 490. The training data are used to develop the joint model, and the validation data are used to evaluate the quality of reserve prediction. We emphasize that our analysis is based on ground-up losses, which allowed us to identify and exclude claims with reported losses less than the deductible, resulting in such claims being closed without payment. Therefore, our 3393 reported claims do not include claims which closed without payments. Figure 3 visualizes the relationship between settlement time (in quarters) and the ultimate payments (in log scale) for claims in the training data set. The settlement time of a claim is defined as the difference between the close date and reporting date of the claim. For reopened claims, the settlement time of a claim is defined as the difference between the final close date after reopening and reporting date of the claim. The left panel shows the distribution of ultimate payment by settlement time, and the right panel shows the scatter plot of the two outcomes. The solid curve in the right panel corresponds to the fit of the locally estimated scatter plot smoother (LOESS). Both plots suggest a strong positive relation between ultimate payment and settlement time, that is, it takes longer to close larger claims. In addition, the data set contains rich information regarding the policy, policyholder, claims, and payment transactions. Table 3 describes the variables that we select to use as predictors in the joint model. Table 4 provides descriptive statistics of the two outcomes of interest, that is, the settlement time and ultimate loss amount, as well as the continuous predictors. We note that the deductible and initial case estimate are right-skewed. To handle the skewness, we apply logarithmic transformation prior to model fitting. In addition, the table reports the Spearman correlation between the two outcomes, and then between each outcome and the predictors. The results suggest a substantial correlation between the settlement time and ultimate payment, and the selected covariates are expected to be predictive in the reserving application.

Estimation results
The joint longitudinal-survival framework is applied to the micro-level reserving problem using the property data from the Wisconsin LGPIF. Specifically, we use a gamma distribution with a log link and a nonlinear payment trend using B-splines with an internal knot at payment time 5 in the longitudinal submodel, and a Weibull baseline hazard in the survival submodel. The nonlinear payment trend is motivated by Figure 3, which suggests a nonlinear relationship between payment size and time. Also, a random intercept longitudinal submodel is assumed to follow a normal distribution, N (0, ν). With the random intercept longitudinal submodel, only the intercept parameter in the GLMM is random, and all slope parameters are fixed. The estimation results are presented in Table 5. We present the parameter estimates and standard errors for the continuous covariates. Here, B 1 . . . B 4 denotes the spline coefficients in the longitudinal submodel. For the categorical covariates, we present the likelihood ratio test statistics, the degrees of freedom, and the associated p-values that indicate their statistical significance in each submodel.
In the survival submodel, the association parameter α is interpreted as the percentage change in hazard or risk of the settlement when expected cumulative payments increase by one percent. The estimated value is −0.407 and is statistically significant at a 1% significance level. The negative association in the hazard model implies a positive relationship between the settlement time and payment amount.

Evaluation of survival submodel
The correct specification of the survival submodel is necessary to obtain accurate prediction results. In the modeling building process, we consider two alternative specifications for the baseline hazard function. A parametric Weibull model and a more flexible spline model. The spline baseline model was fitted with equally spaced five internal knots in the quantiles of the observed event times.
To examine the overall goodness-of-fit, we compare the Kaplan-Meier estimate of the Cox-Snell residuals from both survival submodels to the function of the unit exponential distribution (Rizopoulos, 2012). Figure 4 visualizes the fit for the survival submodel with the Weibull and spline baseline hazard functions in the left and right panel, respectively. The solid line is the Kaplan-Meier estimate of the survival function of the Cox-Snell residuals, and the dashed line is the survival function of the unit exponential distribution. It can be seen that both the Weibull and spline baseline functions fit the data very well. We choose the Weibull model due to easy interpretation.

Evaluation of longitudinal submodel
We also explore alternative specifications in the longitudinal submodel. First, we investigate the distributional assumption for the payment amount. To accommodate the skewness in the claim severity, we fit two joint models, one with a lognormal and the other with a gamma longitudinal submodel. The two methods amount to the transformation technique and generalized linear models for handing non-normal responses in regression respectively. In the former, we transform the response to a symmetric distributed outcome and apply linear regression model to the transformed variable. In the latter, we use a link To be more specific, we consider a linear trend and a nonlinear trend using splines. In Figure 5, we overlay the scatter plot of payments by time with the fitted trend. The left panel shows the linear trend, and the right panel shows the nonlinear trend using B-spline basis functions. The nonlinear trend shows a better fit, which is further supported by the AIC and BIC statistics that are not reported. As a result, we employed a gamma regression model with a nonlinear trend in the longitudinal submodel.

Reserve prediction
This section examines the reserve prediction from the proposed joint model. To recap, the validation data span from January 1, 2010 to December 31, 2013. All payments that occurred during this time period are the outstanding liabilities of the insurer as of the valuation date. Our goal in the reserving application is to predict the outstanding payments. One advantage of individual loss reserving models over macro-reserving methods is that we will be able to obtain not only the prediction for the insurance portfolio but also the prediction for each individual claim.

Prediction from the joint model
To obtain the RBNS reserve estimate from the fitted joint model, we follow the prediction routine described in Section 4.2. Specifically, the joint model allows us to make a prediction for both the time-to-settlement and the ultimate payment for individual claims. Given that the B-splines is used in the longitudinal submodel, prediction for the ultimate losses is based on a linear extrapolation for the settlement time beyond the maximum observed payment time in the training data. Figure 6 compares the actual (or realized) and predicted outcomes in the hold-out sample. The left panel presents the distribution of unpaid losses over time. The consistency between the actual and predicted unpaid losses suggests a satisfying performance of the joint model. Another advantage of the joint model is that it can be used to predict the time to settlement for open claims, which will be particularly useful in the run-off operation of an insurer. For example, in a run-off situation for a workers compensation insurer, losses for which claimants would not take an offered settlement usually involves regular payments until death (Kahn, 2002). Therefore, accurately predicting the settlement time or remaining months to live is important in the reserving exercise. The right panel shows the scatter plot of actual and predicted settlement time. The linear relationship is an indicator of prediction accuracy.
For reserving purposes, one is not only interested in the point prediction but also the predictive distribution. We discuss two predictive distributions for the insurer's outstanding payments that are relevant to the application. The first is the predictive distribution of the expected outstanding payments. The uncertainty associated with the expected payments is from the parameter estimation. This type of uncertainty is known as parameter uncertainty in the macro-loss reserving literature (see, for instance, England and Verrall, 2002). The second is the predictive distribution of the outstanding payments. In addition to parameter estimation, additional uncertainty arises due to the inherent randomness of the unpaid losses, which is known as process uncertainty in the macro-loss reserving literature. We obtain the two types of predictive distribution using simulation to incorporate the parameter and process uncertainty. For parameter uncertainty, we generate model parameters from the multivariate normal distribution with mean being the maximum likelihood estimatesθ and covariance matrix Var(θ ). For process uncertainty, we generate the ultimate payment from the joint model given the simulated parameters. To illustrate, we present in Figure 7 the predictive distributions for the entire insurance portfolio. The RBNS liability is generated for each individual claim and then aggregated for the portfolio. As anticipated, the predictive distribution of the outstanding payment is much wider than the predictive distribution of the expected payment because the former contains both parameter and process uncertainty, while the latter only considers parameter uncertainty. The vertical line in the figure indicates the actual outstanding payments in the hold-out sample. It corresponds to the 34th percentile and 53rd percentile in the predictive distributions of expected unpaid loss and unpaid loss, respectively.

Comparison with alternative methods
This section compares reserve prediction from the joint model with alternative individual loss reserving methods. Specifically, we consider the four methods below: In addition, we make a comparison with the reserves determined by a claim expert (initial estimate). Further, we provide a comparison of reserves on the aggregate level and provide reserve estimates from the chain-ladder model. We employ the Mack chain-ladder model (Mack, 1993) and obtain RBNS reserve estimates from a run-off triangle that is aggregated using the reporting and observation year instead of the occurrence year and development year. The analysis was performed in the ChainLadder R package (Carrato et al., 2020).
The comparison is based on predictions of unpaid losses for individual claims. Using the actual and predicted unpaid losses of individual claims, we calculate five validation statistics, mean absolute error (MAE), root mean squared error (RMSE), Pearson correlation, Gini correlation, and simple Gini (see Frees et al., 2011 andFrees et al., 2014 for details on the latter two metrics). The results are summarized in Table 6. Higher prediction accuracy is suggested by a smaller MAE and RMSE, and larger correlations. The validation statistics support the superior prediction from the proposed joint model to alternative individual reserving methods. We also calculate the reserve error on the aggregate level, which is the expected RBNS reserve minus the actual unpaid losses, as a percentage of the actual unpaid losses. The results show the chain-ladder method did not perform well in estimating the unpaid losses.

Limitations
Though the joint model displayed superior prediction results compared to models that ignore the payment-settlement association; there are some limitations in the current framework, which will be the subject of future research and are highlighted below: 1. Working with cumulative instead of incremental payments: From equation (3.3), we explicitly capture the dependence between the payment history and settlement process. To do this, we rely on the assumption that the cumulative payments follow a continuous growth curve. The implicit assumption is that the trends in time, fixed-effects, and the random effects b i together will soak up the temporal correlation among cumulative payments. The conditional independence assumption for cumulative payments could be too strong to result in a realistic claim evolution. One could explicitly account for the correlation in cumulative payments but at the cost of increased model complexity. We leave this work to explore in the future. 2. Intermediate payment prediction: The model immediately predicts the amount paid at settlement without intermediate cash flows, which may be limiting in situations where such quantities are needed. 3. Time-dependent covariates: The proposed model only allows for external time-varying covariates that are determined independently of the settlement process. Internal time-varying covariates such as case estimates, whose value is generated until settlement or censoring by the individual claims, require special treatment (Kalbfleisch and Prentice, 2002). Handling internal time-varying covariates is a topic for future research.

CONCLUSION
This paper concerns the claims reserving problem using an individuallevel loss reserving method. The work was primarily motivated by the payment-settlement association. Specifically, complex claims can be both more expensive in terms of severity and take longer to settle, suggesting that the payment process is correlated with the settlement process for individual claims. In this case, knowledge of paid losses may help predict settlement time, which in turn feeds back into the prediction of unpaid losses.
We introduced a joint model framework to the individual-level loss reserving literature to accommodate such association. The joint model consists of a longitudinal submodel for the cumulative payment process and a survival submodel for the settlement process, and the association between the two components is induced via a shared parameter model. In addition, the proposed joint model incorporates both observed and unobserved heterogeneity into the two sub-processes, which is desired when one is interested in the prediction at the individual claim level.
We have demonstrated that failing to incorporate the association between the payment processes and the settlement processes could lead to significant errors in reserving prediction. Specifically, the joint longitudinal-survival model (JM) framework was applied to the reserving problem using data from a property insurance provider. The prediction results from the joint model were compared to existing reserving models, and the results showed that accounting for the payment-settlement association leads to better prediction and lower reserve uncertainty compared to models that ignore the association.