Hostname: page-component-89b8bd64d-72crv Total loading time: 0 Render date: 2026-05-07T23:24:14.095Z Has data issue: false hasContentIssue false

PHASE-TYPE DISTRIBUTIONS FOR CLAIM SEVERITY REGRESSION MODELING

Published online by Cambridge University Press:  07 January 2022

Martin Bladt*
Affiliation:
Department of Actuarial Science, Faculty of Business and Economics, University of Lausanne, UNIL-Dorigny, 1015 Lausanne, Switzerland, E-Mail: martin.bladt@unil.ch
Rights & Permissions [Opens in a new window]

Abstract

This paper addresses the task of modeling severity losses using segmentation when the data distribution does not fall into the usual regression frameworks. This situation is not uncommon in lines of business such as third-party liability insurance, where heavy-tails and multimodality often hamper a direct statistical analysis. We propose to use regression models based on phase-type distributions, regressing on their underlying inhomogeneous Markov intensity and using an extension of the expectation–maximization algorithm. These models are interpretable and tractable in terms of multistate processes and generalize the proportional hazards specification when the dimension of the state space is larger than 1. We show that the combination of matrix parameters, inhomogeneity transforms, and covariate information provides flexible regression models that effectively capture the entire distribution of loss severities.

Information

Type
Research Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution, and reproduction in any medium, provided the original work is properly cited.
Copyright
© The Author(s), 2022. Published by Cambridge University Press
Figure 0

Figure 1. Densities corresponding to a scalar Weibull distribution and a Matrix-Weibull distribution.

Figure 1

Figure 2. Mean functions of the PH regression model, as a function of a univariate regressor and for different specifications of the inhomogeneity function $\lambda$.

Figure 2

Figure 3. Underlying Markov structures. Names are borrowed from the corresponding PH representations but apply to our inhomogeneous setup as well. The state 0 is added for schematic reasons but is not part of the actual state space of the chain. The (F) general case has the intensities $t_{ij}$ and $t_{ji}$ between each pair of states $i,j\in\{0,\dots,p\}$ omitted for display purposes.

Figure 3

Figure 4. Histogram and kernel density estimate of the log-transformed simulated data.

Figure 4

Table 1. GLMs and PH regression models.

Figure 5

Figure 5. Ordered PITs from equation (2.10) versus uniform order statistics for the simulated dataset. KS refers to the Kolmogorv–Smirnov statistic for testing uniformity.

Figure 6

Table 2. Summary of fitted marginal models to severities from the freMPL dataset.

Figure 7

Figure 6. Fitted distributions to claim severities from the MPL dataset. The log-transform is used exclusively for visual purposes, the estimation having been carried out in the usual scale. KS refers to the Kolmogorv–Smirnov statistic for testing uniformity.

Figure 8

Table 3. Summary statistics of the French MPL dataset: claim severity.

Figure 9

Table 4. Summary for GLM and PH regression models for the freMPL dataset.

Figure 10

Figure 7. Coefficients and p-values of IPH and GLM regression. For display: IPH coefficients multiplied by $-1$ and intercept of GLM omitted.

Figure 11

Figure 8. Left panels: aggregate observed losses versus aggregate implied model premia (expected value), normalized to sum to 1, for the Gamma GLM, and Pareto and Weibull PH regressions; right panels: number of claims within each category (right).

Figure 12

Figure 9. Empirical quantiles by coverage category versus mean (accoss all other covariates) quantiles implied by the Gamma GLM, and Pareto and Weibull PH regressions.

Figure 13

Figure 10. Ordered PITs from Equation (2.10) versus uniform order statistics for the French MPL dataset. KS refers to the Kolmogorv–Smirnov statistic for testing uniformity.

Supplementary material: File

Bladt supplementary material

Bladt supplementary material

Download Bladt supplementary material(File)
File 106.8 KB