Hostname: page-component-89b8bd64d-ktprf Total loading time: 0 Render date: 2026-05-10T04:25:23.701Z Has data issue: false hasContentIssue false

Application of bivariate negative binomial regression model in analysing insurance count data

Published online by Cambridge University Press:  04 May 2017

Feng Liu*
Affiliation:
Department of Applied Finance and Actuarial Studies, Faculty of Business and Economics, Macquarie University, North Ryde NSW 2109, Australia
David Pitt
Affiliation:
Department of Applied Finance and Actuarial Studies, Faculty of Business and Economics, Macquarie University, North Ryde NSW 2109, Australia
*
*Correspondence to: Feng Liu, Department of Applied Finance and Actuarial Studies, Faculty of Business and Economics, Macquarie University, North Ryde NSW 2113, Australia. Tel: 61 2 9850 8455; E-mail: feng.liu4@hdr.mq.edu.au
Rights & Permissions [Opens in a new window]

Abstract

In this paper we analyse insurance claim frequency data using the bivariate negative binomial regression (BNBR) model. We use general insurance data on claims from simple third-party liability insurance and comprehensive insurance. We find that bivariate regression, with its capacity for modelling correlation between the two observed claim counts, provides both a superior fit and out-of-sample prediction compared with the more common practice of fitting univariate negative binomial regression models separately to each claim type. Noting the complexity of BNBR models and their potential for a large number of parameters, we explore the use of model shrinkage methodology, namely the least absolute shrinkage and selection operator (Lasso) and ridge regression. We find that models estimated using shrinkage methods outperform the ordinary likelihood-based models when being used to make predictions out-of-sample. We find that the Lasso performs better than ridge regression as a method of shrinkage.

Information

Type
Papers
Copyright
© Institute and Faculty of Actuaries 2017 
Figure 0

Table 1 Explanatory variables in the regression model.

Figure 1

Table 2 Summary statistics of claim frequencies as classified by the explanatory variables.

Figure 2

Figure 1 Scatter plot of two insurance claim counts. The size of the dot at each point gives a relative indication of the number of observations. The trend line is also presented.

Figure 3

Table 3 Summary table of two types of insurance counts.

Figure 4

Table 4 Interaction terms used in the regression model.

Figure 5

Table 5 Modelling results of the BNBR model, two UNBR models and the BPR model, which are all classified as the full models.

Figure 6

Figure 2 Deviances from cross-validation at different ω values. Each deviance in the graph is calculated as the average of the ten deviances at the same ω generated in the tenfold cross-validation process.

Figure 7

Table 6 Modelling result for the original full bivariate negative binomial regression model and shrunken models.

Figure 8

Figure 3 Comparison of the least absolute shrinkage and selection operator (Lasso) (left) and ridge regression (right).

Figure 9

Figure 4 Shrunken coefficients: the least absolute shrinkage and selection operator (Lasso).

Figure 10

Figure 5 Shrunken coefficients: ridge regression.

Figure 11

Table 7 Modelling results of the original full univariate negative binomial regression (UNBR) model and UNBR models shrunken by the two methods.