Chapter Preview. This chapter considers regression methods where the analyst has the ability to follow a unit of analysis, such as a policyholder, over time. In the biomedical literature, this type of information is known as longitudinal data and, in the econometric literature, as panel data.
Introduction
7.1.1 What Are Longitudinal and Panel Data?
Regression is a statistical technique that serves to explain the distribution of an outcome of interest (y) in terms of other variables, often called “explanatory” or “predictor” (x's) variables. For example, in a typical ratemaking exercise, the analyst gathers a cross-section of policyholders and uses various rating variables (x's) to explain losses (y's).
In this chapter, we assume that the analyst has the ability to follow each policyholder over time. In many situations, a policyholder's past loss experience can provide an important supplement to the information gleaned from available rating variables. We use the notation i to represent the unit of analysis (e.g., policyholder) that we will follow over time t. Using double subscripts, the notation yit refers to a dependent variable y for policyholder i at time t, and similarly for explanatory variables xit.
Consider four applications that an actuary might face that fall into this framework:
(1) Personal lines insurance such as automobile and homeowners: Here, i represents the policyholder that we follow over time t, y represents the policy loss or claim, and the vector x represents a set of rating variables.
[…]