A Markov multiple state model for epidemic and insurance modelling

Abstract With recent epidemics such as COVID-19, H1N1 and SARS causing devastating financial loss to the economy, it is important that insurance companies plan for financial costs of epidemics. This article proposes a new methodology for epidemic and insurance modelling by combining the existing deterministic compartmental models and the Markov multiple state models to facilitate actuarial computations to design new health insurance plans that cover epidemics. Our method is inspired by the seminal paper (Feng and Garrido (2011) North American Actuarial Journal, 15, 112–136.) of Feng and Garrido and complements the work of Hillairet and Lopez et al. in Hillairet and Lopez ((2021) Scandinavian Actuarial Journal, 2021(8), 671–694.) and Hillairet et al. ((2022) Insurance: Mathematics and Economics, 107, 88–101.) In this work, we use the deterministic SIR model and the Eyam epidemic data set to provide numerical illustrations for our method.


Introduction
Given the impact of epidemics such as COVID-19, and to a lesser extent H1N1, SARS, etc. on our society, health insurance providers must pay more attention to developing health insurance coverage to the public that includes those caused by infectious diseases.In that regard, the actuaries who design those health insurance plans may find the vast literature of epidemiological compartmental models immensely useful.Epidemiological compartmental models have been developed since the 1910s with the works of Ross in fighting malaria (Ross, 1910) and Kermack andMcKendrick (1927) and(1932) who introduced the celebrated SIR model.Those early compartmental models are deterministic (i.e., it gives the exact number of individuals in each compartment at any given time).In Bartlett (1949), Bartlett introduced a stochastic version of the SIR model known as the general stochastic epidemic model.Since then, stochastic compartmental models have been studied and generalised extensively (see Andersson and Britton, 2000;Britton et al., 2019).
Even though the literature on epidemiological compartmental models is vast, its applications to actuarial science and in designing insurance plans are modest in comparison.Early papers to investigate actuarial applications of epidemiological models include Hua and Cox (2009) by Hua and Cox and Feng and Garrido (2011) by Feng and Garrido. In Feng and Garrido (2011), the authors developed actuarial models to provide financial arrangement arising due to epidemics based on deterministic epidemiological compartmental models.Other papers in this direction include Chen (2021), Feng et al. (2022, 2020), and Hainaut (2021).In particular, the paper Feng et al. (2022) by Feng et al. contains a detailed bibliography of related works.
In another development, Lefevre, Picard and Simon in a series of papers (Lefevre et al., 2017-Lefevre andSimon, 2021) also developed actuarial applications of the general stochastic epidemic models such as one proposed in Bartlett (1949) and its generalisations.Using sophisticated techniques of probability theory, they obtained formulas for various important quantities of actuarial interest, for example, the final susceptible size distribution of the epidemic, the expected susceptibility and infectivity times, the ruin probability for insurers, etc.
It has been pointed out that modelling the financial impact of epidemics is significantly different from those caused by regular mortality or diseases.One of main differences is that the speed of infection depends heavily on the number of infected individuals in the population.As a result, stochastic Markovian epidemic models usually take the entire population into account rather than looking at each individual.This approach, even though is very natural and powerful, has one drawback that the state space grows with the population size which makes actuarial calculations impractical.
In this paper, we want to propose an idea that starting with a deterministic compartmental model as an exogenous input, we can construct a Markov multiple state model for the infection time line of an individual such that the behaviour of the population of such independent individuals match the deterministic compartmental model.Such a model can then be used to calculate premiums and reserves for an insurance policy covering epidemics.The advantage of this approach is that we can then utilise the full machinery of Markov multiple state models to study the financial impacts of epidemics at the individual level and then obtain related results for the entire population.To illustrate the idea, we analyse in detail the Markov multiple state model arising from the deterministic SIR model.Consequently, we can clarify and extend some of the results in Feng and Garrido (2011).
The idea of using Markov multiple state models at the individual level to study the financial impact of epidemics is not new.Indeed, many of the quantities introduced by Feng and Garrido in Feng and Garrido (2011) are remarkably similar to standard actuarial functions in Markov multiple state models.Furthermore, Billard and Dayananda constructed a multistage compartmental model for HIV-infected individuals based on waiting time in each compartment and studied its actuarial applications in Billard andDayananda (2014a) and(2014b).Nevertheless, Billard and Dayananda model the waiting times directly, whereas our approach involves setting the transition intensities to match the deterministic compartmental model.
It has come to our attention that our model is a special case of the model used in Hillairet and Lopez (2021) and Hillairet et al. (2022) by Hillairet and Lopez et al. in their study of cyber insurance management.Note that in Hillairet et al. (2022), the authors not only study the SIR model but also extend their analysis to multigroup SIR model as well.Nevertheless, our objective and motivation as well as methodology are significantly different from those of Hillairet and Lopez et al.We also acknowledge that there is a parallel paper (Francis and Steffensen, 2023) by Francis and Steffensen which has the same fundamental ideas as those in our paper.Francis and Steffensen not only treat the SIR model but also generalise it to several related models as well.The results obtained in Francis and Steffensen (2023) are very similar to ours.Although our paper focuses exclusively on the SIR model, we also discuss several topics not included in Francis and Steffensen (2023) such as the distributions of the waiting times, the duration and the final susceptible size of the epidemic.Thus, despite the overlap, we believe our paper makes a genuine contribution to the literature.
This paper is organised as follows.In Section 2, we recall the deterministic SIR model and its basic properties.In Section 3, we construct the Markov multiple state model based on the deterministic SIR model.In Section 4, we discuss the waiting times for an individual to remain susceptible or infected.In Section 5, we analyse the spread of the epidemic in a population.More precisely, we give the distributions of the final susceptible size and the duration of the epidemic.In Sections 6 and 7, we give formulas for the actuarial present values of the traditional insurance benefits and discuss premium and reserve calculations.In Section 8, we provide numerical illustrations for our method using the Eyam data set from Raggett (1982).

The deterministic SIR model
Let us briefly recall the deterministic SIR model in epidemiology.The readers should consult (Brauer et al., 2019) for further description of this model.The population is divided into three compartments Susceptible (S), Infected (I) and Removed (R) where possible transitions are indicated in Figure 1.Let S(t), I(t) and R(t) be the number of susceptible, infected and removed individuals at time t, respectively.Assume that at time 0, the population has N individuals and R(0) = 0. Let us define s(t), i(t) and r(t) as the corresponding proportions of susceptible, infected and removed individuals in the population, respectively.The SIR model states that the dynamics of this epidemic is governed by the following system of differential equations where α and β are positive constants.This is a first-order system of differential equations.Although explicit solutions in terms of elementary functions are not available, we can still numerically solve for (s(t), i(t), r(t)) for any t > 0. The following proposition gives the phase-plane equation for (s(t), i(t)) (see Brauer et al., 2019, p. 37).
Proposition 2.1.The functions s(t) and i(t) are related via the equation Equivalently, (2.3) Proposition 2.2.Let s(∞), i(∞) and r(∞) be the limits of s(t), i(t) and r(t), respectively, as t approaches where s(∞) is the unique solution between 0 and α/β of the equation In particular, both s(∞) and r(∞) are strictly between 0 and 1 and Proof.See Brauer et al. (2019, p. 37).
The following formula for the inverse of s(t) will be used in Subsection 8.6.
Proposition 2.3.The inverse function of s(t) is given by: for Integrating both sides and noting that s −1 (s(0)) = 0 yields the required formula.
The proof of the following proposition can be found in Brauer et al. (2019, p. 40).

The SIR Markov multiple state model
In this section, we will construct a Markov multiple state model that can capture the dynamics of the deterministic SIR model.Consider a random individual in this population.At any given time, this individual can be in either one of the three states 0,1 and 2 which correspond to the three compartments (S), (I) and (R).For t ≥ 0, let X t be the state that this individual is in at time t.We assume that (X t ) t≥0 is a continuous Markov process.Let P be the transition matrix for X t .That is for 0 ≤ z ≤ t, P(z, t) is the matrix whose entries are given by for all i, j ∈ {0, 1, 2}. s(t) = s(0)P 00 (0, t) i(t) = s(0)P 01 (0, t) + i(0)P 11 (0, t) (3.2) r(t) = s(0)P 02 (0, t) + i(0)P 12 (0, t).Equations (3.2) can be written in matrix form as follows (s(t), i(t), r(t)) = (s(0), i(0), r(0))P(0, t). (3.3) Proof.At time 0, there are S(0) susceptible individuals who can get infected at time t with probability P 01 (0, t), and there are I(0) infected individuals who remain infected at time t with probability P 11 (0, t).Therefore, the expected number of infected individuals at time t is S(0)P 01 (0, t) + I(0)P 11 (0, t).
Let us recall the Kolmogorov differential equations for multiple state models (see Haberman and Pitacco, 1998, p. 17).
Theorem 3.3.For a Markov multiple state model with transition matrix P and transition intensity matrix μ, the Kolmogorov forward differential equation is given by The initial condition is P(z, z) equals to the identity matrix.
Corollary 3.6.For the SIR multiple state model, the forward Kolmogorov equations are In addition, for t > 0, we have Theorem 3.7.For the SIR multiple state model, the transition probabilities are given by (3.12) In addition, Proof.The formulas for P 11 (z, t) and P 12 (z, t) follow immediately from μ 12 w ≡ α.We have Furthermore, We can directly verify that α e αw s(w)dw = e αw (s(w where c is any constant.Thus, the formula for P 01 (z, t) follows.
Combining (3.13) and Proposition 2.2 yield the following Corollary.
For z > t * = s −1 (α/β) and i(z) sufficiently small, for t > z we have Remark 3.9.The SIR multiple state model defined by Definition 3.5 is a special case of the model constructed by Hillairet and Lopez in (2021).
• We have showed that it is the only Markov multiple state model that satisfies (3.7).That is, the probability a randomly chosen individual being in each compartment matches with the proportion of the population in the corresponding compartment which is given by the deterministic SIR model.• We have also determined the transition probabilities of the SIR multiple state model and show that they can be given by simple expressions of (s(t), i(t), r(t)) (see (3.13)).Note that we could also obtain (3.13) by solving (3.4).
Remark 3.10.The technique that used to construct the SIR multiple state model can be extended to other deterministic compartment models as well.Suppose we are given a deterministic compartment model with compartments 0, 1, . . ., M. Let s(t) = (s i (t)) M i=0 be the row vector whose components are proportions of the population in each compartment.Assume that the dynamics of the population are characterised by the system of differential equations where F is a function of t and s(t).Consider a random individual in this population.Let X t be the compartment that this individual belongs to at time t.We want (X t ) t≥0 to follow a Markov multiple state model that matches the behaviour of the model (3.14) in aggregate.That is for t ≥ 0 . Let μ t be the transition intensity matrix.By the Chapman-Kolmogorov equation, Differentiating (3.15) and using (3.6) yields As a result, μ t satisfies the following equation For the deterministic SIR model, (3.16) reduces to (3.9) which in turn yields (3.8).
Remark 3.11.The fact that many of the constructions in Feng and Garrido (2011) are remarkably similar to the actuarial functions in Markov multiple state models, for example, forces of transition, EPVs of annuity and insurance benefits, etc. suggests that there might be a Markov multiple state model underlying all of these constructions.Moreover, this model should describe the state of a random individual in the population during the epidemic rather than the state of the entire population.The chance that an individual getting infected depends on the current state of the population.Unfortunately, this information is not available if we only consider one individual.Therefore, we use the expected state of the population which is given by the deterministic SIR model to approximate the actual state of the population.We are not claiming that this model actually describes the infection mechanism of the epidemic, but it is a reasonable approximation.

The distribution of waiting times
In this section, we will analyse the individual's waiting times in each state for the SIR multiple state model.The results derived here are useful for simulation (see Subsection 8.6).Consider a random individual in the population.Let T (0) and T (1) be the waiting times in states 0 and 1, respectively.If the individual is in state 1 at time 0, then we set T (0) equal to zero.Furthermore, let T = T (0) + T (1) be the time until removal or time to absorption.The following proposition gives the distributions of T (0) , T (1) and T .
Remark 4.2.The random variables T (0) , T (1) and T all have mixed distributions with mass probabilities at ∞, 0 and ∞, respectively, and continuous density elsewhere.The fact that P(T (0) = ∞) = s(∞)/s(0) can be interpreted as a susceptible individual in this population has a positive chance of never getting infected.Consequently, neither T (0) nor T has finite moments.However, conditioning on T (0) < ∞, all of their moments are finite as showed by the following proposition.

The duration and final susceptible size of the epidemic
In this section, we will investigate the spread of the epidemic in the whole population.Suppose our population consists of S 0 susceptible and I 0 infected individuals initially where S 0 , I 0 > 0. Let N = S 0 + I 0 .We assume that everyone's state is independent from one another and follows the Markov process described in Definition 3.5 with s(0) = S 0 /N and i(0) = I 0 /N.Let S(t) and Ĩ(t) be the number of susceptible and infected individuals at time t.The spread of the epidemic in the population is given by the process ( S(t), Ĩ(t)) t≥0 with ( S(0), Ĩ(0)) = (S 0 , I 0 ).We observe that ( S(t), Ĩ(t)) t≥0 is a continuous Markov process on the state space Proposition 5.1.S(t) follows the binomial distribution B(S 0 , P 00 (0, t)) and Ĩ(t) is the sum of two independent binomial random variables B(S 0 , P 01 (0, t)) and B(I 0 , P 11 (0, t)).In particular, E( S(t)) = Ns(t), E( Ĩ(t)) = Ni(t), Cov( S(t), Ĩ(t)) = −S 0 P 00 (0, t)P 01 (0, t).
Let be the distribution function of the standard normal distribution.For N large and 0 < α < 1, the (1 − α)-confidence intervals of s(t) and i(t) are contained in the intervals respectively.In particular, as N → ∞, S(t)/N and Ĩ(t)/N converge to s(t) and i(t) in probability.
Proof.For i = 0, 1, 2, let S0i (t) be the number of individuals in state i at time t having been susceptible at time 0. Let I 11 (t) be the number of infected individuals at time t who are infected at time 0. We have From the independence assumption between individuals, I 11 (t) ∼ B(I 0 , P 11 (0, t)) and In addition, I 11 (t) is independent of the S0i (t) for i = 0, 1, 2. The formulas for the expected values and covariance of S(t) and Ĩ(t) then follow.The formulas for the confidence intervals can be derived using normal approximations to the binomial distributions.
Remark 5.2.The final susceptible size of the epidemic is the total number of susceptible individuals remained at the end of the epidemic which is given by S(∞).It follows the binomial distribution B(S 0 , s(∞)/s(0)).The final size of the epidemic is given by S 0 − S(∞).
Remark 5.5.Using our notations, the general stochastic epidemic model proposed by Bartlett in Bartlett (1949) is characterised by the following relations (see Andersson andBritton, 2000, p. 14 or Britton et al., 2019, p. 26) (5.4) Compared with Proposition 5.4, the key difference is in the infection intensity where βS t I t /N is replaced by βS t i(t).In our model, the infection intensity is given exogenously from the SIR deterministic model, whereas in the general stochastic epidemic model (5.4), the infection intensity is determined endogenously by the actual number of susceptible and infected individuals in the population.Nevertheless, for large N, Ĩ(t)/N ≈ i(t) in probability.Hence, our model can be considered an approximation of the general stochastic epidemic model.It would be interesting to work out a comprehensive comparison between the two models, and we would like to revisit this question at a future date.
Remark 5.6.The independence assumption among members of the population allows for mathematical tractability.We offer the following heuristic argument to partially justify it.For another point of view, the readers are referred to Hillairet and Lopez (2021) and Hillairet et al. (2022).It is plausible to assume independence between individuals because the dependence nature of the epidemic has already been encoded in the individual's infection intensity.Let us elaborate by considering the following instances.
• Suppose there are many infected individuals in the population at time t, that is, Ĩ(t) is large.Then (5.1) implies that i(t) is also likely to be large.As the infection intensity for an individual is βi(t), a randomly chosen individual is more likely to get infected even though all individuals are independent.• Suppose we observe no infected individuals at time t.According to our model, further infection is possible which is counter intuitive.However, by (5.1), if Ĩ(t) = 0 then it is likely that i(t) is small.The infection intensity is also small and even though further infection is possible, it is also unlikely.
Hence, although the independence assumption is unrealistic, this model can still capture the dynamics of the epidemic to a certain extent because the aggregate level of infection in the population indirectly influences every individual's chance of getting infected.
We now turn to the question of the duration of the epidemic.
Definition 5.7.The duration of the epidemic, D, is defined as the first instance that there are no infected individuals and there are no further infections thereafter.In other words, D = inf{t ≥ 0: Ĩ(t) = 0, and there are no further infections after t.} Equivalently, the duration can be defined as where T i is the time to absorption of the ith individual as defined in Section 4.
Proposition 5.8.The distribution function of D is given by P(D ≤ t) = P 00 (0, ∞) + P 02 (0, t) S 0 1 − e −αt I 0 . (5.5) Proof.The epidemic is over by time t if and only if at time t • all I 0 infected individuals have been removed; • and all S 0 susceptible individuals have either been removed or will never be infected.
Let us give the asymptotic distribution of D as N → ∞ such that s(0) and i(0) are constants.Let D 0 be the maximum value of the time to absorption for the S 0 individuals who are susceptible at time 0 and are removed eventually.Let D 1 be the maximum value of the time to absorption for the I 0 individuals who are infected at time 0. Thus, D 0 is the maximum value of S 0 independently identically distributed (i.i.d.) random variables (Y i ) S 0 i=1 whose common distribution, F(t), is given by F(t) = P 00 (0, ∞) + P 02 (0, t). (5.6) On the other hand, D 1 is the maximum value of I 0 i.i.d.exponential random variables with rate α.
It is clear that D is the maximum value of D 0 and D 1 .The following proposition gives the limiting distributions of D 0 and D 1 .
Proposition 5.9.There exists a constant b such that both (α − βs(∞))D 0 − log (S 0 ) − b and αD 1 − log (I 0 ) converge in distribution to the standard Gumbel distribution as N → ∞.Furthermore, the variable D/ log (N) is bounded in probability.More precisely, (5.7) Proof.The fact that the normalised maximum of i.i.d.exponential random variables converges in distribution to the standard Gumbel distribution as sample size increases to infinity is well-known (see Embrechts et al., 1997, p. 125).From Corollary 3.8, 1 − F(t) ∼ e −(α−βs(∞))t .Thus, the random variable Y i is tail equivalent to the exponential random variable with rate α − βs(∞).From Embrechts et al. (1997, Proposition 3.3.28),the norming constants of the exponential distribution can be used for those of F(t) with the limiting distribution shifted by a constant.As D 0 and D 1 are independent, P(D ≤ t) = P(D 0 ≤ t)P(D 1 ≤ t).By direct calculation, (5.8) Remark 5.10.It is important to emphasise that the duration of the epidemic is not given by inf{t ≥ 0 : The reason is that even if Ĩ(t) = 0, further infections are still possible after time t.If that is the case, we say the epidemic restarts after time t.Since we cannot rule out the possibility that the epidemic may restart as long as there remain susceptible individuals, the duration is not an observable quantity.
For the rest of this section, we will estimate the duration of the epidemic under different sets of available information.Let us start with the following scenario.Suppose z is the last instance we observe infected individuals and t is the first instance we observe no infected individuals.That is, for z < t ( S(z), Ĩ(z)) = (S z , I z ), ( S(t), Ĩ(t)) = (S t , 0) (5.9) where S z , S t and I z > 0. Under these circumstances, the epidemic is not over at time z, but it is not clear whether it is over at time t.Proposition 5.11 below describes the conditional distribution of the duration given (5.9).Thus, we can estimate how long it takes until the epidemic is finally over having observed no infections for the first time.
Proposition 5.11.For w ≥ 0, the conditional distribution of D is given by ⎧ ⎨ ⎩ P 00 (t, ∞) St P 02 (z,z+w) (5.10) Note that the distribution above is continuous but not differentiable at w = t − z.
• For w > t − z, (5.10) gives the probability that the duration is over at time z + w > t based on the state of the population at time t.Then the formula is just a variation of (5.5).• For w < t − z, (5.10) gives the probability that the duration is over at time z + w < t given that there remain S t susceptible individuals at the end of the epidemic.Then the required probability is the product of the probability that all S t susceptible individuals at time t remain susceptible forever with the probability that all S z − S t susceptible individuals at time z being removed by time z + w given they are removed by time t with the probability that all I z infected individuals at time z being removed by time z + w given they are being removed by time t.
Corollary 5.13.Suppose that in addition to the information (5.9), we also know that there are no further infections after time t.For example, all the remaining susceptible individuals are removed for some other reasons.Then the distribution of the duration under the combined information is (5.11) Remark 5.15.Proposition 5.14 whose proof is omitted is applicable in the following situation.Suppose we observe that the infection number is zero at some time z after the peak of the epidemic.How long do we expect to wait until the epidemic is finally over ?From Proposition 5.1, Ĩ(z) = 0 implies that i(z) < 2/ √ N with at least 99% probability.Thus, if the population is large, we can be fairly confident that the condition of the Proposition 5.14 is applied.Then (5.11) gives an approximation to the expected duration of the epidemic from time z.Equation (5.11) is quite intuitive.The factor S z βi(z)/(α − βs(∞)) approximates the expected number of infections after time z and the factor 1/(α − βs(∞)) + 1/α approximates the expected time to absorption for individuals who get infected after time z.Hence, even though the epidemic can continue after zero infection level is observed, we do not expect it to last very long or be very severe.
Remark 5.16.Let us make a brief comparison between the model constructed in this section and the general stochastic epidemic model.According to Andersson and Britton (2000) Theorem 2.2, for the general stochastic epidemic model the distribution of the final size is typically bimodal.That is either a few individuals are infected or a fairly large number are infected eventually, whereas for our model the distribution for the final size is binomial.In addition, from Andersson and Britton (2000, p. 34), the asymptotic distribution of the duration is either short or grows as log (N), whereas for our model the duration grows as log (N).

Valuations of insurance benefits
In this section, we derive formulas for the expected present values (EPV) of insurance and annuity benefits in the SIR multiple state model.The results of this section will be used to analyse premiums and reserves on insurance products in the next section.

Review of general results
For the readers convenience, we collect some useful results for the EPV calculations in a general Markov multiple state model in this subsection.For comprehensive treatments, the readers are referred to Dickson et al. (2020) and Haberman and Pitacco (1998).Definition 6.1.Consider a Markov multiple state model with M + 1 states 0, 1, . . ., M. The transition matrix is P and the force of transition matrix is μ t .Let δ be the constant force of interest and v = e −δ .Recall the following standard actuarial functions.
• For any i,j, let āij (z, n) be the EPV of an insurance benefit that pays a continuous annuity at the rate of $1 per annum between time z and n to an individual, who is at state i at time 0, as long as he is in state j.
• For i = j, let Āij (z, n) be the EPV of an insurance benefit that pays $1 at time t for z ≤ t ≤ n if the policyholder who is at state i at time 0 moves to state j at time t.
• Let Bij (z, n) be the EPV of an insurance benefit that pays $1 at time t for z ≤ t ≤ n if the policyholder who was at state i at time 0 transfer out of state t at time t.
Lemma 6.2.We have Proof.Combining (3.6) and integration by parts yields Remark 6.4 Consider an insurance policy in force between time z and n for an individual who is currently in state i.He will deposit $1 to an insurance fund the moment he enters state j.In return, he will receive a continuous annuity of rate δ per annum as long as he is still in state j.His deposit will also be returned either on the moment he leaves state j or on the expiration of the policy.Then Lemma 6.3 states that for such a policy the EPV of the cost to the insurer is equal to the EPV of the fund paid to the insurer.
Corollary 6.5.If j is an absorbing state, then

Applications to the SIR multiple state model
Let us apply the results of the previous subsection to derive formulas for the standard actuarial functions in the context of the SIR multiple state model.Proposition 6.6.For the SIR multiple state model, we have In addition, Furthermore, the following equations hold Proof.The formulas for Ā12 (z, n), Ā02 (z, n), ā11 (z, n) and ā12 (z, n) can be derived directly from the fact that μ 12 t = α and P 11 (z, t) = e −α(t−z) .From Proposition 6.3, By Corollary 6.5, On the other hand, we have just showed that Ā02 (z, n) = α ā01 (z, n).Thus, we deduce (6.7).Equation (6.6) follows from (6.1) and (6.7).Finally, combining (6.2) with (6.6) yields (6.8).
Definition 6.7.Following Feng and Garrido (2011), we define the following actuarial functions: In addition, Let us refer to these actuarial functions as the aggregate actuarial functions.These functions are useful to calculate reserves and premiums for an insurance scheme consisting of all the population including those who are already infected or removed.For example, this insurance scheme might be part of a compulsory national insurance plan in which the entire population is required to participate.In this context, the aggregate actuarial functions have the following economic interpretations.If each susceptible (or infected or removed) individual is entitled to a continuous annuity at the rate of $1 per annum between time z and n, then the EPV of the cost to the insurer per individual is given by ās (z, n) (or āi (z, n) or ār (z, n)), respectively.Similarly, if each individual is compensated $1 immediately upon getting infected (or removed) between time z and n, then Āi (z, n) (or Ār (z, n)) represents the EPV of the cost to the insurer per individual.Lemma 6.8.From (3.4), the aggregate actuarial functions can be expressed in terms of the standard actuarial functions as follows.
Lemma 6.9. (6.9) Proof.Equation (6.9) can be derived from integration by parts.It has the following economic interpretation.Consider an insurance policy between time z and n where each individual deposits $1 to an insurance fund either at time z if he is already removed or the moment he is removed.In return, he is entitled to a continuous annuity of rate δ per annum until time n when the $1 deposited earlier will be returned.The EPV of the cost to the insurer per individual is given by The EPV of the fund paid to the insurer per individual is the sum of the payments of those who are already removed at time z and the payments by those who are not yet removed at time z.Thus, it is equal to As a result, (6.9) simply says that the EPV of the cost to the insurer is the same as the EPV of the fund paid to the insurer.Proposition 6.10.For the SIR multiple state model, the following identities hold (6.12) Proof.Different versions of these identities are proved in Feng and Garrido (2011).We note here that they can be derived from Proposition 6.6 and Lemma 6.8.Remark 6.11.The economic interpretations of Proposition 6.10 can be obtained by equating the EPV of the cost to the insurer and the EPV of the fund paid to the insurer for suitably chosen insurance policies.As an example, let us interpret (6.10).Consider the insurance policy between time z and n in which each susceptible or infected individual contributes $1 at time z.In return, he will receive a continuous payment of rate δ per annum until he is either removed or until the expiration of the policy when he is also returned his deposit.The other identities can be interpreted similarly (see Feng and Garrido, 2011 for more information).

Premiums
Consider the following insurance policy which is in force for n years: • If the policyholder is susceptible, then he pays a continuous premium at the rate of P per annum; • If the policyholder is infected, then he receives immediately a payment of S 1 and a continuous hospitalisation benefit at the rate of H per annum; • If the policyholder is removed, then he receives immediately a payment of S 2 .Note that n should not exceed 1 otherwise constant removal intensity might not be appropriate.We have the options of either work at the individual or aggregate level.That means we can either determine the premium by considering a single policy consisting of one susceptible policyholder or by considering a portfolio consisting of S 0 susceptible and I 0 infected policyholders.The latter is the approach taken in Feng and Garrido (2011).Nevertheless, being able to work at the individual level allows more flexibility, for example, if only a subset of the population is insured.
Proposition 7.1.Under the equivalence principle, the premium rate P at the individual level is The premium rate per individual π at the aggregate level is given by Proof.Let us prove (7.2) as (7.1) is clear.Let B (0) and B (1) be the EPVs of future benefits for an individual who is susceptible and infected at time 0, respectively.Then, Therefore, π = (s(0)B (0) + i(0)B (1) )/(s(0)ā 00 (0, n)) which can be rearranged to give (7.2).

Reserves
For reserves we differentiate 4 different types • prospective reserves at the individual or aggregate level, • retrospective reserves at the individual or aggregate level, where prospective means we are taking the EPV of the difference between future benefits and future premium payments, whereas retrospective means we are taking the accumulating value of the premium payments already received less that of the benefits already paid.
Proposition 7.2.Consider the insurance policy described at the beginning of this section.
The prospective reserves at the individual level at time t for a susceptible policyholder and an infected policyholder are, respectively, The retrospective reserves at the individual level at time t for a susceptible policyholder and an infected policyholder are, respectively, Furthermore, ( t V (0) , t V (1) ) satisfy the differential equation Proof.The formulas for the prospective and retrospective reserves are clear.We note that (7.7) is a special case of Thiele's differential equation for prospective reserves (see Dickson et al., 2020, p. 325) and (7.8) can be obtained by direct differentiation.
Proposition 7.3.Let W R (t) and W P (t) be the retrospective and prospective reserves per individual at the aggregate level.Then The prospective and retrospective reserves W P (t) and W R (t) are related via W R (t) = W P (t) − e δt W P (0).
In addition, both W R (t) and W P (t) satisfy the differential equation (7.9) Proof.The formulas for the prospective and retrospective reserves are clear due to the interpretation for the aggregate actuarial functions.The differential Equation (7.9) follows by direct differentiation.
Remark 7.4.Proposition 7.3 is proved in Feng and Garrido (2011) for retrospective reserves.Suppose that P is set equal to π .Then we have the following decompositions of aggregate reserves.

Parameter estimation
Given the values (S t i , I t i ) of the number of susceptible and infected individuals ( S(t i ), Ĩ(t i )) at time t i for i = 1, . . ., M where t 1 = 0, we can form the conditional log-likelihood function as follows.
Given α, β, numerical methods can be employed to solve the differential Equations (2.1) with initial values The transition probabilities can be obtained from (3.13).Then (8.1) can be calculated via Proposition 5.3.The graph of the function l(α, β) is given in Figure 2. The   These estimates are slightly different from those obtained by Raggett (1982) who used a different method of estimation.Note that (8.1) does not take into account the fact that the epidemic is already over on October 20.To take that into account, the term S t M log (s(∞)/s(t M )) needs to be added to (8.1).The modified log-likelihood function becomes l(α, β) = l(α, β) + S t M log (s(∞)/s(t M )). (8.5) Note that s(∞) can be calculated from Proposition 2.2.The estimates obtained from maximizing (8.5) are α0 = 35.090and β0 = 56.804.However, we do not use these estimates as we would like to treat the data as part of an ongoing epidemic.Figure 3 shows the graphs of the estimated s(t) and i(t) and the observed data.
Hence, at the start of the epidemic a susceptible individual has a 33.46% chance of never getting infected and 66.54% chance of getting removed eventually.From (2.7), the infection intensity peaks at t * = 0.12 (years) when approximately 10% of the population is infected.Using (3.13), we can calculate the transition probabilities P(z, z + t) for any z, t.The graphs of P 0i (z, z + t) for i = 0, 1, 2 are given in Figure 5.

Duration estimation
Note that the last time that we could observe infected individuals is when z = 0.2521 and the first time we could observe no infected individuals is when t = 0.3370.Based on this information alone, the distribution of the duration D of the epidemic is given by Proposition 5.11.On the other hand, if we could be sure that there are no further infections after time t = 0.3370, for example, all susceptible individuals are to die from some other causes, then the distribution of D is given by Corollary 5.13.The graphs of the distributions of D − z in both cases are given in Figure 6.The mean and standard deviation of the duration in the first case are 0.4653 and 0.0528, respectively.On the other hand, the mean and standard deviation of the duration in the second case are 0.3317 and 0.0048, respectively.
From (7.2), the annual aggregate premium rate is π = 49.5219.As pointed out in Feng and Garrido (2011), the aggregate retrospective reserves are usually negative since the infection rate decreases towards the end of the epidemic.Hence, to not have policyholders surrender policies once the peak of the epidemic has passed and running at a deficit, insurance providers should charge premiums at the higher rate than that determined by the equivalence principle.Adopting the algorithm in Feng and Garrido (2011), we obtained a premium rate of $113.90.From Figure 7, we observe that with the higher premium rate, the aggregate reserve is always non-negative.Thus, the insurer is protected from loss due to early surrender.However, at the end of the epidemic, there is a surplus of $26.79 which should be returned to all remaining susceptible individuals.

Simulations
In this subsection, we describe the simulation procedure for the process (X t ) t≥0 described in Section 3. Suppose X 0 = 0. From Proposition 4.1, T (0) and T (1) can be simulated by the below algorithm.Note that the inverse of s(t) is given by (2.6).
Figure 9 shows one possible realisation of the dynamics of the population.As expected, the observed number of susceptible and infected individuals fluctuated around the corresponding expected numbers.In addition, the first time zero infection was observed is around 0.36 years.The epidemic then restarted around 0.41 years and eventually ended around 0.45 years.We note that even though the epidemic continued for almost 0.1 years after zero infection was observed, the number of further infections is only 1. Thus, the restart of the epidemic in this case does not significantly alter the dynamics of the epidemic.This fact is consistent with Proposition 5.14 and Remark 5.15.

Conclusions
In this paper, we propose a method of studying the financial repercussions of epidemics by combining deterministic compartment models in epidemiology and Markov multiple state models in life insurance.For illustrative purposes, we focus on the classical SIR model.The model that we analyse, which is a special case of Hillairet & Lopez's model in Hillairet and Lopez (2021) and Hillairet et al. (2022) and which is also studied and generalised by Francis and Steffensen in Francis and Steffensen (2023), has clarified and extended some of the constructions and results of Feng and Garrido in (2011).In particular, our approach is more flexible in dealing with small portfolios and more complex insurance products.We also study the spread of an epidemic in a population and provide concrete descriptions of the distributions of the final susceptible size and the duration of the epidemic albeit under a very strong independence assumption.Clearly, further work is required to relax that assumption and to arrive at a more realistic model.Nevertheless, we believe our model having some desirable features can serve as a starting point for further research and the techniques used in this paper are applicable in other aspect of epidemic risk management for, for example, resource planning and allocation as in Chen (2021).

Figure 6 .
Figure 6.Distribution functions of the duration.

Figure 9 .
Figure 9.One possible realisation of the population dynamics.
To connect the deterministic SIR model with this Markov multiple state model, we will match the observed number of individuals in each compartment in the former model with the corresponding expected numbers of individuals in each compartment in the latter model.This leads us to the following proposition that allows us to construct the transition matrix from the values of S(t), I(t) and R(t).
≤ S t ≤ S z and 0 ≤ I t ≤ S z + I z .It is zero otherwise.Proof.Let k be the number of individuals who are in state 0 at time z but are in state 1 at time t.Therefore, • there are S t , k and S z − S t − k individuals who are in state 0 at time z and are in states 0,1 and 2, respectively, at time t.• there are I t − k and I z − (I t − k) individuals who are in state 1 at time z and are in states 1 and 2, respectively, at time t.Proof.From Proposition 5.3, the first probability is given by S t P 00 (t, t + t) St−1 P 01 (t, t + t)P 11 (t, t + t) It , whereas the second probability is given by I t P 00 (t, t + t) St P 11 (t, t + t) It−1 P 12 (t, t + t).
Raggett (1982)illustration, let us consider the Eyam data set fromRaggett (1982).This data set describes the evolution of the bubonic plague in the village of Eyam in 1665-1666.From a population of 261 in June 1666, there were only 83 survivors in October 1666 when the plague ended.Note that we have rounded non-integral values to the nearest integers.
) are the values calculated from the data.The initial estimates obtained from minimizing (8.3) are α0 = 34.739and β0 = 56.441.The maximum likelihood estimates of α and β are given by M k=1 ŝ(t k ) − s(t k ) 2 + M k=1 î(t k ) − i(t k ) 2 (8.3)where s(t k ), i(t k ) are the values obtained from solving (2.1) and ŝ(t k ), î(t k