1. Introduction
Risk sharing is a long-standing and continually evolving research topic in actuarial science. In traditional centralized insurance model, risk sharing is relatively straightforward as it involves only two parties: the insurer and the insured. In this framework, the optimal risk sharing problem is conventionally known as optimal (re)insurance problem. Following the seminal work of Borch (Reference Borch1960) and Arrow (Reference Arrow1963), this topic has been extensively studied under various constraints from various perspectives. Readers are referred to Cai and Chi (Reference Cai and Chi2020) for a comprehensive survey of recent developments.
Based on these works, more general problems on risk sharing among multiple (re)insurer has also been considered. Asimit et al. (Reference Asimit, Badescu and Verdonck2013) and Cong and Tan (Reference Cong and Tan2016) addressed the optimal risk sharing among multiple reinsurers based on some specific risk measures. After that, Boonen et al. (Reference Boonen, Tan and Zhuang2016) studied the optimal reinsurance in the presence of multiple reinsurers based on the general utility function. Asimit and Boonen (Reference Asimit and Boonen2018) discussed the optimal insurance contract design following from bargaining among multiple insurers. They discussed both risk sharing and premium allocation among insurers through the game-theoretic approach which considers the heterogeneous risk measure for different insurers. Lin et al. (Reference Lin, Liu, Liu and Yu2023) studied optimal reinsurance in the framework of stochastic game theory. They consider a Stackelberg model to analyze the noncooperative game between one insurer and two reinsurers.
In recent years, nontraditional insurance markets, such as peer-to-peer insurance and distributed insurance, have been growing rapidly. These developments stimulate novel insurance models where risks are shared among multiple participants rather than just two in the traditional insurance models. This feature presents a completely new framework and calls for more profound studies on the optimal risk sharing strategies.
Some pioneering studies have been carried out. Denuit and Dhaene (Reference Denuit and Dhaene2012) propose the principle of conditional mean risk sharing (CMRS). They show the advantage of CMRS principle in reducing risks in the sense of convex order as well as establish its Pareto optimality. The application of the CMRS principle is explored in the context of peer-to-peer insurance in Denuit (Reference Denuit2020) and Denuit et al. (Reference Denuit, Dhaene and Robert2021), where a list of “conservation properties” and “improvement properties” are further established. The desirability of CMRS principle is further studied by Jiao et al. (Reference Jiao, Kou, Yang and Wang2022) from an axiomatic approach. While the CMRS principle possesses many desirable properties, it has a relatively restrictive optimization criterion (in the sense of convex order) and is difficult to calculate for some probability models. To address these limitations, researchers start to focus on the linear risk sharing strategies, under which the post-transfer risk of each participant is a linear combination of the pre-transfer risk in the pool. In this context, researchers study the optimal sharing ratios under less restrictive criteria, such as minimizing total (weighted) variance in Feng et al. (Reference Feng, Liu and Talyor2020), maximizing the “mutual-aid-efficiency” in Abdikerimova and Feng (Reference Abdikerimova and Feng2022), and maximizing the weighted sum of the expectation less the weighted sum of the second moments of the reserves for a multi-period problem in Abdikerimova et al. (Reference Abdikerimova, Boonen and Feng2024). For more discussion on linear risk sharing strategies, readers are referred to Liu et al. (Reference Liu, Feng and Zhang2022) and Feng (Reference Feng2023).
The assumption of linearity of the risk sharing strategies brings computational advantage. It reduces the question of finding the optimal risk sharing strategy to finding the optimal sharing ratios, which can be usually derived explicitly or calculated through numerical algorithm. However, the choice of linear risk sharing strategy is yet to be theoretically justified. In other words, it remains unknown whether the linear risk sharing strategy is optimal over all possible risk sharing strategies, linear and nonlinear. In this paper, we aim to study the optimality of the linear risk sharing strategy in the context of peer-to-peer insurance. Specifically, we prove that, if the optimization goal is set to minimizing the total variance, the optimal strategy is to linearly share the residual risks. Although this result confirms the optimality of the linear form, it does not provide direct justification for the linear risk sharing strategies that have been studied in the literature. Specifically, the optimal linear risk sharing should be not based on the original risks but on their residual versions, where the residual version of a risk refer to the risk net its expected value. The emergence of the residual form is not surprising, as it is rooted from the demand to maintain actuarial fairness. In addition to identifying the optimal risk sharing form, we also incorporate some constraints in the formulation of optimization problems to reflect desirable properties required by the market. With those constraints, the optimal strategies turn out to favor market development, such as incentivize participation and guarantee fairness.
2. Problem formulation
 Consider a peer-to-peer insurance model involving n participants. Let 
 $C_{1}, \ldots, C_{n}$
 represent the potential loss random variables. Suppose the losses have finite standard deviation
$C_{1}, \ldots, C_{n}$
 represent the potential loss random variables. Suppose the losses have finite standard deviation 
 $\sigma_1, \ldots, \sigma_n$
. Without loss of generality, assume
$\sigma_1, \ldots, \sigma_n$
. Without loss of generality, assume 
 $0 \lt \sigma_1 \lt \cdots \lt \sigma_n$
. In this paper, these standard deviations are sometimes written as
$0 \lt \sigma_1 \lt \cdots \lt \sigma_n$
. In this paper, these standard deviations are sometimes written as 
 $\sigma(C_{1}), \ldots, \sigma(C_{n})$
 to indicate the underlying random variable. These losses/risks,
$\sigma(C_{1}), \ldots, \sigma(C_{n})$
 to indicate the underlying random variable. These losses/risks, 
 $C_1, \ldots, C_n$
, referred to as pre-transfer losses/risks, are to be pooled and redistributed to the participants. The losses/risks after redistribution are referred to as post-transfer losses/risks and are denoted as
$C_1, \ldots, C_n$
, referred to as pre-transfer losses/risks, are to be pooled and redistributed to the participants. The losses/risks after redistribution are referred to as post-transfer losses/risks and are denoted as 
 $L_{1}, \ldots, L_n$
. Furthermore, denote the aggregate risk by
$L_{1}, \ldots, L_n$
. Furthermore, denote the aggregate risk by 
 $S=\sum_{i=1}^n C_i$
 and its variance by
$S=\sum_{i=1}^n C_i$
 and its variance by 
 $\sigma_S^2$
.
$\sigma_S^2$
.
 In the existing studies, the post-transfer risks are assumed to take certain function form of the pre-transfer risks. In this paper, to achieve flexibility to the largest extent, we do not impose any specific functional form on 
 $L_1, \ldots, L_n$
 except making two fundamental assumptions: self-retaining and actuarial fairness, which are described below
$L_1, \ldots, L_n$
 except making two fundamental assumptions: self-retaining and actuarial fairness, which are described below
- 
A1: (Self retaining)  $\sum\limits_{i=1}^{n}L_{i} = \sum\limits_{i=1}^{n}C_{i}$
. $\sum\limits_{i=1}^{n}L_{i} = \sum\limits_{i=1}^{n}C_{i}$
.
- 
A2: (Actuarial fairness)  $E[L_{i}] = E[C_{i}]$
 for $E[L_{i}] = E[C_{i}]$
 for $i=1, \ldots, n$
. $i=1, \ldots, n$
.
 All risk sharing strategies satisfying A1 and A2 comprise the admissible strategy class, denoted as 
 $\mathcal{C}$
. Mathematically,
$\mathcal{C}$
. Mathematically, 
 \begin{equation*}\mathcal{C} = \left\{(L_1, \ldots, L_n)\left|\sum\limits_{i=1}^{n}L_{i} = \sum\limits_{i=1}^{n}C_{i}\quad \mbox{ and }\quad E[L_{i}] = E[C_{i}], \, \forall i=1, \ldots, n \right.\right\}.\end{equation*}
\begin{equation*}\mathcal{C} = \left\{(L_1, \ldots, L_n)\left|\sum\limits_{i=1}^{n}L_{i} = \sum\limits_{i=1}^{n}C_{i}\quad \mbox{ and }\quad E[L_{i}] = E[C_{i}], \, \forall i=1, \ldots, n \right.\right\}.\end{equation*}
Both assumptions are straightforward. Self-retaining indicates that the risks are to be digested within the group, and actuarial fairness implies that a participant should not expect to lower the expected value of the risk through the risk redistribution.
What a participant should expect is to lower his/her level of riskiness through the diversification effect induced by pooling. While there are many measure to describe the level of riskiness, the most conventional measure is the variance, as studied in the literature. Note that in the presence of the actuarial fairness, minimizing variance is equivalent to minimizing the squared loss. Squared loss are frequently used as penalty functions in other fields. For example, in parameter estimation, Makov (Reference Makov1995) demonstrates that the Fisher-weighted squared-error loss function exhibits robustness in both risk and posterior loss. In actuarial reserving, Mean Squared Prediction Error (MSEP) is commonly employed as the loss function for prediction error. One can see Wuthrich and Merz (Reference Wuthrich and Merz2008) for more details. In light of these conventions, we continue to set the criterion to minimizing the total variance and formulate the unconstrained risk sharing problem/model as follows:
 \begin{equation}\begin{aligned}\min_{\substack{(L_1, \ldots, L_n) \in \mathcal{C}}} \quad & \sum\limits_{i=1}^{n}Var[L_{i}]. \\\end{aligned}\end{equation}
\begin{equation}\begin{aligned}\min_{\substack{(L_1, \ldots, L_n) \in \mathcal{C}}} \quad & \sum\limits_{i=1}^{n}Var[L_{i}]. \\\end{aligned}\end{equation}
Feng et al. (Reference Feng, Liu and Talyor2020) studies the optimal risk sharing problem among all linear strategies. To make a comparison to Problem (2.1), we rephrase their main problem as follows:
 \begin{equation}\begin{aligned}\min_{\substack{(L_1, \ldots, L_n) \in \mathcal{C}_{\ell}}} \quad & \sum\limits_{i=1}^{n}Var[L_{i}]. \\\end{aligned}\end{equation}
\begin{equation}\begin{aligned}\min_{\substack{(L_1, \ldots, L_n) \in \mathcal{C}_{\ell}}} \quad & \sum\limits_{i=1}^{n}Var[L_{i}]. \\\end{aligned}\end{equation}
where
 \begin{equation*}\mathcal{C}_{\ell} = \left\{(L_1, \ldots, L_n) \in \mathcal{C} \left| L_i = \sum_{j=1}^n {\alpha_{ij}} C_j, \, \forall \, i=1, \ldots, n\right.\right\}.\end{equation*}
\begin{equation*}\mathcal{C}_{\ell} = \left\{(L_1, \ldots, L_n) \in \mathcal{C} \left| L_i = \sum_{j=1}^n {\alpha_{ij}} C_j, \, \forall \, i=1, \ldots, n\right.\right\}.\end{equation*}
Clearly, 
 $\mathcal{C}_{\ell} \subset \mathcal{C}$
 and thus Problem (2.1) is a direct generalization of Problem (2.2) in the sense that it augments linear optimization space considered in Problem (2.2). This generalization proves to bring improvements in the optimal risk sharing strategy. Details will be discussed in Section 4.
$\mathcal{C}_{\ell} \subset \mathcal{C}$
 and thus Problem (2.1) is a direct generalization of Problem (2.2) in the sense that it augments linear optimization space considered in Problem (2.2). This generalization proves to bring improvements in the optimal risk sharing strategy. Details will be discussed in Section 4.
 While the setting of Problem (2.1) is straightforward and intuitive, it is not sufficient to capture the market demand in reality. One property to be expected from a desirable risk sharing principle is variance reduction, described as follows: 
 $\textrm{P}_{vr}$
: (Variance reduction)
$\textrm{P}_{vr}$
: (Variance reduction) 
 $Var[L_i] \le Var[C_i]$
 for
$Var[L_i] \le Var[C_i]$
 for 
 $i=1, \ldots, n$
.
$i=1, \ldots, n$
.
This property indicates that the post-transfer loss is less risky than the pre-transfer loss, reflecting participants’ demand to benefit from risk pooling. To incorporate this demand, we formulate the following problem:
 \begin{equation}\begin{aligned}\min_{\substack{(L_1, \ldots, L_n) \in \mathcal{C}_{vr}}} \quad & \sum\limits_{i=1}^{n}Var[L_{i}],\end{aligned}\end{equation}
\begin{equation}\begin{aligned}\min_{\substack{(L_1, \ldots, L_n) \in \mathcal{C}_{vr}}} \quad & \sum\limits_{i=1}^{n}Var[L_{i}],\end{aligned}\end{equation}
where
 \begin{equation*}\mathcal{C}_{vr} = \left\{\left.(L_1, \ldots, L_n) \in \mathcal{C} \right| Var[L_i] \le Var[C_i], \, \forall \, i=1, \ldots, n\right\}.\end{equation*}
\begin{equation*}\mathcal{C}_{vr} = \left\{\left.(L_1, \ldots, L_n) \in \mathcal{C} \right| Var[L_i] \le Var[C_i], \, \forall \, i=1, \ldots, n\right\}.\end{equation*}
The problem with the variance reduction constraint has been studied in the literature. For example, Feng et al. (Reference Feng, Liu and Talyor2020) sets up an algorithm to solve the variance minimization with “reduction in variance” constraint numerically. Additionally, in Denuit et al. (Reference Denuit, Dhaene and Robert2021), the variance reduction constraint is listed as a necessary condition for Pareto optimality since it is easy to compute.
 In finding the optimal solution to Problem (2.3), one finds that the amount of risks retained by participants after risk sharing may exhibit certain inconsistency. Specifically, the variance of the post-transfer risk for a participant starting with a lower risk may end up higher than that for a participant with a higher initial risk. For example, while 
 $Var[C_1]\le Var[C_2]$
, their post-transfer risks may follow
$Var[C_1]\le Var[C_2]$
, their post-transfer risks may follow 
 $Var[L_1] \gt Var[L_2]$
. Such an inconsistency would create certain unfairness and discourage the participance of those with low initial risks. To address this issue, we propose a set of retention consistency conditions:
$Var[L_1] \gt Var[L_2]$
. Such an inconsistency would create certain unfairness and discourage the participance of those with low initial risks. To address this issue, we propose a set of retention consistency conditions:
 
 $\textrm{P}^0_{rc}$
: (0-retention consistency)
$\textrm{P}^0_{rc}$
: (0-retention consistency) 
 ${Var[L_{1}]} \leq \ldots\leq {Var[L_{n}]}$
.
${Var[L_{1}]} \leq \ldots\leq {Var[L_{n}]}$
.
 
 $\textrm{P}^1_{rc}$
: (1-retention consistency)
$\textrm{P}^1_{rc}$
: (1-retention consistency) 
 $\frac{Var[L_{1}]}{Var[C_{1}]} \leq \ldots\leq \frac{Var[L_{n}]}{Var[C_{n}]}$
.
$\frac{Var[L_{1}]}{Var[C_{1}]} \leq \ldots\leq \frac{Var[L_{n}]}{Var[C_{n}]}$
.
 
 $\textrm{P}^\gamma_{rc}$
: (
$\textrm{P}^\gamma_{rc}$
: (
 $\gamma$
-retention consistency)
$\gamma$
-retention consistency) 
 $\frac{Var[L_{1}]}{Var[C_{1}]^\gamma} \leq \ldots\leq \frac{Var[L_{n}]}{Var[C_{n}]^\gamma}$
.
$\frac{Var[L_{1}]}{Var[C_{1}]^\gamma} \leq \ldots\leq \frac{Var[L_{n}]}{Var[C_{n}]^\gamma}$
.
 The 0-retention consistency condition 
 $\textrm{P}^0_{rc}$
 indicates that the riskiness levels of the post-transfer losses,
$\textrm{P}^0_{rc}$
 indicates that the riskiness levels of the post-transfer losses, 
 $L_1, \ldots, L_n$
, should exhibit the same order as those of the pre-transfer losses,
$L_1, \ldots, L_n$
, should exhibit the same order as those of the pre-transfer losses, 
 $C_1, \ldots, C_n$
. This is a natural constraint to impose fairness among all participants. Another way to impose fairness is to require the retention rations to follow the same order as the initial risks, which leads to 1-retention consistency condition
$C_1, \ldots, C_n$
. This is a natural constraint to impose fairness among all participants. Another way to impose fairness is to require the retention rations to follow the same order as the initial risks, which leads to 1-retention consistency condition 
 $\textrm{ P}^1_{rc}$
. Note that the 1-retention consistency condition is more restrictive than the 0-retention consistency condition since
$\textrm{ P}^1_{rc}$
. Note that the 1-retention consistency condition is more restrictive than the 0-retention consistency condition since 
 $\frac{Var[L_{i}]}{Var[C_{i}]}\le \frac{Var[L_{j}]}{Var[C_{j}]}$
 implies
$\frac{Var[L_{i}]}{Var[C_{i}]}\le \frac{Var[L_{j}]}{Var[C_{j}]}$
 implies 
 ${Var[L_{i}]} \leq {Var[L_{j}]}$
. To bridge these two conditions and present a more general condition, we introduce the
${Var[L_{i}]} \leq {Var[L_{j}]}$
. To bridge these two conditions and present a more general condition, we introduce the 
 $\gamma$
-retention consistency conditions. As indicated by the notations,
$\gamma$
-retention consistency conditions. As indicated by the notations, 
 $\textrm{P}^0_{rc}$
 and
$\textrm{P}^0_{rc}$
 and 
 ${\rm P}^1_{rc}$
 are special cases of
${\rm P}^1_{rc}$
 are special cases of 
 ${\rm P}^\gamma_{rc}$
. The index
${\rm P}^\gamma_{rc}$
. The index 
 $\gamma$
 can be interpreted as the “strength” of consistency and can be used to reflect the level of fairness desired by participants.
$\gamma$
 can be interpreted as the “strength” of consistency and can be used to reflect the level of fairness desired by participants.
Below, we combine the retention consistency condition and the variance reduction condition to formulate the constraint risk sharing problem/model:
 \begin{equation}\begin{aligned}\min_{\substack{(L_1, \ldots, L_n) \in \mathcal{C}_{vr} \cap \, \mathcal{C}^\gamma_{rc}}} \quad & \sum\limits_{i=1}^{n}Var[L_{i}],\end{aligned}\end{equation}
\begin{equation}\begin{aligned}\min_{\substack{(L_1, \ldots, L_n) \in \mathcal{C}_{vr} \cap \, \mathcal{C}^\gamma_{rc}}} \quad & \sum\limits_{i=1}^{n}Var[L_{i}],\end{aligned}\end{equation}
where
 \begin{equation*}\mathcal{C}^\gamma_{rc} = \left\{(L_1, \ldots, L_n) \in \mathcal{C} \left|\frac{Var[L_{1}]}{Var[C_{1}]^\gamma} \leq\ldots\leq \frac{Var[L_{n}]}{Var[C_{n}]^\gamma}\right.\right\}.\end{equation*}
\begin{equation*}\mathcal{C}^\gamma_{rc} = \left\{(L_1, \ldots, L_n) \in \mathcal{C} \left|\frac{Var[L_{1}]}{Var[C_{1}]^\gamma} \leq\ldots\leq \frac{Var[L_{n}]}{Var[C_{n}]^\gamma}\right.\right\}.\end{equation*}
 Two special cases of Problem (2.4) with 
 $\gamma=0$
 and
$\gamma=0$
 and 
 $\gamma=1$
 are of independent interest and are listed as follows:
$\gamma=1$
 are of independent interest and are listed as follows: 
 \begin{equation}\min_{\substack{(L_1, \ldots, L_n) \in \mathcal{C}_{vr} \cap \, \mathcal{C}^0_{rc}}} \quad \sum\limits_{i=1}^{n}Var[L_{i}],\end{equation}
\begin{equation}\min_{\substack{(L_1, \ldots, L_n) \in \mathcal{C}_{vr} \cap \, \mathcal{C}^0_{rc}}} \quad \sum\limits_{i=1}^{n}Var[L_{i}],\end{equation}
 \begin{equation}\min_{\substack{(L_1, \ldots, L_n) \in \mathcal{C}_{vr} \cap \, \mathcal{C}^1_{rc}}} \quad \sum\limits_{i=1}^{n}Var[L_{i}],\end{equation}
\begin{equation}\min_{\substack{(L_1, \ldots, L_n) \in \mathcal{C}_{vr} \cap \, \mathcal{C}^1_{rc}}} \quad \sum\limits_{i=1}^{n}Var[L_{i}],\end{equation}
Another relevant problem of interest is
 \begin{equation}\min_{(L_1, \ldots, L_n)\in \mathcal{C}_{rc}^1} \quad \sum\limits_{i=1}^{n}Var[L_i]. \end{equation}
\begin{equation}\min_{(L_1, \ldots, L_n)\in \mathcal{C}_{rc}^1} \quad \sum\limits_{i=1}^{n}Var[L_i]. \end{equation}
Note that Problem (2.7) is analogous to Problem (2.3) in the sense that they are each modified from Problems (2.5) and (2.6) by dropping one constraint, namely 0-retention consistency and variance reduction.
The rest of the paper is organized as follows. In Section 3, we identify the optimal form of risk sharing – linear residual risk sharing (defined in Section 3), and thus reduces solving the proposed optimization problems to finding optimal linear sharing ratios. In Section 4, we solve the unconstrained risk sharing problem (2.1) and compare with Problem (2.2). Problem (2.2) has been studied by Feng et al. (Reference Feng, Liu and Talyor2020), with a focus on linear strategies. We show that, by augmenting the admissible strategy class from the linear space to a general functional space, the optimal strategies remain the linear form (with slight modifications) and the total variance will be reduced under new optimal strategy. We also demonstrate other advantages of optimal solutions to Problem (2.1) over that to (2.2). In Section 5, we focus on solving Problem (2.4), constrained with variance reduction and retention consistency conditions, and establish some desirable properties of the optimal solution.
In Section 6, we study Problems (2.5) and (2.6), as the special cases of (2.4) establish the equivalences between Problems (2.5) and (2.3) and between Problems (2.6) and (2.7). At the end of Section 6, we study the following problem:
 \begin{equation}\begin{aligned}\min_{\substack{(L_1, \ldots, L_n) \in \mathcal{C}_{vr} }} \quad & \sum\limits_{i=1}^{n} \frac{Var[L_{i}]}{{Var[C_i]}^{\gamma/2}}.\end{aligned}\end{equation}
\begin{equation}\begin{aligned}\min_{\substack{(L_1, \ldots, L_n) \in \mathcal{C}_{vr} }} \quad & \sum\limits_{i=1}^{n} \frac{Var[L_{i}]}{{Var[C_i]}^{\gamma/2}}.\end{aligned}\end{equation}
While Problem (2.8) focuses on minimizing the total weighted post-transfer variances of all participants, it proves to be equivalent to (2.4). This equivalence provides a new perspective of understanding Problem (2.4) and enable us to make further extensions. The relationship among these problems is illustrated in Figure 1.

Figure 1. Relation among proposed problems.
At the end of the paper, we design two case studies to demonstrate the effectiveness of the residual risk sharing strategy in Section 7 and provide some concluding remarks in Section 8.
3. Optimality of linear residual risk sharing
In this section, we prove that the optimal solutions to Problems (2.1), (2.3), (2.4), and (2.8) should be all in the form of linear residual risk sharing.
Theorem 3.1. The optimal solution to Problem (2.1) (resp. (2.3), (2.4), and (2.8)), 
 $(L_{1}^{*}, \ldots, L_n^*)$
, admits the form of linear residual risk sharing, that is, there exist
$(L_{1}^{*}, \ldots, L_n^*)$
, admits the form of linear residual risk sharing, that is, there exist 
 $a_1, \ldots, a_n \in [0,1]$
 such that
$a_1, \ldots, a_n \in [0,1]$
 such that 
 \begin{equation} L_{i}^* = E[C_{i}] + a_{i}\sum\limits_{j=1}^{n}(C_{j}-E[C_{j}]) = E[C_{i}] + a_{i}(S-E[S]),\end{equation}
\begin{equation} L_{i}^* = E[C_{i}] + a_{i}\sum\limits_{j=1}^{n}(C_{j}-E[C_{j}]) = E[C_{i}] + a_{i}(S-E[S]),\end{equation}
for 
 $i=1, \ldots, n$
.
$i=1, \ldots, n$
.
Proof. We shall focus on Problem (2.4). The proof for the other problems are similar. It suffices to show that for any risk sharing strategy 
 $(L_1, \ldots, L_n)\in \mathcal{C}_{vr}\cap \mathcal{C}^{\gamma}_{rc}$
, there exists a strategy
$(L_1, \ldots, L_n)\in \mathcal{C}_{vr}\cap \mathcal{C}^{\gamma}_{rc}$
, there exists a strategy 
 $(\widetilde{L}_1, \ldots, \widetilde{L}_n)\in \mathcal{C}_{vr}\cap \mathcal{C}^{\gamma}_{rc}$
 of form (3.1) that is superior to
$(\widetilde{L}_1, \ldots, \widetilde{L}_n)\in \mathcal{C}_{vr}\cap \mathcal{C}^{\gamma}_{rc}$
 of form (3.1) that is superior to 
 $(L_1, \ldots, L_n)$
 in the sense that it results in a lower total variance.
$(L_1, \ldots, L_n)$
 in the sense that it results in a lower total variance.
 Construct 
 $(\widetilde{L}_1, \ldots, \widetilde{L}_n)$
 as follows:
$(\widetilde{L}_1, \ldots, \widetilde{L}_n)$
 as follows: 
 \begin{equation} \widetilde{L}_{i} = E[C_{i}] + \frac{\sigma(L_{i})}{\sum\limits_{s=1}^{n}\sigma(L_{s})}(S-E[S]).\end{equation}
\begin{equation} \widetilde{L}_{i} = E[C_{i}] + \frac{\sigma(L_{i})}{\sum\limits_{s=1}^{n}\sigma(L_{s})}(S-E[S]).\end{equation}
Note that
 \begin{equation}\sigma(\widetilde{L}_{i}) = \frac{\sigma(L_{i})}{\sum\limits_{s=1}^{n}\sigma(L_{s})}\sigma\left(\sum\limits_{s=1}^{n}C_{i}\right) = \frac{\sigma(L_{i})}{\sum\limits_{s=1}^{n}\sigma(L_{s})}\sigma\left(\sum\limits_{s=1}^{n}L_{s}\right)\leq \sigma(L_{i}), \end{equation}
\begin{equation}\sigma(\widetilde{L}_{i}) = \frac{\sigma(L_{i})}{\sum\limits_{s=1}^{n}\sigma(L_{s})}\sigma\left(\sum\limits_{s=1}^{n}C_{i}\right) = \frac{\sigma(L_{i})}{\sum\limits_{s=1}^{n}\sigma(L_{s})}\sigma\left(\sum\limits_{s=1}^{n}L_{s}\right)\leq \sigma(L_{i}), \end{equation}
where the last inequality follows from the fact that 
 $\sigma(\sum\limits_{s=1}^{n}L_{i}) \le \sum\limits_{s=1}^{n}\sigma(L_{s})$
. This immediately implies that
$\sigma(\sum\limits_{s=1}^{n}L_{i}) \le \sum\limits_{s=1}^{n}\sigma(L_{s})$
. This immediately implies that 
 $(\widetilde{L}_1, \ldots, \widetilde{L}_n)$
 has a lower total variance than
$(\widetilde{L}_1, \ldots, \widetilde{L}_n)$
 has a lower total variance than 
 $({L}_1, \ldots, {L}_n)$
 and is thus superior.
$({L}_1, \ldots, {L}_n)$
 and is thus superior.
 It remains to prove that 
 $(\widetilde{L}_1, \ldots, \widetilde{L}_n) \in \mathcal{C}_{vr}\cap \mathcal{C}^{\gamma}_{rc}$
 for any
$(\widetilde{L}_1, \ldots, \widetilde{L}_n) \in \mathcal{C}_{vr}\cap \mathcal{C}^{\gamma}_{rc}$
 for any 
 $({L}_1, \ldots, {L}_n) \in \mathcal{C}_{vr}\cap \mathcal{C}^{\gamma}_{rc}$
, which breaks down into the following two steps.
$({L}_1, \ldots, {L}_n) \in \mathcal{C}_{vr}\cap \mathcal{C}^{\gamma}_{rc}$
, which breaks down into the following two steps.
- 
(i) By (3.2), it is straightforward to verify that  $(\widetilde{L}_1, \ldots, \widetilde{L}_n)$
 satisfies the properties of self-retaining and actuarial fairness, and thus belongs to $(\widetilde{L}_1, \ldots, \widetilde{L}_n)$
 satisfies the properties of self-retaining and actuarial fairness, and thus belongs to $\mathcal{C}$
. $\mathcal{C}$
.
- 
(ii) Since  $(L_1, \ldots, L_n)\in \mathcal{C}_{vr}$
 satisfies the variance reduction property, we have $(L_1, \ldots, L_n)\in \mathcal{C}_{vr}$
 satisfies the variance reduction property, we have $Var[L_i] \le Var[C_i]$
 and thus $Var[L_i] \le Var[C_i]$
 and thus $Var[\widetilde{L}_i] \le Var[L_i] \le Var[C_i]$
 following (3.3). Therefore, $Var[\widetilde{L}_i] \le Var[L_i] \le Var[C_i]$
 following (3.3). Therefore, $(\widetilde{L}_1, \ldots, \widetilde{L}_n) \in \mathcal{C}_{vr}$
. $(\widetilde{L}_1, \ldots, \widetilde{L}_n) \in \mathcal{C}_{vr}$
.
- 
(iii) Since  $(L_1, \ldots, L_n)\in \mathcal{C}^{\gamma}_{rc}$
 satisfies the $(L_1, \ldots, L_n)\in \mathcal{C}^{\gamma}_{rc}$
 satisfies the $\gamma$
-retention consistency, it holds that for any $\gamma$
-retention consistency, it holds that for any $i \lt j$
 (3.4)Recall from (3.3) that $i \lt j$
 (3.4)Recall from (3.3) that \begin{equation}\frac{Var[L_{i}]}{Var[C_{i}]^\gamma} \leq \frac{Var[L_{j}]}{Var[C_{j}]^\gamma}.\end{equation}
Multiplying \begin{equation}\frac{Var[L_{i}]}{Var[C_{i}]^\gamma} \leq \frac{Var[L_{j}]}{Var[C_{j}]^\gamma}.\end{equation}
Multiplying \begin{equation*}\sigma(\widetilde{L}_{i}) = \frac{\sigma(L_{i})}{\sum\limits_{s=1}^{n}\sigma(L_{s})}\sigma\left(\sum\limits_{s=1}^{n}L_{s}\right) = \frac{\sigma\left(\sum\limits_{s=1}^{n}L_{s}\right)}{\sum\limits_{s=1}^{n}\sigma(L_{s})}\times \sigma(L_i).\end{equation*} \begin{equation*}\sigma(\widetilde{L}_{i}) = \frac{\sigma(L_{i})}{\sum\limits_{s=1}^{n}\sigma(L_{s})}\sigma\left(\sum\limits_{s=1}^{n}L_{s}\right) = \frac{\sigma\left(\sum\limits_{s=1}^{n}L_{s}\right)}{\sum\limits_{s=1}^{n}\sigma(L_{s})}\times \sigma(L_i).\end{equation*} $\left(\frac{\sigma\left(\sum\limits_{s=1}^{n}L_{s}\right)}{\sum\limits_{s=1}^{n}\sigma(L_{s})}\right)^2$
 to both sides of (3.4) yields (3.5)which implies $\left(\frac{\sigma\left(\sum\limits_{s=1}^{n}L_{s}\right)}{\sum\limits_{s=1}^{n}\sigma(L_{s})}\right)^2$
 to both sides of (3.4) yields (3.5)which implies \begin{equation}\frac{Var[\widetilde{L}_{i}]}{Var[C_{i}]^\gamma} \leq \frac{Var[\widetilde{L}_{j}]}{Var[C_{j}]^\gamma}, \quad \mbox{ for any } i \lt j,\end{equation} \begin{equation}\frac{Var[\widetilde{L}_{i}]}{Var[C_{i}]^\gamma} \leq \frac{Var[\widetilde{L}_{j}]}{Var[C_{j}]^\gamma}, \quad \mbox{ for any } i \lt j,\end{equation} $(\widetilde{L}_1, \ldots, \widetilde{L}_n)\in \mathcal{C}^{\gamma}_{rc}$
. $(\widetilde{L}_1, \ldots, \widetilde{L}_n)\in \mathcal{C}^{\gamma}_{rc}$
.
 Theorem 3.1 suggests that the optimal risk sharing should be conducted in two steps. First, each participant takes the a baseline risk that is equal to the expected value of his/her pre-pooling risk. Then, the remaining risk, 
 $\sum\limits_{s=1}^{n}(C_{s} - E[C_{s}])$
, is allocated linearly among all participants. This strategy is referred to as linear residual risk sharing because the sharing is based on the residual risks,
$\sum\limits_{s=1}^{n}(C_{s} - E[C_{s}])$
, is allocated linearly among all participants. This strategy is referred to as linear residual risk sharing because the sharing is based on the residual risks, 
 $\{C_{s} - E[C_{s}], s=1, \ldots, n\}$
. On the contrast, we shall refer to the strategy in
$\{C_{s} - E[C_{s}], s=1, \ldots, n\}$
. On the contrast, we shall refer to the strategy in 
 $\mathcal{C}_\ell$
 as linear whole risk sharing as the sharing is based on the whole risks,
$\mathcal{C}_\ell$
 as linear whole risk sharing as the sharing is based on the whole risks, 
 $\{C_1, \ldots, C_n\}$
. It is worth noting that the linear residual risk sharing form has been briefly mentioned in Denuit (Reference Denuit2020) (referred to as linear risk sharing therein). The authors use it as an illustrative comparison to the CMRS rule, without studying it in details.
$\{C_1, \ldots, C_n\}$
. It is worth noting that the linear residual risk sharing form has been briefly mentioned in Denuit (Reference Denuit2020) (referred to as linear risk sharing therein). The authors use it as an illustrative comparison to the CMRS rule, without studying it in details.
It is not surprising to see that the linear residual risk sharing strategy is superior to the linear whole risk sharing strategy. Intuitively, with the residualization treatment, the actuarial fairness condition is automatically satisfied, which allows more freedom for the linear residual risk sharing strategy to achieve a better result. Detailed comparison will be presented in Section 4.
The superiority of the linear residual risk sharing also lies in that it follows the aggregate risk sharing rule, as proposed in Denuit et al. (Reference Denuit, Dhaene and Robert2021), which guarantees that the realization of the risk allocation to each participant is determined by only the realization of the aggregate risks but not the individual risks. We refer the reader to Feng et al. (Reference Feng, Liu and Talyor2020) more discussion on this property. This aggregate risk sharing rule is also called the risk anonymity, which is discussed in Jiao et al. (Reference Jiao, Kou, Yang and Wang2022) as a desiarble property to simplify market operation and reduce potential moral hazard or legal dispute. Readers are referred to Jiao et al. (Reference Jiao, Kou, Yang and Wang2022) for more detailed discussions.
With the optimal risk sharing form specified by Theorem 3.1, Problems (2.1) and (2.4) can be significantly simplified. For notational convenience, define
 \begin{equation}\begin{aligned}\mathcal{A} = \left\{(a_1, \ldots, a_n) \in [0,1]^n \left | \sum_{i=1}^n a_i =1\right.\right\},\end{aligned}\end{equation}
\begin{equation}\begin{aligned}\mathcal{A} = \left\{(a_1, \ldots, a_n) \in [0,1]^n \left | \sum_{i=1}^n a_i =1\right.\right\},\end{aligned}\end{equation}
 \begin{equation}\begin{aligned}\mathcal{A}_{vr} =\left\{(a_1, \ldots, a_n) \in \mathcal{A} \left | a_{i}^{2} \leq \frac{\sigma_i^2}{\sigma_S^2}, \quad i=1,2,...,n\right.\right\},\end{aligned}\end{equation}
\begin{equation}\begin{aligned}\mathcal{A}_{vr} =\left\{(a_1, \ldots, a_n) \in \mathcal{A} \left | a_{i}^{2} \leq \frac{\sigma_i^2}{\sigma_S^2}, \quad i=1,2,...,n\right.\right\},\end{aligned}\end{equation}
 \begin{equation}\begin{aligned}\mathcal{A}^\gamma_{rc} =\left\{(a_1, \ldots, a_n) \in \mathcal{A} \left | \frac{a_{1}^{2}}{\sigma_1^{2\gamma}}\leq \cdots \leq \frac{a_{n}^{2}}{\sigma_n^{2\gamma}}\right.\right\}.\end{aligned}\end{equation}
\begin{equation}\begin{aligned}\mathcal{A}^\gamma_{rc} =\left\{(a_1, \ldots, a_n) \in \mathcal{A} \left | \frac{a_{1}^{2}}{\sigma_1^{2\gamma}}\leq \cdots \leq \frac{a_{n}^{2}}{\sigma_n^{2\gamma}}\right.\right\}.\end{aligned}\end{equation}
By plugging the linear residual risk sharing form as specified by (3.1), Problems (2.1), (2.4), and (2.8), respectively, reduce to
 \begin{equation}\begin{aligned} \min_{(a_1, \ldots, a_n)\in \mathcal{A}} \quad & \sum\limits_{i=1}^{n}a_{i}^{2},\end{aligned}\end{equation}
\begin{equation}\begin{aligned} \min_{(a_1, \ldots, a_n)\in \mathcal{A}} \quad & \sum\limits_{i=1}^{n}a_{i}^{2},\end{aligned}\end{equation}
 \begin{equation}\begin{aligned} \min_{(a_1, \ldots, a_n)\in \mathcal{A}_{vr} \cap \mathcal{A}^{\gamma}_{rc}} \quad & \sum\limits_{i=1}^{n}a_{i}^{2},\end{aligned}\end{equation}
\begin{equation}\begin{aligned} \min_{(a_1, \ldots, a_n)\in \mathcal{A}_{vr} \cap \mathcal{A}^{\gamma}_{rc}} \quad & \sum\limits_{i=1}^{n}a_{i}^{2},\end{aligned}\end{equation}
 \begin{equation}\begin{aligned} \min_{(a_1, \ldots, a_n)\in \mathcal{A}_{vr} } \quad & \sum\limits_{i=1}^{n} \frac{a_{i}^{2}}{\sigma_{i}^\gamma}.\end{aligned}\end{equation}
\begin{equation}\begin{aligned} \min_{(a_1, \ldots, a_n)\in \mathcal{A}_{vr} } \quad & \sum\limits_{i=1}^{n} \frac{a_{i}^{2}}{\sigma_{i}^\gamma}.\end{aligned}\end{equation}
In Sections 4 and 5, we shall respectively solve Problems (3.9) and (3.10) and establish desirable properties of the optimal solutions. The solution to Problem (3.11) and its relation to Problem (3.10) will be discussed in Section 6.
4. Unconstrained residual risk sharing
In this section, we shall solve Problem (3.9) (and thus Problem (2.1)) and then compare the solution to Problem (2.2), which has been studied in Feng et al. (Reference Feng, Liu and Talyor2020). The comparison will demonstrate that, compared to the linear whole risk sharing, the linear residual risk sharing is more flexible, more effective in reducing risk, and more robust.
Theorem 4.1. The optimal solution of Problem (2.1) is given by:
 \begin{equation} L_{i} = E[C_{i}] + \frac{1}{n}\sum\limits_{s=1}^{n}(C_{s} - E[C_{s}]), \quad i= 1, \ldots, n.\end{equation}
\begin{equation} L_{i} = E[C_{i}] + \frac{1}{n}\sum\limits_{s=1}^{n}(C_{s} - E[C_{s}]), \quad i= 1, \ldots, n.\end{equation}
Proof. It immediately follows from the inequality:
 \begin{equation*}\sum_{i=1}^n a_i^2 \ge \frac{\left(\sum_{i=1}^n a_i\right)^2}{n} = \frac{1}{n},\end{equation*}
\begin{equation*}\sum_{i=1}^n a_i^2 \ge \frac{\left(\sum_{i=1}^n a_i\right)^2}{n} = \frac{1}{n},\end{equation*}
with equality obtained at 
 $a_1 = \cdots = a_n = \frac{1}{n}$
.
$a_1 = \cdots = a_n = \frac{1}{n}$
.
 Theorem 4.1 indicates that the optimal strategy to minimize the total post-transfer variance is for all the participants to equally share the total residual risk 
 $\sum\limits_{s=1}^{n}(C_{s} - E[C_{s}])$
. This is analogous to the uniform risk sharing strategy (see Denuit et al. (Reference Denuit, Dhaene and Robert2021)), except applying to the residual risks but not the whole risks.
$\sum\limits_{s=1}^{n}(C_{s} - E[C_{s}])$
. This is analogous to the uniform risk sharing strategy (see Denuit et al. (Reference Denuit, Dhaene and Robert2021)), except applying to the residual risks but not the whole risks.
Under this risk sharing strategy, post-transfer variances are given by:
 \begin{equation} Var[L_{i}] = \frac{1}{n^{2}}Var\left[\sum\limits_{i=1}^{n}C_{i}\right], \quad i= 1, \ldots, n.\end{equation}
\begin{equation} Var[L_{i}] = \frac{1}{n^{2}}Var\left[\sum\limits_{i=1}^{n}C_{i}\right], \quad i= 1, \ldots, n.\end{equation}
Clearly, under this risk sharing arrangement, the total riskiness of the pool is significantly reduced in the sense that
 \begin{equation}\sum\limits_{i=1}^{n}Var[L_{i}] = \frac{1}{n}Var\left[\sum\limits_{i=1}^{n}C_{i}\right]\leq \sum\limits_{i=1}^{n}Var[C_{i}].\end{equation}
\begin{equation}\sum\limits_{i=1}^{n}Var[L_{i}] = \frac{1}{n}Var\left[\sum\limits_{i=1}^{n}C_{i}\right]\leq \sum\limits_{i=1}^{n}Var[C_{i}].\end{equation}
Furthermore, if the pre-transfer risks are sufficiently diversified in the sense that
 \begin{equation*}\frac{1}{n^{2}}Var\left[\sum\limits_{i=1}^{n}C_{i}\right] \le Var[C_1] = \min_{1\le i \le n} Var[C_i],\end{equation*}
\begin{equation*}\frac{1}{n^{2}}Var\left[\sum\limits_{i=1}^{n}C_{i}\right] \le Var[C_1] = \min_{1\le i \le n} Var[C_i],\end{equation*}
then the diversification benefit will be extended to each participant so that his/her post-transfer variance will be less than the pre-transfer variance. If the above diversification condition is not satisfied, then not all participants will enjoy a reduced post-transfer variance. In that case, the variance reduction constraint should be incorporated to yield a more reasonable risk sharing scheme. This problem will be studied in Section 5.
In the rest of this section, we investigate the difference between the linear residual risk sharing problem (3.9) and the linear whole risk sharing problem (2.2). To this end, we first note that Problem (2.2) has been studied in Feng et al. (Reference Feng, Liu and Talyor2020). Below, we cite their solution to Problem (2.2).
Theorem 4.2. (Theorem 1 of Feng et al. (Reference Feng, Liu and Talyor2020)). The solution to Problem (2.2) is
 \begin{equation*}(P_1, \ldots, P_n)^T = \mathbf{A} (C_1, \ldots, C_n)^T,\end{equation*}
\begin{equation*}(P_1, \ldots, P_n)^T = \mathbf{A} (C_1, \ldots, C_n)^T,\end{equation*}
with 
 $\mathbf{A}\in \mathbb{R}^{n\times n}$
 specified by:
$\mathbf{A}\in \mathbb{R}^{n\times n}$
 specified by: 
 \begin{equation*} \mathbf{A}= \frac{1}{n}\mathbf{e}\mathbf{e}^{T} + k(\mathbf{I}-\mathbf{e}\mathbf{e}^{T})\boldsymbol{\mu}\boldsymbol{\mu}^{T}\boldsymbol{\Sigma}^{-1},\end{equation*}
\begin{equation*} \mathbf{A}= \frac{1}{n}\mathbf{e}\mathbf{e}^{T} + k(\mathbf{I}-\mathbf{e}\mathbf{e}^{T})\boldsymbol{\mu}\boldsymbol{\mu}^{T}\boldsymbol{\Sigma}^{-1},\end{equation*}
where 
 $\mathbf{I} \in \mathbb{R}^{n\times n}$
 is the identity matrix,
$\mathbf{I} \in \mathbb{R}^{n\times n}$
 is the identity matrix, 
 $\mathbf{e}= (1, \ldots,1)^{T} \in \mathbb{R}^{n\times 1}$
,
$\mathbf{e}= (1, \ldots,1)^{T} \in \mathbb{R}^{n\times 1}$
, 
 $\boldsymbol\mu = (\mu_1, \ldots, \mu_n)^T$
 and
$\boldsymbol\mu = (\mu_1, \ldots, \mu_n)^T$
 and 
 $\boldsymbol\Sigma$
 are, respectively, the mean vector and covariance matrix of
$\boldsymbol\Sigma$
 are, respectively, the mean vector and covariance matrix of 
 $(C_{1}, \ldots, C_{n})$
, and
$(C_{1}, \ldots, C_{n})$
, and 
 $k^{-1} = \boldsymbol{\mu}^{T}\boldsymbol{\Sigma\mu}$
.
$k^{-1} = \boldsymbol{\mu}^{T}\boldsymbol{\Sigma\mu}$
.
 Recalling that Problem (2.2) is a sub-problem of Problem (2.1) (and thus Problem (3.9)) in the sense that the optimization domain of Problem (2.1) is a subset of that of Problem (2.1), it is anticipated that the optimal solution to Problem (3.9) will result in a lower total post-transfer variance than the optimal solution to Problem (2.2). In the following, we confirm this through a direct comparison between 
 $\sum_{i=1}^n \text{Var}[P_i]$
 and
$\sum_{i=1}^n \text{Var}[P_i]$
 and 
 $\sum_{i=1}^n \text{Var}[L_i]$
 and further uncover the relationship between these two problems.
$\sum_{i=1}^n \text{Var}[L_i]$
 and further uncover the relationship between these two problems.
Proposition 4.3. Let 
 $(L_1, \ldots, L_n)$
 and
$(L_1, \ldots, L_n)$
 and 
 $(P_1, \ldots, P_n)$
 be the solutions to Problems (2.1) and (2.2), respectively. Then,
$(P_1, \ldots, P_n)$
 be the solutions to Problems (2.1) and (2.2), respectively. Then, 
 \begin{equation*}\sum_{i=1}^n \text{Var}[L_i] \le \sum_{i=1}^n \text{Var}[P_i],\end{equation*}
\begin{equation*}\sum_{i=1}^n \text{Var}[L_i] \le \sum_{i=1}^n \text{Var}[P_i],\end{equation*}
with the equality attained if and only if 
 $\mu_1 =\cdots = \mu_n$
.
$\mu_1 =\cdots = \mu_n$
.
Proof. Feng et al. (Reference Feng, Liu and Talyor2020) obtain that
 \begin{equation*} \text{Var}[P_{i}] = \frac{1}{n^{2}}\mathbf{e}^{T}\boldsymbol\Sigma \mathbf{e} + k\mu_{i}^{2} - \frac{k}{n^{2}}\left(\sum\limits_{i=1}^{n}\mu_{i}\right)^{2}.\end{equation*}
\begin{equation*} \text{Var}[P_{i}] = \frac{1}{n^{2}}\mathbf{e}^{T}\boldsymbol\Sigma \mathbf{e} + k\mu_{i}^{2} - \frac{k}{n^{2}}\left(\sum\limits_{i=1}^{n}\mu_{i}\right)^{2}.\end{equation*}
Thus,
 \begin{equation*}\sum_{i=1}^n \text{Var}[P_i] = \frac{1}{n} \mathbf{e}^{T}\boldsymbol\Sigma \mathbf{e} + k \sum_{i=1}^n \mu_i^2 - \frac{k}{n} \left(\sum\limits_{i=1}^{n}\mu_{i}\right)^{2}.\end{equation*}
\begin{equation*}\sum_{i=1}^n \text{Var}[P_i] = \frac{1}{n} \mathbf{e}^{T}\boldsymbol\Sigma \mathbf{e} + k \sum_{i=1}^n \mu_i^2 - \frac{k}{n} \left(\sum\limits_{i=1}^{n}\mu_{i}\right)^{2}.\end{equation*}
According to (4.2),
 \begin{equation*}\sum_{i=1}^n \text{Var}[L_i] = \frac{1}{n} \text{Var} \left[\sum_{i=1}^n C_i \right] = \frac{1}{n} \mathbf{e}^{T}\boldsymbol\Sigma \mathbf{e}.\end{equation*}
\begin{equation*}\sum_{i=1}^n \text{Var}[L_i] = \frac{1}{n} \text{Var} \left[\sum_{i=1}^n C_i \right] = \frac{1}{n} \mathbf{e}^{T}\boldsymbol\Sigma \mathbf{e}.\end{equation*}
Therefore,
 \begin{equation} \nonumber\sum_{i=1}^n \text{Var}[L_i] \le \sum_{i=1}^n \text{Var}[P_i] \Longleftrightarrow \left( \sum_{i=1}^n \mu_i \right)^2 \le n \sum_{i=1}^n \mu_i^2,\end{equation}
\begin{equation} \nonumber\sum_{i=1}^n \text{Var}[L_i] \le \sum_{i=1}^n \text{Var}[P_i] \Longleftrightarrow \left( \sum_{i=1}^n \mu_i \right)^2 \le n \sum_{i=1}^n \mu_i^2,\end{equation}
which holds true according to Cauchy–Schwarz inequality, with the equality attained if and only if 
 $\mu_1 =\cdots = \mu_n$
.
$\mu_1 =\cdots = \mu_n$
.
 When 
 $\mu_1 =\cdots = \mu_n$
, it is easy to verify the following two facts:
$\mu_1 =\cdots = \mu_n$
, it is easy to verify the following two facts:
- 
(i)  $\mathbf{A} = \frac{1}{n}\mathbf{e}\mathbf{e}^{T}$
, and thus $\mathbf{A} = \frac{1}{n}\mathbf{e}\mathbf{e}^{T}$
, and thus $P_i = \frac{1}{n}\sum_{s=1}^n C_s$
 for each $P_i = \frac{1}{n}\sum_{s=1}^n C_s$
 for each $i=1, \ldots, n$
. $i=1, \ldots, n$
.
- 
(ii)  $L_{i} = E[C_{i}] + \frac{1}{n}\sum\limits_{s=1}^{n}(C_{s} - E[C_{s}]) = \frac{1}{n}\sum_{s=1}^n C_s$
 for each $L_{i} = E[C_{i}] + \frac{1}{n}\sum\limits_{s=1}^{n}(C_{s} - E[C_{s}]) = \frac{1}{n}\sum_{s=1}^n C_s$
 for each $i=1, \ldots, n$
. $i=1, \ldots, n$
.
 In other words, when 
 $\mu_1 =\cdots = \mu_n$
, the optimal linear whole risk sharing strategy coincides with the optimal linear residual risk sharing strategy. This is the only scenario when the total post-transfer variance obtained by former strategy reaches the same level as the latter strategy. For all other scenarios, it is always strictly greater. This confirms that the augmentation of Problem (2.2) to Problem (2.1) is not trivial.
$\mu_1 =\cdots = \mu_n$
, the optimal linear whole risk sharing strategy coincides with the optimal linear residual risk sharing strategy. This is the only scenario when the total post-transfer variance obtained by former strategy reaches the same level as the latter strategy. For all other scenarios, it is always strictly greater. This confirms that the augmentation of Problem (2.2) to Problem (2.1) is not trivial.
 In addition to reduction in the total post-transfer variance, the optimal linear residual risk sharing strategy also possesses a few other advantages over the linear whole risk sharing strategy. First, as pointed out in Section 3, the linear residual risk sharing strategy is risk anonymous, while the linear whole risk sharing strategy is generally not. Second, the optimal sharing ratios in the residual risk sharing model, namely 
 $\frac{1}{n}$
 for each participant, do not rely on the distributional characteristics of the pre-transfer risks and thus ensures robustness to model uncertainty or estimation error in practical implementation. Lastly, the total post-transfer variance obtained by the optimal linear residual sharing strategy is more robust to the covariance matrix of the pre-transfer risks in the sense that it is less sensitive to the change in the covariance matrix, as demonstrated by the following proposition.
$\frac{1}{n}$
 for each participant, do not rely on the distributional characteristics of the pre-transfer risks and thus ensures robustness to model uncertainty or estimation error in practical implementation. Lastly, the total post-transfer variance obtained by the optimal linear residual sharing strategy is more robust to the covariance matrix of the pre-transfer risks in the sense that it is less sensitive to the change in the covariance matrix, as demonstrated by the following proposition.
Proposition 4.4. Let 
 $\left( C_1^H, \ldots, C_n^H \right)$
 and
$\left( C_1^H, \ldots, C_n^H \right)$
 and 
 $\left( C_1^L, \ldots, C_n^L \right)$
 be two sets of pre-transfer losses with the same mean vector and different covariance matrices
$\left( C_1^L, \ldots, C_n^L \right)$
 be two sets of pre-transfer losses with the same mean vector and different covariance matrices 
 $\boldsymbol{\Sigma}^H$
 and
$\boldsymbol{\Sigma}^H$
 and 
 $\boldsymbol{\Sigma}^L$
. Let
$\boldsymbol{\Sigma}^L$
. Let 
 $(L_1^\ast, \ldots, L_n^\ast)$
 and
$(L_1^\ast, \ldots, L_n^\ast)$
 and 
 $(P_1^\ast, \ldots, P_n^\ast)$
, respectively, represent the post-transfer risks of
$(P_1^\ast, \ldots, P_n^\ast)$
, respectively, represent the post-transfer risks of 
 $(C_1^\ast, \ldots, C_n^\ast)$
, respectively, under the optimal linear residual risk sharing strategy and the optimal linear whole risk sharing strategy, where
$(C_1^\ast, \ldots, C_n^\ast)$
, respectively, under the optimal linear residual risk sharing strategy and the optimal linear whole risk sharing strategy, where 
 $\ast$
 indexes H and L. If
$\ast$
 indexes H and L. If 
 $\boldsymbol{\Sigma}^{H}-\boldsymbol{\Sigma}^{L}$
 is positive definite, then
$\boldsymbol{\Sigma}^{H}-\boldsymbol{\Sigma}^{L}$
 is positive definite, then 
 \begin{equation} \sum\limits_{s=1}^{n}\text{Var}[L_{i}]^{H} - \sum\limits_{s=1}^{n}\text{Var}[L_{i}]^{L}\leq \sum\limits_{s=1}^{n}\text{Var}[P_{i}]^{H} - \sum\limits_{s=1}^{n}\text{Var}[P_{i}]^{L}.\end{equation}
\begin{equation} \sum\limits_{s=1}^{n}\text{Var}[L_{i}]^{H} - \sum\limits_{s=1}^{n}\text{Var}[L_{i}]^{L}\leq \sum\limits_{s=1}^{n}\text{Var}[P_{i}]^{H} - \sum\limits_{s=1}^{n}\text{Var}[P_{i}]^{L}.\end{equation}
Proof. According to the proof of Proposition 4.3, we have
 \begin{equation*}\sum_{i=1}^n \text{Var}[L_i] = \frac{1}{n} \mathbf{e}^{T}\boldsymbol\Sigma \mathbf{e} \quad \text{and} \quad\text{Var}[P_{i}] = \frac{1}{n^{2}}\mathbf{e}^{T}\boldsymbol\Sigma \mathbf{e} + k\mu_{i}^{2} - \frac{k}{n^{2}}\left(\sum\limits_{i=1}^{n}\mu_{i}\right)^{2}.\end{equation*}
\begin{equation*}\sum_{i=1}^n \text{Var}[L_i] = \frac{1}{n} \mathbf{e}^{T}\boldsymbol\Sigma \mathbf{e} \quad \text{and} \quad\text{Var}[P_{i}] = \frac{1}{n^{2}}\mathbf{e}^{T}\boldsymbol\Sigma \mathbf{e} + k\mu_{i}^{2} - \frac{k}{n^{2}}\left(\sum\limits_{i=1}^{n}\mu_{i}\right)^{2}.\end{equation*}
Therefore,
 \begin{equation} \sum\limits_{s=1}^{n}\text{Var}[L_{i}]^{H} - \sum\limits_{s=1}^{n}\text{Var}[L_{i}]^{L} = \frac{1}{n^{2}}e^{T}\left(\Sigma^{H}-\Sigma^{L}\right)e,\end{equation}
\begin{equation} \sum\limits_{s=1}^{n}\text{Var}[L_{i}]^{H} - \sum\limits_{s=1}^{n}\text{Var}[L_{i}]^{L} = \frac{1}{n^{2}}e^{T}\left(\Sigma^{H}-\Sigma^{L}\right)e,\end{equation}
 \begin{equation} \sum\limits_{s=1}^{n}\text{Var}[P_{i}]^{H} - \sum\limits_{s=1}^{n}\text{Var}[P_{i}]^{L} = \frac{1}{n^{2}}e^{T}\left(\Sigma^{H}-\Sigma^{L}\right)e + \left(k^{H}-k^{L}\right)\left(\sum\limits_{i=1}^{n}\mu_{i}^{2} - n\bar{\mu}\right).\end{equation}
\begin{equation} \sum\limits_{s=1}^{n}\text{Var}[P_{i}]^{H} - \sum\limits_{s=1}^{n}\text{Var}[P_{i}]^{L} = \frac{1}{n^{2}}e^{T}\left(\Sigma^{H}-\Sigma^{L}\right)e + \left(k^{H}-k^{L}\right)\left(\sum\limits_{i=1}^{n}\mu_{i}^{2} - n\bar{\mu}\right).\end{equation}
Thus, it suffices to show that 
 $k^{H}- k^{L}\geq 0$
. Noting that
$k^{H}- k^{L}\geq 0$
. Noting that 
 $k^{-1} = \mu^{T}\Sigma\mu$
, we have
$k^{-1} = \mu^{T}\Sigma\mu$
, we have 
 \begin{equation} k^{H} - k^{L} = k^{H}k^{L}\left(\frac{1}{k^{L}} - \frac{1}{k^{H}}\right) = k^{H}k^{L}\mu^{T}\Big(\big(\Sigma^{L}\big)^{-1} - \big(\Sigma^{H}\big)^{-1}\Big)\mu \geq 0,\end{equation}
\begin{equation} k^{H} - k^{L} = k^{H}k^{L}\left(\frac{1}{k^{L}} - \frac{1}{k^{H}}\right) = k^{H}k^{L}\mu^{T}\Big(\big(\Sigma^{L}\big)^{-1} - \big(\Sigma^{H}\big)^{-1}\Big)\mu \geq 0,\end{equation}
where the last inequality follows from the fact that 
 $\big(\Sigma^{L}\big)^{-1} - \big(\Sigma^{H}\big)^{-1}$
 is positive definite according to Corollary 7.7.4 of Johnson and Horn (Reference Johnson and Horn2012).
$\big(\Sigma^{L}\big)^{-1} - \big(\Sigma^{H}\big)^{-1}$
 is positive definite according to Corollary 7.7.4 of Johnson and Horn (Reference Johnson and Horn2012).
5. Constrained residual risk sharing
 While the unconstrained residual risk sharing model (2.1) as well as its solution, as studied in Section 4, possesses many desirable properties, it is not without problem. As mentioned in the comments following Theorem 4.1, if the pre-transfer risks are not sufficiently diversified, then some participants may end up a post-transfer variance that is higher than the pre-transfer variance. This limitation thus motivates the addition of the variance reduction constraint 
 $P_{vr}$
.
$P_{vr}$
.
 Furthermore, Theorem 4.1 indicates that, in the unconstrained setup, the optimal linear residual risk sharing strategy leads to equal post-transfer variances for all participants. Recalling that the pre-transfer variances are ordered as 
 $\text{Var}[C_1] \le \cdots \le \text{Var}[C_n]$
, we have
$\text{Var}[C_1] \le \cdots \le \text{Var}[C_n]$
, we have 
 $ \frac{\text{Var}[L_{i}]}{\text{Var}[C_{i}]} \geq \frac{\text{Var}[L_{j}]}{\text{Var}[C_{j}]}$
 for any
$ \frac{\text{Var}[L_{i}]}{\text{Var}[C_{i}]} \geq \frac{\text{Var}[L_{j}]}{\text{Var}[C_{j}]}$
 for any 
 $i \lt j$
. This means, a low-risk participant will enjoy a lower risk mitigation efficiency (measured by the ration of the post-transfer and pre-transfer variances) by joining the risk sharing scheme. This somehow discourage the participation of the participants with low pre-transfer variances, which is typically regarded as high-qualify clients. To address this limitation, the retention consistent constraint
$i \lt j$
. This means, a low-risk participant will enjoy a lower risk mitigation efficiency (measured by the ration of the post-transfer and pre-transfer variances) by joining the risk sharing scheme. This somehow discourage the participation of the participants with low pre-transfer variances, which is typically regarded as high-qualify clients. To address this limitation, the retention consistent constraint 
 $P_{rc}^\gamma$
 is imposed.
$P_{rc}^\gamma$
 is imposed.
 The addition of the constraints of 
 $P_{vr}$
 and
$P_{vr}$
 and 
 $P_{rc}^\gamma$
 leads to the study of Problem (2.4), or equivalently, Problem (3.10). In this section, we shall derive the solution to this problem and discuss the desirable properties of the optimal solution.
$P_{rc}^\gamma$
 leads to the study of Problem (2.4), or equivalently, Problem (3.10). In this section, we shall derive the solution to this problem and discuss the desirable properties of the optimal solution.
5.1. Optimal solution to Problem (3.10)
 A preliminary investigation reveals that the optimal solution to Problem (3.10) relies on the dependence structure among the pre-transfer risks 
 $C_1, \ldots, C_n$
. In order to better present the optimal solutions, we introduce the following sets to characterize different dependence scenarios:
$C_1, \ldots, C_n$
. In order to better present the optimal solutions, we introduce the following sets to characterize different dependence scenarios:
Define
 \begin{equation*}\begin{aligned}\mathcal{U}_0 &=\left\{(C_1, \ldots, C_n)\Big|\sigma_S \le n\sigma_1 \right\} \\\mathcal{U}_{n} &= \left\{(C_1, \ldots, C_n)\Big|\sigma_S=\sum\limits_{i=1}^{n} \sigma_i\right\}.\end{aligned}\end{equation*}
\begin{equation*}\begin{aligned}\mathcal{U}_0 &=\left\{(C_1, \ldots, C_n)\Big|\sigma_S \le n\sigma_1 \right\} \\\mathcal{U}_{n} &= \left\{(C_1, \ldots, C_n)\Big|\sigma_S=\sum\limits_{i=1}^{n} \sigma_i\right\}.\end{aligned}\end{equation*}
Noting that 
 $(C_1, \ldots, C_n) \in \mathcal{U}_n$
 if and only if
$(C_1, \ldots, C_n) \in \mathcal{U}_n$
 if and only if 
 $C_1, \ldots, C_n$
 are perfectly linearly correlated, that is, the correlation coefficients of
$C_1, \ldots, C_n$
 are perfectly linearly correlated, that is, the correlation coefficients of 
 $(C_i, C_j)$
,
$(C_i, C_j)$
, 
 $\rho_{ij}$
, is equal to 1 for any
$\rho_{ij}$
, is equal to 1 for any 
 $i\neq j$
. In this sense,
$i\neq j$
. In this sense, 
 $\mathcal{U}_n$
 represents the strongest possible dependence scenario. On the contrary,
$\mathcal{U}_n$
 represents the strongest possible dependence scenario. On the contrary, 
 $\mathcal{U}_{1}$
 represents the spectrum of dependence strength on the lower end. Note that in the discussion following Theorem (4.1), the term “sufficiently diversified” described by (4.3) corresponds to
$\mathcal{U}_{1}$
 represents the spectrum of dependence strength on the lower end. Note that in the discussion following Theorem (4.1), the term “sufficiently diversified” described by (4.3) corresponds to 
 $(C_1, \ldots, C_n)\in \mathcal{U}_0$
 with
$(C_1, \ldots, C_n)\in \mathcal{U}_0$
 with 
 $\gamma = 0$
. In order to partition the spectrum of dependence strength between
$\gamma = 0$
. In order to partition the spectrum of dependence strength between 
 $\mathcal{U}_0$
 and
$\mathcal{U}_0$
 and 
 $\mathcal{U}_n$
, assume
$\mathcal{U}_n$
, assume 
 $\gamma \in [0,1]$
 and define
$\gamma \in [0,1]$
 and define 
 \begin{equation*}\mathcal{U}_k = \left\{ (C_1, \ldots, C_n)\ \middle\vert d_k \le \sigma_S \lt d_{k+1} \right\},\end{equation*}
\begin{equation*}\mathcal{U}_k = \left\{ (C_1, \ldots, C_n)\ \middle\vert d_k \le \sigma_S \lt d_{k+1} \right\},\end{equation*}
with 
 $d_k = \sigma_{k}^{1-\gamma}\cdot {\sum\limits_{i=k}^{n}\sigma_{i}^{\gamma}} + \sum\limits_{i=1}^{k-1} \sigma_{i}$
 for
$d_k = \sigma_{k}^{1-\gamma}\cdot {\sum\limits_{i=k}^{n}\sigma_{i}^{\gamma}} + \sum\limits_{i=1}^{k-1} \sigma_{i}$
 for 
 $k=1, \ldots, n-1$
.
$k=1, \ldots, n-1$
.
 Note that 
 $d_{k+1} - d_k = \left(\sigma_{k+1}^{1-\gamma} - \sigma_{k}^{1-\gamma}\right)\sum_{i=k+1}^{n} \sigma_i^\gamma \ge 0$
 only when
$d_{k+1} - d_k = \left(\sigma_{k+1}^{1-\gamma} - \sigma_{k}^{1-\gamma}\right)\sum_{i=k+1}^{n} \sigma_i^\gamma \ge 0$
 only when 
 $\gamma\in[0,1]$
. However, for the case of
$\gamma\in[0,1]$
. However, for the case of 
 $\gamma \gt 1$
, a similar partition can be defined by switching
$\gamma \gt 1$
, a similar partition can be defined by switching 
 $a_{k+1}$
 and
$a_{k+1}$
 and 
 $a_k$
, and the study of the solution to Problem (3.10) can be conducted in a similar manner. Throughout the paper, we shall focus on the case of
$a_k$
, and the study of the solution to Problem (3.10) can be conducted in a similar manner. Throughout the paper, we shall focus on the case of 
 $\gamma\in [0,1]$
 unless otherwise indicated.
$\gamma\in [0,1]$
 unless otherwise indicated.
 To obtain an intuitive image of the sets 
 $\mathcal{U}_0, \mathcal{U}_1, \ldots, \mathcal{U}_n$
 and better understand the main results in this section, we develop the following “filling a stepped water tank” graphical interpretation for the case
$\mathcal{U}_0, \mathcal{U}_1, \ldots, \mathcal{U}_n$
 and better understand the main results in this section, we develop the following “filling a stepped water tank” graphical interpretation for the case 
 $\gamma=0$
. Suppose there are n two-dimensional rectangle tanks, as shown in Figure 2(a). These tanks are capped from above at the heights of
$\gamma=0$
. Suppose there are n two-dimensional rectangle tanks, as shown in Figure 2(a). These tanks are capped from above at the heights of 
 $\sigma_1, \ldots, \sigma_n$
, respectively. The rectangle tanks are connected on the sides and form a big stepped tank. Let the “volume” of the water in the tank represents the quantity
$\sigma_1, \ldots, \sigma_n$
, respectively. The rectangle tanks are connected on the sides and form a big stepped tank. Let the “volume” of the water in the tank represents the quantity 
 $\sigma_{S} = \sigma\left(\sum_{i=1}^n C_i\right)$
. When the (highest) water line reaches the level of
$\sigma_{S} = \sigma\left(\sum_{i=1}^n C_i\right)$
. When the (highest) water line reaches the level of 
 $\sigma_n$
, the volume of the water equals to
$\sigma_n$
, the volume of the water equals to 
 $\sum_{i=1}^n \sigma_{i}$
, corresponding to the category
$\sum_{i=1}^n \sigma_{i}$
, corresponding to the category 
 $\mathcal{U}_n$
. When the water line is below or at the level of
$\mathcal{U}_n$
. When the water line is below or at the level of 
 $\sigma_1$
, it corresponds to the category
$\sigma_1$
, it corresponds to the category 
 $\mathcal{U}_0$
. When the water line moves between the level of
$\mathcal{U}_0$
. When the water line moves between the level of 
 $\sigma_1$
 and
$\sigma_1$
 and 
 $\sigma_n$
, it will respectively resemble the categories of
$\sigma_n$
, it will respectively resemble the categories of 
 $\mathcal{U}_1, \ldots, \mathcal{U}_{n-1}$
. For the general case of
$\mathcal{U}_1, \ldots, \mathcal{U}_{n-1}$
. For the general case of 
 $\gamma\in (0,1]$
, the filling-tank interpretation still works by adjust the dimensions of rectangular tanks. Specifically, set the base and height of each individual tank to be
$\gamma\in (0,1]$
, the filling-tank interpretation still works by adjust the dimensions of rectangular tanks. Specifically, set the base and height of each individual tank to be 
 $\sigma_{i}^{\gamma}$
 and
$\sigma_{i}^{\gamma}$
 and 
 $\sigma_{i}^{1-\gamma}$
, respectively, so that the capacity of each tank remains at the level of
$\sigma_{i}^{1-\gamma}$
, respectively, so that the capacity of each tank remains at the level of 
 $\sigma_i$
. Notably, when
$\sigma_i$
. Notably, when 
 $\gamma=1$
, the big tank becomes a regular rectangle, as shown in Figure 2(b).
$\gamma=1$
, the big tank becomes a regular rectangle, as shown in Figure 2(b).

Figure 2. “Step-shaped water tank” graphical interpretation.
 Clearly, the categorization of 
 $(C_1, \ldots, C_n)$
 is determined by the values of
$(C_1, \ldots, C_n)$
 is determined by the values of 
 $\sigma_{1}, \ldots, \sigma_n$
 and the correlation coefficients
$\sigma_{1}, \ldots, \sigma_n$
 and the correlation coefficients 
 $\{\rho_{ij}, 1\le i \lt j \le n\}$
. We assume that all correlation coefficients are nonnegative throughout the paper. Furthermore, denote
$\{\rho_{ij}, 1\le i \lt j \le n\}$
. We assume that all correlation coefficients are nonnegative throughout the paper. Furthermore, denote 
 \begin{equation}\rho_l = \min \{\rho_{ij}, {1\le i \lt j \le n}\}, \qquad \rho_u = \max \{\rho_{ij}, {1\le i \lt j \le n}\}.\end{equation}
\begin{equation}\rho_l = \min \{\rho_{ij}, {1\le i \lt j \le n}\}, \qquad \rho_u = \max \{\rho_{ij}, {1\le i \lt j \le n}\}.\end{equation}
The following proposition provides a sufficient conditions for the categorization of 
 $(C_1, \ldots, C_n)$
 under the scenario
$(C_1, \ldots, C_n)$
 under the scenario 
 $\gamma=0$
.
$\gamma=0$
.
Proposition 5.1 Consider the case of 
 $\gamma =0$
.
$\gamma =0$
.
- 
(i)  $(C_1, \ldots, C_n)\in \cup_{i=0}^{k} \mathcal{U}_i$
, i.e, $(C_1, \ldots, C_n)\in \cup_{i=0}^{k} \mathcal{U}_i$
, i.e, $\sigma_S \lt \sum_{i=1}^k \sigma_i + (n-k)\sigma_{k+1}$
, if (5.2) $\sigma_S \lt \sum_{i=1}^k \sigma_i + (n-k)\sigma_{k+1}$
, if (5.2) \begin{equation}{\sigma_{k+1}^{2} \gt \left(\rho_{u} + \frac{1-\rho_u}{n-k}\right) \sigma_{n}^{2}}.\end{equation} \begin{equation}{\sigma_{k+1}^{2} \gt \left(\rho_{u} + \frac{1-\rho_u}{n-k}\right) \sigma_{n}^{2}}.\end{equation}
- 
(ii)  $(C_1, \ldots, C_n)\in \cup_{i=k+1}^{n} \mathcal{U}_i$
, that is, $(C_1, \ldots, C_n)\in \cup_{i=k+1}^{n} \mathcal{U}_i$
, that is, $\sigma_S \ge \sum_{i=1}^k \sigma_i + (n-k)\sigma_{k+1}$
, if one of the following conditions is satisfied (5.3) $\sigma_S \ge \sum_{i=1}^k \sigma_i + (n-k)\sigma_{k+1}$
, if one of the following conditions is satisfied (5.3) \begin{equation}\left( \rho_l + \frac{1-\rho_l}{n-k-1} \right)\sigma_{k+2}^2 \ge \left( \rho_l + (1-\rho_l)\frac{n^2-k-1}{(n-k-1)^2} \right)\sigma_{k+1}^2,\end{equation}
(5.4) \begin{equation}\left( \rho_l + \frac{1-\rho_l}{n-k-1} \right)\sigma_{k+2}^2 \ge \left( \rho_l + (1-\rho_l)\frac{n^2-k-1}{(n-k-1)^2} \right)\sigma_{k+1}^2,\end{equation}
(5.4) \begin{equation}\sigma_{k+2} \ge \frac{n/\sqrt{\rho_{l}}-k-1}{n-k-1} \sigma_{k+1}.\end{equation} \begin{equation}\sigma_{k+2} \ge \frac{n/\sqrt{\rho_{l}}-k-1}{n-k-1} \sigma_{k+1}.\end{equation}
Proof. See Appendix A.1 in the supplementary document.
 (5.2) and (5.3) give sufficient conditions for 
 $(C_1, \ldots, C_n)$
 to fall in the lower
$(C_1, \ldots, C_n)$
 to fall in the lower 
 $k+1$
 categories and the upper
$k+1$
 categories and the upper 
 $n-k$
 categories, respectively. Combining these two conditions yields a sufficient condition for
$n-k$
 categories, respectively. Combining these two conditions yields a sufficient condition for 
 $(C_1, \ldots, C_n)\in \mathcal{U}_k$
. (5.2) indicates that, for
$(C_1, \ldots, C_n)\in \mathcal{U}_k$
. (5.2) indicates that, for 
 $(C_1, \ldots, C_n)$
 to fall in the lower
$(C_1, \ldots, C_n)$
 to fall in the lower 
 $k+1$
 categories, the riskiness levels, namely,
$k+1$
 categories, the riskiness levels, namely, 
 $\sigma_{k+1}, \ldots, \sigma_n$
, of the
$\sigma_{k+1}, \ldots, \sigma_n$
, of the 
 $n-k$
 largest risks should stay relatively close to each other, avoiding significant dispersion. This condition is relatively easy to be satisfied when the overall dependence strength (characterized by
$n-k$
 largest risks should stay relatively close to each other, avoiding significant dispersion. This condition is relatively easy to be satisfied when the overall dependence strength (characterized by 
 $\rho_u$
) is low and n is large. On the other hand, (5.3) implies that the riskiness levels of
$\rho_u$
) is low and n is large. On the other hand, (5.3) implies that the riskiness levels of 
 $(n-k)$
 largest risks and those of the other k risks should be apart enough for
$(n-k)$
 largest risks and those of the other k risks should be apart enough for 
 $(C_1, \ldots, C_n)$
 to fall in the upper
$(C_1, \ldots, C_n)$
 to fall in the upper 
 $n-k$
 categories. The condition becomes less restrictive when the overall dependence strength, as measured by
$n-k$
 categories. The condition becomes less restrictive when the overall dependence strength, as measured by 
 $\rho_l$
, increases. This is more clearly illustrated by Condition (5.4).
$\rho_l$
, increases. This is more clearly illustrated by Condition (5.4).
 For the cases with general 
 $\gamma\in [0,1]$
, it is easy to verify the following statements.
$\gamma\in [0,1]$
, it is easy to verify the following statements.
Proposition 5.2. Consider the general case of 
 $\gamma \in [0,1]$
.
$\gamma \in [0,1]$
.
- 
(i) If  $\sigma_1 = \cdots =\sigma_n$
, then $\sigma_1 = \cdots =\sigma_n$
, then $(C_1, \ldots, C_n) \in \mathcal{U}_0$
. $(C_1, \ldots, C_n) \in \mathcal{U}_0$
.
- 
(ii) If  $\rho_{ij}=1$
 for any $\rho_{ij}=1$
 for any $1\le i \lt j \le n$
, then $1\le i \lt j \le n$
, then $(C_1, \ldots, C_n) \in \mathcal{U}_n$
. $(C_1, \ldots, C_n) \in \mathcal{U}_n$
.
- 
(iii) If  $\rho_{ij}=0$
 for any $\rho_{ij}=0$
 for any $1\le i \lt j \le n$
 and $1\le i \lt j \le n$
 and $\sigma_n^2 \le (n+1)\sigma_1^2$
, then $\sigma_n^2 \le (n+1)\sigma_1^2$
, then $(C_1, \ldots, C_n) \in \mathcal{U}_0$
. $(C_1, \ldots, C_n) \in \mathcal{U}_0$
.
 Proposition 5.2(ii) and (iii) confirm the intuition that the categorization of 
 $(C_1, \ldots, C_n)$
 is closely related to the dependence strength. Specifically, as the dependence strength increases, it is more likely for
$(C_1, \ldots, C_n)$
 is closely related to the dependence strength. Specifically, as the dependence strength increases, it is more likely for 
 $(C_1, \ldots, C_n)$
 to fall into upper categories. Another factor that affects the categorization is the degree of the dispersion of the riskiness level. As suggested by Proposition 5.2(i), when the riskiness levels become equalFootnote 
1
,
$(C_1, \ldots, C_n)$
 to fall into upper categories. Another factor that affects the categorization is the degree of the dispersion of the riskiness level. As suggested by Proposition 5.2(i), when the riskiness levels become equalFootnote 
1
, 
 $(C_1, \ldots, C_n)$
 will always stay in the lowest category
$(C_1, \ldots, C_n)$
 will always stay in the lowest category 
 $\mathcal{U}_0$
.
$\mathcal{U}_0$
.
Theorem 5.3. Assume 
 $\gamma \in [0,1]$
 and
$\gamma \in [0,1]$
 and 
 $(C_1, \ldots, C_n)\in \mathcal{U}_k$
 for
$(C_1, \ldots, C_n)\in \mathcal{U}_k$
 for 
 $k \in \{0, 1, \ldots, n\}$
, the optimal sharing ratios,
$k \in \{0, 1, \ldots, n\}$
, the optimal sharing ratios, 
 $\{a_1, \ldots, a_n\}$
, that solve Problem (3.10) are given by:
$\{a_1, \ldots, a_n\}$
, that solve Problem (3.10) are given by: 
 \begin{equation}a_ i =\begin{cases} \dfrac{\sigma_i}{\sigma_S}, & \quad \mathrm{ \ for\ } i=1, \ldots, k;\\[15pt]\dfrac{\sigma_S - \sum_{j=1}^k\sigma_j}{\sigma_S} \times \dfrac{\sigma_i^\gamma}{\sum_{j=k+1}^n \sigma_{j}^\gamma}, & \quad \mathrm{ \ for \ } i=k+1, \ldots, n,\end{cases}\end{equation}
\begin{equation}a_ i =\begin{cases} \dfrac{\sigma_i}{\sigma_S}, & \quad \mathrm{ \ for\ } i=1, \ldots, k;\\[15pt]\dfrac{\sigma_S - \sum_{j=1}^k\sigma_j}{\sigma_S} \times \dfrac{\sigma_i^\gamma}{\sum_{j=k+1}^n \sigma_{j}^\gamma}, & \quad \mathrm{ \ for \ } i=k+1, \ldots, n,\end{cases}\end{equation}
with the convention of 
 $\sum_{j=1}^0 x_j =0$
. Under the optimal risk sharing strategy, the post-transfer standard deviations are specified by:
$\sum_{j=1}^0 x_j =0$
. Under the optimal risk sharing strategy, the post-transfer standard deviations are specified by: 
 \begin{equation}\sigma(L_i) =\begin{cases} {\sigma_i}, & \quad \mathrm{ \ for \ } i=1, \ldots, k;\\[5pt] \left(\sigma_S - \sum_{j=1}^k\sigma_j\right) \times \dfrac{\sigma_i^\gamma}{\sum_{j=k+1}^n \sigma_{j}^\gamma}, & \quad \mathrm{ \ for \ } i=k+1, \ldots, n.\end{cases}\end{equation}
\begin{equation}\sigma(L_i) =\begin{cases} {\sigma_i}, & \quad \mathrm{ \ for \ } i=1, \ldots, k;\\[5pt] \left(\sigma_S - \sum_{j=1}^k\sigma_j\right) \times \dfrac{\sigma_i^\gamma}{\sum_{j=k+1}^n \sigma_{j}^\gamma}, & \quad \mathrm{ \ for \ } i=k+1, \ldots, n.\end{cases}\end{equation}
Proof. See Appendix A.2 in the supplementary document.
 For the case 
 $\gamma=1$
, the structure of the optimal solution can be significantly simplified.
$\gamma=1$
, the structure of the optimal solution can be significantly simplified.
Proposition 5.4. When 
 $\gamma=1$
, the optimal sharing ratios that solve Problem (3.10) are given by:
$\gamma=1$
, the optimal sharing ratios that solve Problem (3.10) are given by: 
 \begin{equation}a_i = \frac{\sigma_i}{\sum_{j=1}^n \sigma_j}, \qquad i=1, \ldots, n.\end{equation}
\begin{equation}a_i = \frac{\sigma_i}{\sum_{j=1}^n \sigma_j}, \qquad i=1, \ldots, n.\end{equation}
Proof. Note that when 
 $\gamma=1$
,
$\gamma=1$
, 
 $d_1 = \cdots = d_n = \sum_{i=1}^n \sigma_i$
. Thus,
$d_1 = \cdots = d_n = \sum_{i=1}^n \sigma_i$
. Thus, 
 $\mathcal{U}_1, \ldots, \mathcal{U}_{n-1}$
 all degenerate to
$\mathcal{U}_1, \ldots, \mathcal{U}_{n-1}$
 all degenerate to 
 $\emptyset$
, and
$\emptyset$
, and 
 \begin{equation*}\mathcal{U}_0 = \left\{(C_1, \ldots, C_n) \left| \sigma_S \lt {\sum_{i=1}^n \sigma_i}\right.\right\}, \quad \text{and} \quad\mathcal{U}_n = \left\{(C_1, \ldots, C_n)\left| \sigma_S = {\sum_{i=1}^n \sigma_i}\right.\right\}.\end{equation*}
\begin{equation*}\mathcal{U}_0 = \left\{(C_1, \ldots, C_n) \left| \sigma_S \lt {\sum_{i=1}^n \sigma_i}\right.\right\}, \quad \text{and} \quad\mathcal{U}_n = \left\{(C_1, \ldots, C_n)\left| \sigma_S = {\sum_{i=1}^n \sigma_i}\right.\right\}.\end{equation*}
Following (5.5), we have
- 
(i) When  $(C_1, \ldots, C_n) \in \mathcal{U}_0$
, (5.8) $(C_1, \ldots, C_n) \in \mathcal{U}_0$
, (5.8) \begin{equation}a_i = \frac{\sigma_S - \sum_{j=1}^0\sigma_j}{\sigma_S} \times \frac{\sigma_i}{\sum_{j=1}^n \sigma_{j}} = \frac{\sigma_i}{\sum_{j=1}^n \sigma_{j}}, \quad \text{ for } i=1, \ldots, n.\end{equation} \begin{equation}a_i = \frac{\sigma_S - \sum_{j=1}^0\sigma_j}{\sigma_S} \times \frac{\sigma_i}{\sum_{j=1}^n \sigma_{j}} = \frac{\sigma_i}{\sum_{j=1}^n \sigma_{j}}, \quad \text{ for } i=1, \ldots, n.\end{equation}
- 
(ii) When  $(C_1, \ldots, C_n) \in \mathcal{U}_n$
, $(C_1, \ldots, C_n) \in \mathcal{U}_n$
, $\sigma_S=\sum_{j=1}^n \sigma_i$
, and (5.9) $\sigma_S=\sum_{j=1}^n \sigma_i$
, and (5.9) \begin{equation}a_i = \frac{\sigma_i}{\sigma_S}= \frac{\sigma_i}{\sum_{j=1}^n \sigma_{j}}, \quad \text{ for } i=1, \ldots, n.\end{equation} \begin{equation}a_i = \frac{\sigma_i}{\sigma_S}= \frac{\sigma_i}{\sum_{j=1}^n \sigma_{j}}, \quad \text{ for } i=1, \ldots, n.\end{equation}
 Below we develop graphical interpretations of Theorem 5.3 and Proposition 5.4. Construct n rectangular tanks with the base-height dimension of 
 $\left\{\left(\sigma_1^{\gamma}, \sigma_1^{1-\gamma}\right), \ldots, \left(\sigma_n^{\gamma}, \sigma_n^{1-\gamma}\right)\right\}$
, respectively. The capacities of the tanks serve as an visualization of the pre-transfer standard deviations, namely,
$\left\{\left(\sigma_1^{\gamma}, \sigma_1^{1-\gamma}\right), \ldots, \left(\sigma_n^{\gamma}, \sigma_n^{1-\gamma}\right)\right\}$
, respectively. The capacities of the tanks serve as an visualization of the pre-transfer standard deviations, namely, 
 $\sigma_1, \ldots, \sigma_n$
. Due to the assumption of variance reduction, we naturally use a portion of each tank’s capacity to represent the post-transfer standard deviation
$\sigma_1, \ldots, \sigma_n$
. Due to the assumption of variance reduction, we naturally use a portion of each tank’s capacity to represent the post-transfer standard deviation 
 $\sigma(L_i)$
, as visualized by the shaded rectangular area in Figure 3. With this setup, the
$\sigma(L_i)$
, as visualized by the shaded rectangular area in Figure 3. With this setup, the 
 $\gamma$
-retention consistency
$\gamma$
-retention consistency 
 \begin{equation}\frac{\sigma(L_1)}{\sigma_1^{\gamma}} \le \cdots \le \frac{\sigma(L_n)}{\sigma_n^{\gamma}},\end{equation}
\begin{equation}\frac{\sigma(L_1)}{\sigma_1^{\gamma}} \le \cdots \le \frac{\sigma(L_n)}{\sigma_n^{\gamma}},\end{equation}
can be interpreted as the heights of shaded rectangles should be in an ascending order, as shown in Figure 3.

Figure 3. Visualization of Theorem 5.3.
 Recall that the goal is to minimize 
 $\sum_{i=1}^n [\sigma(L_i)]^2$
, while
$\sum_{i=1}^n [\sigma(L_i)]^2$
, while 
 $\sum_{i=1}^n [\sigma(L_i)] = \sigma_S$
 due to the linearity among
$\sum_{i=1}^n [\sigma(L_i)] = \sigma_S$
 due to the linearity among 
 $L_1, \ldots, L_n$
 proved in Theorem 3.1. In order to attain the minimum,
$L_1, \ldots, L_n$
 proved in Theorem 3.1. In order to attain the minimum, 
 $(\sigma(L_1), \ldots, \sigma(L_n))$
 should be least majorized (Theorem 1.12 in Khan et al. (Reference Khan, Bradanovi, Latif, Pecaric and Pecaric2019)), meaning that the values of
$(\sigma(L_1), \ldots, \sigma(L_n))$
 should be least majorized (Theorem 1.12 in Khan et al. (Reference Khan, Bradanovi, Latif, Pecaric and Pecaric2019)), meaning that the values of 
 $\sigma(L_1), \ldots, \sigma(L_n)$
 should stay as close as possible. Note that the heights and bases of shaded rectangles are both ascending ordered and the bases are fixed. The requirement of least majorization forces to the heights to stay at the same level whenever possible.
$\sigma(L_1), \ldots, \sigma(L_n)$
 should stay as close as possible. Note that the heights and bases of shaded rectangles are both ascending ordered and the bases are fixed. The requirement of least majorization forces to the heights to stay at the same level whenever possible.
 Suppose all rectangle tanks are connected on the sides and form a step-shaped tank. Consider the experiment of filling water into this big tank. The volume of the water represents the standard deviation, 
 $\sigma_S$
, of the aggregate risk. Note that as the water flows in (
$\sigma_S$
, of the aggregate risk. Note that as the water flows in (
 $\sigma_S$
 increases), the waterline in each rectangle tank stays the same, until it reaches the individual caps,
$\sigma_S$
 increases), the waterline in each rectangle tank stays the same, until it reaches the individual caps, 
 $\sigma_1, \ldots, \sigma_n$
. In this sense, this “filling water” process properly simulates the minimization problem with the constraints of variance reduction and
$\sigma_1, \ldots, \sigma_n$
. In this sense, this “filling water” process properly simulates the minimization problem with the constraints of variance reduction and 
 $\gamma$
-retention consistency. Therefore, the volume of water that ends up in each rectangle tank gives the optimal post-transfer standard deviation.
$\gamma$
-retention consistency. Therefore, the volume of water that ends up in each rectangle tank gives the optimal post-transfer standard deviation.
 Specifically, when 
 $(C_1, \ldots, C_n) \in \mathcal{U}_0$
, that is,
$(C_1, \ldots, C_n) \in \mathcal{U}_0$
, that is, 
 $\sigma_S < \sigma_1^{1-\gamma}\sum\limits_{i=1}^n \sigma_{i}^{\gamma}$
, the waterline would not touch the lowest cap
$\sigma_S < \sigma_1^{1-\gamma}\sum\limits_{i=1}^n \sigma_{i}^{\gamma}$
, the waterline would not touch the lowest cap 
 $\sigma_1$
 and stays at the same level for all rectangle tanks. Thus, each rectangle tank receives water of volume proportional to its base, leading to
$\sigma_1$
 and stays at the same level for all rectangle tanks. Thus, each rectangle tank receives water of volume proportional to its base, leading to 
 $\sigma(L_i) = \frac{\sigma_i^\gamma}{ \sigma_{k+1}^\gamma + \cdots + \sigma_{n}^\gamma} \times \sigma_S$
 or equivalently,
$\sigma(L_i) = \frac{\sigma_i^\gamma}{ \sigma_{k+1}^\gamma + \cdots + \sigma_{n}^\gamma} \times \sigma_S$
 or equivalently, 
 $a_i = \frac{\sigma_i^\gamma}{ \sigma_{k+1}^\gamma + \cdots + \sigma_{n}^\gamma}$
 for all
$a_i = \frac{\sigma_i^\gamma}{ \sigma_{k+1}^\gamma + \cdots + \sigma_{n}^\gamma}$
 for all 
 $i=1, \ldots, n$
. When
$i=1, \ldots, n$
. When 
 $(C_1, \ldots, C_n) \in \mathcal{U}_k$
, the total volume of water
$(C_1, \ldots, C_n) \in \mathcal{U}_k$
, the total volume of water 
 $\sigma_S$
 satisfies
$\sigma_S$
 satisfies 
 \begin{equation*}\sigma_k^{1-\gamma} \sum_{s=k}^n \sigma_s^\gamma + \sum_{s=1}^{k-1}\sigma_s \le \sigma_S \lt \sigma_{k+1}^{1-\gamma} \sum_{s=k+1}^n \sigma_s^\gamma + \sum_{s=1}^{k}\sigma_s.\end{equation*}
\begin{equation*}\sigma_k^{1-\gamma} \sum_{s=k}^n \sigma_s^\gamma + \sum_{s=1}^{k-1}\sigma_s \le \sigma_S \lt \sigma_{k+1}^{1-\gamma} \sum_{s=k+1}^n \sigma_s^\gamma + \sum_{s=1}^{k}\sigma_s.\end{equation*}
Thus, the first 
 $k-1$
 rectangle tanks will be fully filled, and the rest rectangle tanks share the remainder volume proportional to their bases. This leads to formula (5.5).
$k-1$
 rectangle tanks will be fully filled, and the rest rectangle tanks share the remainder volume proportional to their bases. This leads to formula (5.5).
 For the case of 
 $\gamma=1$
, the bases of the rectangular tanks become
$\gamma=1$
, the bases of the rectangular tanks become 
 $\sigma_1, \ldots, \sigma_n$
 and their heights all equals to 1. Thus, the combined tank degenerates to a big rectangular tank without the stepped shape. Therefore, the filling water process becomes relatively straightforward: all individual tanks always proportionally share the total volume.
$\sigma_1, \ldots, \sigma_n$
 and their heights all equals to 1. Thus, the combined tank degenerates to a big rectangular tank without the stepped shape. Therefore, the filling water process becomes relatively straightforward: all individual tanks always proportionally share the total volume.
5.2. Properties of the optimal solution
In this subsection, we establish some properties of the optimal solution to Problem (2.4), including the behaviors of both the optimal sharing ratios and post-transfer risks. These results further demonstrate the desirability of the residual risk sharing model.
Proposition 5.5 The optimal post-transfer variances satisfy the following properties:
- 
(a) Low risk, low sharing ratio: For any  $i \lt j$
, $i \lt j$
, $a_i \le a_j$
. $a_i \le a_j$
.
- 
(b) Pre-/post-riskiness consistency: For any  $i \lt j$
, $i \lt j$
, $L_i$
 has less variability than $L_i$
 has less variability than $L_j$
 in the following senses: $L_j$
 in the following senses:- 
(i)  $|L_i - {E}[L_i]| \le_{a.s.} |L_j - {E}[L_j]|$ $|L_i - {E}[L_i]| \le_{a.s.} |L_j - {E}[L_j]|$
- 
(ii)  $|L_i - {E}[L_i]| \le_{cx} |L_j - {E}[L_j]|$ $|L_i - {E}[L_i]| \le_{cx} |L_j - {E}[L_j]|$
 
- 
- 
(c) High risk, high reduction:  $\sigma_i - \sigma(L_i) \le \sigma_j - \sigma(L_j) $
 for any $\sigma_i - \sigma(L_i) \le \sigma_j - \sigma(L_j) $
 for any $i \lt j$
. $i \lt j$
.
- 
(d) Monotonicity in pre-transfer standard deviation:  $\sigma(L_i)$
 increases as $\sigma(L_i)$
 increases as $\sigma_i$
 increases. $\sigma_i$
 increases.
Proof. Suppose 
 $(C_1, \ldots, C_n)\in \mathcal{U}_k$
, which implies
$(C_1, \ldots, C_n)\in \mathcal{U}_k$
, which implies 
 $\sigma_S \ge d_k =\sigma_k^{1-\gamma} \sum_{s=k}^n \sigma_s^\gamma + \sum_{s=1}^{k-1}\sigma_s$
.
$\sigma_S \ge d_k =\sigma_k^{1-\gamma} \sum_{s=k}^n \sigma_s^\gamma + \sum_{s=1}^{k-1}\sigma_s$
.
- 
(a) For any  $i \lt j \le k$
, $i \lt j \le k$
, $a_i = \frac{\sigma_i}{\sigma_S} \le \frac{\sigma_j}{\sigma_S} =a_j$
. $a_i = \frac{\sigma_i}{\sigma_S} \le \frac{\sigma_j}{\sigma_S} =a_j$
.
 For any 
 $k+1 \le i \lt j$
, it holds that
$k+1 \le i \lt j$
, it holds that 
 \begin{align*}a_i &= \frac{\sigma_S - (\sigma_1+\cdots + \sigma_k)}{\sigma_S} \times \frac{\sigma_i^\gamma}{ \sigma_{k+1}^\gamma + \cdots + \sigma_{n}^\gamma}\\& \le \frac{\sigma_S - (\sigma_1+\cdots + \sigma_k)}{\sigma_S} \times \frac{\sigma_j^\gamma}{ \sigma_{k+1}^\gamma + \cdots + \sigma_{n}^\gamma} =a_j.\end{align*}
\begin{align*}a_i &= \frac{\sigma_S - (\sigma_1+\cdots + \sigma_k)}{\sigma_S} \times \frac{\sigma_i^\gamma}{ \sigma_{k+1}^\gamma + \cdots + \sigma_{n}^\gamma}\\& \le \frac{\sigma_S - (\sigma_1+\cdots + \sigma_k)}{\sigma_S} \times \frac{\sigma_j^\gamma}{ \sigma_{k+1}^\gamma + \cdots + \sigma_{n}^\gamma} =a_j.\end{align*}
 For any 
 $i\le k$
 and
$i\le k$
 and 
 $j \ge k+1$
, noting that
$j \ge k+1$
, noting that 
 $a_i \le a_{k}$
 and
$a_i \le a_{k}$
 and 
 $a_j \ge a_{k+1}$
, it suffices to prove that
$a_j \ge a_{k+1}$
, it suffices to prove that 
 $a_{k}\le a_{k+1}$
, or equivalently
$a_{k}\le a_{k+1}$
, or equivalently 
 \begin{equation}\frac{\sigma_k}{\sigma_S} \le \frac{\sigma_S - (\sigma_1+\cdots + \sigma_k)}{\sigma_S} \times \frac{\sigma_{k+1}^\gamma}{ \sigma_{k+1}^\gamma + \cdots + \sigma_{n}^\gamma},\end{equation}
\begin{equation}\frac{\sigma_k}{\sigma_S} \le \frac{\sigma_S - (\sigma_1+\cdots + \sigma_k)}{\sigma_S} \times \frac{\sigma_{k+1}^\gamma}{ \sigma_{k+1}^\gamma + \cdots + \sigma_{n}^\gamma},\end{equation}
which holds true due to the assumption of 
 $(C_1, \ldots, C_n)\in \mathcal{U}_k$
.
$(C_1, \ldots, C_n)\in \mathcal{U}_k$
.
- 
(b) Recall from (3.1) that  $L_i - {E}[L_i] = a_i(S - {E}[S])$
. b(i) immediately follows from (a) and b(ii) follows from Theorem 3.A.18 of Shaked and Shanthikumar (Reference Shaked and Shanthikumar2007). $L_i - {E}[L_i] = a_i(S - {E}[S])$
. b(i) immediately follows from (a) and b(ii) follows from Theorem 3.A.18 of Shaked and Shanthikumar (Reference Shaked and Shanthikumar2007).
- 
(c) For any  $i\le k$
, $i\le k$
, $\sigma(L_i) = \sigma_i$
, and thus $\sigma(L_i) = \sigma_i$
, and thus $\sigma_i - \sigma(L_i) = 0 \le \sigma_j -\sigma(L_j)$
 for any $\sigma_i - \sigma(L_i) = 0 \le \sigma_j -\sigma(L_j)$
 for any $j \gt i$
. $j \gt i$
.
 For any 
 $i \gt k$
, note that
$i \gt k$
, note that 
 \begin{equation}\sigma_i - \sigma(L_i) = \sigma_i^{\gamma} \times \left(\sigma_i^{1-\gamma} - \frac{\sigma_S - (\sigma_1+ \cdots + \sigma_k)}{\sigma_{k+1}^\gamma + \cdots + \sigma_n^\gamma} \right).\end{equation}
\begin{equation}\sigma_i - \sigma(L_i) = \sigma_i^{\gamma} \times \left(\sigma_i^{1-\gamma} - \frac{\sigma_S - (\sigma_1+ \cdots + \sigma_k)}{\sigma_{k+1}^\gamma + \cdots + \sigma_n^\gamma} \right).\end{equation}
Since both factors on the right-hand side are nonnegative and increasing in i, the conclusion of (b) immediately follows.
- 
(d) For any  $i\le k$
, $i\le k$
, $\sigma(L_i)=\sigma_i$
 and is thus clearly increasing in $\sigma(L_i)=\sigma_i$
 and is thus clearly increasing in $\sigma_i$
. $\sigma_i$
.
 For 
 $i \gt k$
,
$i \gt k$
, 
 $\sigma(L_i) = ({\sigma_S - (\sigma_1+\cdots + \sigma_k)}) \times \frac{\sigma_i^\gamma}{ \sigma_{k+1}^\gamma + \cdots + \sigma_{n}^\gamma}$
. Since all the correlation coefficients are assumed to be nonnegative, it is easy to conclude that
$\sigma(L_i) = ({\sigma_S - (\sigma_1+\cdots + \sigma_k)}) \times \frac{\sigma_i^\gamma}{ \sigma_{k+1}^\gamma + \cdots + \sigma_{n}^\gamma}$
. Since all the correlation coefficients are assumed to be nonnegative, it is easy to conclude that 
 $\sigma_S$
 increases as any
$\sigma_S$
 increases as any 
 $\sigma_i$
 increases. Therefore, both factors of
$\sigma_i$
 increases. Therefore, both factors of 
 $\sigma(L_i)$
 are increasing in
$\sigma(L_i)$
 are increasing in 
 $\sigma_i$
, which implies that
$\sigma_i$
, which implies that 
 $\sigma(L_i)$
 is increasing in
$\sigma(L_i)$
 is increasing in 
 $\sigma_i$
.
$\sigma_i$
.
 Although Property b(i) and b(ii) follow from Property (a), they have different focus. While Property (a) describes the behavior of the optimal sharing ratios, Property (b) compares the riskiness level of the post-transfer risks. Specifically, it indicates that participant with a lower pre-transfer risk will end up with a lower post-transfer risk. The post-transfer risks are ordered not only in term of variance (as implied by b(i)) but also in much more strong senses as described in b(i) and b(ii). Property b(ii) is also referred to as 
 $L_i$
 is less than
$L_i$
 is less than 
 $L_j$
 in dilation order, which compares the variability of random variables. More discussions of this order can be found in Shaked and Shanthikumar (Reference Shaked and Shanthikumar2007). Note that Property b(i) and b(ii) do not imply each other.
$L_j$
 in dilation order, which compares the variability of random variables. More discussions of this order can be found in Shaked and Shanthikumar (Reference Shaked and Shanthikumar2007). Note that Property b(i) and b(ii) do not imply each other.
 Properties (a) and (c) in Proposition 5.5 bear straightforward interpretations through Figure 3. Specifically, 
 $\sigma(L_i)$
 and
$\sigma(L_i)$
 and 
 $\sigma_i-\sigma(L_i)$
 are, respectively, represented by the shaded area and unshaded area in the
$\sigma_i-\sigma(L_i)$
 are, respectively, represented by the shaded area and unshaded area in the 
 $i^{th}$
 rectangle tank. For two tanks sharing the same waterline, the one with the wider base naturally has a larger shaded area and a larger unshaded area. When one or both tanks are fully filled, the comparison is also clearly demonstrated by Figure 3.
$i^{th}$
 rectangle tank. For two tanks sharing the same waterline, the one with the wider base naturally has a larger shaded area and a larger unshaded area. When one or both tanks are fully filled, the comparison is also clearly demonstrated by Figure 3.
The properties established in Proposition 5.5 confirm the desirability of the residual risk sharing model in favoring practice. Specifically, Properties (a) and (b) serve as the riskiness fairness, that is, the participant with a lower pre-transfer risk will end up with a lower risk sharing ratio and a lower post-transfer risk (in multiples senses). Meanwhile, Property (c) encourages the participation of agents with high riskiness levels, because they will enjoy a larger reduction in riskiness. Property (d) ensures that no one would benefit by intentionally increasing the level of riskiness (the pre-transfer variance) and thus prevents moral hazard to certain extent.
In practice, a valid concern is whether a risk sharing program is sustainable. Specifically, whether the mechanism discourages existing participants from staying or new participants from joining. The rationale for an existing participant to remain in the program is ensured by the variance reduction condition. Below, we investigate the motivation for a new participant to join.
 Let 
 $C_{n+1}$
 be the risk to be added to the original risk pool, namely,
$C_{n+1}$
 be the risk to be added to the original risk pool, namely, 
 $\{C_1, \ldots, C_n\}$
. Denote by
$\{C_1, \ldots, C_n\}$
. Denote by 
 $\sigma_{n+1}$
 the standard deviation of
$\sigma_{n+1}$
 the standard deviation of 
 $C_{n+1}$
 and by
$C_{n+1}$
 and by 
 $\{\rho_{i\, n+1}\}$
 the correlation coefficient between
$\{\rho_{i\, n+1}\}$
 the correlation coefficient between 
 $C_{n+1}$
 and
$C_{n+1}$
 and 
 $C_i$
 for
$C_i$
 for 
 $i=1, \ldots, n$
. Furthermore, define
$i=1, \ldots, n$
. Furthermore, define 
 \begin{equation} \widehat{\rho}_h = \max_{1\le i \le n} \rho_{i\,n+1}.\end{equation}
\begin{equation} \widehat{\rho}_h = \max_{1\le i \le n} \rho_{i\,n+1}.\end{equation}
Denote 
 $S = \sum_{i=1}^n C_i$
 and
$S = \sum_{i=1}^n C_i$
 and 
 $\widehat{S} = \sum_{i=1}^{n+1} C_i$
. Let
$\widehat{S} = \sum_{i=1}^{n+1} C_i$
. Let 
 ${L_1, \ldots, L_n}$
 and
${L_1, \ldots, L_n}$
 and 
 ${\widehat{L}_1, \ldots, \widehat{L}_n, \widehat{L}_{n+1}}$
 represent the post-transfer risks, respectively, for the original risk pool and augmented risk pool (with the addition of
${\widehat{L}_1, \ldots, \widehat{L}_n, \widehat{L}_{n+1}}$
 represent the post-transfer risks, respectively, for the original risk pool and augmented risk pool (with the addition of 
 $C_{n+1}$
) under the optimal strategy.
$C_{n+1}$
) under the optimal strategy.
Proposition 5.6. The addition of 
 $C_{n+1}$
 benefits every existing participant in the sense that
$C_{n+1}$
 benefits every existing participant in the sense that 
 $Var[\widehat{L}_i] \le Var[{L}_i]$
 for
$Var[\widehat{L}_i] \le Var[{L}_i]$
 for 
 $i=1, \ldots, n$
, if
$i=1, \ldots, n$
, if 
 \begin{equation}\sigma_{n+1} \leq 2\left( 1- \frac{n\left(\widehat{\rho}_h - \rho_l\right)}{1-\rho_l}\right)\frac{\sum_{j=1}^n \sigma_i}{n-1}.\end{equation}
\begin{equation}\sigma_{n+1} \leq 2\left( 1- \frac{n\left(\widehat{\rho}_h - \rho_l\right)}{1-\rho_l}\right)\frac{\sum_{j=1}^n \sigma_i}{n-1}.\end{equation}
Proof. See Appendix A.3 in the supplementary document.
 Proposition 5.6 provides a sufficient condition, (5.14), for it is sensible to enlarge the size of the risk pool, and thus makes the risk sharing program sustainable. This condition is referred to as sustainability condition. It indicates that, for a new risk to be added, its riskiness level, 
 $\sigma_{n+1}$
, should not exceed the threshold
$\sigma_{n+1}$
, should not exceed the threshold 
 $2\left( 1- \frac{n(\widehat{\rho}_h - \rho_l)}{1-\rho_l}\right)\frac{\sum_{j=1}^n \sigma_i}{n-1}$
, which can be regarded the baseline
$2\left( 1- \frac{n(\widehat{\rho}_h - \rho_l)}{1-\rho_l}\right)\frac{\sum_{j=1}^n \sigma_i}{n-1}$
, which can be regarded the baseline 
 $2 \frac{\sum_{j=1}^n \sigma_i}{n-1}$
 adjusted by a factor reflecting the dependence structure. The baseline is slightly above two times the average riskiness of the existing pool, which serves as a reasonable upper bound. The adjustment factor,
$2 \frac{\sum_{j=1}^n \sigma_i}{n-1}$
 adjusted by a factor reflecting the dependence structure. The baseline is slightly above two times the average riskiness of the existing pool, which serves as a reasonable upper bound. The adjustment factor, 
 $\frac{n(\widehat{\rho}_h - \rho_l)}{1-\rho_l}$
, reflects the competitive impacts of the dependence strength among the existing risks, characterized by
$\frac{n(\widehat{\rho}_h - \rho_l)}{1-\rho_l}$
, reflects the competitive impacts of the dependence strength among the existing risks, characterized by 
 $\rho_l$
, and the dependence strength between the new risks and the existing risks, characterized by
$\rho_l$
, and the dependence strength between the new risks and the existing risks, characterized by 
 $\widehat{\rho}_h$
. In particular, the adjustment factor is negative if
$\widehat{\rho}_h$
. In particular, the adjustment factor is negative if 
 $\widehat{\rho}_h \gt \rho_l$
, indicating a more restrictive sustainability condition. This is because the addition of
$\widehat{\rho}_h \gt \rho_l$
, indicating a more restrictive sustainability condition. This is because the addition of 
 $C_{n+1}$
 introduces a highly dependent risk and thus weakens the diversification effect among the existing risks.
$C_{n+1}$
 introduces a highly dependent risk and thus weakens the diversification effect among the existing risks.
6. Other relevant models
In this section, we study a few problems relevant to Problem (2.4). Through uncovering the connections among these problems, we gain insights on how to formulate the risk sharing problems in a more general setup.
To this end, we recall Problems (2.3) and (2.7) below:
 \begin{equation*}\min_{(L_1, \ldots, L_n)\in \mathcal{C}_{vr}} \quad \sum\limits_{i=1}^{n}Var[L_i], \end{equation*}
\begin{equation*}\min_{(L_1, \ldots, L_n)\in \mathcal{C}_{vr}} \quad \sum\limits_{i=1}^{n}Var[L_i], \end{equation*}
 \begin{equation} \nonumber\min_{(L_1, \ldots, L_n)\in \mathcal{C}_{rc}^1} \quad \sum\limits_{i=1}^{n}Var[L_i]. \end{equation}
\begin{equation} \nonumber\min_{(L_1, \ldots, L_n)\in \mathcal{C}_{rc}^1} \quad \sum\limits_{i=1}^{n}Var[L_i]. \end{equation}
Clearly, Problems (2.3) and Problem (2.7) are a modification of Problem (2.5) and Problem (2.6) (which are two special cases of Problem (2.4)), respectively. Each modification lies in the omission of one constraint, namely the constraint of 
 $(L_1, \ldots, L_n)\in \mathcal{C}_{rc}^0$
 for Problem (2.3) and the constraint of
$(L_1, \ldots, L_n)\in \mathcal{C}_{rc}^0$
 for Problem (2.3) and the constraint of 
 $(L_1, \ldots, L_n)\in \mathcal{C}_{vr}$
 for Problem (2.7). Interestingly, such an omission does not alter the optimal solution. In other words, Problems (2.3) and (2.5) are equivalent, and so are Problems (2.7) and (2.6), as demonstrated by the following propositions.
$(L_1, \ldots, L_n)\in \mathcal{C}_{vr}$
 for Problem (2.7). Interestingly, such an omission does not alter the optimal solution. In other words, Problems (2.3) and (2.5) are equivalent, and so are Problems (2.7) and (2.6), as demonstrated by the following propositions.
Proof. See Appendix B.1 in the supplementary document.
Compared to Problem (2.3), Problem (2.5) has an additional constraint: 0-retention consistency. The equivalence between these two problems indicates that this constraint is redundant. As demonstrated by the proof, the solution to (2.3) inherently satisfies the constraint of 0-retention consistency, even without imposing this condition. This can be intuitively explained by the water-filling model. By the nature of water-filling, the waterline will start with the same level across individual tanks until it reaches the caps, from low to high, respectively. This guarantees that the volume of water in a tank with a lower cap would never exceed that in tank with a higher caps, which is exactly the requirement of the 0-retention consistency.
Similarly, we have the equivalence between Problem (2.7) and Problem (2.6).
Proof. See Appendix B.2 in the supplementary document.
 The equivalence between Problems (2.6) and (2.7) indicates the redundancy of the variance reduction condition in the presence of the constraint of 1-retention consistency. Indeed, when 1-retention consistency is considered, the water-filling process is simplified because the step-shaped tank degenerates to a regular rectangle, as shown in Figure 2(b). When the total volume of water, 
 $\sigma_S$
, reaches its highest possible value,
$\sigma_S$
, reaches its highest possible value, 
 $\sum_{i=1}^n \sigma_i$
, the water volume in individual tanks simultaneously reach the levels of
$\sum_{i=1}^n \sigma_i$
, the water volume in individual tanks simultaneously reach the levels of 
 $\sigma_1, \ldots, \sigma_n$
, respectively, leading to an inherent fulfillment of the variance reduction constraint.
$\sigma_1, \ldots, \sigma_n$
, respectively, leading to an inherent fulfillment of the variance reduction constraint.
Proof. See Appendix B.3 in the supplementary document.
 The constraint of 
 $\gamma$
-retention consistency in Problem (2.4) reflects the degree of discrepancy in risk attitudes among the participants. Specifically, a larger value
$\gamma$
-retention consistency in Problem (2.4) reflects the degree of discrepancy in risk attitudes among the participants. Specifically, a larger value 
 $\gamma$
 signifies a greater discrepancy in the maximum riskiness levels that different participants can tolerate, in the sense that
$\gamma$
 signifies a greater discrepancy in the maximum riskiness levels that different participants can tolerate, in the sense that 
 $\frac{Var[L_i]}{Var[L_j]} \le \frac{\sigma_i^\gamma}{\sigma_j^\gamma}$
 for any
$\frac{Var[L_i]}{Var[L_j]} \le \frac{\sigma_i^\gamma}{\sigma_j^\gamma}$
 for any 
 $i \lt j$
. The equivalence between Problems (2.4) and (2.8) suggests that the constraint on the discrepancy of risk attitudes can be inherently incorporated by reshape the objective function, into a weighted average format. This equivalence provides a new perspective for understanding Problem (2.4) and offers insights for formulating and solving more general problems. For instance, when considering risk measures other than variance, the discrepancy in risk attitudes based on the new measures may be challenging to study in the format of
$i \lt j$
. The equivalence between Problems (2.4) and (2.8) suggests that the constraint on the discrepancy of risk attitudes can be inherently incorporated by reshape the objective function, into a weighted average format. This equivalence provides a new perspective for understanding Problem (2.4) and offers insights for formulating and solving more general problems. For instance, when considering risk measures other than variance, the discrepancy in risk attitudes based on the new measures may be challenging to study in the format of 
 $\gamma$
-retention consistency. However, it might be more manageable when incorporated into a weighted average objective function. This is subject to further investigation.
$\gamma$
-retention consistency. However, it might be more manageable when incorporated into a weighted average objective function. This is subject to further investigation.
7. Case studies
In this section, we present two case studies to demonstrate the effectiveness of the residual risk sharing strategy. We also analyze the behavior of post-sharing risks within these scenarios and discuss potential avenues for future research inspired by these observed behaviors.
7.1. The two-agent model
In this subsection, we illustrate the effectiveness of the residual risk sharing model in variance reduction using a toy model with only two agents and examine how the effects of variance reduction vary across different models under varying constraints.
 Let 
 $C_1$
 and
$C_1$
 and 
 $C_2$
, respectively, denote the pre-transfer risks from Participant 1 and Participant 2. Their standard deviations are, respectively, denoted as
$C_2$
, respectively, denote the pre-transfer risks from Participant 1 and Participant 2. Their standard deviations are, respectively, denoted as 
 $\sigma_{1}$
 and
$\sigma_{1}$
 and 
 $\sigma_{2}$
. Assume the correlation coefficient between
$\sigma_{2}$
. Assume the correlation coefficient between 
 $C_1$
 and
$C_1$
 and 
 $C_2$
 is
$C_2$
 is 
 $\rho$
.
$\rho$
.
Suppose the goal is to minimize the total post-transfer variance, without imposing any additional constraints, that is to solve the unconstrained risks sharing problem (2.1). According to Theorem 4.1, the optimal risk sharing strategy is given by:
 \begin{equation*}L_i = E[C_i] + \frac{1}{2}(C_1 + C_2 - E[C_1] - E[C_2]), \quad \text{for }i=1, 2. \end{equation*}
\begin{equation*}L_i = E[C_i] + \frac{1}{2}(C_1 + C_2 - E[C_1] - E[C_2]), \quad \text{for }i=1, 2. \end{equation*}
The total variance of the post-transfer risks is given by:
 \begin{equation*}Var[L_1] + Var[L_2] = \frac{1}{2}Var[C_1+C_2] = \frac{1}{2}\left(\sigma_1^2 + \sigma_2^2 + 2\rho\sigma_1\sigma_2\right).\end{equation*}
\begin{equation*}Var[L_1] + Var[L_2] = \frac{1}{2}Var[C_1+C_2] = \frac{1}{2}\left(\sigma_1^2 + \sigma_2^2 + 2\rho\sigma_1\sigma_2\right).\end{equation*}
It is easy to verify that the total variance of the post-transfer risks is lower than that of the pre-transfer risks, which is expected as variance reduction is the primary goal of the risk sharing. To quantify the effect of variance reduction, we use the concept of variance reduction ratio as discussed in Feng et al. (Reference Feng, Liu and Talyor2020). Specifically, the variance reduction ratio is calculated as:
 \begin{equation} 1 - \frac{Var[L_{1}]+Var[L_{2}]}{Var[C_{1}]+Var[C_{2}]} = \frac{1}{2} - \rho \cdot \left(\frac{\sigma_{2}}{\sigma_{1}} + \frac{\sigma_{1}}{\sigma_{2}}\right)^{-1}.\end{equation}
\begin{equation} 1 - \frac{Var[L_{1}]+Var[L_{2}]}{Var[C_{1}]+Var[C_{2}]} = \frac{1}{2} - \rho \cdot \left(\frac{\sigma_{2}}{\sigma_{1}} + \frac{\sigma_{1}}{\sigma_{2}}\right)^{-1}.\end{equation}
(7.1) indicates that the variance reduction ratio is determined solely by the correlation coefficient and the ratio of the two standard deviations. This relationship allows us to make several observations.
 First, for a fixed ratio of 
 $\sigma_2/\sigma_1$
, the variance reduction ratio decreases as the correlation coefficient
$\sigma_2/\sigma_1$
, the variance reduction ratio decreases as the correlation coefficient 
 $\rho$
 increases. This aligns with the intuition that lower correlation leads to greater diversification. In particular, if
$\rho$
 increases. This aligns with the intuition that lower correlation leads to greater diversification. In particular, if 
 $\rho =-1$
, the variance reduction ratio reaches it maximum possible value 1 when
$\rho =-1$
, the variance reduction ratio reaches it maximum possible value 1 when 
 $\sigma_1= \sigma_2$
, as the two risks would perfectly hedge each other and result in a zero total post-transfer variance. On the contrary, if
$\sigma_1= \sigma_2$
, as the two risks would perfectly hedge each other and result in a zero total post-transfer variance. On the contrary, if 
 $\rho=1$
, the variance reduction ratio reaches its minimum value 0 when
$\rho=1$
, the variance reduction ratio reaches its minimum value 0 when 
 $\sigma_1= \sigma_2$
, as the two risks would be perfectly correlated, completely eliminating the diversification effect.
$\sigma_1= \sigma_2$
, as the two risks would be perfectly correlated, completely eliminating the diversification effect.
 Second, when the correlation coefficient is positive, the variance reduction ratio attains its minimum when 
 $\sigma_2/\sigma_1=1$
 and increases as this ratio moves away from 1 in either direction. This observation suggests that the residual risk sharing mechanism favors risk pools with a greater disparity in risk levels. In other words, the larger the difference between
$\sigma_2/\sigma_1=1$
 and increases as this ratio moves away from 1 in either direction. This observation suggests that the residual risk sharing mechanism favors risk pools with a greater disparity in risk levels. In other words, the larger the difference between 
 $\sigma_1$
 and
$\sigma_1$
 and 
 $\sigma_2$
, the greater the potential variance reduction achieved through residual risk sharing. This somehow contrasts with the traditional insurance preference for pooling homogeneous risks over heterogeneous ones. Indeed, a closer examination reveals that when
$\sigma_2$
, the greater the potential variance reduction achieved through residual risk sharing. This somehow contrasts with the traditional insurance preference for pooling homogeneous risks over heterogeneous ones. Indeed, a closer examination reveals that when 
 $\sigma_2 / \sigma_1$
 is large, the unconstrained optimal risk sharing strategy may not be fair. The significant reduction in total variance is achieved by transferring variance from the larger risk
$\sigma_2 / \sigma_1$
 is large, the unconstrained optimal risk sharing strategy may not be fair. The significant reduction in total variance is achieved by transferring variance from the larger risk 
 $C_2$
 to the smaller risk
$C_2$
 to the smaller risk 
 $C_1$
, potentially resulting in the smaller risk having a post-transfer variance that exceeds its pre-transfer variance. This situation is clearly undesirable and motivates the constrained risk sharing models discussed in Section 2, which will be demonstrated in the upcoming case studies.
$C_1$
, potentially resulting in the smaller risk having a post-transfer variance that exceeds its pre-transfer variance. This situation is clearly undesirable and motivates the constrained risk sharing models discussed in Section 2, which will be demonstrated in the upcoming case studies.
 The above observations are illustrated in the contour plots of the variance reduction ratio as a function of 
 $\sigma_1$
 and
$\sigma_1$
 and 
 $\sigma_2$
, as shown in Figure 4. The four contour plots correspond to different levels of the correlation coefficient
$\sigma_2$
, as shown in Figure 4. The four contour plots correspond to different levels of the correlation coefficient 
 $\rho$
. Each contour is labeled with a value indicating the variance reduction ratio. Notably, each contour is a straight line, confirming that for a fixed
$\rho$
. Each contour is labeled with a value indicating the variance reduction ratio. Notably, each contour is a straight line, confirming that for a fixed 
 $\rho$
, the variance reduction ratio depends solely on the ratio
$\rho$
, the variance reduction ratio depends solely on the ratio 
 $\sigma_2 / \sigma_1$
. The relations of the variance reduction ratio to
$\sigma_2 / \sigma_1$
. The relations of the variance reduction ratio to 
 $\rho$
 and to
$\rho$
 and to 
 $\sigma_2/\sigma_1$
 are, respectively, demonstrated by the comparison across different plots and the examination of contour values within each plot.
$\sigma_2/\sigma_1$
 are, respectively, demonstrated by the comparison across different plots and the examination of contour values within each plot.

Figure 4. Percentage variance reduction for unconstrained residual risk sharing.
 In the following, we use the two-agent model to demonstrate the solution to Problem (2.5), the optimal risk sharing problem with the constraints of variance reduction and 0-retention consistency. As seen in (7.1), the variance reduction ratio is a function of 
 $\sigma_2/\sigma_1$
. To closely study the behavior of this function in the constrained model, we denote
$\sigma_2/\sigma_1$
. To closely study the behavior of this function in the constrained model, we denote 
 $r=\sigma_2/\sigma_1$
 and assume
$r=\sigma_2/\sigma_1$
 and assume 
 $r\ge 1$
.
$r\ge 1$
.
Following Theorem 5.3, the post-transfer variances of the two agents under the optimal risk sharing strategy are specified as follows:
- 
(i) If  $\sigma(C_1+C_2) \lt 2\sigma_1$
, or equivalently, $\sigma(C_1+C_2) \lt 2\sigma_1$
, or equivalently, $2\rho r + r^2 \lt 3$
, $2\rho r + r^2 \lt 3$
, \begin{equation*}Var[L_{1}] = Var[L_2] = \frac{1}{4}\left(\sigma_1^2 + \sigma_2^2 + 2\rho\sigma_1\sigma_2\right) = \frac{1}{4}\sigma_1^2\left(1+2\rho r + r^2\right). \end{equation*} \begin{equation*}Var[L_{1}] = Var[L_2] = \frac{1}{4}\left(\sigma_1^2 + \sigma_2^2 + 2\rho\sigma_1\sigma_2\right) = \frac{1}{4}\sigma_1^2\left(1+2\rho r + r^2\right). \end{equation*}
- 
(ii) If  $\sigma(C_1+C_2)\ge 2\sigma_1$
, or equivalently, $\sigma(C_1+C_2)\ge 2\sigma_1$
, or equivalently, $2\rho r + r^2 \ge 3$
, $2\rho r + r^2 \ge 3$
, \begin{equation*}Var[L_{1}] = \sigma_1^2 \quad \text{ and }\quad Var[L_2] = (\sigma(C_1+C_2)-\sigma_1)^{2} = \sigma_1^2 \left(\sqrt{1+2\rho r + r^2}-1\right). \end{equation*} \begin{equation*}Var[L_{1}] = \sigma_1^2 \quad \text{ and }\quad Var[L_2] = (\sigma(C_1+C_2)-\sigma_1)^{2} = \sigma_1^2 \left(\sqrt{1+2\rho r + r^2}-1\right). \end{equation*}
The total variance reduction ratio can be expressed as:
 \begin{equation} 1 - \dfrac{Var[L_{1}]+Var[L_{2}]}{Var[C_{1}]+Var[C_{2}]} = \begin{cases} \dfrac{1}{2} - \dfrac{\rho r}{1+r^2} & \text{if $2\rho r + r^2 \lt 3$}\\[15pt]\dfrac{2\left(\sqrt{1+ 2\rho r + r^2}-1-\rho r\right)}{1+r^2} & \text{if $2\rho r + r^2 \ge 3$} \end{cases}. \end{equation}
\begin{equation} 1 - \dfrac{Var[L_{1}]+Var[L_{2}]}{Var[C_{1}]+Var[C_{2}]} = \begin{cases} \dfrac{1}{2} - \dfrac{\rho r}{1+r^2} & \text{if $2\rho r + r^2 \lt 3$}\\[15pt]\dfrac{2\left(\sqrt{1+ 2\rho r + r^2}-1-\rho r\right)}{1+r^2} & \text{if $2\rho r + r^2 \ge 3$} \end{cases}. \end{equation}
 The behavior of the variance reduction ratio as a function of 
 $\rho$
 is similar to that in the unconstrained model, as discussed earlier in this section. However, its behavior as a function of
$\rho$
 is similar to that in the unconstrained model, as discussed earlier in this section. However, its behavior as a function of 
 $r = \sigma_2 / \sigma_1$
 differs significantly. Specifically, in the unconstrained model with a positive correlation coefficient, the variance reduction ratio increases with
$r = \sigma_2 / \sigma_1$
 differs significantly. Specifically, in the unconstrained model with a positive correlation coefficient, the variance reduction ratio increases with 
 $r \in [1, \infty)$
, reaching its maximum as
$r \in [1, \infty)$
, reaching its maximum as 
 $r \to \infty$
, which, as previously noted, raises fairness concerns. In the constrained model, however, this is no longer the case; it is evident that the variance reduction ratio falls to zero as r approaches infinity. A numerical analysis shows that the optimal variance reduction ratio of
$r \to \infty$
, which, as previously noted, raises fairness concerns. In the constrained model, however, this is no longer the case; it is evident that the variance reduction ratio falls to zero as r approaches infinity. A numerical analysis shows that the optimal variance reduction ratio of 
 $0.265$
 is achieved at
$0.265$
 is achieved at 
 $r = 1.603$
 when
$r = 1.603$
 when 
 $\rho=0.5$
, which aligns more closely with our intuition.
$\rho=0.5$
, which aligns more closely with our intuition.
 The behavior of the variance reduction ratio is illustrated by the contour plots in Figure 5. In the plots for 
 $\rho=0.25$
 and
$\rho=0.25$
 and 
 $\rho=0.5$
, note that the two contour lines in the lower right corner have repeating values, indicating a maximum is reached between them. This contrasts with the consistent trend observed in Figure 4, reflecting the different behavior of the variance reduction ratio as a function of
$\rho=0.5$
, note that the two contour lines in the lower right corner have repeating values, indicating a maximum is reached between them. This contrasts with the consistent trend observed in Figure 4, reflecting the different behavior of the variance reduction ratio as a function of 
 $\sigma_2 / \sigma_1$
 discussed above.
$\sigma_2 / \sigma_1$
 discussed above.

Figure 5. Percentage total variance reduction for residual risk sharing satisfying variance reduction and 0-retention consistency.
7.2. Application to flood insurance
 In this subsection, we apply the residual risk sharing strategy to a flood risk pool, which is studied in Feng et al. (Reference Feng, Liu and Talyor2020). We shall investigate the effectiveness of the residual risk sharing strategy and compare it to the study in Feng et al. (Reference Feng, Liu and Talyor2020). Through this case study, we shall also illustrate the impact of the constraint of 
 $\gamma$
-retention consistency at different levels of
$\gamma$
-retention consistency at different levels of 
 $\gamma$
, which reflect the degree of discrepancy in risk attitude.
$\gamma$
, which reflect the degree of discrepancy in risk attitude.
The data for this study is sourced from the US National Flood Insurance Program (NFIP), as documented in FEMA (2020). This dataset includes millions of publicly available records spanning decades of issued flood insurance policies. To make our study comparable to that of Feng et al. (Reference Feng, Liu and Talyor2020), we focus on claims filed within the United States between 2000 and 2019. We further narrow the dataset to claims related to single-floor properties, which allow for direct comparison without additional normalization for property size or type.
 Each state is treated as one participant of the peer-to-peer insurance program. Using the prepared dataset, the total payment amount for flood claims in each quarter for each state is calculated and used as a data point of the state’s flood risk. In this way, a sample of 40 data points for each state is created, so is a joint sample representing the risk vector formed by the combined risks of all 50 states. From this sample, we estimate the standard deviations of each individual state’s risk and of the aggregate risk. We then apply the optimal residual risk sharing strategies under the unconstrained model (2.1) and the constrained model (2.4) with 
 $\gamma = 0$
 and
$\gamma = 0$
 and 
 $\gamma = 1$
, respectively, and evaluate both the overall and individual variance reduction ratios for each state according to the formulas in Theorems 4.1 and 5.3. The results are summarized in Tables 1 and 2.
$\gamma = 1$
, respectively, and evaluate both the overall and individual variance reduction ratios for each state according to the formulas in Theorems 4.1 and 5.3. The results are summarized in Tables 1 and 2.
Table 1. Standard deviation of claims paid is in units of thousands and variance reduction between pre- and post-pooling.

Table 2. Optimal residual risk sharing under 0 and 1 retention consistency

Table 1 presents the variance reduction ratios (in the column of “Residual risk”) under the optimal strategy for the unconstrained model. The results from Feng et al. (Reference Feng, Liu and Talyor2020) are also listed (in the column of “Whole risk”) for the purpose of comparison.
The results indicate that residual risk sharing strategy reduces the the total post-transfer variance by 92.54%, surpassing the ratio of 88.03% achieved by the whole risk sharing strategy. This aligns with the findings discussed in Proposition 4.3. In addition, while Feng et al. (Reference Feng, Liu and Talyor2020) observed that high-exposure states, particularly along the Gulf Coast – such as Florida, Louisiana, and Texas, experienced the lowest individual variance reduction, our model shows improved variance reduction for these states. This is because residual risk sharing tends to provide better variance reduction effect to those with higher pre-transfer variance. Furthermore, residual risk sharing yields substantial risk reduction for states with unexpected flood events. For example, North Dakota experiences a 99% reduction in pre-transfer risk, which is attributed to its high pre-transfer variance caused by the unexpected flooding of the Souris River in 2011.
 Table 2 presents the variance reduction ratios under the optimal strategies for the constrained model, respectively, with 
 $\gamma = 0$
 and
$\gamma = 0$
 and 
 $\gamma = 1$
. It is interesting to note that, when
$\gamma = 1$
. It is interesting to note that, when 
 $\gamma = 0$
, the variance reduction ratios are the same as those achieved by the unconstrained model, as shown in Table 1. This finding is unexpected, as it implies that our dataset falls within
$\gamma = 0$
, the variance reduction ratios are the same as those achieved by the unconstrained model, as shown in Table 1. This finding is unexpected, as it implies that our dataset falls within 
 $\mathcal{U}_{0}$
 as defined at the start of Section 5.1, that is,
$\mathcal{U}_{0}$
 as defined at the start of Section 5.1, that is, 
 $\sigma_{S} \leq n \sigma_{1}$
, where
$\sigma_{S} \leq n \sigma_{1}$
, where 
 $\sigma_{S}$
 represents the total flood risk standard deviation across the United States, and
$\sigma_{S}$
 represents the total flood risk standard deviation across the United States, and 
 $\sigma_{1}$
 denotes the smallest pre-transfer standard deviation among the states. This is not a coincidence but rather it is well justified by Theorem 5.3. Specifically, the flood risks from different states fall into category
$\sigma_{1}$
 denotes the smallest pre-transfer standard deviation among the states. This is not a coincidence but rather it is well justified by Theorem 5.3. Specifically, the flood risks from different states fall into category 
 $\mathcal{U}_0$
, that means, the dependence strength among risks is low enough such that equally sharing the residual risks will provide sufficient diversification, leading to a minimum total post-transfer variance. For the model with
$\mathcal{U}_0$
, that means, the dependence strength among risks is low enough such that equally sharing the residual risks will provide sufficient diversification, leading to a minimum total post-transfer variance. For the model with 
 $\gamma=1$
, all the individual variance reduction ratios are equal. This is because the optimal strategy is for each participant to share the residual risks proportionally to his own pre-transfer standard deviation, according to Proposition 5.4. It is also worth noting that, with
$\gamma=1$
, all the individual variance reduction ratios are equal. This is because the optimal strategy is for each participant to share the residual risks proportionally to his own pre-transfer standard deviation, according to Proposition 5.4. It is also worth noting that, with 
 $\gamma = 1$
, the total variance reduction ratio is only reduced by 2% compared to the model with
$\gamma = 1$
, the total variance reduction ratio is only reduced by 2% compared to the model with 
 $\gamma=0$
.
$\gamma=0$
.
 In both case studies, the findings about the behavior of the variance reduction ratio naturally raises a key question: how would the optimal risk sharing strategy adapt in response to changes in parameters such as 
 $\rho$
,
$\rho$
, 
 $\sigma_i$
, and
$\sigma_i$
, and 
 $\gamma$
? Furthermore, how would these parameter shifts impact the optimal total and individual post-transfer variance reduction ratios? Addressing these questions could provide insights in developing more efficient and adaptable risk sharing frameworks. This is subject to future research.
$\gamma$
? Furthermore, how would these parameter shifts impact the optimal total and individual post-transfer variance reduction ratios? Addressing these questions could provide insights in developing more efficient and adaptable risk sharing frameworks. This is subject to future research.
8. Conclusion
In this paper, we examine the optimal risk sharing problem in peer-to-peer insurance. By expanding the class of admissible strategies to a more general functional space, our study extends beyond the existing models that are limited to linear strategies. We demonstrate that, even with this expansion of the admissible strategy class, the optimal risk sharing strategy remains linear in form, though it is based on the residual risks rather than the original risks. The linear residual risk sharing strategy offers several advantages. First, it indeed improves the objective function (total post transfer variance) compared to the linear strategies applied to the original risks. Second, its simple structure allows an explicit optimal solution without requiring numerical procedures, favoring its implementation in practice. Third, the optimal sharing strategy risk anonymity in the sense that the allocated loss to each participant does not rely on the realization of individual losses but only the aggregate loss.
 When no constraint is imposed, the optimal strategy is to equally share the aggregate residual risk 
 $S-E[S]$
 among all participants. To promote market development, we introduce the constraints of variance reduction and
$S-E[S]$
 among all participants. To promote market development, we introduce the constraints of variance reduction and 
 $\gamma$
-retention consistency. With these constraints in place, the optimal strategy still follows the form of linear residual risk sharing. For participants with “high” riskiness, the optimal risk sharing ratios remain uniform, while for those with “low” riskiness, the ratios are restricted by the constraints. The categorization of “high” and “low” riskiness depends on the comparison between the aggregate riskiness
$\gamma$
-retention consistency. With these constraints in place, the optimal strategy still follows the form of linear residual risk sharing. For participants with “high” riskiness, the optimal risk sharing ratios remain uniform, while for those with “low” riskiness, the ratios are restricted by the constraints. The categorization of “high” and “low” riskiness depends on the comparison between the aggregate riskiness 
 $\sigma_S$
 and the individual riskiness levels
$\sigma_S$
 and the individual riskiness levels 
 $\{\sigma_1, \ldots, \sigma_n\}$
. We provide sufficient conditions and simplify the characterization into a more direct comparison between the overall dependence strength, as measured by the smallest and largest correlation coefficients, and the degree of dispersion in the individual riskiness levels.
$\{\sigma_1, \ldots, \sigma_n\}$
. We provide sufficient conditions and simplify the characterization into a more direct comparison between the overall dependence strength, as measured by the smallest and largest correlation coefficients, and the degree of dispersion in the individual riskiness levels.
In addition to deriving explicit solutions, we establish several desirable properties that enhance practical implementation and promote market development. Specifically, we demonstrate the robustness of the optimal solutions against model uncertainty and estimation errors. We also ensure the consistency of the solutions to promote fair practices. Furthermore, we provide sufficient conditions for sustainability to support healthy market expansion.
Admittedly, the linearity of the optimal risk sharing strategy largely depends on the choice of risk measures. While the risk measure variance is adopted and investigated in this paper, it remains an open question which other risk measures also favor the optimality of the linear residual risk sharing. A natural follow-up question is, beyond these risk measures, what form the optimal risk sharing strategy should take and how to find explicit solutions. These questions are undoubtedly challenging to answer, but the study in this paper provides some valuable insights. First, the intuitive water-filling model we developed effectively explains the main results throughout the paper in a coherent manner. Second, by studying risk sharing problems under different combinations of constraints, we establish a clear relationship between the problems as well as constraints. These two findings enhance our understanding of the nature of the problem, the constraint, and the risk measure and thus provide insights for future generalizations.
Acknowledgments
The authors thank the editor and the two anonymous reviewers for their valuable comments, which have greatly improved the paper. Both authors acknowledge the support from the Campus Research Award (RB24052) provided by University of Illinois Urbana-Champaign.
Supplementary material
To view supplementary material for this article, please visit https://doi.org/10.1017/asb.2024.37.
 
  
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 







