1. Introduction
Since the pioneering contribution of Hanoch and Levy [Reference Hanoch and Levy14], stochastic dominance (SD) theory has become an important tool for comparing risks, which provides a systematic and efficient standard paradigm for analyzing people's decision-making behavior under uncertainty. SD theory is consistent with expected utility theory, but it does not require a specific utility function. It uses the whole probability distribution rather than some common numerical characteristics such as mean and standard derivation. SD theory has been well-developed, and there are hundreds of papers on SD and its applications (see, e.g., [Reference Hadar and Russell13,Reference Rothschild and Stiglitz28]). The most popular notions of SD in practical applications are the first-order stochastic dominance (FSD) and the second-order stochastic dominance (SSD). They both possess some simple and tractable properties such as equivalent criteria based on integral conditions and probability transfer [Reference Levy19,Reference Müller and Stoyan24].
Although SD theory is attractive, some related literatures have shown that its practical application scope is very limited [Reference Atkinson1]. The dominance relation between risky prospects $F$ and $G$ fails to hold even in the situation that the vast majority of investors would prefer $F$ over $G$ ($F$ is almost entirely below $G$). To provide a standard for comparing this kind of risks and reveal a preference for “most” decision makers, Leshno and Levy [Reference Leshno and Levy17] extended the SD theory to almost stochastic dominance (ASD) by eliminating pathological preferences. Bali et al. [Reference Bali, Demirtas, Levy and Wolf3] illustrated that ASD can unambiguously explain some decision-making behaviors of investors, such as a higher stock to bond ratio for the long-term investment. Up to now, ASD has played an important role in a number of fields, especially in insurance and finance, and drawn many important applications [Reference Guo, Zhu, Wong and Zhu10,Reference Levy18,Reference Levy, Leshno and Leibovitch21,Reference Tzeng, Huang and Shih31].
In recent literatures, some papers connected distorted distributions with stochastic comparisons (see, e.g., [Reference Denuit, Dhaene, Goovaerts and Kaas5,Reference Lando, Arab and Oliveira16,Reference Müller and Stoyan24,Reference Shaked and Shanthikumar30]). A function $H$ from $[0,1]$ to $[0,1]$ is called a distortion function if $H$ is increasing, and satisfies $H(0)=0$ and $H(1)=1$. Let $X$ be a random variable on an atomless probability space $(\Omega, \mathscr {F}, P)$, with cumulative distribution function (cdf) $F$. The distorted distribution function of $F$ is defined by $H\circ F$, which can be regarded as the cdf of some random variable $X^{H}$ if $H$ is right continuous. Generally, the distorted function $H$ is interpreted as a subjective weight of the original cdf $F$, and reflects the risk attitude of decision makers [Reference Yaari35]. Intuitively, a concave $H$ attaches more weight to smaller expenses, conforming to the idea of risk aversion. Yaari [Reference Yaari35] firstly applied the distorted distribution function in dual theory of choice under risk, and some researchers proved that some SD rules via expected utility theory can be characterized through the distortion transform [Reference Levy and Wiener20,Reference Müller and Stoyan24]. For more applications of distortions in insurance and actuarial science, see, for example, Wang [Reference Wang32] and Denuit et al. [Reference Denuit, Dhaene, Goovaerts and Kaas5].
In this paper, we devote to further develop of ASD theory and its applications in the fields of reliability theory and biostatistics. Throughout, a distortion function $H$ is always assumed to be right continuous. The risk of $X$ is valued as its distorted expectation $\mathbb {E}_{H}(X)$, defined by
where $F^{-1}(u)=\inf \{x: F(x)\ge u\}$ for $u\in [0,1]$. Let $\mathscr {L}=\{X: X \text { is a random variable on } (\Omega, \mathscr {F}, P)\}$. Some researchers have investigated the equivalent characterizations of classical SD rules based on distortion expectation. For any random variable $X_i\in \mathscr {L}$, $i=1,2$, Levy and Wiener [Reference Levy and Wiener20] proved that $X_1$ is greater than $X_2$ in FSD, if and only if, for all distortion functions $H$,
Levy and Wiener [Reference Levy and Wiener20] also proposed that $X_1$ is greater than $X_2$ in SSD if and only if (1.2) holds for all convex distortion functions $H$. Wang and Young [Reference Wang and Young34] showed that $X_1$ is greater than $X_2$ in increasing convex order if and only if (1.2) holds for all concave distortion functions $H$. These results are very meaningful since many good risk measures can be expressed as the expectations of the distorted distributions. Although FSD, SSD and increasing convex order can all be characterized via distorted expectations, it is not true in general for ASD because SD rules are defined by integrated distributions, whereas distorted expectations are related to integrated quantiles [Reference Muliere and Scarsini23]. We will explore the equivalent characterizations of AFSD rules based on distorted expectations, and find that, for any $0<\varepsilon <1/2$, $X_1$ is greater than $X_2$ in $\varepsilon$-AFSD is equivalent to the condition (1.2) for all distortion functions $H$ in a certain class of distortion functions. Furthermore, we use this main result to derive some other properties of AFSD under distortion transform.
The paper is organized as follows. In Section 2, we recall the definition of AFSD and illustrate that AFSD does not possess invariance under increasing concave or convex transforms. The main characterization result of AFSD via distorted expectation is given in Section 3.1. The other properties of AFSD under distortion transform are listed in Section 3.2. The main results in Section 3 are applied to establish stochastic comparisons of order statistics and ROC curves via AFSD in Section 4.
2. Almost stochastic dominance and distorted expectation
2.1. Almost stochastic dominance
We begin by introducing some notations. Let $X_i$ denote the random asset with cdf $F_i$, $i=1,2$, and let $\mathscr {U}$ be the set of all differentiable utility functions on $\mathbb {R}$. For $0<\varepsilon <1/2$, define
Throughout, denote $x_+=\max \{x, 0\}$ for any $x\in \mathbb {R}$, and define $\|\varphi \|=\int ^{\infty }_{-\infty } |\varphi (x)|\,\mathrm {d} x$ for any function $\varphi :\mathbb {R}\to \mathbb {R}$. We recall the definition of AFSD.
Definition 2.1. [Reference Leshno and Levy17]
We say that $X_2$ is dominated by $X_1$ in $\varepsilon$-AFSD, denoted by $X_1\ge _{\varepsilon \text {-AFSD}} X_2$ or $F_1\ge _{\varepsilon \text {-AFSD}} F_2$, if and only if
(i) $\mathbb {E} [U(X_1)]\ge \mathbb {E}[U(X_2)]$ for all $U \in \mathscr {U}_{1}^{\varepsilon }$, or
(ii) $\|(F_1-F_2)_+\| \le \varepsilon \parallel F_1-F_2\parallel$.
FSD can be regarded as $\varepsilon$-AFSD with $\varepsilon =0$. Levy and Wiener [Reference Levy and Wiener20] studied FSD and SSD through distorted expectations and investigated the classes of distortion functions that preserve FSD and SSD. In this paper, we will mainly study the properties of AFSD under distortion transforms. It should be mentioned that AFSD does not possess invariance under increasing convex or concave transforms. Let us see the following two examples.
Example 2.1. Assume that $X_1$ and $X_2$ are two random variables with respective probability mass functions $F_1$ and $F_2$ described by Tables 1 and 2.
Hence,
Thus, $X_1$ dominates $X_2$ in terms of $\varepsilon$-AFSD, if and only if,
that is, $1/6\leq \varepsilon <1/2$. Choose the distortion function $H(u)=\sqrt {u}$, which is increasing and concave on $[0,1]$. Then,
Hence, $\| (H\circ F_1-H\circ F_2)_+\|=1/4$ and $\|H\circ F_1 - H\circ F_2\| =1/2$. Thus, for any $0<\varepsilon <1/2$, it holds that
That is, $X^{H}_1$ does not dominate $X_2^{H}$ in $\varepsilon$-AFSD.
Example 2.2. Assume that $X_1$ and $X_2$ are two random variables with respective probability mass functions $F_1$ and $F_2$ described by Tables 3 and 4.
Then,
Hence, $\|(F_1-F_2)_+\| =1/12$ and $\|F_1-F_2\|=7/12$. Thus, $X_2$ is dominated by $X_1$ in $\varepsilon$-AFSD, if and only if,
that is, $1/7\leq \varepsilon <1/2$. Let $H(u)=u^{2}$, which is increasing and convex on $[0,1]$. Then,
Hence, $\|(H\circ F_1-H\circ _2)_+\|=17/144$ and $\|H\circ F_1-H\circ F_2\| = 53/144$. Then, for $\varepsilon =16/53$,
That is, $X^{H}_1$ does not dominate $X^{H}_2$ in $16/53$-AFSD.
2.2. Distorted expectation
Distorted expectation is very important in the statistical, financial and economic literatures. Suppose that $X$ is a random variable with cdf $F(x)$ and $H$ is a right continuous distortion function that leads to a distorted probability measure $H\circ F$. It can be shown that the Choquet integral of a random variable $X$, with respect to a distortion function $H$, is equivalent to the expectation of the variable $X$ under the distorted distribution $H\circ F$:
In finance and insurance, the Choquet integral with the distortion function has been proposed to measure risks [Reference Wang33]. For any nonnegative random variable $X$ such as loss, depending on the chosen distortion function, different distorted risk measures are obtained. When $H(u)=1-(1-u)^{n}$, $u\in [0, 1]$, for a positive integer $n$, we obtain
which is the generalized Gini index introduced by Donaldson and Weymark [Reference Donaldson and Weymark6]. When
for any $0\le \alpha \le 1$, we obtain
which is the expression of the popular risk measure ${\rm VaR}_{\alpha }$ (Value-at-Risk at confidence level $\alpha$). Let $\widetilde {H}(u) = 1-H(1-u)$, $u\in [0, 1]$, denote the dual distortion function of $H$. In insurance, $\mathbb {E}_{\widetilde {H}}(X)$ has been proposed to measure the risk and compute risk premiums of $X$, see Wang [Reference Wang32,Reference Wang33] and Denuit et al. [Reference Denuit, Dhaene, Goovaerts and Kaas5].
Let $\mathscr {H}$ be the class of all distortion functions. For $H\in \mathscr {H}$, define
and
For $0<\varepsilon <1/2$, set
It is clear that $H_1\in \mathscr {H}_{1}^{\varepsilon }$, and some piece-wise linear distortion functions without differentiability provided by Balbás et al. [Reference Balbás, Garrido and Mayoral2] are also included in $\mathscr {H}_{1}^{\varepsilon }$, for example:
For any $H\in \mathscr {H}_{1}^{\varepsilon }$, the distorted expectation $E_{H}(\cdot )$ as risk measure, has some good properties such as law invariance, translation invariance, positive homogeneity and monotonicity. However, subadditivity does not necessarily satisfied because the distortion function $H\in \mathscr {H}_{1}^{\varepsilon }$ is not convex. The famous Kusuoka representation shows that under an atomless probability space, any law invariant coherent risk measure is a supremum of distorted expectations with convex distortions [Reference Föllmer and Schied8], Section 4.6] and one necessarily has a distorted expectation under additional assumption of being comonotone additive. In addition, subadditivity is not necessarily satisfied. The reader can see the next example.
Example 2.3. Suppose that identically distributed Bernoulli random variables $X_1$ and $X_2$ with the discrete joint distribution given in Table 5.
Assume that distortion function $H$ is as follows:
Clearly, $H\in \mathscr {H}_{1}^{\varepsilon }$ where $0<\varepsilon \leq \frac {3}{19}$. The distorted expectation are
and
Thus, we obtain
It is well known that the distorted risk measure with convex distortions preserves some popular stochastic dominance. In this paper, we will investigate the isotonicity of the distorted expectation with distortion function $H\in \mathscr {H}_{1}^{\varepsilon }$ under some stochastic dominances to develop the dual theory of choice under risk.
3. Main results
In this section, we characterize AFSD via distorted expectation, and then investigate the properties of AFSD under distortion transforms.
3.1. Characterization of AFSD via distorted expectations
In this subsection, we first show that the distorted expectation is isotonic under AFSD by using the following Lemma 3.1.
Lemma 3.1. For any $0<\varepsilon < 1/2$, $X_1\ge _{\varepsilon \text {-AFSD}} X_2$ if and only if
Proof. Note that
and
The desired result now follows directly.
Theorem 3.2. For $0<\varepsilon <1/2$, the following two statements are equivalent:
(i) $X_1\ge _{\varepsilon \text {-AFSD}} X_2$;
(ii) $\mathbb {E}_H[X_1]\ge \mathbb {E}_H[X_2]$ for all $H\in \mathscr {H}_1^{\varepsilon }$.
Proof. (i) $\Rightarrow$ (ii): From Lemma 3.1, we have, for any $0<\varepsilon <1/2$, $X_1\ge _{\varepsilon \text {-AFSD}}X_2$ if and only if
Thus, for any $H\in \mathscr {H}_{1}^{\varepsilon }$,
That is, $\mathbb {E}_{H}(X_1)\ge \mathbb {E}_{H}(X_2)$.
(ii) $\Rightarrow$ (i): Define a distortion function $H$ with right derivative
It is easy to verify that $H\in \mathscr {H}_{1}^{\varepsilon }$. Thus,
Hence, for all $H\in \mathscr {H}_{1}^{\varepsilon }$, $\mathbb {E}_{H} (X_1)\ge \mathbb {E}_{H}(X_2)$ implies (3.2). That is, $X_1\ge _{\varepsilon \text {-AFSD}}X_2$. This completes the proof of the theorem.
3.2. Properties under distortion transform
Levy and Wiener [Reference Levy and Wiener20] studied the closure of SD under a distortion transform on the space of distribution functions. They found that all distortion functions preserve FSD, whereas only concave distortion functions preserve SSD. In this subsection, we will prove that under the proper conditions, the $\varepsilon$-AFSD is also preserved under a distortion transform.
Proposition 3.3. For any $0<\varepsilon, \varepsilon _1<1/2$ such that $0<\varepsilon /\varepsilon _1 <1/2$, we have
Proof. From Theorem 3.2, it follows that $X_1^{H} \geq _{(\varepsilon /\varepsilon _1) \text {-AFSD}} X_2^{H}$ if and only if $\mathbb {E}_h [X_1^{H}] \ge \mathbb {E}_h [ X_2^{H}]$ for all $h\in \mathscr {H}_1^{\varepsilon /\varepsilon _1}$, that is,
holds. Set $H^{-1}(u)=v$, then (3.3) reduces to
Hence, to prove that (3.4) holds for all $h\in \mathscr {H}_1^{\varepsilon /\varepsilon _1}$, it suffices to verify $h\circ H\in \mathscr {H}_1^{\varepsilon }$. Since $h\in \mathscr {H}_1^{\varepsilon /\varepsilon _1}$ and $H\in \mathscr {H}_1^{\varepsilon _1}$, for $v\in [0, 1]$, we have
This implies that $h\circ H\in \mathscr {H}_1^{\varepsilon }$. This completes the proof.
Corollary 3.4. Let $\varepsilon _1, \varepsilon _2\in (0, 1/2)$ such that $0<\varepsilon _1/\varepsilon _2< 1/2$. If $H_1$, $H_2$ be two distortion functions with $H_2\circ H_1^{-1}\in \mathscr {H}_1^{\varepsilon _1/\varepsilon _2}$, then
Proof. Since $X^{H_1}_1 \geq _{\varepsilon _1\hbox {-AFSD}} X^{H_1}_2$ and $H_2\circ H_1^{-1}\in \mathscr {H}_1^{\varepsilon _1/\varepsilon _2}$, we have
by Proposition 3.3, that is, $X_1^{H_2} \geq _{\varepsilon _2\text {-AFSD}} X_2^{H_2}$. This completes the proof.
Proposition 3.5. For any $0<\varepsilon <1/2$, if the distortion functions $H_1$ and $H_2$ satisfy $H_1(u)\leq H_2(u)$ for any $u\in [0,1]$, then
Proof. Since $X_1^{H_1}\geq _{\varepsilon \text {-AFSD}} X_2^{H_1}$, it follows from Theorem 3.2 that
for any $H\in \mathscr {H}_1^{\varepsilon }$. Since $H_1(u)\leq H_2(u)$ for any $u\in [0,1]$, we have $H_1\circ F_2(x)\leq H_2\circ F_2(x)$ for any $x\in \mathbb {R}$, and hence, $F_2^{-1}\circ H^{-1}_1(u)\geq F_2^{-1}\circ H^{-1}_2(u)$ for $u\in [0,1]$. Therefore, for any $H\in \mathscr {H}_1^{\varepsilon }$,
Combining (3.5) with (3.6), we have
for any $H\in \mathscr {H}_1^{\varepsilon }$, that is, $X_{1}^{H_1}\geq _{\varepsilon \text {-AFSD}} X_2^{H_2}$. This completes the proof.
Proposition 3.6. For two cdfs $F_1$ and $F_2$, if $F_1$ singly crosses $F_2$ from below, and $H$ is a concave distortion function, then for $0<\varepsilon <1/2$,
Proof. By Lemma 3.1, $F_1 \ge _{\varepsilon \text {-AFSD}} F_2$ if and only if
To prove $H\circ F_1\geq _{\varepsilon \text {-AFSD}} H\circ F_2$, it is sufficient to prove
or, equivalently,
by setting $v=H^{-1}(u)$. Define
Since $F_1$ singly crosses $F_2$ from below, it follows that $\Delta _1 (v)\le \Delta _1(1)\le 0$ for all $v\in [0,1]$. Since $H$ is concave, we have
This completes the proof.
Proposition 3.7. Let $F_1$ and $F_2$ be two cdfs with a common support $(a, b)$, where $-\infty \le a< b\le +\infty$. If $F_1$ singly crosses $F_2$ from below, and $H$ is an increasing and convex function such that $F_1\circ H$ and $F_2\circ H$ are also two cdfs, then for $0<\varepsilon <1/2$,
Proof. By Definition 2.1, $F_1\ge _{\varepsilon \text {-AFSD}} F_2$ if and only if
To prove $F_1\circ H\geq _{\varepsilon \text {-AFSD}} F_2\circ H$, it suffices to prove
or, equivalently,
Define
Since $F_1$ singly crosses $F_2$ from below, it follows that $\Delta _2(y)\le \Delta _2(b)\le 0$ for all $y\in [a, b]$. Define $\eta (y)= H^{-1}(y)$ for $y\in [a, b]$, and let $\eta '_+(y)$ denote the right derivative of $\eta$ at $y$. Therefore,
where $\eta _+'(b)=\lim _{y\to b-}\eta _+'(y)$, and the last inequality follows from the convexity of $H$. This completes the proof.
4. Applications
In this section, we will apply our results to establish stochastic comparisons of order statistics and ROC curves with respect to AFSD.
4.1. Stochastic comparisons of order statistics
Order statistics are widely used in reliability, data analysis, risk management, auction theory, statistical inference, and many other applied areas. Let $X_1, \ldots, X_n$ be a random sample of size $n$ from a distribution $F$. Denote by $X_{1:n}\le X_{2:n}\leq \cdots \leq X_{n:n}$ its order statistics. If $X_1, \ldots, X_n$ are independent, it is well known that the cdf of $X_{k:n}$ is given by $F_{B}\circ F$, where $F_B$ is the beta distribution with parameters $k$ and $n-k+1$.
Order statistics have a close connection with the lifetimes of $k$-out-of-$n$ systems. In reliability theory, a $k$-out-of-$n$ system is the system consisting of $n$ components and normally operating if and only if at least $k$ of the $n$ components work. Considering a $k$-out-of-$n$ system with all components having a common cdf $F$, then, its lifetime $T_{k:n}$ is the same as $X_{(n-k+1):n}$.
A significant body of literature exists on stochastic comparisons of order statistics in the past two decades. One may refer to Shaked and Shanthikumar [Reference Shaked and Shanthikumar30] and references therein. In the following, we deal with the problem of comparing the lifetimes of $k$-out-of-$n$ systems with respect to AFSD. First, we recall some popular ageing notions.
Definition 4.1. [Reference Lando, Arab and Oliveira16]
Let $X$ be a random variable with cdf $F$. Then, $X$ or $F$ is said to be
• Convex if $F$ is convex on its support, denoted by $F\in \mathscr {F}_{\rm C}$;
• Logit-convex if $\log \left (F/{\bar F}\right )$ is convex on the support of $F$, denoted by $F\in \mathscr {F}_{\rm CL}$;
• Odds-convex if $F/{\bar F}$ is convex on the support of $F$, denoted by $F\in \mathscr {F}_{\rm CO}$;
• IFR (increasing failure rate) if ${\bar F} (x)$ is log-concave, denoted by $F\in \mathscr {F}_{\rm IFR}$.
If $X$ has a probability density function $f$, then $F\in \mathscr {F}_{\rm C}$ if and only if $f$ is increasing. Define $\lambda (x)=f(x)/{\bar F}(x)$ on $\{x:F(x)<1\}$, which is called the failure rate function of $F$. Then, $F\in \mathscr {F}_{\rm IFR}$ if and only if $\lambda (x)$ is increasing on $\{x:F(x)<1\}$. $\mathscr {F}_{\rm IFR}$ is an important class in reliability theory and contains many relevant models. It is clear that $\mathscr {F}_{\rm C}\subset \mathscr {F}_{\rm IFR}$.
The ageing distribution set $\mathscr {F}_{\rm CL}$ contains distributions with unbounded support, which has been studied by Zimmer et al. [Reference Zimmer, Wang and Pathak36], Sankaran and Jayakumar [Reference Sankaran and Jayakumar29] and Navarro et al. [Reference Navarro, Ruiz and Aguila27]. $F\in \mathscr {F}_{\rm CL}$ if and only if the log-odds rate $\lambda (x)/F(x)$ is increasing. Therefore, $\mathscr {F}_{\rm CL}\subset \mathscr {F}_{\rm IFR}$.
The ageing distribution set $\mathscr {F}_{CO}$ can be characterized as follows: $F \in \mathscr {F}_{\rm CO}$ if and only if the ratio between the failure rate and the survival function, $\lambda (x)/{\bar F}(x)$, is increasing [Reference Kirmani and Gupta15]. Therefore, $\mathscr {F}_{\rm CO}$ is the widest ageing distribution set and $\mathscr {F}_{\rm IFR}\subset \mathscr {F}_{\rm CO}$. Several examples of distributions belonging to $\mathscr {F}_{\rm C}$, $\mathscr {F}_{\rm CL}$, $\mathscr {F}_{\rm CO}$ and $\mathscr {F}_{\rm IFR}$ can be found in Lando et al. [Reference Lando, Arab and Oliveira16].
To study ageing patterns of lifetimes of $k$-out-of-$n$ systems, Lando et al. [Reference Lando, Arab and Oliveira16] compared different order statistics with respect to SSD and derived sufficient dominance conditions by identifying the class of component lifetimes. In this subsection, we will explore stochastic comparison of order statistics via AFSD. The following lemma is due to Theorem 7 in Müller et al. [Reference Müller, Scarsini, Tsetlin and Winkler26] since $(1+\gamma, 1+\gamma )$-SD is equivalent to $\varepsilon$-AFSD with $\varepsilon =\gamma /(1+\gamma )$ (see [Reference Müller, Scarsini, Tsetlin and Winkler25], Definition 4.2]).
Lemma 4.1. [Reference Müller, Scarsini, Tsetlin and Winkler26]
Let $X_1$ and $X_2$ be random variables with $\mathbb {E} (X_1)=\mu _1$, $\mathbb {E} (X_2)=\mu _2$, ${\rm Var} (X_1)=\sigma _1^{2}$, ${\rm Var} (X_2)=\sigma _2^{2}$ and $\mu _1>\mu _2$. Define
and
Then $X_1\ge _{\varepsilon \text {-AFSD}} X_2$ for $\varepsilon ^{*}(t)<\varepsilon <1/2$.
Let $B_{i,n}\sim {\rm beta}(i,n-i+1)$. Denote
For the sake of simplification, denote
It is known that if $j\ge i$ and $n-m\ge i-j$, then $X_{j:m}\ge _{\rm FSD} X_{i:n}$ (see, e.g., [Reference Boland, Hu, Shaked and Shanthikumar4]).
The next two propositions give the conditions under which $X_{j:m}$ and $X_{i:n}$ can be ordered by $\varepsilon$-AFSD when the condition $j\ge i$ is violated and $F$ belongs to any one of $\mathscr {F}_{\rm C}$, $\mathscr {F}_{\rm CO}$, $\mathscr {F}_{\rm IFR}$ and $\mathscr {F}_{\rm CL}$.
Proposition 4.2. For any $F\in \mathscr {F}_{\rm C}$, if $n-m> i-j> 0$ and $i/(n+1)>j/(m+1)$, then $X_{i:n}\geq _{\varepsilon \text {-AFSD}} X_{j:m}$ for $\varepsilon ^{\ast } (t_1) \leq \varepsilon < 1/2$.
Proof. Let $B_{i,n}\sim beta(i,n-i+1)$ and $B_{j,m}\sim beta(j,m-j+1)$. Since $n-m> i-j>0$, the cdf $F_{B_{i,n}}$ of $B_{i,n}$ singly crosses the cdf $F_{B_{j,m}}$ of $B_{j,m}$ from below by Lemma A.1. Since $i/(n+1)>j/(m+1)$, we have $\mu _{B}(i,n):=\mathbb {E} [ B_{i,n}] > \mathbb {E}[B_{j,m}]=:\mu _{B}(j,m)$. Therefore, from Lemma 4.1, we have $B_{i,n}\geq _{\varepsilon \text {-AFSD}} B_{j,m}$ for $\varepsilon ^{\ast }(t_1)\leq \varepsilon < 1/2$.
On the other hand, the cdf of $X_{i:n}$ can be formulated as $F_{B_{i,n}}\circ F$. Since $F$ is convex, it follows from Proposition 3.7 that $F_{B_{i,n}}\circ F\geq _{\varepsilon \text {-AFSD}} F_{B_{j,m}}\circ F$. That is, $X_{i:n}\geq _{\varepsilon \text {-AFSD}} X_{j:m}$. This completes the proof of this proposition.
Define
where $\mu _{h_\ell }(\cdot,\cdot )$ and $\sigma _{h_\ell }(\cdot,\cdot )$ are defined in Lemmas A.2 and A.3.
Proposition 4.3. Assume that $n-m>i-j>0$, and define $\psi (x) = \Gamma '(x)/\Gamma (x)$ for $x>0$.
(1) For any $F\in \mathscr {F}_{\rm CO}$, if $i/(n-i)>j/(m-j)$, then
$$X_{i:n}\geq_{\varepsilon\text{-AFSD}} X_{j:m}\quad \text{for}\ \varepsilon^{{\ast}} (t_2)\leq \varepsilon< \tfrac 12;$$(2) For any $F\in \mathscr {F}_{\rm CL}$, if $\psi (i)-\psi (n-i+1)>\psi (j)-\psi (m-j+1)$, then
$$X_{i:n}\geq_{\varepsilon\text{-AFSD}} X_{j:m}\quad \text{for}\ \varepsilon^{{\ast}}(t_3)\leq\varepsilon< \tfrac 12;$$(3) For any $F\in \mathscr {F}_{\rm IFR}$, if $\psi (n+1)-\psi (n-i+1)>\psi (m+1)-\psi (m-j+1)$, then
$$X_{i:n}\geq_{\varepsilon\text{-AFSD}} X_{j:m}\quad \text{for}\ \varepsilon^{{\ast}}(t_4) \leq\varepsilon< \tfrac 12.$$
Proof. (1) Since $h_2(p)= p/(1-p)$ for $p\in (0,1)$, its inverse function $h_2^{-1}(x)=x/(1+x)$ is increasing in $x\in \mathbb {R}_+$. Note that $F_{B_{i,n}}\circ h_2^{-1}$ is the cdf of random variable $h_2(B_{i,n})$. From Lemmas A.2 and 4.1, it follows that, under the condition $i/(n-i)>j/(m-j)$,
By Lemma A.1, the condition $n-m>i-j>0$ implies that $F_{B_{i,n}}$ singly crosses $F_{B_{j,m}}$ from below. Hence, $F_{B_{i,n}}\circ h_2^{-1}$ singly crosses $F_{B_{j,m}}\circ h_2^{-1}$ from below. On the other hand, since $F\in \mathscr {F}_{\rm CO}$, we have $h_2 \circ F$ is convex. Therefore, from Proposition 3.7, we have
for $\varepsilon ^{\ast }(t_2)\leq \varepsilon <1/2$. That is, $X_{i:n}\geq _{\varepsilon \text {-AFSD}} X_{j:m}$ for $\varepsilon ^{\ast }(t_2)\leq \varepsilon <1/2$.
(2) and (3) The proof is similar to part (1) by applying Lemmas A.1, A.3 and 4.1. This completes the proof of the proposition.
4.2. Stochastic comparison of ROC curves
The Receiver Operating Characteristic (ROC) curve is one of the most common statistical tools to assess classifier performance [Reference Lusted22]. It is generated by plotting the fraction of true positive rate (TPR) against false positive rate (FPR) at various operating points as the decision threshold or misclassification cost [Reference Fawcett7]. Let us consider a binary classification tool by assigning a real-valued score to classify items into two categories: good or bad. Let random variables $X_B$ and $X_G$ represent the respective scores of the bad population with cdf $F_B$ and good population with cdf $F_G$. Then, ROC curve can be defined as
However, the selection of the best classifier is quite challenging when ROC curves intersect. The Area Under the Curve (AUC) is one of the most common measures to evaluate the classifier performance, but it has well-understood weakness when comparing ROC curves which cross. Gigliarano et al. [Reference Gigliarano, Figini and Muliere9] proposed a novel approach of ROC comparison and investigated the relationships between ROC orderings and integer-degree stochastic dominance in a theoretical framework. In this subsection, we will focus on extending the methodological approach to AFSD.
Proposition 4.4. Let $F_1, F_2$ and $G$ be three cdfs such that $F_1\circ G^{-1}$ and $F_2\circ G^{-1}$ are also cdfs. If $F_1$ singly crosses $F_2$ from below, and $G$ is concave, then for any $0<\varepsilon <1/2$,
Proof. The desired result follows from Proposition 3.7 by observing that $G^{-1}$ is increasing and convex.
Proposition 4.5. Let $G_1, G_2$ and $F$ be three cdfs such that $F\circ G_1^{-1}$ and $F\circ G_2^{-1}$ are also cdfs. If $G_2$ singly crosses $G_1$ from below, and $F$ is concave, then for any $0<\varepsilon <1/2$,
Proof. Without loss of generality, let the single crossing point of $G_1$ and $G_2$ be $x_0$. Define
for $t \in \mathbb {R}$. Then $\varphi _1(t)$ is decreasing on $(-\infty, x_0]$ and increasing on $(x_0, \infty )$. Since $G_2\ge _{\varepsilon \text {-AFSD}} G_1$, we get $\varphi _1(t) \le 0$ for any $t\in \mathbb {R}$. It is clear that
where the last inequality follows from the decreasing property of $F'_+$. Therefore, $F\circ G_1^{-1} \ge _{\varepsilon \text {-AFSD}} F\circ G_2^{-1}$ for any $0<\varepsilon <1/2$.
Proposition 4.6. Let $G_i$ and $F_i$ be cdfs such that $F_i\circ G_j^{-1}$ is also a cdf for any $i, j\in \{1, 2\}$. If $G_2$ singly crosses $G_1$ from below, $F_1$ singly crosses $F_2$ form below, and $F_2$ and $G_1$ are both concave (or $F_1$ and $G_2$ are both concave), then for any $0<\varepsilon <1/2$,
Proof. It suffices to prove that
Since
we have
where the last inequality follows from Propositions 4.4 and 4.5. This completes the proof.
We will end this section by presenting two examples to illustrate the applications of Proposition 4.6.
Example 4.1. Assume that $X\sim \mathcal {P}(a,b)$, the Pareto distribution with parameters $a>0$ and $b>0$, and with the cdf given by
Since $F'(x; a, b)=ab^{a} x^{-a-1} > 0$ and $F''(x; a,b) = -a(a+1)b^{a} x^{-a-2} < 0$, $F(x)$ is a concave function. Furthermore, assume that $X_1\sim F(\cdot ; a_1,b_1)$ and $X_2\sim F(\cdot ; a_2,b_2)$. Denote $F_i=F(\cdot ; a_i,b_i)$ for $i=1,2$. We claim that if
then $X_{1}\geq _{\varepsilon \text {-AFSD}} X_2$ for $\varepsilon (a_1,a_2,b_1,b_2)<\varepsilon <1/2$, where $\varepsilon (a_1,a_2,b_1,b_2)$ is to be determined later.
Now, we assume that (4.1) holds. First, for $a_1>a_2>0$, it can be checked that $F_1$ singly crosses $F_2$ from below with crossing point $x_0:=(b_1^{a_1}/b_2^{a_2})^{\frac {1}{a_1-a_2}}$. Next, note that, for any $0<\varepsilon <1/2$, $X_1\geq _{\varepsilon \text {-AFSD}} X_2$ if and only if
or, equivalently,
Define
It can be checked that
and
Define
Then, $0<\varepsilon (a_1,a_2,b_1,b_2)<1/2$ in view of (4.1). So, (4.2) holds if $\varepsilon (a_1,a_2,b_1,b_2)<\varepsilon <1/2$. This proves $X_1\geq _{\varepsilon \text {-AFSD}} X_2$.
Example 4.2. Suppose that we have measurements from two diagnostic tests. Let $X_i$ denote the measurement from test $i$ for diseased subject, and let $Y_i$ denote the corresponding measurement on the healthy subject, $i=1,2$. Assume that $X_1\sim \mathcal {P}(8,3.5)$, $X_2\sim \mathcal {P}(7,3)$, $Y_1\sim \mathcal {P}(4.5,2.5)$ and $Y_2\sim \mathcal {P}(5,3)$. We plot the ROC curves of the two measurements in Figure 1. The two curves intersect with many crossing points. AUC of the $1{\rm st}$ measurement is $0.5652$ and the $2{\rm nd}$ is $0.5655$, which are almost the same. Therefore, we can not compare the accuracy of these two diagnostic tests through the classic ROC comparison method. But, we can use AFSD rules to rank the two ROC curves.
To see it, denote $X_i\sim F_i$ and $Y_i\sim G_i$, and let $\varepsilon ^{\ast } <\varepsilon <1/2$, where
From Example 4.1, we have $X_1\ge _{\varepsilon \text {-AFSD}} X_2$ and $Y_2\ge _{\varepsilon \text {-AFSD}} Y_1$, and cdfs $F_i$ and $G_i$ are both concave. By Proposition 4.6, we get $F_1\cdot G_1^{-1} \ge _{\varepsilon \text {-AFSD}} F_2\circ G_2^{-1}$, which also implies that the AUC of $1{\rm st}$ measurement is smaller than the AUC of $2{\rm nd}$ measurement.
Acknowledgments
J. Yang was supported by the NNSF of China (No. 11701518, 12071438), Zhejiang Provincial Natural Science Foundation (No. LQ17A010011) and Zhejiang SCI-TECH University Foundation (No. 16062097-Y). W. Zhuang was supported by the NNSF of China (No. 71971204) and Excellent Youth Foundation of Anhui Scientific Committee (No. 2208085J43).
Appendix
Lemma A.1. Let $B_1\sim beta(\alpha _1, \beta _1)$ and $B_2\sim beta(\alpha _2, \beta _2)$ with $\alpha _i>0$ and $\beta _i>0$ for $i=1, 2$. If $\alpha _1>\alpha _2$ and $\beta _1>\beta _2$, then $F_{B_1}$ singly crosses $F_{B_2}$ from below.
Proof. Let $f_{B_1}$ and $f_{B_2}$ be the respective probability density functions of $B_1$ and $B_2$, and let $F_{B_1}$ and $F_{B_2}$ be the respective cdfs of $B_1$ and $B_2$. Then
Taking the derivative of $\ell (x)$, we have
Denote the number of sign changes of the function $a(x)$ in $\mathbb {R}$ by $S^{-}(a)$. Then, if $\alpha _1>\alpha _2$ and $\beta _1>\beta _2$, we have $S^{-}(\ell '(x))= 1$ and the sign sequence is $+$, $-$. This implies that $\ell (x)$ is first increasing and then decreasing with $\ell (0)= \ell (1)=0$. It is clear that $S^{-}(f_{B_1}-f_{B_2})=S^{-}(\ell -1)=2$ and the sign sequence is $-, +, -$. Therefore, $S^{-}(F_{B_1}-F_{B_2})=1$ and the sign sequence is $-, +$. This completes the proof of the lemma.
Lemma A.2. Let $B_{i,n}\sim beta(i,n-i+1)$, where $1\leq i\leq n$. Then,
where $h_2(p)=p/(1-p)$.
Proof. Note that
and
Therefore,
This completes the proof.
Lemma A.3. Let $B_{i,n}\sim beta(i,n-i+1)$, $1\le i\le n$, and define $\psi (x)= \Gamma ^{\prime }(x)/\Gamma (x)$ for $x>0$.
(1) If $h_3(p) = \log [p/(1-p)]$, then $\mu _{h_3}(i,n): = \mathbb {E} [ h_3(B_{i,n})] = \psi (i)-\psi (n-i+1)$ and
$$\sigma^{2}_{h_3}(i,n): ={\rm Var} (h_{3}(B_{i,n})) = \psi'(i)+\psi'(n-i+1);$$(2) If $h_4(p): = -\log (1-p)$, then $\mu _{h_4}(i, n): = \mathbb {E} [ h_4(B_{i,n})] = \psi (n+1)-\psi (n-i+1)$ and
$$\sigma^{2}_{h_4}(i, n): = {\rm Var}(h_4(B_{i,n})) =\psi'(n-i+1)-\psi'(n+1).$$
Proof. (1) Note that the moment generating function of $h_3(B_{i,n})$ is
Then,
and
Therefore,
(2) The proof is similar to part (1), and hence is omitted. This completes the proof.