Non-parametric estimation of the generalized past entropy function under α-mixing sample

Radhakumari Maya; Muhammed Rasheed Irshad; Francesco Buono; Maria Longobardi

doi:10.1017/S0269964825100132

Non-parametric estimation of the generalized past entropy function under α-mixing sample

Part of: Nonparametric inference Sufficiency and information Communication, information

Published online by Cambridge University Press: 05 November 2025

Radhakumari Maya ,

Muhammed Rasheed Irshad ,

Francesco Buono

and

Maria Longobardi

Show author details

Radhakumari Maya: Affiliation:
Department of Statistics, Cochin University of Science and Technology, Cochin, Kerala, India
Muhammed Rasheed Irshad: Affiliation:
Department of Statistics, Cochin University of Science and Technology, Cochin, Kerala, India
Francesco Buono: Affiliation:
Institute of Statistics, RWTH Aachen University, Aachen, Germany. Now at Dipartimento di Matematica e Applicazioni, Università degli Studi di Napoli Federico II, Naples, Italy
Maria Longobardi*: Affiliation:
Dipartimento di Matematica e Applicazioni, Università degli Studi di Napoli Federico II, Naples, Italy
*: Corresponding author: Maria Longobardi; Email: maria.longobardi@unina.it

Article contents

Abstract
Introduction
Non-parametric estimation of generalized past entropy function
Asymptotic properties of generalized past entropy function
Simulation
Numerical examples
Conclusion
Declaration of interests
References

Rights & Permissions

Abstract

Measure of uncertainty in past lifetime distribution plays an important role in the context of information theory, forensic science and other related fields. In the present work, we propose non-parametric kernel type estimator for generalized past entropy function, which was introduced by Gupta and Nanda [9], under $\alpha$-mixing sample. The resulting estimator is shown to be weak and strong consistent and asymptotically normally distributed under certain regularity conditions. The performance of the estimator is validated through simulation study and a real data set.

Keywords

Generalized past entropy function α-mixing Kernel estimator Past entropy function Residual entropy function

MSC classification

Primary: 62B10: Information-theoretic topics

Secondary: 62G20: Asymptotic properties 94A17: Measures of information, entropy

Information

Type: Research Article
Information: Probability in the Engineering and Informational Sciences , First View , pp. 1 - 16

DOI: https://doi.org/10.1017/S0269964825100132 [Opens in a new window]
Creative Commons: This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright: © The Author(s), 2025. Published by Cambridge University Press.

1. Introduction

The concept of entropy was first introduced and thoroughly explored by Shannon [Reference Shannon19], an electrical engineer at Bell Telephone Laboratories, in his research on communication networks. Around the same time, Wiener [Reference Wiener22] independently investigated the concept in his work on Cybernetics, albeit with a different motivation. Let $X$ be a non-negative rv with an absolutely continuous cumulative distribution function (cdf) $F(x) = P[X \leq x]$ and with probability density function (pdf) $f(x)$. Then the Shannon entropy associated with $X$ is defined as

(1.1)

\begin{equation} H(X) = - \int\limits_0^{+\infty} {f(x)\log f(x)\ dx}. \end{equation}

Starting from the pioneering work of Shannon [Reference Shannon19], different researchers have shown applications of entropy in different fields. Apart from thermodynamics and information theory, application of Shannon entropy varies over diverse fields such as statistics, economics, finance, psychology, wavelet analysis, image recognition, computer science, fuzzy sets and so on. In spite of its well known applications, it is necessary to modify this measure under different scenarios, specifically the concept of residual entropy proposed by Ebrahimi [Reference Ebrahimi7] substantiate this claim.

If the system has already survived for some units of time, Shannon entropy will not be applicable in such cases which led to the development of residual entropy. Residual entropy is the Shannon entropy associated with the rv $[X-t|X \gt t]$, $t\geq 0$, and is defined as (see, Ebrahimi [Reference Ebrahimi7])

(1.2)

\begin{equation} \begin{aligned} H(f;t) & = - \int\limits_t^{+\infty} {\frac{{f(x)}}{{\overline F (t)}}\log \left(\frac{{f(x)}}{{\overline F (t)}}\right)dx}, \end{aligned} \end{equation}

where $\overline F(t)$ denotes the survival function. Given that an item has survived up to time $t$, $H(f;t)$ measures the uncertainty about its remaining life.

Ebrahimi [Reference Ebrahimi7] used this concept to define new ageing classes such as decreasing (increasing) uncertainty of residual life (DURL (IURL)). Belzunce et al. [Reference Belzunce, Guillamón, Navarro and Ruiz3] proposed kernel-type estimation of the residual entropy function in the case of independent complete data sets. Belzunce et al. [Reference Belzunce, Navarro, Ruiz and del Aguila4] showed that for a rv $X$ with an increasing residual entropy $H(f;t)$, the function $H(f;t)$ uniquely determines the cdf. Rajesh et al. [Reference Rajesh, Abdul-Sathar, Maya and Nair15] thoroughly elucidated the necessity of developing inferential aspects under dependent sample and proposed non-parametric kernel type estimators for the residual entropy function based on $\alpha$-mixing dependent sample.

Di Crescenzo and Longobardi [Reference Di Crescenzo and Longobardi6] discussed the necessity of developing the uncertainty measure based on the reversed residual life (past lifetime) and they developed the concept of past entropy. If $X$ denotes the lifetime of a component, a system or a living organism, then the past entropy of $X$ at time $t$ is defined as

(1.3)

\begin{equation} \begin{aligned} \overline H(f;t) & = - \int\limits_0^ t {\frac{{f(x)}}{{F (t)}}\log \frac{{f(x)}}{{F (t)}}\ dx}. \end{aligned} \end{equation}

This measure has become significant in forensic science, particularly in determining the exact time of failure (or death, in the case of humans) when a unit is found in a failed state at some time $t$. It also has applications in actuarial science, as discussed by [Reference Sachlas and Papaioannou18]. Maya [Reference Maya12] introduced non-parametric estimators for this measure, applicable to both complete and censored $\alpha$-mixing dependent samples.

Numerous researchers have explored various definitions of entropy and their associated properties. Balakrishnan et al. [Reference Balakrishnan, Buono and Longobardi2] established relationships between certain cumulative entropies and the moments of order statistics. Furthermore, a general formulation of entropy was presented in Balakrishnan et al. [Reference Balakrishnan, Buono and Longobardi1]. Building on the successful applications of the past entropy function, Gupta and Nanda [Reference Gupta and Nanda9] extended (1.3) to define the past entropy function of order $\beta$. Now the past entropy function of order $\beta$ is defined as

(1.4)

\begin{equation} \overline H^\beta (f;t) = \frac{1}{(1-\beta )}\left\{\log \int\limits _0^t \left(\frac{f (x)}{F (t)}\right)^\beta dx\right\},\ \text{for} \ \beta \neq 1, \beta \gt 0. \end{equation}

Later on Nanda and Paul [Reference Nanda and Paul13] elucidated certain ordering and ageing properties and some characterization results based on this generalized past entropy measure. As stated by Ebrahimi [Reference Ebrahimi7], systems or components with high uncertainty tend to exhibit lower reliability. Building on this concept, these measures can be effectively used to choose the most reliable system among competing models.

However, in terms of inferential aspects, no studies appear to have been conducted in the existing literature. This gap motivates the authors to explore the development of a non-parametric estimator for the generalized past entropy function using kernel-based estimation. The study focuses on cases where the observations exhibit dependence. More specifically, the proposed estimator is based on an $\alpha$-mixing dependent sample (see Rosenblatt [Reference Rosenblatt17]), and its definition is provided below.

Definition 1.1. Let $\{X_i;i\geq 1\}$ denote a sequence of rvs. Given a positive integer $n$, set

(1.5)

\begin{equation} \alpha (n) = \sup\limits_ {k \geq 1} \{|P(A \cap B )-P(A)P(B)|; A\epsilon \mathfrak{F}_1^k, B \epsilon \mathfrak{F}_{k+n}^{+\infty}\}, \end{equation}

where $\mathfrak{F}_i^k$ denote the $\sigma$- field of events generated by $\{X_j; i \leq j \leq k\}$. The sequence is said to be $\alpha$-mixing (strong mixing) if the mixing coefficient $\alpha(n)\to 0$ as $n \to +\infty$.

Among various mixing conditions, $\alpha$-mixing is reasonably weak and has many practical applications. In the same way, Irshad et al. [Reference Irshad, Maya, Buono and Longobardi10] have proposed non-parametric kernel type estimators for two versions of cumulative residual Tsallis entropies.

The rest of the article is organized as follows. In Section 2, we propose a non-parametric kernel type estimator of $\overline H^{\beta}(f;t)$ given in (1.4) by $\alpha$-mixing dependent sample. Asymptotic properties of the proposed estimator are discussed in Section 3. The performance of the estimator is validated using simulation study in Section 4. In Section 5, a real data set is used to illustrate the performance of the proposed estimator and then the asymptotic normality is analyzed by the use of histograms. Finally, the article is concluded in Section 6.

2. Non-parametric estimation of generalized past entropy function

In this section, we propose a non-parametric kernel type estimator for the generalized past entropy function.

Let $\{X_i; 1\leq i\leq n\}$ be a sequence of identically distributed rvs. Note that $X{_i}'s$ need not be mutually independent, that is, the observations are supposed to be $\alpha$-mixing. Rosenblatt [Reference Rosenblatt16] and Parzen [Reference Parzen14] proposed a non-parametric kernel type estimator of $f (x)$ that is given by

(2.1)

\begin{equation} f_n (x) = \frac{1}{n b_n}\sum\limits_{j = 1}^n {K\left( {\frac{{x - X_j }}{{b_n }}} \right)}, \end{equation}

where $K(x)$ is a kernel of order $s$ satisfying the conditions:

a. $K(x) \geq 0$ for all $x$,
b. $\int\limits_{- +\infty }^{+\infty} K(x)dx = 1$,
c. $K(\cdot)$ is symmetric about zero and satisfies the Lipschitz condition, that is, there exists a constant $M$ such that $|K(x)-K(y)| \leq M |x-y|$,
d. $K_n (x) = \frac{1}{b_n} {K\left( {\frac{{x }}{{b_n }}} \right)}$, where $\{b_n\}$ is a bandwidth sequence of positive numbers such that $b_n \rightarrow 0$ and $nb_n \rightarrow +\infty$ as $n \rightarrow +\infty$.

The expressions for the bias and variance of $f_n (x)$ under $\alpha$-mixing dependence sample are respectively given by

(2.2)

\begin{equation} Bias\mathop {}\limits^{} (f_n (x)) \backsimeq \frac{{b_n ^s \Delta_s}}{{s!}}f^{(s)} (x) \end{equation}

and

(2.3)

\begin{equation} Var\mathop {}\limits^{} (f_n (x)) \backsimeq \frac{{ f(x)}}{{nb{}_n}}\Delta_K, \end{equation}

where $\Delta_s = \int\limits_{- +\infty }^{+\infty} {u^s K(u)\ du} $, $\Delta_K = \int\limits_{- +\infty }^{+\infty} {K^2 (u)\ du}$ and $f^{(s)}(\cdot)$ denotes the $s$-th derivative of $f$.

Based on (2.1), we propose a non-parametric kernel type estimator for $\overline H^{\beta} (f;t)$ that is defined as

(2.4)

\begin{equation} \overline H_n^\beta (f;t) = \frac{1}{(1-\beta)}\left\{\log \int\limits_0 ^ t f_n^\beta (x)\ dx - \beta \log F_n (t) \right\}, \end{equation}

where $f_n (x)$ is a non-parametric estimator of $f (x)$ and is given in (2.1) and $ F_n (t)= \int\limits_0^t f_n (x) dx $ is a non-parametric estimator of cdf $ F (t)$. The bias and variance of $ F_n (t)$ are respectively given by (see, Maya [Reference Maya12])

(2.5)

\begin{equation} Bias\mathop {}\limits^{}(F_n (t)) \backsimeq \frac{b_n^{s}\Delta_{s} }{s!}\int\limits_0^t f^{(s)}(x)\ dx \end{equation}

and

(2.6)

\begin{equation} Var\mathop {}\limits^{} (F_n (t)) \backsimeq \frac{1}{nb_n} \Delta_K F(t). \end{equation}

3. Asymptotic properties of generalized past entropy function

In this section, some asymptotic properties of generalized past entropy function are established.

For computational simplicity, define the following

(3.1)

\begin{equation} a_n (t) = \log\int\limits_0^t {f_n^{\beta} (x)\ dx},a(t) = \log\int\limits_0^t {f^{\beta} (x)\ dx}, \end{equation}

and

(3.2)

\begin{equation} m_n (t)=\log F_n (t)\ \text{and} \ m(t)=\log F(t). \end{equation}

Therefore,

(3.3)

\begin{equation} \overline H^\beta (f;t) = \frac{1}{(1-\beta)}\left[ a(t)- \beta\ m (t) \right] \end{equation}

and

(3.4)

\begin{equation} \overline H_n^\beta (f;t) = \frac{1}{(1-\beta)}\left[ a_n (t)- \beta\ m_n (t) \right]. \end{equation}

In the following theorems, the weak and strong consistency properties of $\overline H_n^\beta (f;t)$ are proved.

Theorem 3.1. Suppose $\overline H_n^\beta (f;t)$ is a non-parametric estimator of $\overline H^\beta (f;t)$ defined in (2.4) satisfying the assumptions given in Section 2. Then $\overline H_n^\beta (f;t) $ is a consistent estimator of $\overline H^\beta (f;t) $.

Proof. By using Taylor’s series expansion, we have

\begin{equation*} \log\int\limits_0^t {f_n^{\beta} (x)\ dx}\backsimeq \log\int\limits_0^t {f^{\beta} (x)\ dx}+(f_n(t)-f(t))\frac{\int\limits_0^t\beta f^{\beta-1}(x)dx }{\int\limits_0^t f^\beta(x)dx}. \end{equation*}

Using (3.1), we get

(3.5)

\begin{equation} a_n(t) \backsimeq a(t)+(f_n(t)-f(t))u(t), \end{equation}

where

\begin{equation*}u(t)=\frac{\int\limits_0^t\beta f^{\beta-1}(x)dx }{\int\limits_0^t f^\beta(x)dx}.\end{equation*}

Also,

\begin{equation*} \log F_n (t) \backsimeq\log F(t)+\frac{(F_n(t)-F(t))}{F(t)}. \end{equation*}

From (3.2), we write

(3.6)

\begin{equation} m_n(t) \backsimeq m(t)+\frac{(F_n(t)-F(t))}{F(t)}, \end{equation}

Using (3.5) and (3.6), the bias and variance of $a_n(t)$ and $m_n(t)$ are given by

(3.7)

\begin{equation} Bias\mathop {}\limits^{}(a_n (t)) \backsimeq \frac{b_n^{s}}{s!}\Delta_{s} f^{(s)}(t)u(t), \end{equation}

(3.8)

\begin{equation} Var\mathop {}\limits^{} (a_n (t)) \backsimeq \frac{1}{nb_n} \Delta_Kf(t)u^2(t), \end{equation}

(3.9)

\begin{equation} Bias\mathop {}\limits^{}(m_n (t)) \backsimeq \frac{b_n^{s}\Delta_{s} }{s!F(t)}\int\limits_0^t f^{(s)}(x)\ dx, \end{equation}

and

(3.10)

\begin{equation} Var\mathop {}\limits^{} (m_n (t)) \backsimeq \frac{1}{nb_n} \frac{\Delta_K} {F(t)}. \end{equation}

From (3.7) and (3.8), as $ n \to +\infty$

\begin{equation*} \text{MSE} \left( {a_n (t)} \right) \to 0. \end{equation*}

Therefore, the estimator $a_n(t)$ is consistent (in the probability sense), i.e.,

\begin{equation*} {a_n (t)}\mathop \to \limits ^ p a(t). \end{equation*}

From (3.9) and (3.10), as $ n \to +\infty$

\begin{equation*} \text{MSE} \left( {m_n (t)} \right) \to 0. \end{equation*}

Therefore, the estimator $m_n(t)$ is consistent (in the probability sense), i.e.,

\begin{equation*} {m_n (t)}\mathop \to \limits ^ p m(t). \end{equation*}

Therefore

\begin{equation*} \overline H_n^\beta (f;t) =\frac{1}{(1-\beta)}\left[ a_n (t)- \beta\ m_n (t) \right] \mathop \to \limits^p \frac{1}{(1-\beta)}\left[a(t) -\beta\ m(t)\right]= \overline H^\beta (f;t) . \end{equation*}

That is, $\overline H_n^\beta (f;t)$ is a consistent (in the probability sense) estimator of $\overline H^\beta (f;t)$.

Theorem 3.2. Let $\overline H_n^\beta (f;t)$ be a non-parametric estimator of $\overline H^\beta (f;t)$ defined in (2.4) satisfying the assumptions given in Section 2. Suppose the kernel $K(\cdot)$ satisfies the requirements:

a. $K(u)\to 0$ as $|u|\to +\infty$,
b. $\int\limits_{- +\infty }^{+\infty} |K^{'}(u)| du \lt +\infty$,
c. $\int\limits_{- +\infty }^{+\infty} |u| K(u)\ du \lt +\infty$.

Let $\overline t=\sup \{t \in \mathbb R; F(t) \lt 1\}$ and $J$ be any compact subset of $(0,\ \overline t)$. Then

\begin{equation*} \lim \limits _{n \to +\infty} \sup \limits_ {t \epsilon J} \left|\overline H_n^\beta (f;t)-\overline H^\beta (f;t)\right|= 0 \ a.s. \end{equation*}

Proof. By direct calculations, we get

\begin{equation*} \begin{aligned} \overline H_n^\beta (f;t)-\overline H^\beta (f;t)&= \frac{1}{(1-\beta)}\left(a_n (t)-a(t)\right)-\frac{\beta}{(1-\beta)}\left(m_n (t)-m(t)\right).\\ \end{aligned} \end{equation*}

\begin{equation*}\nonumber \begin{aligned} |\overline H_n^\beta (f;t)-\overline H^\beta (f;t)| & \backsimeq \frac{u(t)}{(1-\beta)}\left|f_n (t)-f(t)\right|+\frac{\beta}{(1-\beta)}\left(\frac{| F_n (t)- F(t)|}{1- F(t)}\right). \end{aligned} \end{equation*}

By using Theorem 3.3 and Theorem 4.1 of Cai and Roussas [Reference Cai and Roussas5], the result is immediate. Thus, we conclude that $\overline H_n^\beta (f;t)$ is a strong consistent estimator of $\overline H^\beta (f;t)$.

In order to prove that $\overline H_n^\beta (f;t)$ is an integratedly uniformly consistent in quadratic mean estimator of $\overline H^\beta (f;t)$, we consider the following definition from Wegman [Reference Wegman21].

Definition 3.1. The density estimator $f_n(x)$ is said to be integratedly uniformly consistent in quadratic mean if the mean integrated squared error (MISE) approaches zero, i.e.,

\begin{equation*} \lim_{n \to +\infty} \mathbb{E} \left[ \int (f_n(x) -f(x))^2 \,dx \right] = 0 \end{equation*}

In the following theorem, we prove that $\overline H_n^\beta (f;t)$ is integratedly uniformly consistent in quadratic mean estimator of $\overline H^\beta (f;t)$.

Theorem 3.3. Suppose $\overline H_n^\beta (f;t)$ is a kernel estimator of $\overline H^\beta (f;t)$ as defined in (2.4). Then, $\overline H_n^\beta (f;t)$ is integratedly uniformly consistent in quadratic mean estimator of $\overline H^\beta (f;t)$.

Proof. MISE of $\overline H_n^\beta (f;t)$ is given by

(3.11)

\begin{equation} \begin{aligned} {\rm MISE}(\overline H_n^\beta (f;t))&= {\rm E} \int\limits_{0}^{+\infty} \left[\overline H_n^\beta (f;t)-\overline H^\beta (f;t)\right]^2 dt\\ & =\int\limits_{0}^{+\infty} \left[[{\rm Bias}(\overline H_n^\beta (f;t))]^2+{\rm Var}(\overline H_n^\beta (f;t)) \right]dt\\ &= \int\limits_0^{+\infty}\left( \frac{b^{s}_n \Delta_s }{(1-\beta)s!} \left(f^{(s)}(t) u(t)-\frac{\beta}{F(t)} \int\limits_{0}^ {t} f^{(s)}(x) dx \right) \right)^2 dt\\ & +\int\limits_0^{+\infty} \left(\frac{\Delta_K}{nb_n(1-\beta)^{2}} \left(f(t) u^{2}(t)+\frac{\beta^{2}}{F(t)} \right)\right) dt.\\ \end{aligned} \end{equation}

We have, as $ n \to +\infty$,

\begin{equation*} {\rm MSE}\left( \overline H_n^\beta (f;t)\right) = Bias(\overline H_n^\beta (f;t))]^2+{\rm Var}(\overline H_n^\beta (f;t)) \to 0. \end{equation*}

Therefore, from (3.11), we have

(3.12)

\begin{equation} {\rm MISE}\left(\overline H_n^\beta (f;t)\right) \to 0, \text{as}~ n \to +\infty. \end{equation}

From (3.12), we can say that $\overline H_n^\beta (f;t)$ is integratedly uniformly consistent in quadratic mean estimator of $\overline H^\beta (f;t)$.

Thus the theorem is proved.

In the following theorem, we obtained the optimal bandwidth.

Theorem 3.4. Suppose $\overline H_n^\beta (f;t)$ is a non-parametric estimator of $ \overline H^\beta (f;t)$ defined in (2.4) satisfying the assumptions given in Section 2. Then the optimal bandwidth is given by

(3.13)

\begin{equation} b^{'}_n=\left[\frac{\frac{1}{n} \int\limits_{0}^ {+\infty} \Delta_K \left(f(t) u^{2}(t)+\frac{\beta^{2}}{F(t)} \right) dt }{2s\int\limits_{0}^ {+\infty} \left(\frac{\Delta_s}{s!}\left(f^{(s)}(t) u(t)-\frac{\beta}{F(t)} \int\limits_{0}^ {t} f^{(s)}(x) dx \right)\right)^2dt} \right] ^{\frac{1}{2s+1}}. \end{equation}

Proof. By using (3.11), the asymptotic-MISE (A-MISE) is given by

\begin{equation*} \begin{aligned} \text{A-MISE}(\overline H^{\beta}_n (f;t)) & = {b^{2s}_{n}}\int\limits_0^{+\infty}\left(\frac{\Delta_s}{(1-\beta)s!}\left(f^{(s)}(t) -\frac{\beta}{F(t)}\int\limits_0^t f^{(s)}_{(x)} dx \right)\right)^2dt\\ & + \frac{1}{n b_n} \int\limits_0^{+\infty} \frac{\Delta_K}{(1-\beta)^{2}}\left(f(t)u^{2}(t) +\frac{\beta^{2}}{F(t)} \right) dt.\\ \end{aligned} \end{equation*}

By minimizing $\text{MISE}(\overline H^{\beta}_n (f;t) )$ with respect to the parameter $b_n$, we get the optimal bandwidth $b^{'}_n$.

$\frac{\partial \text{A-MISE}(\overline H^{\beta}_n (f;t))}{\partial b_{n}}=0\Rightarrow$

\begin{equation*} \begin{aligned} 2s b^{2s-1}_{n}\int\limits_0^{+\infty}\left\{\frac{\Delta_s}{(1-\beta)s!}\left(f^{(s)}(t) -\frac{\beta}{F(t)}\int\limits_0^t f^{(s)}_{(x)} dx \right) \right\} ^2dt &= \\ \frac{1}{n b^{2}_n} \int\limits_0^{+\infty} \frac{\Delta_K}{(1-\beta)^{2}}\left(f(t)u^{2}(t) +\frac{\beta^{2}}{F(t)} \right) dt. \end{aligned} \end{equation*}

$\Rightarrow$

\begin{equation*} b^{2s+1}_{n}=\frac{\frac{1}{n } \int\limits_0^{+\infty} \frac{\Delta_K}{(1-\beta)^{2}}\left(f(t)u^{2}(t) +\frac{\beta^{2}}{F(t)} \right) dt}{2s \int\limits_0^{+\infty}\left\{\frac{\Delta_s}{(1-\beta)s!}\left(f^{(s)}(t) -\frac{\beta}{F(t)}\int\limits_0^t f^{(s)}_{(x)} dx \right) \right\} ^2dt}. \end{equation*}

Therefore,

\begin{equation*} \begin{aligned} b^{'}_{n} &= \left[\frac{\int\limits_0^{+\infty} \Delta_K\left(f(t)u^{2}(t) +\frac{\beta^{2}}{F(t)} \right) {\rm d}t}{2s \int\limits_0^{+\infty}\left\{\frac{\Delta_s}{s!}\left(f^{(s)}(t) -\frac{\beta}{F(t)}\int\limits_0^t f^{(s)}_{(x)} dx \right) \right\} ^2{\rm d}t}\right]^{\frac{1}{2s+1}}n^{-\frac{1}{2s+1}} \\ &= O\left(n^{-\frac{1}{2s+1}}\right), s=1,2,....\\ \end{aligned} \end{equation*}

The following lemma is used to derive the asymptotic normality of $\overline H^{\beta}(f;t)$ and is established in Theorem 3.4.

Lemma 3.1. Let $K(x)$ be a kernel of order $s$, let $\{\alpha (i)\}$ be the mixing coefficients and let $\{b_n\}$ be a sequence of numbers satisfying the assumptions given in Section 2. Then, for all fixed point $x$, with $f(x) \gt 0$,

(3.14)

\begin{equation} \left(nb_n\right)^\frac{1}{2}\left\{\frac{(f_n (x)- f(x))}{\sigma_f}\right\} \end{equation}

has a standard normal distribution as $n \to +\infty $ with $\sigma_{f} ^2 \backsimeq \frac{1}{n}f(x)\Delta_k$.

Proof. The proof follows directly by applying the same steps as in the proof of Theorem 8 in Masry [Reference Masry11].

Theorem 3.5. Let $\overline H^{\beta}_n (f;t)$ be a non-parametric estimator of $\overline H^{\beta}(f;t)$, $K(x)$ be a kernel of order $s$ satisfying the assumptions given in Section 2. Then for fixed $t$,

(3.15)

\begin{equation} {\sqrt {nb_n} }\left\{\frac{{(\overline H^{\beta}_n (f;t) -\overline H^{\beta} (f;t))}}{{\sigma _{\overline H}}}\right\} \end{equation}

has a standard normal distribution as $n \to +\infty$ with

(3.16)

\begin{equation} \sigma^2_{\overline H} \backsimeq \frac{\Delta_K }{nb_n(1-\beta)^2}\left\{f(t)u^2(t)+ \frac{\beta^2}{F(t)} \right\}. \end{equation}

Proof.

\begin{align*} \sqrt {n b_n}\left(\overline H^{\beta}_n (f;t)-\overline H^{\beta}(f;t) \right)&= \frac{1}{(1-\beta)}(nb_n)^{\frac{1}{2}} \left\{\log\! \int\limits_0^t f_{n}^{\beta}(x)\ dx- \log\! \int\limits_0^t f^{\beta}(x)\ dx\right\} \\ &\quad - \frac{\beta}{(1-\beta)}(nb_n)^{\frac{1}{2}}\left\{\log F_n (t)-\log F (t)\right\}\\ &\backsimeq \frac{1}{(1-\beta)}(nb_n)^{\frac{1}{2}} \left\{(f_n (t)-f(t))u(t) \right\} \\ &\quad- \frac{\beta }{(1-\beta)}(nb_n)^{\frac{1}{2}}\frac{\int\limits_0^t (f_n(x)-f(x))\ dx}{F(t)}. \end{align*}

By using the asymptotic normality of $f_n (x)$ given in Lemma 1, the proof is immediate.

4. Simulation

A simulation study is conducted to evaluate the performance of the proposed estimator $\overline H^{\beta}_n (f;t)$. Here the process $\left\{X_i\right\}$ is generated form exponential AR(1), with correlation coefficient $\phi=0.2$ and parameter $\lambda=1$. The Gaussian kernel is employed as the kernel function for estimation. The estimated value, bias, and MSE of the proposed estimator are calculated for various sample sizes. We vary the values of $\beta$ and the bandwidth parameter is determined using the plug-in method as proposed by Sheather [Reference Sheather and Jones20]. The result for the exponential AR(1) process are presented in Table 1.

Table 1. Estimated value, bias ad $MSE$ of $\overline H^{\beta}_n (f;t)$ for the Exponential AR(1) process along with the corresponding theoretical value $\overline H^{\beta} (f;t)$.

From Table 1, it is evident that both the $MSE$ and bias of the estimators decrease as the sample size increases. The reduction in $MSE$ signifies that the estimators’ predictions become closer to the true values with larger sample sizes, reflecting improved accuracy and efficiency in estimation. Similarly, the decreasing bias highlights the growing precision of the estimator.

5. Numerical examples

To demonstrate the practical applicability of the proposed estimator discussed in Section 2, we analyze real-world data consisting of the failure times (in months) of 20 electric carts used for internal transport and delivery in a large manufacturing facility (see Zimmer et al. [Reference Zimmer, Keats and Wang23]). The bootstrapping procedure is employed to determine the optimal value of $b_n$ (see Efron [Reference Efron8]).

For estimation, we utilize the Gaussian kernel function:

\begin{equation*} K(z) = \frac{1}{\sqrt{2\pi}} \exp \left( \frac{-z^2}{2} \right). \end{equation*}

At each value of $t$, we compute the biases and mean-squared errors of $\overline{H}_{n}^{\beta} (f;t)$ using 250 bootstrap samples of size 20. Table 2 presents the bootstrap biases and mean-squared errors for the estimator $\overline{H}_{n}^{\beta} (f;t)$. Figure 1 compares the theoretical value $\overline{H}^{\beta} (f;t)$ with the estimated value $\overline{H}_n^{\beta} (f;t)$. From Figure 1, it is evident that for the given dataset, the generalized past entropy function increases over time.

Figure 1. Plots of estimates of generalized past entropy function for the first failure of 20 electric carts.

Table 2. Bootstrap bias and mean-squared error of $\overline H_{n}^{\beta}(f;t)$ for the real data set.

Next, we aim to examine the asymptotic normality of the estimator presented in Theorem 3.5. To achieve this, we conduct a numerical evaluation of $\overline{H}^{\beta}_n (f;t)$ and analyze it through Monte Carlo simulations. Let $X$ follow an exponential distribution with parameter $\lambda$, where the mean is given by $\mu = 1/\lambda$. Then, the past entropy function of order $\beta$ is expressed as

(5.1)

\begin{equation} \overline H^{\beta} (f;t)=\frac{1}{1-\beta}\left[(\beta-1)\log\lambda-\log\beta-\beta\log(1-e^{-\lambda t})+\log(1-e^{-\lambda\beta t})\right]. \end{equation}

Before obtaining our estimator, it is necessary to fix a function $K$ and a sequence $\{b_n\}_{n\in\mathbb N}$ which satisfy the assumptions given in Section 2. Here, we consider

(5.2)

\begin{eqnarray} K(x)&=&\frac{1}{\sqrt{2\pi}}\exp\left(-\frac{x^2}{2}\right), \ \ x\in\mathbb R \end{eqnarray}

(5.3)

\begin{eqnarray} b_n&=&\frac{1}{\sqrt{n}}, \ \ n\in\mathbb N. \end{eqnarray}

By using these assumptions, we obtain

\begin{equation*} \nonumber \Delta_K=\frac{1}{2\sqrt\pi}, \end{equation*}

and, for the exponential distribution, we have

\begin{equation*} \nonumber u(t)=\frac{\beta^2}{\lambda(\beta-1)}\frac{1-e^{-\lambda(\beta-1)t}}{1-e^{-\lambda\beta t}}. \end{equation*}

We aim to verify whether the quantity defined in (3.15) follows a standard normal distribution. Using the exprnd function in MATLAB, we generate 500 samples, each of size $N = 50$, from an exponential distribution with parameter $\lambda$. For each simulated sample, we extract only the values that do not exceed a fixed threshold $t$, and the count of these selected values is used as $n$ in our evaluations. Subsequently, using the fixed parameters, we compute the quantity given in (3.15) for each sample. To assess its asymptotic normality, we construct a histogram based on the 500 computed values. Specifically, we explore various choices for $\lambda$, $\beta$, and $t$, as illustrated in Figure 2.

Figure 2. Histograms of (3.15) with parameters given in Section 5 and different choices of $\alpha$, $\beta$ and $t$. a) $\lambda=1$, $\beta=2$, $t=2$ b) $\lambda=1$, $\beta=2$, $t=4$ c) $\lambda=1$, $\beta=0.5$, $t=2$ d) $\lambda=2$, $\beta=0.5$, $t=2$.

6. Conclusion

This paper examines non-parametric estimators for the generalized past entropy function using kernel-based estimation, with $\alpha$-mixing dependent observations. The asymptotic properties of the proposed estimator are analyzed and established. Additionally, a simulation study and real data analysis are carried out to evaluate the performance of the proposed estimator. We obtained that the estimator is performing good on both simulation study and data analysis owing to bias and $MSE$. As discussed earlier these estimate can be used in selecting a reliable system from the other available competing models.

Future research will extend beyond the $\alpha$-mixing dependence condition to explore the inferential properties of generalized past entropy function under $\phi$-mixing and $\rho$-mixing dependence conditions.

Declaration of interests

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgements

Francesco Buono and Maria Longobardi are members of the research group GNAMPA of INdAM (Istituto Nazionale di Alta Matematica). ML has been partially supported by MIUR—PRIN 2022 PNRR, project “Stochastic Modeling of Compound Events”, no. P2022KZJTZ, funded by European Union—Next Generation EU.

Appendix

R-code to find $\overline H^{\beta}_n (f;t)$ by simulation for size $n$=100.

rm(list = ls())

library(MASS)

library(extremefit)

library(xlsx)

library(stats)

# Upper limit of integration

t <- 1

# PDF of the exponential distribution with rate = 1.5

f <- function(x) {

dexp(x, rate = 1)

}

# CDF of the exponential distribution at t

F_t <- pexp(t, rate = 1)

betas <- c( 2,4, 6, 8, 10) # Add more beta values as needed

results_list <- list()

for (beta in betas) {

r2 <- NULL # Reset results for each beta

n <- 100

r <- 1000

t <- 1

Hhat <- numeric(r)

for (s in 1:r) {

print(c(beta, k, s))

#set.seed(10)

# Generate data for the process

lambda <- 1

phi <- 0.2

X <- numeric(n)

a <- numeric(n)

for (i in 2:(n + 1)) {

a[i] <- sample(c(0, rexp(1, lambda)), prob = c(phi, 1 - phi), size = 1)

X[i] <- phi * X[i - 1] + a[i]

}

X <- X[-1]

# Bandwidth estimation

h <- bw.SJ(X)

# The integrand function: (f(x) / F_t)^beta

integrand <- function(x) {

(f(x) / F_t)^beta

}

# Integrate the function from 0 to t, passing beta

integral_result <- integrate(integrand, lower = 0, upper = t)

# Extract the integral value

integral_value <- integral_result$value

# PDF approximation using a kernel density estimator

int <- function(x) {

fn <- numeric(length(x))

for (i in 1:length(x)) {

fn[i] <- mean(dnorm((x[i] - X) / h) / h)

}

return(fn)

}

# Compute the non-parametric CDF Fn(t)

compute_Fn <- function(data, t) {

integrate(Vectorize(int), lower = 0, upper = t)$value

}

# Compute the integral of f_n(x)^beta from 0 to t

compute_integral <- function(data, t, beta){

integrand <- function(x) int(x)^beta

integrate(Vectorize(integrand), lower = 0, upper = t)$value

}

# Function to compute H^beta_n(f; t)

H_beta_n <- function(data, t, beta) {

integral_value <- compute_integral(data, t, beta)

Fn_t <- compute_Fn(data, t)

result <- (log(integral_value) - beta * log(Fn_t))

return(result)

}

# Compute H^beta_n(f; t)

Hhat[s] <- (1 / (1 - beta)) * H_beta_n(X, t, beta)

}

# Compute the estimator and related metrics

est <- (1 / (1 - beta)) * log(integral_value)

H <- mean(Hhat)

Mse <- (est - Hhat)^2

bias <- H - est

MSE <- mean(Mse)

}

References

Balakrishnan, N., Buono, F. and Longobardi, M.. A unified formulation of entropy and its application. Physica A: Statistical Mechanics and its Applications. 596 (2022), 127214.10.1016/j.physa.2022.127214CrossRef Google Scholar

Balakrishnan, N., Buono, F. and Longobardi, M.. On Cumulative Entropies in Terms of Moments of Order Statistics. Methodology and Computing in Applied Probability. 24 (2022), 345–359.10.1007/s11009-021-09850-0CrossRef Google Scholar

Belzunce, F., Guillamón, A., Navarro, J. and Ruiz, J. M.. Kernel estimation of residual entropy. Communications in Statistics—Theory and Methods. 30 (2001), 1243–1255.10.1081/STA-100104743CrossRef Google Scholar

Belzunce, F., Navarro, J., Ruiz, J. M. and del Aguila, Y.. Some results on residual entropy function. Metrika. 59 (2004), 147–161.10.1007/s001840300276CrossRef Google Scholar

Cai, Z. and Roussas, G. G.. Uniform strong estimation under

$\unicode{x03B1}$-mixing, with rates. Statistics and Probability Letters. 15 (1992), 47–55.10.1016/0167-7152(92)90284-CCrossRef Google Scholar

Di Crescenzo, A. and Longobardi, M.. Entropy-based measure of uncertainty in past lifetime distributions. Journal of Applied Probability. 39 (2002), 434–440.10.1239/jap/1025131441CrossRef Google Scholar

Ebrahimi, N.. How to measure uncertainty in the residual life time distribution. Sankhya A. 58 (1996), 48–56.Google Scholar

Efron, B.. Censored data and the bootstrap. Journal of American Statistical Association. 76 (1981), 312–319.10.1080/01621459.1981.10477650CrossRef Google Scholar

Gupta, R. D. and Nanda, A. K..

$\unicode{x03B1}$- and

$\unicode{x03B2}$-entropies and relative entropies of distributions. Journal of Statistical Theory and Applications. 1 (2002), 177–190.Google Scholar

Irshad, M. R., Maya, R., Buono, F. and Longobardi, M.. Kernel Estimation of Cumulative Residual Tsallis Entropy and Its Dynamic Version under

$\unicode{x03C1}$-Mixing Dependent Data. Entropy. 24 (2022), 9.10.3390/e24010009CrossRef Google Scholar

Masry, E.. Recursive probability density estimation for weakly dependent stationary processes. IEEE. 32 (1986), 254–267.Google Scholar

Maya, R.. Kernel estimation of the past entropy function with dependent data. Journal of the Kerala Statistical Association. 24 (2013), 12–36.Google Scholar

Nanda, A. K. and Paul, P.. Some results on generalized past entropy. Journal of Statistical Planning and Inference. 136 (2006), 3659–3674.10.1016/j.jspi.2005.01.006CrossRef Google Scholar

Parzen, E.. On estimation of a probability density function and mode. The Annals of Mathematical Statistics. 33 (1962), 1065–1076.10.1214/aoms/1177704472CrossRef Google Scholar

Rajesh, G., Abdul-Sathar, E.I, Maya, R. and Nair, K. R. M.. Nonparametric estimation of the residual entropy function with censored dependent data. Brazilian Journal of Probability and Statistics. 29 (2015), 866–877.10.1214/14-BJPS250CrossRef Google Scholar

Rosenblatt, M.. Remarks on some nonparametric estimates of a density function. The Annals of Mathematical Statistics. 27 (1956), 832–837.10.1214/aoms/1177728190CrossRef Google Scholar

Rosenblatt, M.. A central limit theorem and a strong mixing condition. Proceedings of the National Academy of Sciences of the United States of America. 42 (1956), 43–47.10.1073/pnas.42.1.43CrossRef Google Scholar

Sachlas, A. and Papaioannou, T.. Residual and past entropy in actuarial science and survival models. Methodology and Computing in Applied Probability. 16 (2014), 79–99.10.1007/s11009-012-9300-0CrossRef Google Scholar

Shannon, C. E.. A mathematical theory of communication. Bell System Technical Journal 27(3) (1948), 379–423, 623–656.10.1002/j.1538-7305.1948.tb01338.xCrossRef Google Scholar

Sheather, S. J. and Jones, M. C.. A reliable data-based bandwidth selection method for kernel density estimation. Journal of the Royal Statistical Society: Series B (Methodological). 53 (1991), 683–690.10.1111/j.2517-6161.1991.tb01857.xCrossRef Google Scholar

Wegman, E. J.. Nonparametric probability density estimation: I. A summary of available methods. Technometrics. 14 (1972), 533–546.10.1080/00401706.1972.10488943CrossRef Google Scholar

Wiener, N.. Cybernetics: or control and communication in the animal and the machine. John Wiley. (1948).Google Scholar PubMed

Zimmer, W.I, Keats, J. B. and Wang, F. K.. The Burr xii distribution in reliability analysis. Journal of Quality Technology. 30 (1998), 386–394.10.1080/00224065.1998.11979874CrossRef Google Scholar

Table 1. Estimated value, bias ad $MSE$ of $\overline H^{\beta}_n (f;t)$ for the Exponential AR(1) process along with the corresponding theoretical value $\overline H^{\beta} (f;t)$.

Figure 1. Plots of estimates of generalized past entropy function for the first failure of 20 electric carts.

Table 2. Bootstrap bias and mean-squared error of $\overline H_{n}^{\beta}(f;t)$ for the real data set.

Article contents

Non-parametric estimation of the generalized past entropy function under α-mixing sample

Abstract

Keywords

MSC classification

Information

1. Introduction

2. Non-parametric estimation of generalized past entropy function

3. Asymptotic properties of generalized past entropy function

4. Simulation

5. Numerical examples

6. Conclusion

Declaration of interests

Acknowledgements

Appendix

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests