Generalized Entropies, Variance and Applications

The generalized cumulative residual entropy is a recently defined dispersion measure. In this paper, we obtain some further results for such a measure, in relation to the generalized cumulative residual entropy and the variance of random lifetimes. We show that it has an intimate connection with the non-homogeneous Poisson process. We also get new expressions, bounds and stochastic comparisons involving such measures. Moreover, the dynamic version of the mentioned notions is studied through the residual lifetimes and suitable aging notions. In this framework we achieve some findings of interest in reliability theory, such as a characterization for the exponential distribution, various results on k-out-of-n systems, and a connection to the excess wealth order. We also obtain similar results for the generalized cumulative entropy, which is a dual measure to the generalized cumulative residual entropy.


Introduction and Background
The notions of uncertainty and information are relative and involve comparisons of distributions in terms of a probabilistic point of view. In information theory a relevant role is played by the concept of entropy; see Shannon [1]. We recall that an absolutely continuous nonnegative random variable X with probability density function (pdf) f , the (differential) entropy of X is defined as where E(·) denotes expectation and "log" is the natural logarithm, with the convention 0 log 0 = 0. Despite the many advantages, the Shannon entropy has some limitations; see, e.g., Rao et al. [2] and Schroeder [3]. We recall the following concerns about the differential entropy. (i) H(X) cannot be used when X is a mixed random variable. (ii) The differential entropy of an absolutely continuous random variable may take any value on the extended real line, wheres in the discrete case the entropy is always positive. (iii) H(X) is inconsistent, in the sense that for a uniform distribution it may be zero, or negative, or positive. (iv) The differential entropy is decreased by conditioning. Moreover, the vanishing of the conditional differential entropy of X given Y does not imply that X is a function of Y, whereas in the discrete case the conditional entropy of X given Y is equal to zero if and only if X is a function of Y. (v) In general, the entropy of the empirical distribution does not provide a proper approximation of the differential entropy of an absolutely continuous random variable. (vi) The differential entropy, as the Shannon entropy, is location independent. To overcome these drawbacks, various new alternative measures of uncertainty have been proposed in the recent literature. Here we recall some basic concepts which are involved in their definitions.
Let X be a random variable with support (0, ∞), having cumulative distribution function (cdf) F(x) = P(X ≤ x) and survival function F(x) = 1 − F(x). As customary, we denote by x > 0 the cumulative hazard function and by the hazard rate function of X. Let X describe the lifetime of a system. Under the condition that the system has survived up to age t > 0, the survival function and the cumulative hazard function of the residual lifetime X t := [X − t | X > t], t > 0, are respectively given by where [X|B] refers to a random variable having the same distribution of X conditioned on B. The mean residual lifetime (MRL) function of X is defined as Similarly, one can also introduce the reversed hazard rate function of X, given by τ(t) = f (t)/F(t), t > 0, and the cumulative reversed hazard function Under the condition that the system has been found failed at age t > 0, the inactivity time is defined as , t > 0. The mean inactivity time (MIT) function of X is It is also known as the mean past lifetime and the mean waiting time function of X.
Let us now recall various useful information measures that have been defined recently in terms of the functions reported above. For clarity and brevity, we report them in Table 1. The cumulative residual entropy (CRE) has been proposed and studied by Rao et al. [2] and Rao [4] as an alternative measure of uncertainty; see case (i) of Table 1. This measure is obtained by replacing the pdf with the survival function in (1). The CRE can be expressed as (see, e.g., Mitra [5] and Asadi and Zohrevand [6]) The CRE has been used by Leser [7] in order to measure the elasticity of life expectancy with respect to a proportional change of mortality in life tables. Recently, it has been extended in Psarrakos and Navarro [8] to the generalized cumulative residual entropy (GCRE) of X, for n ∈ N 0 ≡ N ∪ {0} = {0, 1, 2, . . .}, as given in (iii) of Table 1. It is worth noting that the GCRE is a dispersion measure related to the (upper) record values of a sequence of independent and identically distributed random variables and to the relevation transform. We recall that the relevation transform of X, denoted by X(X ), is defined through the survival function Table 1. Information measures of interest, for a given random lifetime X, with n ∈ N 0 for cases (iii) and (v), n ∈ N for cases (iv) and (vi), and t > 0 for cases (v) and (vi).
(i) cumulative residual entropy (CRE) (ii) cumulative entropy (CE) (v) dynamic gen. cumulative residual entropy (DGCRE) (vi) dynamic gen. cumulative entropy (DGCE) For n = 0, one has E 0 (X) = E(X), whereas for n = 1, the GCRE identifies with the CRE (see Table 1). In order to capture the effect of the age t > 0 of a system on the information of the residual distribution, according to Equation (8) of Psarrakos and Navarro [8], the dynamic version of GCRE, denoted DGCRE for short, is defined by substituting F(x) with F t (x) in the definition (see cases (iii) and (v) of Table 1). It is evident that E n (X; 0) = E n (X). Moreover, for n = 0 we have i.e., the mean residual life function (4). Some properties of E 1 (X) ≡ E (X) and its dynamic version are widely discussed in Asadi and Zohrevand [6] and Navarro et al. [10], among others. An application of the CRE in reliability engineering systems is given by Toomaj et al. [11]. Several properties of the GCRE and the DGCRE, such as bounds, stochastic ordering, aging classes properties, relationships with other entropy concepts, and characterization results, were investigated by Navarro and Psarrakos [12], Psarrakos and Navarro [8], and Psarrakos and Toomaj [13]. Specifically, in the latter paper the GCRE was studied as a risk measure, compared to the standard deviation and the right-tail risk measure, the latter being introduced by Wang [14]. Various dual measures can be introduced by analogy, as can be seen on the right-hand-side of Table 1, by replacing F(x) with F(x), and Λ(x) with T(x). Indeed, Di Crescenzo and Longobardi [15] defined the cumulative entropy (CE) as an alternative measure of uncertainty for the inactivity time by replacing the survival function with the distribution function; see (ii) of Table 1. We also recall that CE (X) = E[ µ(X)]. A suitable extension of the CE, in analogy with the GCRE, was defined by Kayal [16] as the generalized cumulative entropy (GCE); see case (iv) of Table 1. Furthermore, Kayal [16] considered the corresponding dynamic version, named dynamic generalized cumulative entropy (DGCE), as given in case (vi) of Table 1. Note that lim t→∞ CE n (X; t) = CE n (X), which coincides with the GCE of X. Various results on GCE and DGCE, related to bounds, stochastic ordering, aging classes properties, relationships with other entropy concepts, and characterizations, were obtained by Kayal [16] and Di Crescenzo and Toomaj [17].
The aim of the present paper is to investigate further results on GCRE and GCE. Specifically, we first illustrate an intimate connection between the GCRE and the non-homogeneous Poisson process. Moreover, we perform various stochastic comparisons, and provide various inequalities, including an upper bound for the GCRE in terms of the standard deviation. Some of our results are based on the fact that the GCRE is related to the record values of a sequence of independent and identically distributed (i.i.d.) random variables and the relevation transform, and that its dual, i.e., the GCE, is related to the lower record values and the reversed relevation transform. We recall that, in analogy with the relevation transform, the reversed relevation transform of X, denoted by X[X ], is defined through the distribution function with X an independent copy of the lifetime X.
In this paper we also focus on certain connections with the excess wealth order, based on the result that the GCRE can be expressed in terms of the excess wealth transform. Our investigation also deals with classical coherent systems of interest in reliability theory, i.e., k-out-of-n systems, by focusing on monotonicity properties and comparison results for their residual variance and GCRE.
The rest of this paper is organized as follows: In Section 2 we investigate a connection of the GCRE with the non-homogeneous Poisson processes (NHPP). Further, an expression for the variance of a random variable based the its mean residual lifetime is obtained and the connection of the GCRE with the standard deviation is considered. The connection of the GCRE with the excess wealth order is obtained in Section 3. Dynamic properties of GCRE and of the residual variance are given in Section 4. A characterization of the exponential distribution in terms of the variance residual life function and the dynamic GCE is also provided in Section 4. In Section 5, we discuss applications of the dynamic GCRE and of the residual variance in reliability, also with reference to k-out-of-n systems. In Section 6, similar results for the GCE are studied, whereas the dynamic GCE and inactivity variance are investigated in Section 7. Finally, Section 8 concludes the paper with some closing remarks.
For simplicity, in the rest of the paper we write φ n (x) instead of [φ(x)] n for any given function φ. Moreover, φ denotes the derivative of φ. Note that the terms increasing and decreasing are used in a non-strict sense.

Connection of Variance and GCRE
An intimate connection of the GCRE with the expected interepochs (or successive times) of a non-homogeneous Poisson process and the relevation transform has been pointed out by Psarrakos and Navarro [8]. These notions arise naturally in preventive maintenance, repairable systems, cold standby systems, and shock models. In fact, such connection shows that the known results on the GCRE can be used in various applied contexts thanks to the large usefulness of the Poisson process. In particular, there is an intimate connection among non-homogeneous Poisson processes (NHPPs), record values, and minimal repair processes. For an excellent review, we refer the readers to, e.g., Gupta and Kirmani [18] and Kirmani and Gupta [19].
Let X be an absolutely continuous nonnegative random variable with pdf f (x) and survival function F(x). Assume that 0 ≡ X 0 ≤ X 1 ≤ X 2 ≤ · · · denote the epoch times of a NHPP with intensity function (2), where X 1 has the same distribution as X. In this case, X n+1 − X n , n ∈ N 0 , describes the duration of the interepoch intervals or the interoccurrence times. Denoting by F n+1 (x) the survival function of X n+1 , n ∈ N 0 , it follows that (see Baxter [20]) so that the pdf of X n+1 is given by Recalling the definition of GCRE given in case (iii) of Table 1, from (9) we obtain Thus, for n ∈ N 0 , the GCRE of order n corresponds to the expected interepoch interval in a non-homogeneous Poisson process, with E 0 (X) = E(X). As mentioned by Psarrakos and Toomaj [13], there is a close relationship between the GCRE and the standard deviation σ(X). Indeed, E n (X), n ∈ N can be used for measuring the closeness of X to a degenerate distribution; i.e., it is a dispersion measure.
is a strictly increasing function such that φ (u) ≥ 1 for all u > 0. Then, E n (X) ≤ E n (Y) for all n ∈ N.
Proof. From (iii) of Table 1, denoting by φ −1 the inverse of φ, one has Therefore, the proof immediately follows.
This result is similar to that shown in Theorem 1 of Ebrahimi et al. [21] for the differential entropy. It should be noted that one advantage of the GCRE is that it exists for heavy tailed distributions with mean E(X) < ∞ and E(X 2 ) = ∞, while in this case σ(X) is not finite; see Psarrakos and Toomaj [13] and Yang [22]. Similar results also hold for the generalized entropy.
Hereafter, we provide a connection between the variance and the generalized cumulative residual entropy. First, we obtain an expression for the variance, σ 2 (X), in terms of the mean residual life function. It is based on the results in Hall and Wellner [23] (see also, Fernandez-Ponce et al. [24]). Theorem 1. Let X be an absolutely continuous nonnegative random variable with pdf f , distribution function F, and mean residual life function m(x), where the second moment E(X 2 ) is finite. Then Hereafter we show that Theorem 1 does not hold for a discrete random variable.

Remark 1.
Let X be an integer-valued random variable with probability mass function p(j) = P(X = j) and survival function P(k) = P(X ≥ k) = ∑ ∞ i=k p(i), respectively. The discrete mean residual life m(k) is defined as (see Roy and Gupta [25]) For example, we consider a three-point discrete distribution with the probability mass function p(0) = 1/2, p(1) = 1/4 and p(2) = 1/4. Then from (13), we obtain m(0) = 3/2 and m(1) = 1 while m(2) = 0. It is obvious that σ 2 (X) = 11/16. However, we get Applying Theorem 1, one can get variance inequalities under suitable sufficient conditions. First, we recall that X is said to have increasing (decreasing) mean residual life, Moreover, for absolutely continuous nonnegative random variables X and Y with CDFs F and G, PDFs f and g, and mean residual lifetime functions m X (t) and m Y (t), respectively, we say that X is smaller than Y in the For further information and properties about these concepts and on the stochastic orders that will be used in this paper (i.e., ≤ hr , ≤ rhr , ≤ ew , ≤ disp ), we refer the reader to Shaked and Shanthikumar [26]. In view of the following theorem, we recall that neither of the orders ≤ st and ≤ mrl implies the other (see Section 2.A.2 of [26]).

Theorem 2.
Let X and Y be nonnegative random variables with mean residual life functions m X (x) and m Y (x), respectively. Then The first inequality is obtained from the assumption X ≤ mrl Y while the last inequality is obtained by the fact that X ≤ st Y implies that E(ψ(X)) ≤ E(ψ(Y)) for all increasing functions ψ(·). Now, let X be IMRL. Then, we similarly have and hence the result stated in (i) follows. The proof of (ii) is similar. Arriaza et al. [27] for a recent characterization of the generalized Pareto distribution). It is well known that X has linear mean residual life function m( , with analogous constraints on the parameters. It is clear that if a = c and b < d, then X ≤ st Y and X ≤ mrl Y. If, in addition, a = c > 0, then both X and Y are IMRL. Indeed, in this case the result stated in point (i) of Theorem 2 is confirmed, being In the sequel, we give an upper bound for the generalized CRE in terms of the standard deviation. First, as extension of Theorem 2.1 of Asadi and Zohrevand [6] we provide an expression for the GCRE in terms of the mean residual life function.
Theorem 3. Let X be an absolutely continuous nonnegative random variable with survival function F(x) and mean residual life function m(x), and let X n be the nth epoch time of a NHPP with intensity function (2). Then, The identity (14) is a special case of Equation (34), which is proved in Section 4 below. As application of the representations (12) and (14), consider the following results.

Example 2.
(i) Assume that X is exponentially distributed with mean µ. Then, it is known that the MRL function of X is m(t) = µ, t > 0. Hence, we have σ 2 (X) = µ 2 , and E n (X) = µ, for any n ∈ N due to (14).
where the last equality is obtained by noting that E[X n ] = b − b/2 n .
(iii) Let X have a Pareto (Lomax) distribution with parameters α > 1 and β > 0, with survival function and mean residual life given respectively by Hence, we get .
where the last equality is obtained by E[X n ] = βα n /(α − 1) n − β, for n ∈ N.

Remark 2.
A class of distributions of interest in maintenance policies and shock models is the class of new better (worse) than used in expectation (NBUE) (NWUE) distributions. A random lifetime X is said to have a new better (worse) than used in expectation (NBUE) (NWUE) distribution if m(t) ≤ (≥) m(0) = µ for all t > 0. For a greater detail on these concepts, we refer the reader to, e.g., Barlow and Proschan [28]. If X is NBUE (NWUE), then under the assumptions of Theorem 3 we have Moreover, from (12) one can see that σ 2 (X) ≤ (≥) µ 2 , or equivalently The upper (lower) bounds in (15) and (16) show that the expected interepoch intervals (or the amount of uncertainty of a random variable X based on the GCRE) and the standard deviation are less (greater) than the expected value of X if it is NBUE (NWUE).
Thanks to (14), an iterative formula for the GCRE is proved in the following theorem. where Proof. By virtue of (14) and this giving the desired result.
Hereafter we provide an alternative iterative formula for the GCRE.

Theorem 5.
Under the assumption of Theorem 3, for all n ∈ N, we have where Z is an absolutely continuous nonnegative random variable having pdf Proof. The result follows from the probabilistic mean value theorem (cf. Theorem 4.1 of [29]) and making use of Equations (11) and (14).

Remark 3.
It is worth mentioning that if X is IMRL (DMRL), since m (x) ≥ (≤) 0, then as an immediate consequence of (17) or (18), we have Hereafter, we obtain an upper bound for the GCRE of X in terms of its standard deviation, thereby providing a suitable relation between two concepts of dispersion measure. Theorem 6. Let X be an absolutely continuous nonnegative random variable with standard deviation σ(X) and GCRE function E n (X). Then, for all n ∈ N, Proof. By the Cauchy-Schwarz inequality, for all n ∈ N we obtain On the other hand, it holds that Therefore, use of Theorems 1 and 3 completes the proof.
In the special case n = 1, from Theorem 6 one can obtain a close relationship between the standard deviation and CRE; i.e., where, for C = exp{ 1 0 log(x | log x|) dx} ∼ = 0.2065, the first inequality is obtained from Rao et al. [2]. It is worth pointing out that the inequalities given in (21) involve three uncertainty measures. Moreover, the related inequality between the differential entropy and the standard deviation is similar to relation (2) of Ebrahimi et al. [21]. They also pointed out that the first inequality in (21) becomes an equality if and only if X is normal. Moreover, if X is exponential then the second inequality is satisfied as equality. As well known, the entropy is a measure of disparity of the density function f (x) from the uniform distribution. On the other hand, the variance measures an average of distances of outcomes of the probability distribution f (x) from the mean. Additionally, as mentioned by Psarrakos and Toomaj [13] (see also Toomaj et al. [11]), the CRE acts like the standard deviation, i.e., it is a dispersion measure in spite of the similarity shape with the Shannon entropy. Although all of three measures, i.e., entropy, standard deviation, and CRE, are measures of dispersion and uncertainty, the lack of a simple relationship between orderings of a distribution by the three measures derives from their quite substantial and subtle differences. All such measures reflect "concentration", but their respective metrics for concentration are different.
We remark also that the Stirling formula allows one to obtain the following asymptotic result for the term given in the right-hand-side of (20): For the last theorem of this section, we provide suitable bounds for the GCRE. These are useful in reliability applications, since in a typical situation the only fact known a priori, for example, is that the component has an increasing (decreasing) failure rate due to wear (reverse ageing). We recall that if X is an absolutely continuous nonnegative random variable with hazard rate function λ(t), then X is said to have an increasing (decreasing) failure rate, say IFR (DFR), if λ(t) is increasing (decreasing) in t > 0. Theorem 7. Let X be an absolutely continuous nonnegative random variable with hazard rate function λ(t) and GCRE function E n (X). If X is IFR (DFR), then for all n ∈ N Hence, recalling that X has survival function F(t) and pdf f (x), from (iii) of Table 1 we have this giving the proof.
It is not hard to verify that if X is IFR, then We recall that the condition that X is IFR implies that X is IFRA, i.e., − 1 t log F(t) is increasing in t, so that the above relation indicates that the upper bound in Proposition 3.1 of Psarrakos and Toomaj [13] is a sharper bound. On the other hand, if X is DFR, then it is also DFRA. In this case, the inequality in (23) is reversed. This shows also that the lower bound in Proposition 3.1 of [13] improves the bound given in Theorem 7 when X is DFR.

Connection with the Excess Wealth Order
One of the most important issues in statistics, probability, actuarial science, risk theory, and other related areas is the concept of variability. The simplest way of comparing the variabilities of two distributions involves the associated standard deviations. However, the comparison of numerical measures is not always sufficiently informative. Several transforms and stochastic orders for comparing their variabilities have been introduced and widely studied (see Shaked and Shanthikumar [26]). One of said orders for the measure of spread is the excess wealth order. For a nonnegative random variable X with probability distribution function F, we recall that the quantile function is defined by for p ∈ (0, 1).
The excess wealth transform (or right spread function) is, for p ∈ (0, 1), This function is also related to the mean residual life function (4) by the following relation For an absolutely continuous nonnegative random variable X with strictly increasing distribution function F, Fernández-Ponce et al. [24] obtained the following representation for the variance of X in terms of excess wealth [m X (F −1 (p))] 2 dp.
Analogously, in the following theorem we show that the GCRE can be expressed in terms of the excess wealth transform by means of (14). The proof is straightforward and thus is omitted.

Theorem 8.
For an absolutely continuous nonnegative random variable X, for any n ∈ N one has Thus, using (26), for any n ∈ N we obtain (ii) Let us consider the Pareto distribution given in case (iii) of Example 2. It is easily seen that m X (F −1 (p)) = β(1 − p) −1/α /(α − 1). Thus, from (26), for any n ∈ N we have Let X and Y be random variables with distribution functions F and G, respectively. Then X is said to be smaller than Y in the excess wealth order, denoted as X ≤ ew Y, when W X (p) ≤ W Y (p) for all p ∈ (0, 1). It is known that X ≤ ew Y implies σ 2 (X) ≤ σ 2 (Y); see Shaked and Shanthikumar [26]. From (26) we thus obtain the following result, whose proof is straightforward. Theorem 9. Let X and Y be absolutely continuous nonnegative random variables. If X ≤ ew Y, then E n (X) ≤ E n (Y), for any n ∈ N.
Consequently, we have the following implications: for any n ∈ N. We recall that X is smaller than Y in the dispersive order (denoted by for more details, see, e.g., Shaked and Shanthikumar [26]).

Results on Dynamic GCRE and Residual Variance
In this section, we extend the previous results from GCRE to the DGCRE, cf. Table 1, as well as to the variance residual life function. The mean residual life function at age t has been employed in life lengths studies by various authors; see, e.g., Hollander and Proschan [30], Bhattacharjee [31], Hall and Wellner [23], Gupta [32], and the references therein. Another quantity which has generated interest in recent years is the variance of the residual life function defined by It is involved in the formula for Var[ m n (t)] where m n (t) is an estimator of the mean residual life function; see Hall and Wellner [23]. Several properties of the variance residual life function are studied in Gupta [32], Gupta et al. [33], and Gupta and Kirmani [18], among others. In this section, we will study some further results of this quantity and the GCRE. Similarly to (12), one can obtain an analogue representation for the variance residual life function in terms of the mean residual life function as follows: for all t > 0. Furthermore, its derivative is expressed as (see also Gupta [32]) If X is an absolutely continuous nonnegative random variable, it is shown in Psarrakos and Navarro [8] and Navarro and Psarrakos [12] that, for t > 0 and n ∈ N, the DGCRE satisfies the following equalities: Moreover, recalling (3), in analogy with (11) we have: Hereafter we specify the dynamic version of identity (14). To this aim we point out that is the pdf of [X n | X > t], n ∈ N. Recalling the correspondence between the structure of upper record values and the occurrence times of non-homogeneous Poisson processes under minimal repair (see Gupta and Kirmani [18]), we note that f n (x | t) can be viewed also as the pdf of the nth epoch time of a (delayed) NHPP Poisson process whose intensity function is λ(x) 1 {x≥t} .

Theorem 10.
Under the assumptions of Theorem 1, for all n ∈ N and t ≥ 0 we have Proof. Recalling (3), since λ t (z) = d dz Λ t (z) = λ(t + z), it is not hard to see that for all n ∈ N one has From the above relation, from the definition given in (iii) of Table 1, and using Fubini's theorem, we obtain By setting y = u + t and recalling (4), we have Thus, due to (33), we get the result (34).
Clearly, for t = 0, Equation (34) yields identity (14). We remark that the dynamic version of Theorem 6 provides the following upper bound: In the next theorem, through two suitable expressions we provide different probabilistic meanings for the DGCRE. The first is given in terms of a conditional expectation involving the cumulative hazard function and the hazard rate function. The second involves the conditional covariance of the n-th epoch time and the random variable Λ(X n ). Theorem 11. For any t ≥ 0 and for all n ∈ N, it holds that Proof. (i) Recalling (33) we have this giving the desired result due to the definition of DGCRE given in (v) of Table 1.
(ii) By the definition, we have From (33), one can easily obtain Therefore, the result follows by means of (32).
For t = 0, Theorem 11 immediately yields the following result.
Let us now provide a result for the variance residual life functions of two random variables satisfying the proportional mean residual life model (see Zahedi [34] and Nanda et al. [35] for some contributions on this topic).

Theorem 12. Let X and Y be absolutely continuous nonnegative random variables with survival functions F(t)
and G(t), hazard functions λ X (t) and λ Y (t), and residual variances σ 2 (X t ) and σ 2 (Y t ), respectively. Assume that X and Y satisfy the proportional mean residual life model, so that where c is constant. If c > 1 and if σ 2 (X t ) is an increasing function of t, then σ 2 (Y t ) is also increasing in t, Proof. From differentiation of (35), after some calculations and making use of Equations (2) and (4), we first obtain that the hazard functions are related to the mean residual life of X by the following relation: Since σ 2 (X t ) is increasing in t by assumption, from (29) we have for all t > 0. We need to show that or, due to (37), the stronger condition To that aim, we define the function We shall prove that ϕ(t) ≤ 0. Differentiating ϕ(t) and using (29), (35), and (36), we have Hence, due to (37), since c > 1, we have ϕ (t) ≥ 0. From the fact that σ 2 (X t ) ≥ 0 and assumption Since ϕ(t) is increasing, this implies that ϕ(t) ≤ 0, and the desired result then follows.
The next theorem generalizes Theorem 4.5 of Asadi and Zohrevand [6] for the GCRE and the variance residual life function. We first recall the following result, which is stated in Navarro and Psarrakos [12]. Lemma 1. If X is IMRL (DMRL), then E n (X; t) is increasing (decreasing) in t > 0, for all n ∈ N. Theorem 13. Let X and Y be absolutely continuous nonnegative random variables with survival functions F(t) and G(t), variance residual life functions σ 2 (X t ) and σ 2 (Y t ), and DGCRE functions E n (X; t) and E n (Y; t), respectively, such that E n (X; 0) < ∞ and E n (Y; 0) < ∞, for fixed n ∈ N. Let X ≤ hr Y, and let either X or Y be IMRL. Then (i) E n (X; t) ≤ E n (Y; t), for all t > 0 and for all n ∈ N; (ii) σ 2 (X t ) ≤ σ 2 (Y t ), for all t > 0.

Proof. (i) We prove it by induction, when
for all increasing functions ψ(·), and in particular for m X (·) since X is IMRL. Hence, from these conditions, recalling (31) one can obtain, for all t > 0, Assume now that E n−1 (X; t) ≤ E n−1 (Y; t), for all t > 0, and for a fixed n ≥ 2. We recall that if X is IMRL, then E n (Y; t) is increasing in t > 0, for all n ∈ N (see Lemma 1). Thus, using similar arguments of n = 1, from (31) we obtain, for all t > 0, for n ≥ 2. Thus, in this case the proof is completed. The case when Y is IMRL can be treated similarly.
(ii) By using similar arguments of part (i), and due to (28), the result is obtained.
In the sequel, we compare the GCRE and variance residual life functions of two random variables under weaker conditions. Theorem 14. Let X and Y be absolutely continuous nonnegative random variables with survival functions F(t) and G(t), mean residual life functions m X (t) and m Y (t), and DCRE functions E n (X; t) and E n (Y; t), respectively. If X ≤ mrl Y, and if X is DMRL and Y is IMRL, then (i) E n (X; t) ≤ E n (Y; t), for all t > 0 and for all n ∈ N.
Proof. (i) The proof proceeds by induction. For n = 1, we have The first inequality in (38) is obtained by noting that Y is IMRL and X is DMRL, while the last inequality is obtained from the assumption X ≤ mrl Y. This gives the stated result for n = 1. Now we assume that E n−1 (X; t) ≤ E n−1 (Y; t), for all t > 0, and for a given n ≥ 2. Since X is DMRL and Y is IMRL, from Lemma 1 we have that E n−1 (X; t) and E n−1 (Y; t) are decreasing and increasing, respectively. Hence, recalling (31) we get where the last inequality is due to the inductive hypothesis. This completes the proof.
(ii) The proof of σ 2 (X t ) ≤ σ 2 (Y t ), t > 0, can be obtained similarly, starting from (28), making use of the assumption that m X (x) and m Y (x) are decreasing and increasing, respectively, and finally, using X ≤ mrl Y.
Finally, we conclude this section with a characterization of the exponential distribution in terms of variance residual life function and DCRE.
Theorem 15. Let X be a nonnegative, absolutely continuous random variable having support (0, ∞), with survival function F(t), residual variance σ 2 (X t ), and DCRE E (X; t). Then if and only if X is exponentially distributed.
Proof. The necessity is obvious. To prove the sufficiency, let Equation (39) hold so that By differentiating both sides of such identity with respect to t, recalling (29) and (30) for n = 1, and (8), we obtain Hence, from (40) we get This reduces to [E (X; t) − m(t)] 2 = 0, which yields The proof is thus complete; recal that the latter identity is fulfilled if and only if X is exponentially distributed (see Theorem 4.8 of Asadi and Zohrevand [6]).

Applications in Reliability Theory
In the study of reliability of redundant systems, parallel systems play an important role. A parallel system consisting of m components, is a system which functions if and only if at least one of its m components functions. Let X, X 1 , X 2 , . . . , X m be i.i.d. random variables with common absolutely continuous distribution function F and survival function F = 1 − F, and assume that X i , i = 1, 2, . . . , m, is the time up to the failure of the ith component of a coherent system. We denote by X k:m , k = 1, 2, . . . , m, the lifetime of the component having the kth smallest lifetime among m i.i.d. system components. The random lifetime X k:m may represent the lifetime of an (m − k + 1)-out-of-m system, which consists of m components and functions if and only if at least (m − k + 1) out of m components functions. For the analysis of an (m − k + 1)-out-of-m system, we can define X r,k,m (t) = [X k:m − t|X r:m > t], for all 1 ≤ r ≤ k ≤ m. The conditional random variable X r,k,m (t) denotes the residual lifetime of the system under the condition that at least m − r + 1 components of the system are working at time t. The survival function of this lifetime is expressed as for all x ≥ t > 0. We denote the corresponding mean residual lifetime by m r,k,m (X, t) = E[X k:m − t|X r:m > t], 1 ≤ r ≤ k ≤ m.
Asadi and Goliforushani [36] provided the following expression where and where φ(t) = F(t)/F(t), t > 0, is the odds function. This measure has been studied by several authors in the reliability literature. Applying (42) and recalling (12), the residual variance of X r,k,m (t), for t > 0 will be denoted is the pdf of X r,k,m (t). Recalling (31), one can obtain the GCRE of X r,k,m (t) as E n (X r,k,m (t)) = ∞ t E n−1 (X r,k,m (x)) f r,k,m (x|t) dx, t > 0, for all 1 ≤ r ≤ k ≤ m, and for all n ∈ N.
Asadi and Goliforushani [36] have shown that, when the components of an (m − k + 1)-out-of-m system are IFR, then the mean residual life function of X r,k,m (t) is a decreasing function of time t. In the following theorem, we prove similar results for the variance residual life function and the GCRE. We just prove it for the variance since the proof for the other measure is similar. Theorem 16. Let X 1 , . . . , X n denote the i.i.d. lifetimes of an (m − k + 1)-out-of-m system. If X is IFR, then σ 2 r,k,m (X, t) and E n (X r,k,m (t)) are decreasing functions of t.
Proof. The results simply follow from Theorem 2.4 of Asadi and Goliforushani [36], Theorem 3.3 of Gupta [32] and making use of Lemma 1.
Other comparisons of (m − k + 1)-out-of-m systems are given in the next two theorems.
Theorem 17. Let X 1 , . . . , X n denote the lifetimes of an (m − k + 1)-out-of-m system. If X is DFR, then Proof. (i) Let X be DFR. Then, from (43) we have The first inequality is obtained due to Remark 3 of Asadi and Bayramoglu [37] by noting that m 2 1,k−1,m−1 (X, x) ≤ m 2 1,k,m (X, x) for all x > 0, while the second inequality follows by the fact that when X is DFR, then m 1,k,m (X, x) is an increasing function of x due to Theorem 2 of Asadi and Bayramoglu [37] and by noting that X 1,k−1,m−1 (t) ≤ lr X 1,k,m (t) which implies X 1,k−1,m−1 (t) ≤ hr X 1,k,m (t) for all t > 0. The latter relation can be obtained from the following remarks: from Remark 1 of Asadi and Bayramoglu [37], we derive that the distribution of X 1,k,m (t) is identical to the distribution of the k-th order statistics of the sample taken from the conditional distribution of [X − t|X > t]. Therefore, from Corollary 1.C.38 of Shaked and Shanthikumar [26] it holds that X 1,k−1,m−1 (t) ≤ lr X 1,k,m (t), which implies X 1,k−1,m−1 (t) ≤ hr X 1,k,m (t) for all t > 0. (ii) The proof is similar to that of (i) and thus is omitted.
The following theorem gives sufficient conditions for comparison of two (m − k + 1)-out-of-m systems based on their variance residual life functions and dynamic GCREs. Theorem 18. Let us consider two (m − k + 1)-out-of-m systems with i.i.d. component lifetimes distributed as X and Y, and having distribution functions F and G, respectively. If X ≤ hr Y and X is DFR, then Proof. For t > 0, from (43) we have Since X is DFR, then m 2 1,k,m (X, t) is an increasing function of t due to Theorem 2 of Asadi and Bayramoglu [37]. Moreover, X ≤ hr Y implies X 1,k,m (t) ≤ hr Y 1,k,m (t) and hence the first inequality is obtained. The second inequality follows from Theorem 3 of Asadi and Bayramoglu [37]. Finally, the proof of part (ii) is similar to the proof of part (i) and thus is omitted.

Results on the GCE
As pointed out by Kayal [16], there is an intimate connection with GCE and lower record values. Let Y 1 , Y 2 , . . . be a sequence of i.i.d. nonnegative random variables with a common absolutely continuous cdf F, and with pdf f (x). An observation Y i is said to be a lower record if Y i < Y j for all j < i. Assume that if Y i occurs at time i, then the record time sequence is defined as The random variables X n+1 = Y L(n+1) , n ∈ N 0 , are said to be the lower records, such that Y L(1) d = X. Denoting by F n+1 (x) the cumulative distribution function of X n+1 , n ∈ N 0 , it follows that so that the pdf of X n+1 is given by where T(x) is the cumulative reversed hazard function defined in (5). Recalling the definition of GCE given in case (iv) of Table 1, from (44) we obtain Thus, the GCE of order n corresponds to the expected spacings of lower record values. Note that for n = 0, CE 0 (X) = ∞ 0 F(x)dx, which may be divergent. With reference to the GCE, in the following theorem we obtain a result analogous to (14). The proof is omitted, being similar to that of Theorem 3 since Theorem 19. Let X be an absolutely continuous nonnegative random variable with mean inactivity time function µ(t). Then, for all n ∈ N one has In the following theorem, we determine equivalent expressions for the GCE analogous to those given in Theorems 4 and 5, and thus the proof is omitted. Theorem 20. Under the assumption of Theorem 19, for all n ∈ N, we have (i) where Z is an absolutely continuous nonnegative random variable having pdf For an absolutely continuous nonnegative random variable X, for all t > 0 such that F(t) > 0, the mean inactivity time (6) can be written as where denotes the mean failure time of a system conditioned by a failure before time t, also named "mean past lifetime". The derivative of this function is given by for all t > 0. In the next theorem, we prove that the variance of a random variable can also be written in terms of the mean inactivity time function.
Theorem 21. Let X be an absolutely continuous nonnegative random variable with pdf f , cdf F, and mean inactivity time µ(x), where the second moment E(X 2 ) is finite. Then Proof. Let us set Using (47), we obtain Recalling (48), it holds that Integration by parts gives By substituting Equation (51) into (50), we have and this gives (49).
In analogy with the result given in Theorem 2, on the ground of identity (49) we can now state the following theorem. First we remark that for two absolutely continuous nonnegative random variables X and Y with cdf's F and G, pdf's f and g, and mean inactivity time functions µ X (t) and µ Y (t), respectively, we say that X is smaller than Y in the mean inactivity time order, denoted by X ≤ mit Y, if µ X (t) ≥ µ Y (t) for all t > 0. Moreover, we recall that X is said to have increasing mean inactivity time (IMIT) if µ(t) is increasing in t > 0. The proof of the following theorem is omitted, being similar to that of Theorem 2.
Theorem 22. Let X and Y be nonnegative random variables with mean inactivity time functions µ X (t) and µ Y (t), respectively. Let X ≤ st Y and X ≤ mit Y. If either X or Y is IMIT, then σ 2 (X) ≤ σ 2 (Y).
Analogously to (19), hereafter we obtain an upper bound for the GCE in terms of the expected value of the squared mean inactivity time. The proof is omitted being similar to that of Theorem 6.

Theorem 23. Let X be an absolutely continuous nonnegative random variable with mean inactivity time µ(t).
Then, for all n ∈ N, In the special case n = 1, from Theorem 23, one can obtain a close relationship between the standard deviation and the cumulative entropy; i.e., e CH(X) ≤ CE 1 (X) ≤ σ(X), similarly as in (21).

Results on Past Variance and DGCE
In analogy with the variance of the residual life function defined in (27), we can introduce the variance of inactivity time (VIT) as (see, e.g., Mahdy [38]) It should be noted that it is of interest to compare the dispersion of inactivity time distributions in some situations, especially when the mean inactivity times of these distributions are non-ordered. For example, suppose that the random variable X represents the failure time (or time of death) of a component or a living organism. Assume that the a component or a patient with lifetime X, fails (dies) at time t or sometimes before t, and consequently the component can be considered as a black box in the sense that the exact failure time or drying time of X is unknown. In this case, one wishes to estimate the time that has elapsed since the failure and to study the dispersion of this elapsed interval of time. Therefore, several authors have considered properties and stochastic orders of the variance inactivity time function; see Mahdy [38,39] and Kayid and Izadkhah [40] and the references therein.
Similarly as (49), one can obtain an analogue representation for the variance inactivity time function in terms of the mean inactivity time function as follows: In analogy with (46) we have: Hereafter, we provide a result that is dual to Theorem 10, and hence its proof is omitted. To that end, we note that is the pdf of [X n | X ≤ t], n ∈ N.
Theorem 24. Under the assumptions of Theorem 19, for all n ∈ N and t ≥ 0, we have We remark that the dynamic version of Theorem 23 provides the following upper bound: Now, with reference to Theorem 11, we obtain the dual results for the past time. In fact, through two suitable expressions we provide different probabilistic meanings for the DGCE where the first is given in terms of a conditional expectation involving the cumulative reversed hazard function as well as the reversed hazard rate function.

Theorem 25.
For any t ≥ 0 and for all n ∈ N, it holds that τ(X n ) X ≤ t = CE n (X; t); (ii) 1 n Cov[X n , T(X n ) | X ≤ t] = −CE n (X; t).
The analogies between Equations (27) and (52) allow us to provide various results for the VIT that are similar to those for the variance of the residual life. The following theorem is an analogue to Theorem 12; the proof is similar and thus is omitted. In this case we consider random variables having proportional mean inactivity time functions. Theorem 26. Let X and Y be absolutely continuous nonnegative random variables with survival functions F(t) and G(t), mean inactivity time functions µ X (t) and µ Y (t), and variances of inactivity time σ 2 [X [t] ] and σ 2 [Y [t] ], respectively. Assume that µ Y (t) = c µ X (t), where c is constant. If c > 1 and if σ 2 [X [t] ] is an increasing function of t, then σ 2 [Y [t] ] is also increasing in t.
We conclude with a result for the generalized cumulative entropy and the variance of inactivity time that is analogous to Theorem 13. Again, the proof is similar and thus is omitted. We have seen already that X is IMIT means that µ(t) is increasing in t > 0. Moreover, consider the following lemma, which is stated in Di Crescenzo and Toomaj [17].

Lemma 2.
If X is IMIT, then CE n (X; t) is increasing in t > 0, for all n ∈ N. Theorem 27. Let X and Y be absolutely continuous nonnegative random variables with variances of inactivity time σ 2 [X [t] ] and σ 2 [Y [t] ], and DGCE functions CE n (X; t) and CE n (Y; t), respectively, such that CE n (X; 0) < ∞, and CE n (Y; 0) < ∞, for fixed n ∈ N. Let X ≤ rhr Y, and let either X or Y be IMIT. Then (i) CE n (X; t) ≤ CE n (Y; t), for all t > 0 and for all n ∈ N; (ii) σ 2 [X [t] ] ≤ σ 2 [Y [t] ], for all t > 0.

Conclusions
Besides that the GCRE is related to the upper record values of a sequence of independent and identically distributed random variables, and the relevation transform, there is also an intimate connection of the GCRE with the non-homogeneous poisson process, so we are assured that several findings in the literature on the GCRE can be used in shock models, preventive maintenance, repairable systems, and standby systems. It is known that the GCRE has some important variability properties, such as location invariance and positive homogeneity (i.e., E n (aX + b) = aE n (X) for all a > 0 and b ∈ R) and standardization (i.e., for all c ∈ R, one has E n (c) = 0). These properties suggest that the GCRE can also be considered as a dispersion measure. In this case, a connection of the GCRE with the standard deviation is derived. We have shown that when a random variable is absolutely continuous, then its variance can be represented in terms of either mean residual lifetime and mean inactivity time. Based on these expressions, several preservation properties with some well-known stochastic orders have been investigated. Moreover, the connection of the GCRE with the excess wealth order has been studied, together with some properties of the dynamic version of GCRE. Similar results have been obtained also for the GCE, being the dual measure of GCRE.
Future developments based on the ideas considered in this paper will be oriented to the case of weighted measures involving cumulative weighted random variables (see Toomaj and Di Crescenzo [41]). Other related research lines may involve (i) the information-based measures of statistical dispersion based on entropy and Fisher information treated by Kostal et al. [42], and (ii) the variance of typical information measures, such as the differential entropy and the Tsallis entropy (see, e.g., Di Crescenzo and Paolillo [43], and Wei [44]).
Finally, we note that a new kind of entropy that combines the concepts of CRE and CE is the cumulative paired φ-entropy. The construction of such a notion is based on both the distribution function and the survival function of X, through the functional CPE φ (X) = R φ(F(x)) + φ(F(x)) dx. See Klein et al. [45] and Klein and Doll [46] for detailed investigations on CPE φ , also dealing with relationships of the cumulative paired entropy to the variance. This entropy has the properties of a measure of scale (or dispersion). Suitable choices of φ allow one to introduce suitable generalized measures. This approach can be adopted also to construct paired combinations of the CGRE and GCE, and of their dynamic counterparts. This reveals new research ideas for future developments.