Interval Entropy and Informative Distance

The Shannon interval entropy function as a useful dynamic measure of uncertainty for two sided truncated random variables has been proposed in the literature of reliability. In this paper, we show that interval entropy can uniquely determine the distribution function. Furthermore, we propose a measure of discrepancy between two lifetime distributions at the interval of time in base of Kullback-Leibler discrimination information. We study various properties of this measure, including its connection with residual and past measures of discrepancy and interval entropy, and we obtain its upper and lower bounds.


Introduction
Recently, information theory has attracted the attention of statisticians.Sunoj et al. [1] have explored the use of information measures for doubly truncated random variables which plays a significant role in studying the various aspects of a system when it fails between two time points.In reliability theory and survival analysis, the residual entropy was considered in Ebrahimi and Pellerey [2], which basically measures the expected uncertainty contained in remaining lifetime of a system.The residual entropy has been used to measure the wear and tear of components and to OPEN ACCESS characterize, classify and order distributions of lifetimes by Belzunce et al. [3] and Ebrahimi [4].The notion of past entropy, which can be viewed as the entropy of the inactivity time of a system, was introduced in Di Crescenzo and Longobardi [5].
Ebrahimi and Kirmani [6] introduced the residual discrimination measure and studied the minimum discrimination principle.Di Crescenzo and Longobardi [7] have considered the past discrepancy measure and presented a characterization of the proportional reversed hazards model.Furthermore, the use of information measures for doubly truncated random variables was explored by Misagh and Yari [8,9].In this paper, continuing their work, we propose a new measure of discrepancy between two doubly truncated life distributions.The remaining of this paper is organized as follows: in Section 2, some results, including uniqueness of interval entropy and its invariance property are presented.Section 3 is devoted to definitions of dynamic measures of discrimination, including residual and past lifetimes and also the notion of interval discrimination measure is introduced.In Section 4 we present some results and properties of interval entropy and interval discrimination measures.Some conclusions are given in Section 4. Throughout this paper we consider absolutely continuous random variables.

Interval Entropy
Let ܺ be a non-negative random variable describing a system failure time.We denote the probability density function of ܺ as ݂ሺ‫ݔ‬ሻ, the failure distribution as ‫ܨ‬ሺ‫ݔ‬ሻ ൌ ܲሺܺ ‫ݔ‬ሻ and the survival function as ‫ܨ‬ ത ሺ‫ݔ‬ሻ ൌ ܲሺܺ ‫ݔ‬ሻ.The Shannon [10] information measure of uncertainty is defined as: where log denotes the natural logarithm.Ebrahimi and Pellerey [2] considered the residual entropy of the non-negative random variable ܺ at time ‫ݐ‬ as: Given that a system has survived up to time ‫,ݐ‬ ‫ܪ‬ ሺ‫ݐ‬ሻ essentially measures the uncertainty represented by the remaining lifetime.The residual entropy has been used to measure the wear and tear of systems and to characterize, classify and order distributions of lifetimes.See Belzunce et al. [3], Ebrahimi [4] and Ebrahimi and Kirmani [6].Di Crescenzo and Longobardi [5] introduced the notion of past entropy and motivated its use in real-life situations.They also discussed its relationship with the residual entropy.Formally, the past entropy of ܺ at time ‫ݐ‬ is defined as follows: Given that the system ܺ has failed at time ‫,ݐ‬ ‫ܪ‬ ഥ ሺ‫ݐ‬ሻ measures the uncertainty regarding its past lifetime.Now Recall that the probability density function of ሺܺȁ‫ݐ‬ ଵ ൏ ܺ ൏ ‫ݐ‬ ଶ ሻ for all Ͳ ൏ ‫ݐ‬ ଵ ൏ ‫ݐ‬ ଶ is given by ݂ሺ‫ݔ‬ሻ ൫‫ܨ‬ሺ‫ݐ‬ ଶ ሻ െ ‫ݐ‪ሺ‬ܨ‬ ଵ ሻ൯ Τ .Sunoj et al. [1] considered the notion of interval entropy of ܺ in the interval ሺ‫ݐ‬ ଵ ǡ ‫ݐ‬ ଶ ሻ as the uncertainty contained in ሺܺȁ‫ݐ‬ ଵ ൏ ܺ ൏ ‫ݐ‬ ଶ ሻwhich is denoted by: We can rewrite the interval entropy as: where ‫ݎ‬ሺ‫ݔ‬ሻ ൌ ݂ሺ‫ݔ‬ሻ ‫ܨ‬ ത ሺ‫ݔ‬ሻ Τ is the hazard function of ܺ.Note that interval entropy can be negative and also it can be െλ or λ.Given that a system has survived up to time ‫ݐ‬ ଵ , and has been found to be down at time ‫ݐ‬ ଶ , ‫ݐ‪ሺ‬ܪܫ‬ ଵ ǡ ‫ݐ‬ ଶ ሻ measures the uncertainty about its lifetimes between ‫ݐ‬ ଵ and ‫ݐ‬ ଶ .Misagh and Yari [9] introduced a shift-dependent version of ‫ݐ‪ሺ‬ܪܫ‬ ଵ ǡ ‫ݐ‬ ଶ ሻ.The entropy (4) has been used to characterize and ordering random lifetime distributions.See Misagh and Yari [8] and Sunoj et al. [1].
The general characterization problem is to obtain when the interval entropy uniquely determines the distribution function.The following proposition attempts to solve this problem.We first give definition of general failure rate (GFR) functions extracted from Navarro and Ruiz [11].
Definition 2.1.The GFRs of a random variable ܺ having density function ݂ሺ‫ݔ‬ሻ and cumulative distribution function ‫ܨ‬ሺ‫ݔ‬ሻ are given by ݄ .
Proof.By differentiating ‫ݐ‪ሺ‬ܪܫ‬ ଵ ǡ ‫ݐ‬ ଶ ሻ with respect to ‫ݐ‬ , we have: and: Thus, for fixed ‫ݐ‬ ଵ and arbitrary ‫ݐ‬ ଶ , ݄ ଵ ሺ‫ݐ‬ ଵ ǡ ‫ݐ‬ ଶ ሻ is a positive solution of the following equation: Similarly, for fixed ‫ݐ‬ ଶ and arbitrary ‫ݐ‬ ଵ , we have ݄ ଶ ሺ‫ݐ‬ ଵ ǡ ‫ݐ‬ ଶ ሻ as a positive solution of the following equation: By differentiating ݃ and ߛ with respect to ‫ݔ‬ ௧ మ and ‫ݕ‬ ௧ భ ǡ we get Ͳ respectively.Then the functions ݃ and ߛ are minimized at points So, both functions ݃ and ߛ first decrease and then increase with respect to ‫ݔ‬ ௧ మ and ‫ݕ‬ ௧ భ respectively.which conclude that Equations ( 5) and ( 6) has unique roots ݄ ଵ ሺ‫ݐ‬ ଵ ǡ ‫ݐ‬ ଶ ሻ and ݄ ଶ ሺ‫ݐ‬ ଵ ǡ ‫ݐ‬ ଶ ሻ respectively.Now, ‫ݐ‪ሺ‬ܪܫ‬ ଵ ǡ ‫ݐ‬ ଶ ሻ uniquely determines GFRs and by virtue of Remark 2.1, the distribution function.
The effect of monotone transformations on the residual and past entropy has been discussed in Ebrahimi and Kirmani [6] and Di Crescenzo and Longobardi [5] respectively.Following proposition gives similar results for interval entropy.Proposition 2.2.Suppose ܺ be a non-negative random variable with cumulative distribution function ‫ܨ‬ and survival function ‫ܨ‬ ത ; let ܻ ൌ ߮ሺܺሻ, with ߮, strictly increasing and differentiable function.Then Recalling (1), The Shannon information of ܺ and ܻ can be expressed as: From Theorem 2 of Ebrahimi and Kirmani [6] and Proposition 2.4 of Di Crescenzo and Longobardi [5], we have: and: Due to Equation 2.8 of Sunoj et al. [1], there holds: where ‫ܩ‬ and ‫ܩ‬ ҧ denote distribution and survival functions of ܻ respectively.Substituting ‫ܪ‬ሺܻሻ, ‫ܪ‬ ഥ ሺ‫ݐ‬ ଵ ሻ and ‫ܪ‬ ሺ‫ݐ‬ ଵ ሻ in ( 7), ( 8) and ( 9) into terms of (10), we get: Three terms of the right hand side of ( 11) are equal to: and the proof is complete.

Informative Distance
In this section, we review some basic definitions and facts for measures of discrimination between two residual and past lifetime distributions.We introduce a measure of discrepancy between two random variables at an interval of time.
Let ܺ and ܻ are two non-negative random variables describing times to failure of two systems.We denote the probability density functions of ܺ and ܻ as ݂ሺ‫ݔ‬ሻ and݃ሺ‫ݕ‬ሻ, failure distributions as ‫ܨ‬ሺ‫ݔ‬ሻ ൌ ܲሺܺ ‫ݔ‬ሻ and ‫ܩ‬ሺ‫ݕ‬ሻ ൌ ‫ܩ‬ሺܻ ‫ݕ‬ሻ and the survival functions as ‫ܨ‬ ത ሺ‫ݔ‬ሻ ൌ ܲሺܺ ‫ݔ‬ሻ and ‫ܩ‬ ҧ ሺ‫ݕ‬ሻ ൌ ‫ܩ‬ሺܻ ‫ݕ‬ሻ respectively, with ‫ܨ‬ሺͲሻ ൌ ‫ܩ‬ሺͲሻ ൌ ͳ.Kullback-Leibler [12] informative distance between ‫ܨ‬ and ‫ܩ‬ is defined by: where log denotes natural logarithm.‫ܫ‬ ǡ is known as relative entropy and it is shift and scale invariant.However it is not metric, since symmetrization and triangle inequality does not hold.We point out the Jensen-Shannon divergence (JSD) which is based on the Kullback-Leibler divergence, with the notable differences that it is always a finite value and its square root is a metric.See Nielsen [13] and Amari et al. [14].The application of ‫ܫ‬ ǡ as an informative distance in residual and past lifetimes has increasingly studied in recent years.In particular, Ebrahimi and Kirmani [6] considered the residual Kullback-Leibler discrimination information of non-negative lifetimes of the systems ܺ and ܻ at time ‫ݐ‬ as: Given that both systems have survived up to time ‫,ݐ‬ ‫ܫ‬ ǡ ሺ‫ݐ‬ሻ identifies with the relative entropy of remaining lifetimes ሺܺȁܺ ‫ݐ‬ሻ and ሺܻȁܻ ‫ݐ‬ሻǤ Furthermore, the Kullback-Leibler distance for two past lifetimes was studied in Di Crescenzo and Longobardi [7] which is dual to (13) in the sense that it is an informative distance between past lifetimes ሺܺȁܺ ൏ ‫ݐ‬ሻ and ሺܻȁܻ ൏ ‫ݐ‬ሻ.Formally, the past Kullback-Leibler distance of non-negative random lifetimes of the systems ܺ and ܻ at time ‫ݐ‬ is defined as: Given that at time ‫,ݐ‬ both systems have been found to be down, ‫ܫ‬ ҧ ǡ ሺ‫ݐ‬ሻ measures the informative distance between their past lives.
Example 3.1.Suppose ܺ and ܻ be random lifetimes of two systems with joint density function: Ͳ ൏ ‫ݕ‬ ൏ Ͷǡ respectively.Because ܺ and ܻ, belongs to different domains, using relative entropy to measure the informative distance between ܺ and ܻ is not interpretable.The interval distance between ܺ and ܻ in the intervals ሺͲǡͳǤͷሻ and ሺͳǤͷǡʹሻ are 0.01 and 0.16 respectively.Hence, the informative distance between ܺ and ܻ in the interval ሺͳǤͷǡʹሻ is greater than of it in the interval ሺͲǡͳǤͷሻ.
In the following proposition we decompose the Kullback-Leibler discrimination function in terms of residual, past and interval discrepancy measure.The proof is straightforward.Proposition 3.1.Let ܺ and ܻ are two non-negative random lifetimes of two systems.For all Ͳ ‫ݐ‬ ଵ ൏ ‫ݐ‬ ଶ ൏ ǡ the Kullback-Leibler discrimination measure is decomposed as follows: where: is the Kullback-Leibler distance between two trivalent discrete random variables.Proposition 3.1 admits the following interpretation: the Kullback-Leibler discrepancy measure between random lifetimes of systems ܺ and ܻ is composed from four parts: (i) the discrepancy between the past lives of two systems at time ‫ݐ‬ ଵ ; (ii) the discrepancy between residual lifetimes of ܺ and ܻ that have both survived up to time ‫ݐ‬ ଶ ; (iii) the discrepancy between the lifetimes of both systems in the interval ሺ‫ݐ‬ ଵ ǡ ‫ݐ‬ ଶ ሻ; (iv) the discrepancy between two random variables which determines if the systems have been found to be failing before ‫ݐ‬ ଵ , between ‫ݐ‬ ଵ and ‫ݐ‬ ଶ or after ‫ݐ‬ ଶ .

Some Results on Interval Based Measures
In this section we study the properties of ‫ݐ‪ሺ‬ܦܫ‬ ଵ ǡ ‫ݐ‬ ଶ ሻ and point out certain similarities with those of ‫ܫ‬ ǡ ሺ‫ݐ‬ሻ and ‫ܫ‬ ҧ ǡ ሺ‫ݐ‬ሻ.The following proposition gives lower and upper bounds for the interval distance.We first give definition of likelihood ratio ordering.Several results regarding the ordering in Definition 4.1.was provided in Ebrahimi and Pellerey [2].
Proposition 4.1.Let ܺ and ܻ are random variables with common support ሺͲǡ ሻ.Then: is decreasing in ‫ݔ‬ Ͳ, then the inequalities in (19) are reversed.
Proof.Because of increasing ሺ௫ሻ ሺ௫ሻ in Ͳ, from (15), we have: and: is decreasing, the proof is similar.Furthermore, for all ‫ݐ‬ ଵ ൏ ‫ݔ‬ ൏ ‫ݐ‬ ଶ decreasing ݃ሺ‫ݔ‬ሻ in ‫ݔ‬ Ͳ implies ݃ሺ‫ݐ‬ ଶ ሻ ൏ ݃ሺ‫ݔ‬ሻ ൏ ݃ሺ‫ݐ‬ ଵ ሻ, then from (16) we get: and: so that (20) holds.When ݃ሺ‫ݔ‬ሻ is increasing the proof is similar.where ݄ሺȉሻ denotes the Laplace transform of ߱ሺȉሻ given by ݄ሺߠሻ ൌ ‫‬ ߱ሺ‫ݔ‬ሻ݁ ିఏ௫ ‫ݔ݀‬ ஶ ǡ ߠ Ͳ, therefore, for ߣ ് Ɋ the interval distance between ܺ and ܻ at interval ሺ‫ݐ‬ ଵ ǡ ‫ݐ‬ ଶ ሻ is the following: A similar expression is available in Maya and Sunoj [15] for past life time.Due to (22) and from non-negativity of ‫ܦܫ‬ ǡ ഘ ሺ‫ݐ‬ ଵ ǡ ‫ݐ‬ ଶ ሻ we have: which is a direct result of Markov inequality for concave functions.
Proof.From (17) we have: where the first inequality comes from the fact that ‫ܦܫ‬ మ ǡ భ ሺ‫ݐ‬ ଵ ǡ ‫ݐ‬ ଶ ሻ Ͳ and the second one follows from the increasing Proof.The proof is straightforward.
The following remarks clarify the invariance of interval discrimination measure under location and scale transformation.

Conclusions
In this paper, we presented two novel measures of information which are based on a time interval and are more general than the well-known Shannon's differential entropy and Kullback-Leibler divergence measure.These new measures are consistent in that they are valid in both past and residual lifetimes.We call these measures of information the interval entropy and the informative distance.We obtain the requirements that interval entropy can uniquely determine the distribution function.We presented several propositions and remarks, some of which parallel those for Shannon entropy and Kullback-Leibler divergence and others that are more general.The advantages of interval entropy and informative distance were outlined as well.We believe that interval basic measures will have many applications in reliability, stochastic processes and other areas in the near future.The results presented here are by no means comprehensive but hopefully will pave the way for studying the entropy in a different and more general setting.

Definition 4 . 1 .
ܺ is said to be larger than ܻ in likelihood ratio (ܺ ‫ݔ‬ over the union of the supports of ܺ and ܻ.

Remark 4 . 1 .
Consider ܺ and ܻ are two non-negative random variables corresponding to weighted exponential distributions with positive rates ߣ and ߤ respectively and with common positive real weight function ߱ሺȉሻ.The densities of ܺ and ܻ are ݂ሺ‫ݔ‬ሻ ൌ

Example 4 . 1 .
For ߱ሺ‫ݔ‬ሻ ൌ ‫ݔ‬ ିଵ and ݄ሺߠሻ ൌ ሺ݊ െ ͳሻǨ ߠ Τ the distributions of random variables in Remark 4.1 called Erlang distributions with scale parameters ߣ and ߤ and with common shape parameter.The conditional mean of ሺܺȁ‫ݐ‬ ଵ ൏ ܺ ൏ ‫ݐ‬ ଶ ሻ is the following: