Next Article in Journal
The Optimal Evaporation Temperature of Subcritical ORC Based on Second Law Efficiency for Waste Heat Recovery
Next Article in Special Issue
Association of Finite-Time Thermodynamics and a Bond-Graph Approach for Modeling an Endoreversible Heat Engine
Previous Article in Journal
The Mathematical Structure of Information Bottleneck Methods
Previous Article in Special Issue
Scientific Élan Vital: Entropy Deficit or Inhomogeneity as a Unified Concept of Driving Forces of Life in Hierarchical Biosphere Driven by Photosynthesis
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Interval Entropy and Informative Distance

by
Fakhroddin Misagh
1,* and
Gholamhossein Yari
2
1
Department of Statistics, Science and Research Branch, Islamic Azad University, Tehran, 14778-93855, Iran
2
School of Mathematics, Iran University of Science and Technology, Tehran, 16846-13114, Iran
*
Author to whom correspondence should be addressed.
Entropy 2012, 14(3), 480-490; https://doi.org/10.3390/e14030480
Submission received: 20 December 2011 / Revised: 4 February 2012 / Accepted: 7 February 2012 / Published: 2 March 2012
(This article belongs to the Special Issue Concepts of Entropy and Their Applications)

Abstract

:
The Shannon interval entropy function as a useful dynamic measure of uncertainty for two sided truncated random variables has been proposed in the literature of reliability. In this paper, we show that interval entropy can uniquely determine the distribution function. Furthermore, we propose a measure of discrepancy between two lifetime distributions at the interval of time in base of Kullback-Leibler discrimination information. We study various properties of this measure, including its connection with residual and past measures of discrepancy and interval entropy, and we obtain its upper and lower bounds.
AMS Classification:
62N05; 62B10

1. Introduction

Recently, information theory has attracted the attention of statisticians. Sunoj et al. [1] have explored the use of information measures for doubly truncated random variables which plays a significant role in studying the various aspects of a system when it fails between two time points. In reliability theory and survival analysis, the residual entropy was considered in Ebrahimi and Pellerey [2], which basically measures the expected uncertainty contained in remaining lifetime of a system. The residual entropy has been used to measure the wear and tear of components and to characterize, classify and order distributions of lifetimes by Belzunce et al. [3] and Ebrahimi [4]. The notion of past entropy, which can be viewed as the entropy of the inactivity time of a system, was introduced in Di Crescenzo and Longobardi [5].
Ebrahimi and Kirmani [6] introduced the residual discrimination measure and studied the minimum discrimination principle. Di Crescenzo and Longobardi [7] have considered the past discrepancy measure and presented a characterization of the proportional reversed hazards model. Furthermore, the use of information measures for doubly truncated random variables was explored by Misagh and Yari [8,9]. In this paper, continuing their work, we propose a new measure of discrepancy between two doubly truncated life distributions. The remaining of this paper is organized as follows: in Section 2, some results, including uniqueness of interval entropy and its invariance property are presented. Section 3 is devoted to definitions of dynamic measures of discrimination, including residual and past lifetimes and also the notion of interval discrimination measure is introduced. In Section 4 we present some results and properties of interval entropy and interval discrimination measures. Some conclusions are given in Section 4. Throughout this paper we consider absolutely continuous random variables.

2. Interval Entropy

Let X be a non-negative random variable describing a system failure time. We denote the probability density function of X as f ( x ) , the failure distribution as F ( x ) = P ( X x ) and the survival function as F ¯ ( x ) = P ( X > x ) . The Shannon [10] information measure of uncertainty is defined as:
H ( X ) = E ( log f ( X ) ) = 0 f ( x ) log f ( x ) d x
where log denotes the natural logarithm. Ebrahimi and Pellerey [2] considered the residual entropy of the non-negative random variable X at time t as:
H X ( t ) = t f ( x ) F ¯ ( t ) log f ( x ) F ¯ ( t ) d x
Given that a system has survived up to time t , H X ( t ) essentially measures the uncertainty represented by the remaining lifetime. The residual entropy has been used to measure the wear and tear of systems and to characterize, classify and order distributions of lifetimes. See Belzunce et al. [3], Ebrahimi [4] and Ebrahimi and Kirmani [6]. Di Crescenzo and Longobardi [5] introduced the notion of past entropy and motivated its use in real-life situations. They also discussed its relationship with the residual entropy. Formally, the past entropy of X at time t is defined as follows:
H ¯ X ( t ) = 0 t f ( x ) F ( t ) log f ( x ) F ( t ) d x
Given that the system X has failed at time t , H ¯ X ( t ) measures the uncertainty regarding its past lifetime. Now Recall that the probability density function of ( X | t 1 < X < t 2 ) for all 0 < t 1 < t 2 is given by f ( x ) / ( F ( t 2 ) F ( t 1 ) ) . Sunoj et al. [1] considered the notion of interval entropy of X in the interval ( t 1 , t 2 ) as the uncertainty contained in ( X | t 1 < X < t 2 ) which is denoted by:
I H ( t 1 , t 2 ) = t 1 t 2 f ( x ) F ( t 2 ) F ( t 1 ) log f ( x ) F ( t 2 ) F ( t 1 ) d x
We can rewrite the interval entropy as:
I H ( t 1 , t 2 ) = 1 1 F ( t 2 ) F ( t 1 ) t 1 t 2 f ( x ) log r ( x ) d x + 1 F ( t 2 ) F ( t 1 ) { F ¯ ( t 2 ) log F ¯ ( t 2 ) F ( t 1 ) log F ( t 1 ) +   [ F ( t 2 ) F ( t 1 ) ] log [ F ( t 2 ) F ( t 1 ) ]   }
where r ( x ) = f ( x ) / F ¯ ( x ) is the hazard function of X . Note that interval entropy can be negative and also it can be or + . Given that a system has survived up to time t 1 , and has been found to be down at time t 2 , I H ( t 1 , t 2 ) measures the uncertainty about its lifetimes between t 1 and t 2 . Misagh and Yari [9] introduced a shift-dependent version of I H ( t 1 , t 2 ) . The entropy (4) has been used to characterize and ordering random lifetime distributions. See Misagh and Yari [8] and Sunoj et al. [1].
The general characterization problem is to obtain when the interval entropy uniquely determines the distribution function. The following proposition attempts to solve this problem. We first give definition of general failure rate (GFR) functions extracted from Navarro and Ruiz [11].
Definition 2.1. The GFRs of a random variable X having density function f ( x ) and cumulative distribution function F ( x ) are given by h 1 X ( t 1 , t 2 ) = f ( t 1 ) F ( t 2 ) F ( t 1 ) and h 2 X ( t 1 , t 2 ) = f ( t 2 ) F ( t 2 ) F ( t 1 ) .
Remark 2.1. GFR functions determine distribution function uniquely. See Navarro and Ruiz [11].
Proposition 2.1. Let X be a non-negative random variable, and assume I H ( t 1 , t 2 ) be increasing with respect to t 1 and decreasing with respect to t 2 , then I H ( t 1 , t 2 ) uniquely determines F ( x ) .
Proof. By differentiating I H ( t 1 , t 2 ) with respect to t j , we have:
I H X ( t 1 , t 2 ) t 1 = h 1 ( t 1 , t 2 ) [ I H X ( t 1 , t 2 )   1 + log h 1 ( t 1 , t 2 ) ]
and:
I H X ( t 1 , t 2 ) t 2 = h 2 ( t 1 , t 2 ) [ I H X ( t 1 , t 2 ) 1 + log h 2 ( t 1 , t 2 ) ]
Thus, for fixed t 1 and arbitrary t 2 , h 1 ( t 1 , t 2 ) is a positive solution of the following equation:
g ( x t 2 ) = x t 2 [ I H X ( t 1 , t 2 ) 1 + log x t 2 ]   I H ( t 1 , t 2 )   t 1 = 0
Similarly, for fixed t 2 and arbitrary t 1 , we have h 2 ( t 1 , t 2 ) as a positive solution of the following equation:
γ ( y t 1 ) = y t 1 [ I H X ( t 1 , t 2 ) 1 + log y t 1 ] +   I H ( t 1 , t 2 )   t 2 = 0
By differentiating g and γ with respect to x t 2 and y t 1 , we get   g ( x t 2 )   x t 2 = log x t 2 + I H ( t 1 , t 2 ) , and   γ ( y t 1 )   y t 1 = log y t 1 + I H ( t 1 , t 2 ) . Furthermore, second-order derivatives of g and γ with respect to x t 2 and y t 1 are 1 x t > 0   and 1 y t > 0 respectively. Then the functions g and γ are minimized at points x t 2 = e I H ( t 1 , t 2 ) and y t 1 = e I H ( t 1 , t 2 ) respectively. In addition, g ( 0 ) =   I H ( t 1 , t 2 )   t 1 < 0   ,   g ( ) = and γ ( 0 ) =   I H ( t 1 , t 2 )   t 1 < 0   ,   γ ( ) = . So, both functions g and γ first decrease and then increase with respect to x t 2 and y t 1 respectively. which conclude that Equations (5) and (6) has unique roots h 1 ( t 1 , t 2 ) and h 2 ( t 1 , t 2 ) respectively. Now, I H ( t 1 , t 2 ) uniquely determines GFRs and by virtue of Remark 2.1, the distribution function.
The effect of monotone transformations on the residual and past entropy has been discussed in Ebrahimi and Kirmani [6] and Di Crescenzo and Longobardi [5] respectively. Following proposition gives similar results for interval entropy.
Proposition 2.2. Suppose X be a non-negative random variable with cumulative distribution function F and survival function F ¯ ; let Y = φ ( X ) , with φ , strictly increasing and differentiable function. Then for all 0 < t 1 < t 2 < :
I H Y ( t 1 , t 2 ) = I H X ( φ 1 ( t 1 ) , φ 1 ( t 2 ) ) + 1 F ( φ 1 ( t 2 ) ) F ( φ 1 ( t 1 ) ) {   E ( log φ ( X ) )   F ( φ 1 ( t 1 ) ) E ( log φ ( X ) | X < φ 1 ( t 1 ) ) F ¯ ( φ 1 ( t 2 ) ) E ( log φ ( X ) | X > φ 1 ( t 2 ) )   }
Proof. Recalling (1), The Shannon information of X and Y can be expressed as:
H ( Y ) = H ( X ) + E ( log φ ( X ) )
From Theorem 2 of Ebrahimi and Kirmani [6] and Proposition 2.4 of Di Crescenzo and Longobardi [5], we have:
H Y ( t 1 ) = H X ( φ 1 ( t 2 ) ) + E ( log φ ( X ) | X > φ 1 ( t 1 ) )
and:
H ¯ Y ( t 1 ) = H ¯ X ( φ 1 ( t 1 ) ) + E ( log φ ( X ) | X < φ 1 ( t 1 ) )
Due to Equation 2.8 of Sunoj et al. [1], there holds:
H ( Y ) = ( G ( t 1 ) ,   G ¯ ( t 2 ) , 1 G ( t 1 ) G ¯ ( t 2 ) ) + G ( t 1 ) H ¯ Y ( t 1 ) + G ¯ ( t 2 ) H Y ( t 2 ) + [ 1 G ( t 1 ) G ¯ ( t 2 ) ]   I H Y ( t 1 , t 2 )
where G and G ¯ denote distribution and survival functions of Y respectively. Substituting H ( Y ) , H ¯ Y ( t 1 ) and H Y ( t 1 ) in (7), (8) and (9) into terms of (10), we get:
H ( X ) +   E ( log φ ( X ) ) = ( F ( φ 1 ( t 2 ) ) F ( φ 1 ( t 1 ) ) ) I H Y ( t 1 , t 2 ) F ( φ 1 ( t 1 ) ) E ( log φ ( X ) | X < φ 1 ( t 1 ) ) F ¯ ( φ 1 ( t 2 ) ) E ( log φ ( X ) | X > φ 1 ( t 2 ) ) + F ( φ 1 ( t 1 ) ) H ¯ X ( φ 1 ( t 1 ) ) + F ¯ ( φ 1 ( t 2 ) ) H X ( φ 1 ( t 2 ) ) + ( F ( φ 1 ( t 1 ) ) , F ¯ ( φ 1 ( t 2 ) ) , F ( φ 1 ( t 2 ) ) F ( φ 1 ( t 1 ) ) )  
Three terms of the right hand side of (11) are equal to:
[ F ( φ 1 ( t 2 ) ) F ( φ 1 ( t 1 ) ) ] I H X ( φ 1 ( t 1 ) , φ 1 ( t 2 ) )
and the proof is complete.
Remark 2.2. Suppose φ ( X ) = F ( X ) , then the function φ satisfies the assumptions of Proposition 2.2 and uniformly distributed over ( 0 , 1 ) , then:
I H F ( X ) ( t 1 , t 2 ) = I H X ( F 1 ( t 1 ) , F 1 ( t 2 ) ) 1 t 2 t 1 {   H ( X ) + t 1 E ( log f ( X ) | X < F 1 ( t 1 ) ) + t 2 E ( log f ( X ) | X > F 1 ( t 2 ) )   }
Remark 2.3. For all 0 < θ < t 1 , we get I H X + θ ( t 1 , t 2 ) = I H X ( t 1 θ , t 2 θ ) .
Remark 2.4. Let Y = a X where a > 0 , then, we have I H a X ( t 1 , t 2 ) = I H X ( t 1 a , t 2 a ) + log a .

3. Informative Distance

In this section, we review some basic definitions and facts for measures of discrimination between two residual and past lifetime distributions. We introduce a measure of discrepancy between two random variables at an interval of time.
Let X and Y are two non-negative random variables describing times to failure of two systems. We denote the probability density functions of X and Y as f ( x ) and g ( y ) , failure distributions as F ( x ) = P ( X x ) and G ( y ) = G ( Y y ) and the survival functions as F ¯ ( x ) = P ( X > x ) and G ¯ ( y ) = G ( Y > y ) respectively, with F ( 0 ) = G ( 0 ) = 1 . Kullback-Leibler [12] informative distance between F and G is defined by:
I X , Y = 0 f ( x ) log f ( x ) g ( x ) d x
where log denotes natural logarithm. I X , Y is known as relative entropy and it is shift and scale invariant. However it is not metric, since symmetrization and triangle inequality does not hold. We point out the Jensen-Shannon divergence (JSD) which is based on the Kullback-Leibler divergence, with the notable differences that it is always a finite value and its square root is a metric. See Nielsen [13] and Amari et al. [14]. The application of I X , Y as an informative distance in residual and past lifetimes has increasingly studied in recent years. In particular, Ebrahimi and Kirmani [6] considered the residual Kullback-Leibler discrimination information of non-negative lifetimes of the systems X and Y at time t as:
I X , Y ( t ) = t f ( x ) F ¯ ( t ) log f ( x ) / F ¯ ( t ) g ( x ) / G ¯ ( t ) d x
Given that both systems have survived up to time t , I X , Y ( t ) identifies with the relative entropy of remaining lifetimes ( X | X > t ) and ( Y | Y > t ) . Furthermore, the Kullback-Leibler distance for two past lifetimes was studied in Di Crescenzo and Longobardi [7] which is dual to (13) in the sense that it is an informative distance between past lifetimes ( X | X < t ) and ( Y | Y < t ) . Formally, the past Kullback-Leibler distance of non-negative random lifetimes of the systems X and Y at time t is defined as:
I ¯ X , Y ( t ) = 0 t f ( x ) F ( t ) log f ( x ) / F ( t ) g ( x ) / G ( t ) d x
Given that at time t , both systems have been found to be down, I ¯ X , Y ( t ) measures the informative distance between their past lives.
Along a similar line, we define a new discrepancy measure that completes studying informative distance between two random lifetimes.
Definition 3.1. The interval distance between random lifetimes X and Y at interval ( t 1 , t 2 ) is the Kullback-Leibler discrimination measure between the truncated lives ( X | t 1 < X < t 2 ) and ( Y | t 1 < Y < t 2 ) :
I D X , Y ( t 1 , t 2 ) = t 1 t 2 f ( x ) F ( t 2 ) F ( t 1 ) log f ( x ) / [ F ( t 2 ) F ( t 1 ) ] g ( x ) / [ G ( t 2 ) G ( t 1 ) ] d x
Remark 3.1. Clearly I D X , Y ( 0 , t ) = I ¯ X , Y ( t ) , I D X , Y ( t , ) = I X , Y ( t ) and I D X , Y ( 0 , ) = I X , Y .
Given that both systems X and Y have survived up to time t 1 and have seen to be down at time t 2 , I D X , Y ( t 1 , t 2 ) measures the discrepancy between their unknown failure times in the interval ( t 1 , t 2 ) . I D X , Y ( t 1 , t 2 ) satisfies all properties of Kullback-Leibler discrimination measure and can be rewritten as:
I D X , Y ( t 1 , t 2 ) = t 1 t 2 f ( x ) F ( t 2 ) F ( t 1 ) log g ( x ) G ( t 2 ) G ( t 1 ) d x I H X ( t 1 , t 2 )
where I H X ( t 1 , t 2 ) is the interval entropy of X in (4).
An alternative way of writing (16) is the following:
I D X , Y ( t 1 , t 2 ) = log G ( t 2 ) G ( t 1 ) F ( t 2 ) F ( t 1 ) +   1 F ( t 2 ) F ( t 1 ) t 1 t 2 f ( x ) log f ( x ) g ( x ) d x
The following example clarifies the effectiveness of the interval discrimination measure.
Example 3.1. Suppose X and Y be random lifetimes of two systems with joint density function:
f ( x , y ) = 1 4 ,   0 < x < 2 ,   0 < y < 4 2 x
and that the marginal densities of X and Y are f ( x ) = 1 2 ( 2 x )   ,   0 < x < 2 and g ( y ) = 1 8 ( 4 y ) ,   0 < y < 4 , respectively. Because X and Y , belongs to different domains, using relative entropy to measure the informative distance between X and Y is not interpretable. The interval distance between X and Y in the intervals ( 0 , 1.5 ) and ( 1.5 , 2 ) are 0.01 and 0.16 respectively. Hence, the informative distance between X and Y in the interval ( 1.5 , 2 ) is greater than of it in the interval ( 0 , 1.5 ) .
In the following proposition we decompose the Kullback-Leibler discrimination function in terms of residual, past and interval discrepancy measure. The proof is straightforward.
Proposition 3.1. Let X and Y are two non-negative random lifetimes of two systems. For all 0 t 1 < t 2 < , the Kullback-Leibler discrimination measure is decomposed as follows:
I X , Y = [ F ( t 2 ) F ( t 1 ) ] I D X , Y ( t 1 , t 2 ) + F ( t 1 ) I ¯ X , Y ( t 1 ) +   F ¯ ( t 2 ) I X , Y ( t 2 ) + I U , V ( t 1 , t 2 )
where:
I U , V ( t 1 , t 2 ) = F ( t 1 ) log F ( t 1 ) G ( t 1 ) + F ¯ ( t 2 ) log F ¯ ( t 2 ) G ¯ ( t 2 ) + [ F ( t 2 ) F ( t 1 ) ] log F ( t 2 ) F ( t 1 ) G ( t 2 ) G ( t 1 )
is the Kullback-Leibler distance between two trivalent discrete random variables.
Proposition 3.1 admits the following interpretation: the Kullback-Leibler discrepancy measure between random lifetimes of systems X and Y is composed from four parts: (i) the discrepancy between the past lives of two systems at time t 1 ; (ii) the discrepancy between residual lifetimes of X and Y that have both survived up to time t 2 ; (iii) the discrepancy between the lifetimes of both systems in the interval ( t 1 , t 2 ) ; (iv) the discrepancy between two random variables which determines if the systems have been found to be failing before t 1 , between t 1 and t 2 or after t 2 .

4. Some Results on Interval Based Measures

In this section we study the properties of I D ( t 1 , t 2 ) and point out certain similarities with those of I X , Y ( t ) and I ¯ X , Y ( t ) . The following proposition gives lower and upper bounds for the interval distance. We first give definition of likelihood ratio ordering.
Definition 4.1. X   is said to be larger than   Y in likelihood ratio ( X L R Y ) if f ( x ) g ( x ) is increasing in x over the union of the supports of X and Y .
Several results regarding the ordering in Definition 4.1. was provided in Ebrahimi and Pellerey [2].
Proposition 4.1. Let X and Y are random variables with common support ( 0 , ) . Then:
(i) X L R Y implies:
log h 1 X ( t 1 , t 2 ) h 1 Y ( t 1 , t 2 ) I D X , Y ( t 1 , t 2 ) log h 2 X ( t 1 , t 2 ) h 2 Y ( t 1 , t 2 )
when f ( x ) g ( x ) is decreasing in x > 0 , then the inequalities in (19) are reversed.
(ii) Decreasing g ( x ) in x > 0 , implies:
log 1 h 1 Y ( t 1 , t 2 ) I D X , Y ( t 1 , t 2 ) + I H X ( t 1 , t 2 ) log 1 h 2 Y ( t 1 , t 2 )
for increasing g ( x ) then the inequalities in (20) are reversed.
Proof. Because of increasing f ( x ) g ( x ) in > 0 , from (15), we have:
I D X , Y ( t 1 , t 2 ) t 1 t 2 f ( x ) F ( t 2 ) F ( t 1 ) log f ( t 2 ) / [ F ( t 2 ) F ( t 1 ) ] g ( t 2 ) / [ G ( t 2 ) G ( t 1 ) ] d x = log h 2 X ( t 1 , t 2 ) h 2 Y ( t 1 , t 2 )
and:
I D X , Y ( t 1 , t 2 ) t 1 t 2 f ( x ) F ( t 2 ) F ( t 1 ) log f ( t 1 ) / [ F ( t 2 ) F ( t 1 ) ] g ( t 1 ) / [ G ( t 2 ) G ( t 1 ) ] d x = log h 1 X ( t 1 , t 2 ) h 1 Y ( t 1 , t 2 )
which gives (19). When f ( x ) g ( x ) is decreasing, the proof is similar. Furthermore, for all t 1 < x < t 2 decreasing g ( x ) in x > 0 implies g ( t 2 ) < g ( x ) < g ( t 1 ) , then from (16) we get:
I D X , Y ( t 1 , t 2 ) log h 2 Y ( t 1 , t 2 ) I H X ( t 1 , t 2 )
and:
I D X , Y ( t 1 , t 2 ) log h 1 Y ( t 1 , t 2 ) I H X ( t 1 , t 2 )
so that (20) holds. When g ( x ) is increasing the proof is similar.
Remark 4.1. Consider X and Y are two non-negative random variables corresponding to weighted exponential distributions with positive rates λ and μ respectively and with common positive real weight function ω ( ) . The densities of X and Y are f ( x ) = ω ( x ) e λ x h ( λ ) , and g ( x ) = ω ( x ) e μ x h ( μ )   respectively, where h ( ) denotes the Laplace transform of ω ( ) given by h ( θ ) = 0 ω ( x ) e θ x   d x   ,   θ > 0 , therefore, for λ µ the interval distance between X and Y at interval ( t 1 , t 2 ) is the following:
I D X , Y ( t 1 , t 2 ) = log G ( t 2 ) G ( t 1 ) F ( t 2 ) F ( t 1 ) + log h ( μ ) h ( λ ) ( λ μ ) E ( X | t 1 < X < t 2 )
Remark 4.2. Let X be a non-negative random lifetime with density function f ( x ) and cumulative distribution function F ( t ) = P ( X t ) . Then the density function and cumulative distribution function for the weighted random variable X ω associated to a positive real function ω ( ) are f ω ( x ) = ω ( x ) E ( ω ( X ) ) f ( x ) , and F ω ( t ) = E ( ω ( X ) | X t ) E ( ω ( X ) ) F ( t ) , respectively, where E ( ω ( X ) ) = 0 ω ( x ) f ( x ) d x . Then, from (17) we have:
I D X , X ω ( t 1 , t 2 ) = log E ( ω ( X ) | t 1 < X < t 2 ) E ( log ( ω ( X ) ) | t 1 < X < t 2 )
A similar expression is available in Maya and Sunoj [15] for past life time. Due to (22) and from non-negativity of I D X , X ω ( t 1 , t 2 ) we have:
log E ( ω ( X ) | t 1 < X < t 2 ) E ( log ( ω ( X ) ) | t 1 < X < t 2 )
which is a direct result of Markov inequality for concave functions.
Example 4.1. For ω ( x ) = x n 1 and h ( θ ) = ( n 1 ) ! / θ n the distributions of random variables in Remark 4.1 called Erlang distributions with scale parameters λ and μ and with common shape parameter. The conditional mean of ( X | t 1 < X < t 2 ) is the following:
E ( X | t 1 < X < t 2 ) = 1 F ( t 2 ) F ( t 1 ) t 1 t 2 x x n 1 λ n e λ x ( n 1 ) ! d x = γ ( n + 1 , λ t 2 ) γ ( n + 1 , λ t 1 ) λ ( n 1 ) ! ( F ( t 2 ) F ( t 1 ) )
where γ ( α , x ) = 0 x e u u α 1 d u is the incomplete Gamma function. From (21) we obtain:
I D X , Y ( t 1 , t 2 ) = log γ ( n , μ t 2 ) γ ( n , μ t 1 ) γ ( n , λ t 2 ) γ ( n , λ t 1 ) n log λ μ + ( λ μ ) γ ( n + 1 , λ t 2 ) γ ( n + 1 , λ t 1 ) λ ( n 1 ) ! ( F ( t 2 ) F ( t 1 ) )
In the following proposition, sufficient condition for I D X 1 , Y ( t 1 , t 2 ) to be smaller than I D X 2 , Y ( t 1 , t 2 ) is provided.
Proposition 4.2. Consider three non-negative random variables X 1 , X 2 and Y with probability density functions f 1 , f 2 and g respectively. X 1 L R Y implies I D X 1 , Y ( t 1 , t 2 ) I D X 2 , Y ( t 1 , t 2 ) .
Proof. From (17) we have:
I D X 1 , Y ( t 1 , t 2 ) I D X 2 , Y ( t 1 , t 2 ) = I D X 2 , X 1 ( t 1 , t 2 ) + t 1 t 2 ( f 1 ( x ) F 1 ( t 2 ) F 1 ( t 1 ) f 2 ( x ) F 2 ( t 2 ) F 2 ( t 1 ) ) log f 1 ( x ) g ( x ) d x t 1 t 2 ( f 1 ( x ) F 1 ( t 2 ) F 1 ( t 1 ) f 2 ( x ) F 2 ( t 2 ) F 2 ( t 1 ) ) log f 1 ( x ) g ( x ) d x log f 1 ( t 2 ) g ( t 2 ) t 1 t 2 ( f 1 ( x ) F 1 ( t 2 ) F 1 ( t 1 ) f 2 ( x ) F 2 ( t 2 ) F 2 ( t 1 ) ) d x = 0
where the first inequality comes from the fact that I D X 2 , X 1 ( t 1 , t 2 ) 0 and the second one follows from the increasing f 1 ( x ) g ( x ) in x > 0 .
Example 4.2. Let { N ( t ) ,   t 0 } be a non-homogeneous Poisson process with a differentiable mean function M ( t ) = E ( N ( t ) ) such that M ( t ) tends to as t tends to . Let R n ,   n = 1 , 2 , 3 denote the time of the occurrence of the n -th event in such a process. Then R n has the following density function:
f n ( x ) = ( M ( t ) ) n 1 ( n 1 ) ! f 1 ( x ) ,   x > 0
where:
f 1 ( x ) = d d x exp ( M ( x ) ) ,   x > 0
clearly f n ( x ) / f 1 ( x ) is increasing in x > 0 . It follows from Proposition 4.2 that for all m n :
I D X n , X 1 ( t 1 , t 2 ) I D X m , X 1 ( t 1 , t 2 )
Proposition 4.3. Let X and Y were random variables with common support ( 0 , ) . Let φ be a continuous and increasing function, then:
I D φ ( X ) , φ ( Y ) ( t 1 , t 2 ) = I D X , Y ( φ 1 ( t 1 ) , φ 1 ( t 2 ) )
Proof. The proof is straightforward.
The following remarks clarify the invariance of interval discrimination measure under location and scale transformation.
Remark 4.3. For all 0 θ < t 1 , we get I D X + θ , Y + θ ( t 1 , t 2 ) = I D X , Y ( t 1 θ , t 2 θ ) .
Remark 4.4. Let Y = a X where > 0 , I D a X , a Y ( t 1 , t 2 ) = I D X , Y ( t 1 a , t 2 a ) .

4. Conclusions

In this paper, we presented two novel measures of information which are based on a time interval and are more general than the well-known Shannon’s differential entropy and Kullback-Leibler divergence measure. These new measures are consistent in that they are valid in both past and residual lifetimes. We call these measures of information the interval entropy and the informative distance. We obtain the requirements that interval entropy can uniquely determine the distribution function. We presented several propositions and remarks, some of which parallel those for Shannon entropy and Kullback-Leibler divergence and others that are more general. The advantages of interval entropy and informative distance were outlined as well. We believe that interval basic measures will have many applications in reliability, stochastic processes and other areas in the near future. The results presented here are by no means comprehensive but hopefully will pave the way for studying the entropy in a different and more general setting.

Acknowledgment

We are very grateful to the anonymous referees for important and constructive comments, which improved the presentation significantly. We also thank Assistant Editor Sarah Shao.

References

  1. Sunoj, S.M.; Sankaran, P.G.; Maya, S.S. Characterizations of life distributions using conditional expectations of doubly (Interval) truncated random variables. Comm. Stat. Theor. Meth. 2009, 38, 1441–1452. [Google Scholar] [CrossRef]
  2. Ebrahimi, N.; Pellerey, F. New partial ordering of survival functions based on the notion of uncertainty. J. Appl. Prob. 1995, 32, 202–211. [Google Scholar] [CrossRef]
  3. Belzunce, F.; Navarro, J.; Ruiz, J.M.; Yolanda, A. Some results on residual entropy function. Metrica 2004, 59, 147–161. [Google Scholar] [CrossRef]
  4. Ebrahimi, N. Testing whether lifetime distribution is increasing uncertainty. J. Statist. Plann. Infer. 1997, 64, 9–19. [Google Scholar] [CrossRef]
  5. Di Crescenzo, A.; Longobardi, M. Entropy based measure of uncertainty in past lifetime distributions. J. Appl. Prob. 2002, 39, 434–440. [Google Scholar]
  6. Ebrahimi, N.; Kirmani, S.N.U.A. Some results on ordering of survival functions through uncertainty. Statist. Prob. Lett. 1996, 29, 167–176. [Google Scholar] [CrossRef]
  7. Di Crescenzo, A.; Longobardi, M. A measure of discrimination between past lifetime distributions. Stat. Prob. Lett. 2004, 67, 173–182. [Google Scholar]
  8. Misagh, F.; Yari, G.H. A novel entropy-based measure of uncertainty to lifetime distributions characterizations. In Proceedings of the First International Conference on Mathematics and Statistics, AUS-ICMS’10, American University of Sharjah, Sharjah, UAE, 18–21 March 2010.
  9. Misagh, F.; Yari, G.H. On weighted interval entropy. Statist. Prob. Lett. 2011, 29, 167–176. [Google Scholar] [CrossRef]
  10. Shannon, C.E. A mathematical theory of communication. Bell Syst. Tech. J. 1948, 27, 379–423. [Google Scholar] [CrossRef]
  11. Navarro, J.; Ruiz, J.M. Failure rate functions for doubly truncated random variables. IEEE Trans. Reliab. 1996, 45, 685–690. [Google Scholar] [CrossRef]
  12. Kullback, S.; Leibler, R.A. On information and sufficiency. Ann. Math. Statist. 1951, 25, 745–751. [Google Scholar] [CrossRef]
  13. Nielsen, F. A family of statistical symmetric divergences based on Jensen’s inequality. arXiv, 2011; arXiv:1009.4004v2[cs.CV]. [Google Scholar]
  14. Amari, S.-I.; Barndorff-Nielsen, O.E.; Kass, R.E.; Lauritzen, S.L.; Rao, C.R. Differential geometry in statistical inference. IMS Lectures Notes 1987, 10, 217. [Google Scholar]
  15. Maya, S.S.; Sunoj, S.M. Some dynamic generalized information measures in the context of weighted models. Statistica Anno. 2008, LXVIII 1, 71–84. [Google Scholar]

Share and Cite

MDPI and ACS Style

Misagh, F.; Yari, G. Interval Entropy and Informative Distance. Entropy 2012, 14, 480-490. https://doi.org/10.3390/e14030480

AMA Style

Misagh F, Yari G. Interval Entropy and Informative Distance. Entropy. 2012; 14(3):480-490. https://doi.org/10.3390/e14030480

Chicago/Turabian Style

Misagh, Fakhroddin, and Gholamhossein Yari. 2012. "Interval Entropy and Informative Distance" Entropy 14, no. 3: 480-490. https://doi.org/10.3390/e14030480

Article Metrics

Back to TopTop