Information Measure in Terms of the Hazard Function and Its Estimate

It is well-known that some information measures, including Fisher information and entropy, can be represented in terms of the hazard function. In this paper, we provide the representations of more information measures, including quantal Fisher information and quantal Kullback-leibler information, in terms of the hazard function and reverse hazard function. We provide some estimators of the quantal KL information, which include the Anderson-Darling test statistic, and compare their performances.


Introduction
Suppose that X is a random variable with a continuous probability density function (p.d.f.) f (x; θ), where θ is a real-valued scalar parameter. It is well-known that the Fisher information plays an important role in statistical estimation and inference, which is defined as Fisher information identity in terms of the hazard function has been provided by Efron and Johnstone [1] as where h(x; θ) is the hazard function defined as f (x; θ)/(1 − F(x; θ)) and F(x; θ) is the cumulative distribution function. It is also well-known that the entropy (Teitler et al., 1986) and Kullback-Leibler information [2] can be represented in terms of the hazard function, respectively, as where h f (x) and h g (x) are the hazard functions defined as f (x)/(1 − F(x)) and g(x)/(1 − G(x)), respectively.
The quantal (randomly censored) Fisher information and the quantal (randomly censored) Kullback-Leibler information have been defined [3], respectively, as

}dW(t)
and where W(t) is an appropriate weight function which satisfies ∞ −∞ dW(x) = 1. The quantal Fisher information is related with the Fisher information in the ranked set sample, and the quantal Kullback-Leibler information is related with the cumulative residual entropy [4] and cumulative entropy [5], defined as and The information representation in terms of the cumulative functions enables us to estimate the information measure by employing the empirical distribution function. The organization of this article is as follows: In Section 2, we discuss the relation between the quantal Fisher information and quantal Kullback-Leibler information. In Section 3, we provide the expression of the quantal Fisher information in terms of the hazard and reverse hazard functions as where h(x; θ) and r(x; θ) are the hazard and reverse hazard functions, respectively. We also provide the expression of the quantal (randomly censored) KL information in terms of the hazard and reverse hazard functions as where r f (x) and r g (x) are the reverse hazard functions defined as f (x)/F(x) and g(x)/G(x), respectively. This representation enables us to estimate the quantal information by employing the nonparametric hazard function estimator. In Section 4, we discuss the choice of the weight function W(x) in terms of maximizing the related Fisher information. In Section 5, we provide the estimator of (2) and evaluate its performance as a goodness-of-fit test statistic. Finally, in Section 6, some concluding remarks are provided.

Quantal Fisher Information and Quantal Kullback-Leibler Information
If we define the quantal response variable Y at t as its density function is Then, the conditional Fisher information in the quantal response at t about θ can be obtained as This conditional Fisher information has been studied in terms of censoring by Gertsbakh [6] and Park [7], and its weighted average has been defined to be the quantal randomly censored Fisher information [3] as where W(t) is an appropriate weight function. The expression (5) says that I QF (θ) may be called cumulative Fisher information and can be written in a simpler way, as

Remark 1.
If we take W(x) to be F(x; θ), I QF (θ) is related with the Fisher information in the ranked set sample [8] as where I SRS (θ) is the Fisher information in a simple random sample of size n, which is equal to nI(θ), and I RSS (θ) is the Fisher information in a ranked set sample. The result means that the ranked set sample has additional ordering information in the n(n + 1) pairs to the simple random sample. Hence, 1 + (n + 1)I QF /I(θ) represents the efficiency level of the ranked set sample relative to the simple random sample.
In a similar way, the Kullback-Leibler (KL) information between two quantal random variables can be obtained as Then, the weighted average of KL t (F : G) has been defined to be quantal (randomly censored) divergence [3], as We note that the quantal KL information (quantal divergence) with dW(x) = dx is equal to the addition of the cumulative KL information (Park, 2015) and cumulative residual KL information [9]. This quantal Kullback-Leibler information has been discussed in constructing goodness-of-fit test statistics by Zhang [10].
The following approximation of the KL information in terms of the Fisher information is well-known [11], as Hence, we can also apply Taylor's expansion to (2) to have the approximation of the quantal KL information in terms of the quantal Fisher information as follows: Proof of Lemma 1. By applying the Taylor expansion, we have Then, we can show that

Quantal Fisher Information in Terms of the (Reversed) Hazard Function
It is well-known that the Fisher information can be represented in terms of the hazard function [1] as where The mirror image of (1) provides another representation of the Fisher information in terms of the reverse hazard function [12] as where r(x; θ) is the reverse hazard function defined as f (x; θ)/F(x; θ). Then, (6) can be written again in terms of both hazard function and reversed hazard function in view of (7) and (8) as follows: Now, we show that the quantal Fisher information can also be expressed in terms of both hazard function and reversed hazard function, as follows.
Theorem 1. Suppose that W(x) is bounded and the regularity conditions for the existence of the Fisher information hold.
Proof of Theorem 1. In view of Park [7], we have the decomposition of the Fisher information as where Hence, I t QF (θ) can also be expressed from (10) as We can take the expectation of (11) and apply Fubini's theorem to get the result. (9) can be written as because it has been shown in Park (1996) that where I i:n (θ) is the Fisher information in the ith order statistic from an independently and identically distributed sample of size n.

Quantal KL Information and Choice of the Weight Function in Terms of Maximizing the Quantal Fisher Information
Because Lemma 2 shows that the approximation of the Kullback-Leibler information can be represented in terms of the hazard function and reverse hazard function, the following representations of the KL information in terms of the hazard function and reverse hazard function have been shown in Park and Shin [2] as and In a similar context, Lemma 2 and Theorem 1 says that the approximation of the quantal Kullback-Leibler information can also be represented in terms of the hazard function and reverse hazard function; hence, we can expect the following quantal KL information representation in terms of the hazard function and reverse hazard function.
Proof of Theorem 2. We can show that Then, we can apply the integration by parts to (2) to get the result.
Equation (13) can be rewritten in terms of the cumulative distribution function as follows: Hence, the quantal KL information has another representation in terms of the cumulative distribution function, which measures the weighted differences in distribution functions and the log odds ratio. Now, we consider the choice of the weight function W(x) in QKL(F : G), which has not been discussed much so far. Here, we consider the criterion of maximizing the quantal Fisher information in Theorem 1. For the multi-parameter case, we have the quantal Fisher information matrix and can consider its determinant, which is called generalized Fisher information.
For illustration, we take F(x) to be the normal distribution. Then we consider the following dW(x)'s and plotted their shapes in Figure 1 where dW 1 (x) is the bimodal weight function and the shapes get more centralized as i in dW i (x) increases. 1.
where Φ(x) is the cumulative distribution function of the normal random variable. We calculate the corresponding quantal Fisher information and summarize the results in Table 1. We can see from Table 1 that I QF (θ) about the location parameter gets larger as the weight function becomes more centralized. We also note that we have the maximum I QF (θ) about the scale parameter at the bimodal weight function. However, we can see that we have the maximum generalized quantal Fisher information at dW(x) = dΦ(x).

Estimation of the Quantal KL Information
Suppose that we have an independently and identically distributed (IID) sample of size n, (x 1 , · · · , x n ), from an assumed density function f θ (x), and (x 1:n , · · · , x n:n ) are their ordered values. Then, the distance between the sample distribution and the assumed distribution can be measured as KL( f n : f θ ), where f n is an appropriate nonparametric density function estimator, and its estimate has been studied as a goodness-of-fit test statistic by lots of authors, including Pakyari and Balakrishnan [13], Noughabi and Arghami [14], and Qiu and Jia [15] by considering a piecewise uniform density function estimator or nonparametric kernel density function estimator. In the same manner, the estimate of (12) has been studied by Park and Shin (2015) for the same purpose by considering a nonparametric hazard function estimator. However, we note that the critical values based on those nonparametric density (hazard) function estimators depend on the choice of the bandwidth-type parameter.
We can also measure the distance between the sample distribution and the assumed distribution with QKL(F n : F θ ), if we choose the weight function to be F n (x) in view of Section 4, which can be written as where F n is the empirical distribution function. Then,F n (x i:n ) can be obtained as i/n, and dF n (x) is obtained as 1/n only at x i:n 's, and (14) can be written as where ξ i = F θ (x i:n ), ξ 0 = 0 and ξ n+1 = 1, and However, because the empirical distribution function is only right-continuous, we also considerF n (x i:n ) to be (n − i + 1)/n so thatF n (x i:n ) to be (i − 1)/n, then we have Hence, we may obtain the average of both and obtain QKL n (F n : which is actually equivalent to the Anderson-Darling test. Zhang [10] proposed a test statistic by choosing a weight function, For example, we consider the performance of the above statistics for testing the following hypothesis: H 0 : The true distribution function is N(µ, σ) versus H 1 : The true distribution function is not N(µ, σ). The unknown parameters, µ and σ, are estimated with the sample mean and sample standard deviation, respectively. We also consider the classical Kolmogorov-Smirnov test statistic (Lilliefors test) for comparison as where z = (x −x)/s andx and s are the sample mean and the sample standard deviation, respectively.
We provide the critical values of the above test statistics for n = 10, 20, · · · , 100 in Table 2, which are obtained by employing the Monte Carlo simulations of size 200,000. Then, we compare the power estimates of the above test statistics, for illustration, against the following alternatives to compare the powers:
We also employed the Monte Carlo simulation to estimate the powers against the above alternatives for n = 20, 50, 100, respectively, where the simiulation size is 100,000. The numerical results are summarized in Tables 3-5. These show that QKL n performs better than QKL R and QKL L against symmetric alternatives, and the powers of QKL n against asymmetric alternatives are in between QKL R and QKL L . They all outperform the classical Kolmogorov-Smirnov test statistic. Z A generally performs better than QKL n against asymmetric alternatives, but the simulation result shows that Z A seems to be a biased test, which can be known from the power estimate against Beta(2, 2) for n = 20.  Log normal(0,1) 100.00 100.00 100.00 100.00 100.00

Concluding Remarks
It is well-known that both Fisher information and Kullback-Leibler information can be in terms of the hazard function or reverse hazard function. We considered the quantal response variable and showed that the quantal Fisher information and quantal KL information can also be represented in terms of both hazard function and reverse hazard function. We also provided the criterion of maximizing the standardized quantal Fisher information in choosing the weight function in the quantal KL information. For illustration, we considered the normal distribution and studied the choice of weight function, and compared the performance of the estimators of the quantal KL information as a goodness-of-fit test.