Estimating the Entropy of a Weibull Distribution under Generalized Progressive Hybrid Censoring

Recently, progressive hybrid censoring schemes have become quite popular in a life-testing problem and reliability analysis. However, the limitation of the progressive hybrid censoring scheme is that it cannot be applied when few failures occur before time T . Therefore, a generalized progressive hybrid censoring scheme was introduced. In this paper, the estimation of the entropy of a two-parameter Weibull distribution based on the generalized progressively censored sample has been considered. The Bayes estimators for the entropy of the Weibull distribution based on the symmetric and asymmetric loss functions, such as the squared error, linex and general entropy loss functions, are provided. The Bayes estimators cannot be obtained explicitly, and Lindley’s approximation is used to obtain the Bayes estimators. Simulation experiments are performed to see the effectiveness of the different estimators. Finally, a real dataset has been analyzed for illustrative purposes.


Introduction
Entropy, which is one of the important terms in statistical mechanics, was originally defined in physics especially in the second law of thermodynamics.Shannon [1] re-defined it in an information theorem using the concepts of probability and statistics.Let X be a random variable with a continuous distribution function (cdf) F (x) and probability density function (pdf) f (x).The differential entropy H(X) of the random variable is defined by Cover and Thomas [2] as: It is seen that a very sharply peaked distribution has very low entropy, whereas if the probability is spread out, the entropy is much higher.In this sense, H(X) is a measure of uncertainty associated with f .Furthermore, as H(f ) increases, f (x) approaches uniformity.Thus, entropy can be viewed as measuring the uniformity of a distribution (Jiheel and Shanubhoque [3]).Many authors worked on the estimation entropy for different life distributions.Baratpour et al. [4] developed the entropy of upper record values and provided several upper and lower bounds for this entropy by using the hazard rate function.Cramer and Bagh [5] discussed the entropy in the Weibull distribution for progressive censoring.Abo-Eleneen [6] suggested an efficient method for the simple computation of entropy based on progressively Type II censored samples.Cho et al. [7] derived estimators for the entropy function of a Rayleigh distribution based on doubly-generalized Type II hybrid censored samples by using the maximum likelihood estimator (MLE), approximate MLEs and Bayes estimators.
The conventional censoring schemes (Type I, Type II censoring, hybrid censoring and generalized hybrid censoring [8]) do not allow for units to be removed from the test at points other than the final termination point.Intermediate removal may be desirable when a compromise between the reduced time of experimentation and the observation of at least some extreme lifetimes is sought or when some of the surviving units in the experiment that are removed early on can be used for some other tests.Therefore, the loss of units at points other than the final termination point may be unavoidable, as in the case of the accidental breakage of experimental units or the loss of contact with individuals under experiment.These reasons and motivations lead reliability practitioners and theoreticians directly into the area of progressive censoring (Balakrishnan and Aggarwala [9], Balakrishnan and Cramer [10]).A progressive censoring scheme can be described as follows.Immediately following the first observed failure, R 1 surviving units are removed from the test at random.Similarly, following the second observed failure, R 2 surviving units are removed from the test at random.This process continues, until, immediately following the m-th observed failure, all of the remaining R m = n−R 1 −• • •−R m−1 −m units are removed from the experiment.In this experiment, the progressive censoring scheme R = (R 1 , R 2 , • • • , R m ) is pre-fixed.The resulting m ordered failure times, which we denote by X 1:m:n , X 2:m:n , • • • , X m:m:n , are referred to as progressive Type II censored sample.
The disadvantages of the progressive Type II censoring scheme is that the time of the experiment can be very long if the units are highly reliable.Because of that, Kundu and Joarder [11] proposed a progressive hybrid censoring scheme in the context of a life-testing experiment in which n identical units are placed in an experiment with the progressive Type II censoring scheme (R 1 , R 2 , • • • , R m ), and the experiment is terminated at time min{X m:m:n , T }, where T ∈ (0, ∞) and 1 ≤ m ≤ n are fixed in advance and X 1:m:n ≤ X 2:m:n ≤ • • • ≤ X m:m:n are the ordered failure times from the experiment.Under a progressive hybrid censoring scheme, the time of the experiment will be no more than T .Some recent studies on progressive hybrid censoring have been carried out by many authors, including Lin et al. [12], Lin and Huang [13] and Lin et al. [14].One limitation of the progressive hybrid censoring scheme is that it cannot be applied when very few failures may occur before time T .Therefore, MLE for a parameter of a underling distribution of observations may not be computed or its accuracy will be extremely low.For this reason, Cho et al. [7] propose a generalized progressive hybrid censoring scheme, which allows us to observe a pre-specified number of failures.
In Section 2, we introduce a generalized progressive hybrid censoring scheme.In Section 3, we describe the computation of the entropy function with MLE.The Bayes estimates for the entropy are derived in Section 4 for different loss functions using Lindley's approximation.A real dataset has been analyzed in Section 5.In Section 6, the description of different estimators that are compared by performing the Monte Carlo simulation is presented, and Section 7 concludes.

Generalized Progressive Hybrid Censoring
Consider a life-testing experiment in which n identical units are put to testing.Assume that X 1 , X 2 , • • • , X n denote the corresponding lifetimes from a distribution with cdf F (x) and pdf f (x).A generalized progressive hybrid censoring scheme is described as follows (Cho et al. [15]).The integer At the time of first failure, R 1 of the remaining units are randomly removed.Similarly, at the time of the second failure, R 2 of the remaining units are removed, and so on.This process continues until, immediately following the terminated time T * = max{X k:m:n , min{X m:m:n , T }}, all of the remaining units are removed from the experiment.This generalized progressive hybrid censoring scheme modifies the progressive hybrid censoring scheme by allowing the experiment to continue beyond time T if very few failures had been observed up to time T .Under this scheme, the experimenter would ideally like to observe m failures, but is willing to accept a bare minimum of k failures.Let D denote the number of observed failures up to time T .In this scheme, we have one of the following types of observations: Note that for Case II, X D:m:n < T < X D+1:m:n and X D+1:m:n , • • • , X m:m:n are not observed.For Case III, T < X k:m:n < X m:m:n and X k+1:m:n , • • • , X m:m:n are not observed.A schematic representation of the generalized progressive hybrid censoring scheme is presented in Figure 1.Given a generalized progressively hybrid censored sample, the likelihood functions for three different cases are as follows: where

Maximum Likelihood Estimation
Utilizing Equations ( 1) and ( 2), the likelihood functions of α and λ are given by: Case I Case II x α−1 j:m:n e −λx α j:m:n (1+R j ) .
Additionally, the corresponding log likelihood functions are: Therefore, Cases I, II and III can be combined and can be written as: for Case II and J = m and W (α) = 0 for Case III.

Bayes Estimation
In this section, we derive the Bayes estimators for the entropy of a Weibull distribution under symmetric, as well asymmetric loss functions.A very well-known symmetric loss function is the squared error loss (SEL) function, which is defined as L 1 (θ, δ) = (δ − θ) 2 , with δ being an estimate of θ.For this situation, the Bayes estimate, say θS , is given by the posterior mean of θ.
One of the most commonly-used asymmetric loss functions is the linex loss (LL) function, which is defined by: The sign of parameter h represents the direction of asymmetry, and its magnitude reflects the degree of asymmetry.For h < 0, the underestimation is more serious than the overestimation, and for h > 0, the overestimation is more serious than the underestimation.For h close to zero, the LL function is approximately the SEL function.In this case, the Bayes estimate of θ is obtained as: provided the above exception exists.Another commonly-used asymmetric loss function is the general entropy loss (EL) function given by: For q > 0, a positive error has a more serious effect than a negative error, and for q < 0, a negative error has a more serious effect than a positive error.Note that for q = −1, the Bayes estimate coincides with the Bayes estimate under the SEL function.In this case, the Bayes estimate of θ is obtained as: provided the above exception exists.

Prior and Posterior Distribution
It is assumed here that the parameters α and λ are independent and follow the gamma(c, d) and gamma(a, b) prior distributions with a > 0, b > 0, c > 0 and d > 0. Therefore, the joint prior distribution of α and λ is of the form: Based on the above joint prior distribution, the joint density of the α, λ and X can be written as follows.
Then, the posterior distribution of α and λ, given X, is obtained as: . Now, we obtain Bayes estimates of entropy against the SEL, LL and EL functions when the prior distribution is taken to be π(α, λ).The Bayes estimate of entropy against the SEL, LL and EL functions are respectively obtained as,

Lindley's Approximation
In this subsection, based on a generalized progressively hybrid censored sample, we obtained several Bayes estimates of entropy of a Weibull distribution.These Bayes estimates are derived against SEL, LL and EL functions.It is easily observed that all of these estimates are in the form of the ratio of two integrals for which simplified closed forms are not available.Thus, we use Lindley's method to approximate all of the Bayes estimates.For the two-parameter case, (θ 1 , θ 2 ), Lindley's approximation can be written as: where: We further note that l denotes the log-likelihood and τ ij denotes the (i, j)-th element of the matrix [−∂ 2 l/∂θ i 1 ∂θ j 2 ] −1 .Now, we compute ĝ for our estimation problem; we have: x α i:m:n (1 + R i )(logx i:m:n ) 3 + W (α) (3) , (2) , where 3 for Case II and W (α) (2) = W (α) (3) = 0 for Case I and Case III.
Furthermore, we have: (2) , G = J/λ 2 and V = J i=1 x α i:m:n (1+ R i )logx i:m:n + W (α) (1) .Now, we compute the approximate Bayes estimate based on Lindley's approximation.In this case, we compute the approximate Bayes estimator of entropy under the SEL function.In this case, we observe that: Using Equation ( 9), the approximate Bayes estimator of entropy under the SEL function is given by: Next, we compute the approximate Bayes estimator of entropy under the LL function; we observe that: Using Equation ( 9), the approximate Bayes estimator of entropy under the LL function is given by: Finally, we compute the approximate Bayes estimator of entropy under the EL function; we observe that: Using Equation ( 9), the approximate Bayes estimator of the reliability function under the EL function is given by:

Bayes Estimation Based on Balanced Loss Function
From a Bayes perspective, the choice of loss function is an essential part in the estimation and prediction problems.Recently, a more generalized loss function, called the balanced loss function (Jozani et al. [17]), was introduced, of the form: where ρ is an arbitrary loss function, while δ 0 is a chosen a prior target estimator of θ, obtained, for instance, using the criterion of MLE.Loss L ρ,w,δ 0 , which depends on the observed value of δ 0 , reflects a desire for the closeness of δ to both the target estimator δ 0 and the unknown parameter θ; with the relative importance of these criteria governed by the choice of w ∈ [0, 1).A general development with regard to Bayes estimators under L ρ,w,δ 0 is given, namely by relating such estimators to Bayes solutions to the unbalanced case, i.e., L ρ,w,δ 0 with w = 0. L ρ,w,δ 0 can be specialized to various choices of loss function, such as for SEL, LL and EL functions (Ahmadi et al. [18]).By choosing ρ(θ, δ) = (δ − θ) 2 , Equation ( 13) reduced to the balanced squared error loss (BSEL) function, in the form: and the corresponding Bayes estimate of the θ is obtained as: By choosing ρ(θ, δ 13) reduces to the balanced linex loss (BLL) function, in the form: and the corresponding Bayes estimate of the θ is obtained as: By choosing ρ(θ, δ) = (δ/θ) q − qlog(δ/θ) − 1; q = 0, Equation ( 13) reduces to the balanced general entropy loss (BEL) function, in the form: and the corresponding Bayes estimate of the θ is obtained as: It is clear that the balanced loss functions are more general, which include the maximum likelihood estimate and both symmetric and asymmetric Bayes estimates as special cases.Now, we compute approximate Bayes estimates based on the balanced loss function.Using Equations ( 14)-( 16), the approximate Bayes estimates under the BSEL, BLL and BEL functions of the entropy are given by: and

Illustrative Example
For illustrative purposes, we present here a real data analysis using the proposed methods.The following dataset is the failure times of the air conditioning system of an airplane (Linhart and Zucchini, [19]).This dataset was analyzed by Gupta and Kundu [20].One question arises about whether the data fit the Weibull distribution or not.To check for the goodness of fit, we compute the Anderson-Darling statistic: it is 0.552, and the associated p-value is 0.159.Since the p-value is quite high, we cannot reject the null hypothesis that the data are coming from the Weibull distribution.The ordered data are as follows : 1, 3, 5, 7, 11, 11, 11, 12, 14, 14, 14, 16, 16, 20, 21, 23, 42, 47, 52, 62, 71, 71, 87, 95, 90, 120, 120, 225, 246, 261.From the above sample, we created artificial data by a progressive Type II censored sample.We have n = 30, and we took m = 10 and For the Bayesian inference, the prior parameters are chosen: The Bayes estimator based on the non-informative prior is obtained.Furthermore, the Bayes estimator based on the balanced loss function with w = 0.3, 0.5 and 0.7 is obtained.Table 1 presents the estimation of the entropy of the generalized progressively hybrid censored sample.
Table 1.Estimation of entropy as an example.

Simulation Results
In this section, a Monte Carlo simulation study is conducted to compare the performance of different estimators.We consider different n, m, k and T .We have used three different progressive Type II censored sampling schemes, namely: Scheme I, R m = n − m and R i = 0 for i = m; Scheme II, Before progressing further, first, we describe how we generate progressive Type II censored data for a given set n, m, R 1 , R 2 , • • • , R m .We use the following transformation suggested in Balakrishnan and Aggarwala [9].
It is known that if Y i:m:n 's are an i.i.d.standard exponential distribution, then the spacings Z i 's are also an i.i.d.standard exponential distribution of random variables.From Equation (20), it follows that: Finally, we set X i:m:n = F −1 (1 − exp(Y i:m:n )), for i = 1, 2, ..., m, where F −1 (•) is the inverse cumulative distribution function of the Weibull distribution.Then, X 1:m:n , X 2:m:n , • • • , X m:m:n is the required progressively Type II censored sample from the Weibull distribution.
Using (20), generalized progressively hybrid censored data can be easily generated as follows.If T < X k:m:n < X m:m:n , then we have Case I, and the corresponding generalized progressively hybrid censored sample is (X 1:m:n , X 2:m:n , • • • , X k:m:n ).If X k:m:n < T < X m:m:n , then we have Case II, and we find D, such that X D:m:n < T < X D+1:m:n .The corresponding generalized progressively hybrid censored sample is (X 1:m:n , X 2:m:n , • • • , X D:m:n ).If X k:m:n < X m:m:n < T , then we have Case III, and the corresponding generalized progressively hybrid censored sample is (X Without loss of generality, we take α = λ = 1 in each case.We replicate the process 10,000 times in each case.The associated MLE is computed using a Kundu [16] algorithm.All Bayes estimates are computed with respect to the non-informative prior distribution.This corresponds to the case when hyperparameters take values of a = b = c = d = 0. Bayes estimates of entropy are derived with respect to six different loss functions, SEL, LL, EL, BSEL, BLL and BEL functions.The Lindley approximation has been then used to derive approximate explicit expressions for these estimates.Under LL, associated estimates are obtained for h = 1.0, 1.5, 2.0.Under EL, associated estimates are obtained for q = 1.0, 1.5, 2.0.Furthermore, the Bayes estimates with respect to the balanced loss function are computed for three distinct values of w = 0.3, 0.5, 0.7.Finally, different schemes have been taken into consideration to compute MSE and the bias values of all estimates, and these values are tabulated in Tables 2-3.We present the following discussions based on the MSEs.
In Tables 2-3, MSE and bias values of all estimates of entropy are presented for various choices of n, m, k, T and generalized progressive hybrid censoring schemes.We have tabulated MSE and bias values of the respective MLE in the sixth column of the table.All other columns uniformly contain four values.The first value corresponds to the MSE of entropy using Lindley's approximation based on a non-informative prior.The last three values correspond to the MSE and bias of the entropy-derived Bayes estimates based on a non-informative prior under the balanced loss function.
In general, we observed that the MSE values decrease as the sample size n increases.For a fixed sample size, the MSE values decrease generally as the number of generalized progressively hybrid censored samples R i decreases.Furthermore, we observed that Bayes estimates are superior to the respective MLE in terms of MSE values.In particular, respective Bayes estimates under LL and EL of entropy are better than the corresponding MLE.For estimating the entropy, the choice h = 2 seems to be a reasonable choice under LL.In the case of EL loss, the choice q = 2 seems to be a reasonable choice for Lindley's estimates.For estimating the entropy, the choice w = 0.3 seems to be a reasonable choice under BLL and BEL, while w = 0.7 is a good choice under BSEL.Overall, the Bayes estimator using the EL function based on the non-informative prior provide better estimates compared with other estimates.
Table 2.The relative MSEs and biases of entropy estimators with MLE and the Bayes estimator (T = 0.5).

Conclusions
The disadvantage of the progressive Type II censoring scheme is that the time of the experiment can be very long if the units are highly reliable.Because of that, Kundu and Joarder [11] proposed a progressive hybrid censoring scheme in the context of a life-testing experiment in which n identical units are placed in an experiment with the progressive Type II censoring scheme, and the experiment is terminated at time min{X m:m:n , T }.Under the progressive hybrid censoring scheme, the time in the experiment will be no more than T .One limitation of the progressive hybrid censoring scheme is that it cannot be applied when very few failures occur before time T .Therefore, MLE for a parameter of a underling distribution of observations may not be computed or its accuracy will be extremely low.For this reason, Cho et al. [7] propose a generalized progressive hybrid censoring scheme, which allows us to observe a pre-specified number of failures.
In this paper, we discussed entropy estimators for the Weibull distribution based on generalized progressive hybrid censored samples.The paper derived entropy estimators by using the MLE and Bayes estimators in the Weibull distribution based on generalized progressively hybrid censored samples and compared them in terms of their MSE.Bayes estimates using the non-informative prior are obtained under six types of loss function, and it is observed that the Bayes estimate with respect to the non-informative prior under the EL function works quite well in this case.
Although we focused on the entropy estimate of the Weibull distribution in this article, the proposed estimation can be easily extended to other distributions.Particularly, the Bayes estimation can be applied to any other distributions.Estimation of the entropy from other distributions under generalized progressive hybrid censoring is of potential interest in future research.

Figure 1 .
Figure 1.Schematic representation of the generalized progressive hybrid censoring scheme.

Table 3 .
The relative MSEs and biases of entropy estimators with MLE and the Bayes estimators (T = 0.75).