Inference for Kumaraswamy Distribution under Generalized Progressive Hybrid Censoring

: In this paper, generalized progressive hybrid censoring is discussed, while a scheme is designed to provide a ﬂexible and symmetrical scenario to collect failure information in the whole life cycle of units. When the lifetime of units follows Kumaraswamy distribution, inference is investigated under classical and Bayesian approaches. The maximum likelihood estimates and associated existence and uniqueness properties are established and the conﬁdence intervals for unknown parameters are provided by using a large sample size based on asymptotic theory. Moreover, the Bayes estimates along with highest probability density credible intervals are also developed through the Monte-Carlo Markov Chain sampling technique to approximate the associated posteriors. Simulation studies and a real-life example are presented for illustration purposes.


Introduction
Nowadays, sample size heavily affects the accuracy of the estimation and complete testing is impossible to conduct in practical experiments due to the progress of manufacturing design and technology, which yields high reliability and a long span of modern products. In this situation, a censoring scheme (CS) has been introduced due to many reasons such as time constraint and cost reduction. Generally, there is only a part of units whose failure times are collected when a CS is involved under test conditions. There are many CSs used in practice, and most common ones include conventional Type-I/-II CSs and progressive Type-I/-II CSs, where the latter allows engineers to remove units from testing at various stages. A lot of work has been carried out on these CSs in different situations. See, for example, the works of Balakrishnan and Han [1], Fernandez [2], Han and Kundu [3], Panahi and Sayyareh [4], Soliman [5], Wang et al. [6]. For more details, see monographs of Balakrishnan and Aggarwala [7], Lawless [8], among others. In life testing and reliability experiments, however, one major drawback of these mentioned CSs is that there may be not enough failure data or, in the worst-case scenario, no failure at all, or the experiments may take a long time to fail. In these situations, the inferentical accuracy or the test efficiency may heavily affect in consequence. Therefore, the hybrid censoring scheme was introduced in practice to overcome this drawback; this scheme can be regarded as the mixture of Type-I and Type-II censoring in conventional and/or progressive censoring cases and include Type-I hybrid CS by Epstein [9], Type-II hybrid CS by Childs et al. [10] and progressive hybrid CS by Kundu and Joarder [11]. The aforementioned hybrid CSs have received considerable attention in the literature. See, for example, the work of Balakrishnan and Kundu [12], Lin et al. [13], Kundu [14] as well as the references therein.
In lifetime studies, there are many conventional models in practice such as exponential, Weibull, normal, and gamma, among others. Although the survival time of units are usually treated being greater than zero, the lifetime cannot be infinite from practical perspective. Then, it is proper to use bounded models to fit real-life data which may assign more weight to failure data and provide a better inferential accuracy. Motivated by reasons such as those mentioned above, this paper is devoted to discussing the statistical inferential problem for a bounded lifetime distribution with range (0, 1), and classical and Bayesian approaches are used for parameter estimation under generalized progressive hybrid censoring.
The rest of the paper is organized as follows: Section 2 presents data description and notations. Classical estimation is provided in Section 3. Section 4 discusses Bayesian estimation correspondingly. Simulation studies and an illustrative example are proposed in Section 5. Finally, some conclusions are presented in Section 6.

Preliminaries
Recently, Cho et al. [15] introduced a flexible hybrid CS called generalized progressive hybrid CS, which can be conducted as follows: There are n units in experiment, k and m are prefixed integers given in advance with 1 ≤ k ≤ m ≤ n. Furthermore, predefined time point T and non-negative integers r 1 , r 2 , . . . , r m are given in advance with ∑ m i=1 r i + m = n. Following the procedure as noted in the progressive censoring scheme, X i:m:n denotes the failure of the ith testing unit, then the testing stops at point T * = max{X k:m:n , min{T, X m:m:n }}, i.e., T * =    X k:m:n , if T < X k:m:n < X m:m:n ; T, if X k:m:n < T < X m:m:n ; X m:m:n , if X k:m:n < X m:m:n < T.
Following the Cho et al. [15] scenario, the following types of failure times are observed under generalized progressive hybrid censoring: Case I: X 1:m:n , X 2:m:n , . . . , X k:m:n , if T < X k:m:n < X m:m:n , Case II: X 1:m:n , . . . , X k:m:n , . . . , X d:m:n , if X k:m:n < T < X m:m:n , Case III: X 1:m:n , . . . , X k:m:n , , . . . , X m:m:n , if X k:m:n < X m:m:n < T, (1) and a plot of the generalized progressive hybrid censoring scheme is shown in Figure 1. It is seen from the scenario of the generalized progressive hybrid censoring shown in (1) and Figure 1 that the scheme control the testing flow through the introducing variables k and T. Under such a mechanism, this censoring provides a flexible and symmetry testing scheme for obtaining a pre-specified number of failures, whereas the associated testing time is still taken into account in the experimental cycle. Therefore, the corresponding scheme not only saves the testing time and cost, but also guarantees a certain number of failures in the life test, which will help to improve the efficiency of inference due to the greater amount of lifetime information collected.
Suppose that the lifetimes of the testing product follow the probability density function (PDF) f (·) and the cumulative distribution function (CDF) F(·) with a survival function S(·) = 1 − F(·), then under generalized progressively hybrid censoring (1), the joint density function can be expressed as where Let X be a random variable from the Kumaraswamy distribution with parameters α > 0 and β > 0, the CDF and the PDF of X can be written as The distribution (3) was introduced by Kumaraswamy [16] as a better alternative of beta model to describe hydrological phenomenon, which is also illustrated by Nadarajah [17]. The Kumaraswamy distribution is a very flexible model whose failure rate function can be unimodal(α > 1 and β > 1), uniantimodal(α < 1 and β < 1), increasing(α > 1 and β ≤ 1), decreasing(α ≤ 1 and β > 1) and constant(α = β = 1) depending on its parameters. Due to its flexibility, Kumaraswamy distribution has attracted extensive attention in the literature and has been discussed by many authors. See, for example, the contributions of Jones [18], Nadarajah [17], Ghosh and Nadarajah [19]), and Ponnambalam et al. [20]. In this paper, inference of Kumaraswamy distribution under generalized progressive hybrid censoring is studied under classical and Bayesian procedures, respectively.
Suppose the failure times (1) of size n come from Kumaraswamy population (3), the likelihood function of α and β can be expressed as

Classical Inference
This section provides maximum likelihood estimators (MLEs) of unknown parameters, and associated approximate confidence intervals (ACIs) are also obtained by using the asymptotic theory of MLEs.

Maximum Likelihood Estimation
From (4), the likelihood function can be re-expressed as where From (5), the corresponding log-likelihood function can be written as The MLEs of Kumaraswamy parameters α and β are shown as follows: Theorem 1. Suppose that the generalized progressively hybrid censored sample (1) is from Kumaraswamy distribution (3) with parameters α and β. The MLEβ of β iŝ whereα is the MLE of α obtained from following equation x α i:m:n ln x i:m:n 1−x α i:m:n , Case I; x α i:m:n ln x i:m:n 1−x α i:m:n and Proof. Taking derivatives of (6) with respect to α and β and equating them to zero, one directly has that the MLE of β is given by β = − d * w 2 (α) , and the associated MLE of α can be obtained from (7). In the following, we will show the existence and uniqueness of the consequent MLEs.
It is seen from (7) that, the MLE of α cannot obtained in closed form, some iterative algorithm can be used to approximate its estimate. Once the MLE of α is obtained, the MLE of β is derived directly.

Approximate Confidence Intervals
From (6), the second derivatives of (α, β) can be written as x α i:m:n ln 2 x i:m:n Case III,  Under some mild regularity conditions, the asymptotic distribution of the MLE (α,β) can be constructed as where I −1 (α,β) is the inverse of the observed information matrix shown as For arbitrary 0 < γ < 1, the 100(1 − γ)% ACIs of α and β can be expressed aŝ where z γ is the upper γ-th quantile of the standard normal distribution. Sometimes, above ACIs may have negative bounds. In order to solve this problem, asymptotic normal distributions of lnα and lnβ are approximated by using delta technique as follows Therefore, 100(1 − γ)% ACIs of lnα and lnβ is which further implies that the 100(1 − γ)% ACIs of α and β can be constructed as (e A 1 , e A 2 ) and (e B 1 , e B 2 ), respectively.

Bayesian Inference
In this section, Bayes estimates and the associated highest posterior density (HPD) credible intervals are constructed for unknown Kumaraswamy parameters.
Following the idea of Ghosh and Nadarajah [19], parameters α and β are assumed to follow independent gamma priors with densities and Therefore, the joint prior density of (α, β) is given by and the posterior density of α and β can be obtained from (5) and (8) as Under squared error loss, for arbitrary function η(α, β) of α and β, the Bayes estimator of η(α, β) is the expectation of posterior distribution given bŷ It is evident that there is no closed form for Bayesian estimatorη(α, β); thus, the numerical technique should be employed to approximate the associated estimate.
Ignoring the additive constant terms, the posterior density of (α, β) from (9) can be expressed as Therefore, it is conducted from (10) directly that with and that It is observed that the posterior distribution (11) of parameter β is conditionally gamma distributed for generating β random data. However, the conditional posterior distribution of α cannot be reduced analytically to some common familiar models. Following Devroye [21] and Kizilaslan and Nadar [22], one could use the normal proposal distribution method to generate α random samples from (12).
Note that when a 1 = b 1 = a 2 = b 2 = 0, the prior (8) is not proper, but the conditional posterior densities π(α|β, x) and π(β|α, x) are still proper. Hence, the proposed MCMC (Algorithm 1) method can still be used to find the corresponding Bayes estimate.

Algorithm 1 MCMC sampling algorithm.
Step 1 Use the method of MLEs or any other methods to estimate α and β as starting point of iteration, denote these estimates as α (0) and β (0) .
Step 3 Proceeding the iterative procedure in this way and repeat Step 2 K times.

Simulation Studies
In this subsection, simulation studies are conducted to evaluate the performance of point and interval estimates based on different choices of n, m, k, T, R = {r i : j = 1, 2, . . . , m} and α, β values. The performance of classical and Bayes estimates are compared by the following criteria: In addition, the following censoring schemes are considered in the simulation: CS I: r 1 = r 2 = · · · = r m−1 = 0 and r m = n − m; CS II: r 1 = n − m and r 2 = · · · = r m = 0; CS III: In this simulation, the algorithm proposed by Balakrishnan and Sandhu [23] can be used to generate a progressively Type II-censored sample, and then the relevant generalized progressively hybrid censored samples can be obtained by comparing the prefixed T and K with the generated progressively Type II-censored data. The simulation was conducted based on 10,000 repetitions, and the results are presented in Tables 1-8, where the significance level for interval estimates is 1 − γ = 0.95. Since there is no prior information, 0.001(instead of 0) is chosen as the hyper-parameters (all hyper-parameters) of the Bayesian estimation, these priors are proper but almost non-informative. Tables 1, 3, 5 and 7 show that with an increase in n, m, k, T or any combination of all those cases, the ABs and MSEs of both MLEs and Bayes estimates decrease for parameters α and β; for the given n, m, k and T, ABs and MSEs of MLEs for each parameter, these are similar under three CSs I, II and III. A similar phenomenon also appears for Bayes estimates. Furthermore, the performance of Bayes estimates with respect to noninformative prior are slightly better than those of MLEs in terms of ABs and MSEs in general. Meanwhile, for the simulated results of interval estimates shown in Tables 2, 4, 6 and 8, one can observe that CPs increase when n, m, k, T or any combination increases, the AWs decrease correspondingly under same change and both ACIs and HPDs have similar CPs and AWs under CSs I, II and III. Moreover, it is noted that, in most of the cases, the CPs of both ACIs and HPD credible intervals are close to the nominal values. Under the same sample setting, the AWs of HPDs obtained under noninformative prior are slightly shorter than the ACIs for each parameter.
Overall, it can be seen from the simulation results that the performance of MLEs and Bayes estimates are satisfactory; whereas the Bayes estimates are obtained under almost non-informative priors, they are slightly superior to MLEs in terms of ABs and MSEs. Meanwhile, according to AWs and CPs of interval estimates, if one wishes to find confidence intervals which feature slightly shorter AWs, HPD-credible intervals may be an appropriate choice; otherwise, if one wishes to have the confidence interval of which CP is closer to the nominal level and the width of the interval is not a major concern, ACIs can be used to provide a balance between CPs and AWs.

Illustrated Examples
From Nadar et al. [24], the monthly water capacity data of Shasta reservoir in California, USA, from 1991 to 2010 in February, are used for illustration. Since the maximum capacity of the reservoir is 4,552,000 atrial fibrillation, the expression t = (x − x min )/(x max − x min ) is used to convert the origin data into [0, 1] data; x min and x max represent the lower and upper bounds of the original variable x, respectively, and t is the corresponding transformed data. The origin and the transformed monthly capacity data are listed in Table 9, where the first numbers in brackets are the real values of monthly capacity and the second ones are the associated proportion in total capacity. Table 9. Monthly capacity for August and proportion of total capacity for Shasta reservoir.
Before further investigation, we first check whether the Kumaraswamy distribution can provide a proper fit for the real data. By computation, it is seen that the Kolmogorov-Smirnov distance is 0.1709 and the corresponding p-value is 0.4816, respectively. Therefore, Kumaraswamy distribution is a proper model for these data sets. In addition, based on the complete capacity proportion data, the MLEs of α and β are 6.3474 and 4.4892, respectively. The empirical cumulative distribution and the fitting Kumaraswamy distribution plot shown in the left of Figure 2 and the probability-probability (P-P) plot shown in the right of Figure 2 are provided as well, which also suggests that the Kumaraswamy distribution provides a suitable fitting model. Based on Table 9, three groups of generalized progressively hybrid censored sample are generated and shown in Table 10. Using the proposed methods, the various point and interval estimates are obtained given in Table 11 where, since we do not have any prior information about the unknown parameters, the Bayes estimates are also obtained under almost non-informative prior with all hyper-parameters being 0.001; the interval estimates are obtained under a significance level of 0.05 and the interval lengths are also provided in square brackets.   From Table 11, it is observed that both MLEs and Bayes estimates are close to each other under different data sets, which indicate that the classical and Bayes results have similiar performance in general. In addition, the corresponding estimated CDFs via MLEs and Bayes estimates are provided in Figure 3, and the plots also appear similar performance under each data sets. Moreover, one can also noted that the Bayesian credible intervals of unknown parameters are superior to ACIs in terms of ACIs as well.

Conclusions
In this paper, inference is considered for Kumaraswamy distribution based on the generalized hybrid progressive censoring. Under classical and Bayesian procedures, the existence and uniqueness of MLEs for unknown parameters are established, and the Monte-Carlo sampling method is used to approximate the Bayes estimates and HPD intervals. Simulation studies and real-life instances show that the estimation results of classical and Bayesian methods work satisfactorily, and the Bayesian approach is superior to conventional classical estimation. For further study, the optimization design and sampling scheme of generalized progressive hybrid censoring also seem interesting, and will be discussed in future research.