The Exponentiated Lindley Geometric Distribution with Applications

We introduce a new three-parameter lifetime distribution, the exponentiated Lindley geometric distribution, which exhibits increasing, decreasing, unimodal, and bathtub shaped hazard rates. We provide statistical properties of the new distribution, including shape of the probability density function, hazard rate function, quantile function, order statistics, moments, residual life function, mean deviations, Bonferroni and Lorenz curves, and entropies. We use maximum likelihood estimation of the unknown parameters, and an Expectation-Maximization algorithm is also developed to find the maximum likelihood estimates. The Fisher information matrix is provided to construct the asymptotic confidence intervals. Finally, two real-data examples are analyzed for illustrative purposes.


Introduction
Suppose that a company has N systems functioning independently and producing a certain product at a given time, where N is a random variable determined by economy, customers demand, etc. The reason for considering N as a random variable comes from a practical viewpoint, because failure (of a device for example) often occurs due to the present of an unknown number of initial defects in the system. In this paper, we consider the case in which N is taken to be a geometric random variable with the probability mass function given by P(N = n) = (1 − p)p n−1 , for 0 < p < 1 and n is a positive integer. We may take N to follow other discrete distributions, such as binomial, Poisson, etc, whereas they need to be truncated 0 because one must have N ≥ 1. Another rationale by taking N to be a geometric random variable is that the "optimum" number can be interpreted as "number to event", matching up with the definition of a geometric random variable, as commented by [1]. The geometric distribution has been widely used for the number of "systems" in the literature; see, for example, [2,3]. It has also been adopted to obtain some new class of distributions; see [4] for the exponential geometric (EG) distribution, [5] for the exponentiated exponential geometric (EEG) distribution, [6] for the Weibull geometric distribution, [1] for the geometric exponential Poisson (GEP) distribution, to name just a few.
On the other hand, we assume that each of N systems is made of α parallel components, and therefore, the system will completely shutdown if all of the components fail. Meanwhile, we assume that the failure times of the components for the ith system, denoted by Z i1 , . . . , Z iα , are independent and identically distributed (iid) with the cumulative distribution function (cdf) G(z) and the probability density function (pdf) g(z). For simplicity of notation, let Y i stand for the failure time of the ith system and X denote the time to failure of the first out of the N functioning systems, i.e., X = min(Y 1 , . . . , Y N ). Then it can be seen from [5] that the conditional cdf of X given N is given by and the unconditional cdf of X can thus be written as The new class of distribution in (1) depends on the cdf of the failure times of the components in the system, which may follow some continuous probability distributions, such as the exponential, Lindley, and Weibull distributions. As an illustration, if the failure times of the components for the ith system are iid exponential random variables with the rate parameter λ, i.e., G(z) = 1 − e −λz , then we obtain the EEG distribution due to [5]. Its cdf is given by Please note that in reliability engineering and lifetime analysis, we often assume that the failure times of the components within each system follow the exponential lifetimes; see, for example [4,5,7], among others. This assumption may be unreasonable because the hazard rate of the exponential distribution is a constant, whereas some real-life systems may not have constant hazard rates, and the components of a system are often more rigid than the system itself. Accordingly, it becomes reasonable to consider the components of a system following a distribution with a non-constant hazard function that has flexible hazard function shapes.
In this paper, we propose a new three-parameter lifetime distribution by compounding the Lindley and geometric distributions based on the new class of distribution in (1). The Lindley distribution was first proposed by [8] in the context of Bayesian statistics, as a counterexample of fiducial statistics. It has recently received considerable attention as an appropriate model to analyze lifetime data especially in applications modeling stress-strength reliability; see, for example, [9][10][11]. Ghitany et al. [12] argue that the Lindley distribution could be a better lifetime model than the exponential distribution through a numerical example and show that the hazard function of the Lindley distribution does not exhibit a constant hazard rate, indicating the flexibility of the Lindley distribution over the exponential distribution. These observations motivate us to study the structure properties of the distribution in (1) when the failure times of the units for the ith system are iid Lindley random variables with the parameter θ, i.e., where the parameter θ > 0. Its corresponding cdf is given by where the parameters α > 0, θ > 0, and 0 < p < 1. We call the distribution as the exponentiated Lindley geometric (ELG) distribution. Indeed, it is necessary to compute the entropy measure for ELG distribution under the assumption that errors are non-Gaussian distributed (e.g., [13]). Other motivations of the ELG distribution are briefly summarized as follows. (i) It contains several lifetime distributions as special cases, such as the Lindley-geometric (LG) distribution due to [14] when α = 1. (ii) It can be viewed as a mixture of exponentiated Lindley distributions introduced by [15]. (iii) The ELG distribution is a flexible model which can be widely used for modeling lifetime data in reliability and survival analysis. (iv) It exhibits monotonically increasing, decreasing, unimodal (upper-down bathtub), and bathtub shaped hazard rates but does not exhibit a constant hazard rate, which makes the ELG distribution to be superior to other lifetime distributions, which exhibit only monotonically increasing/decreasing, or constant hazard rates. The remainder of the paper is organized as follows. In Section 2, we discuss various statistical properties of the new distribution. The maximum-likelihood estimation is considered in Section 3, and an EM algorithm is proposed to find the maximum likelihood estimates because they cannot be obtained in closed form. The maximum-likelihood estimation for censored data is also discussed briefly. In Section 4, two real-data applications are provided for illustrative purposes. Some concluding remarks are given in Section 5.

Properties of the ELG distribution
We provide statistical properties of the ELG distribution. These include the pdf and its shape (Section 2.1), hazard rate function and its shape (Section 2.2), quantile function (Section 2.3), order statistics (Section 2.4), expressions for the nth moments (Section 2.5), residual life function (Section 2.6), mean deviations (Section 2.7), Bonferroni and Lorenz curves (Section 2.8), and entropies (Section 2.9).

Probability Density function
The corresponding pdf of the ELG distribution corresponding to the cdf in (4) is given by for x > 0, α > 0, θ > 0, and 0 < p < 1.
It should be noted that the pdf in (5) is still a well-defined density function when p ≤ 0. Thus, we can define the ELG distribution in (5) to any p < 1. As mentioned in Section 1, the ELG distribution includes several special submodels. When α = 1, it becomes the LG distribution due to [14]. When p = 0 and α = 1, it turns out to be the Lindley distribution due to [8]. It converges a distribution degenerating at the point 0 when p → 1 − . Figure 1 displays the pdf of the ELG distribution in (5) with selected values of α, θ, and p. We observe from Figure 1 that the shape of the pdf is monotonically decreasing with the modal value of ∞ at x = 0 when α < 1 and the shape of the pdf appears upside-down bathtub for α > 1. When α = 1, we observe that the shape exhibits monotonically decreasing as well as unimodal. This observation coincides with Theorem 1 of [14], which states that the density function of the LG distribution is (i) decreasing for all values of p and θ for which p > 1−θ 2 1+θ 2 , (ii) unimodal for all values of p and θ for which p ≤ 1−θ 2 1+θ 2 .

Hazard Rate Function
The failure rate function, also known as the hazard rate (hf) function, is an important characteristic for lifetime modeling. For a continuous distribution with the cdf F(x) and the pdf f (x), its failure rate function is defined as is the survival function of X. The hf of the ELG distribution is given by for x > 0, α > 0, θ > 0, and p < 1. Figure 2 depicts shapes of the hf with selected values of α, θ, and p. We observe that the hf of the ELG distribution is quite flexible. For example, the shape appears monotonically decreasing if α is sufficiently small and p is not sufficiently large. The shape appears monotonically increasing for small p and large α. The shape appears bathtub-shaped or first increases then bathtub-shaped for α = 1. We may conclude that the ELG distribution exhibits increasing, decreasing, upside-down bathtub, and bathtub shaped failure hazard rates, but does not exhibit a constant hazard rate. Note also that as x → 0, the initial hf behaves as h( for α = 1, and h(0) = 0 for α > 1.

Quantile Function
Let Z denote a Lindley random variable with the cdf in (3). We observe from [16] that the quantile function of the Lindley distribution is where 0 < u < 1 and W −1 (·) denotes the negative branch of the Lambert W function (i.e., the solution of the equation W(z)e W(z) = z), which can be calculated by using the Lambert-W function in the R package lamW; see [17] in detail. Let X be a ELG random variable with the cdf F(x) in (4). By inverting F(x) = u for 0 < u < 1, we obtain u − up 1 − up It follows from Equation (7) that the quantile function of the ELG distribution is given by is also unique. Thus, one can use Equation (8) for generating random data from the ELG distribution.
In particular, the quartiles of the ELG distribution, respectively, are given by

Order Statistics
Suppose X 1 , . . . , X n is a random sample from the ELG distribution. Let X (1) < X (2) < · · · < X (n) be the corresponding order statistics. The pdf for the rth order statistic of the ELG distribution, say Y = X (r) , is given by The corresponding cdf of Y is given by In practice, we may be interested in studying the asymptotic distribution of the extreme values X (1) and X (n) . By using L'Hospital's rule, we have In addition, by using L'Hospital's rule, it can be easily shown that By following Theorem 1.6.2 in [18] we observe that there must be some normalizing constants a n > 0, b n , c n > 0, and d n , such that Pr a n (X (1) as n → ∞. The form of the normalizing constants can be determined by using Corollary 1.6.3 in [18]. As an illustration, one can see that a n = θ and b n = F −1 (1 − 1/n), where F −1 (·) denotes the inverse function of F(·).

Moment Properties
Many important features of a distribution can be characterized through its moments, such as dispersion, skewness, and kurtosis. To derive the nth moment of the ELG distribution, we consider the Taylor series expansion of the form which converges for |x| < 1. This provides that where ( −1 k ) is the generalized binomial coefficient. Therefore, we can rewrite Equation (4) as We observe that the ELG distribution is a mixture of exponentiated Lindley distributions introduced by [15], i.e., They show that if Y is an exponentiated Lindley random variable with parameters θ and β, the nth moment and the moment generating function of Y are, respectively, given by and By using Equation (11), we obtain the nth moment of X can be rewritten as for n = 1, 2, . . . Equation (12) can be adopted to compute the third and fourth central moments of the ELG distribution, which are then used to define skewness and kurtosis, respectively. For instance, based on the first four moments of the ELG distribution, the measures of skewness γ and kurtosis κ of the ELG distribution are, respectively, given by The moment generating function of the ELG distribution, denoted by M X (t), is given by Thereafter, we can use M X (t) to obtain the nth moment about zero of the ELG distribution. In particular, if | p 1−p G α (x)| < 1, then Equation (11) can be simplified to The corresponding nth moment of X can be simplified as for n = 1, 2, · · · , and the moment generating function of the ELG distribution is given by

Residual Life Function
Given that a component of a system survives up to time t ≥ 0, the residual life will be the period beyond t until the time of failure occurs in the system and is thus defined by the conditional random variable X − t | X > t. The mean residual life plays an important role in survival analysis and reliability of characterizing lifetime, because it can be used to determine a unique corresponding lifetime distribution. The rth moment of the residual life of the ELG distribution can be obtained by the general formula where S(t) = 1 − F(t) is the survival function defined before. Noting that the ELG distribution is a mixture of exponentiated Lindley distributions, we may calculate m r (t) by using the expression in Lemma 2 of [15], which is given by x t a−1 exp(−t) dx represents the complementary incomplete gamma function. Let X be an ELG random variable. By using the Taylor series expansion in (9), it can be easily shown that From the binomial expansion for (x − t) r , we get that the rth order moment of the residual life of the ELG distribution is given by The mean and variance of the residual life function of the ELG distribution can be easily obtained using m 1 (t) and m 2 (t), and are not shown here for simplicity. In a similar way as done for Equation (13), it can be shown that if | p and the rth order moment of the residual life of the ELG distribution can be written as

Mean Deviations
We consider the totality of deviations from the mean and median and the mean deviation from the mean, which is often used to estimate the amount of scatter in a population. The mean deviation is a more robust statistic to outliers in the data set than the standard deviation and the mean deviation from the median is a measure of statistical dispersion, which is a more robust statistic to outliers than the sample variance or standard deviation.
Let X denote a random variable with the pdf f (x), the cdf F(x), mean µ, and median M. The mean deviation about the mean and the mean deviation about the median are defined by and respectively. Of particular note is that when | p 1−p G α (x)| < 1, the mean deviations above can be further simplified as

Bonferroni and Lorenz Curves
The Bonferroni and Lorenz curves (Bonferroni 1930) have many practical applications not only in economics and poverty, but also in other fields like reliability, lifetime testing, insurance, and medicine. For a random variable X with cdf F(·), the Bonferroni and Lorenz curves are defined by where µ = IE(X), and respectively. If X is an ELG random variable with the pdf in (5), we observe Equation (17) can be written as which is obtained by using Equation (16) with t = q and r = 1. By using Equation (18), it follows easily that the Lorenz curve of the ELG distribution is given by L[F(x)] = µB[F(x)].

Entropies
It is well known that an entropy of a random variable X is a measure of variation of the uncertainty. The Rényi entropy is defined as where γ > 0 and γ = 1. The Shannon entropy is defined as E − log( f (x)) , which is a particular case of the Rényi entropy as γ → 1. We first observe that which shows that the Rényi entropy of the ELG distribution is given by It can be shown that the Shannon entropy of the ELG distribution is given by which can be easily evaluated using a unidimensional integral. Figure 3 depicts shapes of the Shannon entropy of the ELG distribution with several selected values of α, θ, and p. It deserves mentioning that the entropy measure of the ELG distribution can be estimated by using numerical integration methods with the (plug-in) estimators found in the following section.

Estimation of Parameters
We adopt the maximum likelihood estimation to estimate the unknown parameters (Section 3.1) and develop an Expectation-Maximization (EM) algorithm to find the maximum likelihood estimate (MLE) (Section 3.2). We also discuss the MLEs of the unknown parameters when the data is censored (Section 3.3).

Maximum Likelihood Estimation
It is well known that the MLE is often used to estimate the unknown parameter of a distribution because of its attractive properties, such as consistency, asymptotic normality, etc. Let X 1 , . . . , X n be a random sample from the ELG distribution with unknown parameter vector φ = (θ, α, p). Then the log-likelihood function l = l(φ; x) is given by For notational convenience, let . . , n. The MLEs of the unknown parameters can be obtained by taking the first partial derivatives of Equation (19) with respect to α, θ, and p and putting them equal to 0. We have the following likelihood equations ∂l ∂θ ∂l ∂p Please note that the MLEs, respectivelyα,θ andp of α, θ and p cannot be solved analytically. Numerical iteration techniques, such as the Newton-Raphson algorithm, are required to solve these equations, whereas the second derivatives of the log-likelihood are required for all iterations involved in numerical iteration techniques. We thus develop an EM algorithm to estimate the MLEs of the unknown parameters. For interval estimation of the parameters, we consider suitable pivotal quantities based on the asymptotic properties of the MLEs and approximate the distributions of these quantities by the normal distribution. We observe that . . , n. The observed Fisher information matrix of α, θ, and p can be written as so the variance-covariance matrix of the MLEsα,θ andp may be approximated by inverting the matrix I and is thus given by The asymptotic joint distribution of the MLEsα,θ, andp can be treated as being approximately multivariate normal and is given by Since V involves the unknown parameters α, θ, and p, we replace these parameters by their corresponding MLEs to obtain an estimate of V denoted by The asymptotic 100(1 − γ)% confidence intervals of α, θ, and p are determined by α − z γ/2 var(α),α + z γ/2 var(α) , respectively, where z p is the upper pth percentile of the standard normal distribution. The likelihood ratio (LR) can be used to evaluate the difference between the ELG distribution and its special submodels. We partition the parameters of the ELG distribution into (φ 1 , φ 2 ) , where φ 1 is the parameter of interest and φ 2 is the remaining parameters. Consider the hypotheses The LR statistic for the test of the null hypothesis in (24) is given by whereφ andφ * are the restricted and unrestricted maximum likelihood estimators under H 0 and H 1 , respectively. Under H 0 , it follows where D −→ denotes convergence in distribution as n → ∞ and κ is the dimension of the subset φ 1 of interest. For instance, we can compare the ELG and LG distributions by testing H 0 : α = 1 versus H 1 : α = 1. The ELG and Lindley distributions are compared by testing H 0 : (α, p) = (1, 0) versus H 1 : (α, p) = (1, 0).

Expectation-Maximization Algorithm
Dempster et al. [19] introduce an EM algorithm to estimate the parameters when some observations are treated as incomplete data. Suppose that X = (X 1 , X 2 , . . . , X n ) and Z = (Z 1 , Z 2 , . . . , Z n ) represent the observed and hypothetical data, respectively. Here, the hypothetical data can be thought of as missing data because Z 1 , Z 2 , . . . , Z n are not observable. We formulate the problem of finding the MLEs as an incomplete data problem, and thus, the EM algorithm is applicable to determine the MLEs of the ELG distribution. Let W = (X, Z) denote the complete data. To start this algorithm, define the pdf of each (X i , Z i ) for i = 1, . . . , n as The E-step of an EM cycle requires the conditional expectation of (Z | X, α (r) , θ (r) , p (r) ), where (α (r) , θ (r) , p (r) ) is the current estimate of (α, θ, p) in the rthe iteration. Please note that the pdf of Z given X, say g(z | x), is given by Thus, the conditional expectation is given by The log-likelihood function l c (W; α, θ, p) of the complete data after ignoring the constants can be written as Next the M-step involves the maximization of the pseudo log-likelihood function in (27). The components of the score function are given by For notational convenience, let τ (r) for i = 1, . . . , n. By replacing the missing Z's with their conditional expectations IE[Z | X, α (r) , θ (r) , p (r) ], we obtain an iterative procedure of the EM algorithm given by the following equations.
, for i = 1, . . . , n. Please note that some efficient numerical methods, such as the Newton-Raphson algorithm, are only needed for solving Equations (28) and (29).

Censored Maximum Likelihood Estimation
Censored data often occur in lifetime data analysis. Several popular mechanisms of censoring, such as type-I censoring and type-II censoring, have received much attention in the literature. The survival function of the ELG distribution has a simple closed-form expression, and therefore, it can be used in analyzing lifetime data in the presence of censoring. We briefly discuss the general case of multicensored data. Suppose that n = n 0 + n 1 + n 2 subjects of which • n 0 is known to have failed at the times t 1 , . . . , t n 0 , • n 1 is known to have failed into the interval [s i−1 , s i ] for i = 1, . . . , n 1 , • n 2 is known to have survived at a time r i for i = 1, . . . n 2 but not observed any longer.
Please note that Type-I censoring and Type-II censoring are contained as particular cases of multicensoring above. The log-likelihood function of φ = (θ, α, p) of the ELG distribution for this multicensoring takes the form It is straightforward to derive the first derivatives of the log-likelihood function with respect to the three unknown parameters α, θ, and p. Thereafter, the MLEs of the unknown parameters can be obtained by setting the first derivatives equal to zero, i.e., Please note that the Newton-Raphson algorithm or other optimization algorithms may be employed to solve the above system of equations, because the MLEs of the unknown parameters cannot be obtained in closed-forms. Finally, the corresponding information matrix for φ is too complicated to be presented here.

Two Real-Data Applications
In this section, we illustrate the applicability of the ELG distribution using two real-data examples. We use the same data sets to compare the ELG distribution with the Gamma, Weibull, Lindley geometric (LG), Weibull geometric (WG) distributions, whose densities are given by LG(θ, p) for x > 0, respectively. To compare the ELG distribution with the four distributions listed above, we advocate the Akaike information criterion (AIC), the Bayesian information criterion (BIC), and the AIC with a correction (AICc) for the two-real data sets. In addition, we apply two formal goodness-of-fit tests: the Cramér-von Mises (W * ) and Anderson-Darling (A * ) statistics to further verify which distribution fits better to the data; see, for example, [5,20], among others. The smaller the value of the considered criterion, the better the fit to the data. The first data set is about the remission time (in months) of a random sample of 128 bladder cancer patients. This data set presented in Table 1 was studied by [21] in fitting the extended Lomax distribution and [22] for the modified Weibull geometric distribution. Table 2 shows the MLEs of the parameters, AIC, BIC, and AICc for the ELG, Gamma, Weibull, LG, and WG distributions for the first data set. We observe from Table 2 that the ELG distribution and its special case LG provide an improved fit over other distributions that are commonly used for fitting lifetime data. The plots of the fitted probability density and survival function are also shown in Figure 4. Please note that the density and survival functions of the ELG distribution seem to be better than Gamma, Weibull, and WG density and survival functions. In addition, we observe from the values of goodness-of-fit tests in Table 3 that the ELG distribution fits the current data better than other distributions under consideration.   As mentioned in Section 3.1, we can adopt the LR statistic to compare between the ELG distribution and its special submodels. For example, the LR statistic for testing between the LG and ELG distributions (i.e., H 0 : α = 1 versus H 1 : α = 1) is ω = 0.5645 and the corresponding p-value is 0.4525. Thus, we fail to reject H 0 and conclude that there is no statistical difference between the fits to this data using the ELG and its submodel LG. This is quite reasonable because the estimate of α in the ELG model isα = 1.0792, which is close to 1 in the LG model.
In the second data set, we consider the waiting time (in minutes) before service of 100 bank customers. The data are presented in Table 4. This data set was used by [12] in fitting the Lindley distribution. Table 5 shows the MLEs of the parameters, AIC, BIC, and AICc for the ELG, Gamma, Weibull, LG, and WG distributions for the second data set. Table 5 indicates that the ELG distribution is still a strong competitor to other lifetime distributions. In addition, the plots of the fitted probability density and survival function are shown in Figure 5. Please note that the ELG and WG distributions perform identically and that the empirical and fitted five survival curves almost overlap for this data set, supporting that the ELG distribution fits this data at least as good as the four alternative distributions. In addition, we observe from the values of goodness-of-fit tests in Table 6 that the ELG distribution fits the current data better than the Gamma, Weibull, and LG distributions and is comparable with the WG distribution. 8.9 9.5 9.6 9.7 9.8 10.7 10.9 11.0 11.0 11.1 11.2 11.2 11.5 11.9 12.   Figure 5. Plots of the estimated density and survival function of the fitted models for the second data set.

Concluding Remarks
In this paper, we introduced the exponentiated Lindley geometric distribution, which generalizes the LG distribution due to [14] and the Lindley distribution proposed by [23]. We have studied various statistical properties of the new distribution. Estimations of the unknown parameters of the distribution are discussed based on the maximum likelihood estimation and an EM algorithm is provided for estimating the parameters. In an ongoing project, we study and Bayesian inference of these parameters and results will be reported elsewhere.
Author Contributions: M.W. and B.P. initiated and carried out the study. M.W. drafted the manuscript. Z.X. and B.P. participated in the data analysis and discussion. All authors read and approved the final manuscript.
Funding: The first author was partially supported by Scientific Innovation Program of Sichuan Province (Major Engineering Project: 2018RZ0093), Nanchong Scientific Council (Strategic Cooperation Program between University and City: NC17SY4020).