Statistical Inference for Geometric Process with the Power Lindley Distribution

The geometric process (GP) is a simple and direct approach to modeling of the successive inter-arrival time data set with a monotonic trend. In addition, it is a quite important alternative to the non-homogeneous Poisson process. In the present paper, the parameter estimation problem for GP is considered, when the distribution of the first occurrence time is Power Lindley with parameters α and λ. To overcome the parameter estimation problem for GP, the maximum likelihood, modified moments, modified L-moments and modified least-squares estimators are obtained for parameters a, α and λ. The mean, bias and mean squared error (MSE) values associated with these estimators are evaluated for small, moderate and large sample sizes by using Monte Carlo simulations. Furthermore, two illustrative examples using real data sets are presented in the paper.


Introduction
The Renewal process (RP) is a commonly used method for the statistical analysis of the successive inter-arrival times data set observed from a counting process. When the data set is non-trending, independently and identically distributed (iid), RP is a possible approach to modeling the data set. However, if the data follow a monotone trend, the data set can be modeled by a more possible approach than RP, such as the non-homogeneous Poisson process with a monotone intensity function, or a GP.
GP is first introduced as a direct approach to modeling of the inter-arrival times data with the monotone trend by Lam [1]. Actually, GP is a generalization of the RP with a ratio parameter. However, GP is a more flexible approach than RP for modeling of the successive inter-arrival time data with a trend. Because of this feature, GP has been successfully used as a model in many real-life problems from science, engineering, and health.
Before progressing further, let us recall the following definition of GP given in [2]. Definition 1. Let X i be the inter-arrival time the (i − 1)th and ith events of a counting process {N(t), t ≥ 0} for i = 1, 2, · · · . Then, the stochastic process {X n , n = 1, 2, · · · } generated by X i random variables is said to be a geometric process (GP) with parameter a if there exists a real number a > 0 such that Y i = a i−1 X i , i = 1, 2, · · · (1) are iid random variables with the distribution function F, where a is called the ratio parameter of the GP.
Clearly, the parameter a arranges monotonic behavior of the GP. In Table 1, the monotonic behavior of the GP is given.

Value of Parameter a
Behavior of GP a > 1 Stochastically decreasing process a < 1 Stochastically increasing process a = 1 Stable (GP is an RP) In the GP, the assumption on the distribution of X 1 has a special significance because of the fact that the distribution of X 1 and the other random variables X 2 , · · · , X n are from the same family of distributions with a different set of parameters. Namely, X i s (i = 1, 2, · · · , n) are distributed independently, but not identically. This is trivial from Definition 1. By considering this property, the expectation and variance of X i s are immediately obtained as where µ and σ 2 are the expectation and variance of the random variable X 1, respectively. Since the distribution of the random variable X 1 determines the distribution of the other variables, the selection of the distribution of X 1 based on the observed data is quite important to optimal statistical inference [3,4].
There are many studies in the literature on the solution to the parameter estimation problem of GP while selecting some special distributions for X 1 . Chan et al. [5] investigated the parameter estimation problem of GP by assuming that the distribution of random variable X 1 was Gamma. Lam et al. [6] investigated the statistical inference problem for GP with Log-Normal distribution according to parametric and non-parametric methods. When the distribution of random variable X 1 was the inverse Gaussian, Rayleigh, two-parameter Rayleigh and Lindley, the problems of statistical inference for GP were investigated according to ML and modified moment method by [7][8][9][10], respectively. The main objective of this study extensively investigates the solution of parameter estimation problem for GP when the distribution of the first arrival time is Power Lindley. In accordance with this objective, estimators for the parameters of GP with the Power Lindley distribution are obtained according to methods of maximum likelihood (ML), modified moment (MM), modified L-moment (MLM) and modified least-squares (MLS). The method of moments and the least-squares estimators of Power Lindley distribution are available in the literature [11]. The L-moments estimator of the Power Lindley distribution is obtained with this paper. In addition, the novelty of this paper is that the distribution of first inter-arrival time is assumed to be Power Lindley for GP and the ML and MLM estimators under this assumption are obtained.
The rest of this study is organized as follows: Section 2 includes the detailed infomation about the Power Lindley Distribution. In Section 3, the ML, the MLS, the MM and the MLM estimators of the parameters a, α and λ are obtained. Section 4 presents the results of performed Monte Carlo simulations for comparing the performances of the estimators obtained in Section 3. For illustrative purposes, two examples with real data sets are given in Section 5. Section 6 concludes the study.

An Overview to Power Lindley Distribution
The Power Lindley distribution was originally introduced by Githany et al. [11] as an important alternative for modeling of failure times. The distribution has a powerful modeling capability for positive data from different areas, such as reliability, lifetime testing, etc. In addition, for modeling the data sets with various shapes, many different extensions of the Power Lindley distribution have been attempted by researchers under different scenarios [12][13][14][15][16][17][18][19][20].
Let X be a Power Lindley distributed random variable with the parameters α and λ. From now on, a Power Lindley distributed random variable X will be indicated as X ∼ PL (α, λ) for brevity. The probability density function (pdf) of the random variable X is and the corresponding cumulative distribution function (cdf) is where α and λ are the positive and real-valued scale parameter and shape parameter of the distribution, respectively. Essentially, the Power Lindley distribution is a two-component mixture distribution with mixing ratio λ 1+λ in which the first component is a Weibull distribution with parameters α and λ and the second component is a generalized Gamma distribution with parameter 2, α and λ. The Power Lindley distribution is a quite important alternative to standard distribution families for analyzing of lifetime data, since its distribution function, survival function and hazard function are expressed in explicit form. The Power Lindley distribution also has uni-mode and belongs to the exponential family. To clearly show the shape of the distribution, we present Figure 1. Figure 1a,  Some basic measures of the Power Lindley distribution are tabulated in Table 2.

Characteristic Value
Advanced readers can refer to [11] for more information on the Power Lindley distribution.

Shannon and Rényi Entropy of the Power Lindley Distribution
The entropy is a measure of variation or uncertainty of a random variable. In this subsection, we investigate the Shannon and Rényi entropy, which are the two most popular entropies for Power Lindley distribution. The Shannon entropy (SE) of a random variable X with pdf f is defined as, see [21], Then, by using the pdf (4), the SE of the Power Lindley distribution is found as where Ψ (.) is the digamma function [22]. The Rényi entropy of a random variable with pdf f is defined as By using the pdf (4), Rényi entropy of the Power Lindley distribution is obtained as follows: Applying the power expansion formula and gamma function to Equation (9), RE X (ξ) is obtained as

Maximum Likelihood Estimation
Let X 1 , X 2 , · · · , X n be a random sample from a GP with ratio a and X 1 ∼ PL (α, λ). The likelihood function for X i , i = 1, 2, · · · , n is L(a, α, λ) = a n(n−1) 2 From Equation (11), the corresponding log-likelihood function can be written as below: ln L(a, α, λ) = n(n−1) 2 ln a + n [ln (α) + 2 ln λ − ln (λ If the first derivatives of Equation (12) with respect to a, α and λ are taken, we have the following likelihood equations: ∂ ln L(a,α,λ) ∂α and ∂ ln L(a, α, λ) ∂λ From the solution of likelihood Equations (13)- (15), the parameter λ is obtained as However, analytical expressions for the ML estimators of the parameters a and α cannot be obtained from likelihood Equations (13)- (15). In order to estimate these parameters, Equations (13)-(15) must be simultaneously solved by using a numerical method such as Newton's method.
Under these notations, Newton's method is given as where m is the iteration number and H −1 (θ) is the inverse of the Hessian matrix H (θ) , H (θ) ∈ R 3×3 . The elements of the matrix H (θ) are of the second derivatives of the log-likelihood function given in Equation (11) with respect to parameters a, α and λ. Let h ij be the (i, j) th (i, j = 1, 2, 3) element of the matrix H (θ) . The h ij s are obtained as below: Hence, H −1 (θ) is obtained as where Det (H (θ)) is the determinant of the matrice H (θ) and it is calculated by Thus, by starting with being given an initial estimation θ 0 , the parameter vector θ can be estimated with an iterative method given by (18). Hence, the ML estimates of the parameters a, α and λ, sayâ ML , The elements of the Fisher information matrix I given by (21) are immediately written from elements of the Hessian matrix H (θ). However, an explicit form of the Fisher information matrix I cannot be derived. Fortunately, as an estimator, the observed information matrix can be used instead of matrix I. Note that the observed information matrix of the estimators is the negative value of the matrix H (θ) obtained at the last iteration.

Modified Methods
Since GP is a monotonic stochastic process, some divergence problems may arise in the estimation stage of the ratio parameter a. To overcome this problem, estimating the ratio parameter a by nonparametrically is a widely used method in statistical inference for GP [2,24]. A nonparametric estimator for the ratio parameter a is given by, see [6,25], The estimatorâ NP is an unbiased estimator and follows the asymptotic normal distribution, see [25]. When theâ NP given by (22) is substituted into Equation (1), it can be immediately written Thus, the parameters α and λ can be estimated with a selected estimation method by using the estimatorsŶ i (i = 1, 2, · · · , n) . This estimation rule is called modified method (see [6]).

MM Estimation
Let X 1 , X 2 , · · · , X n be a random sample from a GP with ratio a and X 1 ∼ PL(α, λ). In addition, we assume that the ratio parameter a is nonparametrically estimated by (22). For the sample X 1 , X 2 , · · · , X n , first and second sample moments, m 1 and m 2 , are calculated by and respectively. On the other hand, from Table 2, first and second population moments of the distribution PL(α, λ), say µ 1 and µ 2 , can be easily written as and respectively. Thus, the MM estimators of the parameters α and λ,α MM andλ MM , respectively, can be obtained from the solution of the following nonlinear equation system:

.2. MLM Estimation
In this subsection, the MLM estimatorsα MLM andλ MLM are obtained for the parameters α and λ, respectively, when the ratio parameter a is estimated by (22). L-moments' estimators have been proposed as a method based on the linear combination of the order statistics by Hosking [26]. Due to their useful and robust structure, the L-moments estimators have been intensively studied and, in order to estimate the unknown parameters of many probability distributions, L-moment estimators have been obtained. For more information about the L-moments, we refer the readers to [26].
As in the method of moments, to obtain the L-moment estimators, population L-moments are equated to sample L-moments. In our problem, the first two samples and population L-moments are necessary for obtaining the estimatorsα MLM andλ MLM . Under the transformŶ i =â i−1 X i , (i = 1, · · · , n) , first and second sample L-moments are calculated as follows, see [26]: whereŶ (i) , (i = 1, 2, · · · , n) represents the ordered observations. On the other hand, using the notations in [26], first and second population L-moments of PL (λ, α) are and See Appendix A for the derivation of population L-moments. Thus, the estimatorsα MLM andλ MLM are obtained from the numerical solution of the following nonlinear system:

MLS Estimation
The least-squares estimator (LSE) is a regression-based method proposed by Swain et al. [27]. Essentially, the method is a nonlinear curve fitting for the cdf of a random variable by using the empirical distribution function. In the least-squares method, the estimator(s) is determined such that the squared difference between the empirical distribution function and the fitted curve is minimum. Let Z 1 , · · · , Z n be a random sample from a distribution function F Z (.). In addition, the order statistics of the random sample Z 1 , · · · , Z n are represented by Z (1) , · · · , Z (n) . In this situation, the LSE of the parameters are obtained minimizing with respect to the parameters of F Z (.) [27], where expectation E F Z (Z (i) is calculated by Therefore, in our problem, the MLS estimators of parameters α and λ, sayα MLS andλ MLS , respectively, are obtained by minimizing with respect to α and λ.

Simulation Study
In order to evaluate the estimation performances of the ML, the MLS, the MM and the MLM estimators obtained in the previous section, some Monte Carlo simulation experiments are presented in this section. Throughout the simulation experiments, the scale parameter λ is assumed to be 0.5, without loss of generality and also the parameter α set as 0.5, 1, 1.5 and 2. For different sample size n = 30, 50, 100, 150 and ratio parameter a = 0.90, 0.95, 1.05, 1.10, values of means, biases and MSEs have been calculated for estimating the ML, the MLS, the MM and the MLM. The obtained results based on 10,000 replications are displayed in Tables 3-6.
When the results of the simulation experiments given in Tables 3-6 are analyzed, it is seen that the ML estimators have smaller MSE values than the other estimators in all cases. Therefore, the ML estimators outperform the modified estimators with smaller bias and MSE. In addition, when the sample size n increases, the values of the bias and MSE decrease for all estimators. Based on this result, it can be said that all estimators obtained in the previous section are asymptotically unbiased and consistent.

Illustrative Examples
Practical applications of the parameter estimation with developed procedures in Section 3 are illustrated in this section with two data sets called Aircraft data set and Coal mining disaster data set.
In the examples, we use two criteria, the mean-squared error (MSE*) [28] for the fitted values and the maximum percentage error (MPE), which are defined in [25], for comparing the stochastic processes GP and RP. The MSE* and MPE are described as follows: whereX k is calculated bŷ and S k = X 1 + X 2 + · · · + X k , k = 1, 2, · · · , n andŜ k = k ∑ j=1X j . Then, in order to compare the relative performances of the RP and four GPs with the ML, the MLS, the MM and the MLM estimators, the plot of S k andŜ k against k, k = 1, 2, · · · , n can be used.

Example 1. Aircraft data.
The aircraft dataset consists of 30 observations that deal with the air-conditioning system failure times of a Boeing 720 aircraft (aircraft number 7912). The aircraft dataset was originally studied by Proschan [29]. The successive failure times in aircraft dataset are 23, 261, 87, 7, 120, 14, 62, 47, 225, 71, 246, 21, 42, 20, 5, 12, 120, 11, 3, 14, 71, 11, 14, 11, 16, 90, 1, 16, 52, 95. We first investigate the underlying distribution of the set of data. To test whether the underlying distribution of the data {X 1 , · · · , X n } is the Power Lindley, the following procedures can be used. From Definition 1, we know that the Y i = a i−1 X i and the Y i s follow the Power Lindley. We can write immediately as ln Y i = (i − 1) ln a + ln X i by taking the logarithm of Y i . Note that ln Y i s follow the Log-Power Lindley distribution. Therefore, a linear regression model can be defined, where µ = E (ln Y i ) and exp(ε i ) ∼ PL (θ, β). If the exponential errors are Power Lindley distributed, then the underlying distribution of the set of data is Power Lindley. The error term ε i in Equation (39) can be estimated bŷ whereμ = n(n−1) 2 lnâ NL + ∑ n i=1 ln X i . Thus, consistency of the exponentiated errors to Power Lindley distribution can be tested by using a goodness of fit test such as Kolmogorov-Smirnov (K-S). The parameter estimates of the exponentiated errors are θ = 0.9717 and β = 0.7985 and also the value of the K-S test is 0.1225 and the corresponding p-value is 0.7134. Therefore, it can be said that the underlying distribution of this data set is Power Lindley. This can also be seen from Figure 2, which illustrates the plots of the empirical and the fitted cdf. When a GP with the Power Lindley distribution is used for modeling of this dataset, the ML, the MLS, the MM and the MLM estimates of the parameters a, α and λ and also MSE* and MPE values are tabulated in Table 3.
As it can be seen from Table 7, the ML estimates have the smallest MSE * values consistent with the simulation experiments. The estimator having the smallest MPE is the MM, but estimators of the ML and the MLM are very close to the MM. Now, we check the model optimality. In order to select an optimal model to a data set, Akaike information criterion (AIC) and maximized log-likelihood (L) value are commonly used methods. For deciding an optimal model among the Power Lindley and its alternatives (Log-Normal, Gamma and inverse Gaussian) for this data set, we compute the -L and AIC values. The -L and AIC values for the models are given in Table 8. The results given in Table 8 show that the Power Lindley distribution is an optimal model for the aircraft dataset with smaller AIC and -L values. In Figure 3, the failure times for the aircraft data and their fitted times are plotted. The modeling performances of the RP and GP with the ML, the MLS, the MM and the MLM estimators can be easily compared by Figure 3. By Figure 3, the GP with the ML, the MLS, the MM and the MLM estimators outperform the RP. This result is compatible with Table 7.

Example 2. Coal mining disaster data.
This example is from [30] on the intervals in days between successive disasters in Great Britain from 1851 to 1962. The coal mining disaster data set, which includes 190 successive intervals, was used as an illustrative example for GP by [7,10,24].
For this data set, the value of K-S test is 0.0396 and the corresponding p-value is 0.9148. Thus, the Power Lindley distribution is an appropriate model for the coal mining disaster data. This is also supported by the following Q-Q plot, Figure 4, which is constructed by plotting the ordered exponential errors exp (ε i ) against the quantiles of the PL(0.9552, 0.7930) because the data points fall approximately on the straight line in Figure 4.  For the coal mining disaster data set, the estimates of the parameters a, λ and α are given in Table 9. Moreover, calculated AIC and L values for the coal mining disaster data set with the different models are tabulated in Table 10. We can say that the Power Lindley distribution is an optimal model for this data set since it has minimum AIC and -L values. Under the assumption that the underlying distribution of the data is Power Lindley, we present Figure 5 for comparing the modeling performance of the RP and the GP with four estimators obtained in the previous section. Figure 5 plots S k andŜ k versus the number of disasters k, k = 1, 2, · · · , n. As can be seen in Figure 5, the GP with the ML, the MLS, the MM and the MLM estimators more fairly follow real values than the RP. It can also be seen from Table 9 that the MSE* and MPE values of GP models are much smaller than RP. Thus, according to Figure 5 and Table 9, it is concluded that the GP provides the better data fit than RP.

Conclusions
In this paper, we have discussed the parameter estimation problem for GP by assuming that distribution of the first inter-arrival time is Power Lindley with parameters α and λ. In the paper, the parameter estimation problem has been solved from two points of view as the parametric (ML) and nonparametric (MLS, MM and MLM). Parametric estimators, ML, of the parameters A, ALF and LAM are also asymptotically normally distributed. However, more work should be done in order to say something about the asymptotic properties of nonparametric estimators MLS, MM and MLM. In addition, this is usually not an easy task because an analytical form of these estimators cannot be written.
Numerical study results have shown that the ML estimators outperform the MLS, MLM and MLS estimators with smaller bias and MSE measures. In addition, it has been observed that both bias and MSE values of all estimators decrease when the sample size increases. Hence, in light of numerical studies, it can be concluded that all of the estimators are asymptotically unbiased and consistent.
In the illustrative examples presented to demonstrate the data modeling performance of GP with the obtained estimators, the GP with Power Lindley distribution gives a better data fit than the RP in both examples. In addition, according to AIC and -L values, it can be said that modeling both the aircraft dataset and the coal mining disaster dataset using a GP with Power Lindley distribution is more appropriate than a GP with Gamma, log-normal or inverse Gaussian distribution. Therefore, it can be said that a GP with Power Lindley distribution is a quite important alternative to a GP with famous distributions such as Gamma, Log-normal or inverse Gaussian in modeling the successive inter-arrival times.

Funding: This research received no external funding
Acknowledgments: The author is sincerely grateful to the anonymous referees for their comments and suggestions that greatly improved an early version of this paper.

Conflicts of Interest:
The author declares no conflict of interest.

Appendix A
According to [26], the first two population L-moments of a continuous distribution are calculated by and respectively, where f (x) and F (x) are the pdf and the cdf of the random variable X. Obviously, the first popuplation L-moment L 1 is the expectation of random variable X. Thus, from Table 2, the first L-moment of Power Lindley distribution is Now, we calculate the second population L-moment of Power Lindley distribution. Substituting the cdf given by (5) into (A2), we can write In the last equality, using f (x) given in (4), it can be easily written x 2α e −2λx α dx + ∞ 0 x 3α e −2λx α dx .
(A4) Therefore, by using the gamma function, we have (A5)