You are currently viewing a new version of our website. To view the old version click .
Entropy
  • Article
  • Open Access

21 September 2018

Statistical Inference for Geometric Process with the Power Lindley Distribution

Statistics Department, University of Kirikkale, 71450 Kirikkale, Turkey
This article belongs to the Section Information Theory, Probability and Statistics

Abstract

The geometric process (GP) is a simple and direct approach to modeling of the successive inter-arrival time data set with a monotonic trend. In addition, it is a quite important alternative to the non-homogeneous Poisson process. In the present paper, the parameter estimation problem for GP is considered, when the distribution of the first occurrence time is Power Lindley with parameters α and λ . To overcome the parameter estimation problem for GP, the maximum likelihood, modified moments, modified L-moments and modified least-squares estimators are obtained for parameters a, α and λ . The mean, bias and mean squared error (MSE) values associated with these estimators are evaluated for small, moderate and large sample sizes by using Monte Carlo simulations. Furthermore, two illustrative examples using real data sets are presented in the paper.

1. Introduction

The Renewal process (RP) is a commonly used method for the statistical analysis of the successive inter-arrival times data set observed from a counting process. When the data set is non-trending, independently and identically distributed (iid), RP is a possible approach to modeling the data set. However, if the data follow a monotone trend, the data set can be modeled by a more possible approach than RP, such as the non-homogeneous Poisson process with a monotone intensity function, or a GP.
GP is first introduced as a direct approach to modeling of the inter-arrival times data with the monotone trend by Lam [1]. Actually, GP is a generalization of the RP with a ratio parameter. However, GP is a more flexible approach than RP for modeling of the successive inter-arrival time data with a trend. Because of this feature, GP has been successfully used as a model in many real-life problems from science, engineering, and health.
Before progressing further, let us recall the following definition of GP given in [2].
Definition 1.
Let X i be the inter-arrival time the ( i 1 ) th and ith events of a counting process N ( t ) , t 0 for i = 1 , 2 , . Then, the stochastic process X n , n = 1 , 2 , generated by X i random variables is said to be a geometric process (GP) with parameter a if there exists a real number a > 0 such that
Y i = a i 1 X i , i = 1 , 2 ,
are iid random variables with the distribution function F, where a is called the ratio parameter of the GP.
Clearly, the parameter a arranges monotonic behavior of the GP. In Table 1, the monotonic behavior of the GP is given.
Table 1. Behavior of GP according to values of ratio parameter a.
In the GP, the assumption on the distribution of X 1 has a special significance because of the fact that the distribution of X 1 and the other random variables X 2 , , X n are from the same family of distributions with a different set of parameters. Namely, X i s i = 1 , 2 , , n are distributed independently, but not identically. This is trivial from Definition 1. By considering this property, the expectation and variance of X i s are immediately obtained as
E X i = μ a i 1 i = 1 , 2 , ,
V a r X i = σ 2 a 2 i 1 , i = 1 , 2 , ,
where μ and σ 2 are the expectation and variance of the random variable X 1 , respectively. Since the distribution of the random variable X 1 determines the distribution of the other variables, the selection of the distribution of X 1 based on the observed data is quite important to optimal statistical inference [3,4]. There are many studies in the literature on the solution to the parameter estimation problem of GP while selecting some special distributions for X 1 . Chan et al. [5] investigated the parameter estimation problem of GP by assuming that the distribution of random variable X 1 was Gamma. Lam et al. [6] investigated the statistical inference problem for GP with Log-Normal distribution according to parametric and non-parametric methods. When the distribution of random variable X 1 was the inverse Gaussian, Rayleigh, two-parameter Rayleigh and Lindley, the problems of statistical inference for GP were investigated according to ML and modified moment method by [7,8,9,10], respectively.
The main objective of this study extensively investigates the solution of parameter estimation problem for GP when the distribution of the first arrival time is Power Lindley. In accordance with this objective, estimators for the parameters of GP with the Power Lindley distribution are obtained according to methods of maximum likelihood (ML), modified moment (MM), modified L-moment (MLM) and modified least-squares (MLS). The method of moments and the least-squares estimators of Power Lindley distribution are available in the literature [11]. The L-moments estimator of the Power Lindley distribution is obtained with this paper. In addition, the novelty of this paper is that the distribution of first inter-arrival time is assumed to be Power Lindley for GP and the ML and MLM estimators under this assumption are obtained.
The rest of this study is organized as follows: Section 2 includes the detailed infomation about the Power Lindley Distribution. In Section 3, the ML, the MLS, the MM and the MLM estimators of the parameters a , α and λ are obtained. Section 4 presents the results of performed Monte Carlo simulations for comparing the performances of the estimators obtained in Section 3. For illustrative purposes, two examples with real data sets are given in Section 5. Section 6 concludes the study.

2. An Overview to Power Lindley Distribution

The Power Lindley distribution was originally introduced by Githany et al. [11] as an important alternative for modeling of failure times. The distribution has a powerful modeling capability for positive data from different areas, such as reliability, lifetime testing, etc. In addition, for modeling the data sets with various shapes, many different extensions of the Power Lindley distribution have been attempted by researchers under different scenarios [12,13,14,15,16,17,18,19,20].
Let X be a Power Lindley distributed random variable with the parameters α and λ . From now on, a Power Lindley distributed random variable X will be indicated as X P L α , λ for brevity. The probability density function (pdf) of the random variable X is
f ( x ; α , λ ) = α λ 2 1 + λ 1 + x α x α 1 e λ x α , x > 0 , λ > 0 , α > 0
and the corresponding cumulative distribution function (cdf) is
F x , α , λ = 1 1 + λ λ + 1 x α e λ x α , x > 0 ,
where α and λ are the positive and real-valued scale parameter and shape parameter of the distribution, respectively. Essentially, the Power Lindley distribution is a two-component mixture distribution with mixing ratio λ 1 + λ in which the first component is a Weibull distribution with parameters α and λ and the second component is a generalized Gamma distribution with parameter 2, α and λ . The Power Lindley distribution is a quite important alternative to standard distribution families for analyzing of lifetime data, since its distribution function, survival function and hazard function are expressed in explicit form. The Power Lindley distribution also has uni-mode and belongs to the exponential family. To clearly show the shape of the distribution, we present Figure 1. Figure 1a,b show the behavior of the pdf of Power Lindley distribution at the different values of the parameters α and λ .
Figure 1. Pdf of the Power Lindley distribution (a): α = 0.25 , 0.50 , 0.75 , 1 , 2 , 5 , 10 and λ = 2 ; (b) α = 2 and λ = 0.25 , 0.50 , 0.75 , 1 , 2 , 5 , 10 .
Some basic measures of the Power Lindley distribution are tabulated in Table 2.
Table 2. Some basic measures of the Power Lindley distributions.
Advanced readers can refer to [11] for more information on the Power Lindley distribution.

2.1. Shannon and Rényi Entropy of the Power Lindley Distribution

The entropy is a measure of variation or uncertainty of a random variable. In this subsection, we investigate the Shannon and Rényi entropy, which are the two most popular entropies for Power Lindley distribution. The Shannon entropy (SE) of a random variable X with pdf f is defined as, see [21],
H X = E l n f ( x ) .
Then, by using the pdf (4), the SE of the Power Lindley distribution is found as
H X = E ln f x , α , λ = 0 ln α λ 2 1 + λ + α 1 ln x λ x α f x , α , λ d x = ln α λ 2 1 + λ + α 1 0 ln x f x , α , λ d x λ 0 x α f x , α , λ d x = ln α λ 2 1 + λ + α 1 E ln X λ E X α = ln α λ 2 1 + λ + α 1 E ln X λ E X α = λ α α λ + 1 + α α 2 λ + 1 α 1 1 1 + λ Ψ 1 + ln λ α 1 + λ ln α λ 2 1 + λ ,
where Ψ . is the digamma function [22]. The Rényi entropy of a random variable with pdf f is defined as
R E X ξ = 1 1 ξ log f x ξ d x .
By using the pdf (4), Rényi entropy of the Power Lindley distribution is obtained as follows:
R E X ξ = 1 1 ξ log α λ 2 1 + λ 1 + x α x α 1 e λ x α ξ d x = 1 1 ξ log α λ 2 1 + λ ξ x α 1 e λ x α ξ 1 + x α ξ d x .
Applying the power expansion formula and gamma function to Equation (9), R E X ξ is obtained as
R E X ξ = 1 1 ξ log α λ 2 1 + λ ξ i = 0 ξ ξ i x α 1 e λ x α ξ x ξ α d x = 1 1 ξ log α λ 2 1 + λ ξ i = 0 ξ ξ i ( λ ξ ) 2 α ξ + ξ 1 α Γ 2 α ξ ξ + 1 α α .

3. Estimation of Parameters of GP with Power Lindley Distribution

3.1. Maximum Likelihood Estimation

Let X 1 , X 2 , , X n be a random sample from a GP with ratio a and X 1 P L α , λ . The likelihood function for X i , i = 1 , 2 , , n is
L ( a , α , λ ) = a n n 1 2 α n λ 2 n λ + 1 n i = 1 n 1 + a i 1 x i α a i 1 x i α 1 e λ a i 1 x i α .
From Equation (11), the corresponding log-likelihood function can be written as below:
ln L ( a , α , λ ) = n n 1 2 ln a + n ln α + 2 ln λ ln λ + 1 + i = 1 n ln 1 + a i 1 x i α + α 1 j = 1 n ln a j 1 x j λ i = 1 n a i 1 x i α .
If the first derivatives of Equation (12) with respect to a , α and λ are taken, we have the following likelihood equations:
ln L ( a , α , λ ) a = α n n 1 2 a + α i = 1 n a i 2 a i 1 x i α 1 a i 1 x i α + 1 x i i 1 λ i = 1 n α a i 1 x i α 1 x i i 1 a i 2 = 0 ,
ln L ( a , α , λ ) α = n α + i = 1 n ln a i 1 x i a i 1 x i α a i 1 x i α + 1 + i = 1 n ln a i 1 x i λ i = 1 n ln 1 a a i x i 1 a a i x i α = 0
and
ln L ( a , α , λ ) λ = 2 n λ + n λ + 1 i = 1 n a i 1 x i α = 0 .
From the solution of likelihood Equations (13)–(15), the parameter λ is obtained as
λ a ^ , α ^ = 1 X ¯ a α + 6 X ¯ a α + X ¯ a α 2 + 1 2 X ¯ a α ,
where X ¯ a α = i = 1 n a ^ i 1 x i α ^ n . However, analytical expressions for the ML estimators of the parameters a and α cannot be obtained from likelihood Equations (13)–(15). In order to estimate these parameters, Equations (13)–(15) must be simultaneously solved by using a numerical method such as Newton’s method.
Let θ = a α λ be the parameter vector and θ corresponding gradient vector for this parameter vector, i.e.,
θ = ln L ( a , λ ) a ln L ( a , λ ) α ln L ( a , λ ) λ .
Under these notations, Newton’s method is given as
θ m + 1 = θ m H 1 θ m θ m ,
where m is the iteration number and H 1 θ is the inverse of the Hessian matrix H θ , H θ R 3 × 3 . The elements of the matrix H θ are of the second derivatives of the log-likelihood function given in Equation (11) with respect to parameters a , α and λ . Let h i j be the i , j t h i , j = 1 , 2 , 3 element of the matrix H θ . The h i j s are obtained as below:
h 11 = 2 ln L a 2 = α n n 1 2 a 2 + α i = 1 n 1 a 2 a i 1 x i α a i 1 x i α + 1 2 i 1 α i α + a i 1 x i α + 1 λ i = 1 n 1 a 2 α a i 1 x i α i 1 α i α + 1 ,
h 12 = 2 ln L a α = n n 1 2 a + i = 1 n a i 1 x i α a + a a i 1 x i α i 1 + α i = 1 n 1 a ln a i 1 x i a i 1 x i α a i 1 x i α + 1 2 i 1 λ i = 1 n a i 2 a i 1 x i α 1 α ln a i 1 x i + 1 x i i 1 ,
h 13 = 2 ln L a λ = α i = 1 n a i 1 x i α 1 x i i 1 a i 2 ,
h 21 = 2 ln L α a = n n 1 2 a + 1 a i = 1 n a i 1 x i α a i 1 x i α + 1 2 i 1 α ln a i 1 x i + a i 1 x i α + 1 λ i = 1 n 1 a a i 1 x i α α ln a i 1 x i + 1 i 1 ,
h 22 = 2 ln L α 2 = n α 2 + i = 1 n ln a i 1 x i 2 a i 1 x i α a i 1 x i α + 1 2 λ i = 1 n ln a i 1 x i 2 a i 1 x i α ,
h 23 = 2 ln L α λ = i = 1 n ln a i 1 x i a i 1 x i α ,
h 31 = 2 ln L λ a = α i = 1 n a i 1 x i α 1 x i i 1 a i 2 ,
h 32 = 2 ln L λ a = i = 1 n ln a i 1 x i a i 1 x i α ,
h 33 = 2 ln L λ 2 = 2 n λ 2 n λ + 1 2 .
Hence, H 1 θ is obtained as
H 1 θ = 1 D e t H θ h 22 h 33 h 23 h 32 h 12 h 33 h 13 h 32 h 12 h 23 h 13 h 22 h 21 h 33 h 31 h 23 h 11 h 33 h 13 h 31 h 11 h 23 h 21 h 13 h 21 h 32 h 22 h 31 h 11 h 32 h 12 h 31 h 11 h 22 h 12 h 21 ,
where D e t H θ is the determinant of the matrice H θ and it is calculated by
D e t H θ = h 11 h 22 h 33 h 11 h 23 h 32 h 12 h 21 h 33 + h 12 h 31 h 23 + h 21 h 13 h 32 h 13 h 22 h 31 .
Thus, by starting with being given an initial estimation θ 0 , the parameter vector θ can be estimated with an iterative method given by (18). Hence, the ML estimates of the parameters a , α and λ , say a ^ M L , α ^ M L and λ ^ M L , respectively, are obtained as respective elements of the θ m . The joint distribution of a ^ M L , a ^ M L and a ^ M L estimators is asymptotically normal with mean vector a , α , λ and covariance matrix I 1 a , α , λ , (see, [23]), where I is the Fisher information matrix, i.e.,
I = E ln L ( a , α , λ ) a 2 E ln L ( a , α , λ ) a α E ln L ( a , α , λ ) a λ E ln L ( a , λ ) α a E ln L ( a , α , λ ) α 2 E ln L ( a , α , λ ) a λ E ln L ( a , λ ) λ a E ln L ( a , α , λ ) λ α E ln L ( a , α , λ ) λ 2 .
The elements of the Fisher information matrix I given by (21) are immediately written from elements of the Hessian matrix H θ . However, an explicit form of the Fisher information matrix I cannot be derived. Fortunately, as an estimator, the observed information matrix can be used instead of matrix I. Note that the observed information matrix of the estimators is the negative value of the matrix H θ obtained at the last iteration.

3.2. Modified Methods

Since GP is a monotonic stochastic process, some divergence problems may arise in the estimation stage of the ratio parameter a. To overcome this problem, estimating the ratio parameter a by nonparametrically is a widely used method in statistical inference for GP [2,24]. A nonparametric estimator for the ratio parameter a is given by, see [6,25],
a ^ N P = exp 6 n 1 n n + 1 i = 1 n n 2 i + 1 ln X i .
The estimator a ^ N P is an unbiased estimator and follows the asymptotic normal distribution, see [25]. When the a ^ N P given by (22) is substituted into Equation (1), it can be immediately written
Y ^ i = a ^ N P i 1 X i , i = 1 , 2 , , n .
Thus, the parameters α and λ can be estimated with a selected estimation method by using the estimators Y ^ i i = 1 , 2 , , n . This estimation rule is called modified method (see [6]).

3.2.1. MM Estimation

Let X 1 , X 2 , , X n be a random sample from a GP with ratio a and X 1 P L ( α , λ ) . In addition, we assume that the ratio parameter a is nonparametrically estimated by (22). For the sample X 1 , X 2 , , X n , first and second sample moments, m 1 and m 2 , are calculated by
m 1 = 1 n i = 1 n a ^ N P i 1 X i
and
m 2 = 1 n i = 1 n a ^ N P 2 i 1 X i 2 ,
respectively. On the other hand, from Table 2, first and second population moments of the distribution P L ( α , λ ) , say μ 1 and μ 2 , can be easily written as
μ 1 = Γ 1 α α λ + 1 + 1 α 2 λ 1 / α λ + 1
and
μ 2 = 2 Γ 2 α α λ + 1 + 2 α 2 λ 2 / α λ + 1 ,
respectively. Thus, the MM estimators of the parameters α and λ , α ^ M M and λ ^ M M , respectively, can be obtained from the solution of the following nonlinear equation system:
Γ 1 α α λ + 1 + 1 α 2 λ 1 / α λ + 1 m 1 = 0 2 Γ 2 α α λ + 1 + 2 α 2 λ 2 / α λ + 1 m 2 = 0

3.2.2. MLM Estimation

In this subsection, the MLM estimators α ^ M L M and λ ^ M L M are obtained for the parameters α and λ , respectively, when the ratio parameter a is estimated by (22). L-moments’ estimators have been proposed as a method based on the linear combination of the order statistics by Hosking [26]. Due to their useful and robust structure, the L-moments estimators have been intensively studied and, in order to estimate the unknown parameters of many probability distributions, L-moment estimators have been obtained. For more information about the L-moments, we refer the readers to [26].
As in the method of moments, to obtain the L-moment estimators, population L-moments are equated to sample L-moments. In our problem, the first two samples and population L-moments are necessary for obtaining the estimators α ^ M L M and λ ^ M L M . Under the transform Y ^ i = a ^ i 1 X i , i = 1 , , n , first and second sample L-moments are calculated as follows, see [26]:
l 1 = 1 n i = 1 n Y ^ ( i ) ,
l 2 = 2 n n 1 i = 1 n i 1 Y ^ ( i ) l 1 ,
where Y ^ ( i ) , i = 1 , 2 , , n represents the ordered observations. On the other hand, using the notations in [26], first and second population L-moments of P L λ , α are
L 1 = λ λ + 1 Γ 1 α + 1 λ 1 α + 1 λ + 1 Γ 1 α + 2 λ 1 α = Γ 1 α α + α λ + 1 α 2 λ 1 α λ + 1
and
L 2 = Γ 1 α α λ + 1 + 1 α 2 λ 1 α λ + 1 2 α λ 2 1 + λ Γ α + 1 α α 2 λ α + 1 α + Γ 2 α + 1 α α 2 λ 2 α + 1 α 2 α λ 3 1 + λ 2 Γ 2 α + 1 α α 2 λ 2 α + 1 α + Γ 3 α + 1 α α 2 λ 3 α + 1 α .
See Appendix A for the derivation of population L-moments.
Thus, the estimators α ^ M L M and λ ^ M L M are obtained from the numerical solution of the following nonlinear system:
Γ 1 α α + α λ + 1 α 2 λ 1 α λ + 1 l 1 = 0 , Γ 1 α α λ + 1 + 1 α 2 λ 1 α λ + 1 2 α λ 2 1 + λ Γ α + 1 α α 2 λ α + 1 α + Γ 2 α + 1 α α 2 λ 2 α + 1 α
2 α λ 3 1 + λ 2 Γ 2 α + 1 α α 2 λ 2 α + 1 α + Γ 3 α + 1 α α 2 λ 3 α + 1 α l 2 = 0 .

3.2.3. MLS Estimation

The least-squares estimator (LSE) is a regression-based method proposed by Swain et al. [27]. Essentially, the method is a nonlinear curve fitting for the cdf of a random variable by using the empirical distribution function. In the least-squares method, the estimator(s) is determined such that the squared difference between the empirical distribution function and the fitted curve is minimum. Let Z 1 , , Z n be a random sample from a distribution function F Z ( . ) . In addition, the order statistics of the random sample Z 1 , , Z n are represented by Z ( 1 ) , , Z ( n ) . In this situation, the LSE of the parameters are obtained minimizing
j = 1 n F Z ( Z ( i ) ) E F Z ( Z ( i ) 2
with respect to the parameters of F Z ( . ) [27], where expectation E F Z ( Z ( i ) is calculated by
E F Z ( Z ( i ) = i n + 1 , i = 1 , 2 , , n .
Therefore, in our problem, the MLS estimators of parameters α and λ , say α ^ M L S and λ ^ M L S , respectively, are obtained by minimizing
j = 1 n 1 1 + λ λ + 1 Y j α e λ Y j α j n + 1 2
with respect to α and λ .

4. Simulation Study

In order to evaluate the estimation performances of the ML, the MLS, the MM and the MLM estimators obtained in the previous section, some Monte Carlo simulation experiments are presented in this section. Throughout the simulation experiments, the scale parameter λ is assumed to be 0.5 , without loss of generality and also the parameter α set as 0.5 , 1 , 1.5 and 2. For different sample size n = 30 , 50 , 100 , 150 and ratio parameter a = 0.90 , 0.95 , 1.05 , 1.10 , values of means, biases and MSEs have been calculated for estimating the ML, the MLS, the MM and the MLM. The obtained results based on 10,000 replications are displayed in Table 3, Table 4, Table 5 and Table 6.
Table 3. The simulated Means, Biases and n × MSE values for the MLE, MLS, MM and MLM estimators of the parameters a, α and λ , when α = 0.5 .
Table 4. The simulated Means, Biases and n × MSE values for the MLE, MLS, MM and MLM estimators of the parameters a, α and λ , when α = 1 .
Table 5. The simulated Means, Biases and n × MSE values for the MLE, MLS, MM and MLM estimators of the parameters a, α and λ , when α = 1.5 .
Table 6. The simulated Means, Biases and n × MSE values for the MLE, MLS, MM and MLM estimators of the parameters a, α and λ , when α = 2 .
When the results of the simulation experiments given in Table 3, Table 4, Table 5 and Table 6 are analyzed, it is seen that the ML estimators have smaller MSE values than the other estimators in all cases. Therefore, the ML estimators outperform the modified estimators with smaller bias and MSE. In addition, when the sample size n increases, the values of the bias and MSE decrease for all estimators. Based on this result, it can be said that all estimators obtained in the previous section are asymptotically unbiased and consistent.

5. Illustrative Examples

Practical applications of the parameter estimation with developed procedures in Section 3 are illustrated in this section with two data sets called Aircraft data set and Coal mining disaster data set.
In the examples, we use two criteria, the mean-squared error (MSE*) [28] for the fitted values and the maximum percentage error (MPE), which are defined in [25], for comparing the stochastic processes GP and RP. The MSE* and MPE are described as follows:
  • MSE* = 1 / n k = 1 n X k X ^ k 2 ,
  • MPE = max 1 k n S k S ^ k / S k ,
where X ^ k is calculated by
X ^ k = μ ^ M L a ^ M L 1 k GP with the ML estimators , μ ^ M L S a ^ M L S 1 k GP with the MLS estimators , μ ^ M M a ^ M M 1 k GP with the MM estimators , μ ^ M L M a ^ M L M 1 k GP with the MLM estimators , μ ^ M L RP with the MLestimators ,
and S k = X 1 + X 2 + + X k , k = 1 , 2 , , n and S ^ k = j = 1 k X ^ j . Then, in order to compare the relative performances of the RP and four GPs with the ML, the MLS, the MM and the MLM estimators, the plot of S k and S ^ k against k , k = 1 , 2 , , n can be used.
Example 1.
Aircraft data.
The aircraft dataset consists of 30 observations that deal with the air-conditioning system failure times of a Boeing 720 aircraft (aircraft number 7912). The aircraft dataset was originally studied by Proschan [29]. The successive failure times in aircraft dataset are 23, 261, 87, 7, 120, 14, 62, 47, 225, 71, 246, 21, 42, 20, 5, 12, 120, 11, 3, 14, 71, 11, 14, 11, 16, 90, 1, 16, 52, 95.
We first investigate the underlying distribution of the set of data. To test whether the underlying distribution of the data { X 1 , , X n } is the Power Lindley, the following procedures can be used. From Definition 1, we know that the Y i = a i 1 X i and the Y i s follow the Power Lindley. We can write immediately as ln Y i = i 1 ln a + ln X i by taking the logarithm of Y i . Note that ln Y i s follow the Log-Power Lindley distribution. Therefore, a linear regression model
ln X i = μ i 1 ln a + ε i
can be defined, where μ = E ln Y i and exp ( ε i ) P L θ , β . If the exponential errors are Power Lindley distributed, then the underlying distribution of the set of data is Power Lindley. The error term ε i in Equation (39) can be estimated by
ε ^ i = ln X i μ ^ i 1 ln a ^ N L ,
where μ ^ = n n 1 2 ln a ^ N L + i = 1 n ln X i . Thus, consistency of the exponentiated errors to Power Lindley distribution can be tested by using a goodness of fit test such as Kolmogorov–Smirnov (K-S). The parameter estimates of the exponentiated errors are θ = 0.9717 and β = 0.7985 and also the value of the K-S test is 0.1225 and the corresponding p-value is 0.7134. Therefore, it can be said that the underlying distribution of this data set is Power Lindley. This can also be seen from Figure 2, which illustrates the plots of the empirical and the fitted cdf.
Figure 2. Empirical and fitted cdf of the exp ( ε ^ i ) .
When a GP with the Power Lindley distribution is used for modeling of this dataset, the ML, the MLS, the MM and the MLM estimates of the parameters a , α and λ and also MSE* and MPE values are tabulated in Table 3.
As it can be seen from Table 7, the ML estimates have the smallest MSE * values consistent with the simulation experiments. The estimator having the smallest MPE is the MM, but estimators of the ML and the MLM are very close to the MM.
Table 7. Estimation of parameters for the Aircraft data set and the process comparison.
Now, we check the model optimality. In order to select an optimal model to a data set, Akaike information criterion (AIC) and maximized log-likelihood (L) value are commonly used methods. For deciding an optimal model among the Power Lindley and its alternatives (Log-Normal, Gamma and inverse Gaussian) for this data set, we compute the -L and AIC values. The -L and AIC values for the models are given in Table 8.
Table 8. AIC and negative Log-Likelihood values for the Aircraft data set.
The results given in Table 8 show that the Power Lindley distribution is an optimal model for the aircraft dataset with smaller AIC and -L values. In Figure 3, the failure times for the aircraft data and their fitted times are plotted.
Figure 3. The plots of S k and S ^ k against the number of failures for the aircraft data.
The modeling performances of the RP and GP with the ML, the MLS, the MM and the MLM estimators can be easily compared by Figure 3. By Figure 3, the GP with the ML, the MLS, the MM and the MLM estimators outperform the RP. This result is compatible with Table 7.
Example 2.
Coal mining disaster data.
This example is from [30] on the intervals in days between successive disasters in Great Britain from 1851 to 1962. The coal mining disaster data set, which includes 190 successive intervals, was used as an illustrative example for GP by [7,10,24].
For this data set, the value of K-S test is 0.0396 and the corresponding p-value is 0.9148. Thus, the Power Lindley distribution is an appropriate model for the coal mining disaster data. This is also supported by the following Q-Q plot, Figure 4, which is constructed by plotting the ordered exponential errors exp ε ^ i against the quantiles of the P L ( 0.9552 , 0.7930 ) because the data points fall approximately on the straight line in Figure 4.
Figure 4. Q-Q plot of the coal mining disaster data.
For the coal mining disaster data set, the estimates of the parameters a, λ and α are given in Table 9.
Table 9. Estimation of parameters for the coal mining disaster data and process comparison.
Moreover, calculated AIC and L values for the coal mining disaster data set with the different models are tabulated in Table 10.
Table 10. AIC and negative Log-Likelihood values for coal mining disaster data.
We can say that the Power Lindley distribution is an optimal model for this data set since it has minimum AIC and -L values. Under the assumption that the underlying distribution of the data is Power Lindley, we present Figure 5 for comparing the modeling performance of the RP and the GP with four estimators obtained in the previous section. Figure 5 plots S k and S ^ k versus the number of disasters k , k = 1 , 2 , , n .
Figure 5. The plots S k and S ^ k against the number of failures for the coal mining disaster data.
As can be seen in Figure 5, the GP with the ML, the MLS, the MM and the MLM estimators more fairly follow real values than the RP. It can also be seen from Table 9 that the MSE* and MPE values of GP models are much smaller than RP. Thus, according to Figure 5 and Table 9, it is concluded that the GP provides the better data fit than RP.

6. Conclusions

In this paper, we have discussed the parameter estimation problem for GP by assuming that distribution of the first inter-arrival time is Power Lindley with parameters α and λ . In the paper, the parameter estimation problem has been solved from two points of view as the parametric (ML) and nonparametric (MLS, MM and MLM). Parametric estimators, ML, of the parameters A, ALF and LAM are also asymptotically normally distributed. However, more work should be done in order to say something about the asymptotic properties of nonparametric estimators MLS, MM and MLM. In addition, this is usually not an easy task because an analytical form of these estimators cannot be written.
Numerical study results have shown that the ML estimators outperform the MLS, MLM and MLS estimators with smaller bias and MSE measures. In addition, it has been observed that both bias and MSE values of all estimators decrease when the sample size increases. Hence, in light of numerical studies, it can be concluded that all of the estimators are asymptotically unbiased and consistent.
In the illustrative examples presented to demonstrate the data modeling performance of GP with the obtained estimators, the GP with Power Lindley distribution gives a better data fit than the RP in both examples. In addition, according to AIC and -L values, it can be said that modeling both the aircraft dataset and the coal mining disaster dataset using a GP with Power Lindley distribution is more appropriate than a GP with Gamma, log-normal or inverse Gaussian distribution. Therefore, it can be said that a GP with Power Lindley distribution is a quite important alternative to a GP with famous distributions such as Gamma, Log-normal or inverse Gaussian in modeling the successive inter-arrival times.

Funding

This research received no external funding

Acknowledgments

The author is sincerely grateful to the anonymous referees for their comments and suggestions that greatly improved an early version of this paper.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

According to [26], the first two population L-moments of a continuous distribution are calculated by
L 1 = x f x d x
and
L 2 = x 2 F x 1 f x d x ,
respectively, where f x and F x are the pdf and the cdf of the random variable X. Obviously, the first popuplation L-moment L 1 is the expectation of random variable X . Thus, from Table 2, the first L-moment of Power Lindley distribution is
L 1 = Γ 1 α α λ + 1 + 1 α 2 λ 1 / α λ + 1 .
Now, we calculate the second population L-moment of Power Lindley distribution. Substituting the cdf given by (5) into (A2), we can write
L 2 = 0 x 2 1 1 + λ λ + 1 x α e λ x α 1 f ( x ) d x = 0 x 2 1 1 + λ λ + 1 x α e λ x α 1 f ( x ) d x = 0 2 x 1 1 + λ λ + 1 x α e λ x α x f ( x ) d x = 0 2 x 1 1 + λ λ + 1 x α e λ x α f ( x ) d x 0 x f ( x ) d x = 0 2 x f ( x ) d x 0 2 x 1 + λ λ + 1 x α e λ x α f ( x ) d x 0 x f ( x ) d x = 0 x f ( x ) d x 0 2 x 1 + λ λ + 1 x α e λ x α f ( x ) d x = 0 x f ( x ) d x 0 2 x λ + 1 + λ x α λ + 1 e λ x α f ( x ) d x = E X 0 2 x λ + 1 + λ x α λ + 1 e λ x α f ( x ) d x = Γ 1 α α λ + 1 + 1 α 2 λ 1 / α λ + 1 0 2 x 1 + λ x α λ + 1 e λ x α f ( x ) d x = Γ 1 α α λ + 1 + 1 α 2 λ 1 / α λ + 1 0 2 x e λ x α f ( x ) d x 0 2 λ x α + 1 λ + 1 e λ x α f ( x ) d x .
In the last equality, using f x given in (4), it can be easily written
L 2 = Γ 1 α α λ + 1 + 1 α 2 λ 1 / α λ + 1 0 2 x e λ x α α λ 2 1 + λ 1 + x α x α 1 e λ x α d x 0 2 λ x α + 1 λ + 1 e λ x α α λ 2 1 + λ 1 + x α x α 1 e λ x α d x = Γ 1 α α λ + 1 + 1 α 2 λ 1 / α λ + 1 2 0 α λ 2 1 + λ 1 + x α x α e 2 λ x α d x 2 λ 1 + λ 0 α λ 2 1 + λ 1 + x α x 2 α e 2 λ x α d x = Γ 1 α α λ + 1 + 1 α 2 λ 1 / α λ + 1 2 α λ 2 1 + λ 0 x α + x 2 α e 2 λ x α d x 2 α λ 3 1 + λ 2 0 x 2 α + x 3 α e 2 λ x α d x = Γ 1 α α λ + 1 + 1 α 2 λ 1 / α λ + 1 2 α λ 2 1 + λ 0 x α e 2 λ x α d x + 0 x 2 α e 2 λ x α d x 2 α λ 3 1 + λ 2 0 x 2 α e 2 λ x α d x + 0 x 3 α e 2 λ x α d x .
Therefore, by using the gamma function, we have
L 2 = Γ 1 α α λ + 1 + 1 α 2 λ 1 α λ + 1 2 α λ 2 1 + λ Γ α + 1 α α 2 λ α + 1 α + Γ 2 α + 1 α α 2 λ 2 α + 1 α 2 α λ 3 1 + λ 2 Γ 2 α + 1 α α 2 λ 2 α + 1 α + Γ 3 α + 1 α α 2 λ 3 α + 1 α .

References

  1. Lam, Y. Geometric processes and replacement problem. Acta Math. Appl. Sin. 1988, 4, 366–377. [Google Scholar]
  2. Lam, Y. The Geometric Process and Its Applications; World Scientific: Singapore, 2007. [Google Scholar]
  3. Demirci Bicer, H.; Bicer, C. Discrimination Between Gamma and Lognormal Distributions for Geometric Process Data. In Researches on Science and Art in 21st Century Turkey; Gece Publishing: Ankara, Turkey, 2017; Chapter 315; pp. 2830–2836. [Google Scholar]
  4. Demirci Bicer, H.; Bicer, C. Discriminating between the Gamma and Weibull Distributions for Geometric Process Data. Int. J. Econ. Admin. Stud. 2018, 18, 239–252. [Google Scholar] [CrossRef]
  5. Chan, J.S.; Lam, Y.; Leung, D.Y. Statistical inference for geometric processes with gamma distributions. Comput. Stat. Data Anal. 2004, 47, 565–581. [Google Scholar] [CrossRef]
  6. Lam, Y.; Chan, S.K. Statistical inference for geometric processes with lognormal distribution. Comput. Stat. Data Anal. 1998, 27, 99–112. [Google Scholar] [CrossRef]
  7. Kara, M.; Aydogdu, H.; Turksen, O. Statistical inference for geometric process with the inverse Gaussian distribution. J. Stat. Comput. Simul. 2015, 85, 3206–3215. [Google Scholar] [CrossRef]
  8. Biçer, C.; Biçer, H.D.; Kara, M.; Aydoğdu, H. Statistical inference for geometric process with the Rayleigh distribution. Commun. Fac. Sci. Univ. Ankara Ser. A1 Math. Stat. 2019, 68, 149–160. [Google Scholar]
  9. Bicer, C. Statistical Inference for Geometric Process with the Two-parameter Rayleigh Distribution. In The Most Recent Studies in Science and Art; Gece Publishing: Ankara, Turkey, 2018; Chapter 43; pp. 576–583. [Google Scholar]
  10. Bicer, C.; Demirci Bicer, H. Statistical Inference for Geometric Process with the Lindley Distribution. In Researches on Science and Art in 21st Century Turkey; Gece Publishing: Ankara, Turkey, 2017; Chapter 314; pp. 2821–2829. [Google Scholar]
  11. Ghitany, M.; Al-Mutairi, D.K.; Balakrishnan, N.; Al-Enezi, L. Power Lindley distribution and associated inference. Comput. Stat. Data Anal. 2013, 64, 20–33. [Google Scholar] [CrossRef]
  12. Pararai, M.; Warahena-Liyanage, G.; Oluyede, B.O. Exponentiated power Lindley–Poisson distribution: Properties and applications. Commun. Stat. Theory Methods 2017, 46, 4726–4755. [Google Scholar] [CrossRef]
  13. Ashour, S.K.; Eltehiwy, M.A. Exponentiated power Lindley distribution. J. Adv. Res. 2015, 6, 895–905. [Google Scholar] [CrossRef] [PubMed]
  14. Alizadeh, M.; MirMostafaee, S.; Ghosh, I. A new extension of power Lindley distribution for analyzing bimodal data. Chil. J. Stat. 2017, 8, 67–86. [Google Scholar]
  15. Warahena-Liyanage, G.; Pararai, M. A generalized power Lindley distribution with applications. Asian J. Math. Appl. 2014, 2014, ama0169. [Google Scholar]
  16. Alizadeh, M.; Altun, E.; Ozel, G. Odd Burr power Lindley distribution with properties and applications. Gazi Univ. J. Sci. 2017, 30, 139–159. [Google Scholar]
  17. Iriarte, Y.A.; Rojas, M.A. Slashed power-Lindley distribution. Commun. Stat. Theory Methods 2018, 1–12. [Google Scholar] [CrossRef]
  18. Barco, K.V.P.; Mazucheli, J.; Janeiro, V. The inverse power Lindley distribution. Commun. Stat. Simul. Comput. 2017, 46, 6308–6323. [Google Scholar] [CrossRef]
  19. Alizadeh, M.; K MirMostafaee, S.; Altun, E.; Ozel, G.; Khan Ahmadi, M. The odd log-logistic Marshall-Olkin power Lindley Distribution: Properties and applications. J. Stat. Manag. Syst. 2017, 20, 1065–1093. [Google Scholar] [CrossRef]
  20. Demirci Bicer, H. Parameter Estimation for Finite Mixtures of Power Lindley Distributions. In The Most Recent Studies in Science and Art; Gece Publishing: Ankara, Turkey, 2018; Chapter 128; pp. 1659–1669. [Google Scholar]
  21. Arellano-Valle, R.B.; Contreras-Reyes, J.E.; Stehlík, M. Generalized skew-normal negentropy and its application to fish condition factor time series. Entropy 2017, 19, 528. [Google Scholar] [CrossRef]
  22. Abramowitz, M.; Stegun, I.A. Handbook of Mathematical Functions: With Formulas, Graphs, and Mathematical Tables; Courier Corporation: Chelmsford, MA, USA, 1964; Volume 55. [Google Scholar]
  23. Cox, D.; Barndorff-Nielsen, O. Inference and Asymptotics; CRC Press: Boca Raton, FL, USA, 1994. [Google Scholar]
  24. Yeh, L. Nonparametric inference for geometric processes. Commun. Stat. Theory Methods 1992, 21, 2083–2105. [Google Scholar] [CrossRef]
  25. Lam, Y.; Zhu, L.X.; Chan, J.S.; Liu, Q. Analysis of data from a series of events by a geometric process model. Acta Math. Appl. Sin. Engl. Ser. 2004, 20, 263–282. [Google Scholar] [CrossRef]
  26. Hosking, J.R. L-moments: Analysis and estimation of distributions using linear combinations of order statistics. J. R. Stat. Soc. Ser. B (Methodol.) 1990, 52, 105–124. [Google Scholar]
  27. Swain, J.J.; Venkatraman, S.; Wilson, J.R. Least-squares estimation of distribution functions in Johnson’s translation system. J. Stat. Comput. Simul. 1988, 29, 271–297. [Google Scholar] [CrossRef]
  28. Wand, M.; Jones, M. Kernel Smoothing; Chapman & Hall: London, UK, 1995. [Google Scholar]
  29. Proschan, F. Theoretical explanation of observed decreasing failure rate. Technometrics 1963, 5, 375–383. [Google Scholar] [CrossRef]
  30. Andrews, D.; Herzberg, A. Data; Springer: New York, NY, USA, 1985. [Google Scholar]

Article Metrics

Citations

Article Access Statistics

Multiple requests from the same IP address are counted as one view.