1. Introduction
The exponential and Lindley distributions [
1] play an essential role in distribution theory as baselines of many generalizations. Although in some cases, the Lindley distribution can outperform the exponential distribution [
2], both distributions may not be adequate in many problems. Therefore, there is space for improvement.
Let Tbe a non-negative random variable with the probability density function (PDF) given by:
where
is the shape parameter.
The cumulative distribution function (CDF) related to the proposed distribution is:
The proposed distribution can be viewed as a combination of a reparametrized version of the Zakerzadeh and Dolati [
3] distribution with a particular case of the gamma model and the occurrence of zero value. This paper aims to provide some properties of the proposed distribution (
1). The remainder of this study is set out as follows.
Section 2 discusses the genesis of the proposed model and its relationship with the exponential and Lindley distributions.
Section 3 gives the r-moments, mean, variance, the coefficient of variation, skewness, and the kurtosis.
Section 4 shows the relationship of the proposed distribution with the Lindley and exponential distributions.
Section 5 presents the estimators of
based on the maximum likelihood estimator (MLE) and the moments’ estimator (ME).
Section 6 presents a simulation study to compare the MLE and ME performance.
Section 7 explains the relevance of our proposed methodology with four real datasets.
Section 8 summarizes the present study.
2. Genesis
The exponential distribution has performed a crucial role in statistical theory either to describe real situations or as a basis of more flexible models. Its PDF is given by:
Most of the currently generalized models have as a baseline (
2) (see Tahir and Cordeiro [
4] for the recent discussion of generalized models). Rao [
5] discussed another concept to describe sampling situations by weighted distributions.
Let
be a baseline distribution of the random variable
X and
be the weighted function (see Bartoszewicz [
6] for a detailed discussion) where
where:
Then, the weighted distribution of X has the PDF given by:
If
can be rewritten as (
3), then
is a weighted distribution of
f(
x).
Besides the exponential distribution, another important baseline model is the Lindley distribution that has the PDF given by:
Although such a distribution has been proposed in the context of the fiducial distribution, it can be easily noted that:
where
. Therefore, the Lindley distribution it is a weighted model. Following the same idea, we considered that:
Note that the three distributions are closed related and differ by weight functions and . The new distribution tends to behave more similar to the Lindley distribution for small values of , while as and , it tends to recapture the exponential distribution shape.
In
Figure 1, we can observe the influence of the weight functions
and
on the different distributions. As just pointed out above, for small values of
, the proposed distribution behaves more similar to the Lindley distribution. On the other hand, as
increases, it behaves more identical to the exponential distribution. However, the exponential decay of the proposed distribution is smaller than the exponential one, which allows us to fit data with extreme values. Therefore, our distribution is more flexible to describe different types of lifetime data.
Recently, Zakerzadeh and Dolati [
3] introduced a three-parameter generalized Lindley distribution with the PDF given by:
where
and
. Note that, by considering
where
,
and
, we have the PDF (
1). On the other hand, to achieve our distribution, one needs to combine the reparametrized model with
when
and then extend the obtained model for
. It is worth mentioning that Zakerzadeh and Dolati’s model [
3] is not defined for zero value, which is one of our main focuses.
3. Moments
The
rth moment for the proposed distribution is given by:
Moreover, the
rth central moment is given as follows,
In particular, it is simple to show that the mean and the variance are given by:
From (
5), the coefficient of variation, skewness, and the kurtosis are given by:
Figure 2 shows some examples of the shapes of the coefficient of variation, skewness, and kurtosis according to different values of
.
4. Shapes
The PDF is unimodal for
and decreasing for
. The behavior of (
Figure 3) when
and
is, respectively, given by:
Therefore, the proposed distributions allow us to fit data with zero occurrences (inliers), which are common in many problems, especially in hydrology and reliability. In
Figure 3, we illustrate the shapes of the PDF for different values of
.
The survival function for the proposed distribution is given by:
The hazard function plays an important role in lifetime distributions, being one of the most important quantities to characterize the lifetime phenomenon. For the proposed distribution, its hazard function is given by:
Note that,
is increasing for all
t and
, which implies that the hazard rate function (
7) has an increasing shape.
Figure 4 gives examples of the shapes of the hazard function for different values of
.
The behavior of the hazard function (
Figure 4) when
and
is, respectively, given by:
From
Figure 4, we can see that,
as
, i.e., the hazard rate converges to a constant. Therefore, the new distribution converges to an exponential distribution when
.
The mean residual life (MRL) has been widely used in survival analysis and represents the expected additional lifetime given that a component has survived until time
t. The MRL function is computed by:
Solving the integral, we have:
which implies that:
The behavior of the MRL when
and
is
and
. For a non-negative random variable, if
is increasing, then
is decreasing (see Bryson and Siddique [
7]).
Figure 5 presents some shapes of the MRL function for different values of
.
Zakerzadeh and Dolati [
3] presented the moment generating function for the three-parameter distribution, which after some algebra and using the reparametrization discussed in
Section 2 can be used to obtain the
rth moment of our distribution. Moreover, they also studied the behavior of the hazard function of (
4), which has an increasing shape when
, confirming that our hazard function is increasing. Hereafter, we discuss many properties of the proposed model. Due to the different form of our proposed distribution, as well as its simple structure; the results discussed in this paper differ from those presented by the authors above.
5. Inference
In this section, we present the maximum likelihood and the moment estimators for the parameter of the proposed distribution.
The method of moments is one of the simplest estimation procedures, which for a one-parameter distribution can be obtained by equating the first theoretical moment with the sample mean, i.e.,
, where
. After some algebraic manipulation, the solution has a closed-form expression and is given by:
We can construct asymptotic confidence intervals for
by using the fact that
, as
, where the standard deviation
can be obtained by the delta method. The expression is obtained by
in which
, and
and
are given in (
6). After some algebraic manipulations, we have
if
or:
For the maximum likelihood estimation, let
be a random sample such that
T has the PDF given in (
1), then the likelihood function is given by:
The log-likelihood function
is given by:
and the score function is given by:
Solving , we obtain , i.e., the MLE of .
Remark 1. If , then is positive for all ; on the other hand, if λ increases, becomes negative for all . This implies that there is at least one solution for . However, proving that the solution is unique is a hard task since it is difficult to study the behavior of the derivative .
Under mild conditions, the maximum likelihood estimate is asymptotically normal distributed with a normal distribution given by
, where
is the Fisher information element given by:
and Ei
, where
is the exponential integral function.
6. Simulation Analysis
In this section, a simulation study is presented to compare the efficiency of the maximum likelihood method with the method of moments. The simulation study is performed over
N = 5,000,000 samples generated from the new distribution with
and
. This comparison is performed by computing the bias and the mean squared errors (MSE) given by:
where
N is the number of samples, as well as the coverage probabilities with a
confidence level. Note that, the new distribution can be expressed as a two-component mixture:
where
(or
) and
for
. Therefore, the dataset is generated as follows,
Generate , and ;
If , then set , otherwise, set .
By considering this approach, it is expected that the best estimation method will return the average bias and the MSE close to zero. For the coverage probability, the frequencies of intervals that covered the true values of under a confidence level of should be closer to .
The results are condensed in
Table 1,
Table 2 and
Table 3, which present the average bias and average MSE, and the coverage probability with a
confidence level of the estimates obtained from the maximum likelihood and the moment estimation approaches for different samples.
From the obtained results, we can conclude that as n increases, both bias and MSE tend to zero and the coverage probability of the confidence levels tends to the nominal value of . However, the MLE returned better results especially for small values of since the bias, and the MSE were closer to zero than the ME. It is important to point out that the results and the same conclusions were obtained considering different choices of . Therefore, the MLE should be used as an estimator for the parameter of the new proposed distribution.
7. Application
In this section, we fit four datasets using the exponential, the Lindley, and the new distribution. The maximum likelihood estimates of the parameters were computed, as well as the respective standard errors (SE). The Kolmogorov–Smirnov (KS) statistic (see Massey Jr. [
8] for more details) is presented to check the goodness of the fit. The Akaike [
9] information criterion (AIC) computed by
is used as the discrimination criterion, where
is the MLE of
and k is the number of parameters in the model. The best distribution is the one that provides the minimum AIC.
Firstly, we recall two hydrologic datasets analyzed by Muralidharan and Khabia [
10] related to monthly rainfall (in mm) of surface runoff in Andhra Pradesh. Datasets 1 and 2 have the occurrence of zero values.
Table 4 summarizes the obtained results. From the
p-value of the KS test, we observe that all three values are candidates for fitting the data under a significance level of 5%. However, the proposed distribution provided a better fit for both datasets since they had the minimum AIC values.
Now, we consider two datasets related to the failure time of electronic components (Dataset 3) and mechanical components (Dataset 4) of an agricultural machine. Such a machine harvests more than 10 tons of plants per hour, and the study of its lifetime components is of main interest.
Table 5 summarizes the results related to Datasets 3 and 4. Using the
p-value of the KS test, we observe that all three values are candidates for fitting the data under a significance level of 5%. The proposed distribution provided a better fit for both data since they had the minimum AIC values.
8. Discussion And Extensions
We introduced a simple one-parameter distribution that can be used to describe lifetime data in the presence of instantaneous failures. The mathematical properties of the proposed distribution were presented such as the mean, variance, coefficient of variation, skewness, kurtosis, and the rth moment. The survival, hazard, and mean residual lifetime functions were also presented for the proposed model. The parameter estimation was discussed under the methods of moments and maximum likelihood. The asymptotic confidence intervals were presented for both estimation methods. The results of the application section showed that our distribution outperformed the common one-parameter distributions in many applications.
The simple structure of the proposed model provides many possible extensions of the current work. For instance, Sankaran [
11] introduced the discrete Poisson–Lindley distribution by compounding the Poisson distribution with the Lindley one. Here, we can consider the same approach by compounding the Poisson distribution with the new one. For instance, let the parameter
of the Poisson distribution have a distribution function
in which:
Then, the compound distribution is given by:
After some algebra, we have a new discrete compound distribution given by:
Due to the simple structure of the CDF of our proposed model, many extensions could be introduced by including extra parameters using the established G-family of distributions. For instance, considering Lehmann Type 1 and Type 2 alternatives, we have the CDFs given by:
where
,
, and
. Tahir and Cordeiro [
4] discussed many G-family distributions that can be used to extend the proposed model. Our approach should be investigated further in these contexts.
Author Contributions
All authors contributed significantly to the study and preparation of the article.
Funding
This research was funded by FAPESP Grant Proc. 2017/25971-0” and CNPq Grant Proc. 301976/2017-1.
Acknowledgments
The authors are very grateful to the Editor and the three reviewers for their helpful and useful comments, which improved the manuscript.
Conflicts of Interest
The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; nor in the decision to publish the results.
References
- Lindley, D.V. Fiducial distributions and Bayes’ theorem. J. R. Stat. Soc. Ser. B 1958, 20, 102–107. [Google Scholar] [CrossRef]
- Ghitany, M.; Atieh, B.; Nadarajah, S. Lindley distribution and its application. Math. Comput. Simul. 2008, 78, 493–506. [Google Scholar] [CrossRef]
- Zakerzadeh, H.; Dolati, A. Generalized Lindley Distribution. J. Math. Ext. 2009, 3, 1–17. [Google Scholar]
- Tahir, M.H.; Cordeiro, G.M. Compounding of distributions: A survey and new generalized classes. J. Stat. Distrib. Appl. 2016, 3, 13. [Google Scholar] [CrossRef]
- Rao, C.R. Weighted distributions arising out of methods of ascertainment: What population does a sample represent. In A Celebration of Statistics; Springer: Berlin, Germany, 1985; pp. 543–569. [Google Scholar]
- Bartoszewicz, J. On a representation of weighted distributions. Stat. Probab. Lett. 2009, 79, 1690–1694. [Google Scholar] [CrossRef] [Green Version]
- Bryson, M.C.; Siddiqui, M. Some criteria for aging. J. Am. Stat. Assoc. 1969, 64, 1472–1483. [Google Scholar] [CrossRef]
- Massey, F.J., Jr. The Kolmogorov-Smirnov test for goodness of fit. J. Am. Stat. Assoc. 1951, 46, 68–78. [Google Scholar] [CrossRef]
- Akaike, H. A new look at the statistical model identification. IEEE Trans. Autom. Control 1974, 19, 716–723. [Google Scholar] [CrossRef]
- Muralidharan, K.; Khabia, A. Some statistical inferences on Inlier (s) models. Int. J. Syst. Assur. Eng. Manag. 2017, 8, 18–25. [Google Scholar] [CrossRef]
- Sankaran, M. 275. note: The discrete poisson-lindley distribution. Biometrics 1970, 26, 145–149. [Google Scholar] [CrossRef]
© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).