A Distribution for Instantaneous Failures

Abstract: A new one-parameter distribution is proposed in this paper. The new distribution allows for the occurrence of instantaneous failures (inliers) that are natural in many areas. Closed-form expressions are obtained for the moments, mean, variance, a coefficient of variation, skewness, kurtosis, and mean residual life. The relationship between the new distribution with the exponential and Lindley distributions is presented. The new distribution can be viewed as a combination of a reparametrized version of the Zakerzadeh and Dolati distribution with a particular case of the gamma model and the occurrence of zero value. The parameter estimation is discussed under the method of moments and the maximum likelihood estimation. A simulation study is performed to verify the efficiency of both estimation methods by computing the bias, mean squared errors, and coverage probabilities. The superiority of the proposed distribution and some of its concurrent distributions are tested by analyzing four real lifetime datasets.


Introduction
The exponential and Lindley distributions [1] play an essential role in distribution theory as baselines of many generalizations. Although in some cases, the Lindley distribution can outperform the exponential distribution [2], both distributions may not be adequate in many problems. Therefore, there is space for improvement.
Let Tbe a non-negative random variable with the probability density function (PDF) given by: where λ ≥ 2 is the shape parameter. The cumulative distribution function (CDF) related to the proposed distribution is: The proposed distribution can be viewed as a combination of a reparametrized version of the Zakerzadeh and Dolati [3] distribution with a particular case of the gamma model and the occurrence of zero value. This paper aims to provide some properties of the proposed distribution (1). The remainder of this study is set out as follows. Section 2 discusses the genesis of the proposed model and its relationship with the exponential and Lindley distributions. Section 3 gives the r-moments, mean, variance, the coefficient of variation, skewness, and the kurtosis. Section 4 shows the relationship of the proposed distribution with the Lindley and exponential distributions. Section 5 presents the estimators of λ based on the maximum likelihood estimator (MLE) and the moments' estimator (ME). Section 6 presents a simulation study to compare the MLE and ME performance. Section 7 explains the relevance of our proposed methodology with four real datasets. Section 8 summarizes the present study.

Genesis
The exponential distribution has performed a crucial role in statistical theory either to describe real situations or as a basis of more flexible models. Its PDF is given by: Most of the currently generalized models have as a baseline (2) (see Tahir and Cordeiro [4] for the recent discussion of generalized models). Rao [5] discussed another concept to describe sampling situations by weighted distributions.
Let f (x) be a baseline distribution of the random variable X and w(X) : R → R + be the weighted function (see Bartoszewicz [6] for a detailed discussion) where 0 < E[w(X)] < ∞ where: Then, the weighted distribution of X has the PDF given by: If f w (x) can be rewritten as (3), then f w (x) is a weighted distribution of f(x). Besides the exponential distribution, another important baseline model is the Lindley distribution that has the PDF given by: Although such a distribution has been proposed in the context of the fiducial distribution, it can be easily noted that: Therefore, the Lindley distribution it is a weighted model. Following the same idea, we considered that: Note that the three distributions are closed related and differ by weight functions w L (t; λ) and w N (t; λ). The new distribution tends to behave more similar to the Lindley distribution for small values of λ, while as λ → ∞ and w N (t; λ) → 1, it tends to recapture the exponential distribution shape.
In Figure 1, we can observe the influence of the weight functions w L (t; λ) and w N (t; λ) on the different distributions. As just pointed out above, for small values of λ, the proposed distribution behaves more similar to the Lindley distribution. On the other hand, as λ increases, it behaves more identical to the exponential distribution. However, the exponential decay of the proposed distribution is smaller than the exponential one, which allows us to fit data with extreme values. Therefore, our distribution is more flexible to describe different types of lifetime data.  Recently, Zakerzadeh and Dolati [3] introduced a three-parameter generalized Lindley distribution with the PDF given by: where x > 0 and γ, θ, α > 0. Note that, by considering γ = λ −1 (λ − 2) −1 where λ > 2, θ = λ −1 and α = 1, we have the PDF (1). On the other hand, to achieve our distribution, one needs to combine the reparametrized model with f (x) = 1/4xe −x/2 when λ = 2 and then extend the obtained model for x ≥ 0.
It is worth mentioning that Zakerzadeh and Dolati's model [3] is not defined for zero value, which is one of our main focuses.

Moments
The rth moment for the proposed distribution is given by: Moreover, the rth central moment is given as follows, In particular, it is simple to show that the mean and the variance are given by: From (5), the coefficient of variation, skewness, and the kurtosis are given by: Figure 2 shows some examples of the shapes of the coefficient of variation, skewness, and kurtosis according to different values of λ.

Shapes
The PDF is unimodal for 2 ≤ λ < 3 and decreasing for λ ≥ 3. The behavior of (3) when t = 0 and t → ∞ is, respectively, given by: Therefore, the proposed distributions allow us to fit data with zero occurrences (inliers), which are common in many problems, especially in hydrology and reliability. In Figure 3, we illustrate the shapes of the PDF for different values of λ.  The survival function for the proposed distribution is given by: The hazard function plays an important role in lifetime distributions, being one of the most important quantities to characterize the lifetime phenomenon. For the proposed distribution, its hazard function is given by: is increasing for all t and λ ≥ 2, which implies that the hazard rate function (7) has an increasing shape. Figure 4 gives examples of the shapes of the hazard function for different values of λ. The behavior of the hazard function (4) when t = 0 and t → ∞ is, respectively, given by: Figure 4, we can see that, h(t; λ) ↑ 1/λ as λ → ∞, i.e., the hazard rate converges to a constant. Therefore, the new distribution converges to an exponential distribution when λ → ∞.
The mean residual life (MRL) has been widely used in survival analysis and represents the expected additional lifetime given that a component has survived until time t. The MRL function is computed by: Solving the integral, we have: which implies that: The behavior of the MRL when t = 0 and t → ∞ is r(0; λ) = λ 2 (λ − 1) −1 and r(∞; λ) = λ. For a non-negative random variable, if h(t; λ) is increasing, then r(t; λ) is decreasing (see Bryson and Siddique [7]). Figure 5 presents some shapes of the MRL function for different values of λ. Zakerzadeh and Dolati [3] presented the moment generating function for the three-parameter distribution, which after some algebra and using the reparametrization discussed in Section 2 can be used to obtain the rth moment of our distribution. Moreover, they also studied the behavior of the hazard function of (4), which has an increasing shape when α ≥ 1, confirming that our hazard function is increasing. Hereafter, we discuss many properties of the proposed model. Due to the different form of our proposed distribution, as well as its simple structure; the results discussed in this paper differ from those presented by the authors above.

Inference
In this section, we present the maximum likelihood and the moment estimators for the λ parameter of the proposed distribution.
The method of moments is one of the simplest estimation procedures, which for a one-parameter distribution can be obtained by equating the first theoretical moment with the sample mean, i.e., λ 2 (λ − 1) −1 =t, wheret = ∑ n i=1 t i /n. After some algebraic manipulation, the solution has a closed-form expression and is given by: We can construct asymptotic confidence intervals forλ by using the fact thatλ ∼ N(λ, σλ), as n → ∞, where the standard deviation σλ can be obtained by the delta method. The expression is obtained by 4)), and µ and σ 2 are given in (6).
After some algebraic manipulations, we have σ 2 λ ≈ 4/n if λ = 2 or: For the maximum likelihood estimation, let T 1 , . . . , T n be a random sample such that T has the PDF given in (1), then the likelihood function is given by: The log-likelihood function l(λ; t) = log L(λ; t) is given by: and the score function is given by: Solving U(λ; t) = 0, we obtainλ MLE , i.e., the MLE of λ.

Remark 1.
If λ = 2, then U(λ; t) is positive for all t i > 0, i = 1, . . . , n; on the other hand, if λ increases, U(λ; t) becomes negative for all t i ≥ 0, i = 1, . . . , n. This implies that there is at least one solution forλ MLE . However, proving that the solution is unique is a hard task since it is difficult to study the behavior of the derivative Under mild conditions, the maximum likelihood estimate is asymptotically normal distributed with a normal distribution given byλ ∼ N(λ, I −1 (λ)) for n → ∞, where I(λ) is the Fisher information element given by: and Ei(z) = − ∞ z e −t /t dt, where z ≥ 0 is the exponential integral function.

Simulation Analysis
In this section, a simulation study is presented to compare the efficiency of the maximum likelihood method with the method of moments. The simulation study is performed over N = 5,000,000 samples generated from the new distribution with λ = 2.5, 3, 4, 6, 8 and n = 10, 20, 50, 100, 200, 500. This comparison is performed by computing the bias and the mean squared errors (MSE) given by: where N is the number of samples, as well as the coverage probabilities with a 95% confidence level. Note that, the new distribution can be expressed as a two-component mixture: where 1 − p = 1/(λ − 1) (or p = (λ − 2)/(λ − 1)) and f j (t; λ) = λ −j t j−1 e − t λ for j = 1, 2. Therefore, the dataset is generated as follows, 1. Generate U i ∼ Uniform(0, 1), X i ∼ Exp(λ), and Y i ∼ Gamma(2, λ), i = 1, . . . , n; By considering this approach, it is expected that the best estimation method will return the average bias and the MSE close to zero. For the coverage probability, the frequencies of intervals that covered the true values of λ under a confidence level of 95% should be closer to 0.95.
The results are condensed in Tables 1-3, which present the average bias and average MSE, and the coverage probability with a 95% confidence level of the estimates obtained from the maximum likelihood and the moment estimation approaches for different samples.     Table 3. The coverage probability with N = 5,000,000 simulated samples. From the obtained results, we can conclude that as n increases, both bias and MSE tend to zero and the coverage probability of the confidence levels tends to the nominal value of 0.95. However, the MLE returned better results especially for small values of λ since the bias, and the MSE were closer to zero than the ME. It is important to point out that the results and the same conclusions were obtained considering different choices of λ. Therefore, the MLE should be used as an estimator for the parameter of the new proposed distribution.

Application
In this section, we fit four datasets using the exponential, the Lindley, and the new distribution. The maximum likelihood estimates of the parameters were computed, as well as the respective standard errors (SE). The Kolmogorov-Smirnov (KS) statistic (see Massey Jr. [8] for more details) is presented to check the goodness of the fit. The Akaike [9] information criterion (AIC) computed by −2l(λ; t) + 2k is used as the discrimination criterion, whereλ is the MLE of λ and k is the number of parameters in the model. The best distribution is the one that provides the minimum AIC.
Firstly, we recall two hydrologic datasets analyzed by Muralidharan and Khabia [10] related to monthly rainfall (in mm) of surface runoff in Andhra Pradesh. Datasets 1 and 2 have the occurrence of zero values. Table 4 summarizes the obtained results. From the p-value of the KS test, we observe that all three values are candidates for fitting the data under a significance level of 5%. However, the proposed distribution provided a better fit for both datasets since they had the minimum AIC values. Now, we consider two datasets related to the failure time of electronic components (Dataset 3) and mechanical components (Dataset 4) of an agricultural machine. Such a machine harvests more than 10 tons of plants per hour, and the study of its lifetime components is of main interest. Table 5 summarizes the results related to Datasets 3 and 4. Using the p-value of the KS test, we observe that all three values are candidates for fitting the data under a significance level of 5%. The proposed distribution provided a better fit for both data since they had the minimum AIC values.

Discussion And Extensions
We introduced a simple one-parameter distribution that can be used to describe lifetime data in the presence of instantaneous failures. The mathematical properties of the proposed distribution were presented such as the mean, variance, coefficient of variation, skewness, kurtosis, and the rth moment. The survival, hazard, and mean residual lifetime functions were also presented for the proposed model. The parameter estimation was discussed under the methods of moments and maximum likelihood. The asymptotic confidence intervals were presented for both estimation methods. The results of the application section showed that our distribution outperformed the common one-parameter distributions in many applications.
The simple structure of the proposed model provides many possible extensions of the current work. For instance, Sankaran [11] introduced the discrete Poisson-Lindley distribution by compounding the Poisson distribution with the Lindley one. Here, we can consider the same approach by compounding the Poisson distribution with the new one. For instance, let the parameter θ of the Poisson distribution have a distribution function F(θ) in which: Then, the compound distribution is given by: After some algebra, we have a new discrete compound distribution given by: Due to the simple structure of the CDF of our proposed model, many extensions could be introduced by including extra parameters using the established G-family of distributions. For instance, considering Lehmann Type 1 and Type 2 alternatives, we have the CDFs given by: , where x ≥ 0, λ ≥ 2, and α > 0. Tahir and Cordeiro [4] discussed many G-family distributions that can be used to extend the proposed model. Our approach should be investigated further in these contexts.
Author Contributions: All authors contributed significantly to the study and preparation of the article.