Generalising Exponential Distributions Using an Extended Marshall–Olkin Procedure

: This paper presents a three-parameter family of distributions which includes the common exponential and the Marshall–Olkin exponential as special cases. This distribution exhibits a monotone failure rate function, which makes it appealing for practitioners interested in reliability, and means it can be included in the catalogue of appropriate non-symmetric distributions to model these issues, such as the gamma and Weibull three-parameter families. Given the lack of symmetry of this kind of distribution, various statistical and reliability properties of this model are examined. Numerical examples based on real data reﬂect the suitable behaviour of this distribution for modelling purposes


Introduction
Several recent attempts have been made to extend the exponential distribution in order to increase its versatility for modelling purposes.Among others, the two-parameter exponentiated exponential distribution ( [1][2][3] and references therein), and the three-parameter generalised exponential distribution [4] have been presented as feasible alternatives to the gamma, Weibull and lognormal distributions, although both standard and extended distributions are known to present drawbacks.The latter have been used to analyse lifetime data that present a monotonic (increasing or decreasing) hazard rate function (also known as failure rate function).These distributions are popular among researchers interested in areas such as reliability engineering and software reliability [5,6].Interesting, thorough reviews of the exponential distribution along with its applications may be found in a recent book by Balakrishnan (2019) [7].References on extensions of the exponential distribution include Johnson et al. (2019) [8] and references therein.
Inspired by the seminal paper of Marshall and Olkin (1997) [18], in this paper we introduce a family of three-parameter univariate distributions presenting both decreasing and increasing hazard rates, and include the exponential distribution as a particular case.
Let X n be a random sequence given by where ε 1 n , ε 2 n are random sequences of exponential i.i.d.variables with parameters λ 1 and λ 2 , respectively, and 0 < δ < 1.
Let S n (x) be the survival function of X n ; i.e., S n (x) = Pr (X n > x).Then, Assuming stability for S n (x) = S (x|λ 1 , λ 2 , δ), Observe that (2) yields a mechanism to extend a distribution.The rest of this paper is organised as follows: Section 2 presents and discusses general conditions for generalised Marshall-Olkin exponential (GMOE) distributions.Section 3 then shows some interesting properties of the GMOE distributions.For instance, we show that their hazard rate function is related to the constant hazard rate function of an exponential distribution with parameter λ 1 according to the value of δ.Secondly, a closed expression for the moments is obtained.Consequently, the mean, variance, skewness coefficient, etc., are easily obtained.Finally, a brief study of the mode location is conducted in Section 3. Section 4 presents the expressions for model parameter estimation, and a simulation study is performed to determine the performance of the maximum likelihood estimators with respect to certain sample sizes.Some real-world applications are presented in Section 5. Finally, Sections 6 and 7 present some extensions of the proposed methodology and the main conclusions drawn, respectively.

The Generalised Marshall-Olkin Exponential Distribution
In this section, we introduce the three-parameter generalised Marshall-Olkin exponential (GMOE) distribution, using the mechanism described by (2).
In order to reach stability in (2), some initial considerations are needed.Let us denote by S i (x) and F i (x) the respective survival and cumulative distribution functions (cdfs) of ε i n , i = 1, 2. For x ≥ 0, we have Thus, its associated cdf is given by Clearly for any λ 1 , λ 2 > 0 and δ, we have Thus, for G (x|λ 1 , λ 2 , δ) to be a cdf, it is only required that Expression ( 5) is non-negative for all x, when 0 < δ < 1.However, if we wish to extend this scheme to cases where δ > 1, as in the Marshall-Olkin scheme, then it is required that which is a constraint when δ > 1 and is true for any 0 < δ ≤ 1.
Remark 1.For δ = 1, we obtain the exponential distribution with parameter λ 1 .In other words, with the GMO scheme the original family, F 1 , can be generalised by the insertion of an additional parameter δ and by the effect of an auxiliary distribution F 2 (λ 2 ).For several values of the parameters λ 1 , λ 2 and δ, the plots of the density function of GMOE distributions are shown in Figure 1.

Some Properties of the GMO-Exponential Distribution
Some important properties of the GMOE (λ 1 , λ 2 , δ) distributions are shown in this section.From expression (3) and given q ∈ (0, 1) we can obtain the qth quantile x q of the GMOE distribution as follows.Let q = G x q |λ 1 , λ 2 , δ , that is, Then, we rewrite this expression as where t q = e −x q .Given the real solution of this equation, t q , x q = − log t q is the qth quantile of a GMOE (λ 1 , λ 2 , δ) distributed variable.Notice that Equation ( 9) has only one positive solution.
Additionally, observe that this procedure is useful to obtain the quantile function of the GMOE distribution which is given by Q(u) = − log(t(u)), where, when replacing q by u, t(u) is the unique solution to (9).Therefore, if U is a uniform variate on the interval (0, 1), then the random variable X = Q(U) has pdf (8).

Moments
The moments of a GMO-exponential distribution can be written in a closed form with the help of the well-known Hurwitz-Lerch transcendent function, Φ (z, s, a), which is defined by the expression and which can also be expressed in an integral form as follows: Symbolic and numerical evaluations of this function are easily obtained with Mathematica c software using the command HurwitzLerchPhi.

Proposition 1. The nth moment of the GMOE distribution is given by
Proof.Using expressions ( 8) and ( 10) and letting t = λ 2 x, Now observe that 2 dt can be obtained by parts with and then Substituting in (12) then gives and the proof is completed.
Corollary 1.Let X be a GMOE distribution with parameters λ 1 , λ 2 and δ.Then, 14) where (iii) For a fixed value of δ > 0, the value of its kth moment decreases with λ 1 and with λ 2 .
Proof.The proof is immediate and (ii) follows from the identity E(( Using ( 13)-( 15), the coefficient of variation (CV) and the skewness (γ 1 ) of X are given by Observe that in GMOE distributions with a constant λ 1 /λ 2 ratio, the CV and γ 1 only depend on δ.

The Hazard Rate: Reliability Properties
The hazard rate of a GMOExp (λ 1 , λ 2 , δ) distribution, r (x|λ 1 , λ 2 , δ), is given by As Figure 2 shows, the hazard rate function of the GMOE distribution can take monotonic and quasi-bathtub shapes for different values of the parameters λ 1 , λ 2 and δ.If we denote by r E (x|λ) the hazard rate of an Exp (λ) distribution, the following results can be obtained immediately: ) is a strictly decreasing function for 0 < δ < 1, constant for δ = 1 and strictly increasing for Proof.(i) and (ii) are immediate.For (iii), the result follows by observing that the sign of the first derivative of the hazard rate function in (9) with respect to x is the opposite to the sign of 1 − δ.
Furthermore, r(•|λ 1 , λ 2 , δ) is increasing in λ 1 and in δ.Thus, the GMOE distribution is positively ordered with respect to λ 1 according to the hazard rate ordering, and analogously, with respect to δ.
In contrast with the ordinary families of gamma and Weibull distributions, observe that , and that at the origin the hazard rate varies continuously with the parameters.Moreover, for the GMOE distribution, lim x→∞ r(x|λ 1 , λ 2 , δ) = λ 1 , is bounded and continuous in the parameters.
Finally, the residual life distribution of the random variable X-distributed as a GMOE distribution with the parameters λ 1 , λ 2 and δ-provided there is no failure prior to time t > 0, has the survival function where β(t) = 1 − (1 − δ)e −λ 2 t .Thus, the residual life distribution of a random variable X distributed as GMOE(λ 1 , λ 2 , δ) at time t is another GMOE distribution with the third parameter depending upon time t.
Henceforth, from (12) the mean residual life function, i.e., the mean of the residual life distribution, is given by It is then easy to see that and lim

The Mode
From Remarks 1 and 2 in Section 2, we now focus on the values of δ > 0 and δ = 1.The GMOE distribution can present its unique mode either at M = 0 or at M > 0.
Let us define the function such that where Now, to determine whether there exists a positive mode we need merely decide whether h (0) > 0: Assuming t = λ 2 /λ 1 , we define the auxiliary function as Hence, in order to find the mode of GMOE distribution we need only decide whether N(t, δ) > 0. The solution to this question depends on the range of values for δ.

(Case a)
If 0 < δ < 1, N (t, δ) < 0 for any t ≥ 0, and therefore the mode of distribution implies one of the following two cases.On the one hand, which is contradictory, and on the other hand, However, for 1 < δ < 2, the inequality In summary, we conclude that the mode is reached at M = 0 in this case.(Case c) If δ = 2, the condition reduces to t > 1, or equivalently Notice that, for any δ > 2, so we conclude that, in this case, M > 0 if and only if

Order Statistics
Let X 1 , . . ., X n be a random sample of size n from the GMOE distribution in (8).Then, the density of the jth order statistics X j:n , for j = 1, . . ., n, is given by where In particular, the sample distributions of the minimum X 1:n and maximum X n:n are easily obtained by (22) replacing j by 1 and n, respectively.

Estimation
In this section, we estimate the unknown parameters of the GMOE distribution.Let x = (x 1 , . . ., x n ) be a sample of size n from the GMOE distribution in (8).The log-likelihood function for the parameters (λ 1 , λ 2 , δ) is expressed as where α = (λ 1 , λ 2 , δ).By differentiating with respect to λ 1 , λ 2 and δ and then equating to zero, we obtain the normal equations needed to estiate the maximum likelihood.
These non-linear equations do not have a closed expression, but require numerical methods, available in standard software such as Mathematica.
The pdf of the GMOE distribution in ( 8) satisfies all the regularity conditions, and thus from the usual, large sample approximation, the MLE ( λ1 , λ2 , δ) treated as being approximately multivariate normal with a mean vector (λ 1 , λ 2 , δ) and variance-covariance matrix I −1 , and where the elements are provided by the inverse Fisher information matrix, the expected values of the second order derivatives are as shown in Appendix A.

Simulation Study
In this section, we evaluate the performance of the MLEs and Bayesian estimators using Monte Carlo simulation, for certain sample sizes and parameter values.The simulation study is repeated N = 1000 times with sample sizes n = 25, 50, 100.Table 1 shows the results obtained for different parameter combinations, together with the estimated bias and root mean squared error (RMSE) for each estimated parameter given a simulated sample of size n, using the common expressions Table 1 shows that the parameter estimators perform very badly, mainly due to the nonlinearity and instability of the solutions to Equations ( 23)-( 25), even for large values of n.The above three likelihood equations are very complicated, and the Newton-Raphson method is a gradient procedure whose stability depends on the selection of the initial solutions.It is not easy to set up good initial solutions to these three equations.An alternative procedure to obtain stable MLE consists of developing a non-informative Bayesian estimation approach.Doing so, we employ the MCMC method to generate samples from the posterior distributions of the parameters λ 1 , λ 2 and δ from independent, uniform vague priors and then compute the corresponding Bayes estimators using the common squared errors loss function.From Table 2 it is clear that MCMC samples can be used to estimate the parameters in GMOE distributions and that this method obtains better results than solving the normal equations directly by maximum likelihood.A simple code implemented using OpenBUGS is given in Appendix B.
The summary statistics shown in Table 2 are based on N = 1000 simulations with 50,000 iterations following a burn-in stage of 5000 iterations.

Example 1
Here, we revisit the real dataset from [4] representing the number of revolutions before failure for each of the 23 bearings in the life test described in Table 3. Table 4 shows the fits to the dataset obtained from the three-parameter gamma, Weibull and GMOE models, whose probability density functions are given as follows.

Three-Parameter Weibull Distribution
This pdf represents a three-parameter Weibull distribution with shape parameter α, scale parameter β, and location parameter µ, where α and β are positive real numbers and µ is any real number.For comparative and illustrative purposes, all the usual measures, such as p-value, log-likelihood, AIC and BIC, are used to compare the estimated models.As is well known, a model with a minimum BIC value is to be preferred.From Table 4, the log-likelihood and BIC quantities show that, excluding the four-parameter gamma distribution, the remaining three models are almost identical.Table 4 shows that the GMOE distribution performs well in fitting the data distribution when there is a decreasing hazard rate function, and provides a fit as good as that of the three-parameter common distribution.Thus, GMOE distributions could be included in the catalogue of sampling distributions for this kind of dataset.

Example 2
The data for this example were compiled by the Swedish Committee on the Analysis of Risk Premium in Motor Insurance, summarised in Hallin and Ingenbleek [19] and Andrews and Herzberg [20].The data correspond to third party automobile insurance claims for the year 1977, and are available at the url [21].We consider the sums of payments (the severity), in Swedish krona.
The histogram and the empirical hazard function for the data are shown in Figure 3. Observe that the monotonic decreasing of the hazard function suggests that a generalised exponential distribution fits the data well.We fitted three models to these data: exponential (E), Marshall and Olkin exponential (MOE) and GMOE.Table 5 shows the fit of each of these models to the data set.For comparison, log-likelihood, BIC and AIC values are also presented, together with the estimation of the parameters by the maximum likelihood method.In the GMOE model, the stability of the MLE was confirmed by the non-informative Bayesian procedure described in Section 4.1 using uniform vague priors centred at the MLE obtained and solving (23)-( 25).For the data set considered, all of the BIC and AIC values indicate that the GMOE model is better.

Extensions of the GMO Scheme
In this Section, we show some general properties of the GMO scheme applied to any absolutely continuous distribution.Consider a pair of absolutely continuous distributions, denoted by F i (x) , S i (x) and f i (x), for i = 1, 2, their respective cdfs, survival and pdfs' functions.We assume that each F i (x) depends on its parameter θ i , i = 1, 2. We then define the survival function of the GMO with respect to (F 1 , F 2 ) as the function The corresponding cdf is then given by and the corresponding pdf, by We require that g (x|θ 1 , θ 2 , δ) ≥ 0 for all x.Clearly, in the case 0 < δ ≤ 1, this condition is always met.On the other hand, when δ > 1, the required condition reduces to f 1 (x) ≥ (δ − 1) ( f 2 (x) S 1 (x) − S 2 (x) f 1 (x)) .

Conclusions
In this paper, we propose an extension of the Marshall-Olkin procedure to obtain a new three-parameter distribution with a monotone hazard rate function and describe some interesting properties that could be used in reliability scenarios.We show that the proposed distribution can be considered a valid alternative to well-known distributions, such as the Marsall-Olkin exponential and generalised gamma and Weibull distributions, among many others.The cumulative distribution function (cdf) and the hazard rate function present great flexibility.The nth moment is derived and particular values for the mean, variance and kurtosis are easily obtained.The non-linear equations for deriving the MLE and the elements of the observed information matrix are also presented.It can be seen that the maximum-likelihood method can be applied by running a MCMC procedure with non-informative uniform priors.This approach can also be applied to other distributions, although not to exponential ones.An application of the GMOE distribution to two real data sets is provided to demonstrate that this distribution provides a suitable alternative to the standard models.

Figure 3 .
Figure 3. Ordinary histogram (left panel) and empirical hazard function (right panel) of data in Example 2.

Table 4 .
Estimated parameters, log-likelihood, K-S statistics, p-values, AIC and BIC for Example 1.