A New Extended Two-Parameter Distribution: Properties, Estimation Methods, and Applications in Medicine and Geology

: In this paper, a new two-parameter generalized Ramos–Louzada distribution is proposed. The proposed model provides more ﬂexibility in modeling data with increasing, decreasing, J-shaped, and reversed-J shaped hazard rate functions. Several statistical properties of the model were derived. The unknown parameters of the new distribution were explored using eight frequentist estimation approaches. These approaches are important for developing guidelines to choose the best method of estimation for the model parameters, which would be of great interest to practitioners and applied statisticians. Detailed numerical simulations are presented to examine the bias and the mean square error of the proposed estimators. The best estimation method and ordering performance of the estimators were determined using the partial and overall ranks of all estimation methods for various parameter combinations. The performance of the proposed distribution is illustrated using two real datasets from the ﬁelds of medicine and geology, and both datasets show that the new model is more appropriate as compared to the Marshall–Olkin exponential, exponentiated exponential, beta exponential, gamma, Poisson–Lomax, Lindley geometric, generalized Lindley, and Lindley distributions, among others.


Introduction
Probability distributions have great importance for modeling data in several areas, such as medicine, engineering, and life testing, among others. Ramos and Louzada [1] recently introduced a one-parameter distribution called the Ramos-Louzada (RL) distribution with survival function (SF) given by where λ ≥ 2.
The two most common one-parameter distributions are the exponential and Lindley distributions. The important generalizations of the exponential distribution are the Weibull [2] and exponentiated exponential [3] models. In the case of the Lindley distribution, the power Lindley [4] and generalized Lindley [5] models play important roles in survival analysis. These two generalizations are obtained by considering a power parameter in the exponential and Lindley distributions. Ramos and Louzada [1] showed that (1) outperforms the common exponential and Lindley distributions in many situations. Therefore, we propose a new two-parameter extension of the RL distribution by including a power parameter in the baseline model (1). The new proposed model is called a generalized Ramos-Louzada (GRL) distribution.
Let T be a non-negative random variable that follows the GRL model; the SF of random variable T is given by where λ(≥ 2) and α(> 0) are shape parameters. Some mathematical properties, parameter estimations via eight different methods, simulations, and applications are studied and proposed in this paper.
We can summarize the motivations of this proposed model as: (i) the cumulative distribution function (CDF) and hazard rate function (HRF) of the GRL model have simple closed forms; hence, it can be utilized to analyze censored data; (ii) it can be represented as a mixture of Weibull distribution and a particular case of the generalized gamma distribution [6] (see Section 2); (iii) the GRL distribution exhibits increasing, decreasing, reversed-J shaped, and J shaped hazard rates, whereas the RL model exhibits only an increasing hazard rate; and (iv) the GRL distribution outperformed many of the well-known distributions, namely, the Marshall-Olkin exponential, exponentiated exponential, beta exponential, gamma, Poisson-Lomax, Lindley geometric, generalized Lindley, and Lindley distributions, using two unimodal real datasets from the fields of medicine and geology.
Furthermore, another important goal of this paper is to show how several frequentist estimators of the GRL parameters choose the best parameter estimation method for the proposed model, which should be of great interest to practitioners and applied statisticians. Estimating the parameters of generalized models using classical estimation methods and comparing them based on numerical simulations have been discussed by many authors (see, e.g., [7][8][9]). This paper is organized as follows: Section 2 introduces the GRL distribution and its properties, such as quantile function, moments, order statistics, and HRF. Section 3 presents the estimators of the GRL unknown parameters based on eight classical estimation methods. The simulation study-to evaluate and compare the behavior of the eight classical estimation methods-is discussed in Section 4. Section 5 illustrates the relevance of GRL model for two real lifetime datasets. Section 6 summarizes the present study.

The GRL Distribution and Its Properties
Let T be a random variable that follows the GRL model with SF given in (2); the probability density function (PDF) of the random variable T is given by where φ φ φ = (λ, α) . Note that the RL model can be obtained from (3) when α = 1. When λ = 2, we get a special case of generalized gamma distribution. The CDF of the GRL distribution is given by The HRF of T is given by The GRL distribution can be expressed as a two-component mixture where 1 − p = 1/(λ − 1) (or p = (λ − 2)/(λ − 1)) and f j (t; φ φ φ) = α λ j t jα−1 e − t α λ for j = 1, 2.
Note that f 1 (·) is a Weibull distribution and f 2 (·) is a particular case of the generalized gamma distribution [6]. Then, after some algebra, Equation (6) reduces to the PDF in (3). Figure 1 displays some possible shapes of HRF of the GRL for some selected values of λ and α. The shape of HRF can follow increasing, decreasing, reversed-J shaped, or J shaped hazard rates.

Shapes
The behavior of the PDF in (3) when t → 0 and t → ∞ are, respectively, given by In Figure 2, we present the shapes of the PDF for different values of the parameters λ and α. The shape of PDF of the GRL model can be right-skewed or reversed-J shaped.

Quantile Function
The quantile function (QF) of the GRL distribution defined in (3), say, Q(p) wherein 0 < p < 1, can be obtained by solving the equation F(Q(p)) = p in (4) for Q(p) in terms of p, and this implies where W −1 (·) is the negative branch of the Lambert function.

Moments
Moments play an important role in statistical theory, so in this section we provide the r−th moment, the mean, and the variance for the GRL distribution.

Proposition 1.
For the random variable T that follows the GRL distribution, the r−th moment is given by Proof. Note that the r-th moment for the random variable in (7) is given by Since the GRL model can be expressed as a two-component mixture, as in (6), we have The random variable T follows the GRL distribution; its mean and variance, respectively, are given by Proof. From (9) and considering r = 1, it follows that µ 1 = µ. The second result can be obtained by using σ 2 = E[T 2 ] − µ 2 and with some algebra the proof is completed.
Proposition 3. The r-th central moment for the GRL distribution is given by Proof. The result follows directly from Proposition 1.
The mean, variance, skewness, and kurtosis of the GRL distribution were computed numerically for different values of the parameters λ and α, using R software. Table 1 displays these numerical values. From Table 1 we can indicate that the skewness of the GRL distribution varies within the interval (−0.68158, 5.17333), whereas the skewness of the RL distribution can only range in the interval (1.41421, 1.85648) when the parameter λ takes values (2, 3.1, 4, 5.5). Furthermore, the spread of the kurtosis of the GRL distribution is much larger ranging, which is from 2.69447 to 52.6597, whereas the spread of the kurtosis of the RL distribution can only varies from 6.00 to 8.04 for the same values shown above for the parameter λ. The GRL model can also be left skewed or right skewed. Hence, the GRL distribution is a flexible distribution which can be used in modeling skewed data.

Order Statistics
Let T 1 , T 2 , . . . T n be a random sample from (3) and T 1:n ≤ T 2:n ≤ · · · ≤ T n:n denote the the corresponding order statistics. It is well known that the PDF and the CDF of the of r-th order statistics, say, T r:n and 1 ≤ r ≤ n, respectively, are given by (13) and F r: for k = 1, 2, . . . , n. It follows from (13) and (14) that the PDF and CDF of the r-th order statistic of the GRL reduce to

Maximum Likelihood Estimators
In this sub-section we present the MLEs of the parameters λ and α of the GRL distribution. Let t 1 , . . . , t n be a sample from the GRL distribution given in (3). In this case, for φ φ φ = (λ, α) , the likelihood function from (3) is given by The log-likelihood function l(φ φ φ; t t t) = log L(φ φ φ; t t t) is given by From the expressions ∂ ∂λ l(φ φ φ; t t t) = 0, ∂ ∂α l(φ φ φ; t t t) = 0, we get the likelihood equations Under mild conditions [10] the ML estimates are asymptotically normal distributed with a bivariate normal distribution given by where the elements of the observed Fisher information matrix H(λ, α) are given by This can also be done by using different programs, namely, R (optim function) and SAS (PROC NLMIXED), or by solving the nonlinear likelihood equations obtained by differentiating .

Ordinary and Weighted Least-Squares Estimators
Let t (1) , t (2) , · · · , t (n) be the order statistics of a sample of size n from F (t; λ, α) in (4). Take the OLSE from [11]. λ LSE and α LSE can be obtained by minimizing with respect to λ and α. Or equivalently, the OLSEs follow by solving the non-linear equations Note that the solution of ∆ s for s = 1, 2 can be obtained numerically. The WLSEs [11] λ W LSE and α W LSE can be obtained by minimizing the following equation: Further, the WLSEs can also be derived by solving the non-linear equations: where ∆ 1 (·|λ, α) and ∆ 2 (·|λ, α) are provided in (17).

Maximum Product of Spacing Estimators
The maximum product of the spacings method [12][13][14], as an approximation of the Kullback-Leibler information measure, is a good alternative to the maximum likelihood method.

The Cramér-von Mises Minimum Distance Estimators
The CVME, as a type of minimum distance estimator, has less bias than the other minimum distance estimators [15]. The CVMEs are obtained based on the difference between the estimates of the CDF and the empirical distribution function [16]. The CVMEs of the GRL parameters are obtained by minimizing with respect to λ and α. Further, the CVMEs follow by solving the non-linear equations: where ∆ 1 (·|λ, α) and ∆ 2 (·|λ, α) are provided in (17).

The Anderson-Darling and Right-Tail Anderson-Darling Estimators
The Anderson-Darling statistic or Anderson-Darling estimator is another type of minimum distance estimator. The ADEs of the GRL parameters are obtained by minimizing with respect to λ and α. These ADE can also be obtained by solving the non-linear equations The RADEs of the GRL parameters are obtained by minimizing with respect to λ and α. The RADE can also be obtained by solving the non-linear equations where ∆ 1 (·|λ, α) and ∆ 2 (·|λ, α) are defined in Equation (17).

Percentile Estimators
This method was originally suggested by [17,18]. Let u i = i/ (n + 1) be an unbiased estimator of F t (i) |λ, α . Then, the PCE of the parameters of GRL distribution are obtained by minimizing the following function with respect to λ and α, where W −1 (·) is the negative branch of the Lambert function.

Simulation Analysis
A simulation study was conducted to explore and compare the behavior of the estimates with respect to their: average of absolute value of biases (|Bias 2 , and average of mean relative errors (MREs), We generated N = 5000 random samples T 1 , T 2 , . . . , T N of sizes n = 30, 50, 80, 100, and 200 from the GRL model by using Equation (8) [19]. For each parameter combination and each sample, we estimated the GRL parameters λ and α using eight frequentist estimators including WLSE, OLSE, MLE, MPSE, CVME, ADE, RADE, and PCE. Then, the MSEs and MREs of the parameter estimates were computed. Simulated outcomes are listed in Tables 2-9. Furthermore, these tables show the rank of each of the estimators among all the estimators in each row; the superscripts are the indicators, and the ∑ Ranks is the partial sum of the ranks for each column in a certain sample size. Table 10 shows the partial and overall ranks of the estimators.
From Tables 2-9, we can observe that: • All estimation methods show the property of consistency, i.e., the MSEs and MREs decrease as sample size increases, for all parameter combinations, except the weighted least-squares method.
From Table 10, and for the parameter combinations, we can conclude that the MPSE outperforms all the other estimators with an overall score of 62. Therefore, based on our study, we can confirm the superiority of MPSE and ADE for the GRL distribution.  80       Table 9. Simulation results for φ φ φ = (λ = 3.1, α = 3.5) .

Real Data Analysis
In this section, we illustrate the importance of the GRL distribution in modeling skewed data using two real datasets from the medicine and geology fields. The first dataset represents the survival times, in weeks, of 33 patients suffering from acute myelogeneous leukemia [20]. This dataset had already been analyzed by [21][22][23]. The second dataset was used to evaluate the risks associated with earthquakes occurring close to the central site of a nuclear power plant. This dataset refers to the distances, in miles, to the nuclear power plant of the most recent eight earthquakes of intensity larger than a given value [24] and it consists of 60 observations. It is noted that both datasets are unimodal based on the Hartigans' dip test for the unimodality/multimodality test by using the function dip.test which is available within the R package diptest [25]. The null hypothesis: the data have a uni-modal distribution. The p-value (PV) of the first dataset was 0.8238, and 0.1507 was that of the second dataset; hence, we failed to reject the null hypothesis in the both cases at the 5% significance level; thus, both datasets are unimodal.
The fits of the GRL distribution is compared with other competitive models which are given in Table 11, and their densities (for t > 0) are given by: .
The numerical values of − , W * , A * , KS, and bootstrapped PV, the MLEs, and their corresponding standard errors (SEs) (given in parentheses) of the fitted models are listed in Tables 12 and 13, for both datasets, respectively. The figures in these tables show that the GRL distribution has the lowest values for all goodness-of-fit statistics among all fitted models.
Tables 14 and 15 display the parameter estimates under various estimation methods and the goodness-of-fit statistics for both datasets, respectively. From Tables 14 and 15, and based on the K − S bootstrapped PV, we recommend using the MPSE to estimate the parameters of the GRL distribution for leukemia data, while the OLS method is recommended to estimate the GRL parameters for epicenter data.
The histogram of the fitted GRL distribution and the other distributions are displayed in Figures 3 and 4 for the two datasets, respectively. Figures 3 and 4 show the plots of PDFs and CDFs of the fitted models for leukemia and epicenter data. The HRF plot of the GRL distribution and the TTT plot of leukemia data are displayed in Figure 5, whereas the HRF plot of the GRL distribution and the TTT plot of epicenter data are displayed in Figure 6. It is shown that the HRF is decreasing for leukemia data, whereas the HRF is increasing for epicenter data. Furthermore, the scaled TTT plot for the leukemia data is convex, which indicates a decreasing HRF, and it is concave for epicenter data, which indicates an increasing HRF. Thus, the GRL distribution is suitable for modeling leukemia and epicenter data.

Concluding Remarks
In this paper, we introduced a new two-parameter distribution called the generalized Ramos-Louzada (GRL) distribution. Further, the mathematical properties of the GRL model were studied in detail. The GRL parameters are estimated by eight estimation methods-namely, the weighted least-squares, ordinary least squares, maximum likelihood, maximum product of spacing, Cramér-von Mises, Anderson-Darling, right-tail Anderson-Darling, and percentile based estimators. The simulation study illustrated that the maximum product of the spacing method outperforms all other estimation methods. Therefore, based on our study, we can confirm the superiority of the maximum product of spacing method for the GRL distribution. Finally, the practical importance of GRL model was reported in two real applications. The goodness of fit for the proposed datasets showed that our model returned better fitting in comparison with other well-known distributions. Further, the two real data applications showed that the maximum product of the spacing estimator for the leukemia data and the least-square estimator for the epicenter data return the best estimates for the parameters of the GRL distribution.