Slashed Lomax Distribution and Regression Model

In this article, the slashed Lomax distribution is introduced, which is an asymmetric distribution and can be used for fitting thick-tailed datasets. Various properties are explored, such as the density function, hazard rate function, Renyi entropy, r-th moments, and the coefficients of the skewness and kurtosis. Some useful characterizations of this distribution are obtained. Furthermore, we study a slashed Lomax regression model and the expectation conditional maximization (ECM) algorithm to estimate the model parameters. Simulation studies are conducted to evaluate the performances of the proposed method. Finally, two sets of data are applied to verify the importance of the slashed Lomax distribution.


Introduction
The Lomax distribution, which was introduced by Lomax [1], has been regarded as the mixed distribution of the exponential distribution and gamma distribution. It has a heavy-tailed probability distribution, often used in business, economics, and actuarial modeling. Let X be a non-negative random variable with a Lomax distribution, then its probability density function (pdf) is given by, where α > 0 and β > 0, and it is denoted by X ∼ Lomax (α, β). The Lomax distribution contains the monotone decreasing failure rate and the monotone increasing failure rate, which have been regarded as a quantitative life distribution. It has been applied broadly in practical production and real life. For example: Myhre and Saunders [2] applied the Lomax distribution to right censored data; Balakrishnan and Ahsanullah [3] discussed some important statistical properties of the Lomax distribution; Childs et al. [4] studied the properties of right-truncated Lomax distributions and discussed some practical applications; and Howlader and Hossain [5] considered estimating the survival function of the Lomax distribution with the Bayesian method.
The slashed distribution was proposed to model the bell shaped data with a heavier tail, by Rogers and Tukey [6], which is thicker than the tail of the normal distribution and is itself a symmetric distribution. It has a stochastic representation as ZU − 1 the modified slashed Birnbaum-Saunders distribution and concluded that it has greater kurtosis values than the usual BSdistribution. Similar to this methodology, the slashed Lindley-Weibull distribution was introduced by Reyes et al. [17]; the slashed power Lindley distributions was studied by Iriarte et al. [18]; and the modified slashed generalized exponential distribution was introduced by Astorga et al. [19].
Regression models are undoubtedly the most widely used, but the normality assumption of the residual errors is more restrictive. Recently, the error term subject to a more flexible distribution has been studied: Gómez [20] analyzed the regression model of the slashed half-normal distribution; Jamal [21] studied the properties of the Topp-Leone-Weibull Lomax distribution and its regression model; and Hamedani et al. [22] analyzed the regression model of the Burr XII distribution.
Based on the previous research, we propose the slashed Lomax distribution, which has a thicker tail than the Lomax distribution and has more flexibility in kurtosis. Furthermore, the construction of the slashed Lomax distribution makes it possible to estimate the parameters in a variety of ways. Therefore, the proposed new distribution can be used not only for fitting thick-tailed datasets, but also can analyze some phenomena in real life. The rest of the paper is organized as follows. In Section 2, we introduce the slashed Lomax distribution and obtain some of its properties. The ECM algorithm for the parameter estimation and simulation studies is proposed in Section 3. The slashed Lomax regression model is studied in Section 4. Two applications to real data are investigated in Section 5. Some conclusions are offered in Section 6.

Slashed Lomax Distribution
In this section, some basic properties of the slashed Lomax distribution are described.
, then its pdf can be written as: Proof. Using the stochastic representation in Equation (1), we have the joint density function of (Y, V) as: , and the pdf of Y is obtained by marginalizing the distribution with respect to V.

Remark 1.
For λ = 1, Y ∼ Slomax (α, β, λ) is reduced to the canonical slashed Lomax distribution, and the pdf is: For an illustration of the new family of slashed Lomax distributions, we draw the density curves, with different values of α, β, and λ, as follows. After fixing two of the three parameters, we find that the slashed Lomax distributions have a heavier tail as β increases (and the same for λ, but the opposite for α).
(i) The reliability (survival) function of Y is given by: (iii) The reversed hazard rate of Y is given by: Figure 1 shows probability density function curve of the slashed Lomax distribution with several different parameters. Figure 2 shows the curves of hazard rate functions under different parameters. It can be seen that the values of the hazard rate function based on λ, with α and β fixed, have small differences as y increases in these figures. Next, the Renyi entropy and distributional moments of the slashed Lomax distribution are derived. In addition, the coefficients of skewness and kurtosis are calculated. Proposition 3. Let Y ∼ Slomax (α, β, λ); the Renyi entropy of order δ for Y is given by, where δ > 0 and δ = 1.

Proof.
Renyi entropy is an important diversity index in ecology and statistics, which is defined as f (y) δ dy . Therefore, we have: ; the r-th moment of Y, for r = 1, 2 · · · and r < 2λ, is given by: where Γ(·) is the gamma function.
Proof. Using Equation (1), we have, According to the properties of the beta distribution and the Lomax distribution, we have that . Thus, the result is obtained.
; the mean and variance of Y are given by: , .
; the coefficients of skewness and kurtosis for Y, γ 1 and γ 2 , are given by: , , thus, the results are obtained directly. In the following propositions, we consider the linear transformation of the slashed Lomax distribution and discuss its scale mixture representation. Proposition 5. Let Y ∼ Slomax(α, β, λ) and the scalar a > 0, then aY ∼ Slomax(α, aβ, λ).
Proof. The result can be obtained by Equation (1) and Proposition 1.

ECM Algorithm for Parameter Estimation
In this section, we consider the maximum likelihood (ML) estimation for the parameters θ = (α, β, λ) of the slashed Lomax distribution. It is well known that the EM algorithm is an important tool for ML estimation when no data or potential variables are observed. The E-step is used to find the expectation of the incomplete data based on the observed values, and the M-step works for the maximization. Usually, the M-step calculation is difficult when the maximum likelihood of the complete data is complex. Meng and Rubin [23] proposed an ECM algorithm that was an extension of the EM algorithm and satisfied all the properties of the EM algorithm. The basic idea of the ECM algorithm is to decompose the M-step of the EM algorithm into k times conditional maximization. In the following, an ECM algorithm is proposed for obtaining the estimator of θ from the slashed Lomax distribution. Let y 1 , · · · , y n be a random sample from Slomax (α, β, λ). According to Equation (1), we have, Let y = (y 1 , ..., y n ) and v = (v 1 , ..., v n ) be observed data, then (y i , v i ), i = 1, · · · , n are the complete dataset. Therefore, the log-likelihood function for the complete data (y i , v i ), i = 1, · · · , n, is given by:
The iterations are repeated until a suitable convergence rule is satisfied, say ||θ where || · || is the Euclidean norm and is sufficiently small. In the following, we conduct simulation studies to test the efficiency of the estimation procedure discussed above. The random values following the slashed Lomax distribution can be generated by Equation (1). The parameters are set up as: θ = (2,1,0.5), (2,1,1), (2,1,1.5), (1,0.5,1), (1,1,1), (1,2,1), (0.5,2,2), (2,2,1), (4,2,1). The sample sizes n = 50, 100, and 200. The following procedure is for generating a random number with size n from Slomax(α, β, λ), • set α, β, λ, and n; • simulate U ∼ Uni f orm(0, 1); • simulate V ∼ beta(λ, 1); For each scenario, we repeat the process N = 1000 times. The ECM algorithm is applied, and the results are computed using the software R. The procedures are put in the same conditions (same initial values, and = 10 −4 ). The estimators are obtained by applying the nleqslvfunction in Equations (4)- (6). The mean values of the parameters and the corresponding standard deviation (SD) are shown in Table 1. The empirical estimated mean value and SDs based on the 1000 replicates are calculated by: where ξ is the interesting parameter, α, β, λ, respectively. From Table 1, it can be seen that as the sample size increases, the mean value of estimators comes closer to the true values, and the SD decreases in all case.

Slashed Lomax Regression Model
In this section, we study the slashed Lomax regression model and propose the ECM algorithm to estimate the parameters.
According to Equation (2), the density function of y is given by: Let y = (y 1 , y 2 , ...y n ) come from the slashed Lomax regression model with X the given design matrix x. The log-likelihood function of δ = (α, β, λ, γ) can be written as: Now, given v = (v 1 , ...v n ), we can get the complete log-likelihood function: 1 2 , and the conditional expectation of l(δ|y, v), In the following, we illustrate the steps for the ECM algorithm for estimating the parameters in the regression model.
CM-step I:α (k) must be updated as: . CM-step I I:β (k) must be updated as: . CM-step I I I:λ (k) must be updated as: . CM-step IV:γ (k) must be updated as: The iterations are repeated until a suitable convergence rule is satisfied, say |δ (k+1) −δ (k) | sufficiently small. On the basis of this theory, a simulation experiment is carried out. We take the two-dimensional regression model to carry out the numerical simulation. The value of the parameters for (α, β, λ) are chosen as (   From the simulation results, we can see that, as the sample size increases, the standard deviations between the estimated parameters decrease, and the estimated values are close to the actual values.

Application
Next, we use the real data to verify the practicability of the slashed Lomax distribution. We compare the performance of the slashed Lomax distribution to that of the beta Lomax (Blomax) distribution (Lemonte and Cordeiro [24]), the exponentiated standard Lomax (ESlomax) distribution, and the Gumbel Lomax (Gulomax) distribution (Tahir et al. [25]). The measures of the goodness of fit including the Akaike information criterion (AIC) and Bayesian information criterion (BIC) values are computed to compare the fitted model and regression model. Data 1: This dataset has 214 observations and describes the successive failure of the air conditioning systems in a fleet of 13 Boeing 720 jet airplanes, which was studied by Kus [26], Tahir et al. [25], and many others.
The parameter estimates, AIC, and BIC for all fitted distributions are shown in Table 5. Both criteria provide evidence in favor of the slashed Lomax distribution for this dataset, corroborating that the slashed Lomax distribution can be seen as a competitive distribution of practical interest in the real world. Figure 3 displays the fitted models for the dataset. The left panel of Figure 3 shows the fitted densities to the dataset histogram and some estimated distributions, and the right panel displays the empirical distribution function for the dataset and the estimated distributions. Both figures reveal that the slashed Lomax distribution provides a qualified fit for the dataset.

Data 2:
This dataset is about the lifespan of patients with heart failure and contains 96 patients with heart failure, the detailed information of which was provided by Tanvir Ahmad [27].
The numerical variables considered here are ejection fraction (x 1 ), serum creatinine (x 2 ), serum sodium (x 3 ), and age (x 4 ) as potential variables that explain cardiovascular mortality (y). We apply the slashed Lomax regression model developed in Section 4 to this dataset. The resulting estimates and other regression models are given in Table 6. As can be noted from Table 6, the slashed Lomax regression model has the lowest AIC and BIC values among those of the other regression models. The values of these statistics indicate that the slashed Lomax regression model provides the best fit to the data. In addition, the increase of the content of serum creatinine will shorten the lifespan of patients and increase the risk of death, which are consistent with the conclusions by Tanvir Ahmad [27].

Discussion
In this article, the slashed Lomax distribution and its corresponding statistical properties are introduced. Maximum likelihood estimators through the ECM algorithm and simulation studies are discussed. In addition, the slashed Lomax regression model is studied. Applications to real data demonstrate the importance of the proposed distribution. In the future, we will consider extending this proposed model to the multivariate case and study its extremal properties, aiming to explore more practical values.