An Extension of the Truncated-Exponential Skew-Normal Distribution

: In the paper, we present an extension of the truncated-exponential skew-normal (TESN) distribution. This distribution is deﬁned as the quotient of two independent random variables whose distributions are the TESN distribution and the beta distribution with shape parameters q and 1, respectively. The resulting distribution has a more ﬂexible coefﬁcient of kurtosis. We studied the general probability density function (pdf) of this distribution, its survival and hazard functions, some of its properties, moments and inference by the maximum likelihood method. We carried out a simulation and applied the methodology to a real dataset. Author Contributions: Conceptualization, P.A.R. and D.I.G.; methodology, H.W.G.; software, D.I.G.; val-idation, P.A.R., D.I.G., and H.W.G.; formal analysis, O.V.; investigation, O.V. and M.B.; writing—original draft preparation, P.A.R.; writing—review and editing, O.V. and M.B.; funding


Introduction
The Slash (S) distribution is a generalization of the normal model. Its stochastic representation is given by where Z ∼ N(0, 1) is independent of U ∼ U(0, 1) and q > 0. q = 1 represents the canonical Slash model and the standard normal model is obtained for q → ∞. The pdf of the canonical S distribution is x 2 where φ(·) represents the pdf of the standard normal model (see Johnson et al. [1]). This distribution is characterized by having heavier tails than normal distribution, i.e. it has greater kurtosis. Properties of the S distribution are discussed by Rogers and Tukey [2] and Mosteller and Tukey [3]. The ML parameters for location and scale in the S model are discussed in Kadafar [4]. Wang et al. [5]studied a multivariate version of the S distribution and a multivariate skew version. Gómez et al. [6] extended the S distribution using the family of univariate and multivariate elliptical distributions also was extended by using the S model in Gómez et al. [7].
Nadarajah et al. [8] proposed the idea of constructing biased distributions, motivated by Azzalini [9], including asymmetry in these. A unified approach for the construction of models of this kind is given in Ferreira and Steel [10]. If X is a symmetrical random variable around zero with pdf f X (·) and cumulative distribution function (cdf) F X (·), the new random variable Y is defined with pdf given by: with ω(·) denoting a pdf in the interval (0, 1). Then Y is a skew version of the variable X.
The most commonly-used versions of (2) are the skew distributions proposed by Azzalini [9] in the form: f Y (y) = 2 f X (y)F X (λy), y ∈ R, λ ∈ R.
In the present paper, we extend the TESN model introduced by Nadarajah et al. [8], based on the Slash methodology. The pdf of the TESN distribution is given by: where x, λ ∈ R. Hereafter, we use the notation X ∼ TESN(λ) to indicate that X is a random variable following a TESN distribution. According to Barreto-Souza and Simas [11], the distribution presents different behavior for a large |λ|, suggesting that this is a rich class of distributions. Furthermore, the parameter λ can be interpreted as a concentration parameter. Figure 1 shows the graph of the TESN pdf function with variations of the parameter λ.  The extension of this model is based on the quotient of two independent random variables, with TESN distribution and a power of the uniform distribution (0, 1), respectively, obtaining a distribution with a more flexible coefficient of kurtosis and so generating an appropriate model for fitting data. In practical terms, this generalization is based on the search for distributions that are more flexible, which may provide a "better fit" than the TESN distribution. For example, see the works by Gomes et al. [12], Maurya and Nadarajah [13] and the references therein.
The article is organized as follows. In Section 2 we study the representation, pdf, properties, and graphs. In Section 3, we present a Monte Carlo simulation experiment to evaluate the maximum likelihood estimates of the model parameters in Section 2. In Section 4, we provide an application of the proposed distribution. In Section 5, we conclude with some final comments.

Incorporating Kurtosis
In this section, we introduce a new extension of the TESN distribution. We studied its pdf, survival and hazard functions, moments, location and scale parameters, and their log-likelihood equations.

Representation
Following the representation of the Slash distribution, the representation of this new distribution is given by the following definition: where X ∼ TESN(λ) and Y ∼ U(0, 1) are independent variables, λ ∈ R, q > 0.

Probability Density Function
The following proposition shows the pdf function for the STESN distribution, generated using the stochastic representation given in (5) and using the Jacobian method for transforming the r.v.: where z ∈ R, λ ∈ R, q > 0 and R(z; λ, q) = 1 0 w q φ(zw) e {−λΦ(zw)} dw.
Proof. The pdf is generated using the representation given in (5). Using the Jacobian method for transforming the r.v. we obtain X = Z · W ; W = Y 1/q , calculating the Jacobian, we obtain: replacing the joint pdf f Z,W : where z ∈ R, 0 < w < 1, and λ ∈ R. Hence, marginalizing with respect to variable W, we obtain: where z ∈ R, λ ∈ R, q > 0 and:

Remark 1.
The above proposition implies that if q → ∞ then the pdf of the STESN distribution approaches the pdf of a TESN distribution.

Reliability Analysis
For the study of failure times, we need to consider a time y, where we have y ≥ 0. Therefore, in our model we study the case of non-negative variables Y = exp(Z) where Z ∼ STESN(λ, q), thus, the model must be transformed to obtain the following pdf: where y > 0, q > 0, λ ∈ R, and R(log(y); λ, q) = 1 0 w q φ(w log(y)) e {−λΦ(w log(y))} dw. Figure 2c shows the graphs of the pdf for different parameter values. Once the transformation is complete, we obtain the survival and hazard functions. The survival function is defined as the probability that a subject does not experience the event of interest before a moment t, and in our model is given by: where κ(t) = q·λ t · R(log(t); λ, q). It also gives the hazard function defined as the probability of failure during a time interval given in our model by: where κ(a) = q·λ a · R(log(a); λ, q). Figure 3a,b show the graphs of survival and hazard functions respectively, for different parameter values.

Moments
Let Z be a r.v. where Z ∼ STESN(λ, q), the r-th moment for the variable is given by the following proposition. Proposition 3. Using the representation (5), the r-th moment of the r.v. Z is: where q > r and X k:n the k-th order statistic of a random variable with distribution N(0, 1).
Proof. The r-th moments of Z can be calculated as: where E[X r ] is the r-th moment for the model proposed by Nadarajah et al. [8] given by: where X k:n the k-th order statistic of a random variable with distribution N(0, 1) and Therefore the r-th moment for the variable is given by: Using this proposition, the first four moments of the r.v. Z are given in the following corollary. Corollary 1. From the r−th moment of the r.v. Z ∼ STESN(λ, q) represented by (10), we obtain the first four moments of the variable, given by:

Incorporation of Parameters
To produce a more flexible distribution, we will extend this model to location µ and scale σ parameters as X = µ + σZ, where Z ∼ STESN(λ, q) obtaining the following proposition.

Log Likelihood Equations
Let x 1 , . . . , x n be a random sample of the r.v. X with STESN(λ, q, µ, σ) distribution, the log-likelihood function can be written as: where θ = (λ, q, µ, σ) and ρ( For each parameter we have the following likelihood equations: As can be observed, this system can only be resolved by iterative procedures such as Newton-Raphson. As an alternative, it is also possible is to use the optim routine implemented in R software [15]. Standard errors for parameters can be estimated using the hessian matrix of the log-likelihood function, which can be estimated, for instance, using the pracma package (see Borchers [16]).

STESN or TESN Model?
In order to decide between the STESN and TESN models, we can use the traditional Akaike (AIC, Akaike [17]) and Schwarz (BIC, Schwartz [18]) criteria. As the TESN model corresponds to the STESN model with q + ∞, we can use the likelihood ratio test (LRT) to decide between the two models considering H 0 : q = +∞ (TESN model) versus H 1 : q < ∞ (STENS model). This is a problem where the null hypothesis is exactly on the boundary of the parameter space. This kind of problem was first discussed in Chernoff [19]. This problem is also presented, for instance, in a random effects model when we are interested in testing if the variance of such random effects is zero (see Stram and Lee [20] and Gallardo et al. [21]) or in a cure rate model when we are interested in testing the presence of cured individuals in the population (see Maller and Zhou [22]). In this particular case, the statistic for the LRT, say d n , does not converge asymptotically to the usual χ 2 (1) distribution, i.e., the chi-squared distribution with 1 degree of freedom, but converges to 1 2 χ 2 (0) + 1 2 χ 2 (1) , i.e., a 50-50 mixture between a point mass χ 2 (0) in 0 and χ 2 (1) distribution.

Simulation Study
In this section, we will study the behavior of ML estimators in finite samples, verifying empirically whether these estimators have desirable properties (unbiased, asymptotically efficient, verification of the normal asymptotic distribution of ML estimators).
The random variables of the TESN distribution and the Beta distribution were generated to obtain our new variable with pdf shown in (15). The initial values used for optimization were obtained by a sequence of values which maximized this function. In this sequence, λ takes values between −3 and 3, q between 1 and 5, µ between −2 and 2, and σ between 2 and 10. This process was repeated 5000 times with sample size n = 25, n = 50, n = 100 and 200 for different combinations of parameters. Table 1 presents the empirical bias, the standard errors (SE), root of the mean squared error (RMSE), and 95% coverage probabilities (CP) for the estimators of the parameters of the STESN distribution with different combinations of parameters and sample sizes. From those tables, notice that the biases SE and RMSE decrease as the sample size increases, suggesting that estimators are consistent. Furthermore, the asymptotic confidence intervals have an empirical CP differing from the nominal values, especially when the sample size is small. However, we observe that the asymptotic confidence intervals converge to the nominal values when the sample size is increased. Figure 4 shows the estimated pdf for the ML estimators of µ, σ, λ, and q for two combinations of parameters, showing graphically that the skewness of the estimators disappears progressively when the sample sizes increases. We also note that the distribution of the estimators for λ and q are more asymmetric than the distribution of the estimators for µ and σ, especially in small sample sizes.

Application to a Data Set
In this section, we will present a real data application to illustrate the STESN model compared with other models discussed in the literature. These data were presented by Barlow et al. [23] and represent the fatigue fracture life of Kevlar 373/epoxi subjected to a constant pressure of 90% stress until they all fail. To obtain the parameter estimations, the optim command was used and its estimation errors were calculated by the Hessian matrix, both in R software. Codes are available as supplementary material. Table 2 shows a summary of the dataset, including the sample size n, the mean X, the standard deviation S, the asymmetry coefficient √ b 1 , the kurtosis coefficient b 2 , the minimum min(X), and the maximum max(X). A high kurtosis value is observed.  Table 3 shows the results of the fit; the TESN distribution was compared with the STESN distribution by AIC and BIC criteria. It is concluded that the distribution which achieves the best fit for this dataset is the STESN distribution, since it presents a lower value in the criteria. Furthermore, Table 3 provides the Kolmogorov-Smirnov statistic (KSS), a formal goodness-of-fit test to verify which distribution gives a better fit for these data. Small values of this statistic suggests a better fit. Thus, according to the Kolmogorov-Smirnov test, the STESN model fits the current data better than the TESN model.  Figure 5a,b present a histogram of the data with the densities fitted for the data set and Figure 6a,b present the QQ-plot of the densities fitted for the dataset, showing the good fit given by the new distribution.
In our problem, the observed statistic d n for the LRT to decide between the TESN and STESN models, discussed in Section 2.7, is d n = 23.59 with an associated p-value < 0.001. Therefore, the H 0 is rejected under any usual level of significance and the STESN model is preferred over the TESN model.

Final Comments
In this paper, we introduced an extension of the TESN distribution from which we obtained a distribution that showed greater flexibility in the coefficient of kurtosis. Some mathematical properties of the new distribution were studied. Note that the formulae derived easily implemented in different softwares. Inference was implemented based on the ML approach, and its performance was assessed by Monte Carlo simulations using R software. An application to a real dataset showed that the new model produced a better fit than the TESN model. This application demonstrated the practical importance of the new model, and also showed the advantage of STESN over TESN. We hope this new distribution may attract wider applications.