Generalized Truncation Positive Normal Distribution

: In this article we study the properties, inference, and statistical applications to a parametric generalization of the truncation positive normal distribution, introducing a new parameter so as to increase the ﬂexibility of the new model. For certain combinations of parameters, the model includes both symmetric and asymmetric shapes. We study the model’s basic properties, maximum likelihood estimators and Fisher information matrix. Finally, we apply it to two real data sets to show the model’s good performance compared to other models with positive support: the ﬁrst, related to the height of the drum of the roller and the second, related to daily cholesterol consumption.


Introduction
The half-normal (HN) distribution is a very important model in the statistical literature. Its density function has a closed-form and its cumulative distribution function (cdf) depends on the cdf of the standard normal model (or the error function), which is implemented in practically all mathematical and statistical software. Pewsey [1,2] provides the maximum likelihood (ML) estimation for the general location-scale HN distribution and its asymptotic properties. Wiper et al. [3] and Khan and Islam [4] perform analysis and applications for the HN model from a Bayesian framework. Moral et al. [5] also present the hnp R package, which produces half-normal plots with simulated envelopes using different diagnostics from a range of different fitted models. The HN model is also presented in the stochastic representation of the skew-normal distribution in Azzalini [6,7] and Henze [8]. In recent years this distribution has been used to model positive data, and it is becoming an important model in reliability theory despite the fact that it accommodates only decreasing hazard rates. Some of the generalizations of this distribution can be found in Cooray and Ananda [9], Olmos et al. [10], Cordeiro et al. [11], Gómez and Bolfarine [12], among others.
In particular, we focused on the extension of Cooray and Ananda [9]. The authors provided a motivation related to static fatigue life to consider the transformation Z = σY 1/α , where Y ∼ HN(1). This model was named the generalized half-normal (GHN) distribution. An alternative way to extend the HN model was introduced by Gómez et al. [13] considering a normal distribution with mean and standard deviation µ and σ, respectively, truncated to the interval (0, +∞) and considering the reparametrization λ = µ/σ. This model was named the truncated positive normal (TPN) distribution with density function given by where φ(·) and Φ(·) denote the density and cdf of the standard normal models, respectively. We use TPN(σ, λ) to refer to a random variable (r.v.) with density function as in Equation (1). Note that TPN(σ, 0) ≡ HN(σ).
In this work we consider a similar idea to that used in Cooray and Ananda [9] to extend the TPN model including the transformation Y = σX 1/α , where X ∼ TPN (1, λ). We will refer to this distribution as the generalized truncation positive normal (GTPN).
The rest of the manuscript is organized as follows. Section 2 is devoted to study of some important properties of the model, as well as its moments, quantile and hazard functions and its entropy. In Section 3 we perform an inference and present the Fisher information matrix for the proposed model. Section 4 discusses the selection model in nested and non-nested models for the GTPN distribution. In Section 5 we carry out a simulation study in order to study properties of the ML estimators in finite samples for the proposed distribution. Section 6 presents two applications to real data-sets to illustrate that the proposed model is competitive versus other common models for positive data in the literature. Finally, in Section 7, we present some concluding remarks.

Model Properties
In this section we introduce the main properties of the GTPN model such as density, quantile and hazard functions, moments, among others.

Stochastic Representation and Particular Cases
As mentioned previously, we say that a r.v. Z has GTPN(σ, λ, α) distribution if Z = σY 1/α , where Y ∼ TPN(1, λ). By construction, the following models are particular cases for the GTPN distribution: Figure 1 summarizes the relationships among the GTPN and its particular cases. We highlight that λ = 0 and α = 1 are within the parametric space (not on the boundary). Therefore, to decide between the GTPN versus the TPN, GHN, or HN distributions we can use classical hypothesis tests such as the likelihood ratio test (LRT), score test (ST), or gradient test (GT).

Proposition 2.
For Z ∼ GTPN(σ, λ, α), the cdf and hazard function are given by respectively, for all z ≥ 0. Figure 2 shows the density and hazard functions for the GTPN(σ = 1, λ, α) model, considering some combinations for (λ, α). Note that the GTPN model can assume decreasing and unimodal shapes for the density function and decreasing or increasing shapes for the hazard function.

Mode
Proposition 3. The mode of the GTPN(σ, λ, α) model is attained: Proof. Let l = log( f ), where f is the density function defined in (2), of a direct computation we have Note that l z vanishes when α z σ 2α − αλ z σ α − (α − 1) = 0. Using the auxiliary variable w = z σ α the last equation is rewritten as follows: In the rest of the proof we use the discriminant of the quadratic equation in Equation (3), which is given by ∆ = α 2 λ 2 + 4α(α − 1) = α 2 [λ 2 + 4(1 − 1 α )] and its zeros are given by: In consequence, the mode is attained at Here, two cases may occur. The first when 0 < ∆ < λ 2 , in which case if λ > 0 its mode is attained at z = σ , since l zz < 0, and if λ < 0, then the zeros of Equation (3) are negative, implying that function l is strictly decreasing. Its mode is therefore attained at zero. The other case is when ∆ < 0, then we have that αw 2 − αλw − (α − 1) > 0 for all w ≥ 0, implying that l z < 0 for all z ≥ 0. Therefore, l is strictly decreasing and thus its mode is zero.

Remark 1.
Note that α ≥ 1 or λ > 0 implies that the mode of the GTPN model is attached in a positive value.

Quantiles
Proposition 4. The quantile function for the GTPN(σ, λ, α) is given by Proof. Follows from a direct computation, applying the definition of quantile function.

Corollary 1.
The quartiles of the GTPN distribution are

Proof.
Considering the stochastic representation of the GTPN model in Section 2.1, it is immediate that λ). This expected value can be computed using Proposition 2.2 in Gómez et al. [13].

Remark 2.
When r/α ∈ N, Closed Forms Can Be Obtained for µ r Figure 3 illustrates the mean, variance, skewness, and kurtosis coefficients for the GTPN(σ = 1, λ, α) model for some combinations of its parameters.

Bonferroni and Lorenz Curves
In this subsection we present the Bonferroni and Lorenz curves (see Bonferroni [14]). These curves have applications not only in economics to study income and poverty, but also in medicine, reliability, etc. The Bonferroni curve is defined as The Lorenz curve is obtained by the relation L(p) = pB(p). Particularly, it can be checked that for the GTPN model the Bonferroni curve is given by ) .
These curves serve as graphic methods for analysis and comparison, e.g., the inequality of non-negative distributions. See, for example, for a more detailed discussion [15]. Figure 4 shows the Bonferroni curve for the GTPN(σ = 1, λ, α) model, considering different values for λ and α.

Shannon Entropy
Shannon entropy (see Shannon [16]) measures the amount of uncertainty for a random variable. It is defined as: Therefore, it can be checked that the Shannon entropy for the GTPN model is Figure 5 shows the entropy curve for the GTPN(σ = 1.5, λ, α) model, considering different values for λ and α. We note that this function is increasing in λ and α. where E (log(Z)) = ∞ 0 log(z) f (z; σ, λ, α)dz. For α = 1, the Shannon entropy is reduced to which corresponds to the Shannon entropy for the TPN model; and for α = 1 and λ = 0, S(Z) is reduced to which corresponds to the Shannon entropy for the HN distribution.

Inference
In this section we discuss the ML method for parameter estimation in the GTPN model.

Maximum Likelihood Estimators
For a random sample z 1 , . . . , z n from the GTPN(σ, λ, α) model, the log-likelihood function for Therefore, the score assumes the form is the negative of the inverse Mills ratio. The ML estimators are obtained by solving the equation S(θ) = 0 3 , where 0 p denotes a vector of zeros with dimension p. This equation has the following solution for λ Replacing Equation (4) in S λ (θ) = 0 and S α (θ) = 0, the problem is reduced to two equations. The solution of this problem needs to be solved by numerical methods such as Newton-Raphson. Below we discuss initial values for the vector θ to initialize the algorithm.

Initial Point to Obtain the Maximum Likelihood Estimators
In this subsection, we discuss the initial points for the iterative methods to find the ML estimators in the GTPN distribution.

A Naive Point Based on the HN Model
In Section 2 we discuss that GTPN(σ, λ = 0, α = 1) ≡HN(σ). Based on this fact, and considering that the ML estimator for σ in the HN distribution has a closed-form, we can consider as an initial

An Initial Point Based on Centiles
Let q t , t = 1, . . . , 99, the t-th sample centile based on z 1 , . . . , z n . An initial point to θ can be obtained by matching q u , q 50 and q 100−u , with u ∈ {1, 2, . . . , 48, 49}, with their respective theoretical counterparts. Defining p = u/100, the equations obtained are The solutions for σ and α are where λ is obtained from the non-linear equation Therefore, the initial point based on this method is given by θ cent = σ, λ, α .

An Initial Point Based on the Method of Moments
A more robust initial point can be obtained using the method of moments. The equations to solve are µ r = z r , r = 1, 2, 3. The solution for σ is The solution for λ and α (say λ and α , respectively) are obtained from the non-linear equations Therefore, the initial point based on this method is given by θ mom = σ , λ , α .

Fisher Information Matrix
The Fisher information (FI) matrix for the GTPN distribution is given by IF(θ) = I ab a,b ∈ {σ,λ,α} .

Model Discrimination
In this section we discuss some techniques to discriminate among the GTPN distribution and other models.

GTPN versus Submodels
An interesting problem to solve is the discrimination between GTPN and the three submodels represented in Figure 1. In other words, we are interested in testing the following hypotheses: • H The three hypotheses can be tested considering the LRT, ST, and GT. Below we present the statistics for the three tests considered and for the three hypotheses of interest.

Likelihood Ratio Test
The statistic for the LRT (say SLR) to tests H (j) 0 , j = 1, 2, 3, is defined as where σ 0j , λ 0j and α 0j denote the ML estimators for σ, λ and α restricted to H (j) (1) and under H 0 the ML estimators under the null hypotheses need to be computed numerically. However, in both cases the problem is reduced to a unidimensional maximization. For details see Cooray and Ananda [9] and Gómez et al. [13], respectively.

Score Test
The statistic for the ST (say SR) to test H (j) 0 , j = 1, 2, 3, is defined as

Gradient Test
The statistic for the GT (say ST) to tests H (j) 0 , j = 1, 2, 3, is defined as 2) . After some algebraic manipulations, we obtain that

Non-Nested Models
The comparison of non-nested models can be performed based on the AIC criteria (Akaike [17]), where the model with a lower AIC is preferred. However, in practice we can have a set of inappropriate models for a certain data set. For this reason, we also need to perform a goodness-of-fit validation. This can be performed, for instance, based on the quantile residuals (QR). For more details see Dunn and Smyth [18]. These residuals are defined as where G(; ψ) is the cdf of the specified distribution evaluated in the estimator for ψ. If the model is correctly specified, such residuals are a random sample from the standard normal distribution. This can be assessed using, for instance, the Anderson-Darling (AD), Cramer-Von-Mises (CVM) and Shapiro-Wilks (SW) tests. A discussion of these tests can be seen in Yazici and Yocalan [19].

Simulation
In this section we present a Monte Carlo (MC) simulation study in order to illustrate the behavior of the ML estimators. We consider three sample sizes: n = 50, 150 and 300; two values for σ:1 and 2; two values for λ:3 and 4; and three values for α: 0.8, 1 and 2. For each combination of n, σ, λ and α, we draw 10,000 samples of size n from the GTPN(σ, λ, α) model. To simulate a value from this distribution, we consider the following scheme:

2.
Compute For each sample generated, ML estimators were computed numerically using the Newton-Raphson algorithm. Table 1 presents means and standard deviations for each parameter in each case. Notice that bias and standard deviations are reduced as the sample size increases, suggesting that the ML estimators are consistent.

Applications
In this section we present two real data applications to illustrate the better performance of the GTPN model over other well known models in the literature. For these comparisons we also consider the Weibull (WEI) and the Generalized Lindley (GL, Zakerzadeh [20]) models. The density function of the Weibull distribution is given by with x > 0, σ > 0 and λ > 0, whereas for the GL model is given by with x > 0, θ > 0 α > 0 and γ > 0.

Application 1
The data set was taken from Laslett [21], and consisted of n = 115 heights measured at 1 micron intervals along the drum of a roller (i.e., parallel to the axis of the roller). This was part of an extensive study of the surface roughness of the rollers. A statistical summary of the data set is presented in Table 2.
Heights measured 115 3.48 0.52 −1.24 6.30 Initially, we calculate the estimators based on the centiles, naive, and moments of the GTPN distribution, which are θ cent = (2.326, 2.741, 2.376), θ naive = (3.557, 0, 1), and θ mom = (3.075, 1.572, 3.203) . We used these estimations as initial values in computing the ML estimators for the GTPN model. Results are presented in Table 3. Note the high estimated standard error for the γ parameter in the GL model. In addition, note that, based on the AIC criteria and BIC criteria [22], the GTPN model is preferred (among the fitted models) for this data set. Figure 6 shows the estimated density for each model in this data set, where the GTPN model appears to provide a better fit. Finally, Figure 7 also presents the qq-plot for the QR in the same models and the p − values for the three normality tests discussed in Section 4.2. Results suggest that the GTPN model is an appropriate model for this data set while the rest of models are not.

Application 2
The data set to be investigated was taken from Nierenberg et al. [23], and is related to a study on plasma retinol and betacarotene levels from a sample of n = 315 subjects. More specifically, the response variable observed is grams of cholesterol consumed per day. Descriptive statistics for the variable are provided in Table 4.
Initially, we calculate the estimators based on the centiles, naive, and moments of the GTPN distribution, which are θ cent = (2.326, 2.741, 2.376), θ naive = (275.960, 0, 1), and θ mom = (0.003, −86.024, 1.854). We used these estimations as initial values in computing the ML estimators for the GTPN model. Table 5 summarizes the fit for this data set. As in last application, we noted the high estimated standard error for the γ parameter in the GL model. Again, based on the AIC criteria and BIC criteria, the preferred model is the GTPN. Figure 8 shows the estimated density for each model in the cholesterol data set. The GTPN model appears to provide a better fit. For this data set, we also present the hypothesis tests for the three particular models of the GTPN distribution discussed in Section 4.1. Specifically, for H : α = 1 we obtained p − values < 0.0001, 0.0157, and < 0.0001 for the SLR, SR, and ST tests, respectively. Therefore, with a significance of 5% we preferred the GTPN over the TPN model. For H 1 : (α, λ) = (1, 0) we obtained p − values < 0.0001 for the three tests. Therefore, with a significance of 5% we preferred the GTPN over the HN model. Finally, Figure 9 also presents the qq-plot for the QR in the fitted models and the p − values for the three normality tests. Results suggest that the GTPN model is appropriate for this data set while the rest of the models are not.

Conclusions
In this work we introduce a new distribution for positive data named the GTPN model. This new distribution includes as particular cases three models well-known in the literature: the TPN, GHN, and HN models. The basic properties of the model and ML estimation were studied. We performed a simulation study in finite samples and two real data applications, showing the good performance of the model compared with other usual models in the literature.

Conflicts of Interest:
The authors declare no conflict of interest.