1. Introduction
The half-normal (HN) distribution is suitable to fit positive data. For this reason, it is of interest in reliability and survival analysis as a lifetime model. The HN model also exhibits a large number of theoretical properties; for instance, it can be obtained as a particular case of the folded normal, the truncated normal, or the central chi distribution with one degree of freedom; details can be seen in Johnson et al. [
1]. Recall that a random variable (rv)
Z follows an HN distribution with scale parameter
,
if its probability density function (pdf) is given by
where
and
denotes the pdf of a
distribution.
Properties of the HN distribution and first applications can be seen in the papers by Rogers and Tukey [
2] and Mosteller and Tukey [
3]. Pewsey [
4,
5] introduced the general location-scale HN distribution and studied asymptotic inference based on maximum likelihood (ML) estimators. Later, Wiper et al. [
6] obtained Bayesian results in the general HN and half-t distributions. Cooray and Ananda [
7] proposed the generalized half-normal (GHN) distribution as a lifetime model useful for items subjects to static fatigue. Ahmadi and Yousefzadeh [
8] obtained results in the GHN for type I interval censoring data. Olmos et al. [
9,
10] used the slash methodology to extend the HN and GHN distributions. They proposed models with more kurtosis than their precedents.
On the other hand, Gómez and Bolfarine [
11] introduced the two-parameter PHN distribution. This is a model useful to fit positive data with a shape parameter, which provides flexibility to the pdf, survival, and hazard rate function with respect to the HN distribution. They also showed that the PHN model is a competitor of the GHN model, and therefore, it can be used as a static fatigue lifetime model. Due to its good properties, the PHN will be the starting point to introduce our proposal. Our aim is to get the slashed version of the PHN model.
Next, we recall the main features of the PHN model (see [
11]). It is said that an rv
X follows a PHN distribution,
, if its pdf is given by
where
and
are scale and shape parameters, respectively, and
denotes the cumulative distribution function (cdf) of a
.
Lemma 1 (Properties of PHN distribution, [
11]).
Let . Then- 1.
- 2.
- 3.
In particular
- (a)
.
- (b)
.
- (c)
Skewness coefficient, defined as , is - (d)
Kurtosis coefficient, defined as , is
It can be seen in [
11] that
and
are decreasing functions of
. The aim of this paper is to propose an extension of the PHN model whose kurtosis coefficient exhibits a greater range of values than the kurtosis coefficient in the PHN model, and therefore, it may be used to accommodate outlying observations.
In this sense, it is well known that the slash models have heavier tails than other classical distributions, such as the normal one. Relevant papers, which illustrate the main properties of slash models, are Segovia et al. [
12], Wang et al. [
13] and Iriarte et al. [
14].
The outline of this paper is as follows. In
Section 2, the stochastic representation of the slash power half-normal (SPHN) model is given, its pdf, cdf, properties, relationships and approximations to other models, expression as a mixture, moments, asymmetry and kurtosis coefficients, and stochastic ordering properties are studied. In
Section 3, given a random sample of the SPHN model, inference for the unknown parameters is carried out by using moment and maximum likelihood methods. In
Section 4, a simulation study is carried out. An algorithm to generate random values in the SPHN model is proposed, and the consistency of ML estimators is analyzed there. In
Section 5, two real applications dealing with survival and fatigue fracture data are given. In this section, our model is compared to other competing models, such as PHN, GHN, Slash Power Maxwell (see [
12]), LogNormal and slash generalized half-normal (SGHN) (see [
10]). It is proven that our proposal outperforms the competitors. Finally, a brief discussion, some conclusions and future tasks are given in
Section 6.
2. The Slashed Power Half-Normal Distribution
In this section, the new model is introduced, and its theoretical properties are studied. First, the stochastic representation of the SPHN model is given; that is, a continuous, non-negative rv
T follows a SPHN distribution,
, if
T is obtained as
where
and
are independent,
,
, and
.
In (
6),
is a scale parameter, whereas
and
are shape parameters. It will be seen in
Section 2.3 that
q increases the range of possible values for the kurtosis coefficient in the SPHN model with respect to the PHN distribution. In the next proposition, the pdf of (
6) is obtained.
Proposition 1. Let . Then, the pdf of T is given bywhere , , , and Proof. By using (
6), the Jacobian technique, and marginalizing, the pdf of
T is given by
Making the change of variable
, we have
Finally, by considering the change of variable
, (
7) is obtained. □
Figure 1 shows the pdf of the SPHN model for fixed values of
,
, and several values of parameter
. This plot suggests that the right tail in this model becomes heavier as
q becomes smaller.
Moreover,
Table 1 compares the right tail in the PHN and SPHN distributions for several values of
q,
. Note that for a fixed
t value, the closer to zero
q is, the greater
is obtained. These appreciations agree with the fact that
q is mainly related to the kurtosis in this new model, as it will be seen in
Section 2.3.
Remark 1. For completeness, plots of the pdf of the SPHN model for and (), and increasing values of q are given in Appendix A.1 and Appendix A.2, respectively. In this way, we have displayed all the possibilities as for the shape of SHPN pdfs. 2.1. Properties
Next the cdf, survival and hazard rate function are obtained. Relationships with these features in the PHN model are included.
Proposition 2. Let . Then, the cdf of T is given bywith the cdf of given in (1) and introduced in (8). Proof. Combining (
7) and (
8), we write
, where
Integrating by parts
I with
and
, and using (
1), (
9) is obtained. □
Corollary 1. Let . Then, the survival function, , and the hazard function, , of T are given bywith , , , and given in (8). From Corollary 1, the next relationship between the survival function of and the model follows.
Corollary 2. Let . Then, the survival function, , can be expressed aswith the survival function of . Plots of the cdf, survival and hazard function of
are given in
Figure 2 for
and
fixed and several values of
q,
(
corresponds to the
). On the other hand, plots for the cdf, survival and hazard rate function of SPHN model, taken
, for
(
), and
, by considering increasing values of
q are given in
Appendix A.1 and
Appendix A.2, respectively.
These plots suggest that:
- (1)
For increasing values of q, the approaches the distribution (proven in Proposition 5).
- (2)
For
and
fixed, these models are stochastically ordered with respect to
q (proven in
Section 2.4).
Proposition 3. Let . Then
- 1.
For , the mode of T is at zero.
- 2.
For , the mode of T can be obtained as the solution for ofwhere , , , was introduced in (8), and denotes the pdf of a model.
Proof. 1. It follows from the fact that for , the pdf of T is a strictly decreasing function of t.
2. For
, let us consider
, i.e.,
Thus,
which is equivalent to (
10). □
Remark 2. (10) must be solved numerically. Next, it is proven that the SPHN model can be expressed as a scale mixture of distributions.
Proposition 4. Let and . Then, .
Proof. Note that the marginal pdf of
T can be obtained as
Making the change of variable , the proposed result is obtained. □
By applying the method proposed in Barranco-Chamorro et al. [
15], the convergence in law of the
model, as
, to a
distribution is next established. To highlight the fact that we are taking the limit for
, the subindex
q is used to refer to
.
Proposition 5. Let . If , then converges in distribution to .
Note that the result given in Proposition 5 states that for large values of q, the model can be approached by a distribution.
2.2. Relationships among Distributions
In the following, we will see special cases that are associated with the SPHN distribution.
According to Proposition 5, if then , where . That is, the SPHN model contains the PHN model as a limit case.
If
, then
with
, where
Y follows a slash half-normal (SHN) distribution introduced in Olmos et al. [
9].
If and , then , where M follows an distribution.
These relationships among distributions are summarized in
Figure 3.
2.3. Moments
The next proposition gives us the expresion of noncentral moments in the SPHN distribution. The expected value, variance, skewness and kurtosis coefficients follow in a straightforward way.
Proposition 6. Let . Then, for and , the rth-non-central moment of T exists and is given bywhere Proof. By using the stochastic representation for the SPHN distribution given in (
6), we have that
On the one hand, we have that
exists for
and
On the other hand,
is the rth-moment of a
model given in (
2).
Remark 3. Note that from Proposition 6, given , for , is infinity.
From (
2) and (
11), the following relationship between the moments of SHPN and PHN models follows.
Corollary 3. Let and . Then Corollary 4. Let . Thenwhere For illustrative purposes, the expected value, variance and mode for different values of parameters in the SHPN model are given in
Table 2. We observe that the expected value, variance and mode decrease as
q increases for the the values of the parameters under consideration.
Corollary 5. Let . Then, the skewness, , and kurtosis, , coefficients are, for ,and for ,where . Remark 4. The skewness and kurtosis coefficients were obtained by using Figure 4 and
Figure 5 provides plots for the skewness and kurtosis coefficients in the SPHN distribution. Both coefficients depend on
and
q parameters.
and
do not depend on
, since
is a scale parameter.
2.4. Stochastic Ordering
Proposition 7. Let , , and with (σ, fixed). Then, X is stochastically smaller than , , and is stochastically smaller than , . So, as summary, we can write Proof. From Corollary 2, and the fact that
, defined in (
8), is a decreasing function of
q, we can write the following relationship among the survival functions of
X,
and
It can be seen in [
16] that this is the definition of stochastic order, and therefore, (
15) follows. □
Corollary 6. Let , , and with (σ, fixed). Then Proof. Since for
,
exists
, then, it can be seen in [
16] that from (
15) follows (
16). □
In addition, some relationships can be given for the order statistics of these distributions.
Proposition 8. Let be a random sample of , and let us denote by the j- order statistic in this sample, . Similarly, let us consider a random sample of , with , and denotes the j- order statistic for the sample of . Then, the j- order statistics are also stochastically ordered, explicitly, Proof. It can be seen in [
16] that this result is a consequence of (
15). □
4. Simulation Study
In this section, a simulation study is conducted aiming to investigate ML estimation performance for parameters
,
and
q in the SPHN model. Specifically, 1000 random samples of sizes
50, 100 and 200 were generated under the SPHN model by using the algorithm given below. A summary of the results obtained in this study are depicted in
Table 3. The empirical means correspond to the means of the estimated parameters over the 1000 simulated samples. The SE given in
Table 3 is the average of the standard errors obtained in every simulation,
, which were calculated as the square root of the corresponding diagonal element in the inverse of the observed information matrix. Moreover, Asymptotic Confidence Intervals (ACIs) at confidence level
,
, have been built based on the asymptotic normality of MLEs. Specifically,
The level confidence is
. To asses the performance of these summaries, the empirical covarage probability (CP) of (
25) has been included in
Table 3. That is the proportion of ACIs that contain the true value of the parameter.
In
Table 3, RMSE denotes the square root of the empirical mean squared error: for instance, for
, it is calculated as
and so on.
Next, the algorithm used to generate samples from
is introduced. The Algorithm 1 is based on (
6) and the inversion of the cdf given in (
1).
Algorithm 1: for generate samples from . |
- 1:
Simulate . - 2:
Compute . - 3:
Simulate . - 4:
Compute .
|
As conclusions of this simulation study, we highlight that as the sample size increases, estimates become closer to the true parameter values. These results suggest that the estimated standard errors and RMSE become smaller as sample size increases: that is, the proposed estimators are consistent. As for the ACI, the results are satisfactory. We highlight that their empirical CP approaches to the nominal 0.95 confidence level as n increases.
Following reviewers’ recommendations, similar plots to the ones proposed in [
19] have been carried out to illustrate the results in
Table 3. So, the empirical coverage probabilities obtained for the asymptotic confidence intervals at 95% for
,
and
q for the sample sizes
have been plotted in
Figure 6. The columns correspond to the cases
,
and
in
Table 3 and the panels by rows to
. It can be appreciated there that, in all cases, the empirical coverage probability approaches the confidence level 0.95 as the sample size increases. This plot also suggests that the approximate confidence intervals for
q and
perform better than those for
.
5. Applications
In this section, two real data sets with high kurtosis levels are considered. In these data sets, the PHN, GHN, Slash Power Maxwell (SPM) introduced in Segovia et al. [
12], SGHN introduced in Olmos et al. [
10] and SPHN distributions are considered. Details about these models can be seen in
Appendix C.
The parameters are estimated by ML. The Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC), histograms and Q-Q plots are considered to compare these models.
5.1. Application 1
Let us consider the data set of kevlar 49/epoxy, which corresponds to fatigue fracture to constant pressure at the 90% stress level until the fail happened. This data set has been previously analyzed by Andrews and Herzberg [
20], Barlow et al. [
21] and Olmos et al. [
9,
10] among others. The data set consists of 101 observations with the presence of outliers. Explicitly, in
Table 4:
In
Table 5, the descriptive analysis is provided. We can see that this data set exhibits a high sample kurtosis coefficient of
, so it is interesting to see what can our model do here.
For the SPHN model, the moment estimates are , and . These estimates are used as starting values to get the ML estimates by using numerical methods.
Table 6 shows the estimated parameters for each model under consideration. If we apply the AIC and BIC criteria, then the SPHN distribution must be preferred over the GHN, PHN, SPM and SGHN distributions, since its AIC and BIC are the smallest ones.
Figure 7 shows the histograms for the fatigue fracture data set, along with the fitted distributions by using ML estimates in SPHN, GHN and SGHN models. The QQ-plot is also included to asses the good fit provided by the SPHN model to this data set.
5.2. Application 2
Here, the data set previously analyzed by Gómez and Bolfarine [
11] is considered. This data set corresponds to 72 survival times of guinea pigs injected with different doses of tubercle bacilli, which are in
Table 7.
The moment estimates for the parameters in the SPHN model are: , and . Again, these estimates are used as initial values to get the ML estimates by using numerical methods.
In
Table 8, the descriptive analysis is given. We have that the sample kurtosis coefficient is
, so it is also interesting to see if the SPHN distribution can provide a good fit to this data set.
Table 9 shows the estimated parameters for each distribution. If we apply the statistical information criteria then, in all cases, both criteria choose the SPHN model over the GHN, PHN, SPM and SGHN distributions.
Figure 8 shows the histograms for the guinea pigs survival time data along with the fitted distributions: SPHN, LogNormal, and SGHN, whose parameters were estimated by ML. The QQ-plot is also included for the proposed SPHN model, which provides the best fit to this data set.
6. Conclusions
This paper introduces the SPHN distribution, which is built from the PHN distribution by using the slash methodology proposed in (
6). In this way, a model with higher kurtosis than the PHN is obtained. The SPHN is a three-parameter model whose right tail is heavier for smaller values of the kurtosis parameter
q. Relevant results of interest in reliability are discussed, such as cdf, survival, hazard rate function and stochastic orderings. The convergence in distribution to the PHN model is studied when the parameter of kurtosis
q increases, along with the relationships with the PHN, SHN and HN models. All these relationships are summarized in
Figure 3 and enhance the interest of our model. It is shown that the SPHN can be expressed of a scale mixture of a PHN and a uniform distribution. This property allows us to propose an algorithm to generate random values of the SPHN model. The unknown parameters in the model are estimated via ML. A simulation study is given where the good properties of ML estimators can be seen. As applications, two real data sets are considered with moderate and high kurtosis. These are Applications 2 and 1, respectively. Several common models are considered as competitors of SPHN. By applying information criteria (AIC and BIC), it is shown that our proposal provides the best fit to these data sets. Due to this fact, it is of interest to spread out the use and applications of this model.