Next Article in Journal
Algebraic Properties of the Block Cipher DESL
Previous Article in Journal
Previous Article in Special Issue
Normal-G Class of Probability Distributions: Properties and Applications

Symmetry 2019, 11(11), 1410; https://doi.org/10.3390/sym11111410

Article
Modified Power-Symmetric Distribution
by
1
Department of Quantitative Methods in Economics and TIDES Institute, University of Las Palmas de Gran Canaria, 35017 Las Palmas de Gran Canaria, Spain
2
Departamento de Matemáticas, Facultad de Ciencias Básicas, Universidad de Antofagasta, Antofagasta 1240000, Chile
3
Centre for Actuarial Studies, Department of Economics, The University of Melbourne, Melbourne, VIC 3010, Australia
*
Author to whom correspondence should be addressed.
Received: 2 October 2019 / Accepted: 13 November 2019 / Published: 15 November 2019

## Abstract

:
In this paper, a general class of modified power-symmetric distributions is introduced. By choosing as symmetric model the normal distribution, the modified power-normal distribution is obtained. For the latter model, some of its more relevant statistical properties are examined. Parameters estimation is carried out by using the method of moments and maximum likelihood estimation. A simulation analysis is accomplished to study the performance of the maximum likelihood estimators. Finally, we compare the efficiency of the modified power-normal distribution with other existing distributions in the literature by using a real dataset.
Keywords:
maximum likelihood; kurtosis; power-normal distribution

## 1. Introduction

Over the last few years, the search for flexible probabilistic families capable of modeling different levels of bias and kurtosis has been an issue of great interest in the field of distributions theory. This was mainly motivated by the seminal work of Azzalini [1]. In that paper, the probability density function (pdf) of a skew-symmetric distribution was introduced. The expression of this density is given by
$g ( z ; λ ) = 2 f ( z ) G ( λ z ) , z , λ ∈ R ,$
where $f ( · )$ is a symmetric pdf about zero; $G ( · )$ is an absolutely continuous distribution function, which is also symmetric about zero; and $λ$ is a parameter of asymmetry. For the case where $f ( · )$ is the standard normal density (from now on, we reserve the symbol $ϕ$ for this function), and $G ( · )$ is the standard normal cumulative distribution function (henceforth, denoted by $Φ$), the so-called skew-normal ($SN$) distribution with density
$ϕ Z ( z ; λ ) = 2 ϕ ( z ) Φ ( λ z ) , z , λ ∈ R ,$
is obtained. We use the notation $Z ∼ SN ( λ )$ to denote the random variable Z with pdf given by Equation (2). A generalization of the $SN$ distribution is introduced by Arellano-Valle et al. [2] and Arellano-Valle et al. [3]; they study Fisher’s information matrix of this generalization. For further details about the $SN$ distribution, the reader is referred to Azzalini [4]. Martínez-Flórez et al. [5] used generalizations of the $SN$ distribution to extend the Birnbaum-Saunders model, and Contreras-Reyes and Arellano-Valle [6] utilized the Kullback–Leibler divergence measure to compare the multivariate normal distribution with the skew-multivariate normal.
One of the main limitations of working with the family given by Equation (1) is that the information matrix could be singular for some of its particular models (see Azzalini [1]). This might lead to some difficulties in the estimation, due to the asymptotic convergence of the maximum likelihood (ML) estimators. To overcome this issue, some authors (see Chiogna [7] or Arellano-Valle and Azzalini [8]) have used a reparametrization of the $SN$ model to obtain a nonsingular information matrix. However, this methodology cannot be extended to all type of skew-symmetric models which suffers of this convergence problem. On the other hand, the family of power-symmetric ($PS$) distributions does not have this problem of singularity in the information matrix (see, Pewsey et al. [9]). The pdf of this family of distribution is given by
$φ F ( z ; α ) = α f ( z ) { F ( z ) } α − 1 , z ∈ R , α ∈ R + .$
where $F ( · )$ is itself a cumulative distribution function (cdf) and $α$ is the shape parameter. For the particular case that $F ( · ) = Φ ( · )$, the power-normal ($PN$) distribution is obtained, with density given by
$f ( z ; α ) = α ϕ ( z ) { Φ ( z ) } α − 1 , z ∈ R , α ∈ R + .$
For some references where this family is discussed, the reader is referred to Lehmann [10], Durrans [11], Gupta and Gupta [12], and Pewsey et al. [9], among other papers. Other extensions of this model are given in Martínez-Flórez et al. [13], where a multivariate version from the model is introduced; also, Martínez-Flórez et al. [14] carried out applications by using regression models; finally, Martínez-Flórez et al. [15] examined the exponential transformation of the model, and Martínez-Flórez et al. [16] examined a version of the model doubly censored with inflation in a regression context. Truncations of the $PN$ distribution were considered by Castillo el al. [17].
In this paper, a modification in the pdf of the $PS$ probabilistic family is implemented to increase the degree of kurtosis. This methodology is later used to explain datasets that include atypical observations. Usually, this methodology is accomplished by increasing the number of parameters in the model.
The paper is organized as follows. In Section 2, first, we introduce the modified power symmetric distribution. Then, the particular case of the modified power normal distribution is derived. Some of the most relevant statistical properties of this model, including moments and kurtosis coefficient, are presented. Next, in Section 3, some methods of estimation are discussed. Later, a simulation study is provided to illustrate the behavior of the shape parameter. A numerical application where the modified power normal distribution is compared to the $SN$ and $PN$ distributions is given in Section 4. Finally, Section 5 concludes the paper.

## 2. Genesis and Properties of Modified Power-Normal Distribution

In this section, we introduce a new family of probability distributions. The idea is to make a transformation to a given probability density, as the skew-symmetric or power-symmetric distributions does. As there exists a certain resemblance between our formula (Equation (6)) and the formula for the power-symmetric distributions (Equation (3)), we agree to name these new distributions as modified power-symmetric $( MPS )$ distributions. From the standard normal distribution, we obtain the so-called Modified Power-Normal $( MPN )$ distribution. The main parameters and properties of this particular distribution will be studied throughout this work.

#### 2.1. Probability Density Function

Definition 1.
Let Z be a continuous and symmetric random variable with cdf$G ( z ; η )$and pdf$g ( z ; η )$, where$η$denotes a vector of parameters. We say that, a random variable,  X, follows a$MPS$distribution, denoted as$X ∼ MPS ( η , α )$, if its cdf is given by
$F ( x ; η , α ) = 1 + G ( x ; η ) α − 1 2 α − 1 ,$
and its pdf is given by
$f ( x ; η , α ) = α 2 α − 1 g ( x ; η ) 1 + G ( x ; η ) α − 1 .$
where$x ∈ R$and$α > 0$.
Remark 1.
In the case$α = 1$, the transformation given by Equation (6) is the identity. That is, the$MPS$distribution for$α = 1$always provides the input probability density function.
Thereforeforth, we proceed to examine the $MPN$ distribution, whose cdf is provided by
$F ( x ; μ , σ , α ) = 1 + Φ x − μ σ α − 1 2 α − 1 ,$
and whose pdf is given by
$f ( x ; μ , σ , α ) = α ( 2 α − 1 ) σ ϕ x − μ σ 1 + Φ x − μ σ α − 1 ,$
where $x ∈ R$, $μ ∈ R$ is the location parameter, $σ > 0$ is the scale parameter, and $α > 0$ is the shape parameter. Hereafter, this will be denoted as $X ∼ MPN ( μ , σ , α )$. Figure 1 depicts some different shapes of the pdf of this model, for selected values of the parameter $α$ with and $σ = 1$. The $MPN$ class of distributions is applicable for the change point problem, due to its favorable properties (see Maciak et al. [18]); moreover, the $MPN$ model can be utilized in calibration (see Pe$s ˇ$ta [19]).
Remark 2.
Here,$μ ∈ R$and$σ > 0$are location and scale parameters of the$MPN$distribution, respectively. For the particular case$α = 1$, these are not only location and scale parameters but also the mean and standard deviation of the standard normal distribution.

#### 2.2.1. Shape of the Density

The $MPN$ distribution exhibits a bell-shaped form, which can be symmetric or positively or negatively skewed depending on the value of the parameter $α$. Now, we derive some analytical expressions that are useful to obtain approximations of modal values and inflection points of this model. In the following, it will be assumed that $μ = 0$ and $σ = 1$.
Proposition 1.
The pdf of$X ∼ MPN ( 0 , 1 , α )$has a local maximum at$( x 1 , f ( x 1 ; α ) )$and two inflection points at$( x 2 , f ( x 2 ; α ) )$and$( x 3 , f ( x 3 ; α ) )$, respectively, where$x 1$is the root of the equation
$x * = ( α − 1 ) ϕ ( x * ) 1 + Φ ( x * ) ,$
and$x 2$and$x 3$are two solutions of the equation
$1 = − x + ( α − 1 ) ϕ ( x ) 1 + Φ ( x ) 2 − ( α − 1 ) ϕ ( x ) 1 + Φ ( x ) x + ϕ ( x ) 1 + Φ ( x ) .$
Proof.
The proof consists of simple derivatives of the function f. From the Equation (8), we calculate
$∂ ∂ x f ( x ; α ) = α 2 α − 1 ϕ ( x ) [ 1 + Φ ( x ) ] α − 1 − x + ( α − 1 ) ϕ ( x ) 1 + Φ ( x ) . ∂ 2 ∂ x 2 f ( x ; α ) = α 2 α − 1 ϕ ( x ) [ 1 + Φ ( x ) ] α − 1 − x + ( α − 1 ) ϕ ( x ) 1 + Φ ( x ) 2 − 1 + ( α − 1 ) ϕ ( x ) 1 + Φ ( x ) × x + ϕ ( x ) 1 + Φ ( x ) .$
By setting Equations (9) and (10) to be equal to zero, the results are obtained after some algebra. Figure 2 displays the graph of the first derivative of $f ( · )$, where it is observed that the maximum exists and it is unique. Therefore, the $MPN$ distribution is unimodal. □
Remark 3.
The solutions of Equations (9) and (10) can be numerically obtained by using the built-in function “uniroot” in the software package R. Table 1 below illustrates some approximations of the roots $x 1$, $x 2$ , and $x 3$ , and the corresponding figures of the pdf evaluated at these values.

#### 2.2.2. Moments

Proposition 2.
The rth moments of$X ∼ MPN ( 0 , 1 , α )$for$r = 1 , 2 , 3 , … ,$are given by
$E ( X r ) = α 2 α − 1 a r ( α ) ,$
where$a r ( α )$is defined as
$a r ( α ) = ∫ 0 1 [ Φ − 1 ( u ) ] r ( 1 + u ) α − 1 d u .$
Here,$Φ − 1 ( · )$is the quantile function of the standard normal distribution.
Proof.
By using the change of variable $u = Φ ( x )$, it follows that
$E ( X r ) = ∫ − ∞ ∞ x r α 2 α − 1 ϕ ( x ) [ 1 + Φ ( x ) ] α − 1 d x = α 2 α − 1 ∫ 0 1 [ Φ − 1 ( u ) ] r ( 1 + u ) α − 1 d u = α 2 α − 1 a r ( α ) .$
Corollary 1.
The mean and variance of X are given by
$E ( X ) = α 2 α − 1 a 1 ( α ) and V a r ( X ) = α 2 α − 1 a 2 ( α ) − α 2 α − 1 a 1 2 ( α ) ,$
respectively.
Corollary 2.
The skewness ($β 1$) and kurtosis ($β 2$) coefficients are, respectively, given by
$β 1 = a 3 ( α ) − 3 α 2 α − 1 a 1 ( α ) a 2 ( α ) + 2 α 2 ( 2 α − 1 ) 2 a 1 3 ( α ) α 2 α − 1 3 / 2 [ a 2 ( α ) − α 2 α − 1 a 1 2 ( α ) ] 3 / 2 and β 2 = a 4 ( α ) − 4 α 2 α − 1 a 1 ( α ) a 3 ( α ) + 6 α 2 ( 2 α − 1 ) 2 a 1 2 ( α ) a 2 ( α ) − 3 α 3 ( 2 α − 1 ) 3 a 1 4 ( α ) α 2 α − 1 [ a 2 ( α ) − α 2 α − 1 a 1 2 ( α ) ] 2 .$
Remark 4.
Observe that the integral in Equation (12) can be numerically approximated by using the built-in function integrate available in the software package R. Below, in Table 2 , some approximations of the mean and variance for the $MPN$ distribution for different values of α are displayed. Figure 3 illustrates the behavior of the $E ( X )$ and $V a r ( X )$ of the $MPN$ distribution for different values of α. It is observable that when α grows, the mean increases and the variance decreases.
Figure 4displays the curves associated with the coefficients of skewness (left panel) and kurtosis (right) of the$MPN$and$PN$distributions. It is shown that, depending on the values of α, the$MPN$distribution exhibits equal, greater, or lesser values for these coefficients compared to the$PN$model. In general, the$MPN$distribution has a smaller range of skewness than the$PN$distribution. On the other hand, when$α < 13.05$, the$MPN$distribution has a greater kurtosis coefficient than the$PN$model.

#### 2.2.3. Stochastic Ordering

Stochastic ordering is an important tool to compare continuous random variables. It is well-known that random variable $X 1$ is smaller than random variable $X 2$ in stochastic ordering ($X 1 ≤ s t X 2$) if $F X 1 ( x ) ≥ F X 2 ( x )$ for all x, and in likelihood ratio order ($X 1 ≤ l r X 2$) if $f X 1 ( x ) / f X 2 ( x )$ decreases with x. Using Theorem 1.C.1 and Theorem 2.A.1 of Shaked and Shanthikumar [20], the above stochastic orders hold according to the following implications,
$X 1 ≤ l r X 2 ⇒ X 1 ≤ s t X 2 .$
The proposition shows that the members of the $MPN$ family can be stochastically ordered according to parameters values.
Proposition 3.
Let$X 1 ∼ MPN ( 0 , 1 , α 1 )$and$X 2 ∼ MPN ( 0 , 1 , α 2 )$. If$α 1 > α 2$, then$X 1 ≤ l r X 2$and, therefore,$X 1 ≤ s t X 2$.
Proof.
From the quotient of both densities, it follows that
$f X 2 ( x ; α 2 ) f X 1 ( x ; α 1 ) = α 2 α 1 2 α 1 − 1 2 α 2 − 1 [ 1 + Φ ( x ) ] α 2 − α 1 ,$
is non-decreasing if and only if $μ ′ ( x ) ≥ 0$ for $x ∈ ( − ∞ , ∞ )$, where
$μ ( x ) = [ 1 + Φ ( x ) ] α 2 − α 1 .$
After some calculations, it is shown that
$μ ′ ( x ) = ( α 2 − α 1 ) ϕ ( x ) [ 1 + Φ ( x ) ] α 2 − α 1 − 1 .$
It is straightforward that for $α 1 > α 2$, then $μ ′ ( x ) < 0$ for $x ∈ ( − ∞ , ∞ )$. Therefore, $f X 2 ( x ; α 2 ) / f X 1 ( x ; α 1 )$ is decreasing in x, and consequently $X 1 ≤ l r X 2$. The other implication follows immediately from (13). □

## 3. Inference

In this section, parameters estimation for the $MPN$ distribution is discussed by using the method of moments and ML estimation. Additionally, a simulation analysis is carried out to illustrate the behavior of the ML estimators.

#### 3.1. Method of Moments

The following proposition illustrates the derivation of the moment estimates of the $MPN$ distribution.
Proposition 4.
Let$x 1 , … , x n$be a random sample obtained from the random variable$X ∼ MPN ( μ , σ , α )$, then the moment estimates$θ ^ M = ( μ ^ M , σ ^ M , α ^ M )$for$θ = ( μ , σ , α )$are given by
$σ ^ M = S x ̲ α ^ M 2 α ^ M − 1 a 2 ( α ^ M ) − α ^ M 2 α ^ M − 1 a 1 2 ( α ^ M ) , μ ^ M = x ¯ − σ ^ M α ^ M 2 α ^ M − 1 a 1 ( α ^ M )$
and
$a 3 ( α ^ M ) − 3 α ^ M 2 α ^ M − 1 a 1 ( α ^ M ) a 2 ( α ^ M ) + 2 α ^ M 2 ( 2 α ^ M − 1 ) 2 a 1 3 ( α ^ M ) α ^ M 2 α ^ M − 1 3 / 2 [ a 2 ( α ^ M ) − α ^ M 2 α ^ M − 1 a 1 2 ( α ^ M ) ] 3 / 2 − A x ̲ = 0 ,$
where$x ¯$, $S x ̲$and$A x ̲$denote the sample mean, sample standard deviation and sample Fisher’s skewness coefficient respectively.
Proof.
As $μ$ and $σ$ are location and scale parameters respectively, the skewness coefficient does not depend on these parameters. Thus, the result in (15) is directly obtained from matching the sample skewness coefficient with population counterpart given in Corollary 2. In addition, by considering that $X = σ Z + μ$, where $Z ∼ MPN ( 0 , 1 , α )$, and again by equating sample mean and sample variance to the mean and variance respectively, it follows that
$x ¯ = σ ^ M E ( X ) + μ ^ M , = σ ^ M α ^ M 2 α ^ M − 1 a 1 ( α ^ M ) + μ ^ M ,$
and
$S x ̲ 2 = σ ^ M 2 V a r ( X ) = σ ^ M 2 α ^ M 2 α ^ M − 1 a 2 ( α ^ M ) − α ^ M 2 α ^ M − 1 a 1 2 ( α ^ M ) ,$
where $α ^ M$ satisfies expression (15). Then, (14) is obtained by solving the latter equations for $μ ^ M$ and $σ ^ M$, respectively. □

#### 3.2. Maximum Likelihood Estimation

For a random sample $x 1 , … , x n$ derived from the $MPN ( μ , σ , α )$ distribution, the log-likelihood function can be written as
$ℓ ( μ , σ , α ) = n c ( σ , α ) − 1 2 σ 2 ∑ i = 1 n ( x i − μ ) 2 + ( α − 1 ) ∑ i = 1 n log 1 + Φ x i − μ σ ,$
where $c ( σ , α ) = log ( α ) − log ( 2 α − 1 ) − log ( σ ) − 1 2 log ( 2 π ) .$ The score equations are given by
$n μ + σ ( α − 1 ) ∑ i = 1 n κ ( x i ) = n x ¯ ,$
$n σ 2 + σ ( α − 1 ) ∑ i = 1 n ( x i − μ ) κ ( x i ) = ∑ i = 1 n ( x i − μ ) 2 ,$
$n α + ∑ i = 1 n log 1 + Φ x i − μ σ = 2 α log ( 2 ) n 2 α − 1 ,$
where $κ ( w ) = κ ( w ; μ , σ ) = ϕ w − μ σ 1 + Φ w − μ σ .$Solutions for these Equations (17)–(19) can be obtained by using numerical procedures such as Newton–Raphson algorithm. Alternatively, these estimates can be found by directly maximizing the log-likelihood surface given by (16) and using the subroutine “optim” in the software package [21].

#### 3.3. Simulation Study

To examine the behavior of the proposed approach, a simulation study is carried out to assess the performance of the estimation procedure for the parameters $μ$, $σ$, and $α$ in the $MPN$ model. The simulation analysis is conducted by considering 1000 generated samples of sizes $n = 50 ,$ 100, and 200 from the $MPN$ distribution. The goal of this simulation is to study the behavior of the ML estimators of the parameters by using our proposed procedure. To generate $X ∼ MPN ( μ , σ , α )$, the following algorithm is used,
• Step 1: Generate $W ∼ U n i f o r m ( 0 , 1 ) .$
• Step 2: Compute $X = μ + σ Φ − 1 2 α W − W + 1 1 / α − 1 .$
where $μ ∈ R$, $σ > 0$, $α > 0$ and $Φ − 1 ( · )$ is the quantile function of the standard normal distribution. For each generated sample of the $MPN$ distribution, the ML estimates and corresponding standard deviation (SD) were computed for each parameter. As it can be seen in Table 3, the performance of the estimates improves when n and $α$ increases.

#### Fisher’s Information Matrix

Let us now consider $X ∼ MPN ( μ , σ , α )$ and $Z = X − μ σ ∼ MPN ( 0 , 1 , α )$. For a single observation x of X, the log-likelihood function for $θ = ( μ , σ , α )$ is given by
$ℓ ( θ ) = log f X ( θ , x ) = c ( σ , α ) − 1 2 σ 2 ( x − μ ) 2 + ( α − 1 ) log 1 + Φ x − μ σ .$
The corresponding first and second partial derivatives of the log-likelihood function are derived in the Appendix A. It can be shown that the Fisher’s information matrix for the $MPN$ distribution is provided by
$I F ( θ ) = I μ μ I μ σ I μ α I σ σ I σ α I α α$
with the following entries,
$I μ μ = 1 σ 2 + α − 1 σ 2 ( b 11 + b 02 ) , I μ σ = 2 σ 2 E ( Z ) − α − 1 σ 2 ( b 01 − b 21 − b 12 ) , I μ α = 1 σ b 01 , I σ σ = − 1 σ 2 + 3 σ 2 E ( Z 2 ) − α − 1 σ 2 ( 2 b 11 − b 31 − b 22 ) , I σ α = 1 σ b 11 , I α α = 1 α 2 − 2 α log 2 ( 2 ) ( 2 α − 1 ) 2 ,$
where $b i j = E Z i κ j ( Z ; 0 , 1 )$ must be numerically computed.
The Fisher’s (expected) information matrix can be obtained by computing the expected values of the above expressions. By taking in this matrix, $α = 1$, we have that $Z ∼ N ( μ , σ )$ and
$I F ( μ , σ , α = 1 ) = 1 σ 2 0 1 σ d 02 0 2 σ 2 1 σ d 12 1 σ d 02 1 σ d 12 1 − 2 log 2 ( 2 ) ,$
where $d i j = ∫ − ∞ ∞ z i ϕ j ( z ) 1 + Φ ( z ) d z$ must be numerically obtained.
The determinant of $I F ( μ , σ , α = 1 )$ is $( 2 − 4 log 2 ( 2 ) − b 12 2 − 2 b 02 2 ) / σ 4 = 0 . 003357435 / σ 4 ≠ 0$, consequently, the Fisher’s information matrix is nonsingular at $α = 1 .$
Therefore, for large samples, the ML estimators, $θ ^$, of $θ$ are asymptotically normal, that is,
$n θ ^ − θ ⟶ L N 3 ( 0 , I ( θ ) − 1 ) ,$
resulting in the asymptotic variance of the ML estimators $θ ^$ being the inverse of Fisher’s information matrix $I ( θ ) .$ As the parameters are unknown, the observed information matrix is usually considered, where the unknown parameters are estimated by ML.

## 4. Application

In this section, a numerical illustration based on a real dataset is presented. The goal of this application is to show empirical evidence that the $MPN$ yields a better fit to data than the $PN$, $SN$, and t-student $( TS )$ with $α$ degrees of freedom distributions. For that reason, we consider a set of 3848 observations of the variable “density” included in the dataset verb “POLLEN5.DA” available at http://lib.stat.cmu.edu/datasets/pollen.data. This variable measures a geometric characteristic of a specific type of pollen. This dataset was previously used by Pewsey et al. [9] to compare the $PN$ and $SN$ distributions. A summary of some descriptive statistics are displayed in Table 4 below.
By using the results derived in Proposition 4, we have computed the moment estimates for the parameters $( μ , σ , α )$ of the $MPN$ distribution, obtaining $( − 5.609 , 4.576 , 11.857 )$. Then, by taking these numbers as initial values, the ML estimates are derived. In Table 5, the ML estimates for the parameters of the $MPN$, $PN$, $SN$, and $TS$ distributions. The figures between brackets are the asymptotic standard errors of the estimates obtained by inverting the Fisher’s information matrices for the three models evaluated at their respective ML estimates. Additionally, for each model, the values of the maximum of the log-likelihood function ($ℓ m a x$) are reported. The $MPN$ distribution attains the largest value, and consequently provides a better fit to data.
To compare the fit achieved by each distribution, the values of several measures of model selection, i.e., Akaike’s information criterion (AIC) (see Akaike [22]) and Bayesian information criterion (BIC) (see Schwarz [23]) are reported in Table 6. A model with lower numbers in these measures of model selection is preferable. It can be seen that the $MPN$ is preferable in terms of these two measures of model validation. In addition, the Kolmogorov–Smirnov test statistics and the corresponding p-values has been included in this table for all the models considered. It can be observed that none of the models is rejected at the usual significance levels. However, the $MPN$ distribution has a higher p-value and is rejected later than the other two models. Alternative methods of model selection to the Kolmogorov–Smirnov test that can be applied here can be found in J$a ¨$ntschi and Bolboac$a ˘$ [24] and J$a ¨$ntschi [25]. Furthermore, the histogram associated to the empirical distribution of the variable “density” in the pollen dataset is illustrated in the left hand side of Figure 5. In addition, the densities of $TS$, $SN$, $PN$, and $MPN$, by using the maximum likelihood estimates of their parameters, have been superimposed. Similarly, on the right hand side of Figure 5, the fit in both tails is shown. It is observable that, for this dataset, the $MPN$ has thicker tails than the other three distributions. Finally, the QQ-plots for each distribution considered have been illustrated in Figure 6. Here, note that the $MPN$ distribution exhibits an almost perfect alignment with the 45$°$ line, and therefore it provides a better fit for extreme quantiles. Finally, Figure 7 displays the profile log-likelihood of $μ$, $σ$, and $α$ of the MPN distribution. It is noticeable that the estimates are unique.

## 5. Concluding Remarks

In this paper, a modification of the continuous symmetric-power distribution has been introduced. The particular case of the normal distribution the $MPN$ distribution has been examined in detail. This distribution arises by modifying the distribution function of the symmetrical powers family. After carrying out this modification, a more flexible family of probability distributions is obtained, allowing for the kurtosis coefficient to take a certain range of values in the parameter space. For this model, its basic properties, different method of estimation and Fisher’s information matrix were studied. By using a real dataset, we showed that the $MPN$ distribution provides a better fit than other existing models in the literature such as the $TS$, $SN$, and $PN$ distributions.

## Author Contributions

The authors contributed equally to this work.

## Funding

This work was partially completed while Héctor W. Gómez visited the Universidad de Las Palmas de Gran Canaria, supported by MINEDUC-UA project, code ANT1855. This research was also funded by (EGD) [Ministerio de Economía y Competitividad, Spain] grant number [ECO2013–47092]; (EGD)[Ministerio de Economía, Industria y Competitividad. Agencia Estatal de Investigación] grant number [ECO2017–85577–P].

## Acknowledgments

We also acknowledge the referee’s suggestions that helped us to improve this work.

## Conflicts of Interest

The authors declare no conflicts of interest.

## Appendix A

The first derivatives of $ℓ ( θ )$ are given by
$∂ ℓ ( θ ) ∂ μ = 1 σ x − μ σ − ( α − 1 ) κ ( x ) ∂ ℓ ( θ ) ∂ σ = − 1 σ 1 − x − μ σ 2 + ( α − 1 ) x − μ σ κ ( x ) ∂ ℓ ( θ ) ∂ α = 1 α − 2 α log ( 2 ) 2 α − 1 + log 1 + Φ x − μ σ$
The second derivatives of $l ( θ )$ are
$∂ 2 ℓ ( θ ) ∂ μ 2 = − 1 σ 2 − α − 1 σ 2 x − μ σ κ ( x ) + κ 2 ( x ) ∂ 2 ℓ ( θ ) ∂ μ ∂ σ = − 2 σ 2 x − μ σ + α − 1 σ 2 κ ( x ) 1 − x − μ σ 2 − x − μ σ κ ( x ) ∂ 2 ℓ ( θ ) ∂ μ ∂ α = − k ( x ) σ ∂ 2 ℓ ( θ ) ∂ σ 2 = 1 σ 2 − 3 σ 2 x − μ σ 2 + α − 1 σ 2 x − μ σ κ ( x ) 2 − x − μ σ 2 − x − μ σ κ ( x ) ∂ 2 ℓ ( θ ) ∂ σ ∂ α = − x − μ σ 2 k ( x ) ∂ 2 ℓ ( θ ) ∂ α 2 = − 1 α 2 + 2 α log 2 ( 2 ) ( 2 α − 1 ) 2$

## References

1. Azzalini, A. A class of distributions which includes the normal ones. Scand. J. Stat. 1985, 12, 171–178. [Google Scholar]
2. Arellano-Valle, R.B.; Gómez, H.W.; Quintana, F.A. A New Class of Skew-Normal Distributions. Commun. Stat. Theory Methods 2004, 33, 1465–1480. [Google Scholar] [CrossRef]
3. Arellano-Valle, R.B.; Gómez, H.W.; Salinas, H.S. A note on the Fisher information matrix for the skew-generalized-normal model. Stat. Oper. Res. Trans. 2013, 37, 19–28. [Google Scholar]
4. Azzalini, A. The Skew-Normal and Related Families; IMS monographs; Cambridge University Press: New York, NY, USA, 2014. [Google Scholar]
5. Martínez-Flórez, G.; Barranco-Chamorro, I.; Bolfarine, H.; Gómez, H.W. Flexible Birnbaum–Saunders Distribution. Symmetry 2019, 11, 1305. [Google Scholar] [CrossRef]
6. Contreras-Reyes, J.E.; Arellano-Valle, R.B. Kullback–Leibler Divergence Measure for Multivariate Skew-Normal Distributions. Entropy 2012, 14, 1606–1626. [Google Scholar] [CrossRef]
7. Chiogna, M. A note on the asymptotic distribution of the maximum likelihood estimator for the scalar skew-normal distribution. Stat. Methods Appl. 2005, 14, 331–341. [Google Scholar] [CrossRef]
8. Arellano-Valle, R.B.; Azzalini, A. The centred parametrization for the multivariate skew-normal distribution. J. Multivar. Anal. 2008, 99, 1362–1382. [Google Scholar] [CrossRef]
9. Pewsey, A.; Gómez, H.W.; Bolfarine, H. Likelihood-based inference for power distributions. Test 2012, 21, 775–789. [Google Scholar] [CrossRef]
10. Lehmann, E.L. The power of rank tests. Ann. Math. Statist. 1953, 24, 23–43. [Google Scholar] [CrossRef]
11. Durrans, S.R. Distributions of fractional order statistics in hydrology. Water Resour. Res. 1992, 28, 1649–1655. [Google Scholar] [CrossRef]
12. Gupta, D.; Gupta, R.C. Analyzing skewed data by power normal model. Test 2008, 17, 197–210. [Google Scholar] [CrossRef]
13. Martínez-Flórez, G.; Arnold, B.C.; Bolfarine, H.; Gómez, H.W. The Multivariate Alpha-power Model. J. Stat. Plan. Inference 2013, 143, 1236–1247. [Google Scholar] [CrossRef]
14. Martínez-Flórez, G.; Bolfarine, H.; Gómez, H.W. Asymmetric regression models with limited responses with an application to antibody response to vaccine. Biom. J. 2013, 55, 156–172. [Google Scholar] [CrossRef] [PubMed]
15. Martínez-Flórez, G.; Bolfarine, H.; Gómez, H.W. The log alpha-power asymmetric distribution with application to air pollution. Environmetrics 2014, 25, 44–56. [Google Scholar] [CrossRef]
16. Martínez-Flórez, G.; Bolfarine, H.; Gómez, H.W. Doubly censored power-normal regression models with inflation. Test 2015, 24, 265–286. [Google Scholar] [CrossRef]
17. Castillo, N.O.; Gallardo, D.I.; Bolfarine, H.; Gómez, H.W. Truncated Power-Normal Distribution with Application to Non-Negative Measurements. Entropy 2018, 20, 433. [Google Scholar] [CrossRef]
18. Maciak, M.; Peštová, B.; Pešta, M. Structural breaks in dependent, heteroscedastic, and extremal panel data. Kybernetika 2018, 54, 1106–1121. [Google Scholar] [CrossRef]
19. Pesˇta, M. Total least squares and bootstrapping with application in calibration. Statistics 2013, 47, 966–991. [Google Scholar] [CrossRef]
20. Shaked, M.; Shanthikumar, J.G. Stochastic Orders; Springer Science & Business Media: New York, NY, USA, 2007. [Google Scholar]
21. R Development Core Team. A Language and Environment for Statistical Computing; R Foundation for Statistical Computing: Vienna, Austria, 2019. [Google Scholar]
22. Akaike, H. A new look at the statistical model identification. IEEE Trans. Autom. Control 1974, 19, 716–723. [Google Scholar] [CrossRef]
23. Schwarz, G. Estimating the dimension of a model. Ann. Statist. 1978, 6, 461–464. [Google Scholar] [CrossRef]
24. Ja¨ntschi, L.; Bolboacă, S.D. Computation of Probability Associated with Anderson–Darling Statistic. Mathematics 2018, 6, 88. [Google Scholar] [CrossRef]
25. Ja¨ntschi, L. A Test Detecting the Outliers for Continuous Distributions Based on the Cumulative Distribution Function of the Data Being Tested. Symmetry 2019, 11, 835. [Google Scholar] [CrossRef]
Figure 1. Plot of the pdf of $MPN$ distribution for selected values of the parameters.
Figure 1. Plot of the pdf of $MPN$ distribution for selected values of the parameters.
Figure 2. Plot of the first derivative of $MPN$ distribution for selected values of the parameters.
Figure 2. Plot of the first derivative of $MPN$ distribution for selected values of the parameters.
Figure 3. Plot of the $E ( X )$ and $V a r ( X )$ of the $MPN$ distribution.
Figure 3. Plot of the $E ( X )$ and $V a r ( X )$ of the $MPN$ distribution.
Figure 4. Graphs of the skewness and kurtosis coefficients for the $MPN$ and $PN$ distributions.
Figure 4. Graphs of the skewness and kurtosis coefficients for the $MPN$ and $PN$ distributions.
Figure 5. Left panel: Histogram of the empirical distribution and fitted densities by ML superimposed for pollen dataset. Right panel: Plots of the tails for the four models.
Figure 5. Left panel: Histogram of the empirical distribution and fitted densities by ML superimposed for pollen dataset. Right panel: Plots of the tails for the four models.
Figure 6. QQ-plots: (a) $MPN$ model; (b) $PN$ model; (c) $SN$ model; (d) $TS$ model.
Figure 6. QQ-plots: (a) $MPN$ model; (b) $PN$ model; (c) $SN$ model; (d) $TS$ model.
Figure 7. Profile log-likelihood of $μ$, $σ$ and $α$ for the $MPN$ distribution.
Figure 7. Profile log-likelihood of $μ$, $σ$ and $α$ for the $MPN$ distribution.
Table 1. Approximations of the roots of Equations (9) and (10) for some values of $α$, and the corresponding figures of the pdf of the $MPN$ evaluated at these roots.
Table 1. Approximations of the roots of Equations (9) and (10) for some values of $α$, and the corresponding figures of the pdf of the $MPN$ evaluated at these roots.
$α$$x 1$$x 2$$x 3$$f ( x 1 ; α )$$f ( x 2 ; α )$$f ( x 3 ; α )$
0.5−0.136−1.1350.8860.3970.2390.241
1.00.000−1.0001.0000.3990.2420.242
2.00.243−0.6911.1730.4120.2610.251
3.00.435−0.4141.2990.4330.2820.266
4.00.586−0.2031.3960.4570.2980.284
5.00.706−0.0411.4750.4810.3160.301
Table 2. Approximations of $E ( X )$ and $V a r ( X )$ of the $MPN$ distribution for different values of $α$.
Table 2. Approximations of $E ( X )$ and $V a r ( X )$ of the $MPN$ distribution for different values of $α$.
$α$$E ( X )$$V ar ( X )$
0.5−0.0971.006
1.00.0001.000
5.00.6590.770
10.01.1190.521
100.02.2470.218
Table 3. Maximum likelihood (ML) estimates and standard deviation (SD) for the parameters $μ$, $σ$ and $α$ of the $MPN$ model for different generated samples of sizes $n = 50 ,$ 100, and 200.
Table 3. Maximum likelihood (ML) estimates and standard deviation (SD) for the parameters $μ$, $σ$ and $α$ of the $MPN$ model for different generated samples of sizes $n = 50 ,$ 100, and 200.
$n = 50$
$μ$$σ$$α$$μ ^$ (SD)$σ ^$ (SD)$α ^$ (SD)
010.1−0.352478 (0.149214)0.994441 (0.091321)0.190243 (0.175202)
0.5−0.19534 (0.14501)0.990622 (0.094550)0.613052 (0.272096)
0.8−0.083183 (0.144587)0.990286 (0.098669)0.854338 (0.164924)
1−0.009586 (0.141691)0.995312 (0.0997256)1.007328 (0.122688)
50.004225 (0.100001)0.997408 (0.088254)5.030272 (0.229064)
100.001108 (0.066610)0.999124 (0.068611)10.060478 (0.475019)
1000.002171 (0.017362)1.001152 (0.029604)100.437990 (2.668190)
$n = 100$
010.1−0.351446 (0.104552)0.998513 (0.070831)0.180054 (0.111930)
0.5−0.19268 (0.101786)0.997622 (0.068806)0.576957 (0.223378)
0.8−0.08140 (0.099360)0.997674 (0.069451)0.830318 (0.152995)
10.002786 (0.097411)0.996444 (0.069648)1.002200 (0.088749)
50.002014 (0.099305)0.996788 (0.085987)5.023032 (0.221756)
100.002897 (0.046109)1.000515 (0.050192)10.032857 (0.339106)
1000.000623 (0.012137)1.000185 (0.019759)100.168752 (1.866302)
$n = 200$
010.1−0.348177 (0.072732)0.999433 (0.047548)0.170978 (0.076165)
0.5−0.196617 (0.072015)0.999142 (0.047896)0.562935 (0.218890)
0.8−0.076657 (0.069510)0.997719 (0.050718)0.824700 (0.127661)
10.001158 (0.06877)0.998408 (0.050586)1.003651 (0.058344)
5−0.000165 (0.053006)1.000623 (0.044182)5.005130 (0.115719)
10−0.000239 (0.033615)1.000017 (0.035902)10.014958 (0.246652)
1000.000514 (0.008452)1.000491 (0.014599)100.104380 (1.295144)
Table 4. Summary of descriptive statistics for the pollen density dataset.
Table 4. Summary of descriptive statistics for the pollen density dataset.
MeanMedianVarianceSkewnessKurtosis
0.000−0.0309.8870.1093.193
Table 5. Parameter estimates; standard errors (SE); and maximum of the log-likelihood function, $ℓ m a x$, for the $TS$, $SN$, $PN$, and $MPN$ corresponding to the pollen density dataset.
Table 5. Parameter estimates; standard errors (SE); and maximum of the log-likelihood function, $ℓ m a x$, for the $TS$, $SN$, $PN$, and $MPN$ corresponding to the pollen density dataset.
Parameters$TS ( SE )$$SN ( SE )$$PN ( SE )$$MPN ( SE )$
$μ$−0.010 (0.05)−2.04 (0.24)−1.74 (0.68)−5.73 (0.43)
$σ$3.037 (0.05)3.75 (0.14)3.69 (0.21)4.62 (0.14)
$α$29.995 (13.01)0.93 (0.14)1.77 (0.37)12.13 (1.21)
$ℓ m a x$−9864.99−9863.42−9863.37−9861.98
Table 6. Akaike’s information criterion (AIC), Bayesian information criterion (BIC), Kolmogorov– Smirnov (KSS) test, and the corresponding p-values for all the models considered.
Table 6. Akaike’s information criterion (AIC), Bayesian information criterion (BIC), Kolmogorov– Smirnov (KSS) test, and the corresponding p-values for all the models considered.
Criteria$TS$$SN$$PN$$MPN$
AIC19,735.9819,732.8419,732.7419,729.96
BIC19,754.7419,751.6119,751.5019,748.72
KSS (p-value)0.014 (0.516)0.013 (0.559)0.012 (0.627)0.010 (0.820)