# An Asymmetric Distribution with Heavy Tails and Its Expectation–Maximization (EM) Algorithm Implementation

^{1}

^{2}

^{3}

^{*}

## Abstract

**:**

## 1. Introduction

## 2. An Asymmetric Distribution

#### 2.1. New Distribution

#### 2.2. Density Function

**Proposition**

**1.**

**Proof.**

#### 2.3. Properties

**Proposition**

**2.**

**Proof.**

**Proposition**

**3.**

**Proof.**

**Proposition**

**4.**

**Proof.**

**Remark**

**1.**

#### 2.4. Moments

**Proposition**

**5.**

**Proof.**

**Corollary**

**1.**

**Corollary**

**2.**

**Remark**

**2.**

## 3. Inference

**Proposition**

**6.**

**Proof.**

## 4. Em Algorithm

**E-step:**Given $\mathbf{\theta}={\widehat{\mathbf{\theta}}}^{\left(k\right)}={({\widehat{\sigma}}^{\left(k\right)},{\widehat{q}}^{\left(k\right)})}^{\top}$, calculate ${\widehat{{w}_{i}}}^{\left(k\right)}$, for $i=1,\cdots ,n.$**CM-step I:**Update ${\widehat{\sigma}}^{\left(k\right)}$$$\begin{array}{ccc}\hfill {\widehat{\sigma}}^{2(k+1)}& =& \frac{{S}_{u}^{\left(k\right)}}{2},\hfill \end{array}$$**CM-step II:**Fix $\alpha ={\widehat{\sigma}}^{(k+1)}$, update ${q}^{\left(k\right)}$ by optimizing ${\widehat{q}}^{(k+1)}=\mathrm{arg}\phantom{\rule{0.166667em}{0ex}}{\mathrm{max}}_{\mathrm{q}}Q({\widehat{\sigma}}^{(k+1)},q|{\widehat{\mathbf{\theta}}}^{\left(k\right)})$, where ${S}_{u}^{\left(k\right)}=\frac{1}{n}\sum _{i=1}^{n}{\widehat{w}}_{i}^{\left(k\right)}\phantom{\rule{0.166667em}{0ex}}{t}_{i}.$

**Remark**

**3.**

- i.
- For $q\to \infty $, $\widehat{\sigma}$ in M-step reduces to those obtained when the HN distribution is used;
- ii.
- An alternative to the CM-Steps II is obtained considering the idea in Lin et al. ([12], Section 3), by using the following estimation:
**CML-step**: Update ${q}^{\left(k\right)}$ by maximizing the constrained actual log-likelihood function, i.e.,$$\begin{array}{ccc}\hfill {\widehat{q}}^{(k+1)}& =& {\displaystyle \mathrm{arg}\phantom{\rule{0.166667em}{0ex}}{\mathrm{max}}_{\mathrm{q}}}\phantom{\rule{3.33333pt}{0ex}}\ell \phantom{\rule{0.166667em}{0ex}}({\widehat{\sigma}}^{(k+1)},q).\hfill \end{array}$$

## 5. Simulation

- Simulate $X\sim N(0,{\sigma}^{2})$ and $Y\sim Exp\left(2\right)$.
- Compute $T=\frac{\left|X\right|}{{Y}^{1/q}}$.

## 6. Aplications

#### 6.1. Application 1

#### 6.2. Application 2

#### 6.3. Application 3

## 7. Conclusions

- The MSHN distribution has a greater kurtosis than the SHN distribution, as is clearly reflected in Table 1.
- The proposed model has a closed-form expression and presents more flexible asymmetry and kurtosis coefficients than that of the HN model.
- Two stochastic representations for the MSHN model are presented. One is defined as the quotient between two independent random variables: An HN in the numerator and Exp(2) in the denominator. The other shows that the MSHN distribution is a scale mixture of an HN and a Wei distribution.
- Using the mixed scale representation, the EM algorithm was implemented to calculate the ML estimators.
- Results from a simulation study indicate that with a reasonable sample size, an acceptable bias is obtained.
- Three illustrations using real data show that the MSHN model achieves a better fit in terms of the AIC and BIC criteria.

## Author Contributions

## Funding

## Acknowledgments

## Conflicts of Interest

## Appendix A

- Gamma distribution:$$f(x;\alpha ,\beta )=\frac{{\beta}^{\alpha}}{\mathsf{\Gamma}\left(\alpha \right)}{x}^{\alpha -1}{e}^{-\beta x},$$
- Exponential distribution:$$f(x;\beta )=\frac{1}{\beta}{e}^{-x/\beta},$$
- Weibull distribution:$$f(x;\gamma ,\beta )=\frac{\gamma}{\beta}{x}^{\gamma -1}{e}^{-{x}^{\gamma}/\beta},$$

**Lemma**

**A1.**

## References

- Cooray, K.; Ananda, M.M.A. A Generalization of the Half-Normal Distribution with Applications to Lifetime Data. Commun. Stat. Theory Methods
**2008**, 37, 1323–1337. [Google Scholar] [CrossRef] - Cordeiro, G.M.; Pescim, R.R.; Ortega, E.M.M. The Kumaraswamy Generalized Half-Normal Distribution for Skewed Positive Data. J. Data Sci.
**2012**, 10, 195–224. [Google Scholar] - Gómez, Y.M.; Bolfarine, H. Likelihood-based inference for power half-normal distribution. J. Stat. Theory Appl.
**2015**, 14, 383–398. [Google Scholar] [CrossRef] - Gómez, Y.M.; Vidal, I. A generalization of the half-normal distribution. Appl. Math. J. Chin. Univ.
**2016**, 31, 409–424. [Google Scholar] [CrossRef] - Olmos, N.M.; Varela, H.; Gómez, H.W.; Bolfarine, H. An extension of the half-normal distribution. Stat. Pap.
**2012**, 53, 875–886. [Google Scholar] [CrossRef] - Rogers, W.H.; Tukey, J.W. Understanding some long-tailed symmetrical distributions. Stat. Neerl.
**1972**, 26, 211–226. [Google Scholar] [CrossRef] - Mosteller, F.; Tukey, J.W. Data Analysis And Regression; Addison-Wesley: Reading, MA, USA, 1977. [Google Scholar]
- Reyes, J.; Gómez, H.W.; Bolfarine, H. Modified slash distribution. Statistics
**2013**, 47, 929–941. [Google Scholar] [CrossRef] - Lehmann, E.L. Elements of Large-Sample Theory; Springer: New York, NY, USA, 1999. [Google Scholar]
- Dempster, A.P.; Laird, N.M.; Rubin, D.B. Maximum likelihood from incomplete data via the EM algorithm. J. R. Statist. Soc. Ser. B
**1977**, 39, 1–38. [Google Scholar] [CrossRef] - Lee, S.Y.; Xu, L. Influence analyses of nonlinear mixed-effects models. Comput. Stat. Data Anal.
**2004**, 45, 321–341. [Google Scholar] [CrossRef] - Lin, T.I.; Lee, J.C.; Yen, S.Y. Finite mixture modeling using the skew-normal distribution. Stat. Sin.
**2007**, 17, 909–927. [Google Scholar] - Lyu, M. Handbook of Software Reliability Engineering; IEEE Computer Society Press: Washington, DC, USA, 1996. [Google Scholar]
- Akaike, H. A new look at the statistical model identification. IEEE Trans. Auto. Contr.
**1974**, 19, 716–723. [Google Scholar] [CrossRef] - Schwarz, G. Estimating the dimension of a model. Ann. Stat.
**1978**, 6, 461–464. [Google Scholar] [CrossRef] - Von Alven, W.H. Reliability Engineering by ARINC; Prentice-Hall, Inc.: Upper Saddle River, NJ, USA, 1964. [Google Scholar]
- Linhart, H.; Zucchini, W. Model Selection; Wiley Series in Probability and Statistics; Wiley: Hoboken, NJ, USA, 1986. [Google Scholar]
- Prudnikov, A.P.; Brychkov, Y.A.; Marichev, O.I. Integrals and Series; Gordon & Breach Science Publishers: Amsterdam, The Netherlands, 1986. [Google Scholar]

**Figure 1.**The density function for different values of parameter q and $\sigma =1$ in the MSHN distribution.

**Figure 3.**Histogram fitted with the HN, SHN and MSHN distributions provided with the ML estimations.

**Figure 5.**Histogram fitted with the HN, SHN and MSHN distributions provided with the ML estimations.

**Figure 6.**Histogram fitted with the HN, SHN and MSHN distributions provided with the ML estimations.

**Table 1.**Tails comparison for different slashed half-normal (SHN) and modified slashed half-normal (MSHN) distributions.

Distribution | $\mathit{P}(\mathit{T}>3)$ | $\mathit{P}(\mathit{T}>4)$ | $\mathit{P}(\mathit{T}>5)$ | $\mathit{P}(\mathit{T}>6)$ | $\mathit{P}(\mathit{T}>7)$ |
---|---|---|---|---|---|

SHN(1, 0.5) | 0.3781 | 0.3497 | 0.3239 | 0.3009 | 0.2805 |

MSHN(1, 0.5) | 0.5304 | 0.48289 | 0.4466 | 0.4176 | 0.3936 |

SHN(1, 1) | 0.1777 | 0.1570 | 0.1385 | 0.1224 | 0.1086 |

MSHN(1, 1) | 0.3678 | 0.2992 | 0.2519 | 0.2173 | 0.19102 |

SHN(1, 3) | 0.0350 | 0.0205 | 0.0120 | 0.0044 | 0.0034 |

MSHN(1, 3) | 0.0901 | 0.0438 | 0.0238 | 0.0142 | 0.0091 |

**Table 2.**Maximum likelihood (ML) estimations for parameters $\sigma $ and q of the MSHN distribution. Standard error (SE), root of the mean squared error (RMSE).

True Value | $\mathit{n}=30$ | $\mathit{n}=50$ | $\mathit{n}=100$ | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|

$\mathbf{\sigma}$ | q | Estimator | Bias | SE | RMSE | Bias | SE | RMSE | Bias | SE | RMSE |

1 | 1 | $\sigma $ | 0.178 | 0.430 | 0.528 | 0.122 | 0.320 | 0.378 | 0.089 | 0.206 | 0.279 |

q | 0.199 | 0.422 | 0.668 | 0.097 | 0.219 | 0.263 | 0.059 | 0.138 | 0.163 | ||

2 | $\sigma $ | 0.111 | 0.355 | 0.407 | 0.078 | 0.258 | 0.295 | 0.042 | 0.172 | 0.186 | |

q | 1.006 | 2.500 | 2.603 | 0.480 | 1.105 | 1.519 | 0.182 | 0.458 | 0.562 | ||

5 | $\sigma $ | 0.026 | 0.277 | 0.239 | 0.033 | 0.222 | 0.189 | 0.023 | 0.159 | 0.149 | |

q | 2.227 | 8.833 | 3.871 | 2.012 | 6.743 | 3.550 | 1.333 | 4.092 | 2.905 | ||

2 | 1 | $\sigma $ | 0.284 | 0.835 | 0.973 | 0.192 | 0.617 | 0.665 | 0.104 | 0.414 | 0.481 |

q | 0.168 | 0.356 | 0.571 | 0.094 | 0.215 | 0.263 | 0.058 | 0.141 | 0.166 | ||

2 | $\sigma $ | 0.294 | 0.727 | 0.815 | 0.122 | 0.507 | 0.572 | 0.074 | 0.343 | 0.383 | |

q | 1.210 | 2.821 | 2.835 | 0.465 | 1.067 | 1.534 | 0.174 | 0.454 | 0.623 | ||

5 | $\sigma $ | 0.057 | 0.544 | 0.454 | 0.066 | 0.441 | 0.371 | 0.044 | 0.305 | 0.290 | |

q | 2.456 | 8.991 | 3.934 | 2.089 | 6.712 | 3.615 | 1.545 | 4.150 | 3.075 | ||

5 | 1 | $\sigma $ | 0.834 | 2.111 | 2.548 | 0.494 | 1.527 | 1.826 | 0.386 | 1.038 | 1.233 |

q | 0.217 | 0.414 | 0.740 | 0.119 | 0.225 | 0.287 | 0.083 | 0.144 | 0.174 | ||

2 | $\sigma $ | 0.658 | 1.782 | 2.065 | 0.293 | 1.285 | 1.475 | 0.209 | 0.872 | 0.966 | |

q | 1.218 | 2.872 | 2.836 | 0.413 | 1.018 | 1.414 | 0.188 | 0.489 | 0.694 | ||

5 | $\sigma $ | 0.094 | 1.379 | 1.160 | 0.146 | 1.096 | 0.950 | 0.123 | 0.779 | 0.731 | |

q | 2.266 | 8.894 | 3.880 | 1.952 | 6.526 | 3.557 | 1.370 | 4.118 | 2.948 |

Parameters | HN (SE) | SHN (SE) | MSHN (SE) |
---|---|---|---|

$\widehat{\sigma}$ | 285.191 (19.774) | 20.977 (5.674) | 19.874 (4.867) |

$\widehat{q}$ | - | 0.687 (0.118) | 0.872 (0.115) |

Log-likelihood | −663.411 | −605.102 | −600.876 |

**Table 4.**The Akaike information criterion (AIC) and the Bayesian information criterion (BIC) for each model fitted.

Criterion | HN | SHN | MSHN |
---|---|---|---|

AIC | 1328.822 | 1214.204 | 1205.752 |

BIC | 1331.466 | 1219.493 | 1211.041 |

Parameters | HN (SE) | SHN (SE) | MSHN (SE) |
---|---|---|---|

$\widehat{\sigma}$ | 6.07 (0.6335) | 1.6251 (0.4777) | 1.5108 (0.3179) |

$\widehat{q}$ | - | 1.3539 (0.4347) | 1.6365 (0.3425) |

Log-likelihood | −116.3881 | −103.1834 | −102.65 |

Criterion | HN | SHN | MSHN |
---|---|---|---|

AIC | 234.7762 | 210.3668 | 209.302 |

BIC | 236.6048 | 214.0241 | 212.9573 |

Parameters | HN (SE) | SHN (SE) | MSHN (SE) |
---|---|---|---|

$\widehat{\sigma}$ | 89.616 (11.381) | 13.785 (6.047) | 16.148(5.128) |

$\widehat{q}$ | - | 0.859 (0.285) | 1.233 (0.251) |

Log-likelihood | −161.861 | −154.857 | −153.954 |

Criterion | HN | SHN | MSHN |
---|---|---|---|

AIC | 325.7224 | 313.715 | 311.908 |

BIC | 327.1564 | 316.583 | 314.776 |

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Olmos, N.M.; Venegas, O.; Gómez, Y.M.; Iriarte, Y.A.
An Asymmetric Distribution with Heavy Tails and Its Expectation–Maximization (EM) Algorithm Implementation. *Symmetry* **2019**, *11*, 1150.
https://doi.org/10.3390/sym11091150

**AMA Style**

Olmos NM, Venegas O, Gómez YM, Iriarte YA.
An Asymmetric Distribution with Heavy Tails and Its Expectation–Maximization (EM) Algorithm Implementation. *Symmetry*. 2019; 11(9):1150.
https://doi.org/10.3390/sym11091150

**Chicago/Turabian Style**

Olmos, Neveka M., Osvaldo Venegas, Yolanda M. Gómez, and Yuri A. Iriarte.
2019. "An Asymmetric Distribution with Heavy Tails and Its Expectation–Maximization (EM) Algorithm Implementation" *Symmetry* 11, no. 9: 1150.
https://doi.org/10.3390/sym11091150