# Quantile-Based Estimation of the Finite Cauchy Mixture Model

^{1}

^{2}

^{*}

^{†}

## Abstract

**:**

## 1. Introduction

## 2. Preliminaries

#### 2.1. Mixture Models

#### 2.2. Cauchy Distribution

## 3. An EM-Type Algorithm for the Cauchy Mixture Model

**The initializing step.**Find initial parameter values, $\widehat{\Theta}={\Theta}_{0}$, as follows.

- Set all prior component weight parameter estimates to the uniform distribution, ${\widehat{p}}_{1}=\cdots ={\widehat{p}}_{K}=\frac{1}{K}$;
- For the location parameters ${\widehat{\alpha}}_{1},\cdots ,{\widehat{\alpha}}_{K}$, do one of the following:
- i
- find the empirical $j/(K+1)$ percentiles of the data, $j=1,\dots ,K$;
- ii
- draw a random sample of size K from the data;
- iii
- place the values of ${\widehat{\alpha}}_{j}$, $j=1,\dots ,K$, symmetrically around the mean, in multiples of the standard deviation of the data D;

- set all component scale parameter estimates ${\widehat{\gamma}}_{j}$ to the half-interquartile range of the dataset, that is ${\widehat{\gamma}}_{1}=\cdots ={\widehat{\gamma}}_{K}=\frac{1}{2}\phantom{\rule{0.277778em}{0ex}}IQR({x}_{1},\dots ,{x}_{n})$.

**E-step.**Using the current estimates $\widehat{\Theta}$, compute the membership weights ${w}_{ij}$ of observations ${x}_{i}$ as

**M-step.**Given the membership weights ${w}_{ij}$ from the E-step, we can use the data points to compute an updated parameter value. Let ${\sum}_{i=1}^{n}{w}_{ij}={N}_{j}$, that is the sum of the membership weights for the jth component. Then, the new estimate of the mixture weights is

## 4. Simulation Study

- (A)
- a two-component Cauchy mixture model with mixing parameters equal to $\underline{p}=(0.5,0.5)$, location parameters $\underline{\alpha}=(-200,200)$ and scale parameters $\underline{\gamma}=(1,1)$;
- (B)
- a two-component Cauchy mixture model with mixing parameters equal to $\underline{p}=(1/3,2/3)$, location parameters $\underline{\alpha}=(-200,200)$ and scale parameters $\underline{\gamma}=(1,4)$; and
- (C)
- a four-component Cauchy mixture with $\underline{p}=(0.1,0.3,0.3,0.3)$, $\underline{\alpha}=(-200,200,400,600)$, $\underline{\gamma}=(1,1,1,5)$.

## 5. Real Data Examples

#### 5.1. Adler Data

#### 5.2. Melter Data

#### 5.3. Analysis of Image Greyscales

## 6. Discussion and Conclusions

## Author Contributions

## Funding

## Acknowledgments

## Conflicts of Interest

## References

- Aitkin, M.; Francis, B.; Hinde, J.; Darnell, R. Statistical Modelling in R. Oxford Statistical Science Series 35; Oxford University Press: Oxford, UK, 2009. [Google Scholar]
- Longford, N.T.; D’Urso, P. Mixture models with an improper component. J. Appl. Stat.
**2011**, 38, 2511–2521. [Google Scholar] [CrossRef] - Peel, D.; McLachlan, G.J. Robust mixture modelling using the t distribution. Stat. Comput.
**2000**, 10, 339–348. [Google Scholar] [CrossRef] - Barnett, V. The Study of Outliers: Purpose and Model. J. R. Stat. Soc. Ser. C (Appl. Stat.)
**1978**, 27, 242–250. [Google Scholar] [CrossRef] - Lahmiri, S.; Boukadoum, M. An Ensemble System Based on Hybrid EGARCH-ANN with Different Distributional Assumptions to Predict S&P 500 Intraday Volatility. Fluct. Noise Lett.
**2015**, 14, 1550001. [Google Scholar] - Hua, J.; Huang, M.; Huang, C. Centrality Metrics’ Performance Comparisons on Stock Market Datasets. Symmetry
**2019**, 11, 916. [Google Scholar] [CrossRef] - Raza, N.; Shahzad, S.J.H.; Tiwari, A.K.; Shahbaz, M. Asymmetric impact of gold, oil prices and their volatilities on stock prices of emerging markets. Resour. Policy
**2016**, 49, 290–301. [Google Scholar] [CrossRef] - Reeds, J.A. Asymptotic Number of Roots of Cauchy Location Likelihood Equations. Ann. Stat.
**1985**, 13, 775–784. [Google Scholar] [CrossRef] - Boes, D.C. On the Estimation of Mixing Distributions. Ann. Math. Stat.
**1966**, 37, 177–188. [Google Scholar] [CrossRef] - McLachlan, G.J.; Peel, D. Finite Mixture Models; John Wiley & Sons: New York, NY, USA, 2000. [Google Scholar]
- Zhang, L.; Gove, J.H.; Liu, C.; Leak, W. A finite mixture of two Weibull distributions for modeling the diameter distributions of rotated-sigmoid, uneven-aged stands. Can. J. For. Res.
**2001**, 31, 1654–1659. [Google Scholar] [CrossRef] - Zaman, M.R.; Roy, M.K.; Akhter, N. Chi-square Mixture of Gamma Distribution. J. Appl. Sci.
**2005**, 5, 1632–1635. [Google Scholar] - Suksaengrakcharoen, S.; Bodhisuwan, W. A new Family of Generalized Gamma Distribution and its Application. J. Math. Stat.
**2014**, 10, 211–220. [Google Scholar] [CrossRef] - Karim, R.; Hossain, P.; Begum, S.; Hossain, F. Rayleigh Mixture Distribution. J. Appl. Math.
**2011**, 2011, 238290. [Google Scholar] [CrossRef] - Sindhu, T.N.; Feroze, N. Bayesian Inference of Mixture of two Rayleigh Distributions: A New Look. J. Math.
**2006**, 48, 49–64. [Google Scholar] - Arnold, B.C.; Beaver, R.J. The skew-Cauchy distribution. Stat. Probab. Lett.
**2000**, 49, 285–290. [Google Scholar] [CrossRef] - Nadarajah, S. Making the Cauchy work. Braz. J. Probab. Stat.
**2011**, 25, 99–120. [Google Scholar] [CrossRef] - Koenker, R.; Bassett, G. Regression Quantiles. Econometrica
**1978**, 46, 33–50. [Google Scholar] [CrossRef] - Bloch, D. A note on the estimation of the location parameter of the Cauchy distribution. J. Am. Stat. Assoc.
**1966**, 61, 852–855. [Google Scholar] [CrossRef] - Rothenberg, T.J.; Fisher, F.M.; Tilanus, C.B. A note on estimation from a Cauchy sample. J. Am. Stat. Assoc.
**1964**, 59, 460–463. [Google Scholar] [CrossRef] - Tiku, M.L.; Suresh, R.P. A new method of estimation for location and scale parameters. J. Stat. Plan. Inference
**1992**, 30, 281–292. [Google Scholar] [CrossRef] - Fried, R.; Einbeck, J.; Gather, U. Weighted Repeated Median Smoothing and Filtering. J. Am. Stat. Assoc.
**2007**, 102, 1300–1308. [Google Scholar] [CrossRef] [Green Version] - Seidel, W.; Mosler, K.; Alker, M. A cautionary note on likelihood ratio tests in mixture models. Ann. Inst. Statist. Meth.
**2000**, 52, 481–487. [Google Scholar] [CrossRef] - Schwarz, G. Estimating the dimension of a model. Ann. Stat.
**1978**, 6, 461–464. [Google Scholar] [CrossRef] - Fox, J.; Weisberg, S.; Price, B. carData: Companion to Applied Regression Data Sets, R Package Version 3.0-2; 2018. Available online: https://CRAN.R-project.org/package=carData (accessed on 20 March 2019).
- Liu, X.; Xie, L.; Kruger, U.; Littler, T.; Wang, S.-Q. Statistical-based monitoring of multivariate non-gaussian systems. AIChE J.
**2008**, 54, 2379–2391. [Google Scholar] [CrossRef] - Mouselimis, L. OpenImageR: An Image Processing Toolkit, R Package Version 1.1.4.; 2019. Available online: https://CRAN.R-project.org/package=OpenImageR (accessed on 30 January 2019).
- Beleites, C. Arrayhelpers: Convenience Functions for Arrays, R Package Version 1.0-20160527; 2016. Available online: https://CRAN.R-project.org/package=arrayhelpers (accessed on 30 January 2019).
- Nguyen, T.M.; Wu, Q.M.J.; Mukherjee, D.; Zhang, H. A Bayesian Bounded Asymmetric Mixture Model with Segmentation Application. IEEE J. Biomed. Health Inform.
**2014**, 18, 109–119. [Google Scholar] [CrossRef] [PubMed] - Dempster, A.P.; Laird, N.M.; Rubin, D.B. Maximum Likelihood from Incomplete Data via the EM Algorithm. J. R. Stat. Soc. Ser. B
**1977**, 39, 1–38. [Google Scholar] [CrossRef]

**Figure 1.**Simulated data: Scenario A (

**left**); and Scenario B (

**right**). The plotted curves correspond to the fitted component densities weighted by the estimated components probabilities at $K=2$. Note that the densities are truncated at the top.

**Figure 2.**Simulated data, Scenario C: The plotted curves correspond to the fitted component densities weighted by the estimated components probabilities.

**Figure 3.**Simulated data, Scenario A: Box plots of estimated Cauchy mixture parameters at different sample sizes of $n=200,500,1000,$ and 2000 (when ${p}_{1}={p}_{2}=0.5,{\alpha}_{1}=-200,{\alpha}_{2}=200,{\gamma}_{1}={\gamma}_{2}=1$).

**Figure 4.**Simulated data, Scenario B: Box plots of estimated Cauchy mixture parameters at different sample sizes of $n=200,500,1000,$ and 2000 (when ${p}_{1}=1/3$, ${p}_{2}=2/3$, ${\alpha}_{1}=-200$, ${\alpha}_{2}=200$, ${\gamma}_{1}=1$ and ${\gamma}_{2}=4$).

**Figure 5.**Simulated data, Scenario C: Box plots of estimated Cauchy mixture mixing parameters at different sample sizes of $n=200,500,1000,$ and 2000 (when $\underline{p}=(0.1,0.3,0.3,0.3)$, $\underline{\alpha}=(-200,200,400,600)$ and $\underline{\gamma}=(1,1,1,5))$.

**Figure 6.**Simulated data, Scenario C: Box plots of estimated Cauchy mixture location and scale parameters at different sample sizes of $n=200,500,1000,$ and 2000 (when $\underline{p}=(0.1,0.3,0.3,0.3)$, $\underline{\alpha}=(-200,200,400,600)$ and $\underline{\gamma}=(1,1,1,5))$.

**Figure 9.**Fitted Cauchy mixtures for average ratings from Adler data: (

**Left**) $K=3$; and (

**right**) $K=4$.

**Figure 11.**Fitted Cauchy (

**left**); and Gaussian (

**right**) mixtures to melter temperature data. The displayed curves correspond to the fitted component densities weighted by the estimated component probabilities.

**Figure 12.**(

**Left**) Image of tiled mosaic; and (

**right**) histogram of grey scale distribution of image (1 = white, 0 = black).

**Figure 13.**Fitted Cauchy mixtures for grey scales from the tiled mosaic: (

**Left**) $K=3$; and (

**right**) $K=4$.

**Table 1.**Means of 100,000 Cauchy parameter estimates using robust quantile based-estimators (Approach (I)) or numerical ML (Approach (II)). Standard deviations are provided in brackets.

Simulation Scenario | ||||
---|---|---|---|---|

Estimation | (i) | (ii) | ||

scenario | $\widehat{\alpha}$ | $\widehat{\alpha}$ | $\widehat{\gamma}$ | |

(I) | 1.9995 (0.1591) | 1.9996 (0.1594) | 1.0071 (0.1616) | |

(II) | 2.0000 (0.1433) | 1.9997 (0.1440) | 1.0008 (0.1441) | |

(iii) | (iv) | |||

$\widehat{\alpha}$ | $\widehat{\gamma}$ | $\widehat{\alpha}$ | $\widehat{\gamma}$ | |

(I) | 2.0001 (0.0158) | 0.1005 (0.0161) | 1.9968 (1.5892) | 10.067 (1.606) |

(II) | 2.0001 (0.0143) | 0.0999 (0.0143) | 1.9975 (1.4373) | 10.006 (1.434) |

Number of Components (K) | |||||
---|---|---|---|---|---|

1 | 2 | 3 | 4 | 5 | |

$-2logL$ | 916.56 | 910.41 | 867.37 | 862.76 | 857.58 |

$BIC$ | 925.93 | 933.83 | 904.83 | 914.26 | 923.13 |

**Table 3.**Melter temperature data: Estimated Cauchy mixture parameters for the temperature sensor data.

K | ${\widehat{\mathit{p}}}_{1}$ | ${\widehat{\mathit{p}}}_{2}$ | ${\widehat{\mathit{p}}}_{3}$ | ${\widehat{\mathit{\alpha}}}_{1}$ | ${\widehat{\mathit{\alpha}}}_{2}$ | ${\widehat{\mathit{\alpha}}}_{3}$ | ${\widehat{\mathit{\gamma}}}_{1}$ | ${\widehat{\mathit{\gamma}}}_{2}$ | ${\widehat{\mathit{\gamma}}}_{3}$ |
---|---|---|---|---|---|---|---|---|---|

1 | 1 | 1086.0 | 83.2 | ||||||

2 | 0.198 | 0.802 | 878.1 | 1105.0 | 27.5 | 33.0 | |||

3 | 0.225 | 0.203 | 0.572 | 877.5 | 1055.0 | 1119.0 | 31.2 | 14.1 | 18.2 |

**Table 4.**Melter temperature data: Comparison of model disparity ($-2logL$) and $BIC$ between Cauchy mixture and Gaussian mixture fits. For the Cauchy fits with $K=2$ and 3, the minimum disparity was achieved after four and seven iterations, respectively.

K | Cauchy Mixture Model | Gaussian Mixture Model | ||
---|---|---|---|---|

$-2log\mathit{L}$ | $\mathit{B}\mathit{I}\mathit{C}$ | $-2log\mathit{L}$ | $\mathit{B}\mathit{I}\mathit{C}$ | |

1 | 1,044,617 | 1,044,640 | 1,095,249 | 1,095,272 |

2 | 988,145 | 988,202 | 994,756 | 994,812 |

3 | 970,329 | 970,419 | 961,530 | 961,620 |

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Kalantan, Z.I.; Einbeck, J.
Quantile-Based Estimation of the Finite Cauchy Mixture Model. *Symmetry* **2019**, *11*, 1186.
https://doi.org/10.3390/sym11091186

**AMA Style**

Kalantan ZI, Einbeck J.
Quantile-Based Estimation of the Finite Cauchy Mixture Model. *Symmetry*. 2019; 11(9):1186.
https://doi.org/10.3390/sym11091186

**Chicago/Turabian Style**

Kalantan, Zakiah I., and Jochen Einbeck.
2019. "Quantile-Based Estimation of the Finite Cauchy Mixture Model" *Symmetry* 11, no. 9: 1186.
https://doi.org/10.3390/sym11091186