# Quantile-Based Estimation of the Finite Cauchy Mixture Model

## Abstract

## 1. Introduction

## 2. Preliminaries

#### 2.1. Mixture Models

#### 2.2. Cauchy Distribution

## 3. An EM-Type Algorithm for the Cauchy Mixture Model

**The initializing step.**Find initial parameter values, $\widehat{\Theta}={\Theta}_{0}$, as follows.

- Set all prior component weight parameter estimates to the uniform distribution, ${\widehat{p}}_{1}=\cdots ={\widehat{p}}_{K}=\frac{1}{K}$;
- For the location parameters ${\widehat{\alpha}}_{1},\cdots ,{\widehat{\alpha}}_{K}$, do one of the following:
- i
- find the empirical $j/(K+1)$ percentiles of the data, $j=1,\dots ,K$;
- ii
- draw a random sample of size K from the data;
- iii
- place the values of ${\widehat{\alpha}}_{j}$, $j=1,\dots ,K$, symmetrically around the mean, in multiples of the standard deviation of the data D;

- set all component scale parameter estimates ${\widehat{\gamma}}_{j}$ to the half-interquartile range of the dataset, that is ${\widehat{\gamma}}_{1}=\cdots ={\widehat{\gamma}}_{K}=\frac{1}{2}\phantom{\rule{0.277778em}{0ex}}IQR({x}_{1},\dots ,{x}_{n})$.

**E-step.**Using the current estimates $\widehat{\Theta}$, compute the membership weights ${w}_{ij}$ of observations ${x}_{i}$ as

**M-step.**Given the membership weights ${w}_{ij}$ from the E-step, we can use the data points to compute an updated parameter value. Let ${\sum}_{i=1}^{n}{w}_{ij}={N}_{j}$, that is the sum of the membership weights for the jth component. Then, the new estimate of the mixture weights is

## 4. Simulation Study

- (A)
- a two-component Cauchy mixture model with mixing parameters equal to $\underline{p}=(0.5,0.5)$, location parameters $\underline{\alpha}=(-200,200)$ and scale parameters $\underline{\gamma}=(1,1)$;
- (B)
- a two-component Cauchy mixture model with mixing parameters equal to $\underline{p}=(1/3,2/3)$, location parameters $\underline{\alpha}=(-200,200)$ and scale parameters $\underline{\gamma}=(1,4)$; and
- (C)
- a four-component Cauchy mixture with $\underline{p}=(0.1,0.3,0.3,0.3)$, $\underline{\alpha}=(-200,200,400,600)$, $\underline{\gamma}=(1,1,1,5)$.

## 5. Real Data Examples

#### 5.1. Adler Data

#### 5.2. Melter Data

#### 5.3. Analysis of Image Greyscales

## 6. Discussion and Conclusions

## Author Contributions

## Funding

## Acknowledgments

## Conflicts of Interest

**Figure 1.**Simulated data: Scenario A (

**left**); and Scenario B (

**right**). The plotted curves correspond to the fitted component densities weighted by the estimated components probabilities at $K=2$. Note that the densities are truncated at the top.

**Figure 2.**Simulated data, Scenario C: The plotted curves correspond to the fitted component densities weighted by the estimated components probabilities.

**Figure 3.**Simulated data, Scenario A: Box plots of estimated Cauchy mixture parameters at different sample sizes of $n=200,500,1000,$ and 2000 (when ${p}_{1}={p}_{2}=0.5,{\alpha}_{1}=-200,{\alpha}_{2}=200,{\gamma}_{1}={\gamma}_{2}=1$).

**Figure 4.**Simulated data, Scenario B: Box plots of estimated Cauchy mixture parameters at different sample sizes of $n=200,500,1000,$ and 2000 (when ${p}_{1}=1/3$, ${p}_{2}=2/3$, ${\alpha}_{1}=-200$, ${\alpha}_{2}=200$, ${\gamma}_{1}=1$ and ${\gamma}_{2}=4$).

**Figure 5.**Simulated data, Scenario C: Box plots of estimated Cauchy mixture mixing parameters at different sample sizes of $n=200,500,1000,$ and 2000 (when $\underline{p}=(0.1,0.3,0.3,0.3)$, $\underline{\alpha}=(-200,200,400,600)$ and $\underline{\gamma}=(1,1,1,5))$.

**Figure 6.**Simulated data, Scenario C: Box plots of estimated Cauchy mixture location and scale parameters at different sample sizes of $n=200,500,1000,$ and 2000 (when $\underline{p}=(0.1,0.3,0.3,0.3)$, $\underline{\alpha}=(-200,200,400,600)$ and $\underline{\gamma}=(1,1,1,5))$.

**Figure 9.**Fitted Cauchy mixtures for average ratings from Adler data: (

**Left**) $K=3$; and (

**right**) $K=4$.

**Figure 11.**Fitted Cauchy (

**left**); and Gaussian (

**right**) mixtures to melter temperature data. The displayed curves correspond to the fitted component densities weighted by the estimated component probabilities.

**Figure 12.**(

**Left**) Image of tiled mosaic; and (

**right**) histogram of grey scale distribution of image (1 = white, 0 = black).

**Figure 13.**Fitted Cauchy mixtures for grey scales from the tiled mosaic: (

**Left**) $K=3$; and (

**right**) $K=4$.

**Table 1.**Means of 100,000 Cauchy parameter estimates using robust quantile based-estimators (Approach (I)) or numerical ML (Approach (II)). Standard deviations are provided in brackets.

Simulation Scenario | ||||
---|---|---|---|---|

Estimation | (i) | (ii) | ||

scenario | $\widehat{\alpha}$ | $\widehat{\alpha}$ | $\widehat{\gamma}$ | |

(I) | 1.9995 (0.1591) | 1.9996 (0.1594) | 1.0071 (0.1616) | |

(II) | 2.0000 (0.1433) | 1.9997 (0.1440) | 1.0008 (0.1441) | |

(iii) | (iv) | |||

$\widehat{\alpha}$ | $\widehat{\gamma}$ | $\widehat{\alpha}$ | $\widehat{\gamma}$ | |

(I) | 2.0001 (0.0158) | 0.1005 (0.0161) | 1.9968 (1.5892) | 10.067 (1.606) |

(II) | 2.0001 (0.0143) | 0.0999 (0.0143) | 1.9975 (1.4373) | 10.006 (1.434) |

Number of Components (K) | |||||
---|---|---|---|---|---|

1 | 2 | 3 | 4 | 5 | |

$-2logL$ | 916.56 | 910.41 | 867.37 | 862.76 | 857.58 |

$BIC$ | 925.93 | 933.83 | 904.83 | 914.26 | 923.13 |

**Table 3.**Melter temperature data: Estimated Cauchy mixture parameters for the temperature sensor data.

K | ${\widehat{\mathit{p}}}_{1}$ | ${\widehat{\mathit{p}}}_{2}$ | ${\widehat{\mathit{p}}}_{3}$ | ${\widehat{\mathit{\alpha}}}_{1}$ | ${\widehat{\mathit{\alpha}}}_{2}$ | ${\widehat{\mathit{\alpha}}}_{3}$ | ${\widehat{\mathit{\gamma}}}_{1}$ | ${\widehat{\mathit{\gamma}}}_{2}$ | ${\widehat{\mathit{\gamma}}}_{3}$ |
---|---|---|---|---|---|---|---|---|---|

1 | 1 | 1086.0 | 83.2 | ||||||

2 | 0.198 | 0.802 | 878.1 | 1105.0 | 27.5 | 33.0 | |||

3 | 0.225 | 0.203 | 0.572 | 877.5 | 1055.0 | 1119.0 | 31.2 | 14.1 | 18.2 |

**Table 4.**Melter temperature data: Comparison of model disparity ($-2logL$) and $BIC$ between Cauchy mixture and Gaussian mixture fits. For the Cauchy fits with $K=2$ and 3, the minimum disparity was achieved after four and seven iterations, respectively.

K | Cauchy Mixture Model | Gaussian Mixture Model | ||
---|---|---|---|---|

$-2log\mathit{L}$ | $\mathit{B}\mathit{I}\mathit{C}$ | $-2log\mathit{L}$ | $\mathit{B}\mathit{I}\mathit{C}$ | |

1 | 1,044,617 | 1,044,640 | 1,095,249 | 1,095,272 |

2 | 988,145 | 988,202 | 994,756 | 994,812 |

3 | 970,329 | 970,419 | 961,530 | 961,620 |

