# Maximum Entropy Gibbs Density Modeling for Pattern Classification

^{1}

^{2}

^{3}

^{*}

## Abstract

**:**

## 1. Introduction

## 2. Density Estimation by Constrained Maximum Entropy

## 3. Gibbs Pattern Density Modeling by a Kohonen Network

#### 3.1. The Traditional Kohonen Network

#### 3.2. The Gibbsian Kohonen Network

- Initialize the potential functions ${\lambda}_{j}^{\left(\beta \right)}$ to small random values, $\beta =1,...,K$. | |||

- Initialize patterns ${\Gamma}_{j}^{syn}$ in nodes $j\in [1,J]$. | |||

- For $n=1\u27f6{n}_{iter}$, do | |||

- For each training pattern Γ do | |||

- Compute the vectors of representation $H(\Gamma )$ and $H({\Gamma}_{j}^{syn})$, $j\in [1,J]$. | |||

- determine ${j}^{*}={min}_{j}d(H\left({\Gamma}_{i}\right),H({\Gamma}_{j}^{syn}))$, | |||

- Update the potential functions: | |||

$${\lambda}_{j}^{\left(\beta \right)}(n+1)={\lambda}_{j}^{\left(\beta \right)}\left(n\right)+{\u03f5}_{n}{g}_{n}^{j,{j}^{*}}({H}^{\left(\beta \right)}({\Gamma}_{j}^{syn})-{H}^{\left(\beta \right)}\left({\Gamma}_{i}\right)),\phantom{\rule{3.33333pt}{0ex}}\phantom{\rule{3.33333pt}{0ex}}\phantom{\rule{3.33333pt}{0ex}}\phantom{\rule{3.33333pt}{0ex}}\beta =1,...,K.$$
$${\u03f5}_{n}={\u03f5}_{i}{(\frac{{\u03f5}_{f}}{{\u03f5}_{i}})}^{\frac{n}{{n}_{max}}},\phantom{\rule{3.33333pt}{0ex}}\phantom{\rule{3.33333pt}{0ex}}{g}_{n}^{j,{j}^{*}}=exp-\frac{\left|\right|j-{j}^{*}{\left|\right|}^{2}}{2{\sigma}_{n}^{2}},\phantom{\rule{3.33333pt}{0ex}}\phantom{\rule{3.33333pt}{0ex}}{\sigma}_{n}={\sigma}_{i}{(\frac{{\sigma}_{f}}{{\sigma}_{i}})}^{\frac{n}{{n}_{max}}}$$
| |||

- Synthesize new patterns ${\Gamma}_{j}^{syn}$ by MCMC using the potential functions |

#### 3.3. The Main Differences and Similarities

## 4. An Experimental Validation

#### 4.1. The Database

#### 4.2. Feature Extraction

#### 4.3. Evaluation

**Table 2.**3-fold cross validation recognition rates using a $5\times 5$ Gibbsian and Kohonen networks for each character category.

Fold | Gibbsian network | Kohonen network | Difference |
---|---|---|---|

1 | 95.1 % | 94.4 % | +0.7 |

2 | 94.0 % | 92.1 % | +1.9 |

3 | 94.7 % | 92.4 % | +2.3 |

Mean | 94.6% | 92.9 % | +1.7 |

#### 4.4. Test of Statistical Significance

## 5. Conclusions

## Appendix

## A. MCMC Shape Sampling Algorithm

Given a shape $\Gamma =(({x}_{1},{y}_{1}),({x}_{2},{y}_{2}),...({x}_{{n}_{point}},{y}_{{n}_{point}}))$ | |||

- For $s=1\u27f6{s}_{max}$, with ${s}_{max}$ being the maximum sweep number | |||

- For $p=1\u27f6{n}_{point}$,with ${n}_{point}$ being the point number on shapes | |||

- For $d=1\u27f6{n}_{direction}$, | |||

with ${n}_{direction}$ being the number of | |||

directions | |||

- move ${A}_{l}$ according to d directions | |||

to obtain a set of new shapes ${\Gamma}_{d}$, | |||

$l=1,2,...,d$ | |||

- Compute $p({\Gamma}_{d};\Lambda )$ by Equation (6). | |||

- Keep $\widehat{\Gamma}$ which maximizes $p({\Gamma}_{d};\Lambda )$ and affect it to ${\Gamma}^{syn}$. | |||

- Compute $\eta (\Gamma \to {\Gamma}^{syn})=min(\frac{p({\Gamma}^{syn};\Lambda )}{p(\Gamma ;\Lambda )},1)$. | |||

- Draw a random number $r\in [0,1]$ from a uniform distribution | |||

- If $r<\eta $, replace Γ by ${\Gamma}^{syn}$. |

