# Coincidences and Estimation of Entropies of Random Variables with Large Cardinalities

## Abstract

**:**

**MSC**94A17, 62F12, 62F15

## 1. Introduction

## 2. Summary of the NSB Method

## 3. Saddle Point Analysis

#### Numerical Implementation

- The saddle point (the maximum of $\rho \left(\xi \right|\mathbf{n})$) is found numerically by:
- (a)
- evaluating an approximation for ${\kappa}_{0}$ using the first few terms of the series, Equation (18);
- (b)
- using the approximate value as a starting point for the Newton-Raphson iterative algorithm to solve for ${\kappa}_{0}$ from Equation (13);
- (c)
- plugging the solution into the series expansion for the saddle ${\kappa}^{*}$, Equation (12);
- (d)
- and, finally, using the latter solution as a starting point for the Newton-Raphson search of a more accurate value of ${\kappa}^{*}$ in Equation (11).

- Each of the integrands in Equation (7) is divided by the value of $\rho \left(\xi \right|\mathbf{n})$ at the saddle point, so that the maximum of the integrands is $O\left(1\right)$.
- Curvature around the saddle point (and hence the posterior variance) is evaluated numerically.
- The integrals are evaluated numerically over the range that spans a few standard deviations on both sides of the saddle point; the range is controlled by the user-specified desired accuracy.

## 4. Choosing a Value for K?

## 5. Unknown or Infinite K

## 6. Conclusions

## Acknowledgments

## References

- Paninski, L. Estimation of entropy and mutual information. Neural Comp.
**2003**, 15, 1191–1253. [Google Scholar] [CrossRef] - Panzeri, S.; Treves, A. Analytical estimates of limited sampling biases in different information measures. Netw. Comput. Neural Syst.
**1996**, 7, 87–107. [Google Scholar] [CrossRef] - Strong, S.; Koberle, R.; de Ruyter van Steveninck, R.; Bialek, W. Entropy and information in neural spike trains. Phys. Rev. Lett.
**1998**, 80, 197–200. [Google Scholar] [CrossRef] - Victor, J. Binless strategies for estimation of information from neural data. Phys. Rev. E
**2002**, 66, 051903. [Google Scholar] [CrossRef] - Antos, A.; Kontoyiannis, I. Convergence properties of functional estimates for discrete distributions. Random Struct. Algorithm.
**2002**, 19, 163–193. [Google Scholar] [CrossRef] - Batu, T.; Dasgupta, S.; Kumar, R.; Rubinfeld, R. The complexity of approximating the entropy. SIAM J. Comput.
**2005**, 35, 132–150. [Google Scholar] [CrossRef] - Grassberger, P. Entropy estimates from insufficient samples. arXiv
**2003**. physics/0307138v2. [Google Scholar] - Wyner, A.; Foster, D. On the lower limits of entropy estimation. Available online: http://www-stat.wharton.upenn.edu/
^{~}ajw/lowlimitsentropy.pdf (accessed on 16 December 2011). - Kennel, M.; Shlens, J.; Abarbanel, H.; Chichilnisky, E. Estimating entropy rates with bayesian confidence intervals. Neural Comp.
**2005**, 17, 1531–1576. [Google Scholar] [CrossRef] [PubMed] - Ma, S. Calculation of entropy from data of motion. J. Stat. Phys.
**1981**, 26, 221–240. [Google Scholar] [CrossRef] - Orlitsky, A.; Santhanam, N.; Vishwanathan, K. Population estimation with performance guarantees. In Proceedings of the IEEE International Symposium on Information Theory, Nice, France, 24th–29th June 2007; pp. 2026–2030.
- Orlitsky, A.; Santhanam, N.; Vishwanathan, K.; Zhang, J. Limit results on pattern entropy. IEEE Trans. Inf. Ther.
**2006**, 52, 2954–2964. [Google Scholar] [CrossRef] - Nemenman, I.; Shafee, F.; Bialek, W. Entropy and inference, revisited. In Advances in Neural Information Processing Systems 14; Dietterich, T.G., Becker, S., Ghahramani, Z., Eds.; MIT Press: Cambridge, MA, USA, 2002. [Google Scholar]
- Nemenman, I.; Bialek, W.; de Ruyter van Steveninck, R. Entropy and information in neural spike trains: Progress on the sampling problem. Phys. Rev. E
**2004**, 69, 056111. [Google Scholar] [CrossRef] - Nemenman, I.; Lewen, G.; Bialek, W.; de Ruyter van Steveninck, R. Neural coding of natural stimuli: Information at sub-millisecond resolution. PLoS Comput. Biol.
**2008**, 4, e1000025. [Google Scholar] [CrossRef] [PubMed] - NSB Entropy Estimation. Available online: http://nsb-entropy.sourceforge.net/ (accessed on 16 December 2011).
- Wolpert, D.; Wolf, D. Estimating functions of probability distributions from a finite set of samples. Phys. Rev. E
**1995**, 52, 6841–6854. [Google Scholar] [CrossRef] - Sjölander, K.; Karplus, K.; Brown, M.; Hughey, R.; Krogh, A.; Mian, I.S.; Haussler, D. Dirichlet mixtures: A method for improved detection of weak but significant protein sequence homology. Comput. Appl. Biosci.
**1996**, 12, 327–345. [Google Scholar] [CrossRef] [PubMed] - Jeffreys, H. Further significance tests. Proc. Camb. Phil. Soc.
**1936**, 32, 416–445. [Google Scholar] [CrossRef] - Schwartz, G. Estimating the dimension of a model. Ann. Stat.
**1978**, 6, 461–464. [Google Scholar] [CrossRef] - Clarke, B.; Barron, A. Information—Theoretic asymptotics of Bayes methods. IEEE Trans. Inf. Thy.
**1990**, 36, 453–471. [Google Scholar] [CrossRef] - Balasubramanian, V. Statistical inference, Occam’s razor, and statistical mechanics on the space of probability distributions. Neural Comp.
**1997**, 9, 349–368. [Google Scholar] [CrossRef] - Nemenman, I. Fluctuation-dissipation theorem and models of learning. Neural Comp.
**2005**, 17. [Google Scholar] [CrossRef] [PubMed] - Gradshteyn, I.; Ryzhik, I. Tables of Integrals, Series and Products, 6th ed.; Academic Press: Burlington, MA, USA, 2000. [Google Scholar]
- Schurmann, T.; Grassberger, P. Entropy estimation of symbol sequences. Chaos
**1996**, 6, 414–427. [Google Scholar] [CrossRef] [PubMed]

© 2011 by the author; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/.)

## Share and Cite

**MDPI and ACS Style**

Nemenman, I.
Coincidences and Estimation of Entropies of Random Variables with Large Cardinalities. *Entropy* **2011**, *13*, 2013-2023.
https://doi.org/10.3390/e13122013

**AMA Style**

Nemenman I.
Coincidences and Estimation of Entropies of Random Variables with Large Cardinalities. *Entropy*. 2011; 13(12):2013-2023.
https://doi.org/10.3390/e13122013

**Chicago/Turabian Style**

Nemenman, Ilya.
2011. "Coincidences and Estimation of Entropies of Random Variables with Large Cardinalities" *Entropy* 13, no. 12: 2013-2023.
https://doi.org/10.3390/e13122013