Estimation Bias in Maximum Entropy Models

1 Gatsby Computational Neuroscience Unit, UCL, London WC1N 3AR, UK 2 Max Planck Institute for Biological Cybernetics, Bernstein Center for Computational Neuroscience and Werner Reichardt Centre for Integrative Neuroscience, 72076 Tübingen, Germany 3 School for Informatics, University of Edinburgh, Edinburgh EH8 9AB, UK
Received: 25 June 2013; in revised form: 25 July 2013 / Accepted: 29 July 2013 / Published: 2 August 2013
Abstract: Maximum entropy models have become popular statistical models in neuroscience and other areas in biology and can be useful tools for obtaining estimates of mutual information in biological systems. However, maximum entropy models fit to small data sets can be subject to sampling bias; i.e., the true entropy of the data can be severely underestimated. Here, we study the sampling properties of estimates of the entropy obtained from maximum entropy models. We focus on pairwise binary models, which are used extensively to model neural population activity. We show that if the data is well described by a pairwise model, the bias is equal to the number of parameters divided by twice the number of observations. If, however, the higher order correlations in the data deviate from those predicted by the model, the bias can be larger. Using a phenomenological model of neural population recordings, we find that this additional bias is highest for small firing probabilities, strong correlations and large population sizes—for the parameters we tested, a factor of about four higher. We derive guidelines for how long a neurophysiological experiment needs to be in order to ensure that the bias is less than a specified criterion. Finally, we show how a modified plug-in estimate of the entropy can be used for bias correction.
Keywords: maximum entropy; sampling bias; asymptotic bias; model-misspecification; neurophysiology; neural population coding; Ising model; Dichotomized Gaussian

