# Large Deviations Properties of Maximum Entropy Markov Chains from Spike Trains

^{1}

^{2}

^{3}

^{4}

^{*}

## Abstract

**:**

## 1. Introduction

## 2. Preliminaries

#### 2.1. Data Binarization and Spike Trains

#### 2.2. Features

#### 2.3. Statistical Structure

- (i)
- the random variation in the ionic flux of charges crossing the cellular membrane per unit time at the post synaptic button due to the binding of neurotransmitter;
- (ii)
- (iii)
- noise coming from electrical synapses [39].

#### 2.4. Empirical Averages

## 3. Inference of the Statistical Model with the MEP

#### 3.1. Fundamentals of the MEP

- I.
- II.
- Using the available data ${\mathit{x}}_{0,T-1}$, compute the empirical averange of each feature ${A}_{T}\left({f}_{k}\right):={c}_{k}$.
- III.
- Assuming stationarity, define the space of statistical models $\mathcal{M}({c}_{1},\dots ,{c}_{K})\subset \mathcal{M}$ given by$$\mathcal{M}({c}_{1},\dots ,{c}_{K})=\{g\in \mathcal{M}|\phantom{\rule{0.277778em}{0ex}}{\mathbb{E}}_{g}\left\{{f}_{1}\right\}={c}_{1},\dots ,{\mathbb{E}}_{g}\left\{{f}_{K}\right\}={c}_{K}\},$$
- IV.
- Defining the entropy rate of the stochastic process as$$\mathcal{S}\left\{p\right\}=\underset{t\to \infty}{lim}\frac{1}{t}\sum _{{\mathit{x}}_{0,t-1}\in {\mathcal{A}}_{t}^{N}}{p}_{t}\left\{{\mathit{x}}_{0,t-1}\right\}log\frac{1}{{p}_{t}\left\{{\mathit{x}}_{0,t-1}\right\}},$$$$\widehat{p}=\underset{q\phantom{\rule{0.166667em}{0ex}}\in \phantom{\rule{0.166667em}{0ex}}\mathcal{M}({c}_{1},\dots ,{c}_{k})}{arg\; max}\mathcal{S}\left\{q\right\}.$$

#### 3.2. Time-Independent Constraints

#### 3.3. Non-Synchronous Constraints

#### 3.3.1. Transfer Matrix Method

#### 3.3.2. Thermodynamic Formalism

## 4. Large Deviations and Applications in MEMC

#### 4.1. Preliminary Considerations

#### 4.1.1. Central Limit Theorem

#### 4.1.2. Large Deviations

#### 4.2. Large Deviations for Features of MEMC

#### 4.3. Large Deviations for the Entropy Production

#### 4.4. Large Deviations and MEMC Distinguishability

## 5. Illustrative Examples

- Choose the features and build the energy function (Equation (7)).
- Build the transfer matrix (Equation (10)).
- Compute the free energy and find the maximum entropy parameters using (Equation (17)).
- Build the Markov transition matrix using (Equation (12)).
- Choose the observable to examine and build the tilted transition matrix using Equation (22).
- Compute the Legendre transform of the log maximum eigenvalue of the tilted transition matrix to obtain the rate function (Equation (24)).

#### 5.1. First Example: Maximum Entropy Model of a Range Two Feature

#### 5.2. Second Example: Maximum Entropy Model With Only Synchronous Constraints

#### 5.3. Third Example: Past Independent and Markov Maximum Entropy Measures

## 6. Conclusions

## Author Contributions

## Funding

## Conflicts of Interest

## Abbreviations

MEP | Maximum entropy principle |

MEMC | Maximum entropy Markov chain |

SCGF | Scaled cumulant generating function |

CLT | Central limit theorem |

LLN | Law of large numbers |

LDP | Large deviation principle |

IEP | Information entropy production |

KSE | Kolmogorov–Sinai entropy |

NESS | Non-equilibrium steady states |

**Symbol list**

${x}_{n}^{k}$ | Spiking state of neuron k at time n. |

${\mathit{x}}_{n}$ | Spike pattern at time n |

${\mathit{x}}_{{t}_{1},{t}_{2}}$ | Spike block from time ${t}_{1}$ to ${t}_{2}$. |

${A}_{T}\left(f\right)$ | Empirical Average value of the feature f considering T spike patterns. |

${\mathcal{A}}_{R}^{N}$ | Set of spike blocks of N neurons and length R. |

$\mathcal{S}\left[\phantom{\rule{0.166667em}{0ex}}p\phantom{\rule{0.166667em}{0ex}}\right]$ | Entropy of the probability measure p. |

$\mathcal{H}$ | Energy function. |

$\mathcal{P}\left[\phantom{\rule{0.166667em}{0ex}}\mathcal{H}\phantom{\rule{0.166667em}{0ex}}\right]$ | Free energy or topological pressure. |

${\lambda}_{f}\left(k\right)$ | Scaled cumulant generating function of f. |

${I}_{f}\left(s\right)$ | Rate function of f. |

## Appendix A. Discrete-Time Markov Chains and Spike Train Statistics

## Appendix B. Cumulant Generating Function

## Appendix C. Linear Response

## Appendix D. Time Correlations from Topological Pressure

## Appendix E. Gallavotti–Cohen Fluctuation Theorem

## References

- Rieke, F.; Warland, D.; de Ruyter van Steveninck, R.; Bialek, W. Spikes, Exploring the Neural Code; MIT Press: Cambridge, MA, USA, 1996. [Google Scholar]
- Friston, K.J. Functional and effective connectivity: A review. Brain Connect.
**2011**, 1, 13–36. [Google Scholar] [CrossRef] [PubMed] - Okatan, M.; Wilson, M.A.; Brown, E.N. Analyzing functional connectivity using a network likelihood model of ensemble neural spiking activity. Neural Comput.
**2005**, 17, 1927–1961. [Google Scholar] [CrossRef] [PubMed] - Ganmor, E.; Segev, R.; Schneidman, E. The architecture of functional interaction networks in the retina. J. Neurosci.
**2011**, 31, 3044–3054. [Google Scholar] [CrossRef] [PubMed] - Ferrea, E.; Maccione, A.; Medrihan, L.; Nieus, T.; Ghezzi, D.; Baldelli, P.; Benfenati, F.; Berdondini, L. Large-scale, high-resolution electrophysiological imaging of field potentials in brain slices with microelectronic multielectrode arrays. Front. Neural Circuits
**2012**, 6, 80. [Google Scholar] [CrossRef] [PubMed] - Schneidman, E.; Berry, M.J., II; Segev, R.; Bialek, W. Weak pairwise correlations imply string correlated network states in a neural population. Nature
**2006**, 440, 1007–1012. [Google Scholar] [CrossRef] [PubMed] - Tkačik, G.; Marre, O.; Mora, T.; Amodei, D.; Berry II, M.J.; Bialek, W. The simplest maximum entropy model for collective behavior in a neural network. J. Stat. Mech.
**2013**, 2013, P03011. [Google Scholar] [CrossRef] - Vasquez, J.C.; Marre, O.; Palacios, G.A.; Berry II, M.J.; Cessac, B. Gibbs distribution analysis of temporal correlation structure on multicell spike trains from retina ganglion cells. J. Physiol. Paris
**2012**, 106, 120–127. [Google Scholar] [CrossRef] [PubMed] - Marre, O.; El Boustani, S.; Frégnac, Y.; Destexhe, A. Prediction of spatiotemporal patterns of neural activity from pairwise correlations. Phys. Rev. Lett.
**2009**, 102, 138101. [Google Scholar] [CrossRef] [PubMed] - Croner, L.J.; Purpura, K.; Kaplan, E. Response variability in retinal ganglion cells of primates. Proc. Natl. Acad. Sci. USA
**1993**, 90, 8128–8130. [Google Scholar] [CrossRef] [PubMed] - Shadlen, M.N.; Newsome, W.T. The variable discharge of cortical neurons: Implications for connectivity, computation, and information coding. J. Neurosci.
**1998**, 18, 3870–3896. [Google Scholar] [CrossRef] [PubMed] - Pillow, J.W.; Shlens, J.; Paninski, L.; Sher, A.; Litke, A.M.; Chichilnisky, E.J.; Simoncelli, E.P. Spatio-temporal correlations and visual signaling in a complete neuronal population. Nature
**2008**, 454, 995–999. [Google Scholar] [CrossRef] [PubMed] - Tkačik, G.; Marre, O.; Amodei, D.; Schneidman, E.; Bialek, W.; Berry, M.J., II. Searching for collective behavior in a large network of sensory neurons. PLoS Comput. Biol.
**2013**, 10, e1003408. [Google Scholar] [CrossRef] [PubMed] - Tkačik, G.; Mora, T.; Marre, O.; Amodei, D.; Berry II, M.; Bialek, W. Thermodynamics for a network of neurons: Signatures of criticality. Proc. Natl. Acad. Sci. USA
**2015**, 112, 11508–11513. [Google Scholar] [CrossRef] [PubMed] - Tang, A.; Jackson, D.; Hobbs, J.; Chen, W.; Smith, J.; Patel, H.; Prieto, A.; Petrusca, D.; Grivich, M.; Sher, A.; et al. A maximum entropy model applied to spatial and temporal correlations from cortical networks in vitro. J. Neurosci.
**2008**, 28, 505–518. [Google Scholar] [CrossRef] [PubMed] - Schrödinger, E. What is Life? the Physical Aspect of the Living Cell; Cambridge University Press: Cambridge, UK, 1983. [Google Scholar]
- Deem, M. Mathematical adventures in biology. Phys. Today
**2007**, 60, 42–47. [Google Scholar] [CrossRef] - Prigogine, I. Nonequilibrium Statistical Mechanics; Monographs in Statistical Physics; Interscience Publishers: Geneva, Switzerland, 1962; Volume 1. [Google Scholar]
- Filyukov, A.A.; Karpov, V.Y. Description of steady transport processes by the method of the most probable path of evolution. J. Eng. Phys.
**1967**, 13, 624–630. [Google Scholar] [CrossRef] - Filyukov, A.A.; Karpov, V.Y. Method of the most probable path of evolution in the theory of stationary irreversible processes. J. Eng. Phys. Thermophys.
**1967**, 13, 416–419. [Google Scholar] [CrossRef] - Favretti, M. The maximum entropy rate description of a thermodynamic system in a stationary non-equilibrium state. Entropy
**2009**, 4, 675–687. [Google Scholar] [CrossRef] - Monthus, C. Non-equilibrium steady states: Maximization of the Shannon entropy associated with the distribution of dynamical trajectories in the presence of constraints. J. Stat. Mech. Theor. Exp.
**2011**, 3, P03008. [Google Scholar] [CrossRef] - Shi, P.; Qian, H. Chapter Irreversible Stochastic Processes, Coupled Diffusions and Systems Biochemistry. In Frontiers in Computational and Systems Biology; Feng, J., Fu, W., Sun, F., Eds.; Springer: Berlin/Heidelberg, Germany, 2010; pp. 175–201. [Google Scholar]
- Cofré, R.; Cessac, B. Exact computation of the maximum entropy potential of spiking neural networks models. Phys. Rev. E
**2014**, 89, 052117. [Google Scholar] [CrossRef] [PubMed] - Cofré, R.; Maldonado, C. Information entropy production of maximum entropy markov chains from spike trains. Entropy
**2018**, 20, 34. [Google Scholar] [CrossRef] - Mora, T.; Bialek, W. Are biological systems poised at criticality? J. Stat. Phys.
**2011**, 144, 268–302. [Google Scholar] [CrossRef] - Ellis, R. Entropy, Large Deviations and Statistical Mechanics; Springer: Berlin/Heidelberg, Germany, 1985. [Google Scholar]
- Dembo, A.; Zeitouni, O. Large Deviations Techniques and Applications; Springer: Berlin/Heidelberg, Germany, 2010. [Google Scholar]
- Georgii, H. Probabilistic Aspects of Entropy; Greven, A., Keller, G., Warnecke, G., Eds.; Princeton University Press: Princeton, NJ, USA, 2003. [Google Scholar]
- Balasubramanian, V. Statistical inference, Occam’s Razor, and statistical mechanics on the space of probability distributions. Neural Comput.
**1997**, 9, 349–368. [Google Scholar] [CrossRef] - Mastromatteo, I.; Marsili, M. On the criticality of inferred models. J. Stat. Mech.
**2011**, 2011, P10012. [Google Scholar] [CrossRef] - Macke, J.; Murray, I.; Latham, P. Estimation bias in maximum entropy models. Entropy
**2013**, 15, 3109–3129. [Google Scholar] [CrossRef] - Marsili, M.; Mastromatteo, I.; Roudi, Y. On sampling and modeling complex systems. J. Stat. Mech.
**2013**, 2013, P09003. [Google Scholar] [CrossRef] - Quiroga, R.Q.; Nadasdy, Z.; Ben-Shaul, Y. Unsupervised spike sorting with wavelets and superparamagnetic clustering. Neural Comput.
**2004**, 16, 1661–1678. [Google Scholar] [CrossRef] [PubMed] - Hill, D.N.; Mehta, S.B.; Kleinfeld, D. Quality metrics to accompany spike sorting of extracellular signals. J. Neurosci.
**2011**, 31, 8699–8705. [Google Scholar] [CrossRef] [PubMed] - Gerstner, W.; Kistler, W. Spiking Neuron Models; Cambridge University Press: Cambridge, UK, 2002. [Google Scholar]
- Schwalger, T.; Fisch, K.; Benda, J.; Lindner, B. How noisy adaptation of neurons shapes interspike interval histograms and correlations. PLoS Comput. Biol.
**2010**, 6, e1001026. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Linaro, D.; Storace, M.; Giugliano, M. Accurate and fast simulation of channel noise in conductance-based model neurons by diffusion approximation. PLoS Comput. Biol.
**2011**, 7, e1001102. [Google Scholar] [CrossRef] [PubMed] - Cofré, R.; Cessac, B. Dynamics and spike trains statistics in conductance-based Integrate-and-Fire neural networks with chemical and electric synapses. Chaos Soliton. Fract.
**2013**, 50, 13–31. [Google Scholar] [CrossRef] - Jaynes, E. Information theory and statistical mechanics. Phys. Rev.
**1957**, 106, 620–630. [Google Scholar] [CrossRef] - Jaynes, E. Probability Theory: The Logic of Science; Cambridge University Press: Cambridge, UK, 2003. [Google Scholar]
- Peng, H.; Long, F.; Ding, C. Feature selection based on mutual information: Criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans. Pattern. Anal. Mach.
**2005**, 27, 1226–1238. [Google Scholar] [CrossRef] [PubMed] - Brown, G.; Pocock, A.; Zhao, M.; Lujn, M. Conditional likelihood maximisation: A unifying framework for information theoretic feature selection. J. Mach. Learn. Res.
**2012**, 13, 27–66. [Google Scholar] - Cover, T.M.; Thomas, J.A. Elements of Information Theory, 2nd ed.; Wiley: Hoboken, NJ, USA, 2006. [Google Scholar]
- Nasser, H.; Cessac, B. Parameter estimation for spatio-temporal maximum entropy distributions: Application to neural spike trains. Entropy
**2014**, 16, 2244–2277. [Google Scholar] [CrossRef] [Green Version] - Tkačik, G.; Prentice, J.S.; Balasubramanian, V.; Schneidman, E. Optimal population coding by noisy spiking neurons. Proc. Natl. Acad. Sci. USA
**2010**, 107, 14419–14424. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Sinai, Y.G. Gibbs measures in ergodic theory. Russ. Math. Surv.
**1972**, 27, 21–69. [Google Scholar] [CrossRef] - Chliamovitch, G.; Dupuis, A.; Chopard, B. Maximum entropy rate reconstruction of markov dynamics. Entropy
**2015**, 17, 3738–3751. [Google Scholar] [CrossRef] - Seneta, E. Non-Negative Matrices and Markov Chains; Springer: Berlin/Heidelberg, Germany, 2006. [Google Scholar]
- Bowen, R. Equilibrium States and the Ergodic Theory of Anosov Diffeomorphisms; Second Revised Version; Springer: Berlin/Heidelberg, Germany, 2008. [Google Scholar]
- Levin, D.A.; Peres, Y.; Wilmer, E.L. Markov Chains and Mixing Times; American Mathematical Society: Providence, RI, USA, 2009. [Google Scholar]
- Jones, G.L. On the Markov chain central limit theorem. Probab. Surv.
**2004**, 1, 299–320. [Google Scholar] [CrossRef] [Green Version] - Touchette, H. The large deviation approach to statistical mechanics. Phys. Rep.
**2009**, 478, 1–69. [Google Scholar] [CrossRef] [Green Version] - Ellis, R.S. The theory of large deviations and applications to statistical mechanics. In Long-Range Interacting Systems; Oxford University Press: Oxford, UK, 2010. [Google Scholar]
- Maes, C. The fluctuation theorem as a Gibbs property. J. Stat. Phys.
**1999**, 95, 367–392. [Google Scholar] [CrossRef] - Gaspard, P. Time-reversed dynamical entropy and irreversibility in Markovian random processes. J. Statist. Phys.
**2004**, 117, 599–615. [Google Scholar] [CrossRef] - Jiang, D.Q.; Qian, M.; Qian, M.P. Mathematical Theory of Nonequilibrium Steady States; Springer: Berlin/ Heidelberg, Germany, 2004. [Google Scholar]
- Amari, S. Information Geometry of Multiple Spike Trains. In Analysis of Parallel Spike Trains; Springer Series in Computational Neuroscience; Grün, S., Rotter, S., Eds.; Springer: Berlin/Heidelberg, Germany, 2010; Volume 7, pp. 221–253. [Google Scholar]
- Gaspard, P. Random paths and current fluctuations in nonequilibrium statistical mechanics. J. Math. Phys.
**2014**, 55, 075208. [Google Scholar] [CrossRef] [Green Version]

**Figure 1.**(

**A**) Each bar indicates a spike of a neuron indexed from 1 to 4 in continuous time. (

**B**) After binning $\mathsf{\Delta}{t}_{b}$, the spiking activity is transformed into binary patterns in discrete time. We illustrate the notation used throughout this paper.

**Figure 2.**Algorithmic view of the maximum entropy Markov chains (MEMC): Inputs are the spike trains ${\left\{{x}_{i}\right\}}_{i=1}^{T}$ and the average values of a set of features. The output is the MEMC transition matrix P.

**Figure 3.**Algorithmic view of the method: Inputs are the maximum entropy Markov transition matrix and a feature. From the inputs, the tilted transition matrix can be built. The rate function of the feature is obtained as the Legendre transform of the log maximum eigenvalue of the tilted transition matrix. Using this framework, we can estimate the large deviations of the average values of the features.

**Figure 4.**(

**A**) Scaled cumulant generating function (SCGF) (Equation (24)) for the feature ${f}_{1}$ of the first example computed at the values provided by the table above. (

**B**) Rate function for the same feature computed at the same parameter values as the SCGF. Each of this functions are obtained taking the Lagrange transform of the respective SCGF in (

**A**).

**Figure 5.**Gallavotti–Cohen symmetry relationship for the information entropy production IEP for values in Table 1. Left SCGF ${\lambda}_{W}\left(k\right)$. Right rate function of the IEP feature $W,{I}_{W}\left(s\right)$.

**Figure 7.**(

**A**) Rate functions of the synchronous feature ${x}_{0}^{1}{x}_{0}^{2}$ for both energy functions. (

**B**) Moving averages computed from a sample of length 20,000 of the Markov Chain with transition matrix P.

**Figure 8.**The fluctuations of the synchronous feature around the mean computed from the sample of the Markov chain are indicated with the bars. The Gaussian fluctuations around the mean predicted by the large deviations rate functions of both models are plotted. The curve predicted by the Markov model obtained from $\mathcal{H}$ is in green and the curve predicted by the model obtained from $\tilde{\mathcal{H}}$ is in orange.

c | ${\mathit{\beta}}_{1}$ | $\mathit{IEP}$ |
---|---|---|

0.043 | −2 | 0.176 |

0.11 | −1 | 0.056 |

0.25 | 0 | 0 |

0.475 | 1 | 0.0525 |

0.711 | 2 | 0.1184 |

**Table 2.**Set of values used in Figure 6.

${\mathit{A}}_{\mathit{T}}\left({\mathit{f}}_{\mathit{k}}\right)$ | ${\mathit{c}}_{\mathit{k}}$ | ${\mathit{\beta}}_{\mathit{k}}$ | $\mathit{\delta}{\mathit{\beta}}_{\mathit{k}}$ | ${\tilde{\mathit{c}}}_{\mathit{k}}$ |
---|---|---|---|---|

${A}_{T}\left({x}^{1}\right)$ | 0.3 | −1.0436 | 0 | 0.30350016 |

${A}_{T}\left({x}^{2}\right)$ | 0.2 | −1.6727 | 0 | 0.20127414 |

${A}_{T}\left({x}^{3}\right)$ | 0.1 | −2.8163 | 0 | 0.10450018 |

${A}_{T}\left({x}^{1}{x}^{2}\right)$ | 0.08 | 0.4590 | 0 | 0.08187418 |

${A}_{T}\left({x}^{1}{x}^{3}\right)$ | 0.05 | 0.8604 | 0.1 | 0.05475019 |

${A}_{T}\left({x}^{2}{x}^{3}\right)$ | 0.04 | 1.0325 | 0 | 0.04207419 |

© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Cofré, R.; Maldonado, C.; Rosas, F.
Large Deviations Properties of Maximum Entropy Markov Chains from Spike Trains. *Entropy* **2018**, *20*, 573.
https://doi.org/10.3390/e20080573

**AMA Style**

Cofré R, Maldonado C, Rosas F.
Large Deviations Properties of Maximum Entropy Markov Chains from Spike Trains. *Entropy*. 2018; 20(8):573.
https://doi.org/10.3390/e20080573

**Chicago/Turabian Style**

Cofré, Rodrigo, Cesar Maldonado, and Fernando Rosas.
2018. "Large Deviations Properties of Maximum Entropy Markov Chains from Spike Trains" *Entropy* 20, no. 8: 573.
https://doi.org/10.3390/e20080573