- freely available
- re-usable

*Entropy*
**2015**,
*17*(7),
4701-4743;
doi:10.3390/e17074701

## Abstract

**:**We study the asymptotic law of a network of interacting neurons when the number of neurons becomes infinite. Given a completely connected network of neurons in which the synaptic weights are Gaussian correlated random variables, we describe the asymptotic law of the network when the number of neurons goes to infinity. We introduce the process-level empirical measure of the trajectories of the solutions to the equations of the finite network of neurons and the averaged law (with respect to the synaptic weights) of the trajectories of the solutions to the equations of the network of neurons. The main result of this article is that the image law through the empirical measure satisfies a large deviation principle with a good rate function which is shown to have a unique global minimum. Our analysis of the rate function allows us also to characterize the limit measure as the image of a stationary Gaussian measure defined on a transformed set of trajectories.

## 1. Introduction

The goal of this paper is to study the asymptotic behavior and large deviations of a network of interacting neurons when the number of neurons becomes infinite. Our network may be thought of as a network of weakly-interacting diffusions: thus before we begin we briefly overview other asymptotic analyses of such systems. In particular, a lot of work has been done on spin glass dynamics, including Ben Arous and Guionnet on the mathematical side [1–4] and Sompolinsky and his co-workers on the theoretical physics side [5–8]. Furthermore, the large deviations of weakly interacting diffusions has been extensively studied by Dawson and Gartner [9,10], and more recently Budhiraja, Dupuis and Fischer [11,12]. More references to previous work on this particular subject can be found in these references.

Because the dynamics of spin glasses is not too far from that of networks of interacting neurons, Sompolinsky also succesfully explored this particular topic [13] for fully connected networks of rate neurons, i.e., neurons represented by the time variation of their firing rates (the number of spikes they emit per unit of time), as opposed to spiking neurons, i.e., neurons represented by the time variation of their membrane potential (including the individual spikes). For an introduction to these notions, the interested reader is referred to such textbooks as [14–16]. In his study of the continuous time dynamics of networks of rate neurons, Sompolinsky and his colleagues assumed, as in the work on spin glasses, that the coupling coefficients, called the synaptic weights in neuroscience, were random variables independent and identically distributed with zero mean Gaussian laws. The main result obtained by Ben Arous and Guionnet for spin glass networks using a large deviations approach (respectively by Sompolinsky and his colleagues for networks of rate neurons using the local chaos hypothesis) under the previous hypotheses is that the averaged law of Langevin spin glass (respectively rate neurons) dynamics is chaotic in the sense that the averaged law of a finite number of spins (respectively neurons) converges to a product measure as the system gets very large.

The next theoretical efforts in the direction of understanding the averaged law of rate neurons are those of Cessac, Moynot and Samuelides [17–21]. From the technical viewpoint, the study of the collective dynamics is done in discrete time, assuming no leak (this term is explained below) in the individual dynamics of each of the rate neurons. Moynot and Samuelides obtained a large deviation principle and were able to describe in detail the limit averaged law that had been obtained by Cessac using the local chaos hypothesis and to prove rigorously the propagation of chaos property. Moynot extended these results to the more general case where the neurons can belong to two populations, the synaptic weights are non-Gaussian (with some restrictions) but still independent and identically distributed, and the network is not fully connected (with some restrictions) [18].

The common thread to all of the above approaches is that, in the large network limit, the neurons are (probabilistically) independent of each other. This independence is desirable because it facilitates a reduction to the macroscopic level, since the net activity of the network can be accurately represented by the mean activity of any particular neuron. However, as our results further below demonstrate, complete independence between the neurons is not the only situation in which one may obtain an accurate reduction to the macroscopic level. We are therefore motivated to incorporate in the network model the fact that the synaptic weights are not independent and in effect often highly correlated. One of the reasons for this is the plasticity processes at work at the levels of the synaptic connections between neurons; see for example [22] for a biological viewpoint, and [14,16,23] for a more computational and mathematical account of these phenomena.

Our results imply that there are system-wide correlations between the neurons, even in the asymptotic limit. The key reason why we do not have propagation of chaos is that the Radon-Nikodym derivative $\frac{d{Q}^{N}}{d{P}^{N}}$ of the average laws in Proposition 8 cannot be tensored into N independent and identically distributed processes; whereas the simpler assumptions on the weight function Λ in Moynot and Samuelides allow the Radon-Nikodym derivative to be tensored. We remind the reader that the Radon-Nikodym derivative of a measure with respect to another measure is an extension to more general spaces of the following simple result from differential calculus: given two differentiable functions F (x) and G(x) defined on ℝ with derivatives f(x) and g(x), then the ration of the differentials dF (x) and dG(x) is equal to f(x)/g(x) whenever g(x) ≠ 0. In this example, the first measure is f(x) dx and the second g(x) dx. The interested reader may look at standard textbooks on real and complex analysis such as [24].

A very important implication of our result is that the mean-field behavior is insufficient to characterize the behavior of a population. Our limit process µ_{e} is system-wide and ergodic. Our work challenges the assumption held by some that one cannot have a “concise” macroscopic description of a neural network without an assumption of asynchronicity at the local population level.

In more details, the problem we solve in this paper is the following. Given a completely connected network of firing rate neurons in which the synaptic weights are Gaussian correlated random variables, we describe the asymptotic behavior of the network when the number of neurons goes to infinity. Like in [18,19] we study a discrete time dynamics but unlike these authors we cope with more complex intrinsic dynamics of the neurons, in particular we allow for a leak (to be explained in more detail below). In the large-size limit, the neurons are highly correlated. The probabilistic law is ergodic, which basically means that it is invariant under a shift of the indices. Despite the non-trivial correlations, we are able to obtain a macroscopic process µ_{e} which describes the large-size behavior of the system. Furthermore we are able to obtain various “reductions” to the macroscopic level, as outlined in Section 6.

To be complete, let us mention the fact that this problem has already partially been explored in Physics by Sompolinsky and Zippelius [5,6] and in Mathematics by Alice Guionnet [4] who analyzed symmetric spin glass dynamics, i.e., the case where the matrix of the coupling coefficients (the synaptic weights in our case) is symmetric. This is a very special case of correlation. The work in [25] is also an important step forward in the direction of understanding the spin glass dynamics when more general correlations are present.

Let us also mention very briefly another class of approaches toward the description of very large populations of neurons where the individual spikes generated by the neurons are considered. The model for individual neurons is usually of the class of Integrate and Fire (IF) neurons [26] and the underlying mathematical tools are those of the theory of point-processes [27]. Important results have been obtained in this framework by Gerstner and his collaborators, e.g., [28,29] in the case of deterministic synaptic weights. Related to this approach but from a more mathematical viewpoint, important results on the solutions of the mean-field equations have been obtained in [30]. In the case of spiking neurons but with a continuous dynamics (unlike that of IF neurons), the first author and collaborators have recently obtained some limit equations that describe the asymptotic dynamics of fully connected networks of neurons [31] with independent synaptic weights.

Because of the correlation of the synaptic weights, the natural space to work in is the infinite dimensional space of the trajectories, noted ${\mathcal{T}}^{\mathbb{Z}}$, of a countably-infinite set of neurons and the set of stationary probability measures defined on this set, noted ${\mathcal{M}}_{1,S}^{+}\left({\mathcal{T}}^{\mathbb{Z}}\right)$.

We introduce the process-level empirical measure, noted
${\widehat{\mu}}_{N}$, of the N trajectories of the solutions to the equations of the network of N neurons and the averaged (with respect to the synaptic weights) law Q^{N} of the N trajectories of the solutions to the equations of the network of N neurons. The first result of this article (Theorem 1) is that the image law Π^{N} of Q^{N} through
${\widehat{\mu}}^{N}$ satisfies a large deviation principle (LDP) with a good rate function H which is shown to have a unique global minimum, µ_{e}. We remind the reader that the notion of an image law is simply an extension to more complicated objects than functions, i.e., probability measures, of the usual notion of a change of variables. The interested reader is referred to e.g., the textbook as [32]. Thus, with respect to the measure Π^{N}${\mathcal{M}}_{1,S}^{+}\left({\mathcal{T}}^{\mathbb{Z}}\right)$, if the set X contains the measure
${\delta}_{{\mu}_{e}}$, then Π (X) → 1 as N → ∞, whereas if
${\delta}_{{\mu}_{e}}$ is not in the closure of X, Π^{N}(X) → 0 as N → ∞ exponentially fast and the constant in the exponential rate is determined by the rate function. Our analysis of the rate function allows us also, and this is our second result (Theorem 3), to characterize the limit measure µ_{e} as the image of a stationary Gaussian measure
$\underset{\xaf}{{\mu}_{e}}$ defined on a transformed set of trajectories
${\mathcal{T}}^{\mathbb{Z}}$. This is potentially very useful for applications since
$\underset{\xaf}{{\mu}_{e}}$ can be completely characterized by its mean and spectral density. Furthermore the rate function allows us to quantify the probability of finite-size effects. Theorems 1 and 3 allows us to characterize the average (over the synaptic weights) behavior of the network. We also derive, and this is our third result, some properties of the infinite-size network that are true for almost all realizations of the synaptic weights (Theorems 4 and 6).

The paper is organized as follows. In Section 2 we describe the equations of our network of neurons, the type of correlation between the synaptic weights, define the proper state spaces and introduce the different probability measures that are necessary for establishing our results, in particular the process-level empirical measure,
${\widehat{\mu}}_{N}$, Π^{N} and the image R^{N} through
${\widehat{\mu}}_{N}$ of the law of the uncoupled neurons. We state the principle result of this paper in Theorem 1.

In Section 3 we introduce a certain Gaussian process attached to a given measure in
${\mathcal{M}}_{1,S}^{+}\left({\mathcal{T}}^{\mathbb{Z}}\right)$ and
${\mathcal{M}}_{1,S}^{+}\left({\mathcal{T}}^{N}\right)$ and motivate this introduction by showing that the Radon-Nikodym derivative of Q^{N} with respect to the law of the uncoupled neurons can be expressed by the Gaussian process corresponding to the empirical measure
${\widehat{\mu}}_{N}$. This allows us to compute the Radon-Nikodym derivative of Π^{N} with respect to R^{N} for any measure in
${\mathcal{M}}_{1,S}^{+}\left({\mathcal{T}}^{\mathbb{Z}}\right)$. Using these results, Section 4 is dedicated to the proof of the existence of a strong LDP for the measure Π^{N}. In Section 5 we show that the good rate function obtained in the previous section has a unique global minimum and we characterize it as the image of a stationary Gaussian measure. Section 6 is dedicated to drawing some important consequences of our first main theorem, in particular some quenched results. Section 7 explores some possible extensions of our work and we conclude with Section 8.

## 2. The Neural Network Model

We consider a fully connected network of N neurons. Not all sets of neurons are fully connected but many are, e.g., within the same cortical column. One of the major aims of this article is to quantify how quickly the system converges to its limit, so the rate function gives us a means of assessing whether the number of neurons in a cortical column is sufficiently high for the mean field equations to be accurate. For simplicity but without loss of generality, we assume N odd [33] and write N = 2n + 1, n ≥ 0. The state of the neurons is described by the variables $\left({U}_{t}^{j}\right),j=-n,\cdots ,n,t=0,\cdots ,T$ which represent the values of the neurons membrane potentials.

#### 2.1. The Model Equations

The equation describing the time variation of the membrane potential U^{j} of the jth neuron writes

$f:\mathbb{R}\to \left]0,1\right[$ is a monotonically increasing bijection which we assume to be Lipschitz continuous. Its Lipschitz constant is noted k_{f}. We could for example employ f(x) = (1 + tanh(gx))/2, where the parameter g can be used to control the slope of the “sigmoid” f at the origin x = 0.

This equation involves the parameters $\gamma ,\phantom{\rule{0.2em}{0ex}}{J}_{ij}^{N}$, and the time processes ${B}_{t}^{j},\phantom{\rule{0.2em}{0ex}}i,j=-n,\dots ,n,t=0,\dots ,T-1$. The initial conditions are discussed at the beginning of Section 2.2.2.

γ is in [0, 1) and determines the time scale of the intrinsic dynamics, i.e., without interactions, of the neurons. If γ = 0 the dynamics is said to have no leak.

The ${B}_{t}^{j}\mathrm{s}$ represent random fluctuations of the membrane potential of neuron j. They are independent random processes with the same law. We assume that at each time instant t, the ${B}_{t}^{j}\mathrm{s}$ are independent and identically distributed random variables distributed as ${\mathcal{N}}_{1}\left(0,{\sigma}^{2}\right)$ [34].

The ${J}_{ij}^{N}\mathrm{s}$ are the synaptic weights. ${J}_{ij}^{N}$ represents the strength with which the “presynaptic” neuron j influences the “postsynaptic” neuron i. They are Gaussian random variables, independent of the membrane fluctuations, whose mean is given by

We note J^{N} the N × N matrix of the synaptic weights,
${J}^{N}={\left({J}_{ij}^{N}\right)}_{i,j=-n,\cdots ,n}$. Their covariance is assumed to satisfy the following shift invariance property,

**Remark 1.** This shift invariance property is technically useful since it allows us to use the tools of Fourier analysis. In terms of the neural population it means that the neurons “live” on a circle. Therefore, unlike in the uncorrelated case studied in the papers cited in the introduction, we have to indirectly introduce a notion of space.

We stipulate the covariances through a covariance function $\mathrm{\Lambda}:{\mathbb{Z}}^{2}\to \mathbb{R}$ and assume that they scale as 1/N. We write

The function Λ is even:

This implies that the two-dimensional Fourier transform of Λ (also called the power spectral density) is positive, see also Proposition 1 below. Furthermore, for any $k,l\in {\mathbb{Z}}^{+},\mathrm{\Lambda}(0,0)\ge \left|\mathrm{\Lambda}(k,l)\right|$.

We must make further assumptions on Λ to ensure that the system is well-behaved as the number of neurons N asymptotes to infinity. We assume that the series ${(\mathrm{\Lambda}(k,l))}_{k,l\in \mathbb{Z}})$ is absolutely convergent, i.e.,

We let Λ^{N} be the restriction of Λ to [−n, n]^{2}, i.e., Λ^{N}(i, j) = Λ(i, j) for −n ≤ i, j ≤ n.

We next introduce the spectral properties of Λ that are crucial for the results in this paper. We use throughout the paper the notation that if x is some quantity,
$\tilde{x}$ represents its Fourier transform in a sense that depends on the particular space where x is defined. For example
$\tilde{\mathrm{\Lambda}}$ is the 2π doubly periodic Fourier transform of the function Λ whose properties are described in the next proposition. Similarly,
${\tilde{\mathrm{\Lambda}}}^{N}$ is the two-dimensional Discrete Fourier Transform (DFT) of the doubly periodic sequence Λ^{N}. The proof of the following proposition is obvious.

**Proposition 1.** The sum$\tilde{\mathrm{\Lambda}}\left({\theta}_{1},{\theta}_{2}\right)$ of absolutely convergent series${\left(\mathrm{\Lambda}\left(k,l\right){e}^{-i(k{\theta}_{1}+l{\theta}_{2})}\right)}_{k,l\in \mathbb{Z}}$ is continuous on [−π, π[^{2} and positive. The covariance function Λ is recovered from the inverse Fourier transform of$\tilde{\mathrm{\Lambda}}$:

Moreover there exists${\tilde{\mathrm{\Lambda}}}^{\mathrm{min}}>0$ such that

#### 2.2. The Laws of the Uncoupled and Coupled Processes

#### 2.2.1. Preliminaries

Sets of Trajectories, Temporal and Spatial Projections

The time evolution of one membrane potential is represented by the set
${\mathbb{R}}^{\left[0\cdots T\right]}:=\mathcal{T}$ of finite sequences (u_{t})_{t=0,⋯,T} of length T + 1 of real numbers.
${\mathcal{T}}^{N}$ is the set of sequences (u^{−n},⋯,u^{n}) (N = 2n + 1) of elements of
$\mathcal{T}$ that we use to describe the solutions to Equation (1). Similarly we note
${\mathcal{T}}^{\mathbb{Z}}$ the set of doubly infinite sequences of elements of
$\mathcal{T}$. If u is in
${\mathcal{T}}^{\mathbb{Z}}$ we note
${u}^{i},i\in \mathbb{Z}$, its ith coordinate, an element of
$\mathcal{T}$. Hence
$u={\left({u}^{i}\right)}_{i=-\infty \cdots \infty}$.

Given the integers s and t, 0 ≤ s≤ t ≤ T, we define the temporal projection
${\pi}_{s,t}:\mathcal{T}\to {\mathbb{R}}^{\left[s\cdots t\right]}:={\mathcal{T}}_{s,t}$ as the set of finite sequences of length t − s + 1 of real numbers such that
${\pi}_{s,t}(u)={\left({u}_{r}\right)}_{r=s\cdots t}:={u}_{s,t}$. When s = t we note π_{t} and
${\mathcal{T}}_{t}$ rather than π_{t,t} and
${\mathcal{T}}_{t,t}$. The temporal projection π_{s,t} extends in a natural way to
${\mathcal{T}}^{N}$ and
${\mathcal{T}}^{\mathbb{Z}}$: for example π_{s,t} maps
${\mathcal{T}}^{N}$ to
${\mathcal{T}}_{s,t}^{N}$. We define the spatial projection
${\pi}^{N}:{\mathcal{T}}^{\mathbb{Z}}\to {\mathcal{T}}^{N}(N=2n+1)$ to be
${\pi}^{N}(u)=\left({u}^{-n},\dots ,{u}^{n}\right)$. Temporal and spatial projections commute, i.e.,
${\pi}^{N}\circ {\pi}_{s,t}={\pi}_{s,t}\circ {\pi}^{N}$.

The shift operator $\mathcal{S}:{\mathcal{T}}^{\mathbb{Z}}\to {\mathcal{T}}^{\mathbb{Z}}$ is defined by

Given the element u = (u^{−n}, …, u^{n}) of
${\mathcal{T}}^{N}$ we form the doubly infinite periodic sequence

_{N}:

Topologies on the Sets of Trajectories

We equip
${\mathcal{T}}^{\mathbb{Z}}$ with the projective topology, i.e., the topology generated by the following metric. For
$u,v\in {\mathcal{T}}^{N}$, we define their distance d_{N}(u, υ) to be

This allows us to define the following metric over ${\mathcal{T}}^{\mathbb{Z}}$, whereby if $u,v\in {\mathcal{T}}^{\mathbb{Z}}$ then

The metric d generates the Borelian σ-algebra $\mathcal{B}\left({\mathcal{T}}^{\mathbb{Z}}\right):=\mathcal{F}$. It is generated by the coordinate functions ${\left({u}_{j}^{i}\right)}_{i\in \mathbb{Z},t=0\cdots T}$. The spatial and temporal projections defined above can be used to define the corresponding σ-algebras on the sets ${\mathcal{T}}_{s,t}^{N}$, e.g., ${\mathcal{F}}_{s,t}^{N}={\pi}^{N}\left({\pi}_{s,t}(\mathcal{F})\right),0\le s\le t\le T$.

Probability Measures on the Sets of Trajectories

We note ${\mathcal{M}}_{1}^{+}\left({\mathcal{T}}^{\mathbb{Z}}\right)$ (respectively ${\mathcal{M}}_{1}^{+}\left({\mathcal{T}}^{N}\right)$) the set of probability measures on $({\mathcal{T}}^{\mathbb{Z}},\mathcal{F})$ (respectively $({\mathcal{T}}^{N},{\mathcal{F}}^{\mathcal{N}})$).

For
$\mu \in {\mathcal{M}}_{1}^{+}\left({\mathcal{T}}^{\mathbb{Z}}\right)$_{,} we denote its marginal distribution at time t by
${\mu}_{t}=\mu \phantom{\rule{0.2em}{0ex}}\mathrm{o}\phantom{\rule{0.2em}{0ex}}{\mu}_{t}^{-1}$. Similarly,
${\mu}_{s,t}^{N}$ is its N-dimensional spatial, t − s + 1-dimensional time marginal
$\mu \phantom{\rule{0.2em}{0ex}}\circ {\left({\mu}^{N}\right)}^{-1}\circ {\mu}_{s,t}^{-1}$.

We denote the conditional probability distribution of µ, given ${U}_{0}^{j}={u}_{0}^{j}$ (for all j), by ${\mu}_{{u}_{0}}$. This is understood to be a probability measure over $\mathcal{B}\left({\mathcal{T}}_{1,T}^{\mathbb{Z}}\right):={\mathcal{F}}_{1,T}$.

We note
${\mathcal{M}}_{1,S}^{+}\left({\mathcal{T}}^{\mathbb{Z}}\right)$ the set of stationary probability measures on
${\mathcal{T}}^{\mathbb{Z}}$. Given a random variable u with values in T governed by µ in
${\mathcal{M}}_{1,S}^{+}\left({\mathcal{T}}^{\mathbb{Z}}\right)$, so is the random variable
$\mathcal{S}u$, with
$\mathcal{S}$ the shift operator defined by Equation (6)$\left(\text{equivalently}\phantom{\rule{0.2em}{0ex}}\mu \phantom{\rule{0.2em}{0ex}}\circ \phantom{\rule{0.2em}{0ex}}{\mathcal{S}}^{-1}=\mu \right)$. With a slight abuse of notation, we define
${\mathcal{M}}_{1,S}^{+}\left({\mathcal{T}}^{N}\right)$ to be the set of all
$\mu \in {\mathcal{M}}_{1}({\mathcal{T}}^{N})$ satisfying the following property. If (u^{−}^{n}, …, u^{n}) are random variables governed by µ^{N}, then for all |m| ≤ n, (u^{m−n}, …, u^{m}^{+}^{n}) has the same law as (u^{−n}, …, u^{n}) (recall that the indexing is taken modulo N), or equivalently that
${\mu}^{N}\circ \phantom{\rule{0.2em}{0ex}}{\mathcal{S}}^{-1}={\mu}^{N}$ (remember Equation (8)).

**Remark 2.** Note that the stationarity discussed here is a spatial stationarity.

Process-Level Empirical Measure

We next introduce the following process-level empirical measure, see e.g., [35]. Given an element u = (u^{−n}, …, u^{n}) in
${\mathcal{T}}^{N}$ we associate with it the measure, noted
${\widehat{\mu}}_{N}\left({u}^{-n},\dots ,{u}^{n}\right)$, in
${\mathcal{M}}_{1,S}^{+}\left({\mathcal{T}}^{\mathbb{Z}}\right)$ defined by

$\mathcal{S}$ is the shift operator defined in Equation (6).

**Remark 3.** This is a significant difference with previous work dealing with uncorrelated weights (e.g., [19]) where the N processes are coupled through the “usual” empirical measure$d{\widehat{\mu}}_{N}\left({u}^{-n},\cdots ,{u}^{n}\right)(y)=\frac{1}{N}{\displaystyle \sum {}_{i=-n}^{n}{\delta}_{{u}^{i}}(y)}$ which is a measure on$\mathcal{T}$. In our case, because of the correlations and as shown in Section 3.4 the processes are coupled through the process-level empirical measure Equation (10) which is a probability measure on${\mathcal{T}}^{\mathbb{Z}}$. This makes our analysis more biologically realistic, since we know that correlations between the synaptic weights do exist, but technically more involved.

Topology on Sets of Measures

We next equip
${\mathcal{M}}_{1}^{+}\left({\mathcal{T}}^{\mathbb{Z}}\right)$ with the topology of weak convergence, as follows. For
${\mu}^{N},{v}^{N}\in {\mathcal{M}}_{1}^{+}\left({\mathcal{T}}^{N}\right)$, we note the Wasserstein distance induced by the metric k_{f}d_{N}(u, υ) ∧ 1,

_{f}is a positive constant defined at the start of Section 2.1 and $\mathcal{J}$ is the set of all measures in ${\mathcal{M}}_{1}^{+}\left({\mathcal{T}}^{N}\times {\mathcal{T}}^{N}\right)$ with N-dimensional marginals µ

^{N}and υ

^{N}.

**Remark 4.** The use of k_{f} in Equation (11) is technical and used to simplify the proof of Proposition 5.

For $\mu ,v\in {\mathcal{M}}_{1}^{+}\left({\mathcal{T}}^{N}\right)$, we define

_{N}(μ

^{N}, υ

^{N}) ≤ 1 and $\sum {}_{n=0}^{\infty}{\kappa}_{n}<\infty$. It can be shown that ${\mathcal{M}}_{1}^{+}\left({\mathcal{T}}^{\mathbb{Z}}\right)$ equiped with this metric is Polish.

#### 2.2.2. Coupled and Uncoupled Processes

We specify the initial conditions for Equation (1) as N independent and identically distributed random variables
${\left({U}_{0}^{j}\right)}_{j=-n,\cdots ,n}$. Let µ_{I} be the individual law on
$\mathbb{R}$ of
${U}_{0}^{j}$; it follows that the joint law of the variables is
${\mu}_{I}^{\otimes N}$ on
${\mathbb{R}}^{N}$. We note P the law of the solution to one of the uncoupled equations (1) where we take
${J}_{ij}^{N}=0,i,j=-n,\cdots ,n$. P is the law of the solution to the following stochastic difference equation:

_{I}. This process can be characterized exactly, as follows.

Let $\mathrm{\Psi}:\mathcal{T}\to T$ be the following bicontinuous bijection. Writing υ = Ψ(u), we define

The following proposition is evident from Equations (13) and (14).

**Proposition 2.** The law P of the solution to Equation (13) writes

_{T}is the T-dimensional vector of coordinates equal to 0 and Id

_{T}is the T -dimensional identity matrix.

We later employ the convention that if
$u=\left({u}^{-n},\dots ,{u}^{n}\right)\in {\mathcal{T}}^{N}$ then Ψ(u) = (Ψ(u^{−n}), …, Ψ(u^{n})). A similar convention applies if
$u\in {\mathcal{T}}^{\mathbb{Z}}$. We also use the notation Ψ_{1}_{,T} for the mapping
$\mathcal{T}\to {\mathcal{T}}_{1,T}$ such that Ψ_{1}_{,T} = π_{1}_{,T} ◦ Ψ.

Reintroducing the coupling between the neurons, we note Q^{N}(J^{N}) the element of
${\mathcal{M}}_{1}^{+}\left({\mathcal{T}}^{N}\right)$ which is the law of the solution to Equation (1) conditioned on J^{N}. We let
${Q}^{N}={\mathbb{E}}^{J}[{Q}^{N}({J}^{N})]$ be the law averaged with respect to the weights. The reason for this is as follows. We want to study the empirical measure
${\widehat{\mu}}_{N}$ on path space. There is no reason for this to be a simple problem since for a fixed interaction J^{N}, the variables (U^{−n}, ⋯, U^{n}) are not exchangeable. So we first study the law of
${\widehat{\mu}}_{N}$ averaged over the interaction before we prove in Section 6 some almost sure properties of this law. Q^{N} is a common construction in the physics of interacting particle systems and is known as the annealed law [36].

We may thus infer that

**Lemma 1.** P ^{⊗}^{N}, Q^{N} and${\widehat{\mu}}_{N}^{N}$ (the N-dimensional marginal of${\widehat{\mu}}_{N}$) are in${\mathcal{M}}_{1,\mathcal{S}}^{+}\left({\mathcal{T}}^{N}\right)$_{.}

Since the application ψ defined in Equation (14) plays a central role in the sequel we introduce the following definition.

**Definition 1.** For each measure
$\mu \in {\mathcal{M}}_{1}^{+}\left({\mathcal{T}}^{N}\right)$ or
${\mathcal{M}}_{1,\mathcal{S}}^{+}\left({\mathcal{T}}^{\mathbb{Z}}\right)$ we define
$\underset{\xaf}{\mu}$ to be μ ο ψ^{−1}

In particular, note that

Finally we introduce the image laws in terms of which the principal results of this paper are formulated.

**Definition 2.** Let Π^{N} (respectively R^{N}) be the image law of Q^{N} (respectively P^{⊗}^{N}) through the function
${\widehat{\mu}}_{N}:{\mathcal{T}}^{N}\to {\mathcal{M}}_{1,S}^{+}({\mathcal{T}}^{\mathbb{Z}})$ defined by Equation (10).

The central result of this paper is in the next theorem.

**Theorem 1.** Π^{N} is governed by a large deviation principle (LDP) with a good rate function H (to be found in Definition 5). That is, if F is a closed set in${\mathcal{M}}_{1,S}^{+}\left({\mathcal{T}}^{\mathbb{Z}}\right)$, then

Conversely, for all open sets O in${\mathcal{M}}_{1,S}^{+}\left({\mathcal{T}}^{\mathbb{Z}}\right)$,

Note$\underset{N\to \infty}{\mathrm{lim}}$ is the lim-sup and$\underset{N\to \infty}{\mathrm{lim}}$ is the lim-inf. By “good rate function”, we mean that for all a ≥ 0, the following set is compact

**Remark 5.** We recall that the above LDP is also called a strong LDP.

Our proof of Theorem 1 will occur in several steps. We prove in Sections 4.1 and 4.3 that Π^{N} satisfies a weak LDP, i.e., that it satisfies Equation (16) when F is compact and Equation (17) for all open O. We also prove in Section 4.2 that {Π^{N}} is exponentially tight, and we prove in Section 4.4 that H is a good rate function. It directly follows from these results that Π^{N} satisfies a strong LDP with good rate function H [38]. Finally, in Section 5 we prove that H has a unique minimum µ_{e} which
${\widehat{\mu}}_{N}$ converges to weakly as N → ∞. This minimum is a (stationary) Gaussian measure which we describe in detail in Theorem 3.

## 3. The Good Rate Function

In the sections to follow we will obtain an LDP for the process with correlations (Π^{N}) via the (simpler) process without correlations (R^{N}). However, to do this we require an expression for the Radon-Nikodym derivative of Π^{N} with respect to R^{N}, which is the main result of this section. The derivative will be expressed in terms of a function
${\mathrm{\Gamma}}_{[N]}:{\mathcal{M}}_{1,S}^{+}\left({\mathcal{T}}^{N}\right)\to \mathbb{R}$. We will firstly define Γ_{[}_{N}_{]}(µ), demonstrating that it may be expressed in terms of a Gaussian process
${G}_{[N]}^{\mu}$ (to be defined below), and then use this to determine the Radon-Nikodym derivative of Π^{N} with respect to R^{N}.

#### 3.1. Gaussian Processes

Given µ in
${\mathcal{M}}_{1,S}^{+}\left({\mathcal{T}}^{\mathbb{Z}}\right)$ we define a stationary Gaussian process G^{µ}. with values in
${\mathcal{T}}_{1,T}^{\mathbb{Z}}$ For all i the mean of
${G}_{t}^{\mu ,i}$ is given by
${c}_{t}^{\mu}$, where

We now define the covariance of G^{µ}. We first define the following matrix-valued process.

**Definition 3.** Let M^{µ,k}, k ∈ ℤ be the T × T matrix defined by (for s, t ∈ [1, T ]),

These matrices satisfy

^{†}denotes the transpose. Furthermore, they feature a spectral representation, i.e., there exists a T × T matrix-valued measure ${\tilde{M}}^{\mu}={({\tilde{M}}^{\mu})}_{s,t=1,\cdots ,T}$ with the following properties. Each ${\tilde{M}}_{st}^{\mu}$ is a complex measure on [−π, π[ of finite total variation and such that

Relations (20) and (21) imply the following relations, for all Borelian sets $\mathcal{A}\subset \left[-\pi ,\pi \right[,$

^{*}indicates complex conjugation. We may infer from this that ${\tilde{M}}^{\mu}$ is Hermitian-valued. The spectral representation means that for all vectors $W\in {\mathbb{R}}^{T},{\phantom{\rule{0.2em}{0ex}}}^{\u2020}W{\tilde{M}}^{\mu}(d\theta )W$ is a positive measure on [−π, π[.

The covariance between the Gaussian vectors G^{µ,i} and G^{µ,i}^{+}^{k} is defined to be

We note that the above summation converges for all k ∈ ℤ since the series (Λ(k, l))_{k,l}_{∈ℤ} is absolutely convergent and the elements of M^{µ,l} are bounded by 1 for all l ∈ ℤ.

It follows immediately from the definition that for $\mu \in {\mathcal{M}}_{1,S}^{+}\left({\mathcal{T}}^{\mathbb{Z}}\right)$ and k ∈ ℤ we have

This is necessary for the covariance function to be well-defined. The following proposition may be easily proved from the above definitions.

**Proposition 3.** The sequence (K^{µ,k})_{k}_{∈ℤ} has spectral density${\tilde{K}}^{\mu}$ given by

That is, ${\tilde{K}}^{\mu}$ is Hermitian positive and satisfies${\tilde{K}}^{\mu}(-\theta ){=}^{\u2020}{\tilde{K}}^{\mu}\left(\theta \right)$ and${K}^{\mu ,k}=\frac{1}{2\pi}{\displaystyle {\int}_{-\pi}^{\mu}{e}^{ik\theta}{\tilde{K}}^{\mu}(\theta )d\theta}$.

**Proof.** The proof essentially consists of demonstrating that the matrix function

From Equation (23) we obtain that, for all s, t ∈ [1 ⋯ T],

This shows that, because by Equation (4) the series (Λ(k, l))_{k,l}_{∈ℤ} is absolutely convergent,
${\tilde{K}}^{\mu}(\theta )$ is well-defined on [−π, π]. The fact that
${\tilde{K}}^{\mu}(\theta )$ is Hermitian follows from Equations (25) and (24).

Combining Equations (21), (23) and (25) we write

This can be rewritten in terms of the spectral density $\tilde{\mathrm{\Lambda}}$ of Λ

We note that
${\tilde{K}}^{\mu}(\theta )$ is positive, because for all vectors W of ℝ^{T}

We may also define the N-dimensional Gaussian process ${G}_{[N]}^{\mu}$ with values in ${\mathcal{T}}_{1,T}^{N}$ as follows. The mean of ${G}_{[N]}^{\mu ,i},i=-n,\cdots ,n$ is given by Equation (18) (or rather its finite dimensional analog) and the covariance between ${G}_{[N]}^{\mu ,i}$ and ${G}_{[N]}^{\mu ,i+k}$ is given by

#### 3.2. Convergence of Gaussian Processes

The finite-dimensional system “converges” to the infinite-dimensional system in the following sense. In what follows, and throughout the paper, we use the Frobenius norm on the T × T matrices. We write ${\tilde{K}}_{[N]}^{\mu}(\theta )={\displaystyle \sum {}_{k=-n}^{n}{K}_{[N]}^{\mu ,k}\mathrm{exp}(-ik\theta )}$. Note that for $\left|j\right|\le n,{\tilde{K}}_{[N]}^{\mu}(2\pi j/N)={\tilde{K}}_{[N]}^{\mu ,j}$. The lemma below follows directly from the absolute convergence of $\sum {}_{j,k}\left|\mathrm{\Lambda}(j,k)\right|$.

**Lemma 2.** Fix$\mu \in {\mathcal{M}}_{1,S}^{+}\left({\mathcal{T}}^{\mathbb{Z}}\right)$. For all ε, there exists an N such that for all M > N and all j such that$2\left|j\right|+1\le M,\Vert {K}_{[M]}^{\mu ,j}-{K}^{\mu ,j}\Vert <\epsilon $ and for all$\theta \in \left[-\pi ,\pi \right[,\phantom{\rule{0.2em}{0ex}}\Vert {\tilde{K}}_{[M]}^{\mu}(\theta )-{\tilde{K}}^{\mu}(\theta )\Vert \le \epsilon $.

**Lemma 3.** The eigenvalues of${\tilde{K}}_{[N]}^{\mu ,l}$ and${\tilde{K}}^{\mu}(\theta )$ are upperbounded by${\rho}_{K}\underset{\equiv}{\text{def}}T{\mathrm{\Lambda}}^{sum}$, where Λ^{sum} is defined in Equation (4).

**Proof.** Let W ∈ ℝ^{T}. We find from Proposition 3 and Equation (4) that

The eigenvalues of M^{µ,}^{0} are all positive (since it is a correlation matrix), which means that each eigenvalue is upperbounded by the trace, which in turn is upperbounded by T. The proof in the finite dimensional case follows similarly. □

We note ${K}_{[N]}^{\mu}$ the (NT × NT) covariance matrix of the sequence of Gaussian random variables $\left({G}_{[N]}^{\mu ,-n},\dots ,{G}_{[N]}^{\mu ,n}\right)$. Because of the properties of the matrixes ${K}_{[N]}^{\mu ,k},k=-n\cdots n$, this is a symmetric block circulant matrix. It is also positive, being a covariance matrix.

We let ${A}_{[N]}^{\mu}={K}_{[N]}^{\mu}{\left({\sigma}^{2}{\mathrm{Id}}_{NT}+{K}_{[N]}^{\mu}\right)}^{-1}$. This is well-defined because ${K}_{[N]}^{\mu}$ is diagonalizable (being symmetric and real) and has positive eigenvalues (being a covariance matrix). It follows from Lemma 20 in Appendix A that this is a symmetric block circulant matrix, with blocks ${A}_{[N]}^{\mu ,k}(k=-n,\cdots ,n)$ such that

In the limit N → ∞ we may define

The Fourier series of Ã^{µ} is absolutely convergent as a consequence of Wiener’s Theorem. We thus find that, for l ∈ ℤ,

**Lemma 4.** The map B → B(σ^{2}Id_{T} + B)^{−}^{1} is Lipschitz continuous over the set$\mathrm{\Delta}=\{{\tilde{K}}_{[N]}^{\mu}(\theta ),{\tilde{K}}^{\mu}(\theta ):\mu \in {\mathcal{M}}_{1,S}^{+}({\mathcal{T}}^{\mathbb{Z}}),N>0,\theta \in [-\pi ,\pi ]\}$.

**Proof.** The proof is straightforward using the boundedness of the eigenvalues of the matrixes in Δ. □

The following lemma is a consequence of Lemmas 2 and 4.

**Lemma 5.** Fix$\mu \in {\mathcal{M}}_{1,S}^{+}({\mathcal{T}}^{\mathbb{Z}})$. For all ε, there exists an N such that for all M > N and all θ ∈ [−π, π[,
$\left|\right|{\tilde{A}}_{[M]}^{\mu}(\theta )-{\tilde{A}}^{\mu}(\theta )\left|\right|\le \epsilon $.

The above-defined matrices have the following “uniform convergen” properties.

**Proposition 4.** Fix$v\in {\mathcal{M}}_{1,S}^{+}({\mathcal{T}}^{\mathbb{Z}})$. For all ε > 0, there exists an open neighbourhood V_{ε}(ν) such that for all μ ∈ V_{ε}(ν), all s, t ∈ [1,T] and all θ ∈ [−π, π[,

**Proof.** The proof is found in Appendix B. □

Before we close this section we define a subset of ${\mathcal{M}}_{1,S}^{+}({\mathcal{T}}^{\mathbb{Z}})$ which appears naturally, i.e., it is the subset of ${\mathcal{M}}_{1,S}^{+}({\mathcal{T}}^{\mathbb{Z}})$ where the rate function (to be defined) is not infinite, see Section 3.3.2 and Lemma 8.

**Definition 4.** Let $\mathcal{E}$_{2} be the subset of
${\mathcal{M}}_{1,S}^{+}({\mathcal{T}}^{\mathbb{Z}})$ defined by

For this set of measures, we may define the stationary process (υ^{k})_{k}_{∈ℤ} in
${\mathcal{T}}_{1,T}^{\mathbb{Z}}$, where
${v}_{s}^{k}={\mathrm{\Psi}}_{s}({u}^{k})$, s = 1, ⋯, T. This has a finite mean E^{µ}^{1}^{,T} [v^{0}], noted
${\overline{v}}^{\mu}$. It admits the following spectral density measure, noted
${\tilde{v}}^{\mu}$, such that

#### 3.3. Definition of the Functional Γ

In this section we define and study a functional Γ_{[}_{N}_{]} = Γ_{[}_{N}_{]}_{,}_{1} + Γ_{[}_{N}_{]}_{,}_{2}, which will be used to characterize the Radon-Nikodym derivative of Π^{N} with respect to R^{N}. Let
$\mu \in {\mathcal{M}}_{1,S}^{+}({\mathcal{T}}^{\mathbb{Z}})$, and let (μ^{N})N≥1 be the N-dimensional marginals of μ (for N = 2n + 1 odd).

#### 3.3.1. Γ_{1}

We define

Because of the remarks after Lemma 3 the spectrum of
${K}_{[N]}^{\mu}$[N] is positive, that of
${\mathrm{Id}}_{NT}+\frac{1}{{\sigma}^{2}}{K}_{[N]}^{\mu}$ is strictly positive (in effect larger than 1) and the above expression has a sense. Moreover, Γ_{[}_{N}_{]}_{,}_{1}(µ) ≤ 0.

We now define Γ_{1}(µ) = lim_{N→∞} Γ_{[}_{N}_{]}_{,}_{1}(µ). The following lemma indicates that this is well-defined.

**Lemma 6.** When N goes to infinity the limit of Equation (38) is given by

**Proof.** Through Lemma 20 in Appendix A, we have that

**Proposition 5.** Γ_{[}_{N}_{]}_{,}_{1} and Γ_{1} are bounded below and continuous on${\mathcal{M}}_{1,S}^{+}({\mathcal{T}}^{\mathbb{Z}})$.

**Proof.** Applying Lemma 19 in the case of
$Z=\phantom{\rule{0.2em}{0ex}}({G}_{[N]}^{\mu ,-n}-{c}^{\mu},\cdots ,{G}_{[N]}^{\mu ,n}-{c}^{\mu})$, a = 0, b = σ^{−}^{2}, we write

Using Jensen’s inequality we have

By definition of ${K}_{[N]}^{\mu ,0}$, the right-hand side is equal to $-\frac{1}{2{\sigma}^{2}}\text{Trace}({K}_{[N]}^{\mu ,0})$. From Equation (28), we find that

It follows from Equation (19) that 0 ≤ Trace(M^{μ,m}) ≤ T. Hence Γ_{[}_{N}_{]}_{,}_{1}(μ) ≥ −β_{1}, where

It follows from Lemma 6 that −β_{1} is a lower bound for Γ_{1}(μ) as well.

The continuity of both Γ_{[}_{N}_{]}_{,}_{1} and Γ_{1} follows from the expressions (38) and (39), continuity of the applications
$\mu \to {\tilde{K}}_{[N]}^{\mu}$ and
$\mu \to {\tilde{K}}^{\mu}$ (Proposition 4) and the continuity of the determinant.

#### 3.3.2. Γ_{2}

For $\mu \in {\mathcal{M}}_{1,S}^{+}({\mathcal{T}}^{\mathbb{Z}})$ we define

Γ_{[}_{N}_{]}_{,}_{2}(μ) is finite in the subset $\mathcal{E}$_{2} of
${\mathcal{M}}_{1,S}^{+}({\mathcal{T}}^{\mathbb{Z}})$ defined in Definition 4. If μ ∉ $\mathcal{E}$_{2}, then we set Γ_{[}_{N}_{]}_{,}_{2}(μ) = ∞.

We define Γ_{2}(μ) = lim_{N}_{→∞} Γ_{[}_{N}_{]}_{,}_{2}(μ). The following proposition indicates that Γ_{2}(μ) is well-defined.

**Proposition 6.** If the measure μ is in $\mathcal{E}$_{2}, i.e., if${\mathbb{E}}^{\underset{\xaf}{\mu}}{}^{{}_{1,T}}[\Vert {\nu}^{0}{\Vert}^{2}]<\infty $, then Γ_{2}(μ) is finite and writes

The “:” symbol indicates the double contraction on the indexes. One also has

**Proof.** Using Equations (37) and (42) the stationarity of μ and the fact that
${\sum}_{k=-n}^{n}{A}_{[N]}^{\mu ,k}={\tilde{A}}_{[N]}^{\mu}}(0)$, we have

From the spectral representation of ${A}_{[N]}^{\mu}$ we find that

Since (according to Proposition 5)
${\tilde{A}}_{[N]}^{\mu}(\theta )$ converges uniformly to Ã^{μ}(θ) as N → ∞, it follows by dominated convergence that Γ_{[}_{N}_{]}_{,}_{2}(μ) converges to the expression in the proposition.

The second expression for Γ_{2}(μ) follows analogously, although this time we make use of the fact that the partial sums of the Fourier Series of Ã^{μ} converge uniformly to Ã^{μ} (because the Fourier Series is absolutely convergent).

We next obtain more information about the eigenvalues of the matrices
${\tilde{A}}_{[N]}^{\mu ,k}={\tilde{A}}_{[N]}^{\mu}\left(\frac{2k\pi}{N}\right)$ (where k = −n, …, n) and Ã^{μ}(θ).

**Lemma 7.** There exists 0 < α < 1, such that for all N and μ, the eigenvalues of${\tilde{A}}_{[N]}^{\mu ,k}$, Ã^{μ}(θ) and${A}_{[N]}^{\mu}$ are less than or equal to α.

**Proof.** By Lemma 3, the eigenvalues of
${\tilde{K}}^{\mu}(\theta )$ are positive and upperbounded by ρ_{K}. Since
${\tilde{K}}^{\mu}(\theta )$ and
${\left({\sigma}^{2}{\mathrm{Id}}_{T}+{\tilde{K}}^{\mu}(\theta )\right)}^{-1}$ are coaxial (because
${\tilde{K}}^{\mu}$ is Hermitian and therefore diagonalisable), we may take

This upperbound also holds for ${\tilde{A}}_{[N]}^{\mu ,k}$, and for the eigenvalues of ${A}_{[N]}^{\mu}$, because of Lemma 20.

We wish to prove that Γ_{[}_{N}_{]}_{,}_{2}(μ) is lower semicontinuous. A consequence of this will be that Γ_{[}_{N}_{]}_{,}_{2}(μ) is measureable with respect to
$\mathcal{B}({\mathcal{M}}_{1,S}^{+}({\mathcal{T}}^{\mathbb{Z}}))$. In effect, we prove in Appendix C that ϕ^{N}(μ, v) defined by Equation (43) satisfies

_{2}defined in Equation (87) in Appendix C.

We then have the following proposition.

**Proposition 7.** Γ_{[}_{N}_{]}_{,}_{2}(μ) is lower-semicontinuous.

**Proof.** We define
${\varphi}^{N,M}(\mu ,\nu )={1}_{{B}_{M}}(\nu )({\varphi}^{N}(\mu ,\nu )+{\beta}_{2})$, where
${1}_{{B}_{M}}$ is the indicator of B_{M} and v ∈ B_{M} if
${N}^{-1}{\displaystyle {\sum}_{j=-n}^{n}{\Vert {\nu}^{2}\Vert}^{2}}\le M$. We have just seen that ϕ^{N,M} ≥ 0. We also define

Suppose that ν_{n} → μ with respect to the weak topology in
${\mathcal{M}}_{1,S}^{+}({\mathcal{T}}^{\mathbb{Z}})$. Observe that

We may infer from the above expression that
${\mathrm{\Gamma}}_{2,[N]}^{M}(\mu )$ is continuous (with respect to μ) for the following reasons. The first term on the right hand side converges to zero because ϕ^{N,M} is continuous and bounded (with respect to v). The second term converges to zero because ϕ^{N,M}(μ, v) is a continuous function of μ, see Proposition 4.

Since
${\mathrm{\Gamma}}_{[N],2}^{M}(\mu )$ grows to Γ_{[}_{N}_{]}_{,}_{2}(μ) as M → ∞, we may conclude that Γ_{[}_{N}_{]}_{,}_{2}(μ) is lower semicontinuous with respect to μ.

We define Γ_{[}_{N}_{]}(μ) = Γ_{[}_{N}_{]}_{,}_{1}(μ) + Γ_{[}_{N}_{]}_{,}_{2}(μ). We may conclude from Propositions 5 and 7 that Γ_{[}_{N}_{]} is measurable.

#### 3.4. The Radon-Nikodym Derivative

In this section we determine the Radon-Nikodym derivative of Π^{N} with respect to R^{N}. However, in order for us to do this, we must first compute the Radon-Nikodym derivative of Q^{N} with respect to P^{⊗}^{N}. We do this in the next proposition.

**Proposition 8.** The Radon-Nikodym derivative of Q^{N} with respect to P^{⊗}^{N} is given by the following expression.

^{i}), i = −n, ⋯, n given by

**Proof.** For fixed J^{N}, we let
${R}_{{J}^{N}}:{\mathbb{R}}^{N(T+1)}\to {\mathbb{R}}^{N(T+1)}$ be the mapping u → y, i.e.,
${R}_{{J}^{N}}({u}^{-n},\cdots ,{u}^{n})=({y}^{-n},\cdots ,{y}^{n})$, where for j = −n, ⋯, n,

The determinant of the Jacobian of
${R}_{{J}^{N}}$ is 1 for the following reasons. Since
$\frac{d{y}_{s}^{j}}{d{u}_{t}^{k}}=0$ if t > s, the determinant is
$\prod}_{s=0}^{T}{D}_{s$, where D_{s} is the Jacobian of the map
$({u}_{s}^{-n},\dots ,{u}_{s}^{n})\to ({y}_{s}^{-n},\dots ,{y}_{s}^{n})$ induced by
${R}_{{J}^{N}}$. However, D_{s} is evidently 1. Similar reasoning implies that
${R}_{{J}^{N}}$ is a bijection.

It may be seen that the random vector $Y={R}_{{J}^{N}}(U)$, U solution of Equation (1), is such that ${Y}_{0}^{j}={U}_{0}^{j}$ and ${Y}_{t}^{j}={B}_{t-1}^{j}$ where |j| ≤ n and t = 1, ⋯, T. Therefore

Since the determinant of the Jacobian of
${R}_{{J}^{N}}$ is one, we obtain the law of Q^{N}(J^{N}) by applying the inverse of
${R}_{{J}^{N}}$ to the above distribution, i.e.,

Note that, exceptionally, ‖ ‖ is the Euclidean norm in ${\mathbb{R}}^{N(T+1)}$ or ${\mathcal{T}}^{N}$.

Recalling that P^{⊗}^{N} = Q^{N}(0), we therefore find that

Taking the expectation of this with respect to J^{N} yields the result.

In fact, as stated in the proposition below, the Gaussian system ${({G}_{s}^{i})}_{i=-n,\dots ,n,s=1,\dots ,T}$ has the same law as the system ${G}_{[N]}^{{\widehat{\mu}}_{N}}$, as defined in Equation (28) and afterwards.

**Proposition 9.** Fix$u\in {\mathcal{T}}^{N}$. The covariance of the Gaussian system$({G}_{s}^{i})$, where i = −n, …, n and s = 1, …, T writes${K}_{[N]}^{{\widehat{\mu}}_{N}(u)}$. For each i, the mean of G^{i} is${c}^{{\widehat{\mu}}_{N}(u)}$.

The proof of this proposition is an easy verification left to the reader. We obtain an alternative expression for the Radon-Nikodym derivative in Equation (46) by applying Lemma 19 in Appendix A. That is, we substitute Z = (G^{−}^{n}, ⋯, G^{n}),
$a=\frac{1}{{\sigma}^{2}}({\nu}^{-n},\cdots ,{\nu}^{n})$, and
$b=\frac{1}{{\sigma}^{2}}$ into the formula in Lemma 19. After noting Proposition 9 we thus find that

**Proposition 10.** The Radon-Nikodym derivatives write as

Here$\mu \in {\mathcal{M}}_{1,S}^{+}({\mathcal{T}}^{\mathbb{Z}})$, Γ_{[}_{N}_{]}(μ) = Γ_{[}_{N}_{]}_{,}_{1}(μ) + Γ_{[}_{N}_{]}_{,}_{2}(μ) and the expressions for Γ_{[}_{N}_{]}_{,}_{1} and Γ_{[}_{N}_{]}_{,}_{2} have been defined in Equations (38) and (42).

The second expression in the above proposition follows from the first one because Γ_{[}_{N}_{]} is measureable.

**Remark 6.** Proposition 10 shows that the process solutions of Equation (1) are coupled through the process-level empirical measure unlike in the case of independent weights where they are coupled through the usual empirical measure. As mentioned in the Remark 3 this complicates significantly the mathematical analysis.

## 4. The Large Deviation Principle

In this section we prove the principal result of this paper (Theorem 1), that the image laws Π^{N} satisfy an LDP with good rate function H (to be defined below). We do this by firstly establishing an LDP for the image law with uncoupled weights (R^{N}), see Definition 2, and then use the Radon-Nikodym derivative of Corollary 10 to establish the full LDP for Π^{N}. Therefore our first task is to write the LDP governing R^{N}.

Let μ, ν be probability measures over a Polish Space Ω equipped with its Borelian σ-algebra. The Küllback-Leibler divergence of μ relative to ν (also called the relative entropy) is

^{(2)}(μ, ν) = ∞ otherwise. It is a standard result that

For
$\mu \in {M}_{1,S}^{+}({\Omega}^{\mathbb{Z}})$ and
$\nu \in {\mathcal{M}}_{1}^{+}(\Omega )$, the process-level entropy of μ with respect to ν^{ℤ} is defined to be

See Lemma IX.2.4 in [35] for a proof that this (possibly infinite) limit always exists (the superadditivity of the sequence N^{−1}I^{(2)}(μ^{N}) follows from Equation (49)).

**Theorem 2.** R^{N} is governed by a large deviation principle with good rate function [39,40]

_{0}in ℝ

^{ℤ}. In addition, the set of measures {R

^{N}} is exponentially tight.

**Proof.** R^{N} satisfies an LDP with good rate function I^{(3)}(μ, P^{ℤ}) [35]. In turn, a sequence of probability measures (such as {R^{N}}) over a Polish Space satisfying a large deviations upper bound with a good rate function is exponentially tight [38].

It is an identity in [41] that

It follows directly from the variational expression (49) that

Note that although our convention throughout this paper is for N to be odd, the limit in Equation (50) exists for any sequence of integers going to ∞. We divide Equation (52) by N and consider the subsequence of all N of the form N = 2^{k} for k ∈ ℤ^{+}. It follows from Equation (53) that
${N}^{-1}{I}^{(2)}({\mu}_{{u}_{0}}^{N},{P}_{{u}_{0}}^{\otimes N})$ is strictly nondecreasing as N = 2^{k} → ∞ (for all u_{0}), so that Equation (51) follows by the monotone convergence theorem.

Because Ψ is bijective and bicontinuous, it may be easily shown that

Before we move to a statement of the LDP governing Π^{N}, we prove the following relationship between the set $\mathcal{E}$_{2} (see Definition 4) and the set of stationary measures which have a finite Küllback-Leibler divergence or process-level entropy with respect to P^{ℤ}.

**Lemma 8.**

See Lemma 10 in [42] for a proof. We are now in a position to define what will be the rate function of the LDP governing Π^{N}.

**Definition 5.** Let H be the function
${\mathcal{M}}_{1,S}^{+}({\mathcal{T}}^{\mathbb{Z}})\to \mathbb{R}\cup \{+\infty \}$ defined by

Here Γ(μ) = Γ_{1}(μ) + Γ_{2}(μ) and the expressions for Γ_{1} and Γ_{2} have been defined in Lemma 6 and Proposition 6. Note that because of Proposition 6 and Lemma 8, whenever I^{(3)} (μ, P^{ℤ}) is finite, so is Γ(μ).

#### 4.1. Lower Bound on the Open Sets

We prove the second half of Proposition 1.

**Lemma 9.** For all open sets O, Equation (17)

**Proof.** From the expression for the Radon-Nikodym derivative in Proposition 10 we have

If μ ∈ O is such that I^{(3)}(μ, P^{ℤ}) = ∞, then H(μ) = ∞ and evidently Equation (17) holds. We now prove Equation (17) for all μ ∈ O such that I^{(3)}(μ, P^{ℤ}) < ∞. Let ε > 0 and
${Z}_{\epsilon}^{N}(\mu )\subset O$ be an open neighbourhood containing μ such that
${\mathrm{inf}}_{\nu \in {Z}_{\epsilon}^{N}(\mu )}{\mathrm{\Gamma}}_{[N]}(\nu )\ge {\mathrm{\Gamma}}_{[N]}(\mu )-\epsilon $. Such
$\{{Z}_{\epsilon}^{N}(\mu )\}$ exist for all N because of the lower semi-continuity of Γ_{[}_{N}_{]}(μ) (see Propositions 5 and 7). Then

The last equality follows from Lemma 6 and Proposition 6. Since ε is arbitrary, we may take the limit as ε → 0 to obtain Equation (17). Since Equation (17) is true for all μ ∈ O the lemma is proved.

#### 4.2. Exponential Tightness of Π^{N}

We begin with the following technical lemma, the proof of which can be found in Appendix D.

**Lemma 10.** There exist positive constants c > 0 and a > 1 such that, for all N,

^{N}is defined in Equation (43).

This lemma allows us to prove the exponential tightness:

**Proposition 11.** The family {Π^{N}} is exponentially tight.

**Proof.** Let
$B\in \mathcal{B}({\mathcal{M}}_{1,S}^{+}({\mathcal{T}}^{\mathbb{Z}}))$. We have from Proposition 10

Through Hölder’s Inequality, we find that for any a > 1:

Now it may be observed that

Since Γ_{1} ≤ 0, it follows from Lemma 10 that

By the exponential tightness of {R^{N}} (as stated in Theorem 2), for each L > 0, there exists a compact set K_{L} such that
$\overline{\underset{N\to \infty}{\mathrm{lim}}}\phantom{\rule{0.2em}{0ex}}{N}^{-1}\mathrm{log}({R}^{N}({K}_{L}^{c}))\le -L$. Thus if we choose compact K_{Π} such that
${K}_{\mathrm{\Pi}}{,}_{L}={K}_{\frac{a}{a-1}(L+\frac{c}{a})}$, then for all L > 0,
$\overline{\underset{N\to \infty}{\mathrm{lim}}}\phantom{\rule{0.2em}{0ex}}{N}^{-1}\mathrm{log}({\mathrm{\Pi}}^{N}({K}_{\mathrm{\Pi},L}^{c}))\le -L$. □

#### 4.3. Upper Bound on the Compact Sets

In this section we obtain an upper bound on the compact sets, i.e., the first half of Theorem 1 for F compact. Our method is to obtain an LDP for a simplified Gaussian system (with fixed A^{v} and c^{v}), and then prove that this converges to the required bound as v → μ

#### 4.3.1. An LDP for a Gaussian Measure

We linearise Γ in the following manner.
$\text{Fix}\nu \in {M}_{1,s}^{+}\left({\mathcal{T}}^{\mathbb{Z}}\right)$ and assume for the moment that μ ∈ $\mathcal{E}$_{2}. Let

${\varphi}_{\infty}^{N}:{\mathcal{M}}_{1,S}^{+}({\mathcal{T}}^{\mathbb{Z}})\times {\mathcal{T}}_{1,T}^{N}\to \mathbb{R}$ is define by

**Remark 7.** Note the subtle difference with the definition of Φ^{N} in Equation (43): we use A^{ν,k} instead of${A}_{\left[N\right]}^{\nu ,k}$. When turning to spectral representations of${\Phi}_{\infty}^{N}$ this will bring in the matrixes Ã^{ν, N, l}, l = −n…n defined at the beginning of Section 4.3.2.

Let us also define

^{ν,N}is the NT × NT matrix with T × T blocks noted K

^{ν,N,l}. We define

${\mathrm{\Gamma}}_{2}^{\nu}(\mu )=\phantom{\rule{0.2em}{0ex}}{\mathrm{lim}}_{N\to \infty}{\mathrm{\Gamma}}_{[N],2}^{\nu}(\mu )$. We find, using the first identity in Proposition 6, that

Where ${\overline{v}}^{\mu}={\mathbb{E}}^{{\underset{\xaf}{\mu}}_{1,T}}[{v}^{0}]$, and ${\tilde{v}}^{\mu}$ is the spectral measure defined in Equation (37). We recall that : denotes double contraction on the indices.

Similarly to Lemma 6, we find that

For μ ∈ $\mathcal{E}$_{2}, we define H^{ν}(μ) = I^{(3)}(μ, P^{ℤ}) − Γ^{ν}(μ); for μ ∉ $\mathcal{E}$_{2}, we define
${\mathrm{\Gamma}}_{2}^{\nu}(\mu )={\mathrm{\Gamma}}^{\nu}(\mu )=\infty $ and H^{ν} (μ) = ∞.

**Definition 6.** Let
${\underset{\xaf}{Q}}^{\nu}\in {\mathcal{M}}_{1,S}^{+}({\mathcal{T}}^{\mathbb{Z}})$ with N-dimensional marginals
${\underset{\xaf}{Q}}^{\nu ,N}$ given by

Where $B\in \mathcal{B}\left({\mathcal{T}}^{N}\right)$. This defines a law ${Q}^{\nu}\in {\mathcal{M}}_{1,S}^{+}({\mathcal{T}}^{\mathbb{Z}})$ according to the correspondence in Definition 1.

We have the following lemma.

**Lemma 11.**${\underset{\xaf}{Q}}_{1,T}^{\nu}$ is a stationary Gaussian process of mean c^{ν}. Its N-dimensional spatial, T-dimensional temporal, marginals${\underset{\xaf}{Q}}_{1,T}^{\nu ,N}$ are in${\mathcal{M}}_{1,S}^{+}\left({\mathcal{T}}_{1,T}^{N}\right)$ and have covariance σ^{2}Id_{NT}+K^{ν}^{,N}. The spectral density of${\underset{\xaf}{Q}}_{1,T}^{\nu}$ is${\sigma}^{2}{\mathrm{Id}}_{T}+{\tilde{K}}^{\nu}(\theta )$ and in addition,

**Proof.** In effect we find that

^{ν,N}the NT-dimensional vector obtained by concatenating N times the vector c

^{ν}. We also have that

Thus, through Proposition 2, we find that

It is seen that
${\underset{\xaf}{Q}}_{1,T}^{\nu ,N}$ is an NT-dimensional Gaussian measure with mean c^{ν,N}, inverse covariance matrix
$\frac{1}{{\sigma}^{2}}\left({\mathrm{Id}}_{NT}-{A}^{\nu ,N}\right)$, and covariance matrix σ^{2}Id_{NT}+K^{ν,N}. Hence
${\underset{\xaf}{Q}}_{1,T}^{\nu ,N}$ is in
${\mathcal{M}}_{1,S}^{+}\left({\mathcal{T}}_{1,T}^{N}\right)$, and

${\mathcal{M}}_{1,S}^{+}\left({\mathcal{T}}^{N}\right)$. It follows also that the spectral density of ${\underset{\xaf}{Q}}_{1,T}^{\nu}$ is ${\sigma}^{2}{\mathrm{Id}}_{T}+{\tilde{K}}^{\nu}\left(\theta \right)$. □

We may thus define the measure ${\underset{\xaf}{Q}}^{\nu}$ of a stationary process over the variables ${\left\{{v}_{s}^{j}\right\}}_{j\in \mathbb{Z},s=0,\dots ,T}$, with N-dimensional marginals given by Equations (67) and (68).

**Definition 7.** Let
${\underset{\xaf}{\mathrm{\Pi}}}^{\nu ,N}$ be the image law of
${\underset{\xaf}{Q}}^{\nu ,N}$ under
${\widehat{\underset{\xaf}{\mu}}}_{N}$, i.e., for
$B\in \mathcal{B}\left({\mathcal{M}}_{1,S}^{+}\left({\mathcal{T}}^{\mathbb{Z}}\right)\right)$,

The point is that it can be shown that the image
${\underset{\xaf}{\mathrm{\Pi}}}^{\nu ,N}$ of the measure
${\underset{\xaf}{Q}}^{\nu ,N}$ satisfies a strong LDP (see next lemma) and that this LDP can be transferred to Π^{N}, see Proposition 12. We begin with the following lemma which is a generalization of the result in [39].

**Lemma 12.** The image law${\underset{\xaf}{\mathrm{\Pi}}}^{\nu ,N}$ satisfies a strong LDP (in the manner of Theorem 1) with good rate function

**Proof.** We have found an LDP for a Gaussian process in [42]. Since Q^{ν} may be separated in the manner of Equation (68), we may use the expression in Theorem 2 to obtain the result. □

For ${B}_{N}\in \mathcal{B}\left({\mathcal{T}}_{1,T}^{N}\right)$, we define the image law

It follows from the contraction principle that if we write ${H}^{\nu}(\mu ):={\underset{\xaf}{H}}^{\nu}(\underset{\xaf}{\mu})$, then

**Corollary 1.** The image law Π^{ν,N} satisfies a strong LDP with good rate function

#### 4.3.2. An Upper Bound for Π^{N} over Compact Sets

In this section we derive an upper bound for Π^{N} over compact sets using the LDP of the previous section. Before we do this, we require two lemmas governing the ‘distance’ between Γ^{ν} and Γ. Let
${\tilde{K}}^{\mu ,N}$ be the DFT of
${\left({K}^{\mu ,j}\right)}_{j=-n}^{n}$, and similarly Ã^{µ,N} is the DFT of
${\left({A}^{\mu ,j}\right)}_{j=-n}^{n}$. We define

**Lemma 13.** For all$\nu \in {\mathcal{M}}_{1,S}^{+}({\mathcal{T}}^{\mathbb{Z}})$,
${C}_{N}^{\nu}$ is finite and

**Proof.** We recall from Proposition 4 that
${\tilde{K}}_{[M],st}^{\nu}(\theta )$ converges uniformly (in θ) to
${\tilde{K}}_{st}^{\nu}(\theta )$. The same holds for
${\tilde{K}}_{st}^{\nu ,M,l}$, because this represents the partial summation of an absolutely converging Fourier Series. That is, for fixed θ = 2πl_{M},
${\tilde{K}}_{st}^{\nu ,M,{l}_{M}}\to {\tilde{K}}_{st}^{\nu}(\theta )$ as M → ∞ The result then follows from the equivalence of matrix norms. The proof for Ã^{ν} is analogous. □

The second lemma, the proof of which can be found in Appendix E, goes as follows.

**Lemma 14.** There exists a constant C_{0} such that for all ν in${\mathcal{M}}_{1,S}^{+}({\mathcal{T}}^{\mathbb{Z}})$, all ε > 0 and all μ ∈ V_{ε} (ν)∩ε_{2},

Here V_{ε}(ν) is the open neighbourhood defined in Proposition 4, and µ is given in Definition 1.

We are now ready to begin the proof of the upper bound on compact sets for which we follow the ideas in [4].

**Proposition 12.** Let$\mathcal{K}$ be a compact subset of${\mathcal{M}}_{1,S}({\mathcal{T}}^{\mathbb{Z}})$. Then

**Proof**. Fix ε > 0. Let V_{ε}(ν) be the open neighbourhood of ν defined in Proposition 4, and let
${\overline{V}}_{\epsilon}(\nu )$ be its closure. Since
$\mathcal{K}$ is compact and
${\{{V}_{\epsilon}(\nu )\}}_{\nu \in \mathcal{K}\phantom{\rule{0.2em}{0ex}}}$ is an open cover, there exists an r and
${\left\{{\nu}_{i}\right\}}_{i=1}^{r}$ such that
$\mathcal{K}\subset {\displaystyle {\cup}_{i=1}^{r}{V}_{\epsilon}({\nu}_{i})}$. We find that

It follows from the fact that ${\widehat{\mu}}_{N}\in {\epsilon}_{2}$, Proposition 10 and Lemma 14 that

From the definition of Q^{ν,N} in Equation (64) and Hölder’s Inequality, for p, q such that
$\frac{1}{p}+\frac{1}{q}=1$, we have

We note from Lemma 3 that the eigenvalues of the covariance of
${\underset{\xaf}{Q}}_{1,T}^{{\nu}_{i},N}$ are upperbounded by σ^{2} + ρ_{K}. Thus for this integral to converge it is sufficient that

This condition will always be satisfied for sufficiently small ε and sufficiently large N (since ${C}_{N}^{{\nu}_{i}}\to 0$as N → ∞ by Lemma 13). Considering Equation (73), by Corollary 1,

We next find an upper bound for the integral appearing in the definition of the quantity D. We apply Lemma 19 in Appendix A to find

_{T}⊗ 1

_{N}is the NT × T block matrix with each block Id

_{T}and

^{k}, k = −n, ⋯, n its T × T blocks. We have

^{k})

_{k}

_{=−}

_{n,}

_{⋯}

_{,n}. Let v

_{m}be the largest eigenvalue of B. Since (by Lemma 20) the eigenvalues of ${\tilde{B}}^{0}$ are a subset of the eigenvalues of B, we have

From the definition of B and through Lemma 3 we have ${v}_{m}\le \frac{{\sigma}^{2}+{\rho}_{K}}{1-2{C}_{0}q(\epsilon +{C}_{N}^{{\nu}_{i}})({\sigma}^{2}+{\rho}_{K})}$. Hence we have, ${\Vert {c}^{{\nu}_{i}}\Vert}^{2}\le T{\overline{J}}^{2}$

Since the determinant is the product of the eigenvalues, we similarly find that

Upon collecting the above inequalities, and noting that ${\Vert {c}^{\nu}\Vert}^{2}\le T{\overline{J}}^{2}$, we find that

We let $s(q,\epsilon )=\overline{\underset{N\to \infty}{\mathrm{lim}}}{s}_{N}^{{\nu}_{i}}(q,\epsilon )$, and find through Lemma 13 that

Notice that s(q, ε) is independent of ν_{i} and that s(q, ε) → 0 as ε → 0. Using Equations (73), (75) and (76) we thus find that

Recall that H^{ν}(μ) = ∞ for all μ ∉ ε_{2}. Thus if
$\mathcal{K}\cap {\epsilon}_{2}=\varnothing $, we may infer that
$\overline{\underset{N\to \infty}{\mathrm{lim}}}{N}^{-1}\mathrm{log}\left({\prod}^{N}(\mathcal{K})\right)=-\infty $ and the proposition is evident. Thus we may assume without loss of generality that
${\mathrm{inf}}_{\mu}{}_{\in \mathcal{K}}{H}^{{\nu}_{i}}(\mu )={\mathrm{inf}}_{\mu \in \mathcal{K}\cap {\epsilon}_{2}}{H}^{{\nu}_{i}}(\mu )$. Furthermore it follows from Proposition 13 (below) that there exists a constant C_{I} such that for all
$\mu \in {\overline{V}}_{\epsilon}({\nu}_{i}\cap {\epsilon}_{2}$,

We thus find that

We take ε → 0 and find, through the use of Lemma 15, that

The proof may thus be completed by taking p → 1.

**Proposition 13.** There exists a positive constant C_{I} such that, for all ν in${\mathcal{M}}_{1,S}^{+}({\mathcal{T}}^{\mathbb{Z}})\cap {\epsilon}_{2}$, all ε > 0 and all$\mu \in {\overline{V}}_{\epsilon}(\nu )\cap {\epsilon}_{2}$ (where${\overline{V}}_{\epsilon}(\nu )$ is the neighbourhood defined in Proposition 4),

The proof is very similar to that of Lemma 14 and we leave it to the reader. We end this section with Lemma 15 whose proof can be found in Appendix D

**Lemma 15.** There exist constants a > 1 and c > 0 such that for all$\mu \in {\mathcal{M}}_{1,S}^{+}({\mathcal{T}}^{\mathbb{Z}})\cap {\epsilon}_{2}$,

#### 4.4. End of the Proof of Theorem 1

**Lemma 16.** H(μ) is lower-semi-continuous.

The proof is very similar to that in [42]. Because {∏^{N}} is exponentially tight and satisfies the weak LDP with rate function H(μ), the following corollary is immediate (Lemma 2.1.5 in [37]).

**Corollary 2.** H(μ) is a good rate function, i.e., the sets {μ: H(μ) ≤ δ} are compact for all δ ∈ ℝ^{+} and it satisfies the first condition of Theorem 1.

This allows us to complete the proof of Theorem 1:

**Proof**. By combining Lemmas 16 and 9, Proposition 11, and Corollary 2, we complete the proof of Theorem 1.

## 5. Characterization of the Unique Minimum of the Rate Function

We prove that there exists a unique minimum µ_{e} of the rate function. and provide explicit equations for µ_{e} which would facilitate its numerical simulation. We start with the following lemma.

**Lemma 17.** For$\mu ,\nu \in {\mathcal{M}}_{1,S}^{+}({\mathcal{T}}^{\mathbb{Z}})$, H^{ν}(µ) = 0 if and only if µ = Q^{ν}.

**Proof**. This is a straightforward consequence of Theorem 1 in [42] and Theorem 2.

**Proposition 14.** There is a unique distribution${\mu}_{e}\in {\mathcal{M}}_{1,S}^{+}({\mathcal{T}}^{\mathbb{Z}})$ which minimizes H. This distribution satisfies H(µ_{e}) = 0.

**Proof**. By the previous lemma, it suffices to prove that there is a unique µ_{e} such that

We define the mapping $L:{\mathcal{M}}_{1,S}^{+}({\mathcal{T}}^{\mathbb{Z}})\to {\mathcal{M}}_{1,S}^{+}({\mathcal{T}}^{\mathbb{Z}})$ by

It follows from Equation (65) that

It may be inferred from the definitions in Section 3.1 that the marginal of L(µ) = Q^{µ} over
${\mathcal{F}}_{0,t}$ only depends upon the marginal of µ over
${\mathcal{F}}_{0,t-1},t\ge 1$. This follows from the fact that
${\underset{\xaf}{Q}}_{1,t}^{\mu}$ (which determines
${Q}_{0,t}^{\mu}$) is completely determined by the means
$\{{c}_{s}^{\mu};s=1,\dots ,t\}$ and covariances
$\{{K}_{uv}^{\mu ,j};j\in \mathbb{Z},u,v\in [1,t]\}$. In turn, it may be observed from Equations (18) and (23) that these variables are determined by µ_{0,}_{t}_{−1}. Thus for any
$\mu ,\nu \in {\mathcal{M}}_{1,S}^{+}({\mathcal{T}}^{\mathbb{Z}})$ and t ∈ [1, T], if

Then

It follows from repeated application of the above identity that for any ν satisfying

Defining

_{e}satisfies Equation (78).

Conversely if µ = L(µ) for some µ, then we have that µ = L^{2}(ν) for any ν such that ν_{0}_{,T}_{−2} = µ_{0}_{,T}_{−2}. Continuing this reasoning, we find that µ = L^{T} (ν) for any ν such that ν_{0} = µ_{0}. But by Equation (79), since Q^{µ} = µ, we have
${\mu}_{0}={\mu}_{I}^{\mathbb{Z}}$. But we have just seen that any µ satisfying µ = L^{T} (ν), where
${\nu}_{0}={\mu}_{I}^{\mathbb{Z}}$, is uniquely defined by Equation (81), which means that µ = µ_{e}.

We may use the proof of Proposition 14 to characterize the unique measure µ_{e} such that
${\mu}_{e}={Q}^{{\mu}_{e}}$ in terms of its image
$\underset{\xaf}{{\mu}_{e}}$. This characterization allows one to directly numerically calculate µ_{e}. We characterize
$\underset{\xaf}{{\mu}_{e}}$ recursively (in time), by providing a method of determining
${\underset{\xaf}{\mu}}_{{e}_{0,t}}$ in terms of
${\underset{\xaf}{\mu}}_{{e}_{0,t-1}}$. However, we must firstly outline explicitly the bijective correspondence between µ_{e}_{0}_{.t} and
${\underset{\xaf}{\mu}}_{{e}_{0,t}}$, as follows. For
$v\in \mathcal{T}$, we write Ψ^{−1}(v) = (Ψ^{−1}(v)_{0}, ⋯, Ψ^{−1}(v)_{T}). We recall from Equation (14) that Ψ^{−1}(v)_{0} = v_{0}. The coordinate Ψ^{−1}(v)_{t} is the affine function of v_{s}, s = 0 ⋯ t obtained from Equation (14)

Let ${K}_{(t-1,s-1)}^{{\mu}_{e},l}$ be the (t − 1) × (s − 1) submatrix of ${K}^{{\mu}_{e},l}$ composed of the times rows from 1 to (t −1) and the columns from times 1 to (s − 1), and

Let the measures ${\underset{\xaf}{{\mu}_{e}}}_{0,t}^{1}$ and ${\underset{\xaf}{{\mu}_{e}}}_{t,s}^{(0,l)}$ be given by

The lemma below is evident from the definitions above.

**Lemma 18.** For any t ∈ [1, T], the variables$\{{c}_{s}^{{\mu}_{e}},{K}_{rs}^{{\mu}_{e},j}:1\le r,s\le t,j\in \mathbb{Z}\le \}$ are necessary and sufficient to completely characterize the measures$\{{\underset{\xaf}{{\mu}_{e}}}_{0,t}^{1},{\underset{\xaf}{{\mu}_{e}}}_{(0,t)}^{(0,l)}:l\in \mathbb{Z}\}$. In turn, the measures$\{{\underset{\xaf}{{\mu}_{e}}}_{0,t}^{1},{\underset{\xaf}{{\mu}_{e}}}_{(0,t)}^{(0,l)}:l\in \mathbb{Z}\}$ are necessary and sufficient to characterize${\underset{\xaf}{\mu}}_{{e}_{0,t}}$.

The inductive method for calculating $\underset{\xaf}{{\mu}_{e}}$ is outlined in the theorem below.

**Theorem 3.** We may characterize$\underset{\xaf}{{\mu}_{e}}$ inductively as follows. Initially${\underset{\xaf}{{\mu}_{e}}}_{0}={\mu}_{I}^{\mathbb{Z}}$. Given that we have a complete characterization of

For 1 ≤ r, s ≤ t, ${K}_{rs}^{{\mu}_{e},k}={\displaystyle {\sum}_{l=-\infty}^{\infty}\mathrm{\Lambda}(k,l)}{M}_{rs}^{\mu ,l}$ Here, for p = max(r − 1; s − 1),

Of course the measure µ_{e} may be determined from
$\underset{\xaf}{{\mu}_{e}}$ since
${\mu}_{e}=\underset{\xaf}{{\mu}_{e}}\circ \mathrm{\Psi}$.

## 6. Some Important Consequences of Theorem 1

We state some important consequences of our results including some which are valid J almost surely (quenched results). We recall that Q^{N}(J^{N}) is the conditional law of N neurons for given J^{N}.

**Theorem 4**. Π^{N} converges weakly to${\delta}_{{\mu}_{e}}$, i.e.„ for all$\Phi \in {\mathcal{C}}_{b}({\mathcal{M}}_{1,S}^{+}({\mathcal{T}}^{\mathbb{Z}}))$,

Similarly,

**Proof**. The proof of the first result follows directly from the existence of an LDP for the measure ∏^{N} see Theorem 1, and is a straightforward adaptation of Theorem 2.5.1 in [18]. The proof of the second result uses the same method, making use of Theorem 5 below.

We can in fact obtain the following quenched convergence analogue of Equation (16).

**Theorem 5**. For each closed set F of${\mathcal{M}}_{1,S}^{+}({\mathcal{T}}^{\mathbb{Z}})$ and for almost all J

**Proof**. The proof is a combination of Tchebyshev’s inequality and the Borel-Cantelli lemma and is a straightforward adaptation of Theorem 2.5.4 and Corollary 2.5.6 in [18].

We define ${\stackrel{\u2323}{Q}}^{N}({J}^{N})=\frac{1}{N}{\displaystyle {\sum}_{j=-n}^{n}{Q}^{N}({J}^{N})}\circ {S}^{-j}$ where we recall the shift operator S defined by Equation (8). Clearly ${\stackrel{\u2323}{Q}}^{N}({J}^{N})$ is in ${\mathcal{M}}_{1,S}^{+}({\mathcal{T}}^{N})$.

**Corollary 3.** Fix M and let N > M. For almost every J and all h$h\in {\mathcal{C}}_{b}({\mathcal{T}}^{M})$,

That is, the M^{th} marginals${\stackrel{\u2323}{Q}}^{N,M}({J}^{N})$ and Q^{N,M} converge weakly to${\mu}_{e}^{M}$ as N → ∞.

**Proof**. It is sufficient to apply Theorem 4 in the case where Φ in
${\mathcal{C}}_{b}({\mathcal{M}}_{1,S}^{+}({\mathcal{T}}^{\mathbb{Z}}))$ is defined by

^{N}, ${\stackrel{\u2323}{Q}}^{N}(J)\in {\mathcal{M}}_{1,S}^{+}({\mathcal{T}}^{N})$ (Lemma 1 and above remark).

We now prove the following ergodic-type theorem. We may represent the ambient probability space by
$\mathfrak{W}$, where
$\omega \in \mathfrak{W}$ is such that
$\omega =({J}_{ij},{B}_{t}^{j},{\mu}_{0}^{j})$, where i, j ∈ ℤ and 0 ≤ t ≤ T − 1, recall Equation (1). We denote the probability measure governing ω by
$\mathfrak{P}$. Let
${u}^{(N)}(\omega )\in {\mathcal{T}}^{N}$ be defined by Equation (1). As an aside, we may then understand Q^{N}(J^{N}) to be the conditional law of
$\mathfrak{P}$ on u^{(}^{N}^{)}(ω), for given J^{N}.

**Theorem 6.** Fix M > 0 and let$h\in {C}_{b}({\mathcal{T}}^{M})$. For${u}^{(N)}(\omega )\in {\mathcal{T}}^{N}$ (where N > M) and |j| ≤ n. Then$\mathfrak{P}$ almost surely,

_{e}.

**Proof**. Our proof is an adaptation of [18]. We may suppose without loss of generality that
${f}_{{\mathcal{T}}^{M}}h(u)d{\mu}_{e}^{M}(u)=0.$ For p > 1 let

Since µ_{e} ∉ F_{p}, but it is the unique zero of H, it follows that
${\mathrm{inf}}_{{F}_{p}}H=m>0$. Thus by Theorem 1 there exists an N_{0}, such that for all N > N_{0},

However

Thus

We may thus conclude from the Borel-Cantelli Lemma that P almost surely, for every
$\omega \in \mathfrak{W}$, there exists N_{p} such that for all N ≥ N_{p},

This yields (82) because p is arbitrary. The convergence of
${\widehat{\mu}}_{N}({u}^{(N)}(\omega ))$ is a direct consequence of Equation (82), since this means that each of the M^{th} marginals converge.

## 7. Possible Extensions

Our results hold true if we assume that Equation (1) is replaced by the more general equation

_{j}s are independent and identically distributed random variables independent of the synaptic weights J and the random processes B

^{j}. They can be thought of as external stimuli imposed on the neurons. This equation accounts for a more complicated “intrinsic” dynamics of the neurons, i.e., when they are uncoupled. The parameters γ

_{k}, k = 1, ⋯, l must satisfy some conditions to ensure stability of the uncoupled dynamics.

This result can be straightforwardly extended to the case when the noise is correlated but stationary Gaussian, that is cov $({B}_{s}^{j}{B}_{t}^{k})$ is some function of s, t and (k − j). It can also be easily extended to the case that the initial distribution is correlated but mixing, using the Large Deviation Principle in [43].

The hypothesis that the synaptic weights are Gaussian is somewhat unrealistic from the biological viewpoint. In his PhD thesis [18], Moynot has obtained some preliminary results in the case of uncorrelated weights. We think that this is also a promising avenue.

Moynot again, in his thesis, has extended the uncorrelated weights case, to include two populations with different (Gaussian) statistics for each population. This is also an important practical problem in neuroscience. Extending Moynot’s result to the correlated case is probably a low hanging fruit.

Last but not least, the solutions of the equations for the mean and covariance operator of the measure minimizing the rate function derived in Section 5 and their numerical simulation are very much worth investigating and their predictions confronted to biological measurements.

## 8. Conclusions

In recent years there has been a lot of effort to mathematically justify neural-field models, through some sort of asymptotic analysis of finite-size neural networks. Many, if not most, of these models assume/prove some sort of thermodynamic limit, whereby if one isolates a particular population of neurons in a localized area of space, they are found to fire increasingly asynchronously as the number in the population asymptotes to infinity [44]. Indeed, this was the result of Moynot and Samuelides. However, our results imply that there are system-wide correlations between the neurons, even in the asymptotic limit. The key reason why we do not have propagation of chaos is that the Radon-Nikodym derivative
$\frac{d{Q}^{N}}{d{P}^{N}}$ of the average laws in Proposition 8 cannot be tensored into N independent and identically distributed processes; whereas the simpler assumptions on the weight function Λ in Moynot and Samuelides allow the Radon-Nikodym derivative to be tensored. A very important implication of our result is that the mean-field behavior is insufficient to characterize the behavior of a population. Our limit process µ_{e} is system-wide and ergodic. Our work challenges the assumption held by some that one cannot have a “concise” macroscopic description of a neural network without an assumption of asynchronicity at the local population level.

It would be of interest to compare our LDP with other analyses of the rate of convergence of neural networks to their limits as the size asymptotes to infinity. This includes the system-size expansion of Bressloff [45], the path-integral formulation of Buice and Cowan [46] and the systematic expansion of the moments by (amongst others) [47–49].

## Appendix

## A. Two Useful Lemmas

The following lemma from Gaussian calculus [18,50] which we recall for completeness is used several times throughout the paper.

**Lemma 19.** Let Z be a Gaussian vector of ℝ^{p} with mean c and covariance matrix K. If a ∈ and b ∈ ℝ is such that for all eigenvalues α of K the relation αb > − 1 holds, we have

Block-circulant matrices may be diagonalised using DFTs as follows.

**Lemma 20.** Let B be a symmetric block-circulant matrix with the (j, k) T × T block given by (B^{(}^{j}^{−}^{k}^{) mod} ^{N}), j, k = −n, ⋯, n. Let W ^{(}^{N}^{)} be the N × N Unitary matrix with elements${W}_{jk}^{(N)}\frac{1}{\sqrt{N}}\mathrm{exp}\left(\frac{2\pi ijk}{N}\right)$, j, k = −n, ⋯, n. Then B may be ‘block’-diagonalised in the follow manner (where ⊗ is the Kronecker Product and ^{∗} the complex conjugate),

Here${\tilde{B}}^{j}$ is a T × T Hermitian matrix and is the DFT defined in Equation (88). We observe also that λ is an eigenvalue of B if and only if λ is an eigenvalue of${\tilde{B}}^{k}$ for some k.

## B. Proof of Proposition 4

We first recall Proposition 4:

**Proposition 4.** Fix$\nu \in {\mathcal{M}}_{1,S}^{+}({\mathcal{T}}^{\mathbb{Z}})$. For all ε > 0, there exists an open neighborhood V_{ε}(ν) such that for all µ ∈ V_{ε}(ν), all s, t ∈ [1, T] and all θ ∈ [−π, π[,

**Proof**. Let µ be in
${\mathcal{M}}_{1,S}^{+}({\mathcal{T}}^{\mathbb{Z}})$ and θ ∈ [−π, π[. We have

Using Equation (23) we have

^{L}and ν

^{L}. Since $\left|f\left({u}_{s-1}^{0}\right)f\left({u}_{t-1}^{l}\right)-f\left({v}_{s-1}^{0}\right)f\left({v}_{t-1}^{l}\right)\right|\le 2({k}_{f}{d}_{L}({\pi}_{L}u,{\pi}_{L}v)\wedge 1)$, where k

_{f}is the Lipschitz constant of the function f, we find (through Equation (12)) that

Thus for Equation (32) to be satisfied, it suffices for us to stipulate that V_{ε}(ν) is a ball of radius less than
$\frac{\epsilon}{2}$ (with respect to the distance metric in Equation (12)). Similar reasoning dictates that Equation (35) is satisfied too.

However, in light of Lemma 4, it is evident that we may take the radius of V_{ε}(ν) to be sufficiently small that Equations (32), (35) and (36) are satisfied. In fact Equation (33) is also satisfied, as it may be obtained by taking the limit as N → ∞ of Equation (36). Since c^{µ} is determined by the one-dimensional spatial marginal of µ, it follows from the definition of the metric in Equation (12) that we may take the radius of V_{ε}(ν) to be sufficiently small that Equation (34) is satisfied too.

## C. Existence of a Lower Bound for Φ^{N}(µ, v)

In order to prove that φ^{N}(µ, v) defined in Equation (42) possesses a lower bound, we use a spectral representation and let
${\tilde{w}}^{j}={\tilde{v}}^{j}$ for all j, except that
${\tilde{w}}^{0}={\tilde{v}}^{0}-N{c}^{\mu}$. We may then write

Thus in order that the integrand possesses a lower bound, it suffices to prove, since the matrixes ${\tilde{A}}_{[N]}^{\mu ,l}$ are Hermitian positive, that there exists a lower bound for

We have made use of the fact that ${\tilde{w}}^{0}$ and ${\tilde{A}}_{[N]}^{\mu ,0}$ are real (since they are each a sum of real variables). Let ${\tilde{K}}_{[N]}^{\mu ,0}={O}_{[N]}^{\mu}{D}_{[N]}^{\mu}{}^{\u2020}{O}_{[N]}^{\mu}$, where ${D}_{[N]}^{\mu}$ is diagonal and ${O}_{[N]}^{\mu}$ is orthonormal. We define $X{=}^{\u2020}{O}_{[N]}^{\mu}{\tilde{w}}^{0}$, so that Equation (84) is equal to

**Lemma 21.** For each 1 ≤ t ≤ T,

**Proof**. If
$\overline{J}=0$ the conclusion is evident, thus we assume throughout this proof that
$\overline{J}\ne 0$. Since
${D}_{[N],tt}^{\mu}{=}^{\u2020}{\overline{O}}_{[N],t}^{\mu},{\tilde{K}}_{[N]}^{\mu ,0}{O}_{[N],t}^{\mu}$, we find from the definition that

We introduce the matrixes (L^{µ,k}), k ∈ ℤ, where for 1 ≤ s, t ≤ T,

These matrices have the same properties as the matrixes M^{μk}, in particular the discrete Fourier Transform
${\left({\tilde{L}}^{{\mu}^{N},l}\right)}_{l=-n,\dots ,n}$ is Hermitian positive. Using this spectral representation we write

We may use the previous lemma to obtain a lower-bound for the quadratic form (85). We recall the easily-proved identity from the calculus of quadratics that, for all x ∈ ℝ,

We therefore find, through Lemma 21, that Equation (85) is greater than or equal to

We have already noted in the proof of Proposition 5 that
$\text{Trace}({\tilde{K}}^{\mu ,0})\le T{\mathrm{\Lambda}}^{sum}$. Thus, pulling these results together, we find that ϕ^{N}(µ, v) is greater than −β_{2}, where

## D. Proof of Lemmas 10 and 15

For technical reasons, we need the following definition which is also used in Appendix E. The motivation is that when we analyze the function Φ^{N}(µ, v) defined in Equation (43) we are led to use spectral representations and to introduce the Fourier transform
$\tilde{v}$ of v. Since
$\tilde{v}\in {({\u2102}^{T})}^{N}$ the correspondence
$v\to \tilde{v}$ from
${\mathcal{T}}_{1,T}^{N}$ to (ℂ^{T})^{N} is not one to one. We need to take into account the symmetries of
$\tilde{v}$, hence the following definition.

**Definition 8.** For
$v={\left({v}^{j}\right)}_{j=-n\cdots n}\in {\mathcal{T}}_{1,T}^{N}$, we note
${\mathscr{H}}^{N}(v)={v}_{\diamond}=\left({v}_{\diamond}^{-n},\dots ,{v}_{\diamond}^{n}\right)\in {\mathcal{T}}_{1,T}^{N}$, where v_{⋄} is defined from the Discrete Fourier Transform
$\tilde{v}=\left({\tilde{v}}^{-n},\cdots ,{\tilde{v}}^{n}\right)$ of v as follows

The inverse transform is given by ${v}^{j}=\frac{1}{N}{\displaystyle {\sum}_{k=-n}^{n}{\tilde{v}}^{k}\mathrm{exp}\left(-\frac{2\pi ijk}{N}\right)}$.

Because v is in ${\mathcal{T}}_{1,T}^{N}$ the real part of its DFT is even $(\mathrm{Re}\left({\tilde{v}}^{-k}\right)=\mathrm{Re}\left({\tilde{v}}^{k}\right),k=-n,\cdots ,n)$ and similarly its imaginary part is odd. As a consequence we define

It is easily verified that the mapping v → v_{⋄} = $\mathscr{H}$^{N}(v) is a bijection from
${\mathcal{T}}_{1,T}^{N}$ to itself the inverse being given by

For a probability measure ${\mu}^{N}\in {\mathrm{M}}_{1}^{+}\left({\mathcal{T}}^{N}\right)$, we define ${\mu}_{\diamond}^{N}={\mu}_{1,T}^{N}\circ {({\mathscr{H}}^{N})}^{-1}$ to be the image law.

We also note ${\underset{\xaf}{\mu}}_{\diamond}^{N}$ the measure ${\underset{\xaf}{\mu}}_{1,T}^{N}\circ {({\mathscr{H}}^{N})}^{-1}$ (where ${\underset{\xaf}{\mu}}^{N}$ is given in Definition 1). We note that

We notice that ${\mathrm{\Gamma}}_{[N],2}(\mu )={\displaystyle {\int}_{{\mathcal{T}}_{1,T}^{N}}{\varphi}_{\diamond}^{N}\left(\mu ,{v}_{\diamond}\right){\underset{\xaf}{\mu}}_{\diamond}^{N}(d{v}_{\diamond})}$, where

**Lemma 10.** There exist positive constants c > 0 and a > 1 such that, for all N,

^{N}is defined in Equation (43).

**Proof.** We have from Equation (83) that
${\varphi}^{N}(\mu ,v)={\underset{\xaf}{\varphi}}_{\diamond}^{N}(\mu ,{w}_{\diamond})$, where
${w}_{\diamond}^{j}={v}_{\diamond}^{j}$ for all j, except that
${w}_{\diamond}^{0}={v}_{\diamond}^{0}-N{c}^{\mu}$. Since (by Equation (90)) the distribution of the variables
${v}_{\diamond}$ under
${\underset{\xaf}{P}}_{\diamond}^{\otimes N}$ is
${\mathcal{N}}_{T}{\left({0}_{T},N{\sigma}^{2}{\mathrm{Id}}_{T}\right)}^{\otimes N}$, the distribution of
${w}_{\diamond}$ under
${\underset{\xaf}{P}}_{\diamond}^{\otimes N}$ is
${\mathcal{N}}_{T}{\left(N{c}^{\mu},N{\sigma}^{2}{\mathrm{Id}}_{\mathcal{T}}\right)}^{\otimes N}$. By Lemma 7, the eigenvalues of
${\xc3}_{[N]}^{\mu ,j}$ are upperbounded by 0 < α < 1, for all j. Thus

Hence we find that

We note the dependency of
${\mathcal{G}}_{1}$ on (y^{j}) (for all |j| ≠ n) via
${c}^{\widehat{\mu}N}$. After diagonalisation, we find that

We assume that a > 1 is such that 1 − aα>0. To bound this expression, we note the identity that if A : ℝ → ℝ satisfies
$\left|\mathcal{A}\right|\le \mathcal{B}>0$ and γ_{c} > 0, then

Since $\left|{c}_{s}^{\widehat{\mu}N}\right|\le \left|\overline{J}\right|$, s = 1, ⋯, T, and hence $\left|\right|{c}^{\widehat{\mu}N}|{|}^{2}\le T{\overline{J}}^{2}$, we therefore find that ${\mathcal{G}}_{1}\le {\mathcal{G}}_{1}^{c}$, where

Thus

We include the proof of Lemma 15 which is used in the proof of the upper bound on compact sets in Section 4.3.2.

**Lemma 15.** There exist constants a > 1 and c > 0 such that for all$\mu \in {\mathcal{M}}_{1,S}^{+}\left({\mathcal{T}}^{\mathbb{Z}}\right)\cap {\epsilon}_{2}$,

**Proof.** a > 1 is chosen as in the proof of Lemma 10. We have (from Equation (50) that

We recall that I^{(2)} may be expressed using the variational expression (49) as

^{N}is a continuous, bounded function on ${\mathcal{T}}^{N}$. We let ${\varsigma}_{M}^{N}=a{1}_{{B}_{M}}{\varsigma}_{*}^{N}$, where ${\varsigma}_{*}^{N}(u)=N\left({\varphi}^{N}\left({\mu}^{N},{\mathrm{\Psi}}_{1,T}(u)\right)+{\mathrm{\Gamma}}_{[N],1}(\mu )\right)$, and u ∈ B

_{M}only if either ||Ψ(u)|| ≤ NM or (ϕ

^{N}(µ

^{N}, Ψ

_{1}

_{,T}(u)) + Γ

_{[N],1}(µ)) ≤ 0. We proved in Section 3.3.2 that ϕ

^{N}(µ

^{N}, Ψ

_{1}

_{,T}(u)) possesses a lower bound, which means that ${\varsigma}_{M}^{N}$ is continuous and bounded. Furthermore ${\varsigma}_{M}^{N}$ grows to ${\varsigma}_{*}^{N}$, so that after substituting ${\varsigma}_{M}^{N}$ into Equation (93) and taking M → ∞ (i.e., applying the dominated convergence theorem), we obtain

It can be easily shown, similarly to Lemma 10, that log ${\int}_{{\mathcal{T}}^{N}}\mathrm{exp}}\left(a{\varsigma}_{*}^{N}(u)\right){P}^{\otimes N}(du)\le Nc$. We may thus divide both sides by aN and let N → ∞ to obtain the required result.

## E. Proof of Lemma 14

We prove Lemma 14.

**Lemma 14.** There exists a constant C_{0} such that for all ν in${\mathcal{M}}_{1,S}^{+}\left({\mathcal{T}}^{\mathbb{Z}}\right)$, all ε > 0 and all µ ∈ V_{ε}(ν)∩ε_{2},

Here V_{ε}(ν) is the open neighborhood defined in Proposition 4, and µ is given in Definition 1.

**Proof.** We firstly bound Γ_{1}. From Equations (60) and (61) and Lemma 20 we have

It thus follows from Proposition 4 and Lemma 13 that

We define
${\varphi}_{\infty ,\diamond}^{N}\left(\nu ,{v}_{\diamond}\right)={\varphi}_{\infty}^{N}\left(\nu ,{\left({\mathscr{H}}^{N}\right)}^{-1}(v)\right)$, where $\mathscr{H}$^{N} is given in Definition 8 and
${\varphi}_{\infty}^{N}$ is given in Equation (59), and find that

This means that

Upon expansion of the above expression, we find that

^{∗}.

The lemma now follows after consideration of the fact that ${\int}_{{\mathcal{T}}_{T}^{\mathbb{Z}}}\left|\right|{v}^{k}|{|}^{2}{\underset{\xaf}{\mu}}_{1.T}(dv)={\mathbb{E}}^{{\underset{\xaf}{\mu}}_{1,T}}\left[\left|\right|{v}^{0}|{|}^{2}\right]$, $\left|\right|{\tilde{v}}^{0}|{|}^{2}\le N{\displaystyle {\sum}_{k=-n}^{n}\left|\right|{v}^{k}|{|}^{2}}$ (Cauchy-Schwarz) and, because of the properties of the DFT, $\sum}_{l=-n}^{n}\left|\right|{v}^{l}|{|}^{2}=N{\displaystyle {\sum}_{k=-n}^{n}\left|\right|{\tilde{v}}^{k}|{|}^{2}$. □

## Acknowledgments

Many thanks to Bruno Cessac whose suggestion to look at process-level empirical measures and entropies has been very useful and whose insights into the physical interpretations of our results have been very stimulating.

This work was partially supported by the European Union Seventh Framework Programme (FP7/2007-2013) under grant agreement no. 269921 (BrainScaleS), no. 318723 (Mathemacs), and by the ERC advanced grant NerVi no. 227747.

This work was supported by INRIA FRM, ERC-NERVI number 227747, European Union Project # FP7-269921 (BrainScales), and Mathemacs # FP7-ICT-2011.9.7.

## Author Contributions

Both authors contributed to all parts of the article. All authors have read and approved the final manuscript.

## Conflicts of Interest

The authors declare no conflict of interest.

## References and Notes

- Guionnet, A. Dynamique de Langevin d’un verre de spins. Ph.D. Thesis, Université de Paris Sud, Orsay, France, March 1995. [Google Scholar]
- Ben-Arous, G.; Guionnet, A. Large deviations for Langevin spin glass dynamics. Probab. Theory Relat. Fields.
**1995**, 102, 455–509. [Google Scholar] - Ben-Arous, G.; Guionnet, A. Symmetric Langevin Spin Glass Dynamics. Ann. Probab.
**1997**, 25, 1367–1422. [Google Scholar] - Guionnet, A. Averaged and quenched propagation of chaos for spin glass dynamics. Probab. Theory Relat. Fields.
**1997**, 109, 183–215. [Google Scholar] - Sompolinsky, H.; Zippelius, A. Dynamic theory of the spin-glass phase. Phys. Rev. Lett.
**1981**, 47, 359–362. [Google Scholar] - Sompolinsky, H.; Zippelius, A. Relaxational dynamics of the Edwards-Anderson model and the mean-field theory of spin-glasses. Phys. Rev. B
**1982**, 25, 6860–6875. [Google Scholar] - Crisanti, A.; Sompolinsky, H. Dynamics of spin systems with randomly asymmetric bonds: Langevin dynamics and a spherical model. Phys. Rev. A
**1987**, 36, 4922–4939. [Google Scholar] - Crisanti, A.; Sompolinsky, H. Dynamics of spin systems with randomly asymmetric bounds: Ising spins and Glauber dynamics. Phys. Rev. A
**1987**, 37, 4865–4874. [Google Scholar] - Dawson, D.; Gartner, J. Large deviations from the mckean-vlasov limit for weakly interacting diffusions. Stochastics
**1987**, 20, 247–308. [Google Scholar] - Dawson, D.; Gartner, J. Multilevel large deviations and interacting diffusions. Probab. Theory Relat. Fields.
**1994**, 98, 423–487. [Google Scholar] - Budhiraja, A.; Dupuis, P.; Markus, F. Large deviation properties of weakly interacting processes via weak convergence methods. Ann. Probab.
**2012**, 40, 74–102. [Google Scholar] - Fischer, M. On the form of the large deviation rate function for the empirical measures of weakly interacting systems. Bernoulli
**2014**, 20, 1765–1801. [Google Scholar] - Sompolinsky, H.; Crisanti, A.; Sommers, H. Chaos in Random Neural Networks. Phys. Rev. Lett.
**1988**, 61, 259–262. [Google Scholar] - Gerstner, W.; Kistler, W. Spiking Neuron Models; Cambridge University Press: Cambridge, UK, 2002. [Google Scholar]
- Izhikevich, E. Dynamical Systems in Neuroscience: The Geometry of Excitability and Bursting; MIT Press: Cambridge, MA, USA, 2007. [Google Scholar]
- Ermentrout, G.B.; Terman, D. Foundations of Mathematical Neuroscience; Interdisciplinary Applied Mathematics; Springer: New York, NY, USA, 2010. [Google Scholar]
- Cessac, B. Increase in complexity in random neural networks. J. Phys. I
**1995**, 5, 409–432. [Google Scholar] - Moynot, O. Etude mathématique de la dynamique des réseaux neuronaux aléatoires récurrents. Ph.D. Thesis, Université Paul Sabatier, Toulouse, France, January 2000. [Google Scholar]
- Moynot, O.; Samuelides, M. Large deviations and mean-field theory for asymmetric random recurrent neural networks. Probab. Theory Relat. Fields.
**2002**, 123, 41–75. [Google Scholar] - Cessac, B.; Samuelides, M. From neuron to neural networks dynamics. Eur. Phys. J. Spec. Top.
**2007**, 142, 7–88. [Google Scholar] - Samuelides, M.; Cessac, B. Random Recurrent Neural Networks. Eur. Phys. J. Spec. Top.
**2007**, 142, 7–88. [Google Scholar] - Kandel, E.; Schwartz, J.; Jessel, T. Principles of Neural Science, 4th ed; McGraw-Hill: New York, NY, USA, 2000. [Google Scholar]
- Dayan, P.; Abbott, L. Theoretical Neuroscience: Computational and Mathematical Modeling of Neural Systems; MIT Press: Cambridge, MA, USA, 2001. [Google Scholar]
- Rudin, W. Real and Complex Analysis; McGraw-Hill: New York, NY, USA, 1987. [Google Scholar]
- Cugliandolo, L.F.; Kurchan, J.; Le Doussal, P.; Peliti, L. Glassy behaviour in disordered systems with nonrelaxational dynamics. Phys. Rev. Lett.
**1997**, 78, 350–353. [Google Scholar] - Lapicque, L. Recherches quantitatifs sur l’excitation des nerfs traitee comme une polarisation. J. Physiol. Paris.
**1907**, 9, 620–635. [Google Scholar] - Daley, D.; Vere-Jones, D. An Introduction to the Theory of Point Processes: General Theory and Structure; Springer: New York, NY, USA, 2007; Volume 2. [Google Scholar]
- Gerstner, W.; van Hemmen, J. Coherence and incoherence in a globally coupled ensemble of pulse-emitting units. Phys. Rev. Lett.
**1993**, 71, 312–315. [Google Scholar] - Gerstner, W. Time structure of the activity in neural network models. Phys. Rev. E
**1995**, 51, 738–758. [Google Scholar] - Cáceres, M.J.; Carillo, J.A.; Perhame, B. Analysis of nonlinear noisy integrate and fire neuron models: Blow-up and steady states. J. Math. Neurosci.
**2011**, 1. [Google Scholar] [CrossRef] - Baladron, J.; Fasoli, D.; Faugeras, O.; Touboul, J. Mean-field description and propagation of chaos in networks of Hodgkin-Huxley and FitzHugh-Nagumo neurons. J. Math. Neurosci.
**2012**, 2. [Google Scholar] [CrossRef] - Bogachev, V. Measure Theory, 1 ed; Volume 1 in Measure Theory; Springer: Berlin/Heidelberg, Germany, 2007. [Google Scholar]
- When N is even the formulae are slightly more complicated but all the results we prove below in the case N odd are still valid.
- We note ${\mathcal{N}}_{p}(m,\Sigma )$ the law of the p-dimensional Gaussian variable with mean m and covariance matrix Σ.
- Ellis, R. Entropy, Large Deviations and Statistical Mechanics; Springer: New York, NY, USA, 1985. [Google Scholar]
- Liggett, T.M. Interacting Particle Systems; Springer: Berlin/Heidelberg, Germany, 2005. [Google Scholar]
- Deuschel, J.D.; Stroock, D.W. Large Deviations; Pure and Applied Mathematics; Academic Press: Waltham, MA, USA, 1989; Volume 137. [Google Scholar]
- Dembo, A.; Zeitouni, O. Large Deviations Techniques, 2nd ed; Springer: Berlin/Heidelberg, Germany, 1997. [Google Scholar]
- Donsker, M.; Varadhan, S. Large deviations for stationary Gaussian processes. Commun. Math. Phys.
**1985**, 97, 187–210. [Google Scholar] - Baxter, J.R.; Jain, N.C. An Approximation Condition for Large Deviations and Some Applications. In Convergence in Ergodic Theory and Probability; Bergulson, V., Ed.; De Gruyter: Boston, MA, USA, 1993. [Google Scholar]
- Donsker, M.; Varadhan, S. Asymptotic Evaluation of Certain Markov Process Expectations for Large Time, IV. Commun. Pure Appl. Math.
**1983**, XXXVI, 183–212. [Google Scholar] - Faugeras, O.; MacLaurin, J. A Large Deviation Principle and an Analytical Expression of the Rate Function for a Discrete Stationary Gaussian Process
**2013**, arXiv, 1311.4400. - Chiyonobu, T.; Kusuoka, S. The Large Deviation Principle for Hypermixing Processes. Probab. Theory Relat. Fields.
**1988**, 78, 627–649. [Google Scholar] - We noted in the introduction that this is termed propagation of chaos by some.
- Bressloff, P. Stochastic neural field theory and the system-size expansion. SIAM J. Appl. Math.
**2009**, 70, 1488–1521. [Google Scholar] - Buice, M.; Cowan, J. Field-theoretic approach to fluctuation effects in neural networks. Phys. Rev. E
**2007**, 75. [Google Scholar] [CrossRef] - Ginzburg, I.; Sompolinsky, H. Theory of correlations in stochastic neural networks. Phys. Rev. E
**1994**, 50, 3171–3191. [Google Scholar] - ElBoustani, S.; Destexhe, A. A master equation formalism for macroscopic modeling of asynchronous irregular activity states. Neural Comput.
**2009**, 21, 46–100. [Google Scholar] - Buice, M.; Cowan, J.; Chow, C. Systematic fluctuation expansion for neural network activity equations. Neural Comput.
**2010**, 22, 377–426. [Google Scholar] - Neveu, J. Processus aléatoires gaussiens; Presses de l’Université de Montréal: Montréal, QC, Canada, 1968; Volume 34. [Google Scholar]

© 2015 by the authors; licensee MDPI, Basel, Switzerland This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/4.0/).