# Asymptotic Description of Neural Networks with Correlated Synaptic Weights

^{*}

## Abstract

**:**

## 1. Introduction

_{e}is system-wide and ergodic. Our work challenges the assumption held by some that one cannot have a “concise” macroscopic description of a neural network without an assumption of asynchronicity at the local population level.

_{e}which describes the large-size behavior of the system. Furthermore we are able to obtain various “reductions” to the macroscopic level, as outlined in Section 6.

^{N}of the N trajectories of the solutions to the equations of the network of N neurons. The first result of this article (Theorem 1) is that the image law Π

^{N}of Q

^{N}through ${\widehat{\mu}}^{N}$ satisfies a large deviation principle (LDP) with a good rate function H which is shown to have a unique global minimum, µ

_{e}. We remind the reader that the notion of an image law is simply an extension to more complicated objects than functions, i.e., probability measures, of the usual notion of a change of variables. The interested reader is referred to e.g., the textbook as [32]. Thus, with respect to the measure Π

^{N}${\mathcal{M}}_{1,S}^{+}\left({\mathcal{T}}^{\mathbb{Z}}\right)$, if the set X contains the measure ${\delta}_{{\mu}_{e}}$, then Π (X) → 1 as N → ∞, whereas if ${\delta}_{{\mu}_{e}}$ is not in the closure of X, Π

^{N}(X) → 0 as N → ∞ exponentially fast and the constant in the exponential rate is determined by the rate function. Our analysis of the rate function allows us also, and this is our second result (Theorem 3), to characterize the limit measure µ

_{e}as the image of a stationary Gaussian measure $\underset{\xaf}{{\mu}_{e}}$ defined on a transformed set of trajectories ${\mathcal{T}}^{\mathbb{Z}}$. This is potentially very useful for applications since $\underset{\xaf}{{\mu}_{e}}$ can be completely characterized by its mean and spectral density. Furthermore the rate function allows us to quantify the probability of finite-size effects. Theorems 1 and 3 allows us to characterize the average (over the synaptic weights) behavior of the network. We also derive, and this is our third result, some properties of the infinite-size network that are true for almost all realizations of the synaptic weights (Theorems 4 and 6).

^{N}and the image R

^{N}through ${\widehat{\mu}}_{N}$ of the law of the uncoupled neurons. We state the principle result of this paper in Theorem 1.

^{N}with respect to the law of the uncoupled neurons can be expressed by the Gaussian process corresponding to the empirical measure ${\widehat{\mu}}_{N}$. This allows us to compute the Radon-Nikodym derivative of Π

^{N}with respect to R

^{N}for any measure in ${\mathcal{M}}_{1,S}^{+}\left({\mathcal{T}}^{\mathbb{Z}}\right)$. Using these results, Section 4 is dedicated to the proof of the existence of a strong LDP for the measure Π

^{N}. In Section 5 we show that the good rate function obtained in the previous section has a unique global minimum and we characterize it as the image of a stationary Gaussian measure. Section 6 is dedicated to drawing some important consequences of our first main theorem, in particular some quenched results. Section 7 explores some possible extensions of our work and we conclude with Section 8.

## 2. The Neural Network Model

#### 2.1. The Model Equations

^{j}of the jth neuron writes

_{f}. We could for example employ f(x) = (1 + tanh(gx))/2, where the parameter g can be used to control the slope of the “sigmoid” f at the origin x = 0.

^{N}the N × N matrix of the synaptic weights, ${J}^{N}={\left({J}_{ij}^{N}\right)}_{i,j=-n,\cdots ,n}$. Their covariance is assumed to satisfy the following shift invariance property,

**Remark 1.**This shift invariance property is technically useful since it allows us to use the tools of Fourier analysis. In terms of the neural population it means that the neurons “live” on a circle. Therefore, unlike in the uncorrelated case studied in the papers cited in the introduction, we have to indirectly introduce a notion of space.

^{N}be the restriction of Λ to [−n, n]

^{2}, i.e., Λ

^{N}(i, j) = Λ(i, j) for −n ≤ i, j ≤ n.

^{N}. The proof of the following proposition is obvious.

**Proposition 1.**The sum$\tilde{\mathrm{\Lambda}}\left({\theta}_{1},{\theta}_{2}\right)$ of absolutely convergent series${\left(\mathrm{\Lambda}\left(k,l\right){e}^{-i(k{\theta}_{1}+l{\theta}_{2})}\right)}_{k,l\in \mathbb{Z}}$ is continuous on [−π, π[

^{2}and positive. The covariance function Λ is recovered from the inverse Fourier transform of$\tilde{\mathrm{\Lambda}}$:

#### 2.2. The Laws of the Uncoupled and Coupled Processes

#### 2.2.1. Preliminaries

_{t})

_{t=0,⋯,T}of length T + 1 of real numbers. ${\mathcal{T}}^{N}$ is the set of sequences (u

^{−n},⋯,u

^{n}) (N = 2n + 1) of elements of $\mathcal{T}$ that we use to describe the solutions to Equation (1). Similarly we note ${\mathcal{T}}^{\mathbb{Z}}$ the set of doubly infinite sequences of elements of $\mathcal{T}$. If u is in ${\mathcal{T}}^{\mathbb{Z}}$ we note ${u}^{i},i\in \mathbb{Z}$, its ith coordinate, an element of $\mathcal{T}$. Hence $u={\left({u}^{i}\right)}_{i=-\infty \cdots \infty}$.

_{t}and ${\mathcal{T}}_{t}$ rather than π

_{t,t}and ${\mathcal{T}}_{t,t}$. The temporal projection π

_{s,t}extends in a natural way to ${\mathcal{T}}^{N}$ and ${\mathcal{T}}^{\mathbb{Z}}$: for example π

_{s,t}maps ${\mathcal{T}}^{N}$ to ${\mathcal{T}}_{s,t}^{N}$. We define the spatial projection ${\pi}^{N}:{\mathcal{T}}^{\mathbb{Z}}\to {\mathcal{T}}^{N}(N=2n+1)$ to be ${\pi}^{N}(u)=\left({u}^{-n},\dots ,{u}^{n}\right)$. Temporal and spatial projections commute, i.e., ${\pi}^{N}\circ {\pi}_{s,t}={\pi}_{s,t}\circ {\pi}^{N}$.

^{−n}, …, u

^{n}) of ${\mathcal{T}}^{N}$ we form the doubly infinite periodic sequence

_{N}:

_{N}(u, υ) to be

_{,}we denote its marginal distribution at time t by ${\mu}_{t}=\mu \phantom{\rule{0.2em}{0ex}}\mathrm{o}\phantom{\rule{0.2em}{0ex}}{\mu}_{t}^{-1}$. Similarly, ${\mu}_{s,t}^{N}$ is its N-dimensional spatial, t − s + 1-dimensional time marginal $\mu \phantom{\rule{0.2em}{0ex}}\circ {\left({\mu}^{N}\right)}^{-1}\circ {\mu}_{s,t}^{-1}$.

^{−}

^{n}, …, u

^{n}) are random variables governed by µ

^{N}, then for all |m| ≤ n, (u

^{m−n}, …, u

^{m}

^{+}

^{n}) has the same law as (u

^{−n}, …, u

^{n}) (recall that the indexing is taken modulo N), or equivalently that ${\mu}^{N}\circ \phantom{\rule{0.2em}{0ex}}{\mathcal{S}}^{-1}={\mu}^{N}$ (remember Equation (8)).

**Remark 2.**Note that the stationarity discussed here is a spatial stationarity.

^{−n}, …, u

^{n}) in ${\mathcal{T}}^{N}$ we associate with it the measure, noted ${\widehat{\mu}}_{N}\left({u}^{-n},\dots ,{u}^{n}\right)$, in ${\mathcal{M}}_{1,S}^{+}\left({\mathcal{T}}^{\mathbb{Z}}\right)$ defined by

**Remark 3.**This is a significant difference with previous work dealing with uncorrelated weights (e.g., [19]) where the N processes are coupled through the “usual” empirical measure$d{\widehat{\mu}}_{N}\left({u}^{-n},\cdots ,{u}^{n}\right)(y)=\frac{1}{N}{\displaystyle \sum {}_{i=-n}^{n}{\delta}_{{u}^{i}}(y)}$ which is a measure on$\mathcal{T}$. In our case, because of the correlations and as shown in Section 3.4 the processes are coupled through the process-level empirical measure Equation (10) which is a probability measure on${\mathcal{T}}^{\mathbb{Z}}$. This makes our analysis more biologically realistic, since we know that correlations between the synaptic weights do exist, but technically more involved.

_{f}d

_{N}(u, υ) ∧ 1,

_{f}is a positive constant defined at the start of Section 2.1 and $\mathcal{J}$ is the set of all measures in ${\mathcal{M}}_{1}^{+}\left({\mathcal{T}}^{N}\times {\mathcal{T}}^{N}\right)$ with N-dimensional marginals µ

^{N}and υ

^{N}.

**Remark 4.**The use of k

_{f}in Equation (11) is technical and used to simplify the proof of Proposition 5.

_{N}(μ

^{N}, υ

^{N}) ≤ 1 and $\sum {}_{n=0}^{\infty}{\kappa}_{n}<\infty$. It can be shown that ${\mathcal{M}}_{1}^{+}\left({\mathcal{T}}^{\mathbb{Z}}\right)$ equiped with this metric is Polish.

#### 2.2.2. Coupled and Uncoupled Processes

_{I}be the individual law on $\mathbb{R}$ of ${U}_{0}^{j}$; it follows that the joint law of the variables is ${\mu}_{I}^{\otimes N}$ on ${\mathbb{R}}^{N}$. We note P the law of the solution to one of the uncoupled equations (1) where we take ${J}_{ij}^{N}=0,i,j=-n,\cdots ,n$. P is the law of the solution to the following stochastic difference equation:

_{I}. This process can be characterized exactly, as follows.

**Proposition 2.**The law P of the solution to Equation (13) writes

_{T}is the T-dimensional vector of coordinates equal to 0 and Id

_{T}is the T -dimensional identity matrix.

^{−n}), …, Ψ(u

^{n})). A similar convention applies if $u\in {\mathcal{T}}^{\mathbb{Z}}$. We also use the notation Ψ

_{1}

_{,T}for the mapping $\mathcal{T}\to {\mathcal{T}}_{1,T}$ such that Ψ

_{1}

_{,T}= π

_{1}

_{,T}◦ Ψ.

^{N}(J

^{N}) the element of ${\mathcal{M}}_{1}^{+}\left({\mathcal{T}}^{N}\right)$ which is the law of the solution to Equation (1) conditioned on J

^{N}. We let ${Q}^{N}={\mathbb{E}}^{J}[{Q}^{N}({J}^{N})]$ be the law averaged with respect to the weights. The reason for this is as follows. We want to study the empirical measure ${\widehat{\mu}}_{N}$ on path space. There is no reason for this to be a simple problem since for a fixed interaction J

^{N}, the variables (U

^{−n}, ⋯, U

^{n}) are not exchangeable. So we first study the law of ${\widehat{\mu}}_{N}$ averaged over the interaction before we prove in Section 6 some almost sure properties of this law. Q

^{N}is a common construction in the physics of interacting particle systems and is known as the annealed law [36].

**Lemma 1.**P

^{⊗}

^{N}, Q

^{N}and${\widehat{\mu}}_{N}^{N}$ (the N-dimensional marginal of${\widehat{\mu}}_{N}$) are in${\mathcal{M}}_{1,\mathcal{S}}^{+}\left({\mathcal{T}}^{N}\right)$

_{.}

**Definition 1.**For each measure $\mu \in {\mathcal{M}}_{1}^{+}\left({\mathcal{T}}^{N}\right)$ or ${\mathcal{M}}_{1,\mathcal{S}}^{+}\left({\mathcal{T}}^{\mathbb{Z}}\right)$ we define $\underset{\xaf}{\mu}$ to be μ ο ψ

^{−1}

**Definition 2.**Let Π

^{N}(respectively R

^{N}) be the image law of Q

^{N}(respectively P

^{⊗}

^{N}) through the function ${\widehat{\mu}}_{N}:{\mathcal{T}}^{N}\to {\mathcal{M}}_{1,S}^{+}({\mathcal{T}}^{\mathbb{Z}})$ defined by Equation (10).

**Theorem 1.**Π

^{N}is governed by a large deviation principle (LDP) with a good rate function H (to be found in Definition 5). That is, if F is a closed set in${\mathcal{M}}_{1,S}^{+}\left({\mathcal{T}}^{\mathbb{Z}}\right)$, then

**Remark 5.**We recall that the above LDP is also called a strong LDP.

^{N}satisfies a weak LDP, i.e., that it satisfies Equation (16) when F is compact and Equation (17) for all open O. We also prove in Section 4.2 that {Π

^{N}} is exponentially tight, and we prove in Section 4.4 that H is a good rate function. It directly follows from these results that Π

^{N}satisfies a strong LDP with good rate function H [38]. Finally, in Section 5 we prove that H has a unique minimum µ

_{e}which ${\widehat{\mu}}_{N}$ converges to weakly as N → ∞. This minimum is a (stationary) Gaussian measure which we describe in detail in Theorem 3.

## 3. The Good Rate Function

^{N}) via the (simpler) process without correlations (R

^{N}). However, to do this we require an expression for the Radon-Nikodym derivative of Π

^{N}with respect to R

^{N}, which is the main result of this section. The derivative will be expressed in terms of a function ${\mathrm{\Gamma}}_{[N]}:{\mathcal{M}}_{1,S}^{+}\left({\mathcal{T}}^{N}\right)\to \mathbb{R}$. We will firstly define Γ

_{[}

_{N}

_{]}(µ), demonstrating that it may be expressed in terms of a Gaussian process ${G}_{[N]}^{\mu}$ (to be defined below), and then use this to determine the Radon-Nikodym derivative of Π

^{N}with respect to R

^{N}.

#### 3.1. Gaussian Processes

^{µ}. with values in ${\mathcal{T}}_{1,T}^{\mathbb{Z}}$ For all i the mean of ${G}_{t}^{\mu ,i}$ is given by ${c}_{t}^{\mu}$, where

^{µ}. We first define the following matrix-valued process.

**Definition 3.**Let M

^{µ,k}, k ∈ ℤ be the T × T matrix defined by (for s, t ∈ [1, T ]),

^{†}denotes the transpose. Furthermore, they feature a spectral representation, i.e., there exists a T × T matrix-valued measure ${\tilde{M}}^{\mu}={({\tilde{M}}^{\mu})}_{s,t=1,\cdots ,T}$ with the following properties. Each ${\tilde{M}}_{st}^{\mu}$ is a complex measure on [−π, π[ of finite total variation and such that

^{*}indicates complex conjugation. We may infer from this that ${\tilde{M}}^{\mu}$ is Hermitian-valued. The spectral representation means that for all vectors $W\in {\mathbb{R}}^{T},{\phantom{\rule{0.2em}{0ex}}}^{\u2020}W{\tilde{M}}^{\mu}(d\theta )W$ is a positive measure on [−π, π[.

^{µ,i}and G

^{µ,i}

^{+}

^{k}is defined to be

_{k,l}

_{∈ℤ}is absolutely convergent and the elements of M

^{µ,l}are bounded by 1 for all l ∈ ℤ.

**Proposition 3.**The sequence (K

^{µ,k})

_{k}

_{∈ℤ}has spectral density${\tilde{K}}^{\mu}$ given by

**Proof.**The proof essentially consists of demonstrating that the matrix function

_{k,l}

_{∈ℤ}is absolutely convergent, ${\tilde{K}}^{\mu}(\theta )$ is well-defined on [−π, π]. The fact that ${\tilde{K}}^{\mu}(\theta )$ is Hermitian follows from Equations (25) and (24).

^{T}

#### 3.2. Convergence of Gaussian Processes

**Lemma 2.**Fix$\mu \in {\mathcal{M}}_{1,S}^{+}\left({\mathcal{T}}^{\mathbb{Z}}\right)$. For all ε, there exists an N such that for all M > N and all j such that$2\left|j\right|+1\le M,\Vert {K}_{[M]}^{\mu ,j}-{K}^{\mu ,j}\Vert <\epsilon $ and for all$\theta \in \left[-\pi ,\pi \right[,\phantom{\rule{0.2em}{0ex}}\Vert {\tilde{K}}_{[M]}^{\mu}(\theta )-{\tilde{K}}^{\mu}(\theta )\Vert \le \epsilon $.

**Lemma 3.**The eigenvalues of${\tilde{K}}_{[N]}^{\mu ,l}$ and${\tilde{K}}^{\mu}(\theta )$ are upperbounded by${\rho}_{K}\underset{\equiv}{\text{def}}T{\mathrm{\Lambda}}^{sum}$, where Λ

^{sum}is defined in Equation (4).

**Proof.**Let W ∈ ℝ

^{T}. We find from Proposition 3 and Equation (4) that

^{µ,}

^{0}are all positive (since it is a correlation matrix), which means that each eigenvalue is upperbounded by the trace, which in turn is upperbounded by T. The proof in the finite dimensional case follows similarly. □

^{µ}is absolutely convergent as a consequence of Wiener’s Theorem. We thus find that, for l ∈ ℤ,

**Lemma 4.**The map B → B(σ

^{2}Id

_{T}+ B)

^{−}

^{1}is Lipschitz continuous over the set$\mathrm{\Delta}=\{{\tilde{K}}_{[N]}^{\mu}(\theta ),{\tilde{K}}^{\mu}(\theta ):\mu \in {\mathcal{M}}_{1,S}^{+}({\mathcal{T}}^{\mathbb{Z}}),N>0,\theta \in [-\pi ,\pi ]\}$.

**Proof.**The proof is straightforward using the boundedness of the eigenvalues of the matrixes in Δ. □

**Lemma 5.**Fix$\mu \in {\mathcal{M}}_{1,S}^{+}({\mathcal{T}}^{\mathbb{Z}})$. For all ε, there exists an N such that for all M > N and all θ ∈ [−π, π[, $\left|\right|{\tilde{A}}_{[M]}^{\mu}(\theta )-{\tilde{A}}^{\mu}(\theta )\left|\right|\le \epsilon $.

**Proposition 4.**Fix$v\in {\mathcal{M}}_{1,S}^{+}({\mathcal{T}}^{\mathbb{Z}})$. For all ε > 0, there exists an open neighbourhood V

_{ε}(ν) such that for all μ ∈ V

_{ε}(ν), all s, t ∈ [1,T] and all θ ∈ [−π, π[,

**Proof.**The proof is found in Appendix B. □

**Definition 4.**Let $\mathcal{E}$

_{2}be the subset of ${\mathcal{M}}_{1,S}^{+}({\mathcal{T}}^{\mathbb{Z}})$ defined by

^{k})

_{k}

_{∈ℤ}in ${\mathcal{T}}_{1,T}^{\mathbb{Z}}$, where ${v}_{s}^{k}={\mathrm{\Psi}}_{s}({u}^{k})$, s = 1, ⋯, T. This has a finite mean E

^{µ}

^{1}

^{,T}[v

^{0}], noted ${\overline{v}}^{\mu}$. It admits the following spectral density measure, noted ${\tilde{v}}^{\mu}$, such that

#### 3.3. Definition of the Functional Γ

_{[}

_{N}

_{]}= Γ

_{[}

_{N}

_{]}

_{,}

_{1}+ Γ

_{[}

_{N}

_{]}

_{,}

_{2}, which will be used to characterize the Radon-Nikodym derivative of Π

^{N}with respect to R

^{N}. Let $\mu \in {\mathcal{M}}_{1,S}^{+}({\mathcal{T}}^{\mathbb{Z}})$, and let (μ

^{N})N≥1 be the N-dimensional marginals of μ (for N = 2n + 1 odd).

#### 3.3.1. Γ_{1}

_{[}

_{N}

_{]}

_{,}

_{1}(µ) ≤ 0.

_{1}(µ) = lim

_{N→∞}Γ

_{[}

_{N}

_{]}

_{,}

_{1}(µ). The following lemma indicates that this is well-defined.

**Lemma 6.**When N goes to infinity the limit of Equation (38) is given by

**Proof.**Through Lemma 20 in Appendix A, we have that

**Proposition 5.**Γ

_{[}

_{N}

_{]}

_{,}

_{1}and Γ

_{1}are bounded below and continuous on${\mathcal{M}}_{1,S}^{+}({\mathcal{T}}^{\mathbb{Z}})$.

**Proof.**Applying Lemma 19 in the case of $Z=\phantom{\rule{0.2em}{0ex}}({G}_{[N]}^{\mu ,-n}-{c}^{\mu},\cdots ,{G}_{[N]}^{\mu ,n}-{c}^{\mu})$, a = 0, b = σ

^{−}

^{2}, we write

^{μ,m}) ≤ T. Hence Γ

_{[}

_{N}

_{]}

_{,}

_{1}(μ) ≥ −β

_{1}, where

_{1}is a lower bound for Γ

_{1}(μ) as well.

#### 3.3.2. Γ_{2}

_{[}

_{N}

_{]}

_{,}

_{2}(μ) is finite in the subset $\mathcal{E}$

_{2}of ${\mathcal{M}}_{1,S}^{+}({\mathcal{T}}^{\mathbb{Z}})$ defined in Definition 4. If μ ∉ $\mathcal{E}$

_{2}, then we set Γ

_{[}

_{N}

_{]}

_{,}

_{2}(μ) = ∞.

_{2}(μ) = lim

_{N}

_{→∞}Γ

_{[}

_{N}

_{]}

_{,}

_{2}(μ). The following proposition indicates that Γ

_{2}(μ) is well-defined.

**Proposition 6.**If the measure μ is in $\mathcal{E}$

_{2}, i.e., if${\mathbb{E}}^{\underset{\xaf}{\mu}}{}^{{}_{1,T}}[\Vert {\nu}^{0}{\Vert}^{2}]<\infty $, then Γ

_{2}(μ) is finite and writes

**Proof.**Using Equations (37) and (42) the stationarity of μ and the fact that ${\sum}_{k=-n}^{n}{A}_{[N]}^{\mu ,k}={\tilde{A}}_{[N]}^{\mu}}(0)$, we have

^{μ}(θ) as N → ∞, it follows by dominated convergence that Γ

_{[}

_{N}

_{]}

_{,}

_{2}(μ) converges to the expression in the proposition.

_{2}(μ) follows analogously, although this time we make use of the fact that the partial sums of the Fourier Series of Ã

^{μ}converge uniformly to Ã

^{μ}(because the Fourier Series is absolutely convergent).

^{μ}(θ).

**Lemma 7.**There exists 0 < α < 1, such that for all N and μ, the eigenvalues of${\tilde{A}}_{[N]}^{\mu ,k}$, Ã

^{μ}(θ) and${A}_{[N]}^{\mu}$ are less than or equal to α.

**Proof.**By Lemma 3, the eigenvalues of ${\tilde{K}}^{\mu}(\theta )$ are positive and upperbounded by ρ

_{K}. Since ${\tilde{K}}^{\mu}(\theta )$ and ${\left({\sigma}^{2}{\mathrm{Id}}_{T}+{\tilde{K}}^{\mu}(\theta )\right)}^{-1}$ are coaxial (because ${\tilde{K}}^{\mu}$ is Hermitian and therefore diagonalisable), we may take

_{[}

_{N}

_{]}

_{,}

_{2}(μ) is lower semicontinuous. A consequence of this will be that Γ

_{[}

_{N}

_{]}

_{,}

_{2}(μ) is measureable with respect to $\mathcal{B}({\mathcal{M}}_{1,S}^{+}({\mathcal{T}}^{\mathbb{Z}}))$. In effect, we prove in Appendix C that ϕ

^{N}(μ, v) defined by Equation (43) satisfies

_{2}defined in Equation (87) in Appendix C.

**Proposition 7.**Γ

_{[}

_{N}

_{]}

_{,}

_{2}(μ) is lower-semicontinuous.

**Proof.**We define ${\varphi}^{N,M}(\mu ,\nu )={1}_{{B}_{M}}(\nu )({\varphi}^{N}(\mu ,\nu )+{\beta}_{2})$, where ${1}_{{B}_{M}}$ is the indicator of B

_{M}and v ∈ B

_{M}if ${N}^{-1}{\displaystyle {\sum}_{j=-n}^{n}{\Vert {\nu}^{2}\Vert}^{2}}\le M$. We have just seen that ϕ

^{N,M}≥ 0. We also define

_{n}→ μ with respect to the weak topology in ${\mathcal{M}}_{1,S}^{+}({\mathcal{T}}^{\mathbb{Z}})$. Observe that

^{N,M}is continuous and bounded (with respect to v). The second term converges to zero because ϕ

^{N,M}(μ, v) is a continuous function of μ, see Proposition 4.

_{[}

_{N}

_{]}

_{,}

_{2}(μ) as M → ∞, we may conclude that Γ

_{[}

_{N}

_{]}

_{,}

_{2}(μ) is lower semicontinuous with respect to μ.

_{[}

_{N}

_{]}(μ) = Γ

_{[}

_{N}

_{]}

_{,}

_{1}(μ) + Γ

_{[}

_{N}

_{]}

_{,}

_{2}(μ). We may conclude from Propositions 5 and 7 that Γ

_{[}

_{N}

_{]}is measurable.

#### 3.4. The Radon-Nikodym Derivative

^{N}with respect to R

^{N}. However, in order for us to do this, we must first compute the Radon-Nikodym derivative of Q

^{N}with respect to P

^{⊗}

^{N}. We do this in the next proposition.

**Proposition 8.**The Radon-Nikodym derivative of Q

^{N}with respect to P

^{⊗}

^{N}is given by the following expression.

^{i}), i = −n, ⋯, n given by

**Proof.**For fixed J

^{N}, we let ${R}_{{J}^{N}}:{\mathbb{R}}^{N(T+1)}\to {\mathbb{R}}^{N(T+1)}$ be the mapping u → y, i.e., ${R}_{{J}^{N}}({u}^{-n},\cdots ,{u}^{n})=({y}^{-n},\cdots ,{y}^{n})$, where for j = −n, ⋯, n,

_{s}is the Jacobian of the map $({u}_{s}^{-n},\dots ,{u}_{s}^{n})\to ({y}_{s}^{-n},\dots ,{y}_{s}^{n})$ induced by ${R}_{{J}^{N}}$. However, D

_{s}is evidently 1. Similar reasoning implies that ${R}_{{J}^{N}}$ is a bijection.

^{N}(J

^{N}) by applying the inverse of ${R}_{{J}^{N}}$ to the above distribution, i.e.,

^{⊗}

^{N}= Q

^{N}(0), we therefore find that

^{N}yields the result.

**Proposition 9.**Fix$u\in {\mathcal{T}}^{N}$. The covariance of the Gaussian system$({G}_{s}^{i})$, where i = −n, …, n and s = 1, …, T writes${K}_{[N]}^{{\widehat{\mu}}_{N}(u)}$. For each i, the mean of G

^{i}is${c}^{{\widehat{\mu}}_{N}(u)}$.

^{−}

^{n}, ⋯, G

^{n}), $a=\frac{1}{{\sigma}^{2}}({\nu}^{-n},\cdots ,{\nu}^{n})$, and $b=\frac{1}{{\sigma}^{2}}$ into the formula in Lemma 19. After noting Proposition 9 we thus find that

**Proposition 10.**The Radon-Nikodym derivatives write as

_{[}

_{N}

_{]}(μ) = Γ

_{[}

_{N}

_{]}

_{,}

_{1}(μ) + Γ

_{[}

_{N}

_{]}

_{,}

_{2}(μ) and the expressions for Γ

_{[}

_{N}

_{]}

_{,}

_{1}and Γ

_{[}

_{N}

_{]}

_{,}

_{2}have been defined in Equations (38) and (42).

_{[}

_{N}

_{]}is measureable.

**Remark 6.**Proposition 10 shows that the process solutions of Equation (1) are coupled through the process-level empirical measure unlike in the case of independent weights where they are coupled through the usual empirical measure. As mentioned in the Remark 3 this complicates significantly the mathematical analysis.

## 4. The Large Deviation Principle

^{N}satisfy an LDP with good rate function H (to be defined below). We do this by firstly establishing an LDP for the image law with uncoupled weights (R

^{N}), see Definition 2, and then use the Radon-Nikodym derivative of Corollary 10 to establish the full LDP for Π

^{N}. Therefore our first task is to write the LDP governing R

^{N}.

^{(2)}(μ, ν) = ∞ otherwise. It is a standard result that

^{ℤ}is defined to be

^{−1}I

^{(2)}(μ

^{N}) follows from Equation (49)).

**Theorem 2.**R

^{N}is governed by a large deviation principle with good rate function [39,40]

_{0}in ℝ

^{ℤ}. In addition, the set of measures {R

^{N}} is exponentially tight.

**Proof.**R

^{N}satisfies an LDP with good rate function I

^{(3)}(μ, P

^{ℤ}) [35]. In turn, a sequence of probability measures (such as {R

^{N}}) over a Polish Space satisfying a large deviations upper bound with a good rate function is exponentially tight [38].

^{k}for k ∈ ℤ

^{+}. It follows from Equation (53) that ${N}^{-1}{I}^{(2)}({\mu}_{{u}_{0}}^{N},{P}_{{u}_{0}}^{\otimes N})$ is strictly nondecreasing as N = 2

^{k}→ ∞ (for all u

_{0}), so that Equation (51) follows by the monotone convergence theorem.

^{N}, we prove the following relationship between the set $\mathcal{E}$

_{2}(see Definition 4) and the set of stationary measures which have a finite Küllback-Leibler divergence or process-level entropy with respect to P

^{ℤ}.

**Lemma 8.**

^{N}.

**Definition 5.**Let H be the function ${\mathcal{M}}_{1,S}^{+}({\mathcal{T}}^{\mathbb{Z}})\to \mathbb{R}\cup \{+\infty \}$ defined by

_{1}(μ) + Γ

_{2}(μ) and the expressions for Γ

_{1}and Γ

_{2}have been defined in Lemma 6 and Proposition 6. Note that because of Proposition 6 and Lemma 8, whenever I

^{(3)}(μ, P

^{ℤ}) is finite, so is Γ(μ).

#### 4.1. Lower Bound on the Open Sets

**Lemma 9.**For all open sets O, Equation (17)

**Proof.**From the expression for the Radon-Nikodym derivative in Proposition 10 we have

^{(3)}(μ, P

^{ℤ}) = ∞, then H(μ) = ∞ and evidently Equation (17) holds. We now prove Equation (17) for all μ ∈ O such that I

^{(3)}(μ, P

^{ℤ}) < ∞. Let ε > 0 and ${Z}_{\epsilon}^{N}(\mu )\subset O$ be an open neighbourhood containing μ such that ${\mathrm{inf}}_{\nu \in {Z}_{\epsilon}^{N}(\mu )}{\mathrm{\Gamma}}_{[N]}(\nu )\ge {\mathrm{\Gamma}}_{[N]}(\mu )-\epsilon $. Such $\{{Z}_{\epsilon}^{N}(\mu )\}$ exist for all N because of the lower semi-continuity of Γ

_{[}

_{N}

_{]}(μ) (see Propositions 5 and 7). Then

#### 4.2. Exponential Tightness of Π^{N}

**Lemma 10.**There exist positive constants c > 0 and a > 1 such that, for all N,

^{N}is defined in Equation (43).

**Proposition 11.**The family {Π

^{N}} is exponentially tight.

**Proof.**Let $B\in \mathcal{B}({\mathcal{M}}_{1,S}^{+}({\mathcal{T}}^{\mathbb{Z}}))$. We have from Proposition 10

_{1}≤ 0, it follows from Lemma 10 that

^{N}} (as stated in Theorem 2), for each L > 0, there exists a compact set K

_{L}such that $\overline{\underset{N\to \infty}{\mathrm{lim}}}\phantom{\rule{0.2em}{0ex}}{N}^{-1}\mathrm{log}({R}^{N}({K}_{L}^{c}))\le -L$. Thus if we choose compact K

_{Π}such that ${K}_{\mathrm{\Pi}}{,}_{L}={K}_{\frac{a}{a-1}(L+\frac{c}{a})}$, then for all L > 0, $\overline{\underset{N\to \infty}{\mathrm{lim}}}\phantom{\rule{0.2em}{0ex}}{N}^{-1}\mathrm{log}({\mathrm{\Pi}}^{N}({K}_{\mathrm{\Pi},L}^{c}))\le -L$. □

#### 4.3. Upper Bound on the Compact Sets

^{v}and c

^{v}), and then prove that this converges to the required bound as v → μ

#### 4.3.1. An LDP for a Gaussian Measure

_{2}. Let

**Remark 7.**Note the subtle difference with the definition of Φ

^{N}in Equation (43): we use A

^{ν,k}instead of${A}_{\left[N\right]}^{\nu ,k}$. When turning to spectral representations of${\Phi}_{\infty}^{N}$ this will bring in the matrixes Ã

^{ν, N, l}, l = −n…n defined at the beginning of Section 4.3.2.

^{ν,N}is the NT × NT matrix with T × T blocks noted K

^{ν,N,l}. We define

_{2}, we define H

^{ν}(μ) = I

^{(3)}(μ, P

^{ℤ}) − Γ

^{ν}(μ); for μ ∉ $\mathcal{E}$

_{2}, we define ${\mathrm{\Gamma}}_{2}^{\nu}(\mu )={\mathrm{\Gamma}}^{\nu}(\mu )=\infty $ and H

^{ν}(μ) = ∞.

**Definition 6.**Let ${\underset{\xaf}{Q}}^{\nu}\in {\mathcal{M}}_{1,S}^{+}({\mathcal{T}}^{\mathbb{Z}})$ with N-dimensional marginals ${\underset{\xaf}{Q}}^{\nu ,N}$ given by

**Lemma 11.**${\underset{\xaf}{Q}}_{1,T}^{\nu}$ is a stationary Gaussian process of mean c

^{ν}. Its N-dimensional spatial, T-dimensional temporal, marginals${\underset{\xaf}{Q}}_{1,T}^{\nu ,N}$ are in${\mathcal{M}}_{1,S}^{+}\left({\mathcal{T}}_{1,T}^{N}\right)$ and have covariance σ

^{2}Id

_{NT}+K

^{ν}

^{,N}. The spectral density of${\underset{\xaf}{Q}}_{1,T}^{\nu}$ is${\sigma}^{2}{\mathrm{Id}}_{T}+{\tilde{K}}^{\nu}(\theta )$ and in addition,

**Proof.**In effect we find that

^{ν,N}the NT-dimensional vector obtained by concatenating N times the vector c

^{ν}. We also have that

^{ν,N}, inverse covariance matrix $\frac{1}{{\sigma}^{2}}\left({\mathrm{Id}}_{NT}-{A}^{\nu ,N}\right)$, and covariance matrix σ

^{2}Id

_{NT}+K

^{ν,N}. Hence ${\underset{\xaf}{Q}}_{1,T}^{\nu ,N}$ is in ${\mathcal{M}}_{1,S}^{+}\left({\mathcal{T}}_{1,T}^{N}\right)$, and

**Definition 7.**Let ${\underset{\xaf}{\mathrm{\Pi}}}^{\nu ,N}$ be the image law of ${\underset{\xaf}{Q}}^{\nu ,N}$ under ${\widehat{\underset{\xaf}{\mu}}}_{N}$, i.e., for $B\in \mathcal{B}\left({\mathcal{M}}_{1,S}^{+}\left({\mathcal{T}}^{\mathbb{Z}}\right)\right)$,

^{N}, see Proposition 12. We begin with the following lemma which is a generalization of the result in [39].

**Lemma 12.**The image law${\underset{\xaf}{\mathrm{\Pi}}}^{\nu ,N}$ satisfies a strong LDP (in the manner of Theorem 1) with good rate function

**Proof.**We have found an LDP for a Gaussian process in [42]. Since Q

^{ν}may be separated in the manner of Equation (68), we may use the expression in Theorem 2 to obtain the result. □

**Corollary 1.**The image law Π

^{ν,N}satisfies a strong LDP with good rate function

#### 4.3.2. An Upper Bound for Π^{N} over Compact Sets

^{N}over compact sets using the LDP of the previous section. Before we do this, we require two lemmas governing the ‘distance’ between Γ

^{ν}and Γ. Let ${\tilde{K}}^{\mu ,N}$ be the DFT of ${\left({K}^{\mu ,j}\right)}_{j=-n}^{n}$, and similarly Ã

^{µ,N}is the DFT of ${\left({A}^{\mu ,j}\right)}_{j=-n}^{n}$. We define

**Lemma 13.**For all$\nu \in {\mathcal{M}}_{1,S}^{+}({\mathcal{T}}^{\mathbb{Z}})$, ${C}_{N}^{\nu}$ is finite and

**Proof.**We recall from Proposition 4 that ${\tilde{K}}_{[M],st}^{\nu}(\theta )$ converges uniformly (in θ) to ${\tilde{K}}_{st}^{\nu}(\theta )$. The same holds for ${\tilde{K}}_{st}^{\nu ,M,l}$, because this represents the partial summation of an absolutely converging Fourier Series. That is, for fixed θ = 2πl

_{M}, ${\tilde{K}}_{st}^{\nu ,M,{l}_{M}}\to {\tilde{K}}_{st}^{\nu}(\theta )$ as M → ∞ The result then follows from the equivalence of matrix norms. The proof for Ã

^{ν}is analogous. □

**Lemma 14.**There exists a constant C

_{0}such that for all ν in${\mathcal{M}}_{1,S}^{+}({\mathcal{T}}^{\mathbb{Z}})$, all ε > 0 and all μ ∈ V

_{ε}(ν)∩ε

_{2},

_{ε}(ν) is the open neighbourhood defined in Proposition 4, and µ is given in Definition 1.

**Proposition 12.**Let$\mathcal{K}$ be a compact subset of${\mathcal{M}}_{1,S}({\mathcal{T}}^{\mathbb{Z}})$. Then

**Proof**. Fix ε > 0. Let V

_{ε}(ν) be the open neighbourhood of ν defined in Proposition 4, and let ${\overline{V}}_{\epsilon}(\nu )$ be its closure. Since $\mathcal{K}$ is compact and ${\{{V}_{\epsilon}(\nu )\}}_{\nu \in \mathcal{K}\phantom{\rule{0.2em}{0ex}}}$ is an open cover, there exists an r and ${\left\{{\nu}_{i}\right\}}_{i=1}^{r}$ such that $\mathcal{K}\subset {\displaystyle {\cup}_{i=1}^{r}{V}_{\epsilon}({\nu}_{i})}$. We find that

^{ν,N}in Equation (64) and Hölder’s Inequality, for p, q such that $\frac{1}{p}+\frac{1}{q}=1$, we have

^{2}+ ρ

_{K}. Thus for this integral to converge it is sufficient that

_{T}⊗ 1

_{N}is the NT × T block matrix with each block Id

_{T}and

^{k}, k = −n, ⋯, n its T × T blocks. We have

^{k})

_{k}

_{=−}

_{n,}

_{⋯}

_{,n}. Let v

_{m}be the largest eigenvalue of B. Since (by Lemma 20) the eigenvalues of ${\tilde{B}}^{0}$ are a subset of the eigenvalues of B, we have

_{i}and that s(q, ε) → 0 as ε → 0. Using Equations (73), (75) and (76) we thus find that

^{ν}(μ) = ∞ for all μ ∉ ε

_{2}. Thus if $\mathcal{K}\cap {\epsilon}_{2}=\varnothing $, we may infer that $\overline{\underset{N\to \infty}{\mathrm{lim}}}{N}^{-1}\mathrm{log}\left({\prod}^{N}(\mathcal{K})\right)=-\infty $ and the proposition is evident. Thus we may assume without loss of generality that ${\mathrm{inf}}_{\mu}{}_{\in \mathcal{K}}{H}^{{\nu}_{i}}(\mu )={\mathrm{inf}}_{\mu \in \mathcal{K}\cap {\epsilon}_{2}}{H}^{{\nu}_{i}}(\mu )$. Furthermore it follows from Proposition 13 (below) that there exists a constant C

_{I}such that for all $\mu \in {\overline{V}}_{\epsilon}({\nu}_{i}\cap {\epsilon}_{2}$,

**Proposition 13.**There exists a positive constant C

_{I}such that, for all ν in${\mathcal{M}}_{1,S}^{+}({\mathcal{T}}^{\mathbb{Z}})\cap {\epsilon}_{2}$, all ε > 0 and all$\mu \in {\overline{V}}_{\epsilon}(\nu )\cap {\epsilon}_{2}$ (where${\overline{V}}_{\epsilon}(\nu )$ is the neighbourhood defined in Proposition 4),

**Lemma 15.**There exist constants a > 1 and c > 0 such that for all$\mu \in {\mathcal{M}}_{1,S}^{+}({\mathcal{T}}^{\mathbb{Z}})\cap {\epsilon}_{2}$,

#### 4.4. End of the Proof of Theorem 1

**Lemma 16.**H(μ) is lower-semi-continuous.

^{N}} is exponentially tight and satisfies the weak LDP with rate function H(μ), the following corollary is immediate (Lemma 2.1.5 in [37]).

**Corollary 2.**H(μ) is a good rate function, i.e., the sets {μ: H(μ) ≤ δ} are compact for all δ ∈ ℝ

^{+}and it satisfies the first condition of Theorem 1.

**Proof**. By combining Lemmas 16 and 9, Proposition 11, and Corollary 2, we complete the proof of Theorem 1.

## 5. Characterization of the Unique Minimum of the Rate Function

_{e}of the rate function. and provide explicit equations for µ

_{e}which would facilitate its numerical simulation. We start with the following lemma.

**Lemma 17.**For$\mu ,\nu \in {\mathcal{M}}_{1,S}^{+}({\mathcal{T}}^{\mathbb{Z}})$, H

^{ν}(µ) = 0 if and only if µ = Q

^{ν}.

**Proof**. This is a straightforward consequence of Theorem 1 in [42] and Theorem 2.

**Proposition 14.**There is a unique distribution${\mu}_{e}\in {\mathcal{M}}_{1,S}^{+}({\mathcal{T}}^{\mathbb{Z}})$ which minimizes H. This distribution satisfies H(µ

_{e}) = 0.

**Proof**. By the previous lemma, it suffices to prove that there is a unique µ

_{e}such that

^{µ}over ${\mathcal{F}}_{0,t}$ only depends upon the marginal of µ over ${\mathcal{F}}_{0,t-1},t\ge 1$. This follows from the fact that ${\underset{\xaf}{Q}}_{1,t}^{\mu}$ (which determines ${Q}_{0,t}^{\mu}$) is completely determined by the means $\{{c}_{s}^{\mu};s=1,\dots ,t\}$ and covariances $\{{K}_{uv}^{\mu ,j};j\in \mathbb{Z},u,v\in [1,t]\}$. In turn, it may be observed from Equations (18) and (23) that these variables are determined by µ

_{0,}

_{t}

_{−1}. Thus for any $\mu ,\nu \in {\mathcal{M}}_{1,S}^{+}({\mathcal{T}}^{\mathbb{Z}})$ and t ∈ [1, T], if

_{e}satisfies Equation (78).

^{2}(ν) for any ν such that ν

_{0}

_{,T}

_{−2}= µ

_{0}

_{,T}

_{−2}. Continuing this reasoning, we find that µ = L

^{T}(ν) for any ν such that ν

_{0}= µ

_{0}. But by Equation (79), since Q

^{µ}= µ, we have ${\mu}_{0}={\mu}_{I}^{\mathbb{Z}}$. But we have just seen that any µ satisfying µ = L

^{T}(ν), where ${\nu}_{0}={\mu}_{I}^{\mathbb{Z}}$, is uniquely defined by Equation (81), which means that µ = µ

_{e}.

_{e}such that ${\mu}_{e}={Q}^{{\mu}_{e}}$ in terms of its image $\underset{\xaf}{{\mu}_{e}}$. This characterization allows one to directly numerically calculate µ

_{e}. We characterize $\underset{\xaf}{{\mu}_{e}}$ recursively (in time), by providing a method of determining ${\underset{\xaf}{\mu}}_{{e}_{0,t}}$ in terms of ${\underset{\xaf}{\mu}}_{{e}_{0,t-1}}$. However, we must firstly outline explicitly the bijective correspondence between µ

_{e}

_{0}

_{.t}and ${\underset{\xaf}{\mu}}_{{e}_{0,t}}$, as follows. For $v\in \mathcal{T}$, we write Ψ

^{−1}(v) = (Ψ

^{−1}(v)

_{0}, ⋯, Ψ

^{−1}(v)

_{T}). We recall from Equation (14) that Ψ

^{−1}(v)

_{0}= v

_{0}. The coordinate Ψ

^{−1}(v)

_{t}is the affine function of v

_{s}, s = 0 ⋯ t obtained from Equation (14)

**Lemma 18.**For any t ∈ [1, T], the variables$\{{c}_{s}^{{\mu}_{e}},{K}_{rs}^{{\mu}_{e},j}:1\le r,s\le t,j\in \mathbb{Z}\le \}$ are necessary and sufficient to completely characterize the measures$\{{\underset{\xaf}{{\mu}_{e}}}_{0,t}^{1},{\underset{\xaf}{{\mu}_{e}}}_{(0,t)}^{(0,l)}:l\in \mathbb{Z}\}$. In turn, the measures$\{{\underset{\xaf}{{\mu}_{e}}}_{0,t}^{1},{\underset{\xaf}{{\mu}_{e}}}_{(0,t)}^{(0,l)}:l\in \mathbb{Z}\}$ are necessary and sufficient to characterize${\underset{\xaf}{\mu}}_{{e}_{0,t}}$.

**Theorem 3.**We may characterize$\underset{\xaf}{{\mu}_{e}}$ inductively as follows. Initially${\underset{\xaf}{{\mu}_{e}}}_{0}={\mu}_{I}^{\mathbb{Z}}$. Given that we have a complete characterization of

_{e}may be determined from $\underset{\xaf}{{\mu}_{e}}$ since ${\mu}_{e}=\underset{\xaf}{{\mu}_{e}}\circ \mathrm{\Psi}$.

## 6. Some Important Consequences of Theorem 1

^{N}(J

^{N}) is the conditional law of N neurons for given J

^{N}.

**Theorem 4**. Π

^{N}converges weakly to${\delta}_{{\mu}_{e}}$, i.e.„ for all$\Phi \in {\mathcal{C}}_{b}({\mathcal{M}}_{1,S}^{+}({\mathcal{T}}^{\mathbb{Z}}))$,

**Proof**. The proof of the first result follows directly from the existence of an LDP for the measure ∏

^{N}see Theorem 1, and is a straightforward adaptation of Theorem 2.5.1 in [18]. The proof of the second result uses the same method, making use of Theorem 5 below.

**Theorem 5**. For each closed set F of${\mathcal{M}}_{1,S}^{+}({\mathcal{T}}^{\mathbb{Z}})$ and for almost all J

**Proof**. The proof is a combination of Tchebyshev’s inequality and the Borel-Cantelli lemma and is a straightforward adaptation of Theorem 2.5.4 and Corollary 2.5.6 in [18].

**Corollary 3.**Fix M and let N > M. For almost every J and all h$h\in {\mathcal{C}}_{b}({\mathcal{T}}^{M})$,

^{th}marginals${\stackrel{\u2323}{Q}}^{N,M}({J}^{N})$ and Q

^{N,M}converge weakly to${\mu}_{e}^{M}$ as N → ∞.

**Proof**. It is sufficient to apply Theorem 4 in the case where Φ in ${\mathcal{C}}_{b}({\mathcal{M}}_{1,S}^{+}({\mathcal{T}}^{\mathbb{Z}}))$ is defined by

^{N}, ${\stackrel{\u2323}{Q}}^{N}(J)\in {\mathcal{M}}_{1,S}^{+}({\mathcal{T}}^{N})$ (Lemma 1 and above remark).

^{N}(J

^{N}) to be the conditional law of $\mathfrak{P}$ on u

^{(}

^{N}

^{)}(ω), for given J

^{N}.

**Theorem 6.**Fix M > 0 and let$h\in {C}_{b}({\mathcal{T}}^{M})$. For${u}^{(N)}(\omega )\in {\mathcal{T}}^{N}$ (where N > M) and |j| ≤ n. Then$\mathfrak{P}$ almost surely,

_{e}.

**Proof**. Our proof is an adaptation of [18]. We may suppose without loss of generality that ${f}_{{\mathcal{T}}^{M}}h(u)d{\mu}_{e}^{M}(u)=0.$ For p > 1 let

_{e}∉ F

_{p}, but it is the unique zero of H, it follows that ${\mathrm{inf}}_{{F}_{p}}H=m>0$. Thus by Theorem 1 there exists an N

_{0}, such that for all N > N

_{0},

_{p}such that for all N ≥ N

_{p},

^{th}marginals converge.

## 7. Possible Extensions

_{j}s are independent and identically distributed random variables independent of the synaptic weights J and the random processes B

^{j}. They can be thought of as external stimuli imposed on the neurons. This equation accounts for a more complicated “intrinsic” dynamics of the neurons, i.e., when they are uncoupled. The parameters γ

_{k}, k = 1, ⋯, l must satisfy some conditions to ensure stability of the uncoupled dynamics.

## 8. Conclusions

_{e}is system-wide and ergodic. Our work challenges the assumption held by some that one cannot have a “concise” macroscopic description of a neural network without an assumption of asynchronicity at the local population level.

## Appendix

## A. Two Useful Lemmas

**Lemma 19.**Let Z be a Gaussian vector of ℝ

^{p}with mean c and covariance matrix K. If a ∈ and b ∈ ℝ is such that for all eigenvalues α of K the relation αb > − 1 holds, we have

**Lemma 20.**Let B be a symmetric block-circulant matrix with the (j, k) T × T block given by (B

^{(}

^{j}

^{−}

^{k}

^{) mod}

^{N}), j, k = −n, ⋯, n. Let W

^{(}

^{N}

^{)}be the N × N Unitary matrix with elements${W}_{jk}^{(N)}\frac{1}{\sqrt{N}}\mathrm{exp}\left(\frac{2\pi ijk}{N}\right)$, j, k = −n, ⋯, n. Then B may be ‘block’-diagonalised in the follow manner (where ⊗ is the Kronecker Product and

^{∗}the complex conjugate),

## B. Proof of Proposition 4

**Proposition 4.**Fix$\nu \in {\mathcal{M}}_{1,S}^{+}({\mathcal{T}}^{\mathbb{Z}})$. For all ε > 0, there exists an open neighborhood V

_{ε}(ν) such that for all µ ∈ V

_{ε}(ν), all s, t ∈ [1, T] and all θ ∈ [−π, π[,

**Proof**. Let µ be in ${\mathcal{M}}_{1,S}^{+}({\mathcal{T}}^{\mathbb{Z}})$ and θ ∈ [−π, π[. We have

^{L}and ν

^{L}. Since $\left|f\left({u}_{s-1}^{0}\right)f\left({u}_{t-1}^{l}\right)-f\left({v}_{s-1}^{0}\right)f\left({v}_{t-1}^{l}\right)\right|\le 2({k}_{f}{d}_{L}({\pi}_{L}u,{\pi}_{L}v)\wedge 1)$, where k

_{f}is the Lipschitz constant of the function f, we find (through Equation (12)) that

_{ε}(ν) is a ball of radius less than $\frac{\epsilon}{2}$ (with respect to the distance metric in Equation (12)). Similar reasoning dictates that Equation (35) is satisfied too.

_{ε}(ν) to be sufficiently small that Equations (32), (35) and (36) are satisfied. In fact Equation (33) is also satisfied, as it may be obtained by taking the limit as N → ∞ of Equation (36). Since c

^{µ}is determined by the one-dimensional spatial marginal of µ, it follows from the definition of the metric in Equation (12) that we may take the radius of V

_{ε}(ν) to be sufficiently small that Equation (34) is satisfied too.

## C. Existence of a Lower Bound for Φ^{N}(µ, v)

^{N}(µ, v) defined in Equation (42) possesses a lower bound, we use a spectral representation and let ${\tilde{w}}^{j}={\tilde{v}}^{j}$ for all j, except that ${\tilde{w}}^{0}={\tilde{v}}^{0}-N{c}^{\mu}$. We may then write

**Lemma 21.**For each 1 ≤ t ≤ T,

**Proof**. If $\overline{J}=0$ the conclusion is evident, thus we assume throughout this proof that $\overline{J}\ne 0$. Since ${D}_{[N],tt}^{\mu}{=}^{\u2020}{\overline{O}}_{[N],t}^{\mu},{\tilde{K}}_{[N]}^{\mu ,0}{O}_{[N],t}^{\mu}$, we find from the definition that

^{µ,k}), k ∈ ℤ, where for 1 ≤ s, t ≤ T,

^{μk}, in particular the discrete Fourier Transform ${\left({\tilde{L}}^{{\mu}^{N},l}\right)}_{l=-n,\dots ,n}$ is Hermitian positive. Using this spectral representation we write

^{N}(µ, v) is greater than −β

_{2}, where

## D. Proof of Lemmas 10 and 15

^{N}(µ, v) defined in Equation (43) we are led to use spectral representations and to introduce the Fourier transform $\tilde{v}$ of v. Since $\tilde{v}\in {({\u2102}^{T})}^{N}$ the correspondence $v\to \tilde{v}$ from ${\mathcal{T}}_{1,T}^{N}$ to (ℂ

^{T})

^{N}is not one to one. We need to take into account the symmetries of $\tilde{v}$, hence the following definition.

**Definition 8.**For $v={\left({v}^{j}\right)}_{j=-n\cdots n}\in {\mathcal{T}}_{1,T}^{N}$, we note ${\mathscr{H}}^{N}(v)={v}_{\diamond}=\left({v}_{\diamond}^{-n},\dots ,{v}_{\diamond}^{n}\right)\in {\mathcal{T}}_{1,T}^{N}$, where v

_{⋄}is defined from the Discrete Fourier Transform $\tilde{v}=\left({\tilde{v}}^{-n},\cdots ,{\tilde{v}}^{n}\right)$ of v as follows

_{⋄}= $\mathscr{H}$

^{N}(v) is a bijection from ${\mathcal{T}}_{1,T}^{N}$ to itself the inverse being given by

**Lemma 10.**There exist positive constants c > 0 and a > 1 such that, for all N,

^{N}is defined in Equation (43).

**Proof.**We have from Equation (83) that ${\varphi}^{N}(\mu ,v)={\underset{\xaf}{\varphi}}_{\diamond}^{N}(\mu ,{w}_{\diamond})$, where ${w}_{\diamond}^{j}={v}_{\diamond}^{j}$ for all j, except that ${w}_{\diamond}^{0}={v}_{\diamond}^{0}-N{c}^{\mu}$. Since (by Equation (90)) the distribution of the variables ${v}_{\diamond}$ under ${\underset{\xaf}{P}}_{\diamond}^{\otimes N}$ is ${\mathcal{N}}_{T}{\left({0}_{T},N{\sigma}^{2}{\mathrm{Id}}_{T}\right)}^{\otimes N}$, the distribution of ${w}_{\diamond}$ under ${\underset{\xaf}{P}}_{\diamond}^{\otimes N}$ is ${\mathcal{N}}_{T}{\left(N{c}^{\mu},N{\sigma}^{2}{\mathrm{Id}}_{\mathcal{T}}\right)}^{\otimes N}$. By Lemma 7, the eigenvalues of ${\xc3}_{[N]}^{\mu ,j}$ are upperbounded by 0 < α < 1, for all j. Thus

^{j}) (for all |j| ≠ n) via ${c}^{\widehat{\mu}N}$. After diagonalisation, we find that

_{c}> 0, then

**Lemma 15.**There exist constants a > 1 and c > 0 such that for all$\mu \in {\mathcal{M}}_{1,S}^{+}\left({\mathcal{T}}^{\mathbb{Z}}\right)\cap {\epsilon}_{2}$,

**Proof.**a > 1 is chosen as in the proof of Lemma 10. We have (from Equation (50) that

^{(2)}may be expressed using the variational expression (49) as

^{N}is a continuous, bounded function on ${\mathcal{T}}^{N}$. We let ${\varsigma}_{M}^{N}=a{1}_{{B}_{M}}{\varsigma}_{*}^{N}$, where ${\varsigma}_{*}^{N}(u)=N\left({\varphi}^{N}\left({\mu}^{N},{\mathrm{\Psi}}_{1,T}(u)\right)+{\mathrm{\Gamma}}_{[N],1}(\mu )\right)$, and u ∈ B

_{M}only if either ||Ψ(u)|| ≤ NM or (ϕ

^{N}(µ

^{N}, Ψ

_{1}

_{,T}(u)) + Γ

_{[N],1}(µ)) ≤ 0. We proved in Section 3.3.2 that ϕ

^{N}(µ

^{N}, Ψ

_{1}

_{,T}(u)) possesses a lower bound, which means that ${\varsigma}_{M}^{N}$ is continuous and bounded. Furthermore ${\varsigma}_{M}^{N}$ grows to ${\varsigma}_{*}^{N}$, so that after substituting ${\varsigma}_{M}^{N}$ into Equation (93) and taking M → ∞ (i.e., applying the dominated convergence theorem), we obtain

## E. Proof of Lemma 14

**Lemma 14.**There exists a constant C

_{0}such that for all ν in${\mathcal{M}}_{1,S}^{+}\left({\mathcal{T}}^{\mathbb{Z}}\right)$, all ε > 0 and all µ ∈ V

_{ε}(ν)∩ε

_{2},

_{ε}(ν) is the open neighborhood defined in Proposition 4, and µ is given in Definition 1.

**Proof.**We firstly bound Γ

_{1}. From Equations (60) and (61) and Lemma 20 we have

^{N}is given in Definition 8 and ${\varphi}_{\infty}^{N}$ is given in Equation (59), and find that

^{∗}.

## Acknowledgments

## Author Contributions

## Conflicts of Interest

## References and Notes

- Guionnet, A. Dynamique de Langevin d’un verre de spins. Ph.D. Thesis, Université de Paris Sud, Orsay, France, March 1995. [Google Scholar]
- Ben-Arous, G.; Guionnet, A. Large deviations for Langevin spin glass dynamics. Probab. Theory Relat. Fields.
**1995**, 102, 455–509. [Google Scholar] - Ben-Arous, G.; Guionnet, A. Symmetric Langevin Spin Glass Dynamics. Ann. Probab.
**1997**, 25, 1367–1422. [Google Scholar] - Guionnet, A. Averaged and quenched propagation of chaos for spin glass dynamics. Probab. Theory Relat. Fields.
**1997**, 109, 183–215. [Google Scholar] - Sompolinsky, H.; Zippelius, A. Dynamic theory of the spin-glass phase. Phys. Rev. Lett.
**1981**, 47, 359–362. [Google Scholar] - Sompolinsky, H.; Zippelius, A. Relaxational dynamics of the Edwards-Anderson model and the mean-field theory of spin-glasses. Phys. Rev. B
**1982**, 25, 6860–6875. [Google Scholar] - Crisanti, A.; Sompolinsky, H. Dynamics of spin systems with randomly asymmetric bonds: Langevin dynamics and a spherical model. Phys. Rev. A
**1987**, 36, 4922–4939. [Google Scholar] - Crisanti, A.; Sompolinsky, H. Dynamics of spin systems with randomly asymmetric bounds: Ising spins and Glauber dynamics. Phys. Rev. A
**1987**, 37, 4865–4874. [Google Scholar] - Dawson, D.; Gartner, J. Large deviations from the mckean-vlasov limit for weakly interacting diffusions. Stochastics
**1987**, 20, 247–308. [Google Scholar] - Dawson, D.; Gartner, J. Multilevel large deviations and interacting diffusions. Probab. Theory Relat. Fields.
**1994**, 98, 423–487. [Google Scholar] - Budhiraja, A.; Dupuis, P.; Markus, F. Large deviation properties of weakly interacting processes via weak convergence methods. Ann. Probab.
**2012**, 40, 74–102. [Google Scholar] - Fischer, M. On the form of the large deviation rate function for the empirical measures of weakly interacting systems. Bernoulli
**2014**, 20, 1765–1801. [Google Scholar] - Sompolinsky, H.; Crisanti, A.; Sommers, H. Chaos in Random Neural Networks. Phys. Rev. Lett.
**1988**, 61, 259–262. [Google Scholar] - Gerstner, W.; Kistler, W. Spiking Neuron Models; Cambridge University Press: Cambridge, UK, 2002. [Google Scholar]
- Izhikevich, E. Dynamical Systems in Neuroscience: The Geometry of Excitability and Bursting; MIT Press: Cambridge, MA, USA, 2007. [Google Scholar]
- Ermentrout, G.B.; Terman, D. Foundations of Mathematical Neuroscience; Interdisciplinary Applied Mathematics; Springer: New York, NY, USA, 2010. [Google Scholar]
- Cessac, B. Increase in complexity in random neural networks. J. Phys. I
**1995**, 5, 409–432. [Google Scholar] - Moynot, O. Etude mathématique de la dynamique des réseaux neuronaux aléatoires récurrents. Ph.D. Thesis, Université Paul Sabatier, Toulouse, France, January 2000. [Google Scholar]
- Moynot, O.; Samuelides, M. Large deviations and mean-field theory for asymmetric random recurrent neural networks. Probab. Theory Relat. Fields.
**2002**, 123, 41–75. [Google Scholar] - Cessac, B.; Samuelides, M. From neuron to neural networks dynamics. Eur. Phys. J. Spec. Top.
**2007**, 142, 7–88. [Google Scholar] - Samuelides, M.; Cessac, B. Random Recurrent Neural Networks. Eur. Phys. J. Spec. Top.
**2007**, 142, 7–88. [Google Scholar] - Kandel, E.; Schwartz, J.; Jessel, T. Principles of Neural Science, 4th ed; McGraw-Hill: New York, NY, USA, 2000. [Google Scholar]
- Dayan, P.; Abbott, L. Theoretical Neuroscience: Computational and Mathematical Modeling of Neural Systems; MIT Press: Cambridge, MA, USA, 2001. [Google Scholar]
- Rudin, W. Real and Complex Analysis; McGraw-Hill: New York, NY, USA, 1987. [Google Scholar]
- Cugliandolo, L.F.; Kurchan, J.; Le Doussal, P.; Peliti, L. Glassy behaviour in disordered systems with nonrelaxational dynamics. Phys. Rev. Lett.
**1997**, 78, 350–353. [Google Scholar] - Lapicque, L. Recherches quantitatifs sur l’excitation des nerfs traitee comme une polarisation. J. Physiol. Paris.
**1907**, 9, 620–635. [Google Scholar] - Daley, D.; Vere-Jones, D. An Introduction to the Theory of Point Processes: General Theory and Structure; Springer: New York, NY, USA, 2007; Volume 2. [Google Scholar]
- Gerstner, W.; van Hemmen, J. Coherence and incoherence in a globally coupled ensemble of pulse-emitting units. Phys. Rev. Lett.
**1993**, 71, 312–315. [Google Scholar] - Gerstner, W. Time structure of the activity in neural network models. Phys. Rev. E
**1995**, 51, 738–758. [Google Scholar] - Cáceres, M.J.; Carillo, J.A.; Perhame, B. Analysis of nonlinear noisy integrate and fire neuron models: Blow-up and steady states. J. Math. Neurosci.
**2011**, 1. [Google Scholar] [CrossRef] - Baladron, J.; Fasoli, D.; Faugeras, O.; Touboul, J. Mean-field description and propagation of chaos in networks of Hodgkin-Huxley and FitzHugh-Nagumo neurons. J. Math. Neurosci.
**2012**, 2. [Google Scholar] [CrossRef] [Green Version] - Bogachev, V. Measure Theory, 1 ed; Volume 1 in Measure Theory; Springer: Berlin/Heidelberg, Germany, 2007. [Google Scholar]
- When N is even the formulae are slightly more complicated but all the results we prove below in the case N odd are still valid.
- We note ${\mathcal{N}}_{p}(m,\Sigma )$ the law of the p-dimensional Gaussian variable with mean m and covariance matrix Σ.
- Ellis, R. Entropy, Large Deviations and Statistical Mechanics; Springer: New York, NY, USA, 1985. [Google Scholar]
- Liggett, T.M. Interacting Particle Systems; Springer: Berlin/Heidelberg, Germany, 2005. [Google Scholar]
- Deuschel, J.D.; Stroock, D.W. Large Deviations; Pure and Applied Mathematics; Academic Press: Waltham, MA, USA, 1989; Volume 137. [Google Scholar]
- Dembo, A.; Zeitouni, O. Large Deviations Techniques, 2nd ed; Springer: Berlin/Heidelberg, Germany, 1997. [Google Scholar]
- Donsker, M.; Varadhan, S. Large deviations for stationary Gaussian processes. Commun. Math. Phys.
**1985**, 97, 187–210. [Google Scholar] - Baxter, J.R.; Jain, N.C. An Approximation Condition for Large Deviations and Some Applications. In Convergence in Ergodic Theory and Probability; Bergulson, V., Ed.; De Gruyter: Boston, MA, USA, 1993. [Google Scholar]
- Donsker, M.; Varadhan, S. Asymptotic Evaluation of Certain Markov Process Expectations for Large Time, IV. Commun. Pure Appl. Math.
**1983**, XXXVI, 183–212. [Google Scholar] - Faugeras, O.; MacLaurin, J. A Large Deviation Principle and an Analytical Expression of the Rate Function for a Discrete Stationary Gaussian Process
**2013**, arXiv, 1311.4400. - Chiyonobu, T.; Kusuoka, S. The Large Deviation Principle for Hypermixing Processes. Probab. Theory Relat. Fields.
**1988**, 78, 627–649. [Google Scholar] - We noted in the introduction that this is termed propagation of chaos by some.
- Bressloff, P. Stochastic neural field theory and the system-size expansion. SIAM J. Appl. Math.
**2009**, 70, 1488–1521. [Google Scholar] - Buice, M.; Cowan, J. Field-theoretic approach to fluctuation effects in neural networks. Phys. Rev. E
**2007**, 75. [Google Scholar] [CrossRef] - Ginzburg, I.; Sompolinsky, H. Theory of correlations in stochastic neural networks. Phys. Rev. E
**1994**, 50, 3171–3191. [Google Scholar] - ElBoustani, S.; Destexhe, A. A master equation formalism for macroscopic modeling of asynchronous irregular activity states. Neural Comput.
**2009**, 21, 46–100. [Google Scholar] - Buice, M.; Cowan, J.; Chow, C. Systematic fluctuation expansion for neural network activity equations. Neural Comput.
**2010**, 22, 377–426. [Google Scholar] - Neveu, J. Processus aléatoires gaussiens; Presses de l’Université de Montréal: Montréal, QC, Canada, 1968; Volume 34. [Google Scholar]

© 2015 by the authors; licensee MDPI, Basel, Switzerland This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Faugeras, O.; MacLaurin, J.
Asymptotic Description of Neural Networks with Correlated Synaptic Weights. *Entropy* **2015**, *17*, 4701-4743.
https://doi.org/10.3390/e17074701

**AMA Style**

Faugeras O, MacLaurin J.
Asymptotic Description of Neural Networks with Correlated Synaptic Weights. *Entropy*. 2015; 17(7):4701-4743.
https://doi.org/10.3390/e17074701

**Chicago/Turabian Style**

Faugeras, Olivier, and James MacLaurin.
2015. "Asymptotic Description of Neural Networks with Correlated Synaptic Weights" *Entropy* 17, no. 7: 4701-4743.
https://doi.org/10.3390/e17074701