A Large Deviation Principle and an Expression of the Rate Function for a Discrete Stationary Gaussian Process

We prove a large deviation principle for a stationary Gaussian process over R, indexed by Z (for some positive integers d and b), with positive definite spectral density, and provide an expression of the corresponding rate function in terms of the mean of the process and its spectral density. This result is useful in applications where such an expression is needed.


Introduction
In this paper, we prove a large deviation principle (LDP) for a spatially-stationary, indexed by Z d , Gaussian process over R b and obtain an expression for the rate function.Our work in mathematical neuroscience involves the search for asymptotic descriptions of large ensembles of neurons [1][2][3].Since there are many sources of noise in the brains of mammals [4], the mathematician interested in modeling certain aspects of brain functioning is often led to consider spatial Gaussian processes that (s)he uses to model these noise sources.This motivates us to use large deviation techniques.Being also interested in formulating predictions that can be experimentally tested in neuroscience laboratories, we strive to obtain analytical results, i.e., effective results from which, for example, numerical simulations can be developed.This is why we determine a more tractable expression for the rate function in this article.
Our result concerns the large deviations of ergodic phenomena, the literature of which we now briefly survey.Donsker and Varadhan obtained a large deviation estimate for the law governing the empirical process generated by a Markov process [5].They then determined a large deviation principle for a Z-indexed stationary Gaussian process, obtaining a particularly elegant expression for the rate function using spectral theory.Chiyonobu et al. [6][7][8] obtain a large deviation estimate for the empirical measure generated by ergodic processes satisfying certain mixing conditions.Baxter et al. [9] obtain a variety of results for the large deviations of ergodic phenomena, including one for the large deviations of Z-indexed R b -valued stationary Gaussian processes.Steinberg et al. [10] have proven an LDP for a stationary Z d -indexed Gaussian process over R, and [11] obtain an LDP for an R-indexed, R-valued stationary Gaussian process.
In the work we are developing [12], we need large deviation results for spatially-ergodic Ornstein-Uhlenbeck processes.This requires Theorem 1 of this paper.
In the first section, we make some preliminary definitions and state the chief theorem, Theorem 1, for zero mean processes.In the second section, we prove the theorem.In Appendix A, we state and prove several identities involving the relative entropy, which are necessary for the proof of Theorem 1.In Appendix B, we prove a general result for the large deviations of exponential approximations.We prove Corollary 2, which extends the result in Theorem 1 to non-zero mean processes in the Appendix C.

Preliminary Definitions
For some topological space Ω equipped with its Borelian σ-algebra B(Ω), we denote the set of all probability measures by M(Ω).We equip M(Ω) with the topology of weak convergence.Our process is indexed by Z d .For j ∈ Z d , we write j = (j(1), . . ., j(d)).For some positive integer n > 0, we let V n = {j ∈ Z d : |j(δ)| ≤ n for all 1 ≤ δ ≤ d}.Let T = R b , for some positive integer b.We equip T with the Euclidean topology and T Z d with the cylindrical topology, and we denote the Borelian σ-algebra generated by this topology by B(T Z d ).For some µ ∈ M(T Z d ) governing a process (X j ) j∈Z d , we let µ Vn ∈ M(T Vn ) denote the marginal governing (X j ) j∈Vn .For some j ∈ Z d , let the shift operator S j : T Z d → T Z d be S(ω) k = ω j+k .We let M s (T Z d ) be the set of all stationary probability measures µ on (T Z d , B(T Z d )), such that, for all j ∈ Z d , µ • (S j ) −1 = µ.We use Herglotz's theorem to characterize the law Q ∈ M s (T Z d ) governing a stationary process (X j ) in the following manner.We assume that E[X j ] = 0 and: Our convention throughout this paper is to denote the transpose of X as † X and the spectral density with a tilde.
where we consider [−π, π] d to have the topology of a torus.In addition, G(−θ) = † G(θ) = Ḡ (x indicates the complex conjugate of x).We assume that for all θ ∈ [−π, π] d , det G(θ) > Gmin for some Gmin > 0, from which it follows that for each θ, G(θ) is Hermitian positive definite.If x ∈ C b , then † xX j is a stationary sequence with spectral density † x Gx.We employ the operator norm over C b×b .
Let p n : T Z d → T Z d be such that p n (ω) k = ω k mod Vn .Here, and throughout the paper, we take k mod V n to be the element l ∈ V n , such that, for all 1 ≤ γ ≤ d, l(γ) = k(γ) mod (2n + 1).Define the process-level empirical measure μn : T Z d → M s T Z d as: Let Π n ∈ M M s (T Z d ) be the image law of Q under μn .We note that in this definition, we need not have chosen V n to have an odd side length (2n + 1): this choice is for notational convenience, and these results could easily be reproduced in the case that V n has side length n.
In the context of mathematical neuroscience, (X j ) could correspond to a model of interacting neurons on a lattice (d = 1, 2 or 3), as in [12,13].We note that the large deviation principle of this paper may be used to obtain an LDP for other processes using standard methods, such as the contraction principle or Lemma 13.Definition 1.Let (Ω, H) be a measurable space and µ, ν probability measures.
where B is the set of all bounded measurable functions.If Ω is Polishand H = B(Ω), then we only need to take the supremum over the set of all continuous bounded functions.
Let (Y j ) be a stationary Gaussian process on T , such that is governed by a law P , and we write the governing law in M s (T Z d ) as P Z d .It is clear that the governing law over V n may be written as P ⊗Vn (that is the product measure of P , indexed over V n ).Definition 2. Let E 2 be the subset of M s (T Z d ) defined by: We define the process-level entropy to be, for µ ∈ M s (T Z d ): For further discussion of this rate function and a proof that I (3) is well-defined, see [8].Definition 3. A sequence of probability laws (Γ n ) on some topological space Ω equipped with its Borelian σ-algebra is said to satisfy a strong large deviation principle (LDP) with rate function I : and for all closed sets F : If, furthermore, the set {x : I(x) ≤ α} is compact for all α ≥ 0, we say that I is a good rate function.
(which exists due to Herglotz's theorem) by μ.We have: 2 be the Hermitian positive definite square root of G(θ) −1 and: The b × b matrices H j are the coefficients of the absolutely convergent Fourier series (due to Wiener's theorem) of G−1/2 .Define β : T Z d → T Z d as follows: The theorem below is the chief result of this paper.
Theorem 1. (Π n ) satisfies a strong LDP with good rate function . Here: where: Here: Finally, the rate function uniquely vanishes at µ = Q.
Corollary 2. Suppose that the underlying process Q is defined as previously, except with mean E Q [ω j ] = c for all j ∈ Z d and some constant c ∈ R b .If we denote the image law of the empirical measure by Π n c , then (Π n c ) satisfies a strong LDP with a good rate function (for µ ∈ E 2 ): Here, then the rate function is infinite.The rate function has a unique minimum, i.e., I (3) We prove this in Appendix C.

Proof of Theorem 1
In this proof, we essentially adapt the methods of [9,14].We introduce the following metric over T Z d .
where the above is the Euclidean norm.Let d λ,M be the induced Prokhorov metric over M(T Z d ).These metrics are compatible with their respective topologies.
For θ ∈ [−π, π] d , let F (θ) = G(θ) 1/2 be the Hermitian positive square root of G and: The b × b matrices F j are the coefficients of the absolutely convergent Fourier series of the positive square root.Define τ : We note that τ = β −1 (on a suitable domain, where the series are convergent) and that τ (n) is a continuous map, but τ is not continuous (in general).We note (using Lemma 6) that P Z d • τ −1 has spectral density G.We define: We write ε be the law governing μn (Y ), where we recall that the stationary process (Y j ) is defined just below Definition 1.Let Π n (m) ∈ M(M s (T Z d )) be the law governing μn (τ (m) (Y )).Lemma 3. (R n ) satisfies a large deviation principle with good rate function Proof.The first statement is proven in [8].The last statement follows from Lemma 11 below.

Lemma 4. (Π n
(m) ) satisfies a strong LDP with good rate function given by, for µ ∈ E 2 : Proof.The sequence of laws governing μn (Y ) • τ −1 (m) (as n → ∞, with m fixed) satisfies a strong LDP with good rate function as a consequence of an application of the contraction principle to Lemma 3 (since τ (m) is continuous).Now, through the same reasoning as in Lemma 2.1, Theorem 2.3 in [14], it follows from this that (Π n (m) ) satisfies a strong LDP with the same rate function as that of μn (Y ) • τ −1 (m) .The last identification in (11) follows from Lemmas 6 and 9 in Appendix A. We only need to take the infimum over E 2 , because by Lemma 3, , then for all n > 0: The proof is almost identical to that in Lemma 2.4 in [14].We are now ready to prove Theorem 1.
Proof.We make use of the following standard Lemma from [15], to which the reader is referred for the definition of an orthogonal stochastic measure.Let (U j ) ∈ R b be a zero-mean stationary sequence governed by ξ ∈ E 2 .Then, there exists an orthogonal R b -valued stochastic measure , such that for every j ∈ Z d (ξ a.s.): Conversely, any orthogonal stochastic measure defines a zero-mean stationary sequence through (20).It may be inferred from this representation that: The proof that this is well-defined makes use of the fact that K 1 2 and K− 1 2 are uniformly continuous, since their Fourier series' each converge uniformly.This gives us the lemma.We note for future reference that, if ξ has spectral measure d ξ(θ), then the spectral density of ξ • ∆ −1 is: It remains for us to determine a specific expression for I (3) (ξ • Υ −1 ).
Proof.We see from (21) that ξ We thus find that: Proof.We assume for now that there exists a q, such that S j = 0 for all |j| ≥ q, denoting the corresponding map by ∆ q .Let N m q : T Vm → T Vm be the following linear operator.For j ∈ V m , let (N m q ω) j = k∈Vm S (k−j) mod Vm ω k .Let ξq,n = ξ Vn • (N n q ) −1 .It follows from this assumption that the V l marginals of ξq,n and ξ • ∆ −1 q are the same, as long as l ≤ n − q.Thus: This last inequality follows from a property of the relative entropy I (2) , namely that it is nondecreasing as we take a 'finer' σ-algebra (it is a direct consequence of Lemma 2.3 in [5].If ξ Vn does not have a density for some n, then I (3) (ξ) is infinite and the Lemma is trivial.If otherwise, we may readily evaluate I (2)  ξq,n ||P ⊗Vn using a change of variable to find that: We divide (24) by (2l + 1) d , substitute the above result and, finally, take l → ∞ (while fixing n = l + q) to find that: Here, Γ ∆ q (ξ) is equal to Γ ∆ (ξ), as defined above, subject to the above assumption that S j = 0 for |j| > q.On taking q → ∞, it may be readily seen that Γ ∆ q → Γ ∆ pointwise.Furthermore, the lower semicontinuity of I (3) dictates that: Proof.We find, similarly to the previous Lemma, that if γ ∈ M s (T Z d ), then: We substitute γ = ξ • ∆ −1 into the above and, after noting Lemma 6, we find that: The result now follows from the previous Lemma and (26).
Upon taking M → ∞ and applying the dominated convergence theorem, we obtain: We have used the stationarity of µ.By taking the limit n → ∞, we obtain the first inequality (29).

C. Proof of Corollary 2
We now prove Corollary 2.
We divide by (2n + 1) d and take n to infinity to obtain: If µ Vn does not possess a density for some n, then both sides of the above equation are infinite.It may be verified that the spectral density of ν is given by: On substituting this into the expression in Theorem 1, we find that: We have used the fact that G(0) is symmetric.We thus obtain (5).This minimum of the rate function remains unique because of the bijectivity of (Ψ Z d ).