A Large Deviation Principle and an Expression of the Rate Function for a Discrete Stationary Gaussian Process

Olivier Faugeras; James MacLaurin

doi:10.3390/e16126722

and

INRIA Sophia Antipolis Mediterannee, 2004 Route Des Lucioles, Sophia Antipolis, France

^*

Author to whom correspondence should be addressed.

Entropy2014, 16(12), 6722-6738;https://doi.org/10.3390/e16126722

This article belongs to the Section Statistical Physics

Version Notes

Order Reprints

Abstract

We prove a large deviation principle for a stationary Gaussian process over ℝ^b indexed by ℤ^d (for some positive integers d and b), with positive definite spectral density, and provide an expression of the corresponding rate function in terms of the mean of the process and its spectral density. This result is useful in applications where such an expression is needed.

Keywords:

stationary Gaussian process; large deviation principle; spectral density

1. Introduction

In this paper, we prove a large deviation principle (LDP) for a spatially-stationary, indexed by ℤ^d, Gaussian process over ℝ^b and obtain an expression for the rate function. Our work in mathematical neuroscience involves the search for asymptotic descriptions of large ensembles of neurons [–]. Since there are many sources of noise in the brains of mammals [], the mathematician interested in modeling certain aspects of brain functioning is often led to consider spatial Gaussian processes that (s)he uses to model these noise sources. This motivates us to use large deviation techniques. Being also interested in formulating predictions that can be experimentally tested in neuroscience laboratories, we strive to obtain analytical results, i.e., effective results from which, for example, numerical simulations can be developed. This is why we determine a more tractable expression for the rate function in this article.

Our result concerns the large deviations of ergodic phenomena, the literature of which we now briefly survey. Donsker and Varadhan obtained a large deviation estimate for the law governing the empirical process generated by a Markov process []. They then determined a large deviation principle for a ℤ-indexed stationary Gaussian process, obtaining a particularly elegant expression for the rate function using spectral theory. Chiyonobu et al. [–] obtain a large deviation estimate for the empirical measure generated by ergodic processes satisfying certain mixing conditions. Baxter et al. [] obtain a variety of results for the large deviations of ergodic phenomena, including one for the large deviations of ℤ-indexed ℝ^b-valued stationary Gaussian processes. Steinberg et al. [] have proven an LDP for a stationary ℤ^d-indexed Gaussian process over ℝ, and [] obtain an LDP for an ℝ-indexed, ℝ-valued stationary Gaussian process.

In the work we are developing [], we need large deviation results for spatially-ergodic Ornstein–Uhlenbeck processes. This requires Theorem 1 of this paper.

In the first section, we make some preliminary definitions and state the chief theorem, Theorem 1, for zero mean processes. In the second section, we prove the theorem. In Appendix A, we state and prove several identities involving the relative entropy, which are necessary for the proof of Theorem 1. In Appendix B, we prove a general result for the large deviations of exponential approximations. We prove Corollary 2, which extends the result in Theorem 1 to non-zero mean processes in the Appendix C.

2. Preliminary Definitions

For some topological space Ω equipped with its Borelian σ-algebra B(Ω), we denote the set of all probability measures by

ℳ

(Ω). We equip

ℳ

(Ω) with the topology of weak convergence. Our process is indexed by ℤ^d. For j ∈ ℤ^d, we write j = (j(1),…,j(d)). For some positive integer n > 0, we let V_n = {j ∈ ℤ^d: |j(δ)| ≤ n for all 1 ≤ δ ≤ d}. Let

T = ℝ_{b}

, for some positive integer b. We equip

T

with the Euclidean topology and

T^{ℤ^{d}}

with the cylindrical topology, and we denote the Borelian σ-algebra generated by this topology by

B (T^{ℤ^{d}})

. For some

μ \in ℳ (T^{ℤ^{d}})

governing a process

{(X^{j})}_{j \in ℤ^{d}}

, we let

μ^{V_{n}} \in ℳ (T^{V_{n}})

denote the marginal governing

{(X^{j})}_{j \in V_{n}}

. For some j ∈ ℤ^d, let the shift operator

S^{j} : T^{ℤ^{d}} \to T^{ℤ^{d}}

be S(ω)^k = ω^j⁺^k. We let

ℳ_{s} (T^{ℤ^{d}})

be the set of all stationary probability measures μ on

(T^{ℤ^{d}}, B (T^{ℤ^{d}}))

, such that, for all j ∈ ℤ^d, μ ○ (S^j)⁻¹ = μ. We use Herglotz’s theorem to characterize the law

Q \in ℳ_{s} (T^{ℤ^{d}})

governing a stationary process (X^j) in the following manner. We assume that E[X^j] = 0 and:

E [X_{0}^{†} X_{k}] = \frac{1}{{(2 π)}^{d}} \int_{{[- π, π]}^{d}} \exp (i ⟨ k, θ ⟩) \tilde{G} (θ) d θ .

(1)

Our convention throughout this paper is to denote the transpose of X as ^†X and the spectral density with a tilde. ⟨⋅,⋅⟩ is the standard inner product on ℝ^b. Here,

\tilde{G} (θ)

is a continuous function [−π, π]^d → C^b^×^b, where we consider [−π, π]^d to have the topology of a torus. In addition,

\tilde{G} (- θ) =^{†} \tilde{G} (θ) = \bar{\tilde{G}}

(

\bar{x}

indicates the complex conjugate of x). We assume that for all θ ∈ [−π, π]^d, det

\tilde{G} (θ) > {\tilde{G}}_{m i n}

for some

{\tilde{G}}_{m i n} > 0

from which it follows that for each θ,

\tilde{G} (θ)

is Hermitian positive definite. If x ∈ C^b, then ^†xX^j is a stationary sequence with spectral density

^{†} x \tilde{G} \bar{x}

. We employ the operator norm over C^b^×^b. Let

p_{n} : T^{ℤ^{d}} \to T^{ℤ^{d}}

be such that p_n (ω)^k = ω^k. Here, and throughout the paper, we take k mod V_n to be the element l ∈ V_n, such that, for all 1 ≤ γ ≤ d, l(γ) = k(γ) mod (2n + 1). Define the process-level empirical measure

{\hat{μ}}_{n} : T^{ℤ^{d}} \to ℳ_{s} (T^{ℤ^{d}})

as:

{\hat{μ}}_{n} (ω) = \frac{1}{{(2 n + 1)}^{d}} \sum_{k \in V_{n}} δ_{S^{k} p_{n} (ω)} .

(2)

Let

\prod^{n} \in ℳ (ℳ_{s} (T^{ℤ^{d}}))

be the image law of Q under

{\hat{μ}}_{n}

. We note that in this definition, we need not have chosen V_n to have an odd side length (2n + 1): this choice is for notational convenience, and these results could easily be reproduced in the case that V_n has side length n.

In the context of mathematical neuroscience, (X^j) could correspond to a model of interacting neurons on a lattice (d = 1, 2 or 3), as in [,]. We note that the large deviation principle of this paper may be used to obtain an LDP for other processes using standard methods, such as the contraction principle or Lemma 13.

Definition 1. Let (Ω, H) be a measurable space and μ, ν probability measures.

I^{(2)} (μ ‖ v) = \sup_{f \in B} {E^{μ} [f] - \log E^{v} [\exp (f)]},

where B is the set of all bounded measurable functions. If Ω is Polishand H = B(Ω), then we only need to take the supremum over the set of all continuous bounded functions.

Let (Y^j) be a stationary Gaussian process on

T

, such that E[Y ^j] = 0, E[Y ^j†Y^k] = 0 and E[Y^{j †}Y^j] = Id _b. Each Y^j is governed by a law P, and we write the governing law in

ℳ_{s} (T^{ℤ^{d}})

as

P^{ℤ^{d}}

. It is clear that the governing law over V_n may be written as

P^{\otimes V_{n}}

(that is the product measure of P, indexed over V_n).

Definition 2. Let ε₂ be the subset of

ℳ_{s} (T^{ℤ^{d}})

defined by:

ε_{2} = {μ \in ℳ_{s} (T^{ℤ^{d}}) | E^{μ} [{‖ ω^{0} ‖}^{2}] < \infty} .

We define the process-level entropy to be, for

μ \in ℳ_{s} (T^{ℤ^{d}})

:

I^{(3)} (μ) = \lim_{n \to \infty} \frac{1}{{(2 n + 1)}^{d}} I^{(2)} (μ^{V_{n}} ‖ P^{\otimes V_{n}}) .

It is a consequence of Lemma 11 that if μ ∉ ε₂, then I⁽³⁾(μ) = ∞. For further discussion of this rate function and a proof that I⁽³⁾ is well-defined, see [].

Definition 3. A sequence of probability laws (Γⁿ) on some topological space Ω equipped with its Borelian σ-algebra is said to satisfy a strong large deviation principle (LDP) with rate function I: Ω → ℝ if I is lower semicontinuous, for all open sets O,

\underset{n \to \infty}{\lim_{¯}} \frac{1}{{(2 n + 1)}^{d}} \log Γ^{n} (O) \geq - \inf_{x \in O} I (x)

and for all closed sets F:

\underset{n \to \infty}{\lim^{¯}} \frac{1}{{(2 n + 1)}^{d}} \log Γ^{n} (F) \leq - \inf_{x \in F} I (x) .

If, furthermore, the set {x: I(x) ≤ α} is compact for all α ≥ 0, we say that I is a good rate function.

Definition 4. For μ ∈ ε₂, we denote the C^b × C^b-valued spectral measure on ([−π, π]^d, B([−π, π]^d)) (which exists due to Herglotz’s theorem) by

\tilde{μ}

. We have:

\frac{1}{{(2 π)}^{d}} \int_{{[- π, π]}^{d}} \exp (i ⟨ k, θ ⟩) d \tilde{μ} (θ) = E^{μ} [ω^{0 †} ω^{k}] .

For For θ ∈ [−π, π]^d let

\tilde{H} (θ) = \tilde{G} {(θ)}^{- \frac{1}{2}}

be the Hermitian positive definite square root of

\tilde{G} {(θ)}^{- 1}

and:

\tilde{H} (θ) = \sum_{j \in Z^{d}} H^{j} \exp (- i ⟨ j, θ ⟩) .

(3)

The b × b matrices H^j are the coefficients of the absolutely convergent Fourier series (due to Wiener’s theorem) of

{\tilde{G}}^{- 1 / 2}

. Define

β : T^{Z^{d}} \to T^{Z^{d}}

as follows:

{(β (ω))}^{k} = {\sum_{j \in Z^{d}}^{†} H}^{j} ω^{k - j} .

The theorem below is the chief result of this paper.

Theorem 1. (∏ⁿ) satisfies a strong LDP with good rate function I⁽³⁾ (μ ○ β⁻¹). Here:

I^{(3)} (μ \circ β^{- 1}) = {\begin{array}{l} I^{(3)} (μ) - Γ (μ) & i f μ \in ε_{2} \\ \infty & o the r w i s e, \end{array}

(4)

where:

Γ (μ) = {\begin{array}{l} Γ_{1} + Γ_{2} (μ) & i f μ \in ε_{2} \\ 0 & o the r w i s e . \end{array}

Here:

\begin{matrix} Γ_{1} = - \frac{1}{2 {(2 π)}^{d}} \int_{{[- π, π]}^{d}} \log \det \tilde{G} (θ) d θ \\ Γ_{2} (μ) = \frac{1}{2 {(2 π)}^{d}} \int_{{[- π, π]}^{d}} tr (({Id}_{b} - \tilde{G} {(θ)}^{- 1}) d \tilde{μ} (θ)) . \end{matrix}

Finally, the rate function uniquely vanishes at μ = Q.

Corollary 2. Suppose that the underlying process Q is defined as previously, except with mean E^Q[⍵^j]= c for all j ∈ ℤ^d and some constant c ∈ ℝ^b. If we denote the image law of the empirical measure by

\prod_{c}^{n}

then

(\prod_{c}^{n})

satisfies a strong LDP with a good rate function (for μ ∈ ε₂):

I^{(3)} (μ) - Γ (μ) + \frac{1_{†}}{2} c \tilde{G} {(0)}^{- 1} c -^{†} c \tilde{G} {(0)}^{- 1} m^{μ} .

(5)

Here, m^μ = E^μ⁽^ω⁾[ω^j] for all j ∈ ℤ^d. If μ ∉ ε₂, then the rate function is infinite. The rate function has a unique minimum, i.e.,

I^{(3)} (μ) - Γ (μ) + \frac{1}{2} † c \tilde{G} {(0)}^{- 1} c -^{†} c \tilde{G} {(0)}^{- 1} - m^{μ} = 0

if and only if μ = Q.

We prove this in Appendix C.

3. Proof of Theorem 1

In this proof, we essentially adapt the methods of [,]. We introduce the following metric over

T^{ℤ^{d}}

For j ∈ ℤ^d, let

λ^{j} = 3^{- d} \prod_{δ = 1}^{d} 2^{- | j (δ) |}

Define the metric d^λ as follows,

d^{λ} (x, y) = \sum_{j \in Z^{d}} λ^{j} min (‖ x^{j} - y^{j} ‖, 1),

where the above is the Euclidean norm. Let d^{λ $ℳ$} be the induced Prokhorov metric over

ℳ (T^{ℤ^{d}})

. These metrics are compatible with their respective topologies.

For θ ∈ [−π, π]^d,

\tilde{F} (θ) = \tilde{G} {(θ)}^{1 / 2}

be the Hermitian positive square root of

\tilde{G}

and:

\tilde{F} (θ) = \sum_{j \in Z^{d}} F^{j} \exp (- i ⟨ j, θ ⟩) .

(6)

The b × b matrices F^j are the coefficients of the absolutely convergent Fourier series of the positive square root. Define

τ : T^{Z^{d}} \to T^{Z^{d}}

and

τ_{(M)} : T^{Z^{d}} \to T^{Z^{d}}

as follows,

\begin{array}{l} {(τ (ω))}^{k} = {\sum_{j \in ℤ^{d}}^{†} F}^{j} ω^{k - j}, \\ {(τ_{(n)} (ω))}^{k} = {\sum_{j \in V_{n}}^{†} F}^{j} ω^{k - j} \prod_{δ = 1}^{d} (1 - \frac{| j (δ) |}{2 n + 1}) . \end{array}

(7)

We note that τ = β⁻¹ (on a suitable domain, where the series are convergent) and that τ₍_n₎ is a continuous map, but τ is not continuous (in general).

We note (using Lemma 6) that

P^{ℤ^{d}} \circ τ^{- 1}

has spectral density

\tilde{G}

. We define:

{\tilde{F}}_{(n)} (θ) = \sum_{j \in V_{n}} F^{j} \exp (- i ⟨ j, θ ⟩) \prod_{δ = 1}^{d} (1 - \frac{| j (δ) |}{2 n + 1}) .

(8)

We write

ε (n) = \sup_{θ \in {[- π, π]}^{d}} {‖ \tilde{F} (θ) - {\tilde{F}}_{(n)} (θ) ‖}^{2}

. By Fejer’s theorem, ε(n) → 0 as n → ∞. We define

{\tilde{G}}_{(n)} (θ) = {\tilde{F}}_{(n)} {(θ)}^{2}

, noting that this is the spectral density of

P^{ℤ^{d}} \circ τ_{(n)}^{- 1}

. Let Γ₍_n₎ (μ) = Γ_1,(_n₎ + Γ_2,(_n₎ (μ), where for μ ∈ ε₂,

Γ_{1, (n)} = - \frac{1}{2 {(2 π)}^{d}} \int_{[- π, π} \log \det {\tilde{G}}_{(n)} (θ) d θ,

(9)

Γ_{2, (n)} (μ) = \frac{1}{2 {(2 π)}^{d}} \int_{[- π, π} tr (({Id}_{b} - {\tilde{G}}_{(n)}^{- 1} (θ)) d \tilde{μ} (θ)) .

(10)

If μ ∉ ε₂, let Γ_1,(_n₎ = Γ_2,(_n₎(μ) = 0. Let

R^{n} \in M (M_{s} (T^{z^{d}}))

be the law governing

{\hat{μ}}_{n} (Y)

, where we recall that the stationary process (Y^j) is defined just below Definition 1. Let

Π_{(m)}^{n} \in M (M_{s} (T^{Z^{d}}))

be the law governing

{\hat{μ}}_{n} (τ_{(m)} (Y))

.

Lemma 3. (Rⁿ) satisfies a large deviation principle with good rate function I⁽³⁾(μ). If μ ∉ ε₂, then I⁽³⁾(μ) = ∞.

Proof. The first statement is proven in []. The last statement follows from Lemma 11 below.

Lemma 4.

(\prod_{(m)}^{n})

satisfies a strong LDP with good rate function given by, for μ ∈ ε₂:

I_{(m)}^{(3)} (μ) = \inf_{v \in ε_{2} : μ = v \circ τ_{(m)}^{- 1}} I^{(3)} (v) = I^{(3)} (μ) - Γ_{(m)} (μ) .

(11)

If μ ∉ ε₂, then

I_{(m)}^{(3)} (μ) = \infty

.

Proof. The sequence of laws governing

{\hat{μ}}_{n} (Y) \circ τ_{(m)}^{- 1}

(as n → ∞, with m fixed) satisfies a strong LDP with good rate function as a consequence of an application of the contraction principle to Lemma 3 (since τ₍_m₎ is continuous). Now, through the same reasoning as in Lemma 2.1, Theorem 2.3 in [], it follows from this that

(\prod_{(m)}^{n})

satisfies a strong LDP with the same rate function as that of

{\hat{μ}}_{n} (Y) \circ τ_{(m)}^{- 1}

. The last identification in (11) follows from Lemmas 6 and 9 in Appendix A. We only need to take the infimum over ε₂, because by Lemma 3, I⁽³⁾ (ν) is infinite for ν ∉ ε₂.

Lemma 5. If

0 < b < \frac{1}{2 ε (m)}

, then for all n > 0:

\frac{1}{{(2 n + 1)}^{d}} \log E^{P^{Z^{d}}} [\exp (b {\sum_{k \in V_{n}} ‖ τ_{(m)} {(ω)}^{k} - τ {(ω)}^{k} ‖}^{2})] \leq - \frac{1}{2} \log (1 - 2 b ε (m)) .

The proof is almost identical to that in Lemma 2.4 in []. We are now ready to prove Theorem 1.

Proof. We apply Lemma 13 in the Appendix to the above result. We substitute Y₍_m₎ = τ₍_m₎(ω) and W = τ(ω). Taking m → ∞ in the equation in Lemma 5, we find that (38) is satisfied if we stipulate that κ = 0. After noting the LDP governing

(\prod_{(m)}^{n})

in (11), we may thus conclude that (Πⁿ) satisfies a strong LDP with good rate function:

\lim_{δ \to 0} \underset{m \to \infty}{\lim_{¯}} \inf_{γ \in B^{δ} (μ)} I_{(m)}^{(3)} (γ),

(12)

where B^δ(μ) = {γ: d^λ,M(μ,γ) ≤ δ}.

It remains for us to identify (12) with the rate function in (4). We claim that for each δ > 0,

\underset{m \to \infty}{\lim_{¯}} \inf_{γ \in B^{δ} (μ)} (I_{(m)}^{(3)} (γ)) = \inf_{γ \in B^{δ} (μ)} (I^{(3)} (γ) - Γ (γ)) .

(13)

To see this, we have from Lemma 11 that for all m and all γ and constants α₁ < 1, α₃ > 1 and α₂, α₄ ∈ ℝ, (note that if γ ∉ ε₂ the inequalities below are immediate from the definitions):

I^{(3)} (γ) - Γ (γ) \geq (1 - α_{1}) I^{(3)} (γ) - α_{2}

(14)

\begin{array}{r} I_{(m)}^{(3)} (γ) \geq (1 - α_{1}) I^{(3)} (γ) - α_{2} \\ I^{(3)} (γ) - Γ (γ) \leq (1 + α_{3}) I^{(3)} (γ) - α_{4} \\ I_{(m)}^{(3)} (γ) \leq (1 + α_{3}) I^{(3)} (γ) - α_{4} \end{array}

(15)

We thus see that if I⁽³⁾(γ) = ∞ for all γ ∈ B^δ(μ), then (13) is identically infinite on both sides. Otherwise, it may be seen from (14) and (15) that it suffices to establish (13) in the case that

B^{δ} (μ) = B_{l}^{δ} (μ) : = {γ : d^{λ, M} (μ, γ) \leq δ, I^{(3)} (γ) \leq l}

for some l < ∞. However, it follows from (29) and (34) that for all

γ \in B_{l}^{δ} (μ)

, there exist constants

(α_{5}^{m})

, which converge to zero as m → ∞ and such that

| I_{(m)}^{(3)} (γ) - I^{(3)} (γ) + Γ (γ) | \leq α_{5}^{m}

. We may thus conclude (13). The expression for the rate function in (4) now follows, since I⁽³⁾(γ) – Γ(γ) is lower semicontinuous, by Lemma 12.

For the second statement in the Theorem, if I⁽³⁾ (μ ○ β⁻¹) = 0, then

I^{(2)} ({(μ \circ β^{- 1})}^{V_{n}} ‖ P^{\otimes V_{n}}) = 0

for all n. However, since the relative entropy has a unique zero, this means that

{(μ \circ β^{- 1})}^{V_{n}} = P^{\otimes V_{n}}

for all n. However, this means that

μ \circ β^{- 1} = P^{Z^{d}}

, and therefore (using Lemma 6), μ = Q.

Appendix

A. Properties of the Entropy

Let

\tilde{K} : {[- π, π]}^{d} \to C^{b \times b}

possess an absolutely convergent Fourier series and be such that the eigenvalues of

\tilde{K} (θ)

are strictly greater than zero for all θ. We require that

\tilde{K}

is the density of a stationary sequence, which means that we must also assume that for all θ:

\tilde{K} (- θ) =^{†} \tilde{K} (θ) = \bar{\tilde{K}} (θ) .

This means, in particular, that

\tilde{K} (θ)

is Hermitian. We write:

{(Δ (ω))}^{k} = \sum_{j \in Z^{d}}^{†} S^{j} ω^{k - j}, where

(16)

\sum_{j \in Z^{d}} S^{j} \exp (- i ⟨ j, θ ⟩) = \tilde{K} {(θ)}^{- \frac{1}{2}} .

(17)

Here,

{\tilde{K}}^{- \frac{1}{2}}

is understood to be the positive Hermitian square root of

{\tilde{K}}^{- 1}

. The Fourier series of

{\tilde{K}}^{- \frac{1}{2}}

is absolutely convergent as a consequence of Wiener’s theorem. In this section, we determine a general expression for I⁽³⁾ (ξ ○ ∆⁻¹). We are generalizing the result for b = 1 with ℤ-indexing given in []. These results are necessary for the proofs in the previous section.

We similarly write that:

{(ϒ (ω))}^{k} = \sum_{j \in Z^{d}}^{†} R^{j} ω^{k - j}, where

(18)

\sum_{j \in Z^{d}} R^{j} \exp (- i ⟨ j, θ ⟩) = \tilde{K} {(θ)}^{\frac{1}{2}} .

(19)

As previously,

{\tilde{K}}^{\frac{1}{2}}

is understood to be the positive definite Hermitian square root of

\tilde{K}

. We note that R⁻^j = ^†R^j and S⁻^j = ^†S^j.

Lemma 6. For all ξ ∈ ε₂, ξ ○ ∆⁻¹ and ξ ○ γ⁻¹ are in ε₂ and:

ξ \circ Δ^{- 1} \circ ϒ^{- 1} = ξ \circ ϒ^{- 1} \circ Δ^{- 1} = ξ .

Proof. We make use of the following standard Lemma from [], to which the reader is referred for the definition of an orthogonal stochastic measure. Let (U^j) ∈ ℝ^b be a zero-mean stationary sequence governed by ξ ∈ ε₂. Then, there exists an orthogonal ℝ^b-valued stochastic measure Z^ξ = Z^ξ(∆) (∆ ∈ B([−π, π[^d)), such that for every j ∈ ℤ^d (ξ a.s.):

U^{j} = \frac{1}{2 π} \int_{{[- π, π]}^{d}} \exp (i ⟨ j, θ ⟩ Z^{ξ} (d θ)) .

(20)

Conversely, any orthogonal stochastic measure defines a zero-mean stationary sequence through (20). It may be inferred from this representation that:

\begin{array}{l} Z^{ξ \circ Δ^{- 1}} (d θ) =^{†} \tilde{K} {(θ)}^{- \frac{1}{2}} Z^{ξ} (d θ), \\ Z^{ξ \circ ϒ^{- 1}} (d θ) =^{†} \tilde{K} {(θ)}^{\frac{1}{2}} Z^{ξ} (d θ) . \end{array}

The proof that this is well-defined makes use of the fact that

{\tilde{K}}^{\frac{1}{2}}

and

{\tilde{K}}^{- \frac{1}{2}}

are uniformly continuous, since their Fourier series’ each converge uniformly This gives us the lemma. We note for future reference that, if ξ has spectral measure

d \tilde{ξ} (θ)

, then the spectral density of ξ ○ ∆⁻¹ is:

{\tilde{K}}^{- \frac{1}{2}} (θ) d \tilde{ξ} (θ) {\tilde{K}}^{- \frac{1}{2}} (θ) .

(21)

It remains for us to determine a specific expression for I⁽³⁾ (ξ ○ γ⁻¹).

Definition 5. If ξ ∈ ε₂, we define:

Γ^{Δ} (ξ) = \frac{1}{2} (E^{ξ} [{‖ ω^{0} ‖}^{2}] - E^{ξ \circ Δ^{- 1}} [{‖ ω^{0} ‖}^{2}]) - \frac{1}{2 {(2 π)}^{d}} \int_{{[- π, π]}^{d}} \log \det (\tilde{K} (θ)) d θ .

(22)

Otherwise, we define Γ^Δ(ξ) = 0.

Lemma 7. If ξ ∈ ε₂,

Γ^{Δ} (ξ) = \frac{1}{2 {(2 π)}^{d}} \int_{{[- π, π]}^{d}} tr (({Id}_{b} - \tilde{K} {(θ)}^{- 1}) d \tilde{ξ} (θ)) - \frac{1}{2 {(2 π)}^{d}} \int_{{[- π, π]}^{d}} \log \det (\tilde{K} (θ)) d θ .

(23)

Proof. We see from (21) that ξ ○ ∆⁻¹ has spectral measure

{\tilde{K}}^{- \frac{1}{2}} (θ) d \tilde{ξ} (θ) {\tilde{K}}^{- \frac{1}{2}} (θ)

. We thus find that:

\begin{matrix} Γ^{Δ} (ξ) = \frac{1}{2 {(2 π)}^{d}} \int_{{[- π, π]}^{d}} tr (d \tilde{ξ} (θ)) - \frac{1}{2 {(2 π)}^{d}} \int_{{[- π, π]}^{d}} tr ({\tilde{K}}^{- \frac{1}{2}} (θ) d \tilde{ξ} (θ) {\tilde{K}}^{- \frac{1}{2}} (θ)) - \frac{1}{2 {(2 π)}^{d}} \int_{{[- π, π]}^{d}} \log \det (\tilde{K} (θ)) d θ . \\ = \frac{1}{2 {(2 π)}^{d}} \int_{{[- π, π]}^{d}} tr (d \tilde{ξ} (θ)) - \frac{1}{2 {(2 π)}^{d}} \int_{{[- π, π]}^{d}} tr (\tilde{K} {(θ)}^{- 1} d \tilde{ξ} (θ)) - \frac{1}{2 {(2 π)}^{d}} \int_{{[- π, π]}^{d}} \log \det (\tilde{K} (θ)) d θ . \end{matrix}

Lemma 8. For all ξ ∈ ε₂,

I^{(3)} (ξ \circ Δ^{- 1}) \leq I^{(3)} (ξ) - Γ^{Δ} (ξ) .

Proof. We assume for now that there exists a q, such that S^j = 0 for all |j| ≥ q, denoting the corresponding map by ∆_q. Let

N_{q}^{m} : T^{V_{m}} \to T^{V_{m}}

be the following linear operator. For j ∈ V_m, let

{(N_{q}^{m} ω)}^{j} = \sum_{k \in V_{m}} S^{(k - j) \mod V_{m}} ω^{k}

. Let

{{}^{‵}ξ}_{q, n} = ξ^{V_{n}} \circ {(N_{q}^{n})}^{- 1}

. It follows from this assumption that the V_l marginals of

{{}^{‵}ξ}_{q, n}

and

ξ \circ Δ_{q}^{- 1}

are the same, as long as l ≤ n – q. Thus:

\begin{matrix} I^{(2)} ({(ξ \circ Δ_{q}^{- 1})}^{V_{l}} ‖ P^{\otimes V_{l}}) = I^{(2)} ({({{}^{‵}ξ}_{q, n})}^{V_{l}} ‖ P^{\otimes V_{l}}) \\ \leq I^{(2)} ({{}^{‵}ξ}_{q, n} ‖ P^{\otimes V_{n}}) . \end{matrix}

(24)

This last inequality follows from a property of the relative entropy I⁽²⁾, namely that it is nondecreasing as we take a ‘finer’ σ-algebra (it is a direct consequence of Lemma 2.3 in []. If

ξ^{V_{n}}

does not have a density for some n, then I⁽³⁾ (ξ) is infinite and the Lemma is trivial. If otherwise, we may readily evaluate

I^{(2)} ({{}^{‵}ξ}_{q, n} ‖ P^{\otimes V_{n}})

using a change of variable to find that:

I^{(2)} ({{}^{‵}ξ}_{q, n} ‖ P^{\otimes V_{n}}) = I^{(2)} (ξ^{V_{n}} ‖ P^{\otimes V_{n}}) + \frac{1}{2} E^{ξ^{V_{n}}} [{‖ N_{q}^{n} ω ‖}^{2} - {‖ ω ‖}^{2}] + \frac{1}{2} \log \det (N_{q}^{n}) .

We divide (24) by (2l + 1)^d, substitute the above result and, finally, take l → ∞ (while fixing n = l + q) to find that:

\begin{array}{l} I^{(3)} (ξ \circ Δ_{q}^{- 1}) \\ \leq I^{(3)} (ξ) + \frac{1}{2} (E^{ξ \circ {(Δ)}^{- 1}} [{‖ ω^{0} ‖}^{2}] - E^{ξ} [{‖ ω^{0} ‖}^{2}]) + \frac{1}{2 {(2 π)}^{d}} \int_{{[- π, π]}^{d}} \log \det (\tilde{K} (θ)) d θ \\ = I^{(3)} (ξ) - Γ_{q}^{Δ} (ξ) . \end{array}

Here,

Γ_{q}^{Δ} (ξ)

is equal to Γ^∆(ξ), as defined above, subject to the above assumption that S^j = 0 for |j| > q. On taking q → ∞, it may be readily seen that

Γ_{q}^{Δ} \to Γ^{^{Δ}}

pointwise. Furthermore, the lower semicontinuity of I⁽³⁾ dictates that:

I^{(3)} (ξ \circ Δ^{- 1}) \leq \underset{q \to \infty}{\lim_{¯}} I^{(3)} (ξ \circ Δ_{q}^{- 1}),

which gives us the Lemma. □

Lemma 9. For all ξ ∈ ε₂, I⁽³⁾ (ξ ○ ∆⁻¹) = I⁽³⁾ (ξ) − Γ^∆ (ξ).

Proof. We find, similarly to the previous Lemma, that if

γ \in M_{s} (T^{Z^{d}})

, then:

I^{(3)} (γ \circ ϒ^{- 1}) \leq I^{(3)} (γ) + \frac{1}{2} [E^{γ \circ ϒ^{- 1}} [{‖ ω^{0} ‖}^{2}] - E^{γ} [{‖ ω^{0} ‖}^{2}]] - \frac{1}{2 {(2 π)}^{d}} \int_{{[- π, π]}^{d}} \log \det \tilde{K} (θ) d θ .

(25)

We substitute γ = ξ ○ ∆⁻¹ into the above and, after noting Lemma 6, we find that:

I^{(3)} (ξ) \leq I^{(3)} (ξ \circ Δ^{- 1}) + \frac{1}{2} (E^{ξ} [{‖ ω^{0} ‖}^{2}] - E^{(ξ \circ Δ^{- 1})} [{‖ ω^{0} ‖}^{2}]) - \frac{1}{2 {(2 π)}^{d}} \int_{{[- π, π]}^{d}} \log \det \tilde{K} (θ) d θ = I^{(3)} (ξ \circ Δ^{- 1}) + Γ^{Δ} (ξ) .

(26)

The result now follows from the previous Lemma and (26). □

We next prove some matrix identities, which are needed in the proof of Lemma 11.

Lemma 10. If A, B ∈ ℂ^l^×^l (for some positive integer l) are both Hermitian, then:

tr (AB) \leq b ‖ A ‖ ‖ B ‖ \leq {\begin{array}{l} l ‖ A ‖ tr (B) & i f & B i s p o s i t i v e \\ l tr (A) ‖ B ‖ & i f & A i s p o s i t i v e \end{array}

(27)

If A and B are positive and, in addition, A is invertible, then:

tr (B) \leq l {‖ A^{- 1} ‖}^{2} tr (ABA) .

(28)

Proof. The first part of (27) follows from von Neumann’s trace inequality. Given two matrices A and B in ℂ^l^×^l:

| tr (AB) | \leq \sum_{t = 1}^{l} α_{t} β_{t},

where the α_t’s and β_t’s are the singular values of A and B. In the case where A and B are Hermitian, the singular values are the magnitudes of the (real) eigenvalues. By Cauchy-Schwartz,

\sum_{t = 1}^{l} α_{t} β_{t} \leq \sqrt{\sum_{t = 1}^{l} α_{t}^{2}} \times \sqrt{\sum_{t = 1}^{l} β_{t}^{2}} \leq l ‖ A ‖ ‖ B ‖ .

If B is positive, so are its eigenvalues and ||B|| ≤ tr(B), hence (27). If A is invertible, tr(B) = tr(A⁻²ABA). If, moreover, A and B are both Hermitian positive, we obtain the second identity by applying (27) to the Hermitian matrix A⁻² and the Hermitian positive matrix ABA. □

Lemma 11. For all

0 < a < \frac{1}{2}

,

a E^{μ} [{‖ ω^{0} ‖}^{2} + \frac{b}{2} \log (1 - 2 a) \leq I^{(3)} (μ)] .

(29)

For all μ ∈ ε₂, there exist constants α₁ < 1, α₂ ∈ ℝ, α₃ > 1, α₄ ∈ ℝ and

\overset{⌣}{m} > 0

, such that for all

m > \overset{⌣}{m}

,

Γ_{(m)} (μ) \leq α_{1} I^{(3)} (μ) + α_{2},

(30)

Γ (μ) \leq α_{1} I^{(3)} (μ) + α_{2} .

(31)

Γ_{(m)} (μ) \geq - α_{3} I^{(3)} (μ) + α_{4}

(32)

Γ (μ) \geq - α_{3} I^{(3)} (μ) + α_{4}

(33)

There exist constants

α_{3}^{m}, α_{4}^{m} > 0

that converge to zero as m → ∞, such that:

| Γ_{(m)} (μ) - Γ (μ) | \leq α_{3}^{m} E^{μ} [{‖ ω^{0} ‖}^{2}] + α_{4}^{m} .

(34)

Proof. It is a standard result that if

μ \notin M_{s} (T^{Z^{d}})

, then I⁽³⁾(μ) = ∞, for which the first result is evident. Let

μ \in M_{s} (T^{Z^{d}})

. For

w \in T^{V_{n}}

, we let

f (ω) = {\sum_{k \in V_{n}} ‖ ω^{k} ‖}^{2}

. The function f_M(ω) = f(ω)1_af_(ω)≤_m is bounded, and hence, from Definition 1, we have:

a \int_{T^{V_{n}}} f_{M} d μ^{V_{n}} \leq \log \int_{T^{V_{n}}} \exp (a f_{M}) d P^{\otimes V_{n}} + I^{(2)} (μ^{V_{n}} ‖ P^{\otimes V_{n}}) .

We obtain using an easy Gaussian computation that:

\log \int_{τ^{V_{n}}} \exp (a f) d P^{\otimes V_{n}} = - \frac{{(2 n + 1)}^{d} b}{2} \log (1 - 2 a) .

Upon taking M → ∞ and applying the dominated convergence theorem, we obtain:

a E^{μ} [{‖ ω^{0} ‖}^{2}] \leq - \frac{b}{2} \log (1 - 2 a) + \frac{1}{{(2 n + 1)}^{d}} I^{(2)} (μ^{V_{n}} ‖ P^{\otimes V_{n}}) .

We have used the stationarity of μ. By taking the limit n → ∞, we obtain the first inequality (29).

It follows from the definition that (30)–(33) are true if μ ∉ ε₂. Thus, we may assume that μ ∉ ε₂. We choose

\overset{⌣}{m}

to be such that the eigenvalues of

{\tilde{F}}_{(m)}

(as defined in (8)) are strictly greater than zero for all

m > \overset{⌣}{m}

. It may be easily verified that:

\sum_{s = 1}^{b} {\tilde{μ}}_{s s} ({[- π, π]}^{d}) = {(2 π)}^{d} E^{μ} [{‖ ω^{0} ‖}^{2}] .

(35)

We observe the following upper and lower bounds, which hold for all

m > \overset{⌣}{m}

(and for Γ₁, too),

Γ_{1, (m)} \leq - \frac{1}{2} \inf_{m \geq \overset{⌣}{m}, θ \in {[- π, π]}^{d}} \log \det {\tilde{G}}_{(m)} (θ) < \infty

(36)

Γ_{1, (m)} \geq - \frac{1}{2} \sup_{m \geq \overset{⌣}{m}, θ \in {[- π, π]}^{d}} \log \det {\tilde{G}}_{(m)} (θ) > - \infty

(37)

We recall that, since

{\tilde{G}}_{(m)} = {\tilde{F}}_{(2)}^{2}

,

Γ_{2, (m)} (μ) = \frac{1}{2 {(2 π)}^{d}} \int_{{[- π, π]}^{d}} tr (d \tilde{μ} (θ)) - \frac{1}{2 {(2 π)}^{d}} \int_{{[- π, π]}^{d}} tr ({\tilde{F}}_{(m)}^{- 2} (θ) d \tilde{μ} (θ)) .

Note that

tr ({\tilde{F}}_{(m)}^{- 2} (θ) d \tilde{μ} (θ)) = tr ({\tilde{F}}_{(m)}^{- 1} (θ) ({\tilde{F}}_{(m)}^{- 1} (θ) d \tilde{μ} (θ) {\tilde{F}}_{(m)}^{- 1} (θ)) {\tilde{F}}_{(m)} (θ)) = tr ({\tilde{F}}_{(m)}^{- 1} (θ) d \tilde{μ} (θ) {\tilde{F}}_{(m)}^{- 1} (θ))

and apply Lemma 10, (28) to this, obtaining:

Γ_{2, (m)} (μ) \leq \frac{1}{2} (1 - α_{1}^{*}) E^{μ} [{‖ ω^{0} ‖}^{2}],

where:

α_{1}^{*} = \frac{1}{b} \inf_{θ \in {[- π, π]}^{d}, m \geq \tilde{m}} ({‖ {\tilde{F}}_{(m)} (θ) ‖}^{- 2}) > 0.

If

α_{1}^{*} \geq 1

, then (30) is clear, because I⁽³⁾(μ) ≥ 0; the above inequality would imply that Γ_2,(_m₎ is negative and (36). Otherwise, we may use (29) and (36) to find that:

Γ_{(m)} (μ) \leq \frac{1}{2 a} (1 - α_{1}^{*}) I^{(3)} (μ) - \frac{b (1 - α_{1}^{*})}{4 a} \log (1 - 2 a) - \frac{1}{2} \inf_{m \geq \overset{⌣}{m}, θ \in {[- π, π]}^{d}} \log \det {\tilde{G}}_{(m)} .

We may substitute

a > \frac{1}{2} (1 - α_{1}^{*})

, letting

α_{1} = \frac{1}{2 a} (1 - α_{1}^{*})

, into the above to obtain (30). The second inequality (31) follows by taking m → ∞ in the first.

For the third inequality, we find using (27) that:

Γ_{2, (m)} (μ) \geq \frac{1}{2} (1 - α_{3}^{*}) E^{μ} [{‖ ω^{0} ‖}^{2}],

where:

α_{3}^{*} = b \inf_{θ \in {[- π, π]}^{d}, m \geq \overset{⌣}{m}} ({‖ {\tilde{F}}_{(m)} (θ) ‖}^{2}) .

If

α_{3}^{*} \leq 1

then (32) is clear, because of the fact that I⁽³⁾(μ) ≥ 0, the above inequality would mean that Γ_2,(_m₎ is positive and (37). Otherwise, we may use (29) and (37) to find that:

Γ_{2, (m)} (μ) \geq \frac{1 - α_{3}^{*}}{2 a} I^{(3)} (μ) + \frac{b (1 - α_{3}^{*})}{4 a} \log (1 - 2 a) - \frac{1}{2} \sup_{m \geq \overset{⌣}{m}, θ \in {[- π, π]}^{d}} \log \det {\tilde{G}}_{(m)} (θ) .

This yields (32) on taking

a < \frac{α_{3}^{*} - 1}{2}

. Taking limits as m → ∞ yields (33).

Now, making use of Lemma 10, (27), it may be seen that:

\begin{matrix} | Γ_{2, (m)} (μ) - Γ_{2} (μ) | = \frac{1}{2 {(2 π)}^{d}} | \int_{{[- π, π]}^{d}} tr (({\tilde{G}}^{- 1} (θ) - {\tilde{G}}_{(m)}^{- 1} (θ)) d \tilde{μ} (θ)) | . \\ \leq \frac{b}{2 {(2 π)}^{d}} G_{m} \int_{{[- π, π]}^{d}} tr (d \tilde{μ} (θ)) . \end{matrix}

Here,

G_{m} = \sup_{θ \in {[- π, π]}^{d}} ‖ {\tilde{G}}^{- 1} (θ) - {\tilde{G}}_{(m)}^{- 1} (θ) ‖

. The convergence of Γ_1,(_m₎ to Γ₁ is clear. We thus obtain (34).

Lemma 12. I⁽³⁾(μ) − Γ(μ) is lower-semicontinuous.

Proof. Since I⁽³⁾(μ) − Γ(μ) is infinite for all μ ∉ ε₂, we only need to prove the Lemma for μ ∉ ε₂. We need to prove that if μ⁽^j⁾ → μ, then

\underset{j \to \infty}{\lim_{¯}} I^{(3)} (μ^{(j)} \circ β^{- 1}) \geq I^{(3)} (μ \circ β^{- 1})

. We may assume, without loss of generality, that:

\underset{j \to \infty}{\lim_{¯}} I^{(3)} (μ^{(j)} \circ β^{- 1}) = \underset{j \to \infty}{\lim_{¯}} I^{(3)} (μ^{(j)} \circ β^{- 1}) \in ℝ \cup \infty .

If

E^{μ^{(j)}} [{‖ v^{0} ‖}^{2} \to \infty]

, then by (29) in Lemma 11, lim_j_→∞ I⁽³⁾(μ⁽^j⁾ ○ β⁻¹) = ∞, satisfying the requirements of the Lemma.

Thus, we may assume that there exists a constant l, such that

E^{μ^{(j)}} [{‖ ω^{0} ‖}^{2}] \leq l

for all j. We therefore have that, for all m, because of (11) and (4),

\underset{j \to \infty}{\lim_{¯}} I^{(3)} (μ^{(j)} \circ β^{- 1}) = \underset{j \to \infty}{\lim_{¯}} (I_{(m)}^{(3)} (μ^{(j)}) + Γ_{(m)} (μ^{(j)}) - Γ (μ^{(j)})) .

Making use of (32), we may thus conclude that:

\underset{j \to \infty}{\lim_{¯}} I^{(3)} (μ^{(j)} \circ β^{- 1}) \geq \underset{j \to \infty}{\lim_{¯}} (I_{(m)}^{(3)} (μ^{(j)}) - ϵ_{(m)}^{*}),

for some

ϵ_{(M)}^{*}

, which goes to zero as m → ∞. In addition,

\underset{j \to \infty}{\lim_{¯}} I_{(m)}^{(3)} (μ^{(j)}) \geq I_{(m)}^{3} (μ)

due to the lower semi-continuity of

I_{(m)}^{(3)}

. On taking m → ∞, since

I_{(m)}^{(3)} (μ) \to I^{(3)} (μ \circ β^{- 1})

we find that:

\underset{j \to \infty}{\lim_{¯}} I^{(3)} (μ^{(j)} \circ β^{- 1}) \geq (μ^{(j)} \circ β^{- 1}) .

B. A Lemma on the Large Deviations of Stationary Random Variables

The following lemma is an adaptation of Theorem 4.9 in [] to ℤ^d. We state it in a general context. Let B be a Banach Space with norm || ⋅ ||. For j ∈ ℤ^d, let

λ_{j} = 3^{- d} \prod_{δ = 1}^{d} 2^{- | j (δ) |}

. We note that

\sum_{k \in Z^{d}} λ_{k} = 1

. Define the metric d^λ on

B^{Z^{d}}

by:

d^{λ} (x, y) = \sum_{j \in Z^{d}} λ_{j} min (‖ x^{j} - y^{j} ‖, 1) .

Let the induced Prokhorov metric on

M (B^{Z^{d}})

be d^λ,^M. For

ω \in B^{Z^{d}}

and j ∈ ℤ^d, we define the shift operator as S^j(ω)^k = ω^j+k. Let

B^{δ} (A) = {x \in B^{Z^{d}} : d^{λ} (x, y) \leq δ}

for some y ∈ A} be the closed blowup and B(δ) be the closed blowup of {0}.

Suppose that for m ∈ ℤ₊, Y₍_m₎,

W \in B^{Z^{d}}

are stationary random variables, governed by a probability law ℙ. We suppose that

{\hat{μ}}_{n} (Y_{(m)})

is governed by

\prod_{(m)}^{n}

and

{\hat{μ}}^{n}

is governed by

\prod_{W}^{n}

; these being the empirical process measures, defined analogously by (2). Suppose that for each m,

(\prod_{(m)}^{n})

satisfies an LDP with good rate function J₍_m₎. Suppose that W = Y₍_m₎ + Z₍_m₎ for some series of stationary random variables Z₍_m₎ on

B^{Z^{d}}

.

Lemma 13. If there exists a constant κ > 0, such that for all b > 0:

\underset{m \to \infty}{\lim^{¯}} \underset{n \to \infty}{\lim^{¯}} \frac{1}{{(2 n + 1)}^{d}} \log E ⌊ \exp (b \sum_{j \in V_{n}} min ({‖ Z_{(m)}^{j} ‖}^{2}, 1)) ⌋ < k,

(38)

then

(\prod_{W}^{n})

satisfies an LDP with good rate function:

J (x) = \lim_{ϵ \to 0} \underset{m \to \infty}{\lim_{¯}} \inf_{y \in B^{ϵ} (x)} J_{(m)} (y) = \lim_{ϵ \to 0} \underset{m \to \infty}{\lim^{¯}} \inf_{y \in B^{ϵ} (x)} J_{(m)} (y) .

Proof. It suffices, thanks to Theorem 4.2.16, Exercise 4.2.29 in [], to prove that for all ϵ > 0,

\lim_{m \to \infty} \underset{n \to \infty}{\lim^{¯}} \frac{1}{{(2 n + 1)}^{d}} \log P (d^{λ, M} ({\hat{μ}}^{n} (W), {\hat{μ}}^{n} (Y_{(m)})) > ϵ) = - \infty .

(39)

For

x \in B^{Z^{d}}

, write |x|_λ:= d^λ(x, 0). Let

B \in B (B^{Z^{d}})

. Then, noting the definition of p_n just above (2),

\begin{array}{l} {\hat{μ}}^{(n)} (W) (B) = \frac{1}{{(2 n + 1)}^{d}} \sum_{j \in V_{n}} 1_{B} (S^{j} p_{n} (Y_{(m)}) + S^{j} p_{n} (Z_{(m)})) \\ \leq \frac{1}{{(2 n + 1)}^{d}} \sum_{j \in V_{n}} {1_{B} (S^{j} p_{n} (Y_{(m)}) + S^{j} p_{n} (Z_{(m)})) 1_{B (ϵ)} (S^{j} p_{n} (Z_{(m)})) + 1_{B {(ϵ)}^{c}} (S^{j} p_{n} (Z_{(m)}))} \\ \leq \frac{1}{{(2 n + 1)}^{d}} \sum_{j \in V_{n}} 1_{B^{ϵ}} (S^{j} p_{n} (Y_{(m)})) + \frac{1}{{(2 n + 1)}^{d}} # {j \in V_{n} : | S^{j} p_{n} (Z_{(m)}) |_{λ} > ϵ} \\ \leq {\hat{μ}}^{n} (Y_{(m)}) (B^{ϵ}) + \frac{1}{(2 n + 1)} # {j \in V_{n} : | S^{j} p_{n} (Z_{(m)}) |_{λ} > ϵ} . \end{array}

Thus:

\begin{array}{l} P (d^{λ, M} ({\hat{μ}}^{n} (W), {\hat{μ}}^{n} (Y_{(m)})) > ϵ) \\ \leq P (\frac{1}{{(2 n + 1)}^{d}} # {j \in V_{n} : | S^{j} p_{n} (Z_{(m)}) |_{λ} > ϵ} > ϵ) \\ \leq P (\sum_{j \in V_{n}} | S^{j} p_{n} (Z_{(m)}) |_{λ}^{2} > {(2 n + 1)}^{d} ϵ^{3}) \\ \leq \exp (- b {(2 n + 1)}^{d} ϵ^{3}) E^{P} [\exp (b \sum_{j \in V_{n}} | S^{j} p_{n} (Z_{(m)}) |_{λ}^{2})] \\ \leq \exp (- b {(2 n + 1)}^{d} ϵ^{3}) E^{P} [\exp (b \sum_{j \in V_{n}} λ_{k} min ({‖ p_{n} {(Z_{(m)})}^{j + k} ‖}^{2}, 1))] \end{array}

for an arbitrary b > 0. Since

\sum_{K \in Z^{d}} λ_{k} = 1

and the exponential function is convex, by Jensen’s inequality:

\begin{array}{l} \exp (- b {(2 n + 1)}^{d} ϵ^{3}) E^{P} [\exp (b \sum_{j \in V_{n}, k \in Z^{d}} λ_{k} min ({‖ p_{n} {(Z_{(m)})}^{j + k} ‖}^{2}, 1))] \\ \leq \exp (- b {(2 n + 1)}^{d} ϵ^{3}) E^{P} [\sum_{k \in Z^{d}} λ_{k} \exp (b \sum_{j \in V_{n}} min ({‖ p_{n} {(Z_{(m)})}^{j + k} ‖}^{2}, 1))] \\ = \exp (- b {(2 n + 1)}^{d} ϵ^{3}) E^{P} [\exp (b \sum_{j \in V_{n}} min ({‖ (Z_{(m)}^{j}) ‖}^{2}, 1))], \end{array}

by the stationarity of Z₍_m₎ and the fact that

\sum_{k \in Z^{d}} λ_{k} = 1

. We may thus infer, using (38), that:

\underset{m \to \infty}{\lim^{¯}} \underset{n \to \infty}{\lim^{¯}} \frac{1}{{(2 n + 1)}^{d}} \log P (d^{λ, M} ({\hat{μ}}^{n} (W), {\hat{μ}}^{n} (Y_{(m)})) > ϵ) \leq - b ϵ^{3} + k .

(40)

Since b is arbitrary, we may take b → ∞ to obtain (39). □

C. Proof of Corollary 2

We now prove Corollary 2.

Proof. Let

ϕ : T \to T

be ϕ(⍵) = ⍵ + c and

ϕ^{Z^{d}} : T^{Z^{d}} \to T^{Z^{d}}

be

ϕ^{Z^{d}} {(ω)}^{j} = ϕ (ω^{j})

. Let

Ψ^{Z^{d}} : M_{s} (T^{Z^{d}}) \to M_{s} (T^{Z^{d}})

be

Ψ^{Z^{d}} (μ) = μ \circ {(ϕ^{Z^{d}})}^{- 1}

and

Ψ : M_{s} (T) \to M_{s} (T)

be Ψ(v) = v ○ ϕ⁻¹. It is easily checked that these maps are bicontinuous bijections for their respective topologies. Since

\prod_{c}^{n} = \prod_{c}^{n} \circ {(Ψ^{Z^{d}})}^{- 1}

, we have, by a contraction principle, Theorem 4.2.1 in [], that

\prod_{c}^{n}

satisfies a strong LDP with good rate function:

I^{(3)} ({(Ψ^{Z^{d}})}^{- 1} (μ)) - Γ ({(Ψ^{Z^{d}})}^{- 1} (μ)) .

(41)

Clearly,

{(Ψ^{Z^{d}})}^{- 1} (μ)

is in ε₂ if and only if μ is in ε₂. Let

v = {(Ψ^{Z^{d}})}^{- 1} (μ)

. It is well known that if

{(Ψ^{Z^{d}})}^{- 1} {(μ)}^{V_{n}}

is absolutely continuous relative to

P^{\otimes V_{n}}

, then the relative entropy may be written as:

\begin{array}{l} I^{(2)} ({({(Ψ^{Z^{d}})}^{- 1} (μ))}^{V_{n}} ‖ P^{\otimes V_{n}}) = E^{{({(Ψ^{Z^{d}})}^{- 1} (μ))}^{V_{n}_{(x)}}} [\log \frac{d {({(Ψ^{Z^{d}})}^{- 1} (μ))}^{V_{n}}}{d P^{\otimes V_{n}}} (x)] \\ = E^{μ^{V_{n (x)}}} [\log \frac{d μ^{V_{n}}}{d {(Ψ (P))}^{\otimes V_{n}}} (x)] = I^{(2)} (μ^{V_{n}} ‖ {(Ψ (P))}^{\otimes V_{n}}) . \end{array}

Otherwise, the relative entropy is infinite. Thus, if the relative entropy is finite,

{(Ψ^{Z^{d}})}^{- 1} {(μ)}^{V_{n}}

must possess a density, and this means that

μ^{V_{n}}

possesses a density, which we denote by

r (x) : T^{V_{n}} \to T^{V_{n}}

. We note that the density of

{(Ψ (P))}^{\otimes V_{n}}

is:

ρ^{V_{n}} (x) = {(2 π)}^{- \frac{{(2 n + 1)}^{d_{b}}}{2}} \exp (- \frac{1}{2} {\sum_{j \in V_{n}} ‖ x^{j} - c ‖}^{2}) .

(42)

Accordingly, we find that:

\begin{array}{l} I^{(2)} (μ^{V_{n}} ‖ {(Ψ (P))}^{\otimes V_{n}}) \\ = E^{μ^{V_{n} (x)}} [\log (\frac{r (x)}{ρ^{V_{n}} (x)})] \\ = I^{(2)} (μ^{V_{n}} ‖ P^{\otimes V_{n}}) - {(2 n + 1)}^{d †} m^{μ} c + \frac{{(2 n + 1)}^{d}}{2} {‖ c ‖}^{2} . \end{array}

We divide by (2n + 1)^d and take n to infinity to obtain:

I^{(3)} ({(Ψ^{Z^{d}})}^{- 1} (μ)) = I^{(3)} (μ) -^{†} m^{μ} c + \frac{1}{2} {‖ c ‖}^{2} .

If

μ^{V_{n}}

does not possess a density for some n, then both sides of the above equation are infinite. It may be verified that the spectral density of ν is given by:

d \tilde{v} (θ) = d \tilde{μ} (θ) + {(2 π)}^{d} δ (θ) (c^{†} c - m^{μ †} c - c^{†} m^{μ}) .

On substituting this into the expression in Theorem 1, we find that:

\begin{matrix} Γ_{2} (v) = Γ_{2} (μ) + \frac{1}{2} tr (({Id}_{b} - \tilde{G} {(0)}^{- 1}) c^{†} c) - \frac{1}{2} tr (({Id}_{b} - \tilde{G} {(0)}^{- 1}) (m^{μ †} c + c^{†} m^{μ})) \\ = Γ_{2} (μ) + {\frac{1}{2}}^{†} c ({Id}_{b} - \tilde{G} {(0)}^{- 1}) c -^{†} c ({Id}_{b} - \tilde{G} {(0)}^{- 1}) m^{μ} . \end{matrix}

We have used the fact that

\tilde{G} (0)

is symmetric. We thus obtain (5). This minimum of the rate function remains unique because of the bijectivity of

(Ψ^{Z^{d}})

. □

Acknowledgments

This work was partially supported by the European Union Seventh Framework Programme (FP7/2007-2013) under Grant agreement No. 269921 (BrainScaleS), No. 318723 (Mathemacs), and by the ERCadvanced grant, NerVi, no. 227747.

This work was supported by INRIA FRM, ERC-NERVI number 227747, European Union Project # FP7-269921 (BrainScales), and Mathemacs # FP7-ICT-2011.9.7

Author Contributions

Both authors contributed to all the article.

Conflicts of Interest

The authors declare no conflict of interest.

References

Baladron, J.; Fasoli, D.; Faugeras, O.; Touboul, J. Mean-field description and propagation of chaos in networks of Hodgkin-Huxley and Fitzhugh-Nagumo neurons. J. Math. Neurosci. 2012, 2. [Google Scholar] [CrossRef]
Faugeras, O.; Maclaurin, J. Asymptotic description of neural networks with correlated synaptic weights; Rapport de recherche RR-8495; INRIA: Sophia Antipolis, France, March 2014. [Google Scholar]
Faugeras, O.; Touboul, J.; Cessac, B. A constructive mean field analysis of multi population neural networks with random synaptic weights and stochastic inputs. Front. Comput. Neurosci. 2009, 3. [Google Scholar] [CrossRef]
Rolls, E.T.; Deco, G. The Noisy Brain: Stochastic Dynamics as a Principle of Brain Function; Oxford university press: Oxford, UK, 2010. [Google Scholar]
Donsker, M.D.; Varadhan, S.R.S. Asymptotic evaluation of certain markov process expectations for large time, iv. Comm. Pure Appl. Math. 1983, XXXVI, 183–212. [Google Scholar]
Chiyonobu, T.; Kusuoka, S. The large deviation principle for hypermixing processes. Probab. Theor. Relat. Field. 1988, 78, 627–649. [Google Scholar]
Bryc, W.; Dembo, A. Large deviations and strong mixing. In Annales de l’IHP Probabilités et statistiques; Elsevier: Amsterdam, The Netherlands, 1996; Volume 32, pp. 549–569. [Google Scholar]
Deuschel, J.D.; Stroock, D.W.; Zessin, H. Microcanonical distributions for lattice gases. Comm. Math. Phys. 1991, 139, 83–101. [Google Scholar]
Baxter, J.R.; Jain, N.C. An approximation condition for large deviations and some applications. In Convergence in Ergodic Theory and Probability; Bergulson, V., Ed.; Ohio State University, Mathematical Research Institute: Columbus, OL, USA, 1993. [Google Scholar]
Steinberg, Y.; Zeitouni, O. On tests for normality. IEEE Trans. Inf. Theory. 1992, 38, 1779–1787. [Google Scholar]
Bryc, W.; Dembo, A. On large deviations of empirical measures for stationary gaussian processes. Stoch. Process. Appl. 1995, 58, 23–34. [Google Scholar]
Faugeras, O.; MacLaurin, J. Asymptotic description of stochastic neural networks. I. Existence of a large deviation principle. Comptes Rendus Mathematiques 2014, 352(10). [Google Scholar]
Faugeras, O.; MacLaurin, J. Large deviations of a spatially-ergodic neural network with learning 2014, arXiv, 1404.0732.
Donsker, M.D.; Varadhan, S.R.S. Large deviations for stationary Gaussian processes. Commun. Math. Phys. 1985, 97, 187–210. [Google Scholar]
Shiryaev, A.N. Probability; Springer: Berlin, Germany, 1996. [Google Scholar]
Dembo, A.; Zeitouni, O. Large Deviations Techniques, 2nd ed; Springer: Berlin, Germany, 1997. [Google Scholar]

© 2014 by the authors; licensee MDPI, Basel, Switzerland This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/4.0/).

A Large Deviation Principle and an Expression of the Rate Function for a Discrete Stationary Gaussian Process

Abstract

1. Introduction

2. Preliminary Definitions

3. Proof of Theorem 1

Appendix

A. Properties of the Entropy

B. A Lemma on the Large Deviations of Stationary Random Variables

C. Proof of Corollary 2

Acknowledgments

Author Contributions

Conflicts of Interest

References

Article Metrics

Citations

Article Access Statistics