An Edgeworth Expansion for the Ratio of Two Functionals of Gaussian Fields and Optimal Berry–Esseen Bounds

: This paper is concerned with the rate of convergence of the distribution of the sequence { F n / G n } , where F n and G n are each functionals of inﬁnite-dimensional Gaussian ﬁelds. This form very frequently appears in the estimation problem of parameters occurring in Stochastic Differential Equations (SDEs) and Stochastic Partial Differential Equations (SPDEs). We develop a new technique to compute the exact rate of convergence on the Kolmogorov distance for the normal approximation of F n / G n . As a tool for our work, an Edgeworth expansion for the distribution of F n / G n , with an explicitly expressed remainder, will be developed, and this remainder term will be controlled to obtain an optimal bound. As an application, we provide an optimal Berry–Esseen bound of the Maximum Likelihood Estimator (MLE) of an unknown parameter appearing in SDEs and SPDEs.


Introduction
Let X = {X(h), h ∈ H} be an isonormal Gaussian process defined on a probability space (Ω, F, P), where H is a real separable Hilbert space, and let {F n , n ≥ 1} be a sequence of random variables of functionals of infinite-dimensional Gaussian fields associated with X. In recent years, there have been many efforts to find necessary and sufficient conditions for a normal approximation. The authors in [1] discovered a surprising central limit theorem, called the "fourth moment theorem", for a sequence of random variables belonging to some Wiener chaos associated with X. Afterwards, the authors in [2] obtained the upper bound of the distances between two random variable F n and Z by using the techniques based on the combination between Malliavin calculus (see, e.g., [3][4][5][6]) and Stein's method for normal approximation (see, e.g., [7][8][9]). For instance, in the case of the Kolmogorov distance d Kol , one obtained that where DF n and L −1 denote the Malliavin derivative of F n and the pseudo-inverse of the Ornstein-Uhlhenbeck generator, respectively (see Section 2). In the particular case where F n is an element in the qth Wiener chaos of X with E[F 2 n ] = 1, the upper bound (1) is expressed in the following form where E[F 4 n ] − 3 is just the fourth cumulant of F n . Recall that the bound ϕ(F n ) is optimal for the sequence {F n } with respect to some distance d if there exist constants 0 < c < C < ∞, independent of n, such that cϕ(F n ) ≤ d(F n , Z) ≤ Cϕ(F n ), for all n ≥ 1. ( In [10], the authors prove that whenever the distance d is defined as the supremum over a class of twice continuously differentiable functions with a bounded second derivative, an optimal rate of convergence is given by the sequence in the case where {F n } is a sequence of random variables belonging to a fixed Wiener chaos. The further work in [11] subsequent to this result is that the optimal rate in the total variation distance exactly coincides with the rate in the case of the distance related to the smooth test functions. When we considered the statistical estimation problem of parameters appearing in SDEs or SPDEs, quite often we encountered statistics of the form F n /G n . In these circumstances, how an upper bound ϕ(n) in (1) is derived can not be applied directly. The authors of [12] developed a technique to obtain the upper and lower rates of the convergence on the Kolmogorov distance for a sequence {F n /G n }. Using both these rates, the authors obtain the optimal bound for the convergence rate of the normal approximation for MLE of the parameter occurring in SPDE. However, the paper [12] does not provide a complete answer to the optimal rates of convergence on the Kolmogorov distance because the convergence rates in the upper and lower bound derived in [12] may not match each other. The aim of the present work is to find a complete answer for the optimal rate. To this end, a set of sufficient conditions will be derived, ensuring that the upper and lower bounds yield an optimal rate for the Kolmogorov distance. Our methodology to obtain the optimal bound is based on an Edgeworth expansion of P(F n /G n ≤ z) derived from that for P(F n ≤ z) in [13] (for Edgeworth expansions, see, e.g., Chapter 2 in [14] and Chapter 5 in [15]). In that paper, the Edgeworth expansion with general order J is given by where the quantity g s (F n ) can be expressed in terms of the cumulants of F n , H s (z) is the Hermite polynomial of order s, and φ is the probability density function of a standard Gaussian distribution. It is easier and more extensive for newly developed techniques to apply to the estimation problems for parameters occurring in SDEs or SPDEs than to existing ones. Moreover, a set of sufficient conditions derived in this work is satisfied in most of the estimation problems for parameters in SDEs and SPDEs.
The rest of the paper is organized as follows. Section 2 reviews some basic notations, and the results of Malliavin calculus and Stein's method. In Section 3, the Edgeworth expansion with general order will be derived. Using this expansion, an optimal bound in Kolmogorov distance will be obtained in Section 4. Finally, as an application of our main results, in Section 5, we provide an optimal Berry-Esseen bound for the estimator of parameters in SDE and SPDE.
Throughout this paper, c (or C) stands for an absolute constant with possibly different values in different places.

Malliavin Calculus
In this section, we recall some basic facts about Malliavin calculus for Gaussian processes. The reader is referred to [5,6] for a more detailed explanation. Suppose that H is a real separable Hilbert space with scalar product denoted by ·, · H . Let B = {B(h), h ∈ H} be an isonormal Gaussian process, that is a centered Gaussian family of random variables such that E[B(h)B(g)] = h, g H . For every n ≥ 1, let H n be the nth Wiener chaos of B, that is the closed linear subspace of L 2 (Ω) generated by {H n (B(h)) : h ∈ H, h H = 1}, where H n is the nth Hermite polynomial. We define a linear isometric mapping I n : H n → H n by I n (h ⊗n ) = n!H n (B(h)), where H n is the symmetric tensor product. It is well known that any square integrable random variable F ∈ L 2 (Ω, G, P) (G denotes the σ-field generated by B) can be expanded into a series of multiple stochastic integrals: , the series converges in L 2 , and the functions f q ∈ H q are uniquely determined by F. Let {e l , l ≥ 1} be a complete orthonormal system in H. If f ∈ H p and g ∈ H q , the contraction f ⊗ r g, 1 ≤ r ≤ p ∧ q, is the element of H ⊗(p+q−2r) , defined by f , e l 1 ⊗ · · · ⊗ e l r H ⊗r ⊗ g, e l 1 ⊗ · · · ⊗ e l r H ⊗r .
The following formula for the product of the multiple stochastic integrals will be frequently used.
Proposition 1. Let f ∈ H p and g ∈ H q be two symmetric functions. Then Let S be the class of smooth and cylindrical random variables F of the form where n ≥ 1, f ∈ C ∞ b (R n ) and ϕ i ∈ H, i = 1, · · · , n. The Malliavin derivative of F with respect to B is the element of L 2 (Ω, H), defined by We denote by D l,p the closure of its associated smooth random variable class with respect to the norm We denote by δ the adjoint of the operator D, also called the divergence operator. The domain of δ, denoted by Dom(δ), is an element u ∈ L 2 (Ω; H) such that |E(< D l F, u > H ⊗l )| ≤ C(E|F| 2 ) 1/2 for all F ∈ D l,2 .
If u ∈ Dom(δ), then δ(u) is the element of L 2 (Ω) defined by the duality relationship Let F ∈ L 2 (Ω) be a square integrable random variable. The operator L is defined through the projection operator J n , n = 0, 1, 2 . . ., as L = ∑ ∞ n=0 −nJ n F, and is called the infinitesimal generator of the Ornstein-Uhlhenbeck semigroup. The relationship between the operator D, δ, and L is given as follows: δDF = −LF; that is, for F ∈ L 2 (Ω) the statement F ∈ Dom(L) is equivalent to F ∈ Dom(δD) (i.e. F ∈ D 1,2 and DF ∈ Dom(δ)), and in this case δDF = −LF. We also define the operator L −1 , which is the pseudo-inverse of L, as L −1 F = ∑ ∞ n=1 1 n J n (F). Note that L −1 is an operator with values in D 2,2 and LL −1 F = F − E[F] for all F ∈ L 2 (Ω).

Stein's Method
In this paper, we focus only on the normal approximation of random variables with respect to the Kolmogorov distance defined as where Z is a standard Gaussian random variable. For fixed z ∈ R, we consider the Stein equation where Φ(z) = P(Z ≤ z). It is well known (see, e.g., [7]) that for every fixed z ∈ R, the function is a solution to the Stein Equation (9) such that f z ∞ ≤ √ 2π/4 and f z ∞ ≤ 1. For given and write (U N 1 (−∞,z] )(w) = f z (w). The next lemma gives some properties of the function U Z h (see, e.g., [5] for all x and satisfying We define a function Let us set We shall denote by H q the qth Hermite polynomial defined as follows: H 0 (x) = 1 and for q ≥ 1

Edgeworth Expansion
Now we derive the Jth order Edgeworth expansion of P F/G ≤ z for z ∈ R. We begin by the simple result, given in Lemma 2.3 in [10].
Before stating the Edgeworth expansion, we introduce some notations including the Gamma operators Γ (see [16] for a more detailed explanation).
and for F, G ∈ D k,2 k , we define Let us define two sets D(r, k) and D * α,β (r, k) for r = 1, 2, . . . and k = 1, . . . by and, for α = −1, 1, 2 and β = 1, 2, respectively. For the sake of simplicity, we writē and sometimes abbreviate the argument F in the operatorΓ 1 (F) m for m = 0, 1 and Γ * j (F) for j = 2, . . .. Let us set for s ≥ 2, where the notation · denotes the ceiling function defined by Let us define By just following the proof of Theorem 2 in [13], we obtain an Edgeworth expansion of F/G.
Moreover, assume that F and G have an absolutely continuous law with respect to the Lebesgue measure. Then we have that, for z ∈ R, where and the remainder term is given by Here, if J is odd, then D(J + 1, J−1 2 ) = ∅. In this case, the first term in R Proof. Since G > 0 a.s., we have that For fixed z ∈ R, obviously, A z ∈ D 1,2 with E[A z ] = 0, and has an absolutely continuous law with respect to Lebesgue measure. By Stein's equation, we deduce, from (17), that, for every z ∈ R, Applying the iterared method in the proof of Theorem 2 of the paper [13] to the expectation of (18), we get Hence the proof of this theorem is completed.
Before stating and proving the main results in this section, we need to recall the multi-index notation. A multi-index is a vector of non-negative integers of the type For convention, we set 0 0 = 1. For n ≥ 1, we consider a vector of real-valued random variables The cumulant of the vector X b is defined as The joint cumulant of order m of X is defined as ).
Next, we obtain J-th order, J = 1, 2, an Edgeworth expansion by expressing g s , s = 1, 2, in terms of the cumulants of (F,Ḡ). For . . , m. The next result contains the relation between cumulants and the moment associated with X [m] (see, e.g., [17]).
The following result can be obtained from Lemma 6 in [13].

Lemma 4.
Let F and G be random variables with F, G ∈ D 3,2 3 with E[F] = 0. Then It is immediate, from Theorem 2 and Lemma 4, that the expansions with an explicit remainder will be given in the case where J = 1, 2. Let us set X = (F,Ḡ), e 1 = (1, 0) and e 2 = (0, 1).

Corollary 1.
Under the assumptions in Theorem 2, we have (a) for J = 1, where the remainder term R (2) z (A z ) is given by where the remainder term R Proof. By homogeneity and multilinearity of the joint cumulant, we get Using the formula (27) yields By using Theorem 2 and Lemma 4, together with (28) and (29), we get the expansions (23) and (25).
where the function f σ,k of 2/n variables is obtained by identifying x i and x j in the argument of f 1 ⊗ 0 , · · · ⊗ 0 f k if and only if i ∼ σ j (this notation means that i and j belong to the same block of σ).

Corollary 2.
Under the assumptions in Theorem 2, we have that (a) for J = 1, where the remainder term R (2) z (A z ) is given by (24).
(b) for J = 2, where the remainder term R z (A z ) is given by (26).

Optimal Berry-Esseen Bound
In this section, the main result in this paper will be obtained. Theorem 5 below is an interesting result in itself because it is an extension of the optimal bound (4) in the introduction, in some respect. We consider sequences F n = I q ( f n ) andḠ n = I q (g n ), n ≥ 1, living in the qth Wiener chaos, and such that each F n has the unit variance. In this case, we derive the optimal Berry-Esseen bound by using the two-term Edgeworth expansion given in Corollary 1. The following result provides some bounds on Gamma operators Γ j .
Proof. By using Proposotion 4.3 in [10], it is immediate to get results.
By regulating the remainder term in the expansion (15) of Theorem 2, the upper and lower bounds of the Kolmogorov distance between F n /G n and Z will be obtained. In this section, we assume that Theorem 3 (Upper bounds). Let F n = I q ( f n ) andḠ n = I q (g n ) for all q ≥ 2, whereḠ n = G n − E[G n ], and let X n = (F n ,Ḡ n ) such that E[G n ] = κ 2e 1 (X n ) < ∞. Assume that there exists a positive sequence {a n } such that a n → ∞, and Then there exists a constant C > 0 such that Proof. Obviously, it is sufficient to consider z ≥ 0 to prove this theorem. From (23) of Corollary 1, one can deduce that where Now the remainder term R (2) z (A z,n ) in (39) will be estimated. By using Lemma 1 and Proposition 2, the first expectation in (24) can be bounded as follows: Similarly, the second term in the remainder (24) can be estimated as Combining (40) and (41), one can see that the right-hand side of (39) can be bounded as From (42), it follows that, for sufficiently large n, On the other hand, it follows, from (43), that ≤ P F n G n ≤ a n − P(Z ≤ a n ) + 2Φ(a n ) Combining (43) and (44) completes the proof of this theorem.
Theorem 4 (Lower bounds). Let F n = I q ( f n ) andḠ n = I q (g n ) for all q ≥ 2, whereḠ n = G n − E[G n ] and E[G n ] = κ 2e 1 (X n ), and let X n = (F n ,Ḡ n ). Suppose that Then there exists a constant C > 0 such that Proof. Since H 2 (z) = 0 for z = 1 and z = −1, we have, from (25), that and Substracting (50) from (49) yields that By using Proposition 2, together with (51), we have that On the other hand, since H 1 (z) = 0 for z = 0, one immediately see, from the case of J = 2 in Corollary 1, that By a similar estimate as for (52), we obtain Combining (52) and (54) yields the lower bound (48).
By using Theorems 3 and 4, we get an optimal bound for the normal approxiamtion of F n /G n .
Theorem 5 (Optimal bound). Let F n = I q ( f n ) andḠ n = I q (g n ) for all q ≥ 2, whereḠ n = G n − E[G n ] and E[G n ] = κ 2e 1 (X n ), and let X n = (F n ,Ḡ n ). Suppose that the assumptions (i), (ii) and (iii) in Theorems 3 and 4 are satisfied. Moreover, assume that and Then there exist constants 0 < c < C < ∞ such that

Applications
In this section, we apply Theorem 5 to the estimation problem of parameters appearing in SDEs and SPDEs considered in previous works [12,19,20].

Stochastic Partial Differential Equation
In [21], the authors investigate asymptotic properties of the MLE for parameters occurring in parabolic SPDEs associated with the operator A θ ≡ θ A 1 + A 0 , and driven by a cylindrical Brownian motion W: where A 0 and A 1 are partial differential operators of orders m 0 and m 1 , so that {A θ : θ ∈ Θ} are operators of order 2m = max(m 1 , m 0 ). Problem (58)-(60) is understood in the sense of distributions. Some basic assumptions and definitions are described in Section 2 of the paper [21]. For the sake of completeness, we will briefly review these things. The main assumption is that the Equation (58) is diagonalizable. A complete orthonormal (in L 2 (G)) system {h i } ∞ i=1 is a common system of eigenfunction of the operators A 0 and A 1 ; For s < 0, H s is a closure of L 2 (G) in the norm u s defined for s < 0 by the same formula as for s > 0. For s ∈ R, H s is a Hilbert space with respect to the inner product ·, · s associated with the norm u s and the functions h s i = λ −s i h i , i = 1, 2, . . ., form an orthonormal basis in H s .
is the only solution of problem (58)-(60), where the Fourier coefficient u θ i (t) is a solution of the one-dimensional Ornstein-Uhlenbeck equation where W i (t), i = 1, 2, . . ., are independent one-dimensional Wiener processes.
The projection of the solution u(t, x) onto the subspace spanned by {h −α 1 , . . . , h −α N } is given by Let B N T be the Borel σ-algebra on C([0, T]; R N ), and P N θ be the measure on B N T generated by u N,θ . The MLEθ N corresponding to the likelihood ratio dP N θ dP N θ 0 (u N ) is given bŷ Let N be an uniformly elliptic operator of the order m. Then define is the principal symbol of the operator N . In [21], the authors prove the problem of asymptotic normality of the MLEθ N .
Theorem 6 (Huebner and Rozovskii). Assume that m 1 ≥ m − d 2 and set where τ = (m 1 − m)/d + 1/2 and converges in distribution to a Gaussian random variable with zero mean and unit variance as N ↑ ∞.
Obviously, the Fisher Information I N (θ) is given by Since u 0 ∈ L 2 (G), we have u 0 , h n 2 0 = λ 2α i u 0 , h −α n 2 −α = λ 2α i u i (0) → 0 as n → ∞. So the second sum in (65) is the main term of I N (θ). This implies that we may assume u 0 = 0 to find the rate of convergence of the distribution of I N (θ)(θ N − θ 0 ). Define the kernels f By Lemma 1 in [12] , we can write where Then we write X N = (F N ,Ḡ N ).
Now we derive an optimal Berry-Esseen bound for the MLEθ N .
If we take a N = N k( 2m+d 2d ) for k < 15 16 , the right-hand sides in (75) and (77) meet the conditions (ii) and (iii) in Theorem 5. Moreover, the inequalitȳ proves that the condition (37) sup NΦ (a N ) M N < ∞ is satisfied. Thus the proof of this theorem is completed.

Remark 1.
In Theorem 7 of the paper [12], the authors obtain an optimal bound (73) under the assumption m 1 ≥ 3m 2 − d 4 . However, in this paper, we assume that m 1 ≥ 4m 3 − d 3 . Note that 3m This fact shows that the result given in this paper is an extension of the previous result.

Ornstein-Uhlenbeck Process
In this section, we find an optimal rate of convergence of the distribution of the MLÊ θ T of the unknown parameter θ ∈ Θ ⊆ R + based on the observation X = {X t , 0 ≤ t ≤ T} given by dX t = −θX t dt + dW t , X 0 = 0, 0 ≤ t ≤ T, where {W t , t ≥ 0} is a standard Brownian motion. Define kernels f 2 (s, t) and g 2 (s, t) for t, s ∈ [0, T] as f 2 (s, t) = e −θ|t−s| , When the process {X t , 0 ≤ t ≤ T} can be observed, we can write where F T andḠ T are the double stochastic integrals and (1 − e −2θT ).