Rates of Convergence in Laplace’s Integrals and Sums and Conditional Central Limit Theorems

: We obtained the exact estimates for the error terms in Laplace’s integrals and sums implying the corresponding estimates for the related laws of large number and central limit theorems including the large deviations approximation.


Introduction
The Laplace integrals find applications in numerous problems of mathematics and applied science, and the literature on these integrals is abundant. For example, let us mention the applications in statistical physics, see e.g., [1] or Lecture 5 in [2], in the pattern analysis [3], in the large deviation theory [4][5][6], where it is sometimes referred to as the Laplace-Varadhan method, in the analysis of Weibullian chaos [7], in the asymptotic methods for large excursion probabilities [8], in the asymptotic analysis of stochastic processes [9], and in the calculation of the tunneling effects in quantum mechanics and quantum fields, see [10,11]. It can be used to essentially simplify Maslov's type derivation of the Gibbs, Bose-Einstein and Pareto distribution [12]. An infinite-dimensional version and a non-commutative versions of the Laplace approximations were developed recently in [13,14], respectively.
The majority of research on this topic is devoted to the asymptotic expansions, or even, following the general approach to large deviation of Varadhan, just to the logarithmic asymptotics, see also [15]. In the present paper, following the recent trend for the searching of the best constants for the error term in the central-limit-type results, see [16] and references therein, we are interested in exact estimates for the main error term of the Laplace approximation. This approach to Laplace integrals was initiated by the author in book [9] (Appendix B), where the stress was on the integrals with complex phase. Here we aimed at making these asymptotic more precise for real phase including the most general case of both exponent and the pre-exponential term in the integral depending on the parameter (which is crucial for the applications to the classical conditional large numbers (LLN) that we have in mind here), and stressing two new applications, to the sums instead of integrals (Laplace-Varadhan asymptotics) and to the conditional law of large numbers (LLN) and central limit theorems (CLT) of large deviations.
The content of the paper is as follows. In Section 2 we obtained the estimates for the error term in Laplace approximation with minimum of the phase in the interior of the domain of integration improving slightly on estimates from [9], and in Section 3 we derived the resulting LLN and CLT results. In Sections 4 and 5 the same program was carried out for the case of phase minima occurring in the border of the domain. In Section 6 we derived the analogous results for the case of sums, rather than integrals. In Section 7 we show how our results can be applied to the conditional LLN and CLT of large deviations.

Phase Minimum Inside the Domain of Integration
Here we present the estimates of the remainder in the asymptotic formula for the Laplace integrals with the critical point of the phase lying in the interior of the domain of integration, adapting and streamlining the arguments of [9].
Consider the integral where Ω is an open bounded subset of the Euclidean space R d , equipped with the Euclidean norm |.|, with Euclidean volume |Ω|, the amplitude f and the phase S are continuous real functions of x ∈ Ω, N ≥ N 0 .

Remark 1.
The assumption that Ω is bounded is not essential, but simplifies explicit estimates for the error terms. One should think of Ω as a bounded subset of the full domain of integration containing all minimum points of S(., N). If f is integrable outside Ω, the integral of f (x, N) exp{−NS(x, N)} over R d \ Ω will be exponentially small as compared with Equation (1).
Recall that the kth order derivative of a real function φ on R d can be viewed as the multi-linear map The second derivative will be written as usual in the matrix form We shall denote by φ (k) (x) the corresponding norm defined as the lowest constant for which the estimate

Remark 2.
It is a standard way to define norms of multi-linear mappings, see e.g., [17]. However, as all norms on finite-dimensional spaces are equivalent, the choice of a norm is not very essential here.
Let us make now the following assumptions on the functions f and S: (C2) S(x, N) is a thrice continuously differentiable function in x such that for all x ∈ Ω, N ≥ N 0 , ξ ∈ R d , with positive constants Λ m , Λ M ; the latter condition can be concisely written as where the usual ordering on symmetric matrices is used; (C3) For any N ≥ N 0 there exists a unique point x(N) of global minimum of S(., N) in Ω, and the ball is contained in Ω. Let us denote by D N the matrix of the second derivatives of S at x(N), that is Notice that from convexity of S in Ω and Assumption (C3) it follows that Our approach to the study of the Laplace integral I(N) is based on its decomposition

Remark 3.
In the proof below one can use U(N) = {x : |x − x(N)| < N −κ } instead of Equations (2) with 1/3 ≤ κ < 1/2, the lower bounds coming from the estimate of I 1 below, and the upper bound from the estimate of I 3 below.

Property 1.
Under Assumptions (C1)-(C3), where ω(N) is a bounded function depending on Λ m , f 0 , f 1 , S 3 , d, and ω exp (N) is exponentially small, compared to the main term. Explicitly Proof. From the Taylor formula for functions on R Consequently, for x ∈ ∂U(N) we have by Assumption (C2) that It follows then from Equation (4) that so that To go further we shall need the Taylor expansion of S up to the third order. Namely, from Equation (9) we deduce the expansion where, due to the equation Turning to I (N) we further decompose it into the four integrals with It follows from Equation (14) that, for x ∈ U(N), N|σ(x, N)| ≤ S 3 /6. Using Equation (14) again and the trivial estimate |e t − 1| ≤ |t|e |t| , we conclude that, for x ∈ U(N), Consequently, we deduce that Next, or, using Equation (17) with k = 1, Next, is the area of the unit sphere in R d . Changing r to z so that z = NΛ m (r 2 − N −2/3 )/2 ⇐⇒ r 2 = N −2/3 1 + 2z Λ m N 1/3 , and thus dz = NΛ m r dr, the last integral rewrites as so that, using the inequality (1 + ω) ≤ 2 n (1 + ω n ), (20) and for d = 2 the same with 2π instead of 2.

Property 2. Under (C1)-(C3) assume additionally that S is four times differentiable and f has a Lipschitz continuous first derivatives with respect to x with
Then where the exponentially small term ω exp has exactly the same estimate as in the previous Proposition and ω(N)

Remark 5.
The key difference in the error term here is the denominator N instead of √ N in Equation (6).
Proof. We again decompose I(N) in the sum I(N) = I (N) + I (N) with I (N), I (N) given by Equation (5) and estimate I (N) by Equation (12). Estimation of I (N) needs more careful analysis using further terms of the Taylor expansion of S and f . Namely we decompose it first as From Equation (14) we get Consequently, From Equation (17) with k = 6 we deduce that To evaluate I 2 (N) we use the Taylor expansion of S to the fourth order yielding Consequently, I 2 (N) can be represented as Using the estimate forσ we obtain To evaluate J 2 we expand f in Taylor series writing Substituting this in J 2 and using the fact that the integral of an odd function over a ball centered at the origin vanishes, we get The first two integrals are estimated as above, that is Finally, J 23 (N) was estimated in Proposition 1 by representing it as the difference between the integral over the whole space R d and the integral over R d \ U(N), the first term yielding the main term of the asymptotics and the second one being exponentially small. Exponentially small terms are exactly the same as in the previous Proposition. Summarizing the estimates obtained and slightly simplifying, yields Equation (22). Let ξ N denote a Ω-valued random variable having density

LLN and CLT for Internal
(i) Then ξ N weakly converge to x 0 . More explicitly, for a smooth g, one has with a constant c 1 depending on f 0 , Λ m , S 3 , d and f m = min x∈Ω f (x), which can be explicitly derived from Equations (7) and (8).
(ii) If additionally S satisfies the conditions of Proposition 2, then with a constant c 2 depending on f 0 , f 1 , Λ m , S 3 , S 4 , d and f m .
Proof. From Propositions 1 and 2 we conclude that and in cases (i) and (ii) respectively. The estimates of Equations (25) and (26) are then obtained from the triangle inequality.
Next we were interested in the convergence of the normalized fluctuations of ξ N around x 0 , namely, of the random variables To simplify the formulas below we shall assume that f (x, N) = 1, but everything remains valid under general f satisfying the assumptions above, To analyze the fluctuations, we use their moment generating functions for p ∈ R d . The numerator in Equation (30) can be written in the form of Equation (1) as where the new phase is To shorten the notations, we shall denote by primes the derivatives of S or S * with respect to the variable x. S * is also convex, as S is, and has the same derivatives of order 2 and higher as S. To apply the Laplace method we need to find its point of global minimum, which coincides with its (unique) critical point, that we denote by x * = x * (p, N) and that solves the equation As a preliminary step to proving our CLT let us perform some elementary analysis of this equation proving its well posedness and finding its dependence on N in the first approximation. We shall need the following elementary result. Proof. Injectivity is straightforward from convexity. Let us prove the last statement, that is, that for any y ∈ B KΛ m there exists z ∈ B K such that S (x 0 + z) = y. For any α > 0, this claim is equivalent to the existence of a fixed point for a mapping By the famous fixed point principle, to show the existence of a fixed point, it is sufficient to show that Φ maps B K to itself, that is, Φ(z) ≤ K whenever z ≤ K. Let and take α = 1/Λ M . Then the symmetric matrix Hence, the inequality Φ(z) ≤ K is fulfilled whenever y ≤ Λ m K, as was claimed.
Thus the image of the set U(N) contains the ball of radius Λ m N −1/3 , so that for every y : |y| ≤ Λ m N −1/3 there exists a unique x ∈ U(N) such that S (x) = y.
On the other hand, for any K we can take N 1 = max(N 0 , (K/Λ m ) 6 ), which is such that for all N > N 1 and |p| ≤ K. Consequently, by Lemma 1, for such p and N, there exists a unique solution x * = x * (p, N) of Equation (31) in Ω, and x * ∈ U(N), i.e., Next, expanding S (x, N) in the Taylor series around x(N) (where S (x(N), N) = 0), we find from Equation (31) that and thus (recall that we denote D N = S (x (N), N)). This allows us to improve the preliminary estimate of Equation (32) and to obtain Hence from Equation (33) we get Finally we conclude that We can now prove a convergence result that can be called the CLT for Laplace integrals.

Theorem 2.
Under the assumption of Theorem 1 (i), assume additionally that x(N) converges to x 0 quickly enough, that is with positive constants c, δ. Then the fluctuations η N = √ N(ξ N − x 0 ) converge weakly to a centered Gaussian random variable with the moment generating function Proof. We show that the moment generating functions of the fluctuations η N given by Equation (30) converge, as N → ∞, to the function M(p), the convergence being uniform on bounded subsets of p. By the well known characterization of weak convergence this will apply the weak convergence of the random fluctuations η N . Applying Proposition 1 to the numerator and denominator of the r.h.s. of Equation (30) we get, for N > N 0 , where ω is a bounded function, with a bound, depending on S 3 , Λ m , p, d, that can be found explicitly from Equation (7).
We have and consequently Therefore, Using Equation (63) we conclude that where the constant c depends on p, S 3 , Λ m , Λ M , d.
Next, from Equation (35) we get with another constant c depending on p, S 3 , Λ m , Λ M , d. Consequently, we deduce from Equation (41) that with some functions c, ω, which are bounded on bounded subsets of p, implying the required convergence of the functions M N (p).

Phase Minimum on the Border of the Domain of Integration
Here we present the estimates of the remainder in the asymptotic formula for the Laplace integrals with the critical point of the phase lying on the boundary of the domain of integration.
Let us start with a simple one-dimensional result, which is version of the well known Watson lemma. The proof can be performed as above by decomposing the domain of integration [0, a] into the two intervals: [0, N −1/2 ] and [N −1/2 , a]. We omit the detail of the proof.

Remark 6. One can obtain similar result by decomposing
for any γ ∈ [1/2, 1), in which case the exponentially small term will get the estimate This also shows that Lemma 2 remains essentially valid for small a of order a = N −γ , γ < 1, which is used in the proof of the next result.
Let us turn to the general case. Namely, assume Ω is a bounded open set in R d+1 . The coordinates in R d+1 will be denoted (x, y) with x ∈ R, y ∈ R d . Let with some smooth function ψ. It will be convenient to introduce the sections of Ω as the sets We are interested in the asymptotics of the Laplace integral with continuous functions f and S referred to as the amplitude and phase respectively. Let us first discuss the case of Ω + with a plane boundary, that is with ψ(Y) = 0, or equivalently with We shall assume the following: (C1') f (x, y, N) is a continuously differentiable function on Ω + (up to the border) with (C2') S(x, y, N) is thrice continuously differentiable function of x and y such that (where ≥ is the usual order on symmetric matrices) and ∂S ∂x (x, y, N) ≥ g m with positive constants Λ m , g m , and Remark 7. As was noted above, the norms of higher derivatives in the estimates that we are using are their norms as multi-linear operators. For instance, is the minimum of constants α such that d ∑ j=1 ∂ 2 S(x, y, N) ∂x∂y j xy j ≤ α|x||y|.
(C3') For any N > N 0 , there exists a unique point of global minimum of S in Ω + , this point lies on the boundary {x = 0}, i.e., it has coordinates (0, y(N)) with some y(N) ∈ R d , and the box is contained in Ω + . We shall also use the sections U(x, N) = {y : (x, y) ∈ U(N)}.
Let us denote by D N the matrix of the second derivatives of S as a function of y at (0, y(N), N), and by g N the gradient of S as a function of x at (0, y(N), N), that is The approach of our analysis is to decompose the integral I(N) into the sum of two integrals holds for Ω + from Equation (47) and N > N 1 = max(N 0 , (2S 2 /Λ m ) 3 ), where ω exp (N) is an exponentially small term and |ω(N)| ≤ 1 (53) Proof. Integral I (N) from Equation (50) yields clearly an exponentially small contribution, similar to the integral I (N) in Proposition 1, so we omit the details here.
To calculate I(x, N) we have to know critical points of the phase S(x, y, N) as a function of y, that is the solutions y * (x, N) of the equation As S is convex in y, the solution is unique, if it exists. Proceeding as in Lemma 1, that is, searching for a fixed point of the mapping we find that there exists a unique solution y * (x, N) of Equation (54) whenever Next, using the Taylor expansion of ∂S/∂y around the point (0, y(N), N) we get that

This implies
which is an essential improvement as compared with the initial estimate of Equation (56). It ensures that the distance from y * (x, N) to the border of U(x, N) is of order N −1/3 , so that Proposition 1 can in fact be applied to the integral I(x, N) leading to I(x, N) = ω exp (x, y, N) det ∂S 2 ∂y 2 (x, y * (x, N), N) where ω exp is exponentially small compared to the main term and In order to apply Lemma 2 we need to get lower and upper bounds to the quantities But the second term vanishes. Hence Next, differentiating Equation (54) with respect to y we obtain Consequently, using the formula for the differentiation of the determinant of invertible symmetric matrices, Hence Lemma 2 can be applied to the calculation of I (N) given by Equations (51) and with similar change in the constants appearing inω(N) andω exp (N).

LLN and CLT for Minima on the Boundary
The results on weak convergence of random variables with exponential densities given above for the case of the phase having minimum in the interior of the domain can be now recast for the case of the phase having minimum on the boundary of the domain of integration. The following statements are proved by literally the same argument as Theorems 1 and 2. We omit details.

Theorem 4. Let Ω be a bounded open set in R d+1
+ with coordinates (x, y), x ∈ R, y ∈ R d , and let Let the functions f (x, y, N), S(x, y, N) be a continuous functions on Ω + × [1, ∞) satisfying condition (C1')-(C3') from Theorem 3. Assume moreover that f is bounded below by a positive constants and that the sequence of global minima (0, y(N)) converges, as N → ∞, to a point (0, y 0 ) belonging to the interior of Ω.
Let (ξ x N , ξ y N ) denote a Ω + -valued random variable having density φ N (x, y) that is proportional to f (x, y, N) exp{−NS(x, y, N)}, that is with a constant c depending only on S (actually on the bounds for the derivatives of S up to the third order).

Theorem 5.
Under the assumptions of Theorem 4 assume additionally that Then the fluctuations (η x N , η y N ) = (Nξ x N , √ N(ξ y n − y 0 )) converge weakly to a (d + 1)-dimensional random vector such that its last coordinates form a centered Gaussian random vector with the moment generating function and the first coordinate is independent and represents a g 0 -exponential random variable. The rates of convergence with all explicit constants are obtained directly from Theorem 3.

Laplace Sums with Error Estimates
It is more or less straightforward to modify the above results to the of sums rather than integrals. Namely, instead of the integral I(N) from Equation (1) let us consider the sum where Ω is an open polyhedron of the Euclidean space R d , with Euclidean volume |Ω|, the amplitude f and the phase S are continuous real functions.

Theorem 6.
Under the assumptions of Proposition 1, Proof. We use the well known (and easy to prove) fact (a simplified version of the Euler-Maclorin formula) that Consequently, where I(N) is from Equation (1). The first integral on the r.h.s. of Equation (68) is clearly of order 1/N, as compared with the main term of I(N) given in Proposition 1. The pre-exponential term in the second integral vanishes at the critical point (x(N), N) of S(x, N). Hence the required estimate for the second integral is obtained directly from Proposition 1. Now all LLN and CLT results obtained above for continuous distributions can be reformulated and proved straightforwardly for the case of discrete random variables taking values in the lattice {x k = k/N ∈ Ω} with probabilities proportional to f (x k ) exp{−NS(x k , N)}.

Application to LLN and CLT of Large Deviations
Conditional LLN (conditioned on the sums of the corresponding random variables to stay in a certain prescribed domain, usually some linear subspace or a convex set) are well developed in probability (see e.g., [2,18] for two different contexts). The results above can be used to supply exact estimates for the error terms in these approximations. To illustrate this statement in the most transparent way let us start with the classical multidimensional local theorem of large deviations as given in [4] (that extends earlier results of [6]). Namely, let ξ, ξ 1 , ξ 2 , · · · be a sequence of independent identically distributed R k -valued random vectors. Assume that the set O of vectors λ ∈ R k such that the moment generating function v(λ) = Ee (λ,ξ) is well defined has a nonempty interior O 0 . It is well known (and easy to see) that the functions v and ln v are convex and the sets O 0 and its closureŌ 0 =Ō are convex. The function ψ(α) = inf[ln v(λ) − (α, λ)] is called the entropy and it is concave. Moreover, the infimum in its definition is attained, so that there exists λ(α) ∈ O such that ψ(α) = inf λ [ln v(λ) − (α, λ)] = ln v(λ(α)) − (α, λ(α)), and the function λ(α) is a diffeomorphism of O 0 onto some open domain Ω in R k . Assume that the random variable ξ has a bounded probability density p(x), and define the family of distributions P α with the densities π α (x) = exp{(λ(α), x) − ψ(α)}p(x).
Let p n (x) be the density of the averaged sum S n /n = (ξ 1 + · · · + ξ n )/n. Theorem 1 of [4] states (though we formulate it equivalently in terms of the density of S n /n, rather than S n as is done in [4]) that if Φ is any compact set in Ω, then p n (α) = n k/2 e nψ(α) (2π) k/2 det(M(α)) 1/2 1 + s ∑ j=1 c j (α)n −j + O(n −s ) , where s is arbitrary, the estimate is uniform for α ∈ Φ, M(α) is the matrix of the second moments of the distributions P α , the coefficients c j (α) depend only on 2j + 2 moments of P α and are uniformly bounded in Φ.
The densities of Equation (69) are exactly of the type dealt with in our Theorems 1, 2, and 4, and Equation (5). Thus, these theorems are applied directly for finding the rates of convergence for LLN and CLT for the sums of independent variables when S n /n is reduced to some convex bounded set with smooth boundary or a linear subspace. These conditional versions of LLN may be applied even if Eξ is not defined, so that the usual LLN does not hold.
When the random variable ξ has values in a lattice, a version with sums, that is Theorem 6, should be applied to get the rates of convergence in the corresponding laws of large numbers.

Conflicts of Interest:
The authors declare no conflicts of interest.