2. Main Results
Let
,
, be a sequence of i.i.d. real-valued random variables with a bounded probability density
p, and let
be the corresponding scaled random walk (where we set
for
). We shall denote by
the transition operators of the discrete Markov chain
.
For the probability p, we shall assume the following rather standard condition (P) of an asymptotic power tail, but with the marked difference that this power tail holds for large but finite distances.
Condition (P): The probability density
on
is bounded and such that
and
for
, where
is a measurable function on
such that
, with some constants
,
,
such that
. No additional assumptions on the behaviour of
on the interval
are made.
The latter condition is taken for simplicity as being a bit stronger than
which is equivalent to the requirement that
.
The generator of a one-sided stable Lévy process of index
(or, in the language of analysis, fractional derivative operator of order
) is defined by the formulas
Remark 1. In fact, the actual standard generators and derivatives differ from these formulas by a constant multiplier that we omit for simplicity.
Let be the Feller semigroup of the -stable Lévy process in generated by the operator .
Theorem 1. Let and the probability density satisfy the assumption (P). (i) If and , then which in terms of the Wasserstein 1-distance rewrites as (ii) If and , thenwherewhich in terms of the Wasserstein 1-distance rewrites as Proof. It is a consequence of general estimate (
A4), where
is given by Theorem 4 (i) and
is given by Proposition 3, proved in the next two sections. □
Remark 2. 1. It may seem strange at first sight that the range of h depends only on and not on , as one can hardly expect any approximation if, say, , which is not banned by our conditions. However, all constants depend linearly on , so that, for large , our estimates become essentially void. 2. In [6] we obtained much weaker estimates for the distances between and , and, moreover, only in case and . Applying (
6) and (
8) yields the following.
Corollary 1. In case (i) and (ii), we have the following estimates for the Kolmogorov distances for any :respectively. As a direct consequence, let us derive the approximating rates for normalised sums, that is, a non-functional (quasi) central limit theorem (CLT) with stable laws. Namely, setting and in the formulas above yields the following.
Corollary 2. Let and the probability density satisfy the assumption (P). (i) If and , then(ii) If and , then When the upper bound for n is large, we may say that the distribution belongs to the domain of quasi attraction of the -stable law, in the sense that the normalised sums of the corresponding i.i.d. random variables behave in the same way as for distributions from the actual domain of attraction, for all practical purposes. The parameter is the main parameter measuring the level of deviation from the actual domain of attraction.
Remark 3. Rates obtained for non-functional approximations (18), (19) are surely far from being optimal. The proofs (given below) show the essential flexibility of our approach. In this paper, our main stress was on functional approximations, and moreover, we planned to develop and demonstrate some methodology and did not fight for the best estimates. Nevertheless, for non-functional results and for the exact convergence (when ) it can be instructive to compare our rates with those in the literature, which are in abundance. It seems that, even in this case, our results are not consequences of any known results but complement them. The nearest to us seem to be the estimates from [9] that also operate with smooth Wasserstein distances and makes the same standard assumptions on the densities of τ. However, using Hölder spaces, we managed to obtain estimates of weak convergence in terms of just once differentiable functions for and of twice differentiable functions for (unlike twice and thrice differentiable, respectively, in [9]). Additionally, our approach allows one to further weaken these regularity assumptions (that is, decrease the order of smooth Wasserstein distances). In other papers, most notably [8], the assumptions on τ are made in terms of characteristic functions, which makes a direct comparison with our rates not straightforward. Let us turn to the case . In this paper, we decided to avoid dealing with several technical complications arising in the case .
Theorem 2. Let , the probability density satisfy assumption (P), and , where coefficients with tilde are defined in (33). (i) If , thenwhere (ii) If , thenwhere Proof. It is a consequence of estimate (
A4), where
is given by Theorem 4 case (iii) and
is given by Proposition 6. □
Inequalities of Theorem 2 estimate the smooth Wasserstein distances
. Analogously to the case
above one can obtain an estimate for the corresponding Kolmogorov distances using (
7), and for the distances of normalised sums.
Next, let , , be a sequence of i.i.d. -valued random variables with a bounded probability density p, . The random walk and its transition operator are defined as above in case .
For p, we shall assume the condition (Pd), which is a natural extension of the one-dimensional case above.
Condition (Pd): With given constants
,
,
, the probability density
on
is bounded and such that
where
,
is a measurable function on
such that
, and
is a continuous non-negative function on the sphere
such that
The generator of a
d-dimensional stable Lévy process of index
with a spectral measure specified by the density function
is defined by the formulas
One-dimensional results above are presented in a way that they extend straightforwardly to the present d-dimensional case. For instance, for , we find the following.
Theorem 3. Estimates (12) and (14) of Theorem 1 and estimates (20) and (21) of Theorem 2 still hold in d-dimensional case, for densities p satisfying condition (Pd), where one has to plug in the semigroup-generated by (23) in place of . Let us now provide an example of cascading asymptotics showing different regimes for large and for “very large” number of terms.
Let
,
a positive constant,
, and
Aiming at dealing with large , let us assume for definiteness that , so that and thus .
A random variable
with distribution
satisfies the requirement of Theorem 1 (i) with
and
On the other hand, the distribution
has finite moments
Hence, we can apply the Berry–Essen theorem for the distance of normalised sums of
to the standard law. Combining this theorem with Theorem 1 yields the following result.
Proposition 1. For a sequence of i.i.d. random variables distributed like τ with the distribution given by (24), with , it follows that for . On the other hand, for all n,where C is the Berry–Essen constant and is a standard normal random variable. Remark 4. The Berry–Essen constant C belongs to the interval . We refer to [18,19] for the best-known results on its approximation. We see that roughly speaking, in order for estimate (
26) to make sense, we must have
. Thus the interval
is the switching region, where the (quasi)
-stable asymptotics is transferred to the normal CLT. Clearly, if
is sufficiently large so that observations for
n beyond the level of
are not available or feasible, the random variable
looks like it belongs to the domain of attraction of the
-stable law and its true asymptotics cannot be revealed. However, the example shows exactly where this quasi attraction actually breaks down and when the true limit becomes visible.
3. Technical Estimates I: Random Walk Approximation for Stable Generators
In this section, we supply the first group of inequalities needed for the application of Proposition A1 in our setting, namely estimates of type (
A1).
Theorem 4. Let be a measurable bounded function on satisfying assumption (P).
(i) Let , and Thenfor any vanishing at zero and any . In particular, for , (ii) Let . Thenforand any differentiable f vanishing at zero together with its first derivative and such that . In particular,for and . (iii) Let again and set the first moment of p (which is well-defined due to the assumptions of the theorem). Let us assume (for definiteness) that andThenfor and , where (iv) Finally, let be the symmetrized version of the probability density p. Thenfor , and . Integrals in this formula are understood in the sense of the main value. Remark 5. (i) If , it is natural to choose in Statement (i). We have taken here arbitrary δ, because for small β, the interval can become void even for sufficiently large . In addition, notice that the bound was used only for and is not required whenever . (ii) Statements (iii) and (iv) are particular cases of a more general situation with different power asymptotics for p on positive and negative half-lines. This general case is dealt with in the next result concerning stable limits in arbitrary dimensions. (iii) As seen from the proofs below, most of the explicit constants on the r.h.s. of the estimates above can be essentially tightened. We tried to give the simplest versions that, at the same time, clearly indicate the role of all parameters. (iv) The case with requires certain modifications that we are not touching here.
Proof. (i) We shall compare the integrals separately in the domains , and .
For the second term, we have the following estimate:
where we used the inequality
.
Secondly,
and
so that
The latter estimate follows from the inequality
.
Thirdly,
In the last inequality we used the definition of
and the inequality
.
Finally, combining the three estimates above yields (
28).
(ii) Proof of (
30) is analogous. Firstly, for the second term of (
35) we obtain the upper bound
(iii) We have
We are going to apply Statement (ii) to the probability density
with parameters (
33).
If
, then
and therefore
where
and
because
.
Now, by Statement (ii), we can conclude that
differs from
by the r.h.s. of (
31) with all constants with tilde. Consequently, by (
36), this implies (
32).
(iv) Changing the integration variable
y to
in the second inequality of (
31) and summing up with the first one yields
The integrals containing
vanish yielding (
34). □
Let us now obtain a multidimensional extension of these estimates, reducing attention to .
Recall that for a differentiable function f on we shall denote by the sup-norm of the Euclidean length of the gradient vector , and by the sup-norm of the standard matrix norm of the matrix of the second derivatives of f and .
Theorem 5. Let a density p on satisfy condition (Pd) and .
Let denote the first moment of p (which is well-defined due to the assumptions of the theorem). All estimates below are supposed to hold for(with the corresponding and in case (iii)) and twice continuously differentiable functions f and g. (i) For a differentiable f vanishing at zero together with its first derivative, it follows that (iii) If and , then Proof. Statement (i) is a straightforward extension of the proof of part (ii) of Theorem 4. Statement (ii) is obtained by applying (i) to the function
and noting that the integrals containing
vanish. To prove (iii), we follow the line of arguments of part (iii) of Theorem 4 and start by writing
Then we apply Statement (ii) to this integral with respect to the probability density
. Notice that if
, then
. At the same time, if
, then
, and therefore
implying the required estimate for
. □
4. Technical Estimates II: Stable Generators from Stable Semigroups
In this section, we supply the second group of inequalities needed for the application of Proposition A1 in our setting, namely estimates (
A2). Thus the main results are given by Propositions 3, 5 and 7. Preliminary Lemmas 1 and 2 must be essentially known to specialists, but explicit constants for the corresponding estimates are not easy to find in the literature, and we sketch proofs for the convenience of readers.
Lemma 1. Let . Thenfor any . Furthermore, if and , then andIn particular, Proof. Estimate (
41) is straightforward from dividing the integral in (
10) in two parts, over the interval
and over the rest of
(more details are given below for the analogous case of
).
To prove (
42), let us write
We can estimate the first term in magnitude by
and the second term by
Choosing
yields the result required.
Finally, to obtain (
43) we use (
42) with
and (
41) with
. □
Let be the Feller semigroup of the -stable Lévy process generated by operator .
Proposition 2. (i) If , then(ii) For any and , Proof. (i) Since
estimate (
44) is a direct consequence of (
41).
(ii) Let
be an even nonnegative smooth function on
with support in
such that
,
is increasing on
and
. For
, let
. For an
, let
If
, then
On the other hand,
and therefore
for any
.
Writing
and estimating
yields
Choosing
yields
□
Varying
in (
42) and Proposition 2, we can obtain, as direct corollaries, various estimates for the l.h.s. of this inequality. A particular choice of
and
leads to the following result.
Proposition 3. Under assumptions of Proposition 2, Proof. If
, then
and we can use (
43) and (
44) to obtain
If
, then
and we use (
43) and (
45) to obtain the required estimate. □
Remark 6. Using arbitrary γ from(45), the second line of the r.h.s. of (46) can be substituted by a more general expressionThus the power of h can be made arbitrary close to , which is bigger than used in (46). Let us turn to the case .
Lemma 2. Let . Thenfor . Furthermore, if and , thenand Proof. Let
. Then
and
proving (
47).
Proof of (
48) is analogous to the proof of (
42). Namely, we decompose the integral in (
11) in two parts, over the interval
and the rest of
leading to the estimate
Choosing
yields
Since
the numerator in the fraction is bounded by
yielding (
48).
To obtain (
49) we apply (
48) and then (
47) with
instead of
. □
Proposition 4. Let and . Thenand Remark 7. One sees from Proposition 4 that for an effective estimate of one either uses higher regularity of f with better behaviour in small h, or less regularity in f resulting in worse estimate in h. Different versions can be used depending on the regularity requirement.
Proof. Estimate (
50) is a direct consequence of (
47). To prove the second inequality, we work as if in the proof of (
45) exploiting the approximation
to an arbitrary
f.
Writing
and estimating
and, for
,
yields
Choosing
yields
implying (
51) by a rough estimate of the term in the bracket.
Alternatively, we can estimate
and, for
,
yields
Choosing again
yields
implying (
51). □
As above, for the case of , we obtain the following as a direct corollary.
Proposition 5. Let and . Then If and , then also Proof. Estimate (
53) is obtained by combining estimates (
51) and (
47). Estimate (
54) is obtained by combining estimates (
50), (
52) and (
49). □
Choosing in the first case and in the second (also estimating in the second case) we obtain the following consequence.
Proposition 6. If , then also All these estimates and their proofs extend automatically to the d-dimensional case leading to the following result.
Proposition 7. Let be a continuous nonnegative function on and . Then operators (22) and (23) generate Feller semigroups on satisfying estimates of Lemmas 1 and 2, Proposition 2 with an additional multiplier A on the r.h.s. and Proposition 3 with an additional multiplier on the r.h.s. 5. Conclusions
In this paper, we proved various rates of convergence for functional limit theorems with stable laws. In particular, we paid attention to some kind of quasi convergence, where stable approximation holds for large, but not too large n, and in fact, it can vary in different regions of these large n. The method of proof was based essentially on the theory of semigroups.
Let us draw some further perspective.
First of all, our results have more or less straightforward extensions for the convergence of position-dependent random walks to stable-like processes. Unlike the method of Fourier transform, which is tailored to the analysis of constant-coefficient equations, our approach is more robust. To extend our main theorems to variable coefficients, one just has to use general estimate (
A3), rather than its simplified version (
A4).
Next, we excluded the case that requires certain additional efforts. Bringing this case to the theory is also connected to working out the best rates available for various and various distances (Kolmogorov, Wasserstein, etc.). As seen from our proofs, several possibilities arise in choosing various intermediate parameters, and our choice here was motivated by simplicity and not by proper consideration of optimality. One can also weaken the assumption on an asymptotic similarity of with an exact power.
Essential improvement of the results of [
6] on functional CLT with stable laws (as performed here) would naturally imply improvements in the results of [
6] for the convergence of continuous time random walks (CTRW), which we did not touch here at all.
Finally, the author believes that the methods developed here can be successfully applied to many other related models, as described, for instance, in [
20].