Next Article in Journal
Operational Meaning of Classical Fidelity and Path Length in Kubo–Mori–Bogoliubov Fisher Geometry
Next Article in Special Issue
Occupation Times on the Legs of a Diffusion Spider
Previous Article in Journal
Cryptanalysis of an Image Encryption Algorithm Using DNA Coding and Chaos
Previous Article in Special Issue
On the Komlós–Révész SLLN for Ψ-Mixing Sequences
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Self-Normalized Moderate Deviations for Degenerate U-Statistics

1
Division of Arts and Sciences, Mississippi State University at Meridian, Meridian, MS 39307, USA
2
Department of Mathematics, University of Mississippi, University, MS 38677, USA
3
Department of Statistics and Data Science, Shenzhen International Center for Mathematics, Southern University of Science and Technology, Shenzhen 518055, China
*
Author to whom correspondence should be addressed.
Entropy 2025, 27(1), 41; https://doi.org/10.3390/e27010041
Submission received: 4 November 2024 / Revised: 18 December 2024 / Accepted: 20 December 2024 / Published: 7 January 2025
(This article belongs to the Special Issue The Random Walk Path of Pál Révész in Probability)

Abstract

:
In this paper, we study self-normalized moderate deviations for degenerate U-statistics of order 2. Let { X i , i 1 } be i.i.d. random variables and consider symmetric and degenerate kernel functions in the form h ( x , y ) = l = 1 λ l g l ( x ) g l ( y ) , where λ l > 0 , E g l ( X 1 ) = 0 , and g l ( X 1 ) is in the domain of attraction of a normal law for all l 1 . Under the condition l = 1 λ l < and some truncated conditions for { g l ( X 1 ) : l 1 } , we show that log P ( 1 i j n h ( X i , X j ) max 1 l < λ l V n , l 2 x n 2 ) x n 2 2 for x n and x n = o ( n ) , where V n , l 2 = i = 1 n g l 2 ( X i ) . As application, a law of the iterated logarithm is also obtained.

1. Introduction and Main Results

The recent three decades have witnessed significant developments on self-normalized limit theory, especially on large deviations, Cramér-type moderate deviations, and the law of the iterated logarithm. Compared with the classical limit theorems, these self-normalized limit theorems usually require much less moment assumptions.
Let X , X 1 , X 2 , be independent identically distributed (i.i.d.) random variables. Set
S n = i = 1 n X i and V n 2 = i = 1 n X i 2 .
Griffin and Kuelbs [1] obtained a law of the iterated logarithm (LIL) for the self-normalized sum of i.i.d. random variables with distributions in the domain of attraction of a normal or stable law. They proved that
  • If E X = 0 and X is in the domain of attraction of a normal law, then
    lim sup n S n V n ( 2 log log n ) 1 / 2 = 1 a . s .
  • If X is symmetric and is in the domain of attraction of a stable law, then there exists a positive constant C such that
    lim sup n S n V n ( 2 log log n ) 1 / 2 = C a . s .
Shao [2] obtained the following self-normalized moderate deviations and specified the constant C in (2). Let { x n , n 1 } be a sequence of positive numbers such that x n and x n = o ( n ) as n .
  • If E X = 0 and X is in the domain of attraction of a normal law, then
    lim n x n 2 log P S n V n x n = 1 2 .
  • If X is in the domain of attraction of a stable law such that E X = 0 with index 1 < α < 2 , or X 1 is symmetric with index α = 1 , then
    lim n x n 2 log P S n V n x n = β ( α , c 1 , c 2 ) ,
where β ( α , c 1 , c 2 ) is a constant depending on the tail distribution; see [2] for an explicit definition.
Shao [3] refined (3) and obtained the following Cramér-type moderate deviation theorem under a finite third moment: if E X = 0 and E | X | 3 < , then
P ( S n / V n x n ) P ( Z x n ) 1
for any x n [ 0 , o ( n 1 / 6 ) ) , where Z is the standard normal random variable.
Jing, Shao and Wang [4] further extended (4) to general independent random variables under a Lindeberg-type condition, while Shao and Zhou [5] established the result for self-normalized non-linear statistics, which include U-statistics as a special case.
The U-statistics were introduced by Halmos [6] and Hoeffding [7]. The LIL for nondegenerate U-statistics was obtained by Serfling [8]. The LIL for degenerate U-statistics was studied by Dehling, Denker and Philipp ([9,10]), Dehling [11], Arcones and Giné [12], Teicher [13], Giné and Zhang [14], and others. Giné, Kwapień, Latała and Zinn [15] provided necessary and sufficient conditions for the LIL of degenerate U-statistics of order 2, which was extended to any order by Adamczak and Latała [16].
The main purpose of this paper is to study the self-normalized moderate deviations and the LIL for degenerate U-statistics of order 2. Let
U n = 1 n ( n 1 ) 1 i j n h ( X i , X j ) ,
where
h ( x , y ) = l = 1 λ l g l ( x ) g l ( y ) .
A motivation example for the LIL is the one with the kernel h ( x , y ) = x y . Obviously, V n 2 / ( 2 V n 2 log log n ) 0 . Then, via (1) and (2), we have
  • If E X = 0 and X is in the domain of attraction of a normal law, then
    lim sup n 1 2 V n 2 log log n | 1 i j n X i X j | = lim sup n S n V n ( 2 log log n ) 1 / 2 2 = 1 a . s .
  • If X is symmetric and is in the domain of attraction of a stable law, then there exists a positive constant C such that
    lim sup n 1 2 V n 2 log log n 1 i j n X i X j = lim sup n S n V n ( 2 log log n ) 1 / 2 2 = C 2 a . s .
For the general degenerate kernel h defined in (5), we have
n ( n 1 ) U n = l = 1 λ l 1 i j n g l ( X i ) g l ( X j ) = l = 1 λ l i = 1 n g l ( X i ) 2 i = 1 n g l 2 ( X i ) .
Suppose that g l ( X ) is in the domain of attraction of a normal law for every l 1 . Then, L l ( x ) : = E g l 2 ( X 1 ) I ( | g l ( X 1 ) | x ) is a slowly varying function for all l 1 as x . Let { x n , n 1 } be a sequence of positive numbers such that x n and x n = o ( n ) as n . For each l 1 , set
b l = inf x 1 : L l ( x ) > 0 , z n , l = inf s : s b l + 1 , L l ( s ) s 2 x n 2 n .
Write
W n = n ( n 1 ) U n max 1 l < λ l i = 1 n g l 2 ( X i ) .
We have the following self-normalized moderate deviation:
Theorem 1.
Let E g l ( X ) = 0 and λ l 0 for every l 1 .
sup x R l = m + 1 λ l g l 2 ( x ) 1 l < λ l g l 2 ( x ) 0 a s m
and
lim n E g l ( X ) I ( | g l ( X ) | z n , l ) g k ( X ) I ( | g k ( X ) | z n , k ) L l ( z n , l ) L k ( z n , k ) 0
for any l k . Then, for x n and x n = o ( n ) ,
lim n x n 2 log P W n x n 2 = 1 2 .
As an application, we have the following self-normalized LIL:
Theorem 2.
Under the assumptions in Theorem 1, and instead of (8), we assume that for each l [ 1 , ) , there is a constant c l > 0 such that
sup x R λ l g l 2 ( x ) l = 1 λ l g l 2 ( x ) c l a n d l = 1 c l < .
Then,
lim sup n W n log log n = 2 a . s .
Remark 1.
We use an example to show that (9) cannot be removed. Let g 1 ( x ) = x , g 2 ( x ) = x 3 and λ 1 = λ 2 = 1 , λ l = 0 for l 3 . Let X be a Rademacher random variable. Then, n ( n 1 ) U n = 1 i j n ( X i X j + X i 3 X j 3 ) = 2 1 i j n X i X j and W n = 2 1 i j n X i X j / i = 1 n X i 2 . By (6), lim sup n W n / log log n = 4 a.s. which contradicts (11).

2. Proofs

In the proofs of theorems, we will use the following properties for the slowly varying functions g l (e.g., Bingham et al. [17]). As x ,
P ( | g l ( X ) | x ) = o ( L l ( x ) / x 2 ) ,
E | g l ( X ) | I ( | g l ( X ) | x ) = o ( L l ( x ) / x ) ,
E | g l ( X ) | p I ( | g l ( X ) | x ) = o ( x p 2 L l ( x ) ) , p > 2 .
Since L l ( s ) / s 2 0 as s and L l ( x ) is right continuous, in (7), z n , l and for all sufficiently large n values, we have
n L l ( z n , l ) = x n 2 z n , l 2 .

2.1. The Upper Bound of Theorem 1

For each l 1 and i 1 , denote the truncated function
g ¯ l ( X i ) = g l ( X i ) I | g l ( X i ) | z n , l .
Since E g l ( X i ) = 0 , we have
E g ¯ l ( X i ) = o ( L l ( z n , l ) / z n , l ) = o ( x n L l ( z n , l ) / n )
by (13) and (15). For each l 1 , write
Y n , l = i = 1 n g l ( X i ) , Y ¯ n , l = i = 1 n g ¯ l ( X i ) and V n , l 2 = i = 1 n g l 2 ( X i ) .
By Condition (8), for each 0 < ε < 1 , there exists 1 m < such that
m max 1 l < λ l V n , l 2 l = 1 m λ l V n , l 2 ( 1 ε ) l = 1 λ l V n , l 2 .
Hence, for x n ,
P W n ( 1 + ε ) x n 2 P l = 1 λ l Y n , l 2 max 1 l < λ l V n , l 2 x n 2 .
Observe that for any random variables { U l } l = 1 and { Z l } l = 1 and any constants x > 0 and 0 < a < 1 / 3 , we have the following via the Cauchy inequality:
P l ( U l + Z l ) 2 x P l U l 2 ( 1 a ) 2 x + P l Z l 2 a 2 x P l U l 2 ( 1 2 a ) x + P l Z l 2 a 2 x .
For any 0 < ε < 1 / 3 , by (18) and (19), we have
P W n ( 1 + ε ) x n 2 P l = 1 λ l i = 1 n g l ( X i ) I | g l ( X i ) | > z n , l 2 max 1 l < λ l V n , l 2 ε 2 x n 2 + P l = 1 λ l Y ¯ n , l 2 max 1 l < λ l V n , l 2 ( 1 2 ε ) x n 2 .
For any integer m 1 and any constant C 1 > 0 with C 1 ε < 1 , we have
P l = 1 λ l Y ¯ n , l 2 max 1 l < λ l V n , l 2 ( 1 2 ε ) x n 2 P max 1 l < λ l V n , l 2 n ε l = m + 1 λ l L l ( z n , l ) + P max 1 l < λ l V n , l 2 ( 1 ε ) n max 1 l m λ l L l ( z n , l ) + P l = m + 1 λ l Y ¯ n , l 2 ( n / ε ) l = m + 1 λ l L l ( z n , l ) C 1 ε ( 1 2 ε ) x n 2 + P l = 1 m λ l Y ¯ n , l 2 ( 1 ε ) n max 1 l m λ l L l ( z n , l ) ( 1 C 1 ε ) ( 1 2 ε ) x n 2 .
Applying (21) to (20), we have
P W n ( 1 + ε ) x n 2 P max 1 l < λ l V n , l 2 n ε l = m + 1 λ l L l ( z n , l ) + P max 1 l < λ l V n , l 2 ( 1 ε ) n max 1 l m λ l L l ( z n , l ) + P l = 1 λ l i = 1 n g l ( X i ) I | g l ( X i ) | > z n , l 2 max 1 l < λ l V n , l 2 ε 2 x n 2 + P l = m + 1 λ l Y ¯ n , l 2 n l = m + 1 λ l L l ( z n , l ) C 1 ( 1 2 ε ) x n 2 + P l = 1 m λ l Y ¯ n , l 2 n max 1 l m λ l L l ( z n , l ) ( 1 C 1 ε ) ( 1 2 ε ) 2 x n 2 : = I 1 , 1 + I 1 , 2 + I 2 + I 3 + I 4 .

2.2. Estimation of I 1 , 1 and I 1 , 2

Proposition 1.
For m 1 that is sufficiently large,
I 1 , 1 = P max 1 l < λ l V n , l 2 n ε l = m + 1 λ l L l ( z n , l ) exp ( 2 x n 2 )
and for any constants δ > 0 and 0 < η < 1 ,
P 2 m max 1 l < λ l V n , l 2 ( 1 η ) n 1 l < λ l L l ( z n , l ) exp ( 2 x n 2 ) ,
P max 1 l < λ l i = 1 n g l 2 ( X i ) I ( | g l ( X i ) | δ z n , l ) ( 1 η ) n max 1 l < λ l L l ( z n , l ) exp ( 2 x n 2 ) .
In particular,
I 1 , 2 P max 1 l < λ l V n , l 2 ( 1 ε ) n max 1 l < λ l L l ( z n , l ) exp ( 2 x n 2 ) .
Proof. 
We shall apply the following exponential inequality (see, e.g., Theorem 2.19 of de la Peña, Lai and Shao [18]). If Y 1 , , Y n are independent random variables with Y i 0 , μ n = i = 1 n E Y i and B n 2 = i = 1 n E Y i 2 < , then for 0 < x < μ n ,
P i = 1 n Y i x exp ( μ n x ) 2 2 B n 2 .
By (8), l = m + 1 λ l V n , l 2 / l = 1 λ l V n , l 2 0 as m . Then, by (17),
l = m + 1 λ l V n , l 2 max 1 l < λ l V n , l 2 0 a s m .
Hence,
ε max 1 l < λ l V n , l 2 2 l = m + 1 λ l V n , l 2 2 l = m + 1 λ l i = 1 n g ¯ l 2 ( X i ) .
Then,
I 1 , 1 P l = m + 1 λ l i = 1 n g ¯ l 2 ( X i ) n 2 l = m + 1 λ l L l ( z n , l ) exp ( n l = m + 1 λ l L l ( z n , l ) / 2 ) 2 2 n E ( l = m + 1 λ l g ¯ l 2 ( X 1 ) ) 2 .
By Minkowski’s integral inequality, (14) and (15),
E l = m + 1 λ l g ¯ l 2 ( X 1 ) 2 l = m + 1 λ l ( E g ¯ l 4 ( X 1 ) ) 1 / 2 2 = o n x n l = m + 1 λ l L l ( z n , l ) 2 .
Therefore, (23) follows from (27) and (28). To show (24), notice that by (17),
m max 1 l < λ l V n , l 2 ( 1 η ) l = 1 λ l V n , l 2 ( 1 η ) l = 1 λ l i = 1 n g ¯ l 2 ( X i ) .
Then,
P 2 m max 1 l < λ l V n , l 2 ( 1 η ) n 1 l < λ l L l ( z n , l ) P l = 1 λ l i = 1 n g ¯ l 2 ( X i ) n 2 l = 1 λ l L l ( z n , l ) exp ( ( n 2 l = 1 λ l L l ( z n , l ) ) 2 2 n E ( l = 1 λ l g ¯ l 2 ( X 1 ) ) 2 .
Similar to the proof of I 1 , 1 as in (27) and (28), we have (24).
To show (25), let
l n = min l : λ l L l ( z n , l ) = max 1 l < λ l L l ( z n , l ) .
Then,
P max 1 l < λ l i = 1 n g l 2 ( X i ) I ( | g l ( X i ) | δ z n , l ) ( 1 η ) n max 1 l < λ l L l ( z n , l ) P i = 1 n λ l n g l n 2 ( X i ) I ( | g l n ( X i ) | δ z n , l n ) ( 1 η ) n λ l n L l n ( z n , l n ) exp ( 1 ( 1 η ) ) 2 ( n λ l n L l n ( z n , l n ) ) 2 2 n λ l n 2 E g l n 4 ( X 1 ) I ( | g l n ( X 1 ) | δ z n , l n ) = exp η 2 ( n λ l n L l n ( z n , l n ) ) 2 2 n λ l n 2 o ( n L l n 2 ( δ z n , l n ) / x n 2 ) .
Since L l ( δ z n , l ) / L l ( z n , l ) 1 , (25) follows. □

2.3. Estimation of I 2

Proposition 2.
I 2 = P l = 1 λ l i = 1 n g l ( X i ) I | g l ( X i ) | > z n , l 2 max 1 l < λ l V n , l 2 ε 2 x n 2 exp ( 2 x n 2 ) .
Proof. 
Via the Cauchy–Schwarz inequality,
l = 1 λ l i = 1 n | g l ( X i ) | I | g l ( X i ) | > z n , l 2 max 1 l < λ l V n , l 2 l = 1 λ l i = 1 n g l 2 ( X i ) i = 1 n I | g l ( X i ) | > z n , l max 1 l < λ l V n , l 2 .
By (17), the sum of the diagonal terms is as follows:
l = 1 λ l i = 1 n g l 2 ( X i ) I | g l ( X i ) | > z n , l max 1 l < λ l V n , l 2 ε 2 x n 2 2 .
I 2 P 1 i j n l = 1 λ l g l 2 ( X i ) I ( | g l ( X j ) | > z n , l ) max 1 l < λ l i = 1 n g l 2 ( X i ) ε 2 x n 2 2 P 1 i < j n l = 1 λ l g l 2 ( X i ) I ( | g l ( X j ) | > z n , l ) max 1 l < λ l i = 1 n g l 2 ( X i ) ε 2 x n 2 4 + P 1 j < i n l = 1 λ l g l 2 ( X i ) I ( | g l ( X j ) | > z n , l ) max 1 l < λ l i = 1 n g l 2 ( X i ) ε 2 x n 2 4 P 2 j n 1 i < j l = 1 λ l g l 2 ( X i ) I ( | g l ( X j ) | > z n , l ) max 1 l < λ l 1 i < j g l 2 ( X i ) ε 2 x n 2 4 + P 1 j < n j < i n l = 1 λ l g l 2 ( X i ) I ( | g l ( X j ) | > z n , l ) max 1 l < λ l j < i n g l 2 ( X i ) ε 2 x n 2 4 = I 2 , 1 + I 2 , 2 .
Let
ϕ j = 1 i < j l = 1 λ l g l 2 ( X i ) I ( | g l ( X j ) | > z n , l ) max 1 l < λ l 1 i < j g l 2 ( X i ) .
Then, for any constant t > 0 ,
I 2 , 1 E e t j = 2 n ϕ j e t ε 2 x n 2 / 4 .
Let E j be the expectation of X j for 2 j n . Then,
E e t j = 2 n ϕ j = E ( e t j = 2 n 1 ϕ j E n e t ϕ n ) .
Since | e s 1 | e 0 s | s | for any s R and 0 ϕ n m for some sufficiently large m value, then
E n e t ϕ n 1 e m t t E n ϕ n = e m t t 1 i < n l = 1 λ l g l 2 ( X i ) P ( | g l ( X n ) | > z n , l ) max 1 l < λ l 1 i < n g l 2 ( X i ) .
By (12) and (15), we have P ( | g l ( X n ) | > z n , l ) = o ( x n 2 / n ) . Then, together with (17),
E n e t ϕ n = 1 + o ( x n 2 / n ) = e o ( x n 2 / n ) .
Applying (32) to (31), we have
E e t j = 2 n ϕ j = e o ( x n 2 / n ) E e t j = 2 n 1 ϕ j .
Similarly,
E e t j = 2 n 1 ϕ j = E ( e t j = 2 n 2 ϕ j E n 1 e t ϕ n 1 ) = e o ( x n 2 / n ) E e t j = 2 n 2 ϕ j .
Continue this process from X n to X 1 . We conclude that
E e t j = 2 n ϕ j = e n × o ( x n 2 / n ) = e o ( x n 2 ) .
Applying (33) to (30) and letting t = 16 / ε 2 , we have
I 2 , 1 exp ( 3 x n 2 ) .
By the same argument,
I 2 , 2 exp ( 3 x n 2 ) .
Combining (29), (34) and (35), we obtain the proposition. □

2.4. Estimation of I 3

Let Y 1 , , Y n be an independent copy of X 1 , , X n . We will use the following lemma which is a Bernstein-type exponential inequality for degenerate U-statistics.
Lemma 1
((3.5) of Giné, Latała and Zinn [19]). For bounded degenerate kernel h i , j ( X i , Y j ) , let
A = max i , j h i , j ( X i , Y j ) , C 2 = i , j E h i , j 2 ( X i , Y j ) , B 2 = max i , j i E h i , j 2 ( X i , y ) , j E h i , j 2 ( x , Y j ) .
Then, there is a universal constant K such that
P r | i , j h i , j ( X i , Y j ) | > x K exp 1 K min x C , x B 2 / 3 , x A 1 / 2 .
Recall (16). Hence, by (19) and the definition of I 3 in (22),
I 3 P l = m + 1 λ l ( Y ¯ n , l E Y ¯ n , l ) 2 l = m + 1 λ l L l ( z n , l ) C 1 n ( 1 2 ε ) 2 x n 2 .
Let
h m ( X i , Y j ) = l = m + 1 λ l ( g ¯ l ( X i ) E g ¯ l ( X i ) ) ( g ¯ l ( Y j ) E g ¯ l ( Y j ) ) .
In addition to the estimate of I 3 , we include (40) in the following proposition which will be used in the proof of Theorem 2, where
h ( β ) ( X i , Y j ) = l = 1 λ l ( g l ( X i ) I ( | g l ( X i ) | β z n , l ) E g l ( X i ) I ( | g l ( X i ) | β z n , l ) × ( g l ( Y j ) I ( | g l ( Y j ) | β z n , l ) E g l ( Y j ) ) I ( | g l ( Y j ) | β z n , l ) .
Proposition 3.
For a sufficiently large constant C 2 > 0 ,
P 1 i , j n h m ( X i , Y j ) n l = m + 1 λ l L l ( z n , l ) C 2 x n 2 exp ( 3 x n 2 ) ,
Then, by (36) and the decoupling inequalities of de la Peña and Montgomery-Smith [20], for a sufficiently large C 1 > 0 ,
I 3 P 1 i , j n h m ( X i , X j ) n l = m + 1 λ l L l ( z n , l ) C 1 ( 1 2 ε ) 2 x n 2 exp ( 2 x n 2 ) .
Suppose that λ l > 0 for all 1 l < and d > 0 is a constant. For constants 0 < α , β 1 that are sufficiently small,
P 1 i , j [ α n ] h ( β ) ( X i , Y j ) n l = 1 λ l L l ( z n , l ) d x n 2 exp ( 2 x n 2 ) .
Proof. 
We will prove (39) and (40) simultaneously. By (15) and (37),
A n : = h m ( X i , Y j ) 4 l = m + 1 λ l z n , l 2 = 4 l = m + 1 λ l n L l ( z n , l ) x n 2 .
By (15) and (38),
A n , ( β ) : = h ( β ) ( X i , Y j ) 4 l = 1 λ l β 2 z n , l 2 = 4 β 2 n l = 1 λ l L l ( z n , l ) x n 2 .
Let
B n 2 : = max 1 i n E h m 2 ( X i , y ) ,   1 j n E h m 2 ( x , Y j ) .
Since | g ¯ l ( Y j ) | z n , l , then by Cauchy–Schwarz inequality, (15) and (37),
1 i n E h m 2 ( X i , y ) n E 2 l = m + 1 λ l | g ¯ l ( X 1 ) E g ¯ l ( X 1 ) | z n , l 2 4 n E l = m + 1 λ l g ¯ l ( X 1 ) E g ¯ l ( X 1 ) 2 l = m + 1 λ l z n , l 2 4 n l = m + 1 λ l L l ( z n , l ) l = m + 1 λ l n L l ( z n , l ) x n 2 .
The same result can be obtained for 1 j n E h m 2 ( x , Y j ) . Therefore,
B n 2 4 n 2 ( l = m + 1 λ l L l ( z n , l ) ) 2 x n 2 .
Similarly,
B n , α , ( β ) 2 : = max 1 i [ α n ] E h ( β ) 2 ( X i , y ) , 1 j [ α n ] E h ( β ) 2 ( x , Y j ) 4 α n l = 1 λ l L l ( β z n , l ) l = 1 λ l β 2 n L l ( z n , l ) x n 2 .
Since 0 < β 1 , then L l ( β z n , l ) / L l ( z n , l ) 1 . Hence,
B n , α , ( β ) 2 4 α β 2 n 2 ( l = 1 λ l L l ( z n , l ) ) 2 x n 2 .
By (37) and the Cauchy–Schwarz inequality,
C n 2 : = 1 i , j n E h m 2 ( X i , Y j ) 1 i , j n l = m + 1 λ l E g ¯ l ( X i ) E g ¯ l ( X i ) 2 l = m + 1 λ l E g ¯ l ( Y j ) E g ¯ l ( Y j ) 2 n 2 l = m + 1 λ l L l ( z n , l ) 2 .
Similarly,
C n , α , ( β ) 2 : = 1 i j [ α n ] E h ( β ) 2 ( X i , Y j ) α 2 n 2 l = 1 λ l L l ( β z n , l ) 2 α 2 n 2 l = 1 λ l L l ( z n , l ) 2 .
Now let
x = C 2 n x n 2 l = m + 1 λ l L l ( z n , l ) .
By (41) and (46),
x A n 1 / 2 C 2 n x n 2 l = m + 1 λ l L l ( z n , l ) 4 n l = m + 1 λ l L l ( z n , l ) / x n 2 1 / 2 = C 2 / 4 1 / 2 x n 2 .
By (43) and (46),
x B n 2 / 3 C 2 n x n 2 l = m + 1 λ l L l ( z n , l ) 2 n l = m + 1 λ l L l ( z n , l ) / x n 2 / 3 = ( C 2 / 2 ) 2 / 3 x n 2 .
By (44) and (46),
x C n C 2 n x n 2 l = m + 1 λ l L l ( z n , l ) n l = m + 1 λ l L l ( z n , l ) = C 2 x n 2 .
Then, (39) follows from Lemma 1 for a sufficiently large C 2 value. Similarly, let
x d = d n x n 2 l = 1 λ l L l ( z n , l ) .
By (42) and (47),
x d A n , ( β ) 1 / 2 d n x n 2 l = 1 λ l L l ( z n , l ) 4 β 2 n l = 1 λ l L l ( z n , l ) / x n 2 1 / 2 = d 2 β x n 2 .
By (44) and (47),
x d B n , α , ( β ) 2 / 3 d n x n 2 l = 1 λ l L l ( z n , l ) 2 α β n l = 1 λ l L l ( z n , l ) / x n 2 / 3 = d 2 α β 2 / 3 x n 2 .
By (45) and (47),
x d C n , α , ( β ) d n x n 2 l = 1 λ l L l ( z n , l ) α n l = 1 λ l L l ( z n , l ) = d α x n 2 .
Therefore, (40) follows from Lemma 1 for α and β values that are sufficiently small. □

2.5. Estimation of I 4

Lemma 2 below follows Corollary 1(b) of Einmahl [21] and Lemma 4.2 of Lin and Liu [22]. However, we add the condition i = 1 k n E ξ n , i 2 b n 2 , and our result is in a form of an exponential inequality for independent random vectors. We use the same positive constants c 17 , c 20 and c 22 (depending only on the vector dimension d) in Einmahl [21].
Lemma 2.
Let ξ n , 1 , , ξ n , k n be independent random vectors with a mean of zero and values in R d such that ξ n , i A n and i = 1 k n E ξ n , i 2 b n 2 , where · denotes the Euclidean norm. Let S n = i = 1 k n ξ n , i . Suppose that
C o v ( S n ) = B n I d
where B n > 0 , I d is a d × d identity matrix, and α n is a positive sequence such that α n B n 1 / 2 and
α n i = 1 k n E ξ n , i 3 exp α n ξ n , i B n .
Let
β n = B n 3 / 2 i = 1 k n E ξ n , i 3 exp α n ξ n , i .
Then, for any 0 < γ < 1 , there exists n γ such that for all n n γ ,
P S n x exp c 20 β n c 17 3 α n 3 B n 3 / 2 + 1 × exp ( 1 γ ) 6 x 2 2 B n + exp γ 3 ( 1 γ ) 3 x 2 2 c 22 B n β n 2 log ( 1 / β n ) + 2 d exp ( 1 γ ) 2 c 17 2 α n 2 B n 2 2 ( d 2 b n 2 + d c 17 A n α n B n )
uniformly for x [ e n B n 1 / 2 , c 17 α n B n ] , where { e n } n 1 can be any sequence with e n and e n c 17 α n B n 1 / 2 .
Proof. 
Let η n , i , 1 i k n , be independent N ( 0 , σ 2 C o v ( ξ n , i ) ) random vectors, which are independent of the ξ n , i s, where
σ 2 = c 22 β n 2 log ( 1 / β n ) .
By (49) and (50), we have β n α n 1 B n 1 / 2 0 as n . Hence, σ 0 as n . Let p n ( y ) be the probability density of B n 1 / 2 i = 1 k n ( ξ n , i + η n , i ) and ϕ ( 1 + σ 2 ) I d be the density of N ( 0 , ( 1 + σ 2 ) I d ) . By Corollary 1(b) in Einmahl [21] (together with the Remark on page 32), for y c 17 α n B n 1 / 2 ,
p n ( y ) = ϕ ( 1 + σ 2 ) I d ( y ) exp ( T n ( y ) ) with | T n ( y ) | c 20 β n ( y 3 + 1 ) .
For any 0 < γ < 1 and x [ e n B n 1 / 2 , c 17 α n B n ] ,
P S n     x     P S n + i = 1 k n η n , i     ( 1 γ ) x + P i = 1 k n η n , i     γ x = P ( 1 γ ) x     S n + i = 1 k n η n , i   <   c 17 α n B n + P S n + i = 1 k n η n , i     c 17 α n B n + P i = 1 k n η n , i     γ x     P ( 1 γ ) x     S n + i = 1 k n η n , i   <   c 17 α n B n + P S n     ( 1 γ ) c 17 α n B n + P i = 1 k n η n , i     γ c 17 α n B n + P i = 1 k n η n , i     γ x     P ( 1 γ ) x     S n + i = 1 k n η n , i   <   c 17 α n B n + 2 P i = 1 k n η n , i     γ x + P S n     ( 1 γ ) c 17 α n B n : = J 1 + J 2 + J 3 .
Let N denote a centered normal random vector with covariance matrix I d . Then, by (52),
J 1 = ( 1 γ ) x / B n 1 / 2 < y c 17 α n B n 1 / 2 ϕ ( 1 + σ 2 ) I d ( y ) exp ( T n ( y ) ) d y exp c 20 β n c 17 3 α n 3 B n 3 / 2 + 1 y ( 1 γ ) x / B n 1 / 2 ϕ ( 1 + σ 2 ) I d ( y ) d y exp c 20 β n c 17 3 α n 3 B n 3 / 2 + 1 × P N ( 1 γ ) 2 x / B n 1 / 2 + P σ N γ ( 1 γ ) x / B n 1 / 2 .
Observe that N 2 has a χ d 2 distribution. It is well known that for a χ d 2 random variable Y, P ( Y > y ) ( y e 1 y / d / d ) d / 2 for y > d . Hence,
P N ( 1 γ ) 2 x / B n 1 / 2 ( 1 γ ) 4 x 2 d B n exp 1 ( 1 γ ) 4 x 2 d B n d / 2 = ( 1 γ ) 2 d x d d d / 2 B n d / 2 exp d / 2 ( 1 γ ) 4 x 2 2 B n exp ( 1 γ ) 6 x 2 2 B n
by x 2 / B n . Similarly,
P σ N γ ( 1 γ ) x / B n 1 / 2 exp γ 3 ( 1 γ ) 3 x 2 2 B n σ 2 = exp γ 3 ( 1 γ ) 3 x 2 2 c 22 B n β n 2 log ( 1 / β n )
by (51). Then, by (54)–(56), we have
J 1 exp c 20 β n c 17 3 α n 3 B n 3 / 2 + 1 × exp ( 1 γ ) 6 x 2 2 B n + exp γ 3 ( 1 γ ) 3 x 2 2 c 22 B n β n 2 log ( 1 / β n ) .
Since the distribution of η n , i is N ( 0 , σ 2 C o v ( ξ n , i ) ) , the distribution of i = 1 k n η n , i is N ( 0 , σ 2 i = 1 k n C o v ( ξ n , i ) ) . Since the ξ n , i s are independent, i = 1 k n C o v ( ξ n , i ) = C o v ( i = 1 k n ξ n , i ) = B n I d by (48). Hence, the distribution of i = 1 k n η n , i is N ( 0 , σ 2 B n I d ) . Then, similar to (56),
J 2 = 2 P i = 1 k n η n , i   γ x = 2 P σ N γ x / B n 1 / 2 2 exp γ 3 x 2 2 B n c 22 β n 2 log ( 1 / β n ) .
By (57) and (58), we have
J 1 + J 2 exp c 20 β n c 17 3 α n 3 B n 3 / 2 + 1 × exp ( 1 γ ) 6 x 2 2 B n + 3 exp γ 3 ( 1 γ ) 3 x 2 2 c 22 B n β n 2 log ( 1 / β n ) .
Next, we estimate J 3 . For each 1 i n , let ξ n , i = ( ξ n , i ( 1 ) , , ξ n , i ( d ) ) T , where a T denote the transpose of a vector a . Then
S n = i = 1 k n ξ n , i = l = 1 d i = 1 k n ξ n , i ( l ) 2 1 / 2 l = 1 d | i = 1 k n ξ n , i ( l ) | .
Hence,
J 3 = P S n ( 1 γ ) c 17 α n B n P l = 1 d | i = 1 k n ξ n , i ( l ) | ( 1 γ ) c 17 α n B n l = 1 d P | i = 1 k n ξ n , i ( l ) | ( 1 γ ) c 17 α n B n d .
Since ξ n , i A n and i = 1 k n E ξ n , i 2 b n 2 , then | ξ n , i ( l ) | ξ n , i A n and i = 1 k n E ( ξ n , i ( l ) ) 2 i = 1 k n E ξ n , i 2 b n 2 for each 1 l d . By Bernstein’s inequality (e.g., (2.17) of de la Peña, Lai and Shao [18]),
J 3 2 d exp ( 1 γ ) 2 c 17 2 α n 2 B n 2 2 d 2 ( b n 2 + c 17 A n α n B n / d ) .
Then, the lemma follows by applying (59) and (60) to (53). □
Now, we estimate I 4 in the following proposition, which uses some ideas in Liu and Shao [23]:
Proposition 4.
I 4 P l = 1 m λ l Y ¯ n , l E Y ¯ n , l 2 ( 1 C 1 ε ) ( 1 2 ε ) 3 n x n 2 max 1 l m λ l L l ( z n , l ) exp ( 1 C 1 ε ) ( 1 2 ε ) 4 x n 2 2 ( 1 + ε ) .
Proof. 
For each 1 i n , let
G n , i = λ 1 ( g ¯ 1 ( X i ) E g ¯ 1 ( X i ) ) , , λ m ( g ¯ m ( X i ) E g ¯ m ( X i ) ) T .
Let B n = n and Σ be the covariance matrix of G n , 1 . For 1 i n , let
ξ n , i = Σ 1 / 2 G n , i .
Then,
C o v ξ n , 1 + + ξ n , n = E Σ 1 / 2 ( G n , 1 + + G n , n ) Σ 1 / 2 ( G n , 1 + + G n , n ) T = Σ 1 / 2 E G n , 1 + + G n , n ) ( G n , 1 + + G n , n T Σ 1 / 2 = Σ 1 / 2 1 i , j n E G n , i G n , j T Σ 1 / 2 .
Since the X i s are independent, then
C o v ξ n , 1 + + ξ n , n = Σ 1 / 2 i = 1 n E G n , i G n , i T Σ 1 / 2 = n I m = B n I m .
Hence, Condition (48) in Lemma 2 is satisfied. Let
α n = C m x n n 1 / 2
where C m > 0 is a finite constant depending only on m. We shall verify Condition (49). By (61),
ξ n , i 2 = Σ 1 / 2 G n , i T Σ 1 / 2 G n , i = G n , i T Σ 1 G n , i .
Observe that Σ is positive definite by Assumption (9). Then, by the identity
x T A 1 x = max ϑ = 1 ( x T ϑ ) 2 ϑ T A ϑ
for any m × m positive definite matrix A , we have
ξ n , i 2 = G n , i T Σ 1 G n , i = max ϑ = 1 ( G n , i T ϑ ) 2 ϑ T Σ ϑ .
Let ϑ * = ( ϑ 1 * , , ϑ m * ) such that ϑ * = 1 and ( G n , i T ϑ * ) 2 = max ϑ = 1 ( G n , i T ϑ ) 2 . Then, for any ϑ = ( ϑ 1 , , ϑ m ) l 2 , by the Cauchy–Schwarz inequality,
( G n , i T ϑ ) 2 = l = 1 m λ l ( g ¯ l ( X i ) E g ¯ l ( X i ) ) ϑ l 2 = l = 1 m g ¯ l ( X i ) E g ¯ l ( X i ) L l ( z n , l ) ϑ l λ l L l ( z n , l ) 2 l = 1 m ( g ¯ l ( X i ) E g ¯ l ( X i ) ) 2 L l ( z n , l ) l = 1 m ϑ l 2 λ l L l ( z n , l ) .
Since E g l ( X 1 ) = 0 for all l 1 , then E g ¯ l ( X 1 ) = o ( x n L l ( z n , l ) / n ) by (13) and (15). By Assumption (9),
ϑ T Σ ϑ = 1 l , l m ϑ l ϑ l λ l λ l E g ¯ l ( X 1 ) E g ¯ l ( X 1 ) g ¯ l ( X 1 ) E g ¯ l ( X 1 ) = l = 1 m ϑ l 2 λ l E g ¯ l 2 ( X 1 ) l = 1 m ϑ l 2 λ l ( E g ¯ l ( X 1 ) ) 2 + 1 l l m ϑ l ϑ l λ l λ l E g ¯ l ( X 1 ) g ¯ l ( X 1 ) 1 l l m ϑ l ϑ l λ l λ l E g ¯ l ( X 1 ) E g ¯ l ( X 1 ) = l = 1 m ϑ l 2 λ l L l ( z n , l ) l = 1 m ϑ l 2 λ l × o x n 2 L l ( z n , l ) n + o ( 1 ) 1 l l m ϑ l ϑ l λ l λ l L l ( z n , l ) L l ( z n , l ) 1 l l m ϑ l ϑ l λ l λ l × o x n L l ( z n , l ) n o x n L l ( z n , l ) n .
By the Cauchy–Schwarz inequality,
1 l l m ϑ l ϑ l λ l λ l L l ( z n , l ) L l ( z n , l ) m l = 1 m ϑ l 2 λ l L l ( z n , l ) .
Hence,
ϑ T Σ ϑ = ( 1 + o ( 1 ) ) l = 1 m ϑ l 2 λ l L l ( z n , l ) .
Applying (65) and (66) to (64), we have
ξ n , i 2 2 l = 1 m ( g ¯ l ( X i ) E g ¯ l ( X i ) ) 2 L l ( z n , l ) .
Since | g ¯ l ( X i ) | z n , l = n L l ( z n , l ) / x n by (15), and (16), we have
ξ n , i 2 4 m n x n 2 .
By (67),
E ξ n , i 2 2 l = 1 m E ( g ¯ l ( X i ) E g ¯ l ( X i ) ) 2 L l ( z n , l ) 2 m
and
E ξ n , i 3 2 3 / 2 E l = 1 m ( g ¯ l ( X i ) E g ¯ l ( X i ) ) 2 L l ( z n , l ) 3 / 2 .
By Hölder’s inequality,
l = 1 m ( g ¯ l ( X i ) E g ¯ l ( X i ) ) 2 L l ( z n , l ) 3 / 2 m 1 / 2 l = 1 m | g ¯ l ( X i ) E g ¯ l ( X i ) | 3 L l 3 / 2 ( z n , l ) .
Combining (70) and (71), we have
E ξ n , i 3 2 3 / 2 m 1 / 2 l = 1 m 8 E | g ¯ l ( X i ) | 3 L l 3 / 2 ( z n , l ) = l = 1 m o ( z n , l L l ( z n , l ) ) L l 3 / 2 ( z n , l ) = o n 1 / 2 x n
by (14) and (15). Since B n = n and α n = C m x n / n 1 / 2 , then by (68) and (72), we have
α n i = 1 n E ξ n , i 3 exp α n ξ n , i C m x n n 1 / 2 n × o n 1 / 2 x n exp C m x n n 1 / 2 4 m n x n 2 1 / 2 = o ( n ) = o ( B n ) .
Hence, Condition (49) in Lemma 2 is satisfied. Similarly,
β n : = B n 3 / 2 i = 1 n E ξ n , i 3 exp α n ξ n , i = n 3 / 2 n × o n 1 / 2 x n exp C m x n n 1 / 2 ( 4 m n x n 2 1 / 2 ) = o ( 1 / x n ) .
Then, β n 2 log ( 1 / β n ) = o ( 1 / x n ) . By (68), we have ξ n , i ( 4 m n / x n 2 ) 1 / 2 : = A n . By (69), we have i = 1 n E ξ n , i 2 2 m n : = b n 2 . Then, by Lemma 2 and (73) with B n = n and α n = C m x n / n 1 / 2 for a sufficiently large C m value, we have
P S n n 1 / 2 x n exp o ( x n 2 ) exp ( 1 γ ) 6 n x n 2 2 n + exp γ 3 ( 1 γ ) 3 n x n 2 n × o ( 1 / x n ) + 2 m exp ( 1 γ ) 2 c 17 2 C m 2 x n 2 n 2 ( 2 m 3 n + m ( 4 m n / x n 2 ) 1 / 2 c 17 C m x n n 1 / 2 ) exp o ( x n 2 ) exp ( 1 γ ) 6 x n 2 2 + exp 4 x n 2 + exp 4 x n 2 exp ( 1 γ ) 7 x n 2 2 .
Letting γ = 1 ( 1 ε ) 1 / 7 , we have
P S n   n 1 / 2 x n exp ( 1 ε ) x n 2 2 .
Similar to (62),
S n 2 = i = 1 n ξ n , i 2 = Σ 1 / 2 i = 1 n G n , i T Σ 1 / 2 i = 1 n G n , i = i = 1 n G n , i T Σ 1 i = 1 n G n , i .
We will use Identity (63) to estimate (75). Let ϑ * = ( ϑ 1 * , , ϑ m * ) be such that ϑ *   = 1 and
i = 1 n G n , i T ϑ * = max ϑ = 1 i = 1 n G n , i T ϑ .
Observe that
max ϑ = 1 i = 1 n G n , i T ϑ = l = 1 m i = 1 n λ l ( g ¯ l ( X i ) E g ¯ l ( X i ) ) 2 1 / 2 .
By (66),
( ϑ * ) T Σ ϑ * = ( 1 + o ( 1 ) ) l = 1 m ( ϑ l * ) 2 λ l L l ( z n , l ) ( 1 + o ( 1 ) ) max 1 l m λ l L l ( z n , l )
because ϑ * = 1 . By Identity (63) and by (76)–(78),
i = 1 n G n , i T Σ 1 i = 1 n G n , i l = 1 m i = 1 n λ l ( g ¯ l ( X i ) E g ¯ l ( X i ) ) 2 ( 1 + ε ) max 1 l m λ l L l ( z n , l ) .
By (74), (75) and (79), with the application of (16) and (19),
I 4 P l = 1 m λ l i = 1 n ( g ¯ l ( X i ) E g ¯ l ( X i ) ) 2 max 1 l m λ l L l ( z n , l ) ( 1 C 1 ε ) ( 1 2 ε ) 3 n x n 2 P i = 1 n G n , i T Σ 1 i = 1 n G n , i ( 1 C 1 ε ) ( 1 2 ε ) 3 n x n 2 1 + ε = P S n 2 ( 1 C 1 ε ) ( 1 2 ε ) 3 n x n 2 1 + ε exp ( 1 C 1 ε ) ( 1 2 ε ) 4 x n 2 2 ( 1 + ε ) .
Since ε is arbitrary, then the upper bound of Theorem 1 follows from (22) and the estimates of I 1 , I 2 , I 3 and I 4 .

3. The Lower Bound of Theorem 1

Let 0 < ε < 1 be sufficiently small. For 1 m < that is sufficiently large, by (26), max 1 l < λ l V n , l 2 = max 1 l m λ l V n , l 2 . Together with (17), we have
P ( W n ( 1 ε ) x n 2 ) = P l = 1 λ l i = 1 n g l ( X i ) 2 i = 1 n g l 2 ( X i ) max 1 l < λ l V n , l 2 ( 1 ε ) x n 2 P l = 1 m λ l i = 1 n g l ( X i ) 2 max 1 l m λ l V n , l 2 x n 2 .
Let g ˜ l ( X i ) be the random variable with distribution which is of the distribution of g l ( X i ) conditioned on | g l ( X i ) | z n , l . Define Y ˜ n , l = i = 1 n g ˜ l ( X i ) and V ˜ n , l 2 = i = 1 n g ˜ l 2 ( X i ) . By the definition of L l ( x ) and (12),
E g ˜ l 2 ( X i ) = E g ¯ l 2 ( X i ) / P ( | g l ( X i ) | z n , l ) = L l ( z n , l ) / P ( | g l ( X i ) | z n , l ) = L l ( z n , l ) ( 1 + o ( 1 ) ) .
Notice that (13) implies E g ˜ l ( X 1 ) = o ( L l ( z n , l ) / z n , l ) . Then, we have
σ l 2 : = E ( Y ˜ n , l E Y ˜ n , l ) 2 = n E ( g ˜ l ( X 1 ) E g ˜ l ( X 1 ) ) 2 = n E g ˜ l 2 ( X 1 ) ( 1 + o ( 1 ) ) = n L l ( z n , l ) ( 1 + o ( 1 ) )
and
E V ˜ n , l 2 = n L l ( z n , l ) ( 1 + o ( 1 ) ) .
Then, for 0 < δ < 1 ,
P l = 1 m λ l i = 1 n g l ( X i ) 2 max 1 l m λ l V n , l 2 x n 2 P l = 1 m λ l i = 1 n g l ( X i ) 2 max 1 l m λ l V n , l 2 x n 2 , max 1 i n | g l ( X i ) | z n , l , , 1 l m = P l = 1 m λ l i = 1 n g ˜ l ( X i ) 2 max 1 l m λ l V ˜ n , l 2 x n 2 P max 1 i n | g l ( X i ) | z n , l , , 1 l m P l = 1 m λ l i = 1 n g ˜ l ( X i ) 2 max 1 l m λ l V ˜ n , l 2 x n 2 , V ˜ n , l 2 ( 1 + 2 δ ) σ l 2 , 1 l m × P max 1 i n | g l ( X i ) | z n , l , , 1 l m P l = 1 m λ l i = 1 n g ˜ l ( X i ) 2 max 1 l m λ l σ l 2 ( 1 + 2 δ ) x n 2 × P max 1 i n | g l ( X i ) | z n , l , , 1 l m l = 1 m P ( V ˜ n , l 2 ( 1 + 2 δ ) σ l 2 ) .
Without the loss of generality, assume that max 1 l m λ l σ l 2 = λ 1 σ 1 2 . Then,
P l = 1 m λ l i = 1 n g ˜ l ( X i ) 2 max 1 l m λ l σ l 2 ( 1 + 2 δ ) x n 2 P λ 1 i = 1 n g ˜ 1 ( X i ) 2 λ 1 σ 1 2 ( 1 + 2 δ ) x n 2 P i = 1 n g ˜ 1 ( X i ) ( 1 + 2 δ ) 1 / 2 σ 1 x n .
Recall (15) and (81). Take c = 1 / x n ; thus, we have | g ˜ 1 ( X 1 ) | z n , l = c σ l . Therefore, by Theorem 5.2.2 in Stout [24], for any γ > 0 , we have
P i = 1 n g ˜ 1 ( X i ) ( 1 + 2 δ ) 1 / 2 σ 1 x n exp ( ( x n 2 / 2 ) ( 1 + 2 δ ) ( 1 + γ ) ) .
On the other hand,
P max 1 i n | g l ( X i ) | z n , l , 1 l m = [ P | g l ( X 1 ) | z n , l , 1 l m ] n = [ 1 P | g l ( X 1 ) | z n , l , 1 l m ] n [ 1 l = 1 m P | g l ( X 1 ) | z n , l ] n exp ( 2 n l = 1 m P | g l ( X 1 ) | z n , l ) = exp ( o ( x n 2 ) ) .
We apply the following exponential inequality (see Lemma 2.1, Csörgő, Lin and Shao [25]; see also Pruitt [26] and Griffin and Kuelbs [1]) for the rest of the proof.
Lemma 3.
Let ξ , ξ 1 , , ξ n be i.i.d. random variables. Then, for any b , v , s > 0 ,
P i = 1 n ( ξ i I ( | ξ i | b ) E ξ i I ( | ξ i | b ) ) v e v n E ξ i 2 I ( | ξ i | b ) 2 b + s b v 2 e s .
By (14),
E g ˜ 1 4 ( X i ) = o ( z n , 1 2 L 1 ( z n , 1 ) ) .
In Lemma 3, we take ξ i = g ˜ 1 2 ( X i ) , s = x n 2 , b = z n , 1 2 and v = 1 / δ . Notice that s b / v = δ σ 1 2 and v e v n E ξ i 2 I ( | ξ i | b ) 2 b = o ( σ 1 2 ) by (81) and (85). Then,
P ( V ˜ n , 1 2 ( 1 + 2 δ ) σ 1 2 ) = P ( V ˜ n , 1 2 E V ˜ n , 1 2 ( 1 + 2 δ ) σ 1 2 E V ˜ n , 1 2 ) P i = 1 n ( g ˜ 1 2 ( X i ) E g ˜ 1 2 ( X i ) ) δ ( 1 + δ ) σ 1 2 2 exp ( x n 2 ) .
Combining (80), (82), (83), (84) and (86) and letting λ , δ 0 , we have
P ( W n ( 1 ε ) x n 2 ) exp ( x n 2 / 2 ) .

4. The Upper Bound of Theorem 2

Lemma 4
(Lemma 2.3 of Giné, Kwapień, Latała, and Zinn [15]). There exists a universal constant C 3 < such that for any kernel h and any two sequences of i.i.d. random variables, we have
P max k m , l n | i k , j l h ( X i , Y j ) | t C 3 P | i m , j n h ( X i , Y j ) | t / C 3
for all m , n N and all t > 0 .
Proposition 5.
Under the assumptions of Theorem 1,
lim sup n l = 1 λ l i = 1 n g l ( X i ) 2 max 1 l < λ l V n , l 2 log log n 2 a . s .
Consequently,
lim sup n W n log log n 2 a . s .
Proof. 
Let x n as n . Let θ > 1 with θ 1 be sufficiently small. For any positive integer k ( n , θ n ] , via a similar idea as in (19) with 0 < η < 1 ,
P max n < k θ n l = 1 λ l i = 1 k g l ( X i ) 2 max 1 l < λ l V k , l 2 2 ( 1 + η ) 3 x n 2 P l = 1 λ l i = 1 n g l ( X i ) 2 max 1 l < λ l V n , l 2 2 ( 1 η ) ( 1 + η ) 3 x n 2 + P max n < k θ n l = 1 λ l i = n + 1 k g l ( X i ) 2 max 1 l < λ l V k , l 2 η 2 ( 1 + η ) 3 x n 2 2 : = H 1 + H 2 .
Notice that (10) implies (8). By (17) and the upper bound of Theorem 1,
H 1 exp ( 1 η ) 3 / 2 ( 1 + η ) 3 x n 2 .
Let 0 < δ < 1 be a sufficiently small constant. By (19),
H 2 P 2 m max 1 l < λ l V n , l 2 ( 1 η ) n 1 l < λ l L l ( z n , l ) + P max n < k θ n l = 1 λ l i = n + 1 k g l ( X i ) I | g l ( X i ) | > δ η z n , l 2 max 1 l < λ l V k , l 2 η 4 ( 1 + η ) 3 x n 2 2 + P ( max n < k θ n l = 1 λ l i = n + 1 k g l ( X i ) I ( | g l ( X i ) | δ η z n , l ) 2 n 1 l < λ l L l ( z n , l ) ( 1 η ) ( 1 2 η ) η 2 ( 1 + η ) 3 x n 2 4 m ) : = H 2 , 1 + H 2 , 2 + H 2 , 3 .
By (24) in Proposition 1,
H 2 , 1 exp 2 x n 2 .
By the Cauchy–Schwarz inequality, for each k,
l = 1 λ l i = n + 1 k | g l ( X i ) | I | g l ( X i ) | > δ η z n , l 2 max 1 l < λ l V k , l 2 l = 1 λ l i = n + 1 k g l 2 ( X i ) i = n + 1 k I | g l ( X i ) | > δ η z n , l max 1 l < λ l V k , l 2 .
By (17), for some m value that is sufficiently large, the sum of the diagonal terms is as follows:
l = 1 λ l i = n + 1 k g l 2 ( X i ) I | g l ( X i ) | > δ η z n , l max 1 l < λ l V k , l 2 m .
By (91) and (92),
H 2 , 2 P max n < k θ n n + 1 i j k l = 1 λ l g l 2 ( X i ) I ( | g l ( X j ) | > δ η z n , l ) max 1 l < λ l i = n + 1 k g l 2 ( X i ) η 5 ( 1 + η ) 3 x n 2 2 P max n < k θ n n + 1 i < j k l = 1 λ l g l 2 ( X i ) I ( | g l ( X j ) | > δ η z n , l ) max 1 l < λ l i = n + 1 k g l 2 ( X i ) η 5 ( 1 + η ) 3 x n 2 4 + P max n < k θ n n + 1 j < i k l = 1 λ l g l 2 ( X i ) I ( | g l ( X j ) | > δ η z n , l ) max 1 l < λ l i = n + 1 k g l 2 ( X i ) η 5 ( 1 + η ) 3 x n 2 4 P max n < k θ n n + 2 j k n + 1 i < j l = 1 λ l g l 2 ( X i ) I ( | g l ( X j ) | > δ η z n , l ) max 1 l < λ l n + 1 i < j g l 2 ( X i ) η 5 ( 1 + η ) 3 x n 2 4 + P max n < k θ n n + 1 j < k j < i k l = 1 λ l g l 2 ( X i ) I ( | g l ( X j ) | > δ η z n , l ) max 1 l < λ l j < i k g l 2 ( X i ) η 5 ( 1 + η ) 3 x n 2 4 = H 2 , 2 , 1 + H 2 , 2 , 2 .
Let
ϕ j = n + 1 i < j l = 1 λ l g l 2 ( X i ) I ( | g l ( X j ) | > δ η z n , l ) max 1 l < λ l n + 1 i < j g l 2 ( X i ) .
Then, for any constant t > 0 ,
H 2 , 2 , 1 n + 2 j [ θ n ] n + 1 i < j l = 1 λ l g l 2 ( X i ) I ( | g l ( X j ) | > δ η z n , l ) max 1 l < λ l n + 1 i < j g l 2 ( X i ) η 5 ( 1 + η ) 3 x n 2 4 E e t j = n + 2 [ θ n ] ϕ j e t η 5 ( 1 + η ) 3 x n 2 / 4 .
Let E j be the expectation of X j for n + 2 j [ θ n ] . Then,
E e t j = n + 2 [ θ n ] ϕ j = E ( e t j = n + 2 [ θ n ] 1 ϕ j E [ θ n ] e t ϕ [ θ n ] ) .
Since | e s 1 | e 0 s | s | for any s R and 0 ϕ [ θ n ] m for some m value that is sufficiently large, then
E [ θ n ] e t ϕ [ θ n ] 1 e m t t E [ θ n ] ϕ [ θ n ] = e m t t n + 1 i < [ θ n ] l = 1 λ l g l 2 ( X i ) P ( | g l ( X [ θ n ] ) | > δ η z n , l ) max 1 l < λ l n + 1 i < [ θ n ] g l 2 ( X i ) .
By (12) and (15), we have P ( | g l ( X [ θ n ] ) | > δ η z n , l ) = o ( x n 2 / n ) . Then, together with (17),
E [ θ n ] e t ϕ [ θ n ] = 1 + o ( x n 2 / n ) = e o ( x n 2 / n ) .
Applying (96) to (95), we have
E e t j = 2 [ θ n ] ϕ j = e o ( x n 2 / n ) E e t j = n + 2 [ θ n ] 1 ϕ j .
Similarly,
E e t j = n + 2 [ θ n ] 1 ϕ j = E ( e t j = n + 2 [ θ n ] 2 ϕ j E n 1 e t ϕ [ θ n ] 1 ) = e o ( x n 2 / n ) E e t j = n + 2 [ θ n ] 2 ϕ j .
Continue this process from X [ θ n ] to X n + 1 . Thus, we conclude that
E e t j = n + 2 [ θ n ] ϕ j = e [ ( θ 1 ) n ] × o ( x n 2 / n ) = e o ( x n 2 ) .
Applying (97) to (94) and letting t = 8 / ( η 5 ( 1 + η ) 3 ) , we have
H 2 , 2 , 1 exp ( 2 x n 2 ) .
To estimate H 2 , 2 , 2 , let
ψ j , k = j < i k l = 1 λ l g l 2 ( X i ) I ( | g l ( X j ) | > δ η z n , l ) max 1 l < λ l j < i k g l 2 ( X i ) .
Then, for any constant t > 0 ,
H 2 , 2 , 2 P n + 1 j < [ θ n ] max j < k θ n ψ j , k η 5 ( 1 + η ) 3 x n 2 4 E e t j = n + 1 [ θ n ] 1 max j < k θ n ψ j , k e t η 5 ( 1 + η ) 3 x n 2 / 4 .
Let E j be the expectation of X j for n + 1 j [ θ n ] . Note k > j . Then,
E e t j = n + 1 [ θ n ] 1 max j < k θ n ψ j , k = E ( e t j = n + 2 [ θ n ] 1 max j < k θ n ψ j , k E n + 1 e t max n + 1 < k θ n ψ n + 1 , k ) .
Observe that
ψ j , k = l = 1 λ l j < i k g l 2 ( X i ) I ( | g l ( X j ) | > δ η z n , l ) max 1 l < λ l j < i k g l 2 ( X i ) .
Then, by (17), 0 ψ n + 1 , k m for some m value that is sufficiently large. Since | e s 1 | e 0 s | s | for any s R ,
E n + 1 e t max n + 1 < k θ n ψ n + 1 , k 1 e m t t E n + 1 max n + 1 < k θ n ψ n + 1 , k .
Under Assumption (10), for each l [ 1 , ) ,
λ l V n , l 2 = i = 1 n λ l g l 2 ( X i ) c l ν = 1 i = 1 n λ ν g ν 2 ( X i ) = c l ν = 1 λ ν V n , ν 2 .
Recall that (17); then, for each l [ 1 , ) ,
λ l V n , l 2 max 1 l < λ l V n , l 2 m c l 1 ε .
Hence, by (101),
ψ j , k m 1 ε l = 1 c l I ( | g l ( X j ) | > δ η z n , l ) .
Then,
E n + 1 max n + 1 < k θ n ψ n + 1 , k m 1 ε l = 1 c l P ( | g l ( X n + 1 ) | > δ η z n , l ) .
By (12) and (15), we have P ( | g l ( X n + 1 ) | > δ η z n , l ) = o ( x n 2 / n ) . Then, together with (10),
E n + 1 max n + 1 < k θ n ψ n + 1 , k = o ( x n 2 / n ) .
Then, by (102) and (103),
E n + 1 e t max n + 1 < k θ n ψ n + 1 , k = 1 + o ( x n 2 / n ) = e o ( x n 2 / n ) .
Continue this process from j = n + 2 to j = [ θ n ] 1 and by (100),
E e t j = n + 1 [ θ n ] 1 max j < k θ n ψ j , k = e o ( x n 2 ) .
Applying (104) to (99) and letting t = 8 / ( η 5 ( 1 + η ) 3 ) , we have
H 2 , 2 , 2 exp ( 2 x n 2 ) .
By (93), (98) and (105),
H 2 , 2 2 exp ( 2 x n 2 ) .
By the definition of H 2 , 3 in (89), and by Lemma 4, there is a constant 0 < C < such that
H 2 , 3 C P l = 1 λ l i = n + 1 [ θ n ] g l ( X i ) I ( | g l ( X i ) | δ η z n , l ) 2 n l = 1 λ l L l ( z n , l ) ( 1 3 η ) η 2 x n 2 4 m C .
Similar to (36),
H 2 , 3 C P ( l = 1 λ l i = n + 1 [ θ n ] g l ( X i ) I ( | g l ( X i ) | δ η z n , l ) E g l ( X i ) I ( | g l ( X i ) | δ η z n , l ) 2 n l = 1 λ l L l ( z n , l ) ( 1 3 η ) 2 η 2 x n 2 4 m C ) .
By the decoupling version of (40) in Proposition 3,
H 2 , 3 C exp 2 x n 2 .
Combining (89), (90), (106) and (107), we have
H 2 ( 3 + C ) exp 2 x n 2 .
By (87), (88) and (108),
P max n < k θ n l = 1 λ l i = 1 k g l ( X i ) 2 max 1 l < λ l V k , l 2 2 ( 1 + η ) 3 x n 2 exp ( 1 η ) 7 / 4 ( 1 + η ) 3 x n 2 .
Let n = [ θ j ] for some j N . We have
P max θ j < k θ j + 1 l = 1 λ l i = 1 k g l ( X i ) 2 max 1 l < λ l V k , l 2 2 ( 1 + η ) 3 x [ θ j ] 2 exp ( 1 η ) 7 / 4 ( 1 + η ) 3 x [ θ j ] 2 .
Let x n 2 = log log n . Then,
j = 1 P max θ j < k θ j + 1 l = 1 λ l i = 1 k g l ( X i ) 2 max 1 l < λ l V k , l 2 log log k 2 ( 1 + η ) 3 j = 1 P max θ j < k θ j + 1 l = 1 λ l i = 1 k g l ( X i ) 2 max 1 l < λ l V k , l 2 2 ( 1 + η ) 3 x [ θ j ] 2 j = 1 exp ( 1 η ) 7 / 4 ( 1 + η ) 3 log log [ θ j ] K j = 1 exp ( 1 η ) 2 ( 1 + η ) 3 log j < .
By the Borel–Cantelli lemma,
lim sup n l = 1 λ l i = 1 n g l ( X i ) 2 max 1 l < λ l V n , l 2 log log n 2 a . s .

5. The Lower Bound of Theorem 2

Proof. 
By the definition of W n ,
W n log log n = l = 1 λ l i = 1 n g l ( X i ) 2 i = 1 n g l 2 ( X i ) max 1 l < λ l V n , l 2 log log n = l = 1 λ l i = 1 n g l ( X i ) 2 max 1 l < λ l V n , l 2 log log n l = 1 λ l V n , l 2 max 1 l < λ l V n , l 2 log log n .
By (17), l = 1 λ l V n , l 2 / ( max 1 l < λ l V n , l 2 log log n ) m / ( ( 1 ε ) log log n ) 0 as n , we have
W n log log n l = 1 λ l i = 1 n g l ( X i ) 2 max 1 l < λ l V n , l 2 log log n .
Then,
W n log log n k = 1 i = 1 n g k ( X i ) 2 V n , k 2 log log n I k = m i n { j : max 1 l < λ l V n , l 2 = λ j V n , j 2 } .
Hence, by (6),
lim sup n W n log log n 2 a . s .

Author Contributions

Conceptualization, Q.-M.S.; methodology, L.G., H.S. and Q.-M.S.; formal analysis, L.G., H.S. and Q.-M.S.; investigation, L.G., H.S. and Q.-M.S.; writing—original draft preparation, L.G. and H.S.; writing—review and editing, Q.-M.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research was partially supported by the Simons Foundation Grant 586789, USA, as well as by the National Nature Science Foundation of China NSFC 12031005 and Shenzhen Outstanding Talents Training Fund, China.

Institutional Review Board Statement

Not applicable.

Data Availability Statement

Data sharing is not applicable (only appropriate if no new data is generated or the article describes entirely theoretical research).

Conflicts of Interest

The authors declare no conflicts of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

References

  1. Griffin, P.; Kuelbs, J. Self-normalized laws of the iterated logarithm. Ann. Probab. 1989, 17, 1571–1601. [Google Scholar] [CrossRef]
  2. Shao, Q.M. Self-normalized large deviations. Ann. Probab. 1997, 25, 285–328. [Google Scholar] [CrossRef]
  3. Shao, Q.M. A Cramér type large deviation result for Student’s t-statistic. J. Theoret. Probab. 1999, 12, 385–398. [Google Scholar] [CrossRef]
  4. Jing, B.Y.; Shao, Q.M.; Wang, Q. Self-normalized Cramér type large deviations for independent random variables. Ann. Probab. 2003, 31, 2167–2215. [Google Scholar] [CrossRef]
  5. Shao, Q.M.; Zhou, W.X. Cramér type moderate deviation theorems for self-normalized processes. Bernoulli 2016, 22, 2029–2079. [Google Scholar] [CrossRef]
  6. Halmos, P.R. The theory of unbiased estimation. Ann. Math. Statst. 1946, 17, 34–43. [Google Scholar] [CrossRef]
  7. Hoeffding, W. A class of statistics with asymptotically normal distribution. Ann. Math. Statst. 1948, 19, 293–325. [Google Scholar] [CrossRef]
  8. Serfling, R.J. The law of the iterated logarithm for U-statistics and related von Mises statistics. Ann. Math. Statist. 1971, 42, 1794. [Google Scholar]
  9. Dehling, H.; Denker, M.; Philipp, W. Invariance principles for von Mises and U-statistics. Z. Wahrscheinlichkeitstheorie Verwandte Geb. 1984, 67, 139–167. [Google Scholar] [CrossRef]
  10. Dehling, H.; Denker, M.; Philipp, W. A bounded law of the iterated logarithm for Hilbert space valued martingales and its application to U-statistics. Probab. Theory Relat. Fields 1986, 72, 111–131. [Google Scholar] [CrossRef]
  11. Dehling, H. Complete convergence of triangular arrays and the law of the iterated logarithm for degenerate U-statistics. Statist. Probab. Lett. 1989, 7, 319–321. [Google Scholar] [CrossRef]
  12. Arcones, M.; Giné, E. On the law of the iterated logarithm for canonical U-statistics and processes. Stoch. Process. Appl. 1995, 58, 217–245. [Google Scholar] [CrossRef]
  13. Teicher, H. Moments of randomly stopped sums revisited. J. Theoret. Probab. 1995, 8, 779–794. [Google Scholar] [CrossRef]
  14. Giné, E.; Zhang, C.-H. On Integrability in the LIL for Degenerate U-statistics. J. Theoret. Probab. 1996, 9, 385–412. [Google Scholar] [CrossRef]
  15. Giné, E.; Kwapień, S.; Latała, R.; Zinn, J. The LIL for canonical U-statistics of order 2. Ann. Probab. 2001, 29, 520–557. [Google Scholar] [CrossRef]
  16. Adamczak, R.; Latała, R. The LIL for canonical U-statistics. Ann. Probab. 2008, 36, 1023–1068. [Google Scholar] [CrossRef]
  17. Bingham, N.H.; Goldie, C.M.; Teugels, J.L. Regular Variation; Cambridge University Press: Cambridge, UK, 1987. [Google Scholar]
  18. de la Peña, V.H.; Lai, T.L.; Shao, Q.M. Self-Normalized Processes: Limit Theory and Statistical Applications. Probability and Its Applications (New York); Springer: Berlin, Germany, 2009. [Google Scholar]
  19. Giné, E.; Latała, R.; Zinn, J. Exponential and moment inequalities for U-statistics. In High Dimensional Probability II; Birkhäuser: Boston, MA, USA, 2000; pp. 13–38. [Google Scholar]
  20. de la Peña, V.H.; Montgomery-Smith, S.J. Decoupling inequalities for the tail probabilities of multivariate U-statistics. Ann. Probab. 1995, 23, 806–816. [Google Scholar] [CrossRef]
  21. Einmahl, U. Extensions of results of Komlós, Major, and Tusnády to the multivariate case. J. Mult. Anal. 1989, 28, 20–68. [Google Scholar] [CrossRef]
  22. Lin, Z.; Liu, W. On maxima of periodograms of stationary processes. Ann. Statist. 2009, 37, 2676–2695. [Google Scholar] [CrossRef]
  23. Liu, W.; Shao, Q.M. A Cramér moderate deviation theorem for Hotelling’s T2-statistic with applications to global tests. Ann. Statist. 2013, 41, 296–322. [Google Scholar] [CrossRef]
  24. Stout, W.F. Almost Sure Convergence; Academic Press: New York, NY, USA, 1974. [Google Scholar]
  25. Csörgő, M.; Lin, Z.Y.; Shao, Q.M. Studentized increments of partial sums. Sci. China Ser. A 1994, 37, 265–276. [Google Scholar]
  26. Pruitt, W.E. General one-sided laws of the iterated logarithm. Ann. Probab. 1981, 9, 1–48. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Ge, L.; Sang, H.; Shao, Q.-M. Self-Normalized Moderate Deviations for Degenerate U-Statistics. Entropy 2025, 27, 41. https://doi.org/10.3390/e27010041

AMA Style

Ge L, Sang H, Shao Q-M. Self-Normalized Moderate Deviations for Degenerate U-Statistics. Entropy. 2025; 27(1):41. https://doi.org/10.3390/e27010041

Chicago/Turabian Style

Ge, Lin, Hailin Sang, and Qi-Man Shao. 2025. "Self-Normalized Moderate Deviations for Degenerate U-Statistics" Entropy 27, no. 1: 41. https://doi.org/10.3390/e27010041

APA Style

Ge, L., Sang, H., & Shao, Q.-M. (2025). Self-Normalized Moderate Deviations for Degenerate U-Statistics. Entropy, 27(1), 41. https://doi.org/10.3390/e27010041

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop