Next Article in Journal
Analysis of Multi-Server Priority Queueing System with Hysteresis Strategy of Server Reservation and Retrials
Next Article in Special Issue
Parametric Distributions for Survival and Reliability Analyses, a Review and Historical Sketch
Previous Article in Journal
Fuzzy Continuous Mappings on Fuzzy F-Spaces
Previous Article in Special Issue
An Extended Weibull Regression for Censored Data: Application for COVID-19 in Campinas, Brazil
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Exchangeably Weighted Bootstraps of General Markov U-Process

by
Inass Soukarieh
and
Salim Bouzebda
*,†
Laboratory of Applied Mathematics of Compiègne (LMAC), Université de Technologie de Compiègne, 60200 Compiègne, France
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
Mathematics 2022, 10(20), 3745; https://doi.org/10.3390/math10203745
Submission received: 24 August 2022 / Revised: 29 September 2022 / Accepted: 2 October 2022 / Published: 12 October 2022
(This article belongs to the Special Issue Current Developments in Theoretical and Applied Statistics)

Abstract

:
We explore an exchangeably weighted bootstrap of the general function-indexed empirical U-processes in the Markov setting, which is a natural higher-order generalization of the weighted bootstrap empirical processes. As a result of our findings, a considerable variety of bootstrap resampling strategies arise. This paper aims to provide theoretical justifications for the exchangeably weighted bootstrap consistency in the Markov setup. General structural conditions on the classes of functions (possibly unbounded) and the underlying distributions are required to establish our results. This paper provides the first general theoretical study of the bootstrap of the empirical U-processes in the Markov setting. Potential applications include the symmetry test, Kendall’s tau and the test of independence.

1. Introduction

U-statistics are a class of estimators, initially explored in association with unbiased estimators by [1] and officially introduced by [2], and are defined as follows: let X i i = 1 be a sequence of random variables defined on a measurable space ( E , E ) , and let h : E m R be a measurable function, the U-statistics of order m and kernel h based on the sequence X i are
U n ( h ) = n m 1 i 1 , , i m I n m h X i 1 , , X i m , n m ,
where
I n m = i 1 , , i m : i j N , 1 i j n , i j i k if j k .
The empirical variance, Gini’s mean difference or Kendall’s rank correlation coefficient are common examples of U-estimators, while a classical test based on a U-statistic is Wilcoxon’s signed rank test for the hypothesis of the location at zero (see, e.g., [3], Example 12.4).The authors in [1,2,4] provided, amongst others, the first asymptotic results for the case in which the underlying random variables have independent and identical distributions. Extensive literature works have treated the theory of U-statistics, for instance, see [5,6,7,8], etc. Complex statistical issues are also amenable to being solved using U-processes. Examples include tests for goodness-of-fit, nonparametric regression and density estimation. U-processes are a set of U-statistics that are indexed by a family of kernels. U-processes might be viewed as infinite-dimensional variants of U-statistics with a single kernel function or as nonlinear stochastic extensions of empirical processes. Both thoughts have the following advantages: first, considering a large group of statistics rather than a single statistic is more statistically interesting. Second, we may use ideas from the theory of empirical processes to construct limit or approximation theorems for U-processes. Nevertheless, achieving results in U-processes is not easy. Extending U-statistics to U-processes necessitates a significant effort and distinct methodologies; generalizing empirical processes to U-processes is quite challenging, especially when U-processes are presented in the stationary setting. We highlight that the U-processes are used often in statistics, such as when higher order terms are a part of von Mises expansions. Particularly, the study of estimators (including function estimators) with various smoothness degrees involves U-statistics. For instance, Ref. [9] applied almost-sure uniform bounds for P -canonical U-processes to analyze the product limit estimator for truncated data. Two new tests for normality based on U-processes were also presented in [10]. Inspired by [11,12,13], they developed other tests for normality that employed weighted L 1 -distances between the standard normal density and local U-statistics based on standardized observations as test statistics. Estimating the mean of multivariate functions in the case of possibly heavy-tailed distributions was explored by [14]; they presented the median-of-means too, and both explorations were based on U-statistics. Moreover, other researchers emphasized the importance of U-processes; refs. [15,16,17] used them for testing qualitative features of functions in nonparametric statistics, ref. [18] represented the cross-validation for density estimation using U-statistics, in addition to [6,7,19], where the authors established limiting distributions of M-estimators. Since then, this discipline has made significant advancements, and the results have been broadly interpreted. Asymptotic behaviors were demonstrated under weak dependence assumptions, for example, in the works of [20,21,22] or more recently in [23] as well as more generally in [24,25]. However, in practice, explicit computation is not always possible due to the complexity of the U-processes’ limiting distributions or their functionals. We suggest a general bootstrap of the U-processes in the Markov setting to solve this issue, which is a challenging problem. The concept of the bootstrap, given by [26], in the case of independent and identically distributed (iid) random variables, is to resample from an original sample of observations of an unknown marginal distribution function F ( x ) , X 1 , , X n , a new i.i.d sample X 1 * , , X n * with the marginal distribution function F n ( x ) , which represents the empirical distribution function constructed from the original sample. Moreover, it is commonly known that the bootstrap approach gives a better approximation to the statistic’s distribution, mainly when the sample size is small [27]. Bootstraps for U-statistics of independent observations were studied by [28,29,30,31]. However, the bootstrap technique is not the same for dependent variables because the dependence structure cannot be conserved in the new sample. For this reason, other blockwise bootstrap methods were introduced, aiming to keep the structure of dependence. Among those methods, we can cite the circular block bootstrap introduced by [32] and the nonoverlapping block bootstrap introduced by [33]. In [34], the authors proposed a bootstrap method related to the weakly dependent stationary observation, the stationary bootstrap. This latter can be seen as an expansion of the circular block bootstrap, where a random variable, such as a geometric random variable, can be used for the block length. It is important to note that Efron’s initial bootstrap formulation (see [26]) had a few flaws. To be more precise, certain observations might be sampled several times while others might not be at all. A more generalized version of the bootstrap, the weighted bootstrap, was developed to get around this issue and was also demonstrated to be computationally more appealing in some applications. This resampling strategy was initially described in [35] and thoroughly investigated by [28], who coined the name “weighted bootstrap”. For example, Bayesian bootstrap when the weighted vector
( ξ n 1 , , ξ n n ) = ( M n 1 , , M n n ) ,
is equal to the vector of n spacings of n 1 ordered uniform ( 0 , 1 ) random variables in distributions, that is, ( M n 1 , , M n n ) follows a Dirichlet distribution of parameters ( n ; 1 , , 1 ) . For more details, see [36]. This diversity of resampling approaches necessitates the use of a uniform approach, commonly known as general weighted resampling, which was first described by [37] and has since been developed by [38,39]. In [40], the authors investigated the almost-sure rate of convergence of strong approximation for the weighted bootstrap process by a sequence of Brownian bridge processes; refer to [41] for the multivariate setting and [42] for recent references. The concept of the generalized bootstrap, introduced by [37], was extended to the class of nondegenerate U-statistics of degree two and the corresponding Studentized U-statistics by [43]; refer to [44,45]. In [46], the author generalized this theory for a higher order. In his work, he developed a multiplier inequality of a U-process for i.i.d. random variables. We mention that the multiplier processes’ theory is directly and strongly related to the symmetrization inequalities investigated by [6,7].
This paper aims to investigate the exchangeable bootstrap for U-processes in the same way that [46] did but without the restriction of the independence setting. The previous reference focused on U-processes in an independent framework, whereas this paper considers U-processes in the dependent setting of Markov chains. We believe we are the first to present a successful consideration in this general context. We combine the techniques of the renewal bootstrap with the randomly weighted bootstrap in a nontrivial way. We mention a connection between moving-blocks bootstrap and its modification, matched-block bootstrap, at this point. Instead of artificially splitting a sample into fixed-size blocks and then resampling them, the latter seeks to match the blocks to create a smoother transition; for more information, see [47]. The main difficulties in proving Theorem 3 are due to the random size of the resampled blocks. This randomness generates a problem with the random stopping times, which cannot be removed by replacing a random stopping time with its expectation. In the present setting, the bootstrap random variables are generated by resampling from a random number of blocks. One can think that using the conditioning arguments can overcome the problem, but the answer is negative. Our proof uses some arguments from [46,47] by verifying bootstrap stochastic equicontinuity by comparing it to the original process in a similar way as in [48]. However, as we shall see later, integrating concepts from these papers is not enough to solve the problem. To deal with U-processes in the Markov framework, sophisticated mathematical derivations are necessary. We present the first complete theoretical justification of the bootstrap consistency. This justification requires the efficient use of large sample theoretical approaches established for U-empirical processes.
The rest of this paper is organized as follows. Section 2 is devoted to the introduction of the Markov framework, the U-process, the bootstrap weights and the definitions needed in our work. In Section 3, we recall the necessary ingredient for U-statistics and U-processes in the Markov setting. Furthermore, we provide some asymptotic results including the weak convergence of U-processes in Theorem 1. In Section 4, we derive the main results concerning the bootstrap of the U-processes. In Section 5, we collect some examples of weighted U-statistics. Some concluding remarks and possible future developments are relegated to Section 6. To prevent the interruption of the flow of the presentation, all proofs are gathered in Section 7. Appendix A contains a few pertinent technical findings and proofs.

2. Notation and Definitions

In what follows, we aim to properly define our settings. For this reason, we have collected the definitions and notation needed.

2.1. Markov Chain

Let X = ( X n ) n N be an homogeneous ψ -irreducible Markov chain, that means that the chain has stationary transition probabilities, defined on a measurable space ( E , E ) , where E is a separable σ -algebra. Let π ( x , d y ) be the transition probability and ν = ν ( i ) i > 0 the initial probability. Therefore, we denote by P ν or just P the probability measure for P = ( π , ν ) . Likewise, E ν denotes the integration with respect to P ν . In our framework, let P x be a probability measure such that X 0 = x , X 0 E and E x ( · ) is the P x -expectation. We further assume that the Markov chain is Harris positive recurrent with an atom A .
Definition 1
(Harris recurrent). A Markov chain X = ( X n ) n N is said to be Harris recurrent if there exists a σ-finite measure such that, for ψ a positive measure on a countable generated measurable space ( E , E ) , ψ ( E ) > 0 and if for all B E with ψ ( B ) > 0
P x i = 1 ( X i B ) = 1 f o r a n y x E .
Recall that a chain is positive Harris recurrent and aperiodic if and only if it is ergodic ([49] Proposition 6.3), i.e., there exists a probability measure π , called the stationary distribution, such that, in total variation distance,
lim n + P n ( x , · ) π tv = 0 .
Definition 2
(Small sets). A set S E is said to be Ψ-small if there exists δ > 0 , a positive probability measure Ψ supported by S and an integer m N * , such that
x S , B E , P m ( x , B ) δ Ψ ( B ) .
Definition 3.
Let ( X n ) n 1 be a Markov chain taking value in ( E , E ) . We say that ( X n ) n 1 is positive recurrent if
1. 
( X n ) n 1 is ( A , p , ν , m ) recurrent (or Harris recurrent if E is countably generated), where A E is a set, 0 < p < 1 , m is an integer and ν is a probability measure.
2. 
sup x A E x ( T 0 ) < , where T 0 is the hitting time of A by the m-step chain, roughly speaking, T 0 = min { i 1 : X i , m A } .
Definition 4.
A ψ-irreducible aperiodic chain X is called regenerative or atomic if there exists a measurable set A called an atom, in such a way that ψ ( A ) > 0 and for all ( x , y ) A 2 we have P ( x , · ) = P ( y , · ) . Roughly speaking, an atom is a set on which the transition probabilities are the same. If a finite number of states or subsets are visited from the chain, then any state or any subset of the states is actually an atom.
Definition 5
(Aperiodicity). Assuming ψ-irreducibility, there exists d N * and disjoints sets D 1 , , D d (set D d + 1 = D 1 ) positively weighted by ψ such that
ψ ( E 1 i d D i ) = 0
and
x D i , P ( x , D i + 1 ) = 1 .
The period of the chain is the greatest common divisor d of such integers, it is said to be aperiodic if d = 1 .
Definition 6
(Irreducibility). The chain is ψ-irreducible if there exists a σ-finite measure ψ such that, for all set B E , when ψ ( B ) > 0 , for any x E , there exists n > 0 such that P n ( x , B ) > 0 .
One of the most important properties of Harris recurrent Markov chains is the existence of an invariant distribution which we is called μ (a limiting probability distribution, also called occupation measure). Furthermore, Harris recurrent Markov chains can always be embedded in a certain Markov chain on an extended sample space with a recurrent atom. The existence of a recurrent atom A gives an immediate consequence for the construction of a regenerative extension of this chain. The time that the chain hits a given atom (recurrent state) is seen as the regenerative time. In [50,51], the authors give the construction of such a regenerative extension. The development of a regenerative extension makes the use of regenerative techniques possible in order to study this type of Markov chain. As we mentioned above, we assume in this work that the Harris recurrent chain is atomic, i.e., the set which is infinitely almost sure is well-defined and accessible, this set A is called an atom. By definition, an atom A is a set in E , where μ ( A ) > 0 , and for all x , y A , π ( x , · ) = π ( y , · ) . Let P A (respectively, E A ) be the probability measure on the underlying space such that x A (respectively, the P A -expectation).
The conditions imposed on the Markov chain ensure that the defined atom A (or the constructed one in the case of a nonatomic chain) is one recurrent class, and let us define the following terms.

2.1.1. Hitting Times

Define T j : E N { } by
T 0 : = inf { n 0 : X n A } , T j : = inf { n T j 1 : X n A } .
A well-known property of the hitting time is that for all j N , T j < , P ν a . s ([52], chap. I14).

2.1.2. Renewal Times

Using the hitting times, we can define the renewal times as
τ 0 : = T 0 + 1 , τ ( j ) : = T j T j 1 .
Similar to the regenerative process, the sequence of renewal times { τ ( j ) } j = 1 is i.i.d. and it is independent of the choice of the initial probability. All over this work, we set τ = τ ( 1 ) and α = E A ( τ ) .
Definition 7
(Strong Markov property). Let ( X n ) n 0 be a Markov chain, with T the stopping time of ( X n ) n 0 . Then, conditionally on T < and X T = i , ( X T + n ) n 0 is a sequence of a Markov chain and is independent of X 0 , , X T .

2.1.3. Regenerative Blocks

Let l n : = max { j : i = 0 j τ ( j ) n } be the number of visits to the atom A . Using the strong property of a Markov chain, it is possible to divide the given sample ( X 1 , , X n ) into a sequence of blocks { B j } j = 0 l n such that:
B 0 = X 1 , , X T 0 , B j = X T j 1 + 1 , , X T j i n T = n = 1 E n , for all j = 1 , , l n , B l n ( n ) = X T l n 1 + 1 , , X n ,
where l n is the total number of blocks. The length of each block is denoted by
l ( B j ) : = T j T j 1 .

2.2. Exchangeable Weights

In what follows, ξ represents a real-valued random variable, ξ i are independent from ( X i ) . For 1 p < , we denote the p-norm by
ξ p = E ( | ξ | p ) 1 / p .
Definition 8
(Exchangeability). Let ξ n 1 , , ξ n n be a sequence of random variables with joint distribution P ξ and let Σ ( n ) be the group of all permutations acting on { 1 , , n } . We say that ξ n 1 , , ξ n n is exchangeable if, for all σ ( i ) Σ ( n ) ,
P ξ ( ξ n 1 , , ξ n n ) = P ξ ( ξ n σ ( 1 ) , , ξ n σ ( n ) ) .
Assuming the following:
(A1) 
( ξ 1 , , ξ n ) are exchangeable non-negative, symmetric and for all n
i = 1 n ξ i = n .
(A2) 
1 n max 1 i n ( ξ i 1 ) 2 0 in P ξ -probability which is satisfied by the assumption of the moment
sup n ξ 1 2 m , 1 < .
(A3) 
There exists c > 0 such that, in P ξ -probability,
1 n i = 1 n ( ξ i 1 ) 2 c 2 > 0 .
(A4) 
Assume
lim λ lim ¯ t λ t 2 P ξ ( ξ 1 t ) = 0 .

2.3. The U-Process Framework

Let ( X n ) n N be a sequence of random variables with values in a measurable space ( E , E ) . Let h : E m R be a measurable function symmetric in its arguments. The U-statistic of order (or degree) m and kernel h ( · ) is defined as:
U n ( h ) = n m 1 ( i 1 , , i m ) I n m h ( X i 1 , , X i m ) , for n m .
Accordingly, a U-process is the collection { U n ( h ) : h F } , where F is the class of kernels h ( · ) of m variables. The decoupling inequality of U-statistics and U-processes plays a central role in the latest developments in the asymptotic theory. As a result, the decoupling inequality can give a relation between the quantities
E Φ I n m h X i 1 , , X i m and E Φ I n m h X i 1 1 , , X i m m ,
where Φ ( · ) is a non-negative function and { X i k } , k = 1 , , m are independent copies of the original sequence { X i } . One of the useful reasons for decoupling is randomization, which is frequently used in the study of the asymptotic theory of U-statistics, and was studied by [6,7]. The main idea of randomization is to compare the tail probabilities or moments of the original U-statistic or process, I n m h ( X i 1 , , X i m ) , with the tail probabilities or moments of the statistic
I n m ε i 1 ε i r h ( X i 1 , , X i m ) ,
where ε i are independent Rademacher variables, independent from X i , 1 r m and the variables depend on the degree of degeneracy (centering) of the kernel h ( · ) .
Definition 9
([6]). A symmetric P m -integrable kernel h : E m R is P -degenerate of order r 1 if and only if
h ( x 1 , , x m ) d P m r + 1 ( x r , , x m ) = h d P m
holds for any x 1 , , x r 1 E , whereas
h ( x 1 , , x m ) d P m r ( x r + 1 , , x m ) ,
is not a constant function. If h is furthermore P m -centered, that is, P m f = 0 , we write h L 2 c , r P m . For notational simplicity, we usually write L 2 c , m P m = L 2 c , m ( P ) .
Moreover, h ( · ) is said to be canonical or completely degenerated if the integral with respect to one variable is equal to zero, i.e.,
h ( x 1 , , x m ) d P ( x 1 ) = 0 for all x 2 , , x m E .
The fact that the kernel is completely degenerate with the condition P m h 2 < is used for the orthogonality of the different elements of the Hoeffding decomposition of the U-statistics.
Definition 10
(Covering number). The covering number N p ( ε , Q , F ) is defined as the minimal number of balls with radius ε that are needed to cover a class of functions F in the norm L p ( Q ) , where Q is the measure on E with finite support.
We can associate some distances e n , p to the covering numbers, where
e n , p = ( U n ( | f g | p ) ) 1 / p .
In this work, we use the two distances defined afterward
e n , 2 ( f , g ) = ( n m ) ! n ! 0 i 1 < < i m n ( f g ) ( X i 1 , , X i m ) 2 1 / 2 .
For decoupled statistics, we also associate covering numbers, well-known as N ˜ ( ϵ , F , e ˜ n , p ) and a distance, which can be defined for p = 2 as follows:
e ˜ n , 2 ( f , g ) = n 1 / 2 ( n m ) ! n ! E ε 0 i 1 < < i m n ε i 1 ( f g ) ( X i 1 , , X i m ) 2 1 / 2 .
Definition 11.
A class F of measurable functions E R is said to be of VC type (or Vapnik–Chervonenkis type) for an envelope F and admissible characteristic ( C , v ) (positive constants) such that C ( 3 e ) v and v 1 , if for all probability measure Q on ( E , E ) with 0 < F L 2 ( Q ) < and every 0 < ϵ < 1 ,
N ϵ F L 2 ( Q ) , F , · L 2 ( Q ) C ϵ v .
We assume that the class is countable to avoid measurability issues (the noncountable case may be handled similarly by using an outer probability and additional measurability assumptions, see [53]).
Definition 12
(Stochastic equicontinuity, ([54])). Let { Z n } be a sequence of stochastic processes. Call { Z n } stochastically equicontinuous at t 0 if for each δ > 0 , there exists a neighborhood D of t 0 such that
lim sup n P sup D | Z n ( t ) Z n ( t 0 ) | < ε .
In the context of the U-process { U n } , the stochastic equicontinuity at a function g F implies generally that | U n ( h ) U n ( g ) | should be uniformly small for all h ( · ) close enough to g ( · ) , with high probability and for all n large enough.

2.4. Gaussian Chaos Process

Definition 13.
Let H denote a real separable Hilbert space with scalar product · , · H . We say that a stochastic process G = { G P ( h ) , h H } defined in a complete probability space ( E , E , P ) is an isonormal Gaussian process (or a Gaussian process on H) if G P is a centered Gaussian family of random variables such that E ( G P ( h ) G P ( g ) ) = h , g H for all h , g H .
Define the mapping h G P ( h ) . Under the assumption mentioned above, this map is linear and it provides a linear isometry of H onto a closed subspace L 2 ( E , E , P ) which contains a zero mean Gaussian random variables as its elements. Let K P be the isonormal Gaussian chaos process associated with G P determined by:
K P h m ψ = ( m ! ) 1 2 R m G P ( ψ ) , E ψ 2 , 0 , , 0 ,
where h m ψ x 1 , , x m = ψ x 1 ψ x m , ψ L 2 ( P ) and R m is a polynomial defined as a sum of monomials of degree m; ref. [6] give us a simple expression of this polynomial, extracted from Newton’s identity given by
1 i 1 < · < i m n t i 1 t i m = R m i = 1 n t i , i = 1 n t i 2 , , i = 1 n t i m .
Therefore,
1 i 1 < · < i m n ψ x i 1 ψ x i m = R m i = 1 n ψ x i , i = 1 n ψ x i 2 , , i = 1 n ψ x i m .
Hence, by the continuous mapping theorem, we can see that CLT and LLN give:
n m 1 1 2 U n h m 1 ψ 1 , , n m r 1 2 U n h m r ψ r     m 1 ! 1 2 R k 1 G P ψ 1 , E ψ 1 2 , 0 , , 0 , m r ! 1 2 R k r G P ψ r , E ψ r 2 , 0 , , 0 .
Under the linearity of the kernel, we only need to show that:
n m 1 2 U n f : f F     d K P f k = m ! R m G P ψ , E ψ 2 , 0 , , 0 : f k F in ( F ) ,
to hold the weak convergence. The limit K P is useful in the case of degenerate U-statistics and it provides a convergence of all moments, which in turn plays a crucial role because it is due to the hypercontractivity, which makes the uniform integrability better. For a good explanation of K P , readers are invited to see ([6] Chapter 4, Section 4.2).

2.5. Technical Assumptions

For our results, we need the following assumptions.
(C.1)
(Block-length assumption) For all q 1 and l 1 ,
E ν τ l < , E A τ q < ;
(C.2)
(Nonregenerative blocks) For l 1 , we have
E ν i 1 = 1 T 0 i 2 = T 0 + 1 T 1 i 3 = T 1 + 1 T 2 i m = T m 1 + 1 T m | h ( X i 1 , , X i m ) | l <
and
E ν i 1 = T 0 + 1 T 1 i 2 = T 1 + 1 T 2 i m 1 = T m 1 + 1 T m i m = T ( l n ) + 1 n | h ( X i 1 , , X i m 1 , X i m ) | l <
(C.3)
(Block-sum: moment assumptions) For l 1 , we have
E ν i 1 = T 0 + 1 T 1 i 2 = T 1 + 1 T 2 i m = T m 1 + 1 T m | h ( X i 1 , , X i m ) | l < ,
and
E A T 0 + 1 i 1 i m T 1 h ( X i 1 , , X i m ) l < ;
(C.4)
For l 1 , we have
E ν ( i 1 = T 0 + 1 T 1 i 2 = T 1 + 1 T 2 i k = T k + 1 T k + 1 i k = T k + 1 T k + 1 u times i k + u = T k + u + 1 T k + u + 1 i m = T m 1 + 1 T m | h ( X i 1 , X i k , , X i k , u times X i k + u , , X i m ) | ) l < ;
(C.5)
(Nondegeneracy.) We suppose also that
E A i = T 0 + 1 T 1 h 1 ( X i ) 2 > 0 .
Remark 1
(Moment assumptions). In practice, we recall that block-moment assumptions for the split Markov chain can be generally checked by establishing drift conditions of Lyapunov’s type for the original chain; see Chapter 11 in [55,56], as well as All these moment conditions are discussed in detail in ([57], Chapters 11 and 17). There is a key condition in the proof of ergodic theorems in the Markovian context, which is the fact that E A ( τ 0 ) < , for any A that is a set in E , such that ψ ( A ) > 0 . In fact, when there is a finite invariant measure and an atom A , then this condition is readily found. We also refer to [58] for an explicit check of such conditions on several important examples and to §4.1.2 of [59] for sufficient conditions expressed in terms of a uniform return rate to small sets. Finally, as discussed in Chapter 8 of [60], similar conditions can be expressed in potential kernels. Observe that, in the positive recurrent case, the assumptions of (C.1) are not independent when ν = μ : from the basic renewal theory, one has P μ τ = k = E A τ 1 P A τ k for all k 1 . Hence, conditions E μ τ l < and E A τ l + 1 < are equivalent.

3. Preliminary Results

A significant issue was detected in recovering the estimation of our parameter of interest using the U-process. The given shape of this parameter is as follows:
μ ( h ) = x 1 E x k E h ( x 1 , , x k ) μ ( d x 1 ) μ ( d x k ) ,
where h : E m R is a kernel function. The estimation of this parameter should be possible using the U-statistics of the form:
U n ( h ) = n m 1 ( i 1 , , i m ) I n m h ( X i 1 , , X i m ) , for n m ,
As the parameter of interest is defined and based on Kac’s theorem for the occupation measure, μ ( h ) in the regeneration setup can be written as follows:
μ ( h ) = 1 ( E A ( τ ) ) m E A i 1 = T 0 + 1 T 1 i 2 = T 1 + 1 T 2 i m = T ( m 1 ) + 1 T m h ( X i 1 , , X i m ) .
In the Markovian context and since the variables are not independent, the approximation related to the i.i.d. blocks and the regenerative case is introduced below:
Definition 14
(Regenerative kernel). Let h : E m R a kernel. We define the regenerative kernel ω h : T m R as follows:
ω h ( ( x 11 , , x 1 n 1 ) , , ( x k 1 , , x k n k ) ) = i 1 = 1 n 1 i k = 1 n k h ( x 1 i 1 , , x k i k ) .
It is not necessary that the kernel ω h ( · ) be symmetric, as soon as h ( · ) . In fact, we can use the symmetrization of S m ω h in the following way
( S m ω h ) = ( m ! ) 1 σ ( 1 ) = 1 n 1 σ ( m ) = 1 n k h x σ ( 1 ) , , x σ ( m ) ,
where the first sum is over all permutations σ = { i 1 , , i m } of { 1 , , m } . Next, we consider the U-statistic formed by the regenerative data.
Definition 15
(Regenerative U-statistic). Let h : E m R a kernel such that μ ( | h | ) < and set h ˜ ( · ) = h ( · ) μ ( h ) . The regenerative U-statistic associated with the sequence of regenerative blocks { B j } j = 1 L , generated by the Markov chain is given by
R l n ( h ) = l n 1 m 1 ( i 1 , , i m ) I l n 1 m ω h ˜ ( B i 1 , , B i m ) .
Hence, R l n ( h ) is a standard U-statistic with mean zero.
Proposition 1.
Let us define
W n ( h ) = U n ( h ) μ ( h ) l n 1 m n m 1 R l n ( h ) .
Then, under conditions (C.1), (C.2), (C.3) and (C.4), we have the following stochastic convergences:
W n ( h ) 0 , P ν a . s .
Before stating the weak convergence in the next theorem, we define the corresponding U-processes related to the U-statistic U n and the regenerative U-statistic R L , respectively:
Z n : = n m 1 / 2 U n μ ( h ) ,
T l n : = l n m 1 / 2 R l n E ( R l n ) .
Theorem 1.
Let ( X n ) n be a positive recurrent Harris Markov chain, with an accessible atom A , X n satisfies the conditions (C.1) and (C.2) (moments assumptions), (C.3), (C.4), (C.5) and, for a fixed γ > 0 , E ( τ ) 2 + γ < . Let F be a uniform bounded class of functions with an envelope H square-integrable such that:
0 ( log N ( ε , F , e n , 2 ) ) m / 2 d ε < .
Then, the process Z n converges weakly in probability under P ν to a Gaussian process G P indexed by F whose sample paths are bounded and uniformly continuous with respect to the metric L 2 ( μ ) .

The Bootstrapped U-Processes

Trying to facilitate the bootstrap technique, we write the detailed steps of the regenerative block construction and the weighted bootstrap method in Algorithm 1:
Algorithm 1 Regenerative block and weighted bootstrap construction.
  • Identify the number of visits l n = i = 0 n 𝟙 X i A to the atom A .
  • Divide the sample X ( n ) = ( X 1 , , X n ( n ) ) into ( l n + 1 ) regenerative blocks B 0 , , B l n 1 , B l n ( n ) T , each block B i with a length l ( B i ) τ i .
  • Drop the first and the last blocks if τ l n < n to avoid bias.
  • Let ξ = ( ξ i , l n , i = 1 , , n ) be a triangular array of random variables. Define the weighted bootstrap empirical measure from the data:
    P n * = 1 l n i = 1 n ξ i , l n δ B i .
In what follows, we denote by P * and E * , respectively, the conditional probability and the conditional expectation given the sample { X 1 , , X n } . The same notation is used for the sample { B 1 , , B L n } . Define the bootstrapped U-statistic as
U n * ( h ) = n m 1 ( i 1 , , i m ) I n m ξ i 1 , n ξ i m , n h ( X i 1 , , X i m )
and the regenerative bootstrapping
R l n * ( h ) = l n m 1 ( i 1 , , i m ) I l n m ξ i 1 , l n ξ i m , l n ω h ( B i 1 , , B i m ) .
and the U-processes are:
Z n * : = n m 1 / 2 U n * ( h ) U n ( h ) h F = n m 1 / 2 ( i 1 , , i m ) I n m ( ξ i 1 , l n 1 ) ( ξ i m , l n 1 ) h ( X i 1 , , X i m ) .
and
T l n * : = l n m 1 / 2 R l n * ( h ) E ( R l n * ( h ) ) h F = l n m 1 / 2 ( i 1 , , i m ) I l n m ( ξ i 1 , l n 1 ) ( ξ i m , l n 1 ) ω h ( B i 1 , , B i m ) .
Given Δ n , a real-valued function, defined on the product probability space, we say that Δ n is of an order o P ξ o ( 1 ) in P ν o -probability if for any ε , δ > 0
P ν o P ξ X o Δ n > ε > δ 0 as n
and that Δ n is of an order O P ξ o ( 1 ) in P ν o -probability if for any δ > 0 , there exists a 0 < M < such that
P ν o P ξ X o Δ n M > δ 0 as n
We must comment here that the bootstrap works in probability if
d B L ( T l n * , T l n ) 0 in probability ,
where
d B L ( T l n * , T l n ) = sup g B L l ( F ) E g T l n * E g ( T l n ) ,
and
B L l ( F ) : = g : l ( F ) R , | g ( x ) g ( y ) | x y F , g 1 ,
and g T l n * is the measurable envelope of g T l n * . In addition, for any measurable random elements, Y n and Y, the convergence in law of Y n to Y is in the sense of Hoffman–Jorgensen, which is defined as
E g Y n E g ( Y ) ,
for g bounded and continuous. This weak convergence is metrizable by Theorem A1 in Appendix A.
Proposition 2.
Suppose that the bootstrap weights ξ 1 , , ξ n satisfy Assumptions (A1)–(A4). Let
W n * ( h ) : = U n * ( h ) l n 1 m n m 1 R l n * ( h ) .
Then, we have
W n * ( h ) 0 , P ν × P ξ a . s .
The proof of Proposition 2 is postponed until Section 7.
Now, in the following lemma, there are some instrumental results needed later.
Lemma 1.
Let ( X n ) n be a Markov chain defined in Section 2.1. Define p : = P ( X 0 A ) = α 1 . Then, for any initial probability ν, we have:
(i) 
For some η > 0 and C > 0 :
E ν ( l n ) n p 1 C n a n d n l n n p 1 N ( 0 , η 2 ) .
(ii) 
n * n 1 in P ν × P ξ -probability.
(iii) 
Let X i be a sequence of random variables. If
T n = 1 n i = 1 n X i C a . s . ,
then for any integer t n valued sequence of random variables,
1 t n i = 1 t n X i C i n P ν p r o b a b i l i t y .
The proof of Lemma 1 is postponed until Section 7.

4. Weighted Bootstrap Weak Convergence

In this section, we extend some existing results concerning the multiplier U-process to prove the bootstrap uniform weak convergence. Most of these results can be found in [46], generalizing the empirical process work of [38] in the i.i.d. setting. The weak convergence is proved for degenerate U-processes, as we mentioned before, and under the weighted regenerative bootstrap schemes described in Algorithm 1. Before stating the weak convergence theorem, we recall the following important results. The next theorem, proved in [46], is a sharp multiplier inequality, which is essential in the study of the multiplier U-process. These results are based on the decoupling symmetrized U-process, a basic framework of U-statistics. In [47], the author solved these problems for the empirical process settings in the Markov setting (multinomial bootstrap), which we generalize to the U-process by considering more general weights, i.e., the exchangeable weighted bootstrap.
Theorem 2
([46]). Let ( ξ 1 , , ξ n ) be a random vector independent of ( Y 1 , , Y n ) . Then, there exists some measurable function ψ n : R 0 m R 0 such that the expected supremum of the decoupled (Here “decoupled” refers to the fact that { Y i ( k ) } , k N are independent copies of { Y i } , and { ϵ i ( k ) } , k N are independent copies of the Rademacher sequence { ϵ i } .) U-processes
E 1 i k k , 1 k m ϵ i 1 ( 1 ) ϵ i m ( m ) f Y i 1 ( 1 ) , , Y i m ( m ) F ψ n ( 1 , , m ) ,
for all 1 1 , , m n , consequently,
E 1 i 1 , , i m l n 1 ξ i 1 ξ i m f Y i 1 ( 1 ) , , Y i m ( m ) F K R 0 m E ψ n i = 1 l n 1 1 | ξ i | > t 1 , , i = 1 l n 1 1 | ξ i | > t m d t 1 d t m .
Furthermore, if there exists a concave and nondecreasing function ψ ¯ n : R R such that ψ n ( 1 , , m ) = ψ ¯ n k = 1 m k , then
E 1 i 1 , , i m n ξ i 1 ξ i m f Y i 1 ( 1 ) , , Y i m ( m ) F K R 0 m ψ ¯ n 1 i 1 , , i m n k = 1 m P | ξ i k | > t k 1 / m d t 1 d t m .
Here, K > 0 is a constant depending on m only and can be taken as K = 2 2 m k = 2 m ( k k 1 ) for m 2 .
Lemma 2
([46]). Let { F ( 1 , , m ) , n : 1 1 , , m n , n N } be function classes such that F ( 1 , , m ) , n F ( n , , n ) , n for all 1 1 , , m n . Suppose that the ξ i ’s have the same marginal distributions with ξ 1 2 m , 1 < . Suppose that there exists some bounded measurable function a : R 0 m ( n ) R 0 with a ( 1 , , m ) 0 as 1 m , such that the expected supremum of the decoupled U-processes satisfies
E 1 i k k , 1 k m ϵ i 1 ( 1 ) ϵ i m ( m ) f Y i 1 ( 1 ) , , Y i m m F ( 1 , , m ) , n a ( 1 , , m ) k = 1 m k 1 / 2 ,
for all 1 1 , , m n . Then,
n m / 2 E 1 i 1 , , i m n ξ i 1 ξ i m ( n ) f Y i 1 ( 1 ) , , Y i m ( m ) F ( n , , n ) , n 0 , n .
The main result of this paper is represented in the following theorem. It is worth noting here that it is not easy to prove the stochastic equicontinuity in the present setting as explained in the introduction.
Definition 16
(Permissible classes of function). Let ( E , E , P ) be a measurable space ( E a Borel σ-field on E). Let F be a class of functions indexed by a parameter x that belongs to a set E. F is called permissible if it can be indexed by a E such that:
  • There exists a function g ( x , f ) = f ( x ) defined from S × F to R in such a way that this function is L B ( F ) measurable function, where B ( F ) is the Borel σ-algebra generated by the metric on F .
  • E is a Suslin measurable space whose mean E is an analytic subset of a compact metric space E from which it inherits its metric and Borel σ-field.
Theorem 3.
Suppose Assumptions (A1) to (A4), and Conditions (C.1)–(C.5) hold. Let F L 2 c , m ( P ) be permissible and admit a P m -square integrable envelope F such that
0 1 sup Q log N ϵ F L 2 ( Q ) , F , L 2 ( Q ) m / 2 d ϵ < ,
where the supremum is taken over all discrete probability measures. Then,
sup ψ BL E ξ ψ Z n * ( h ) E ψ ( c · K P ) P ν 0 ,
where c is the constant in (A3), and the convergence in probability P ν is with respect to the outer probability of P defined on ( E , E ) .
The proof of Theorem 3 is postponed until Section 7.

4.1. Bootstrap Weights Examples

Let ( ξ 1 , , ξ n ) be a class of real random variables satisfying Assumptions (A1)(A4). We give some examples of bootstrap weights; for instance, refer to [38,61] for more explanations.

4.1.1. Bayesian Resampling Scheme

In this case, ( ξ 1 , , ξ n ) are positive i.i.d. random variables with mean μ and finite variance σ 2 . The weights satisfy ξ 1 2 , 1 < , and we define
ξ ¯ n = i = 1 n ξ i .
The Bayesian bootstrapped weight can be defined as:
ξ n i = ξ i / ξ ¯ n ,
satisfying
ξ n 1 2 , 1 = 0 P ξ ( ξ n 1 u ) d u .
For ξ n i E x p o n e n t i a l ( 1 ) or ξ n i G a m m a ( 4 , 1 ) , the Bayesian weights are distributionally equivalent with Dirichlet weights. For the value of c 2 , we have:
1 n i = 1 n ξ n i 1 2 σ 2 μ 2 : = c 2 , n .

4.1.2. Efron’s Resampling Scheme

For Efron’s bootstrap, we have
( ξ 1 , , ξ n ) Multinomial ( n ; n 1 , , n 1 ) .
Condition (A1) follows directly. Condition (A3) follows from ([37] Lemma 4.1), and Condition (A2) is detailed in [43].

4.1.3. The Delete h-Jackknife

In [62], the authors permute deterministic weights w n , where
w n = n n h , , n n h , 0 , , 0 with i = 1 n w n i = n
in order to build new bootstrap weights, and they defined the new weights ξ n j : = w n R n ( j ) where R n ( · ) is a random permutation uniformly distributed over { 1 , , n } . These weights are called the delete h-Jackknife. In order to achieve Assumption (A3), we must assume that h / n α ( 0 , 1 ) , as c 2 = h / ( n h ) and c > 0 .

4.1.4. The Multivariate Hypergeometric Resampling Scheme

As its name indicates, the bootstrap weights of this scheme follow the multivariate hypergeometric distribution with density:
P ξ n 1 = ε 1 , , ξ n n = ε n = K ε 1 K ε n n K n ,
where K is a positive integer. Assumption (A3) is satisfied with c 2 = ( K 1 ) / K .
Remark 2.
As was pointed out in [38], the preceding mentioned bootstraps are “smoother” in some way than the multinomial bootstrap because they place some (random) weight on all elements in the sample, whereas the multinomial bootstrap applies the positive weight at a proportion of about 1 ( 1 n 1 ) n 1 e 1 = 0.6322 of each element of the sample, on average. Notice that when ω i G a m m a ( 4 , 1 ) , the ξ n i / n are equivalent to four spacings from a sample of 4 n 1 Uniform ( 0 , 1 ) random variables. In [63,64], it was noticed that in addition to being four times more expensive to implement, the choice of four spacings depends on the functional of interest and is not universal.
Remark 3.
It is noteworthy that choosing the bootstrap weights ξ n i properly implies a smaller limit variance, that is, c 2 is smaller than 1. A typical example is the multivariate hypergeometric bootstrap ([38] Example 3.4) and the subsample bootstrap, ([65] Remark 2.2-(3)). A thorough treatment of the weight selection is undoubtedly outside the scope of the current work; for review, we refer the readers to [66].
Remark 4.
In the present paper, we considered a renewal type of bootstrap for atomic Markov chains under minimal moment conditions on renewal times. The atomic Markov chains assumption can be dropped by mimicking the ideas of [50,51] by introducing an artificial atom and deriving the bootstrap procedure that applies to nonatomic Markov chains. Precisely, in the case of a general irreducible chain X with a transition kernel Π ( x , d y ) satisfying a minorization condition:
x S , Π ( x , d y ) δ ψ ( d y ) ,
for an accessible measurable set S, a probability measure ψ and δ ] 0 , 1 [ (note that such a minorization condition always holds for Π or an iterate when the chain is irreducible), an atomic extension ( X , Y ) of the chain may be explicitly constructed by the Nummelin splitting technique (see [49]) from the parameters ( S , δ , ψ ) and the transition probability Π, see for instance [47,67]. From a practical viewpoint, the size of the first block may be large compared to the size n of the whole trajectory, for instance, in the case where the expected return time to the (pseudo-)atom when starting with the initial probability distribution is large. The effective sample size for constructing the data blocks and the corresponding statistic is then dramatically reduced. However, in [68], some simulations were given together with examples including content-dependent storage systems and general AR models supporting the method discussed in this work.

5. Applications

Example 1
(Symmetry test). This example gives an application for the bootstrap U-statistics, inspired by the goodness-of-fit tests in [69], where they considered the symmetry test for the distribution of X t . Let X t t N be a stationary mixing process with f X ( · ) the Lebesgue density. We test the hypothesis:
H 0 : f X ( u ) = f X ( u ) , a l m o s t   e v e r y   w e r e , H 1 : f X ( u ) f X ( u ) o n   a   s e t   o f   p o s i t i v e   m e a s u r e .
The estimator of f X ( u ) is:
f ^ X ( u ) = 1 n h n i = 1 n K u X i h n ,
where K ( · ) is a kernel function and h n > 0 is a smoothing parameter or the bandwidth. An appropriate estimator of the integrated squared difference represent the symmetry test:
I = R f X ( u ) f X ( u ) 2 d u .
According to [69], I can be estimated by
I ^ n : = 4 n 2 h n 1 i < j n Φ n X i , X j ,
where Φ n X i , X j = K X i , X j K X i , X j with K X i , Y j = K X i Y j h n , for Y j X j , X j . Clearly, I ^ n is a degenerate U-statistic with kernel varying with the sample size n. Thus, the stationary bootstrap test,
I ^ n * : = 4 n 2 h n 1 i < j n Φ n X i * , X j * ,
can be shown to have the same limit as I ^ n .
Example 2
(Kendall’s tau). The covariance matrix quantifies the linear dependency in a random vector. The rank correlation is another measure of the nonlinear dependency in a random vector. Two generic vectors y = y 1 , y 2 and z = z 1 , z 2 in R 2 are said to be concordant if y 1 z 1 y 2 z 2 > 0 . For m , k = 1 , , p , define
τ m k = 1 n ( n 1 ) 1 i j n 1 X i m X j m X i k X j k > 0 .
Then, Kendall’s tau rank correlation coefficient matrix T = τ m k m , k = 1 p is a matrix-valued U-statistic with a bounded kernel. It is clear that τ m k quantifies the monotonic dependency between X 1 m , X 1 k and X 2 m , X 2 k and it is an unbiased estimator of
P X 1 m X 2 m X 1 k X 2 k > 0 ,
that is, the probability that X 1 m , X 1 k and X 2 m , X 2 k are concordant.
Example 3
(Test of independence). In [2] the author introduced the parameter
= D 2 ( y 1 , y 2 ) d F ( y 1 , y 2 ) ,
where D ( y 1 , y 2 ) = F ( y 1 , y 2 ) F ( y 1 , ) F ( , y 2 ) and F ( · , · ) is the distribution function of Y 1 and Y 2 . The parameter Δ has the property that = 0 if and only if Y 1 and Y 2 are independent. From [8], an alternative expression for Δ can be developed by introducing the functions
ψ y 1 , y 2 , y 3 = 1 i f y 2 y 1 < y 3 0 i f y 1 < y 2 , y 3 o r y 1 y 2 , y 3 1 i f y 3 y 1 < y 2
and
φ y 1 , 1 , y 1 , 2 , , y 5 , 1 , y 5 , 2 = 1 4 ψ y 1 , 1 , y 1 , 2 , y 1 , 3 ψ y 1 , 1 , y 1 , 4 , y 1 , 5 ψ y 1 , 2 , y 2 , 2 , y 3 , 2 ψ y 1 , 2 , y 4 , 2 , y 5 , 2 .
We have
= φ y 1 , 1 , y 1 , 2 , , y 5 , 1 , y 5 , 2 d F y 1 , 1 , y 1 , 2 d F y 1 , 5 , y 2 , 5 .
The corresponding U-statistics may be used to test the independence.

6. Conclusions

The present paper was concerned with the randomly weighted bootstrap of the U-process in a Markov framework. A large number of bootstrap resampling schemes emerged as special cases of our setting, in particular, the multinomial bootstrap, which is the best-known bootstrap scheme introduced by [26]. One of the main tools was the approximation of the Markov U-process by the corresponding regenerative one. We looked to mimic this result in Proposition 2, in order to approximate the weighted-bootstrap U-process U n * to the regenerative weighted-bootstrap U-process R l n * . Other technical arguments were given in Lemma 1 extended from the work of [47]. These intricate tools were used to reach the full independence of regenerative block variables by proving that a deterministic one could substitute the random size of blocks, which was the main problem for the extension of the bootstrap results to the Markov framework. After a lengthy proof to arrive at independence, we used the results of [46]. All the above steps led us to prove the weak convergence of the regenerative-block weighted-bootstrap U-process, which implied the weak convergence of the weighted-bootstrap U-process. It will be of interest to consider the extension of the paper to the semi-Markov setting. A more delicate problem is to consider the setting of incomplete data such as censored cases or missing data. To the best of our knowledge, this problem has not been considered, even for the original sample (without bootstrap) in the Markov framework. It would be interesting to extend our work to the case of the local stationary process, which requires nontrivial mathematics; this would go well beyond the scope of the present paper.

7. Mathematical Development

This section is devoted to the proof of our results. The previously defined notations continue to be used in what follows.
Proof of Proposition 2.
We have
U n * ( h ) l n 1 m n m 1 R l n * ( h ) = n m 1 ( i 1 , , i m ) I n m ξ i 1 , n * ξ i m , n * h ( X i 1 , , X i m )       n m 1 ( i 1 , , i m ) I l n m ξ i 1 , l n ξ i m , l n ω h ˜ ( B i 1 , , B i m ) = n m 1 ( i 1 , , i m ) I n m ξ i 1 , n * ξ i m , n * h ( X i 1 , , X i m )       n m 1 ( i 1 , , i m ) I l n m ξ i 1 , l n ξ i m , l n ω h ˜ ( B i 1 , , B i m )       + n m 1 ( i 1 , , i m ) I n m h ( X i 1 , , X i m ) n m 1 ( i 1 , , i m ) I n m h ( X i 1 , , X i m )       + n m 1 ( i 1 , , i m ) I l n m ξ i 1 , l n ξ i m , l n 1 ω h ˜ ( B i 1 , , B i m )       n m 1 ( i 1 , , i m ) I l n m ξ i 1 , l n ξ i m , l n 1 ω h ˜ ( B i 1 , , B i m ) = n m 1 ( i 1 , , i m ) I n m ξ i 1 , n * ξ i m , n * 1 h ( X i 1 , , X i m )       n m 1 ( i 1 , , i m ) I l n m ξ i 1 , l n ξ i m , l n 1 ω h ˜ ( B i 1 , , B i m )       + n m 1 ( i 1 , , i m ) I l n m ω h ˜ ( B i 1 , , B i m )       n m 1 ( i 1 , , i m ) I n m h ( X i 1 , , X i m ) .
Given J { 1 , , m } ( J = is not excluded), and i = i 1 , , i m { 1 , , n } m , we set i J to be the point of { 1 , , n } | J | obtained from i by deleting the coordinates in the places not in J (e.g., if i = ( 3 , 4 , 2 , 1 ) , then i { 1 , 3 } = ( 3 , 2 ) . Furthermore, i J indicates the sum over 1 i j n , j J ; for instance, if m = 4 and J = { 1 , 3 } , then
i J h i = i { 1 , 3 } h i 1 , i 2 , i 3 , i 4 = 1 i 1 , i 3 n h i 1 , i 2 , i 3 , i 4 X i 1 ( 1 ) , , X i 4 ( 4 ) .
By convention, i a = a . Notice that
E n m 1 ( i 1 , , i m ) I n m ξ i 1 , n ξ i m , n 1 h ( X i 1 , , X i m )       = n m 1 ( i 1 , , i m ) I n m E ξ i 1 , n ξ i m , n 1 E h ( X i 1 , , X i m )       = n m 1 i { 1 , , m 1 } j = 1 m E k = 1 , k j m ξ i k , n E i j = 1 n ( ξ i j , n 1 ) k = 1 , k j m ξ i k , n             × E h ( X i 1 , , X i m )       = 0 .
In a similar way, we have
E n m 1 ( i 1 , , i m ) I l n m ξ i 1 , l n ξ i m , l n 1 ω h ˜ ( B i 1 , , B i m ) = 0 .
Making use of Proposition 1 and the law of large numbers, we infer that
U n * ( h ) l n 1 m n m 1 R l n * ( h ) = 0 , a . s .
Hence, the proof is completed. □
Proof of Lemma 1.
The proof of part i ) and part i i i ) follows from ([47] Lemma 3.1 and Lemma 3.2). In order to prove i i ) , we need to show that, for every ϵ > 0 ,
P ν × P ξ | X n * n 1 > ϵ 0 ,
which follows if, conditioned on the sample,
P ξ | X n * n 1 > ϵ 0 .
We have:
n * n 1 = i = 1 l n ξ i τ i n 1 = l n n i = 1 l n ξ i τ i + ξ i E * ( τ ) ξ i E * ( τ ) n l n = l n n i = 1 l n ξ i τ i E * ( τ ) l n + l n n i = 1 l n ξ i E * ( τ ) l n n l n = I + I I .
We denote by E * the expectation conditionally on X 1 , , X n . By the fact that τ i are i.i.d. and using Chebyshev’s inequality, we have:
P ξ | X I > ϵ ϵ 2 l n n 2 1 l n E ξ | X ξ 1 , l n τ 1 E * ( τ ) 2 2 ϵ 2 l n n 1 n E ( ξ 1 , l n 2 ) 1 l n i = 1 l n τ i 2 0 in probability .
The last inequality follows using i ) , which implies that l n n p and i i i ) where
1 l n i = 1 l n τ i 2 E ( τ 2 ) ,
for E ( ξ 1 , l n 2 ) < . For I I we have:
I I = l n n i = 1 l n ξ i E * ( τ ) l n n l n = l n n E * ( τ ) n l n by ( A 1 ) = l n n 1 l n i = 1 l n τ i E ( τ ) + E ( τ ) n l n = l n n 1 l n i = 1 l n τ i E ( τ ) + l n n E ( τ ) n l n .
The last equality converges to zero by the fact that n / l n α = E ( τ ) and by iii)
1 l n i = 1 l n τ i E ( τ ) 0 .
This proves Lemma 1. □
Proof of Theorem 3.
For the weak convergence, we need to show the finite-dimensional convergence and the asymptotic equicontinuity. According to Proposition 2 and [6], the finite-dimensional convergence is considered if, for every fixed finite collection of functions { f 1 , , f k } F ,
n m 1 1 / 2 R l n * ( f 1 ) , n m k 1 / 2 R l n , * ( f k ) K P ( f 1 ) , , K P ( f k ) ,
where K P is the Gaussian chaos process. According to Cramér–Wold and the countability of F , we only need to show that for any f L 2 c , m ( P ) ,
sup ψ BL | E ψ n m 1 / 2 R l n * ( f ) | { B i } ] E ψ ( c · K P ( f ) ) 0   a . s .
By ([6] Section 4.2) and ([29] Section 2A), any f L 2 c , m ( P ) can be expanded in L 2 ( P m ) by f = q = 1 c q h m ψ q , where { c q } is a sequence of real numbers and
h m ψ q ( x 1 , , x m ) ψ q ( x 1 ) ψ q ( x m )
for some bounded ψ q L 2 c , 1 ( P ) . Fix ϵ > 0 . Then, there exists Q ϵ N such that with f n ϵ q = 1 Q ϵ c q h m ψ q ,
f f ϵ L 2 ( P m ) ϵ .
The left-hand side of (25) can be further bounded by
sup ψ BL | E ψ n m 1 / 2 R l n * ( f ) | { B i } ] E ψ ( c · K P ( f ) ) sup ψ BL | E [ ψ n m 1 / 2 R l n * ( f ) | { B i } ] E ψ n m 1 / 2 R l n * ( f ϵ ) | { B i } ] + sup ψ BL | E ψ n m 1 / 2 R l n * ( f ϵ ) | { B i } ] E ψ ( c · K P ( f ϵ ) ) + sup ψ BL E ψ ( c · K P ( f ϵ ) ) E ψ ( c · K P ( f ) ) ( I ) + ( I I ) + ( I I I ) .
Let f ϵ ¯ f f ϵ ; noting that ψ is bounded by one and using Lemma 1, we can replace l n by φ ( n ) = n E A ( τ ) which is deterministic. In the following, we denote by π a random permutation uniformly distributed over Σ ( n ) , the set of all permutations over 1 , , n . We have
( I ) 2 E * 2 n m 1 / 2 R l n * ( f ϵ ¯ ) 2 E ξ | X E R 1 n m / 2 1 i 1 i m φ ( n ) ξ π i 1 1 ξ π i m 1 f ¯ ε B i 1 , , B i m 2 α i { 1 , 2 } : i = 1 l α i = 2 m , α 1 α l , 1 l m E ξ * [ 1 n m / 2 E R i = 1 l ξ π i 1 α i × i 1 i m , i 1 i m , i j = i j , 1 j max { j : α j = 2 } f ¯ ϵ B i 1 , , B i m ( n ) f ¯ n ϵ B i 1 , , B i m ( n ) ] α i { 1 , 2 } : i = 1 l α i = 2 m , α 1 α l , 1 m E 1 1 n i = 1 n ξ i 1 2 m × n i 1 i m , i 1 i m , i j = i j , 1 j max { j : α j = 2 } f ¯ ϵ B i 1 , , B i m f ¯ ϵ B i 1 , , B i m .
We have, according to [43], for ( ξ 1 , , ξ n ) a non-negative sequence of variables such that i = 1 n ξ i = n and for π = ( π 1 , , π n ) a random permutation of { 1 , , n } , for any N and α = ( α 1 , , α ) N ,
E π i = 1 ξ π i 1 α i C l , α n i = 1 l n ξ i 1 2 i α i / 2 .
Furthermore, according to [70,71], we have:
n i 1 i m , i 1 i m , i j = i j , 1 j max { j : α j = 2 } f ¯ ϵ ( B i 1 , , B i m ) f ¯ ϵ ( B i 1 , , B i m ) a . s . E A ( τ ) E f ¯ ϵ ( B 1 , , B m ) f ¯ ϵ ( B 1 , , B m ) ( where B j = B j for 1 j max { j : α j = 2 } and for l n / n E A ( τ ) 1 ) E A ( τ ) P m f ¯ ϵ 2 ϵ 2 . ( Under   Conditions   ( C . 1 )   and   ( C . 3 ) . )
Hence we have
lim sup n ( I ) m , ξ ϵ , a . s .
Now for the second term, we have:
n m 1 / 2 R l n * ( f ϵ ) = 1 n m 1 / 2 q = 1 Q ϵ c q 1 i 1 < < i m φ ( n ) ( ξ π i 1 1 ) ( ξ π i m 1 ) ψ q ( B i 1 ) ψ q ( B i m ) = φ ( n ) m / 2 n m 1 / 2 q = 1 Q ϵ c q R m 1 φ ( n ) 1 / 2 i = 1 φ ( n ) ( ξ π i 1 ) ψ q ( B i ) , , 1 φ ( n ) m / 2 i = 1 φ ( n ) ( ξ π i 1 ) m ψ q m ( B i ) ( 1 + o ( 1 ) ) ( m ! ) 1 / 2 E A ( τ ) m / 2 q = 1 Q ϵ c q R m ( A φ ( n ) , q ( 1 ) , , A φ ( n ) , q ( m ) ) ,
where R m is the polynomial of degree m (see [6], p. 175):
1 i 1 < < i m φ ( n ) t i 1 t i m = R m i = 1 φ ( n ) t i , i = 1 φ ( n ) t i 2 , , i = 1 φ ( n ) t i m .
As we mentioned before, this polynomial follows from Newton’s inequality and allows us to show a polynomial function as a sum of monomials. All we need now is to check each argument of this polynomial function. □
For = 1 : We first recall the following lemma from [53].
Lemma 3
([53]). Let ( a 1 , , a n ) be a vector and ( ξ 1 , , ξ n ) be a vector of exchangeable random variables. Suppose that
a ¯ n = 1 n i = 1 n a i = 0 , 1 n i = 1 n a i 2 σ 2 , lim M lim sup n 1 n i = 1 n a i 2 1 { | a i | > M } = 0 ,
and
ξ ¯ n = 1 n i = 1 n ξ i = 0 , 1 n i = 1 n ξ i 2 P ξ α 2 , 1 n max 1 i n ξ i 2 P ξ 0 .
Then,
1 n i = 1 n a i ξ i N 0 , σ 2 α 2 .
Applying Lemma 3 with a i ψ q ( B i ) P n ψ q and ξ i replaced by ξ R i 1 , we can see that
A φ ( n ) , q ( 1 ) c · G P ( ψ q ) , a . s . ,
where G P is a Gaussian process defined on L 2 c , 1 ( P ) with covariance
E G P ( f ) G P ( g ) = P ( f g ) , for f , g L 2 c , 1 ( P ) .
For = 2 : Note that
E π * , ξ ( A φ ( n ) , q ( 2 ) ) = 1 φ ( n ) i = 1 φ ( n ) ( ξ i 1 ) 2 · 1 φ ( n ) i = 1 φ ( n ) ψ q 2 ( B i ) P ν , ξ c 2 E ψ q 2 ( B 1 ) = c 2 E A i = T 0 + 1 T 1 h 1 ( X i ) 2 , a . s .
Furthermore,
Var * , ξ ( A φ ( n ) , q ( 2 ) ) = E * , ξ A φ ( n ) , q ( 2 ) 2 E π * , ξ A φ ( n ) , q ( 2 ) 2 = E π * , ξ 1 φ ( n ) i = 1 φ ( n ) ξ i 1 2 ψ q 2 ( B π i ) 2 1 φ ( n ) i = 1 φ ( n ) ξ i 1 2 P n ψ q 2 2 = 1 φ ( n ) 2 i , j ξ i 1 2 ξ j 1 2 E π * ψ q 2 ( B π i ) ψ q 2 ( B π j ) ( P n ψ q 2 ) 2 = 1 φ ( n ) 2 i ξ i 1 4 E π * ψ q 4 ( B π i ) ( P n ψ q 2 ) 2 + 1 φ ( n ) 2 i j ξ i 1 2 ξ j 1 2 E π * ψ q 2 ( B π i ) ψ q 2 ( B π j ) ( P n ψ q 2 ) 2 1 φ ( n ) 2 i ξ i 1 4 · P n ψ q 4 + 1 φ ( n ) 2 i ( ξ i 1 ) 2 2 · 1 φ ( n ) 1 P n ψ q 4 C 1 φ ( n ) 2 i = 1 n ξ i 1 4 · P n ψ q 4 C ψ q 4 max i ( ξ i 1 ) 2 φ ( n ) · 1 φ ( n ) i = 1 φ ( n ) ( ξ i 1 ) 2 P ξ 0 , a . s .
The first inequality in the above display follows, since
E π * ψ q 2 ( B π i B π i ) ψ q 2 ( B π j ) ( P n ψ q 2 ) 2 = 1 φ ( n ) ( φ ( n ) 1 ) i j ψ q 2 ( B π i ) ψ q 2 ( B π j ) ( P n ψ q 2 ) 2 1 φ ( n ) 1 ( P n ψ q 2 ) 2 1 φ ( n ) 1 P n ψ q 4 .
This shows that
A φ ( n ) , q ( 2 ) P ξ c 2 E ψ q 2 a . s .
For 3 :
E π * , ξ | A φ ( n ) , q ( ) | 1 φ ( n ) / 2 i = 1 φ ( n ) | ξ i 1 | · 1 φ ( n ) i = 1 φ ( n ) | ψ q ( B i ) | max i | ξ i 1 | 2 φ ( n ) 2 2 · 1 φ ( n ) i = 1 φ ( n ) | ξ i 1 | 2 · ψ q P ξ 0 , a . s .
This shows that
A φ ( n ) , q ( ) P ξ 0 , a . s .
Then, we have
R m ( A φ ( n ) , q ( 1 ) , , A φ ( n ) , q ( m ) ) R m ( G P ( c ψ q ) , E ( c ψ q ) 2 , 0 , , 0 ) = c ( m ! ) 1 / 2 · K P ( ψ q ) a . s . ,
where K P is the Gaussian chaos process defined on (⊕ is the orthogonal sum in L 2 E , E , P )
R L 2 c , N ( P ) R m = 1 L 2 c , m ( P ) .
Hence, it follows that, by linearity of K P ,
n m 1 / 2 R l n * ( f ¯ ϵ ) c · K P ( f ϵ ) , a . s .
The last term in (26) follows from the definition of K P
( I I I ) c E K P 2 ( f ¯ ϵ ) 0 ( ϵ 0 ) .
All these final results give the finite-dimensional convergence.
Now, we take a step-by-step approach to establish stochastic equicontinuity. We assume that the class of functions must be bounded, so we suppose that h H , for H an envelope. Throughout the following, we denote by
F δ : = { f , g F : d ( f , g ) δ } .
  • Step 1
Let
Z n * : = n * m 1 / 2 U n * ( h ) E * ( U n * ( h ) ) ,
and
T ˘ l n * : = n m 1 / 2 l n m R l n * E * ( R l n * ) .
In this step, we must prove that the stochastic equicontinuity of the U-process implies that of the regenerative U-process. This is a consequence of 1, and for the weighted bootstrap Proposition 2 and part ii) of Lemma 1.
  • Step 2
Define
T ˘ l n * : = n m 1 / 2 ( i 1 , , i m ) I l n m ( ξ i 1 , l n 1 ) ( ξ i m , l n 1 ) ω h ( B i 1 , , B i m )
and
T ˜ l n * = n m 1 / 2 ( i 1 , , i m ) I E ( l n ) m ( ξ i 1 , l n 1 ) ( ξ i m , l n 1 ) ω h ˜ ( B i 1 , , B i m ) .
Hypothesis: The stochastic equicontinuity of T ˘ l n * implies the stochastic equicontinuity of T ˜ l n * .
Proof. 
In order to prove the previous implication, we only need to show that:
P * T ˜ l n * T ˘ l n * F δ > ϵ P * n m 1 / 2 ( i 1 , , i m ) I E ( l n ) m ( ξ i 1 , l n 1 ) ( ξ i m , l n 1 ) ω h ( B i 1 , , B i m ) F δ > ϵ .
Suppose that l n E ( l n ) , the opposite case can be treated in a similar way. We have
P * T ˜ l n * T ˘ l n * F δ > ϵ = P * n m 1 / 2 ( i 1 , , i m ) I E ( l n ) m ( ξ i 1 , l n 1 ) ( ξ i m , l n 1 ) ω h ( B i 1 , , B i m ) n m 1 / 2 ( i 1 , , i m ) I l n m ( ξ i 1 , l n 1 ) ( ξ i m , l n 1 ) ω h ( B i 1 , , B i m ) F δ > ϵ Define I : = ( i 1 , , i m ) : 1 i 1 < < i m E ( l n ) : i j i k for j k , such that { 1 , , m } : l n i E ( l n ) P * n m 1 / 2 ( i 1 , , i m ) I ( ξ i 1 , l n 1 ) ( ξ i m , l n 1 ) ω h ( B i 1 , , B i m ) F δ > ϵ = P * ( n m 1 / 2 ( i 1 , , i m ) I ( ξ i 1 , l n 1 ) ( ξ i m , l n 1 ) ω h ( B i 1 , , B i m ) > ϵ | E ( l n ) l n | n / 4 F δ ) + P * | E ( l n ) l n | > n / 4 .
However, | E ( l n ) l n | = O P ( n ) by Lemma 1, part i). Then, the exists a constant K > 0 , such that for every ϵ > 0 ,
P * | E ( l n ) l n | > n / 4 < ϵ ,
and the first expression in the previous expression is bounded by:
P * max M n / 2 + E ( l n ) n m 1 / 2 ( i 1 , , i m ) I ( ξ i 1 , l n 1 ) ( ξ i m , l n 1 ) ω h ˜ ( B i 1 , , B i m ) F δ > ϵ where I : = ( i 1 , , i m ) : 1 i 1 < < i m E ( l n ) , = 1 , , m , E ( l n ) < i M , i j i k for j k C 1 P * n m 1 / 2 ( i 1 , , i m ) I ( ξ i 1 , l n 1 ) ( ξ i m , l n 1 ) ω h ˜ ( B i 1 , , B i m ) F δ > C 2 ϵ where I m : = ( i 1 , , i m ) : 1 i 1 < < i m E ( l n ) , = 1 , , m , E ( l n ) < i E ( l n ) + n / 2 , i j i k for j k .
The last expression follows from the Montgomery–Smith inequality. Since
E ( l n ) / n α 1 ,
the last expression matches the stochastic equicontinuity condition for T ˜ l n * . This proves this step. □
Before passing to the next step, we introduce a new bootstrap sample. Define B ^ i : = X T i 1 + 1 , , X T i for i = 1 , , E ( l n ) . Now, apply the weighted bootstrap procedure on the sample { B ^ i } i = 1 E ( l n ) . This new procedure is the same as the old one for B i , but we aim here to replace the random quantity l n with a deterministic one, which is E ( l n ) .
  • Step 3
Define:
T ^ l n * = n m 1 / 2 ( i 1 , , i m ) I E ( l n ) m ( ξ i 1 , l n 1 ) ( ξ i m , l n 1 ) ω h ˜ ( B ^ i 1 , , B ^ i m )
Hypothesis: The stochastic equicontinuity of T ˜ l n * implies the stochastic equicontinuity of T ^ l n * .
Proof. 
First case: l n E ( l n ) :
In this case, all of the terms in the following computation should be multiplied with 1 ( l n E ( l n ) ) . We leave it out to keep the already complex notation simple. Define
A n : = { B 1 , , B l n } T l n * : = T ^ l n * 1 ( B ^ i 1 , , B ^ i m ) A n + T l n * 1 ( B ^ i 1 , , B ^ i m ) A n c .
T l n * is well defined, i.i.d., and has the same distribution as T l n * and ( i 1 , , i m ) I E ( l n ) m . Hence, if we show that:
lim δ 0 lim sup n P * T l n * F δ > ϵ = 0 in probability ,
then the stochastic equicontinuity of T ˜ l n * is established. However, we aim to approximate the one of T ^ l n * . In order to achieve our goal, it is sufficient to estimate:
T l n * T ^ l n * F δ       = T ^ l n * 1 ( B ^ i 1 , , B ^ i m ) A n + T l n * 1 ( B ^ i 1 , , B ^ i m ) A n c T ^ l n * 1 ( B ^ i 1 , , B ^ i m ) A n + T ^ l n * 1 ( B ^ i 1 , , B ^ i m ) A n c F δ       T l n * 1 ( B ^ i 1 , , B ^ i m ) A n c F δ + T ^ l n * 1 ( B ^ i 1 , , B ^ i m ) A n c F δ       : = I n + I I n .
For I n : Let
S n * : = ( i 1 , , i m ) I E ( l n ) m 1 ( B ^ i 1 , , B ^ i m ) A n c ,
conditioned on the sample, we have:
L ( i 1 , , i m ) I E ( l n ) m ( ξ i 1 , l n 1 ) ( ξ i m , l n 1 ) ω h ˜ ( B i 1 , , B i m ) 1 ( B ^ i 1 , , B ^ i m ) A n c = L ( i 1 , , i m ) I S n * m ( ξ i 1 , l n 1 ) ( ξ i m , l n 1 ) ω h ˜ ( B i 1 , , B i m ) .
Hence,
P * I n > ϵ = P * n m 1 / 2 ( i 1 , , i m ) I S n * m ( ξ i 1 , l n 1 ) ( ξ i m , l n 1 ) ω h ˜ ( B i 1 , , B i m ) F δ > ϵ = P * n m 1 / 2 ( i 1 , , i m ) I S n * m ( ξ i 1 , l n 1 ) ( ξ i m , l n 1 ) ω h ˜ ( B i 1 , , B i m ) > ϵ S n * K n F δ + P * S n * > K n P * max M K n n m 1 / 2 ( i 1 , , i m ) I ( ξ i 1 , l n 1 ) ( ξ i m , l n 1 ) ω h ˜ ( B i 1 , , B i m ) F δ > ϵ + P * S n * > K n , for any K > 0 , where I : = ( i 1 , , i m ) : 1 i 1 < < i m S n * , = 1 , , m , S n * < i M , i j i k for j k . C 1 P * n m 1 / 2 ( i 1 , , i m ) I m ( ξ i 1 , l n 1 ) ( ξ i m , l n 1 ) ω h ˜ ( B i 1 , , B i m ) F δ > C 2 ϵ + P * S n * > K n , for any K > 0 ,
where
I m : = ( i 1 , , i m ) : 1 i 1 < < i m S n * , = 1 , , m , S n * < i K n , i j i k for j k .
For n large enough, we need to show that there exists K > 0 such that
P * S n * > K n 0 .
As 1 B ^ i A n c are i.i.d and bounded,
S n * E ( S n * ) E ( l n ) N ( 0 , η 2 ) in probability .
therefore, we can find M > 0 such that
P * S n * > E ( S n * ) + M n < ϵ .
However,
E ( S n * ) = E ( l n ) P * ( B ^ i * A n c ) = E ( l n ) l n = O P ( n ) ,
by Lemma 1 i), then
P * S n * > K n 0 .
Then, we only need to estimate the first part in (33). Define the following bootstrap procedure: let B i : = X T i 1 + 1 , , X T i , 0 , 0 , and let F be a class of function, related to the class of functions F , such that, for every ω h F :
ω h ( B 1 , B 2 , , B k ) = i 1 = 1 i k = 1 h ( x i 1 , , x i k ) 1 x k 0 if   defined , otherwise .
It is classical that { B i } are i.i.d., applying the same bootstrap method of Algorithm 1. This new sample allows us to enlarge and bound (33) by
P * s u p h H | n m 1 / 2 ( i 1 , , i m ) I m ( ξ i 1 , l n 1 ) ( ξ i m , l n 1 ) ω h ( B i 1 , , B i m ) | > ϵ ,
where h H = { ω f ω g , f , g F } and the corresponding class H = { ω f ω g , f , g F } , with envelope F ˜ and F, respectively. To estimate the last expression, we use bracketing. Define the bracket  [ f , f u ] by:
[ f , f u ] : = { f F : f f f u } ,
and the bracketing entropy number by N 1 ( γ , F , P ) , which denotes the minimal number N 1 for which there exist functions f 1 , , f N and f 1 u , , f N u such that:
F k = 1 N f k , f k u , S f k u f k P γ .
For the class of functions H , consider the bracket [ h , h u ] , such that E * ( h , h u ) γ , where γ > 0 and it is determined later. In this framework, the bracketing entropy number is N * ( γ ) : = N 1 ( γ , H , D * ) , for
D * = l n m 1 ( i 1 , , i m ) I l n m ξ i 1 , l n ξ i m , l n δ ( B i 1 , , B i m ) .
Hence, we have the following inequalities
sup h H | n m 1 / 2 ( i 1 , , i m ) I m ( ξ i 1 , l n 1 ) ( ξ i m , l n 1 ) ω h ( B i 1 , , B i m ) | max k N * ( γ ) | n m 1 / 2 ( i 1 , , i m ) I m ( ξ i 1 , l n 1 ) ( ξ i m , l n 1 ) h k u h k ( B i 1 , , B i m ) | max k N * ( γ ) | n m 1 / 2 ( i 1 , , i m ) I m ( ξ i 1 , l n 1 ) ( ξ i m , l n 1 ) h k u ( B i 1 , , B i m ) E * h k u ( B i 1 , , B i m ) | + max k N * ( γ ) | n m 1 / 2 ( i 1 , , i m ) I m ( ξ i 1 , l n 1 ) ( ξ i m , l n 1 ) h k ( B i 1 , , B i m ) E * h k ( B i 1 , , B i m ) | + max k N * ( γ ) | n m 1 / 2 ( i 1 , , i m ) I m ( ξ i 1 , l n 1 ) ( ξ i m , l n 1 ) E * h k u ( B i 1 , , B i m ) E * h k ( B i 1 , , B i m ) | : = I A + I B + I C .
Treating each term, keeping in mind Condition (A.1), i.e., i = 1 n ξ i = n , we have
I C : = max 1 k N * ( γ ) | n m 1 / 2 ( i 1 , , i m ) I m ( ξ i 1 , l n 1 ) ( ξ i m , l n 1 ) E * h k u ( B i 1 , , B i m ) E * h k ( B i 1 , , B i m ) | γ | n m 1 / 2 ( i 1 , , i m ) I m ( ξ i 1 , l n 1 ) ( ξ i m , l n 1 ) | = γ n m 1 i { 1 , , m 1 } j = 1 m k = 1 , k j m ξ i k , n i j = 1 n ( ξ i j , n 1 ) = 0 ,
and
P ( I B > ϵ ) : = P ( max k N * ( γ ) | n m 1 / 2 ( i 1 , , i m ) I m ( ξ i 1 , l n 1 ) ( ξ i m , l n 1 ) h k ( B i 1 , , B i m ) E * h k ( B i 1 , , B i m ) | > ϵ ) N * ( γ ) max k N * ( γ ) P ( | n m 1 / 2 ( i 1 , , i m ) I m ( ξ i 1 , l n 1 ) ( ξ i m , l n 1 ) h k ( B i 1 , , B i m ) E * h k ( B i 1 , , B i m ) | > ϵ ) ( For h k = h k 1 h k M n + h k 1 h k > M n ) N * ( γ ) max k N * ( γ ) P ( | n m 1 / 2 ( i 1 , , i m ) I m ( ξ i 1 , l n 1 ) ( ξ i m , l n 1 ) h k 1 h k M n ( B i 1 , , B i m ) E * h k 1 h k M n ( B i 1 , , B i m ) | > ϵ ) + N * ( γ ) max k N * ( γ ) P ( | n m 1 / 2 ( i 1 , , i m ) I m ( ξ i 1 , l n 1 ) ( ξ i m , l n 1 ) h k 1 h k > M n ( B i 1 , , B i m ) E * h k 1 h k > M n ( B i 1 , , B i m ) | > ϵ ) N * ( γ ) ϵ 2 n m 1 / 2 E * ( ξ 1 , l n 1 ) ( ξ m , l n 1 ) h k 1 h k M n E * h k 1 h k M n 2 + N * ( γ ) ϵ E * ( ξ 1 , l n 1 ) ( ξ m , l n 1 ) h k 1 h k > M n E * h k 1 h k > M n N * ( γ ) ϵ 2 n m 1 / 2 E ξ i = 1 m ( ξ i , l n 1 ) 2 E * h k 1 h k M n E * h k 1 h k M n 2 + N * ( γ ) ϵ E * ( ξ 1 , l n 1 ) ( ξ m , l n 1 ) h k 1 h k > M n E * h k 1 h k > M n N * ( γ ) ϵ 2 n m 1 / 2 c 2 × 4 M n 2 + N * ( γ ) ϵ E * ( ξ 1 , l n 1 ) ( ξ m , l n 1 ) h k 1 h k > M n N * ( γ ) ϵ 2 n m 1 / 2 c 2 × 4 M n 2 + 2 N * ( γ ) ϵ E * ( ξ 1 , l n 1 ) ( ξ m , l n 1 ) F ˜ 1 F ˜ > M n N * ( γ ) ϵ 2 n m 1 / 2 c 2 × 4 M n 2 + 4 N * ( γ ) ϵ l n m 1 ( i 1 , , i m ) I l n m ( ξ i 1 , l n 1 ) ( ξ i m , l n 1 ) F ˜ 1 F ˜ > M n ( B i 1 , , B i m ) ,
yet, B i are i.i.d. and and E F ˜ = E ( τ ) m E F < , so for any M n , we have
E ( l n ) m 1 ( i 1 , , i m ) I E ( l n ) m ( ξ i 1 , l n 1 ) ( ξ i m , l n 1 ) F ˜ 1 F ˜ > M n ( B i 1 , , B i m ) 0 a . s .
Using the same argument as in part iii) of Lemma 1, we can prove that
l n m 1 ( i 1 , , i m ) I l n m ( ξ i 1 , l n 1 ) ( ξ i m , l n 1 ) F 1 F > M n ( B i 1 , , B i m ) 0 i n p r o b a b i l i t y .
Then, it remains to find that, for every fixed γ > 0 , N * ( γ ) is bounded in probability, as the last expression in (38) does not depend on k. It is interesting to note that N 1 ( γ , H , P ) is finite, due to the boundness of H by 2 F with E F ( B ) < and the fact that B i are i.i.d. and discrete random variables. Under the norm L 1 ( P ) , define γ / 2 brackets h 1 , h N ( γ / 2 ) and h 1 u , , h N ( γ / 2 ) u . Observe that
max j N ( γ / 2 ) | l n m 1 / 2 ( i 1 , , i m ) I l n m ( ξ i 1 , l n 1 ) ( ξ i m , l n 1 ) h j u h j ( B i 1 , , B i m ) | ,
converges to zero in probability, and N ( γ / 2 ) does not depend on n. That implies that N * ( γ ) N ( γ / 2 ) < in probability. Replacing h by h u , I A is identical to I B , i.e., I A also converges to zero in probability. This proves the convergence of I n to zero in probability.
For I I n : In the same manner, let
S n * : = ( i 1 , , i m ) I E ( l n ) m 𝟙 ( B ^ i 1 , , B ^ i m ) A n c .
Define a new bootstrap sample { B i * * } in i = l n + 1 , , E ( l n ) . Clearly, the new sample is well-defined since we assumed at the beginning that l n E ( l n ) , and it is defined independently from B i * and B ^ i * . In this case:
L ( i 1 , , i m ) I E ( l n ) m ( ξ i 1 , l n 1 ) ( ξ i m , l n 1 ) ω h ˜ ( B ^ i 1 , , B ^ i m ) 𝟙 ( B ^ i 1 , , B ^ i m ) A n c = L ( i 1 , , i m ) I S n * m ( ξ i 1 , l n 1 ) ( ξ i m , l n 1 ) ω h ˜ ( B i 1 * * , , B i m * * ) .
Hence, as in (33), we have:
P * I I n > ϵ = P * * n m 1 / 2 ( i 1 , , i m ) I S n * m ( ξ i 1 , l n 1 ) ( ξ i m , l n 1 ) ω h ˜ ( B i 1 , , B i m ) F δ > ϵ C 1 P * * n m 1 / 2 ( i 1 , , i m ) I m ( ξ i 1 , l n 1 ) ( ξ i m , l n 1 ) ω h ˜ ( B i 1 , , B i m ) F δ > C 2 ϵ + P * S n * > K n , for any K > 0 ,
where
I m : = ( i 1 , , i m ) : 1 i 1 < < i m S n * , = 1 , , m , S n * < i K n , i j i k f o r j k .
Using the same bootstrap procedure defined previously for I n , let
B i : = X T i 1 + 1 , , X T i , 0 , 0 , ,
for i = l n + 1 , , E ( l n ) , and let F be a class of function such that, for every ω h F :
ω h ( B 1 , B 2 , , B k ) = i 1 = 1 i k = 1 h ( x i 1 , , x i k ) 1 x k 0 if   defined otherwise .
It is classical that { B i } are i.i.d., applying the same bootstrap method of Algorithm 1. This new sample allows us to enlarge and bound (33) by
P * * s u p h H | n m 1 / 2 ( i 1 , , i m ) I m ( ξ i 1 , l n 1 ) ( ξ i m , l n 1 ) ω h ( B i 1 , , B i m ) | > ϵ ,
where
h H = { ω f ω g , f , g F }
corresponding to the class
H = { ω f ω g , f , g F } ,
with envelope F and F, respectively. As before, for the class of functions H , consider the bracket [ h , h u ] , such that
E * * ( h , h u ) γ ,
where γ > 0 and it is determined later. In this framework, the bracketing entropy number is N * * ( γ ) : = N 1 ( γ , H , D * * ) , for
D * * = E ( l n ) l n m 1 ( i 1 , , i m ) I E ( l n ) l n m ξ i 1 , l n ξ i m , l n δ ( B i 1 , , B i m ) .
Following the same arguments from Equations (37) through (38), we can find that (42) is
N * * ( γ ) max k N * * ( γ ) P ( | n m 1 / 2 ( i 1 , , i m ) I m ( ξ i 1 , l n 1 ) ( ξ i m , l n 1 ) h k ( B i 1 , , B i m ) E * * h k ( B i 1 , , B i m ) | > ϵ ) N * * ( γ ) ϵ 2 n m 1 / 2 c 2 × 4 M n 2 + 2 N * * ( γ ) ϵ E * * ( ξ 1 , l n 1 ) ( ξ m , l n 1 ) F ˜ 1 F ˜ > M n N * * ( γ ) ϵ 2 n m 1 / 2 c 2 × 4 M n 2 + 4 N * * ( γ ) ϵ E ( l n ) l n m 1 ( i 1 , , i m ) I E ( l n ) l n m ( ξ i 1 , l n 1 ) ( ξ i m , l n 1 ) F ˜ 1 F ˜ > M n ( B i 1 , , B i m ) .
Here, we must pay attention to the randomness of N * * which depends on n. According to Lemma 1 i), we can see that | E ( l n ) l n | in probability, under the assumption that l n < E ( l n ) . Now, using the same treatment of I n , and for M n : = n 1 / 3 ( to provide the convergence of M n to ), as in [47], this allows the convergence of (43) to zero in probability. Estimating now N * * by considering the same γ / 2 brackets h 1 , h N ( γ / 2 ) and h 1 u , , h N ( γ / 2 ) u , we have N * * ( γ ) < N ( γ / 2 ) , which does not depend on n. Then, I I n is proved. Following the same footsteps, we can prove the case where l n > E ( l n ) . This proves Step 3. □
The end of the previous step yields that we only need to show the stochastic equicontinuity of T ^ l n * , where the number of blocks is replaced by the deterministic quantity E ( l n ) . In order to achieve the equicontinuity of this statistic, Lemma 2 shows that it is sufficient to prove that:
E 1 i k k , 1 k m ( n ) ϵ i 1 ( 1 ) ϵ i m ( m ) ω h B ^ i 1 ( 1 ) , , B ^ i m ( m ) H a ( 1 , , m ) k = 1 m k 1 / 2
for all 1 1 , , m n . We begin to define the distance:
e 2 ( f , g ) 1 k = 1 m k 1 i k k , 1 k m ω h 2 B ^ i 1 ( 1 ) , , B ^ i m ( m ) ,
defined in L 2 , associated with the Rademacher process
1 k = 1 m k 1 / 2 1 i k k , 1 k m ε i 1 ( 1 ) ε i m ( m ) ω h B ^ i 1 ( 1 ) , , B ^ i m ( m ) : h H B ^ 1 , , B ^ m .
Take f 2 e 2 ( f , 0 ) and
r ( δ ) sup f F δ f 2 .
Using Corollary A1, we have
E 1 k = 1 m k 1 / 2 1 i k k , 1 k m ε i 1 ( 1 ) ε i m ( m ) ω h B ^ i 1 ( 1 ) , , B ^ i m ( m ) H       C 0 r ( δ ) log N ε , F , e m / 2 d ε       = C F · 0 r ( δ ) / F log N ε F , F , e m / 2 d ε       C F · 0 r ( δ ) / F sup Q log N ε F L 2 ( Q ) , F , L 2 ( Q ) m / 2 d ε .
Assuming that F 1 , the upper bound in the integral can be replaced by r ( δ ) . The following proposition is necessary for the following.
Proposition 3
([46]). Let X i be i.i.d. random variables with law P . Let H be a class of measurable real-valued functions defined on X m , A m with an P m -integrable envelope such that the following holds: for any fixed δ > 0 , M > 0 , 1 k m ,
max 1 j k E log N δ , π k H M , e , j j 1 / 2 0
holds for any 1 k . Here for = 1 , , k and X i i = 1 ,
e , j ( f , g ) 1 j i j = 1 j 1 j j j 1 i j j : j j ( f g ) X i 1 , , X i k
and
π k H M h 1 H k M : h π k H ,
where H k is an envelope for π k H . Then,
sup h H 1 k = 1 m k 1 i k k , 1 k m h X i 1 , , X i m P m h 0
in L 1 as 1 m . The above equation can be replaced by the decoupled version.
By this proposition, F P F L 2 ( P ) as 1 m , therefore, it suffices to get r ( δ ) p 0 as 1 m and δ 0 . It is obvious that all that is left to do now is to demonstrate that
sup f F ˜ δ 1 k = 1 m k 1 i k k , 1 k m ω h ˜ 2 B ^ i 1 ( 1 ) , , B ^ i m ( m ) P m ω h ˜ 2 p 0 .
Verifying condition (45)
max 1 j k E log N δ , F M 2 , e , j j 1 / 2 1 k 1 / 2 E 0 δ log N ε , F M 2 , e , j m / 2 d ε δ 2 M 1 1 k 1 / 2 E 0 δ / 2 M log N ε , F M 2 , e , j m / 2 d ε ( δ / 2 M ) 1 1 k 1 / 2 0 1 sup Q log N ε F L 2 ( Q ) , F , L 2 ( Q ) m / 2 d ε × F L 2 ( P m ) 0 .
The shift from the second to the third line is true because
N δ , F M 2 , L 2 ( Q ) N δ / 2 M , F M , L 2 ( Q ) .
As the condition is verified, as well as 1 m , (46) follows directly using the previous proposition. Hence, there exists some sequence { a } , in a way that a 0 for any sequence { δ } with δ 0 both under 1 m , such that:
E 1 i k k , 1 k m ε i 1 ( 1 ) ε i m ( m ) ω h ˜ B ^ i 1 ( 1 ) , , B ^ i m ( m ) H ˜ a k = 1 m k 1 / 2 .
An application of Lemma 2 proves that
n m / 2 E 1 i 1 , , i m n ( ξ i 1 1 ) ( ξ i m 1 ) ω h ˜ ( B ^ i 1 , , B ^ i m ) F δ n 0 , n .
This completes the proof for the asymptotic equicontinuity.

Author Contributions

I.S. and S.B.: conceptualization, methodology, investigation, writing—original draft, writing—review and editing. All authors contributed equally to the writing of this paper. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Not applicable.

Acknowledgments

The authors would like to thank the Special Issue Editor of the Special Issue on “Current Developments in Theoretical and Applied Statistics”, Christophe Chesneau for the invitation. The authors are indebted to the Editor-in-Chief and the three referees for their very generous comments and suggestions on the first version of our article, which helped us to improve the content, presentation and layout of the manuscript.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

This appendix contains supplementary information that is an essential part of providing a more comprehensive understanding of the paper. We also refer to [46] for more details.
Proof of Proposition 1.
Let B 0 = X 1 , , X T 0 and B l n ( n ) = X T l n 1 + 1 , , X n the possibly empty non-regenerative blocks of observations. Note that, for l n 2 , the demonstration can be viewed directly in [59], under the assumptions (C1), (C2) and (C3), we can see that
P ν ( l n 2 ) = O ( n 2 ) .
Otherwise, for l n > 2 , we can write W n ( h ) as follows:
W n ( h ) = ( I ) + ( I I ) ,
where
( I ) = 1 n m 1 i 1 < < i m 1 l n 1 ω h ˜ ( B 0 , B i 1 , , B i m 1 ) + 1 i 1 < i m 1 l n 1 ω h ˜ ( B i 1 , , B i m 1 , B l n ) , ( I I ) = 1 n m j = 2 m 0 k < i 1 < i m 1 j l n ω h ˜ ( B k , , B k , B i 1 , , B i m 1 j ) j = 2 m 1 k < i 1 < < i m 1 j n h ˜ ( X k , , X k , X i 1 , , X i m ) = 1 n m I m l n 1 c ω h ˜ ( B i 1 , , B i m ) ( I m n ) c h ˜ ( X i 1 , , X i m ) ,
where
( I m s ) c = { ( i 1 , , i m ) : i j N , 1 i j n ; a t l e a s t t h e r e a r e j a n d k s u c h t h a t i j = i k } ,
the complement of index set, with cardinal equal to s + m 1 m s m : = s m ¯ . To prove the convergence of W n ( h ) to zero in probability, we must fulfill the convergence of (I) and (II) to zero in probability.
A = n m ¯ 1 I m l n 1 c ω h ˜ ( B i 1 , , B i m ) n α m [ E ( ω h ( B 1 , B k , , B k , u times B k + u , , B m ) ) E A ( ( τ ) u ) ( E A ( τ ) ) m u μ ( h ) ] ,
where 1 k m and 1 u k . We apply the SLLN for Harris Markov chains to find the convergence of
B = n m ¯ 1 ( I m n ) c h ˜ ( X i 1 , , X i m ) ,
to
h ( x 1 , x k , , x k , u times x k + u , , x m ) d μ ( x 1 ) d μ u ( x 1 ) d μ ( x k + u ) d μ ( x m ) μ ( h ) .
Using the conditions, all terms in A and B are finite and we can prove the convergence of ( I I ) to zero. Now, for ( I ) , applying the SLLN and by Lemma 3.2 in [47] part i), we can see that
P ν lim n + l n n α 1 = ( E A ( τ ) ) 1 = 1 .
We have
n 2 m E ν 1 i 1 < < i m 1 l n 1 ω h ˜ ( B 0 , B i 1 , , B i m 1 ) 2 2 α 2 m { E ν ω | h | ( B 0 , B 1 , , B m 1 ) 2 + E ν ( τ 0 ) ( E A ( τ ) m 1 μ ( h ) 2 < .
We obtain, in turn, that
n 2 m E ν 1 i 1 < i m 1 l n ω h ˜ ( B i 1 , , B i m 1 , B l n ) 2 2 α 2 m { E ν ω | h | ( B 1 , , B m 1 , B l n ) 2 + ( E A ( τ ) ) m μ ( | h | ) 2 < .
Hence, ( I ) also converges to zero a.s under P ν as n . □
Proof of Theorem 1.
In what follows, let L = l n 1 denote the number of blocks observed. We find that
R L ( h ) = S L ( h ) + D L ( h ) ,
where
S L ( h ) = m L i = 1 L h ˜ ( 1 ) ( B i ) , D L ( h ) = j = 2 m m j L j 1 1 i 1 < < i j L h ˜ ( j ) B i 1 , , B i j ,
where h ˜ ( c ) ( · ) represents the conditional expectation of ω h ˜ ( · ) given the c of the coordinates, for all B c T . The U-statistics D L ( h ) is obtained by truncating the Hoeffding decomposition after the first term S L ( h ) . Then, we just need to show that:
  • L 1 / 2 S L ( h ) converges weakly to a Gaussian process G P on l ( F ) ,
  • L m + 1 / 2 D L ( h ) F 0 .
For P L ( h 1 ) : = 1 L i = 1 L h ˜ 1 ( B i ) , introduce
Z L ( h ) = L ( P L ( h 1 ) P ( h 1 ) ) = 1 L i = 1 L ( P L P m 1 ( h ) ) P m ( h ) .
Using (A1), we can replace the random variable L = l n 1 with the deterministic quantity L ˘ and we write
Z L ˘ ( h ) = 1 L ˘ i = 1 L ˘ ( P L ˘ P m 1 ( h ) ) P m ( h ) + o P ,
where L ˘ = 1 + n E A ( τ ) . In order to establish the weak convergence for the empirical process Z L ˘ ( h ) , it is sufficient and necessary to prove the finite dimensional convergence and the stochastic equicontinuity. For the finite multidimensional convergence, we have to prove that Z L ˘ ( h i 1 ) , , Z L ˘ ( h i k ) converges weakly to G ( h i 1 ) , , G ( h i k ) , for every fixed finite collection of functions
h i 1 , , h i k F .
In order to fix this, it is enough to show that for every fixed a 1 , , a k R ,
j = 1 k a j Z L ˘ ( h i j ) N ( 0 , σ 2 ) , in   distribution ,
where
σ 2 = j = 1 k a j 2 Var ( Z L ˘ ( h i j ) ) + s r a j a i Cov ( Z L ˘ ( h i s ) , Z L ˘ ( h i r ) ) .
By linearity, and in the same footsteps of the arguments of ([57], Chapter 17), we can prove that
1 n j = 1 L ˘ h ˜ 1 ( B j ) N 0 , γ h 1 2 ,
where, under Condition (C5),
γ h 1 2 = α E A h ˜ 1 2 ( B 1 ) .
We readily infer that we have
L S L ( h ) N 0 , m 2 E A h ˜ 1 2 ( B 1 ) .
Now, to verify the equicontinuity, we need to check that for every ϵ > 0 ,
lim δ 0 lim n P sup d ( f , g ) δ | Z L ( f ) Z L ( g ) | > ϵ = 0 ,
where d ( · , · ) is a pseudo distance for which the class F is totally bounded, and f , g belong to F . According to [72], we have
| Z L ( f g ) | = 1 L k = 1 L ( f g ) ( B k ) P m ( f g ) 1 L a k b ( f g ) ( B k ) P m ( f g ) + 1 L 1 k n / E ( τ ) ( ( f g ) ( B k ) P m ( f g ) ) ,
where a = min ( L , n / E ( τ ) ) a n d b = max ( L , n / E ( τ ) ) . For the left-hand part in the last inequality, we have
a k b ( ( f g ) ( B k ) P m ( f g ) ) 2 sup f F max 1 s n 2 1 k s f ( B k ) P m ( f ) .
Dividing the last inequality by L 1 / 2 and using the convergence result in ([72] Lemma 2.11) with Condition (C1), we obtain the desired result. The right-hand part in the inequality is treated using ([72] Lemma 4.2) providing that E A ( τ ) 2 + α < , for α > 0 and the hypothesis of a finite uniform entropy integral. To complete the weak convergence of the regenerative U-statistic, we must treat the remaining terms of its Hoeffding decomposition. For ζ F , let us introduce
ζ : = ω h ˜ ( B 1 , , B m ) P m ( h ) i = 1 m h ˜ ( 1 ) ( B i ) , B i T .
Once can see that ζ is centered, that is
ζ ( B 1 , , B m ) d P ( B 1 ) d P ( B i ) d P ( B m ) = 0 .
By the randomization theorem, according to [7] (for r = 2 ):
E 1 i 1 < < i m L ˘ ζ ( B i 1 , , B i m ) = E 1 i 1 < < i m L ˘ ε i 1 ( 1 ) ε i 2 ( 2 ) ζ ( B i 1 ( 1 ) , , B i m ( 1 ) ) .
Hence, for C a constant:
E L ˘ 1 / 2 1 i 1 < < i m L ˘ ζ ( B i 1 , , B i m ) F C E 0 L ˘ 1 / 2 log N n , 2 ( ε , F ) d ε .
It is sufficient now to use the theorem hypothesis of a uniform entropy integral to complete the proof of the theorem. □
Proof of Theorem 2.
We have
E 1 i 1 < < i m n ξ i 1 ξ i m f Y i 1 ( 1 ) , , Y i m ( m ) F .
By decoupling of the U-process, due to [6],
C m E 1 i 1 < < i m n ξ i 1 ξ i m f Y i 1 ( 1 ) , , Y i m ( m ) F .
By symmetrization, due to [6], we have
2 m C m E 1 i 1 < < i m n ξ i 1 ξ i m ε i 1 ( 1 ) ε i m ( m ( n ) ) f Y i 1 ( 1 ) , , Y i m ( m ) F
for ( sgn ( ξ 1 ) ε 1 * , , sgn ( ξ n ) ε n * ) a sequence independent and with the same distribution as ( ξ 1 , , ξ n ) . By the invariance of ( P ϵ P ) m n and the fact that ξ is independent of X · , ϵ · , we have that
= 2 m C m E x , ϵ 1 i 1 < < i m n | ξ i 1 | | ξ i m | sgn ( ξ i 1 ) ε i 1 ( 1 ) sgn ( ξ i m ) ε i m ( m ( n ) ) f Y i 1 ( 1 ) , , Y i m ( m ) F = 2 m C m E x , ϵ 1 i 1 < < i m l n 1 | ξ i 1 | | ξ i m | ε i 1 ( 1 ) ε i m ( m ( n ) ) f Y i 1 ( 1 ) , , Y i m ( m ) F ,
using the reversed order statistics of { | ξ i | } i = 1 n , | ξ ( 1 ) | | ξ ( n ) | , and the permutations between the different sequences of random variables, and in the same footsteps as [46],
= 2 m C m E 1 i 1 < < i m n | ξ ( i 1 ) | | ξ ( i m ) | ε i 1 ( 1 ) ε i m ( m ( n ) ) f Y i 1 ( 1 ) , , Y i m ( m ) F
substituting ξ ( i ) by k = i n ξ ( k ) ξ ( k + 1 ) , with | ξ ( n + 1 ) | = 0 , we have
2 m C m E 1 i 1 < < i m n k j i j , 1 j r ( | ξ ( l 1 ) | | ξ ( l 1 + 1 ) | ) ( | ξ ( l m ) | | ξ ( l m + 1 ) | ) × ε i 1 ( 1 ) ε i m ( m ( n ) ) f Y i 1 ( 1 ) , , Y i m ( m ) F 2 m C m E 1 i 1 , , i m n k i k , 1 k m ( | ξ ( 1 ) | | ξ ( 1 + 1 ) | ) ( | ξ ( m ) | | ξ ( m + 1 ) | ) × ϵ i 1 ( 1 ) ε i m ( m ( n ) ) f Y i 1 ( 1 ) , , Y i m ( m ) F 2 m C m E [ 1 1 , , m n ( | ξ ( 1 ) | | ξ ( 1 + 1 ) | ) ( | ξ ( m ) | | ξ ( m + 1 ) | ) × E 1 i k k , 1 k m ϵ i 1 ( 1 ) ε i m ( m ( n ) ) f Y i 1 ( 1 ) , , Y i m ( m ) F ] 2 m C m E 1 1 , , m n | ξ ( 1 + 1 ) | | ξ ( 1 ) | | ξ ( m + 1 ) | ) | ξ ( m ) | ψ n ( 1 , , m ) d t m d t 1 2 m C m E [ 1 1 , , m n | ξ ( 1 + 1 ) | | ξ ( 1 ) | | ξ ( m + 1 ) | ) | ξ ( m ) | ψ n ( | { i : | ξ i | > t 1 } | , , | { i : | ξ i | > t m } | ) d t m d t 1 ] 2 m C m E R 0 m ψ n ( | { i : | ξ i | > t 1 } | , , | { i : | ξ i | > t m } | ) d t 1 d t m 2 m C m R 0 m E ψ n i = 1 n 1 | ξ i | > t 1 , , i = 1 n 1 | ξ i | > t m d t 1 d t m . ( B y F u b i n i s t h e o r e m . )
Now, suppose that ψ n ( 1 , , m ) = ψ ¯ n ( k = 1 m k ) . Then, we may further bound the above equation by
R 0 m E ψ ¯ n k = 1 m i = 1 n 1 | ξ i | > t k d t 1 d t m = R 0 m E ψ ¯ n 1 i 1 , , i m n k = 1 m 1 | ξ i k | > t k d t 1 d t m R 0 m ψ ¯ n 1 i 1 , , i m n E k = 1 m 1 | ξ i k | > t k d t 1 d t m ( by Jensen s inequality ) R 0 m ψ ¯ n 1 i 1 , , i m n k = 1 m P | ξ i k | > t k 1 / m d t 1 d t m ,
where the last inequality follows from the generalized Hölder inequality and the assumption that ψ ¯ n is nondecreasing. □
Proof of Lemma 2.
For
ψ n 1 , , m a 1 , , m k = 1 m k 1 / 2 ,
Theorem 2 implies that:
E 1 i 1 , , i m n ξ i 1 ξ i m f Y i 1 , , Y i m F ( n , , n ) , n K m R 0 m E a i = 1 n 1 ξ i > t 1 , , i = 1 n 1 ξ i > t m k = 1 m i = 1 n 1 ξ i > t k 1 / 2 d t 1 d t m K m R 0 m A 2 , n t 1 , , t m E k = 1 m i = 1 n 1 ξ i > t k 1 / 2 d t 1 d t m K m R 0 m A 2 , n t 1 , , t m 1 i 1 , , i m n k = 1 m P ξ i k > t k 1 / m 1 / 2 d t 1 d t m = n m / 2 K m R 0 m A 2 , n t 1 , , t m k = 1 m P ξ 1 > t k 1 / 2 m d t 1 d t m .
Here
A 2 , n t 1 , , t m E a 2 i = 1 n 1 ξ i > t 1 , , i = 1 n 1 ξ i > t m 1 / 2 0
as long as none of P ξ 1 > t k : 1 k m vanishes. The claim now follows from the dominated convergence theorem. □
Corollary A1
([6]). Let X ( t ) , t T , be a (weak) Gaussian or Rademacher chaos process of degree m and let
d X ( s , t ) : = E ( X ( t ) X ( s ) ) 2 1 / 2 , s , t T .
If
0 D log N T , d X , ε m / 2 d ε < ,
then there is a version of X, which we keep denoting X, with almost all of its sample paths in C u T , d X and such that
sup t T | X ( t ) | ψ 2 / m X t 0 ψ 2 + K 0 D log N T , d X , ε m / 2 d ε ,
and
sup d X ( s , t ) δ s . t T | X ( t ) X ( s ) | ψ 2 / m K 0 δ log N T , d X , ε m / 2 d ε ,
for all 0 < δ D , where K is a universal constant and D is the diameter of T for the pseudodistance d X . In fact, every separable version of X satisfies these properties.
Theorem A1
([73]). For any random elements Y n with values in a metric space ( S , d ) , where Y is measurable and has a separable range, the following are equivalent:
  • Y n converge in law to Y;
  • d BL Y n , Y 0 as n ;

References

  1. Halmos, P.R. The theory of unbiased estimation. Ann. Math. Stat. 1946, 17, 34–43. [Google Scholar] [CrossRef]
  2. Hoeffding, W. A class of statistics with asymptotically normal distribution. Ann. Math. Stat. 1948, 19, 293–325. [Google Scholar] [CrossRef]
  3. van der Vaart, A.W. Asymptotic Statistics; Cambridge Series in Statistical and Probabilistic Mathematics; Cambridge University Press: Cambridge, MA, USA, 1998; Volume 3, p. xvi+443. [Google Scholar]
  4. Mises, R.V. On the asymptotic distribution of differentiable statistical functions. Ann. Math. Stat. 1947, 18, 309–348. [Google Scholar] [CrossRef]
  5. Serfling, R.J. Approximation Theorems of Mathematical Statistics; John Wiley & Sons: Hoboken, NJ, USA, 2009; Volume 162. [Google Scholar]
  6. de la Peña, V.H.; Giné, E. Decoupling; From dependence to independence, Randomly stopped processes. U-statistics and processes. Martingales and beyond; Probability and its Applications (New York); Springer: New York, NY, USA, 1999; p. xvi+392. [Google Scholar]
  7. Arcones, M.A.; Giné, E. Limit theorems for U-processes. Ann. Probab. 1993, 21, 1494–1542. [Google Scholar] [CrossRef]
  8. Lee, A.J. U-statistics; Volume 110, Statistics: Textbooks and Monographs; Theory and practice; Marcel Dekker, Inc.: New York, NY, USA, 1990; p. xii+302. [Google Scholar]
  9. Stute, W. Almost sure representations of the product-limit estimator for truncated data. Ann. Statist. 1993, 21, 146–156. [Google Scholar] [CrossRef]
  10. Arcones, M.A.; Wang, Y. Some new tests for normality based on U-processes. Statist. Probab. Lett. 2006, 76, 69–82. [Google Scholar] [CrossRef]
  11. Giné, E.; Mason, D.M. Laws of the iterated logarithm for the local U-statistic process. J. Theoret. Probab. 2007, 20, 457–485. [Google Scholar] [CrossRef]
  12. Giné, E.; Mason, D.M. On local U-statistic processes and the estimation of densities of functions of several sample variables. Ann. Statist. 2007, 35, 1105–1145. [Google Scholar] [CrossRef] [Green Version]
  13. Schick, A.; Wang, Y.; Wefelmeyer, W. Tests for normality based on density estimators of convolutions. Statist. Probab. Lett. 2011, 81, 337–343. [Google Scholar] [CrossRef]
  14. Joly, E.; Lugosi, G. Robust estimation of U-statistics. Stochastic Process. Appl. 2016, 126, 3760–3773. [Google Scholar] [CrossRef]
  15. Lee, S.; Linton, O.; Whang, Y.J. Testing for stochastic monotonicity. Econometrica 2009, 77, 585–602. [Google Scholar]
  16. Ghosal, S.; Sen, A.; van der Vaart, A.W. Testing monotonicity of regression. Ann. Statist. 2000, 28, 1054–1082. [Google Scholar] [CrossRef]
  17. Abrevaya, J.; Jiang, W. A nonparametric approach to measuring and testing curvature. J. Bus. Econom. Statist. 2005, 23, 1–19. [Google Scholar] [CrossRef]
  18. Nolan, D.; Pollard, D. U-processes: Rates of convergence. Ann. Statist. 1987, 15, 780–799. [Google Scholar] [CrossRef]
  19. Sherman, R.P. Maximal inequalities for degenerate U-processes with applications to optimization estimators. Ann. Statist. 1994, 22, 439–459. [Google Scholar] [CrossRef]
  20. Yoshihara, K.i. Limiting behavior of U-statistics for stationary, absolutely regular processes. Z. Wahrscheinlichkeitstheorie und Verw. Gebiete 1976, 35, 237–252. [Google Scholar] [CrossRef]
  21. Borovkova, S.; Burton, R.; Dehling, H. Limit theorems for functionals of mixing processes with applications to U-statistics and dimension estimation. Trans. Amer. Math. Soc. 2001, 353, 4261–4318. [Google Scholar] [CrossRef]
  22. Denker, M.; Keller, G. On U-statistics and v. Mises’ statistics for weakly dependent processes. Z. Wahrsch. Verw. Gebiete 1983, 64, 505–522. [Google Scholar] [CrossRef]
  23. Leucht, A. Degenerate U- and V-statistics under weak dependence: Asymptotic theory and bootstrap consistency. Bernoulli 2012, 18, 552–585. [Google Scholar] [CrossRef]
  24. Leucht, A.; Neumann, M.H. Degenerate U- and V-statistics under ergodicity: Asymptotics, bootstrap and applications in statistics. Ann. Inst. Statist. Math. 2013, 65, 349–386. [Google Scholar] [CrossRef] [Green Version]
  25. Bouzebda, S.; Nemouchi, B. Weak-convergence of empirical conditional processes and conditional U-processes involving functional mixing data. In Statistical Inference for Stochastic Processes; Springer: New York, NY, USA, 2022; pp. 1–56. [Google Scholar]
  26. Efron, B. Bootstrap methods: Another look at the jackknife. Ann. Statist. 1979, 7, 1–26. [Google Scholar] [CrossRef]
  27. Hall, P. The Bootstrap and Edgeworth Expansion; Springer Series in Statistics; Springer: New York, NY, USA, 1992; p. xiv+352. [Google Scholar]
  28. Bickel, P.J.; Freedman, D.A. Some asymptotic theory for the bootstrap. Ann. Statist. 1981, 9, 1196–1217. [Google Scholar] [CrossRef]
  29. Arcones, M.A.; Giné, E. On the bootstrap of U and V statistics. Ann. Statist. 1992, 20, 655–674. [Google Scholar] [CrossRef]
  30. Dehling, H.; Mikosch, T. Random quadratic forms and the bootstrap for U-statistics. J. Multivariate Anal. 1994, 51, 392–413. [Google Scholar] [CrossRef]
  31. Leucht, A.; Neumann, M.H. Consistency of general bootstrap methods for degenerate U-type and V-type statistics. J. Multivariate Anal. 2009, 100, 1622–1633. [Google Scholar] [CrossRef] [Green Version]
  32. Politis, D.N.; Romano, J.P. A circular block-resampling procedure for stationary data. In Exploring the Limits of Bootstrap (East Lansing, MI, 1990); Wiley Ser. Probab. Math. Statist. Probab. Math. Statist.; Wiley: New York, NY, USA, 1992; pp. 263–270. [Google Scholar]
  33. Carlstein, E. The use of subseries values for estimating the variance of a general statistic from a stationary sequence. Ann. Statist. 1986, 14, 1171–1179. [Google Scholar] [CrossRef]
  34. Politis, D.N.; Romano, J.P. The stationary bootstrap. J. Amer. Statist. Assoc. 1994, 89, 1303–1313. [Google Scholar] [CrossRef]
  35. Rubin, D.B. The Bayesian bootstrap. Ann. Statist. 1981, 9, 130–134. [Google Scholar] [CrossRef]
  36. Lo, A.Y. A Bayesian method for weighted sampling. Ann. Statist. 1993, 21, 2138–2148. [Google Scholar] [CrossRef]
  37. Mason, D.M.; Newton, M.A. A rank statistics approach to the consistency of a general bootstrap. Ann. Statist. 1992, 20, 1611–1624. [Google Scholar] [CrossRef]
  38. Præstgaard, J.; Wellner, J.A. Exchangeably weighted bootstraps of the general empirical process. Ann. Probab. 1993, 21, 2053–2086. [Google Scholar] [CrossRef]
  39. van der Vaart, A. New Donsker classes. Ann. Probab. 1996, 24, 2128–2140. [Google Scholar] [CrossRef]
  40. Alvarez-Andrade, S.; Bouzebda, S. Strong approximations for weighted bootstrap of empirical and quantile processes with applications. Stat. Methodol. 2013, 11, 36–52. [Google Scholar] [CrossRef]
  41. Bouzebda, S. On the strong approximation of bootstrapped empirical copula processes with applications. Math. Methods Statist. 2012, 21, 153–188. [Google Scholar] [CrossRef]
  42. Bouzebda, S.; Elhattab, I.; Ferfache, A.A. General M-Estimator Processes and their m out of n Bootstrap with Functional Nuisance Parameters. In Methodology and Computing in Applied Probability; Springer: New York, NY, USA, 2022; pp. 1–45. [Google Scholar]
  43. Huskova, M.; Janssen, P. Consistency of the generalized bootstrap for degenerate U-statistics. Ann. Stat. 1993, 21, 1811–1823. [Google Scholar] [CrossRef]
  44. Janssen, P. Weighted bootstrapping of U-statistics. J. Stat. Plan. Inference 1994, 38, 31–41. [Google Scholar] [CrossRef]
  45. Alvarez-Andrade, S.; Bouzebda, S. Cramér’s type results for some bootstrapped U-statistics. Statist. Papers 2020, 61, 1685–1699. [Google Scholar] [CrossRef]
  46. Han, Q. Multiplier U-processes: Sharp bounds and applications. Bernoulli 2022, 28, 87–124. [Google Scholar] [CrossRef]
  47. Radulović, D. Renewal type bootstrap for Markov chains. Test 2004, 13, 147–192. [Google Scholar] [CrossRef]
  48. Giné, E.; Zinn, J. Bootstrapping general empirical measures. Ann. Probab. 1990, 18, 851–869. [Google Scholar] [CrossRef]
  49. Nummelin, E. General Irreducible Markov Chains and Nonnegative Operators; Cambridge Tracts in Mathematics; Cambridge University Press: Cambridge, UK, 1984; Volume 83, p. xi+156. [Google Scholar]
  50. Athreya, K.B.; Ney, P. A new approach to the limit theory of recurrent Markov chains. Trans. Amer. Math. Soc. 1978, 245, 493–501. [Google Scholar] [CrossRef]
  51. Nummelin, E. A splitting technique for Harris recurrent Markov chains. Z. Wahrsch. Verw. Gebiete 1978, 43, 309–318. [Google Scholar] [CrossRef]
  52. Chung, K.L. Markov Chains with Stationary Transition Probabilities, 2nd ed.; Die Grundlehren der mathematischen Wissenschaften, Band 104; Springer: New York, NY, USA, 1967; p. xi+301. [Google Scholar]
  53. van der Vaart, A.W.; Wellner, J.A. Weak Convergence and Empirical Processes; Springer Series in Statistics; With applications to statistics; Springer: New York, NY, USA, 1996; p. xvi+508. [Google Scholar]
  54. Pollard, D. Convergence of Stochastic Processes; Springer Series in Statistics; Springer: New York, NY, USA, 1984; p. xiv+215. [Google Scholar]
  55. Douc, R.; Guillin, A.; Moulines, E. Bounds on regeneration times and limit theorems for subgeometric Markov chains. Ann. Inst. Henri Poincaré Probab. Stat. 2008, 44, 239–257. [Google Scholar] [CrossRef]
  56. Meyn, S.; Tweedie, R.L. Markov Chains and Stochastic Stability, 2nd ed.; Cambridge University Press: Cambridge, MA, USA, 2009. [Google Scholar]
  57. Meyn, S.P.; Tweedie, R.L. Markov Chains and Stochastic Stability; Communications and Control Engineering Series; Springer: London, UK, 1993; p. xvi+548. [Google Scholar]
  58. Bertail, P.; Clémençon, S. Regeneration-based statistics for Harris recurrent Markov chains. In Dependence in Probability and Statistics; Springer: New York, NY, USA, 2006; Volume 187, pp. 3–54. [Google Scholar]
  59. Bertail, P.; Clémençon, S. A renewal approach to Markovian U-statistics. Math. Methods Statist. 2011, 20, 79–105. [Google Scholar] [CrossRef]
  60. Revuz, D. Markov Chains, 2nd ed.; North-Holland Publishing Co.: Amsterdam, The Netherlands, 1984; Volume 11. [Google Scholar]
  61. Cheng, G. Moment consistency of the exchangeably weighted bootstrap for semiparametric M-estimation. Scand. J. Stat. 2015, 42, 665–684. [Google Scholar] [CrossRef] [Green Version]
  62. Shao, J.; Wu, C. Heteroscedasticity-robustness of jackknife variance estimators in linear models. Ann. Statist. 1987, 15, 1563–1579. [Google Scholar] [CrossRef]
  63. Weng, C.S. On a second-order asymptotic property of the Bayesian bootstrap mean. Ann. Statist. 1989, 17, 705–710. [Google Scholar] [CrossRef]
  64. van Zwet, W.R. The Edgeworth expansion for linear combinations of uniform order statistics. In Second Prague Symposium on Asymptotic Statistics (Hradec Králové, 1978); North-Holland: Amsterdam, NY, USA, 1979; pp. 93–101. [Google Scholar]
  65. Pauly, M. Consistency of the subsample bootstrap empirical process. Statistics 2012, 46, 621–626. [Google Scholar] [CrossRef]
  66. Shao, J.; Tu, D.S. The Jackknife and Bootstrap; Springer Series in Statistics; Springer: New York, NY, USA, 1995; p. xviii+516. [Google Scholar]
  67. Bertail, P.; Clémençon, S. Regenerative block bootstrap for Markov chains. Bernoulli 2006, 12, 689–712. [Google Scholar] [CrossRef]
  68. Bertail, P.; Clémençon, S. Approximate regenerative-block bootstrap for Markov chains. Comput. Statist. Data Anal. 2008, 52, 2739–2756. [Google Scholar] [CrossRef]
  69. Fan, Y.; Ullah, A. On goodness-of-fit tests for weakly dependent processes using kernel method. J. Nonparametr. Statist. 1999, 11, 337–360, First NIU Symposium on Statistical Sciences (De Kalb, IL, 1996). [Google Scholar] [CrossRef]
  70. Frees, E.W. Infinite order U-statistics. Scand. J. Statist. 1989, 16, 29–45. [Google Scholar]
  71. Rempala, G.; Gupta, A. Weak limits of U-statistics of infinite order. Random Oper. Stochastic Equations 1999, 7, 39–52. [Google Scholar] [CrossRef]
  72. Levental, S. Uniform limit theorems for Harris recurrent Markov chains. Probab. Theory Related Fields 1988, 80, 101–118. [Google Scholar] [CrossRef]
  73. Dudley, R.M. Nonlinear functionals of empirical measures and the bootstrap. In Probability in Banach Spaces, 7 (Oberwolfach, 1988); Progr. Probab.; Birkhäuser Boston: Boston, MA, USA, 1990; Volume 21, pp. 63–82. [Google Scholar]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Soukarieh, I.; Bouzebda, S. Exchangeably Weighted Bootstraps of General Markov U-Process. Mathematics 2022, 10, 3745. https://doi.org/10.3390/math10203745

AMA Style

Soukarieh I, Bouzebda S. Exchangeably Weighted Bootstraps of General Markov U-Process. Mathematics. 2022; 10(20):3745. https://doi.org/10.3390/math10203745

Chicago/Turabian Style

Soukarieh, Inass, and Salim Bouzebda. 2022. "Exchangeably Weighted Bootstraps of General Markov U-Process" Mathematics 10, no. 20: 3745. https://doi.org/10.3390/math10203745

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop