Exchangeably Weighted Bootstraps of General Markov U-Process

Soukarieh, Inass; Bouzebda, Salim

doi:10.3390/math10203745

Open AccessArticle

Exchangeably Weighted Bootstraps of General Markov U-Process

by

Inass Soukarieh

^†

and

Salim Bouzebda

^*,†

Laboratory of Applied Mathematics of Compiègne (LMAC), Université de Technologie de Compiègne, 60200 Compiègne, France

^*

Author to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Mathematics 2022, 10(20), 3745; https://doi.org/10.3390/math10203745

Submission received: 24 August 2022 / Revised: 29 September 2022 / Accepted: 2 October 2022 / Published: 12 October 2022

(This article belongs to the Special Issue Current Developments in Theoretical and Applied Statistics)

Download Versions Notes

Abstract

:

We explore an exchangeably weighted bootstrap of the general function-indexed empirical U-processes in the Markov setting, which is a natural higher-order generalization of the weighted bootstrap empirical processes. As a result of our findings, a considerable variety of bootstrap resampling strategies arise. This paper aims to provide theoretical justifications for the exchangeably weighted bootstrap consistency in the Markov setup. General structural conditions on the classes of functions (possibly unbounded) and the underlying distributions are required to establish our results. This paper provides the first general theoretical study of the bootstrap of the empirical U-processes in the Markov setting. Potential applications include the symmetry test, Kendall’s tau and the test of independence.

Keywords:

bootstrap; Markov chains; regenerative processes; empirical processes; VC classes of functions; U-processes; Donsker classes; weak convergence

MSC:

60F05; 60G15; 60K05; 60K15; 62F40

1. Introduction

U-statistics are a class of estimators, initially explored in association with unbiased estimators by [1] and officially introduced by [2], and are defined as follows: let

{\{X_{i}\}}_{i = 1}^{\infty}

be a sequence of random variables defined on a measurable space

(E, E)

, and let

h : E^{m} \to R

be a measurable function, the U-statistics of order m and kernel h based on the sequence

\{X_{i}\}

are

U_{n} (h) = {(\binom{n}{m})}^{- 1} \sum_{(i_{1}, \dots, i_{m}) \in I_{n}^{m}} h (X_{i_{1}}, \dots, X_{i_{m}}), n \geq m,

where

I_{n}^{m} = \{(i_{1}, \dots, i_{m}) : i_{j} \in N, 1 \leq i_{j} \leq n, i_{j} \neq i_{k} if j \neq k\} .

The empirical variance, Gini’s mean difference or Kendall’s rank correlation coefficient are common examples of U-estimators, while a classical test based on a U-statistic is Wilcoxon’s signed rank test for the hypothesis of the location at zero (see, e.g., [3], Example 12.4).The authors in [1,2,4] provided, amongst others, the first asymptotic results for the case in which the underlying random variables have independent and identical distributions. Extensive literature works have treated the theory of U-statistics, for instance, see [5,6,7,8], etc. Complex statistical issues are also amenable to being solved using U-processes. Examples include tests for goodness-of-fit, nonparametric regression and density estimation. U-processes are a set of U-statistics that are indexed by a family of kernels. U-processes might be viewed as infinite-dimensional variants of U-statistics with a single kernel function or as nonlinear stochastic extensions of empirical processes. Both thoughts have the following advantages: first, considering a large group of statistics rather than a single statistic is more statistically interesting. Second, we may use ideas from the theory of empirical processes to construct limit or approximation theorems for U-processes. Nevertheless, achieving results in U-processes is not easy. Extending U-statistics to U-processes necessitates a significant effort and distinct methodologies; generalizing empirical processes to U-processes is quite challenging, especially when U-processes are presented in the stationary setting. We highlight that the U-processes are used often in statistics, such as when higher order terms are a part of von Mises expansions. Particularly, the study of estimators (including function estimators) with various smoothness degrees involves U-statistics. For instance, Ref. [9] applied almost-sure uniform bounds for

P

-canonical U-processes to analyze the product limit estimator for truncated data. Two new tests for normality based on U-processes were also presented in [10]. Inspired by [11,12,13], they developed other tests for normality that employed weighted

L_{1}

-distances between the standard normal density and local U-statistics based on standardized observations as test statistics. Estimating the mean of multivariate functions in the case of possibly heavy-tailed distributions was explored by [14]; they presented the median-of-means too, and both explorations were based on U-statistics. Moreover, other researchers emphasized the importance of U-processes; refs. [15,16,17] used them for testing qualitative features of functions in nonparametric statistics, ref. [18] represented the cross-validation for density estimation using U-statistics, in addition to [6,7,19], where the authors established limiting distributions of M-estimators. Since then, this discipline has made significant advancements, and the results have been broadly interpreted. Asymptotic behaviors were demonstrated under weak dependence assumptions, for example, in the works of [20,21,22] or more recently in [23] as well as more generally in [24,25]. However, in practice, explicit computation is not always possible due to the complexity of the U-processes’ limiting distributions or their functionals. We suggest a general bootstrap of the U-processes in the Markov setting to solve this issue, which is a challenging problem. The concept of the bootstrap, given by [26], in the case of independent and identically distributed (iid) random variables, is to resample from an original sample of observations of an unknown marginal distribution function

F (x)

,

X_{1}, \dots, X_{n}

, a new i.i.d sample

X_{1}^{*}, \dots, X_{n}^{*}

with the marginal distribution function

F_{n} (x)

, which represents the empirical distribution function constructed from the original sample. Moreover, it is commonly known that the bootstrap approach gives a better approximation to the statistic’s distribution, mainly when the sample size is small [27]. Bootstraps for U-statistics of independent observations were studied by [28,29,30,31]. However, the bootstrap technique is not the same for dependent variables because the dependence structure cannot be conserved in the new sample. For this reason, other blockwise bootstrap methods were introduced, aiming to keep the structure of dependence. Among those methods, we can cite the circular block bootstrap introduced by [32] and the nonoverlapping block bootstrap introduced by [33]. In [34], the authors proposed a bootstrap method related to the weakly dependent stationary observation, the stationary bootstrap. This latter can be seen as an expansion of the circular block bootstrap, where a random variable, such as a geometric random variable, can be used for the block length. It is important to note that Efron’s initial bootstrap formulation (see [26]) had a few flaws. To be more precise, certain observations might be sampled several times while others might not be at all. A more generalized version of the bootstrap, the weighted bootstrap, was developed to get around this issue and was also demonstrated to be computationally more appealing in some applications. This resampling strategy was initially described in [35] and thoroughly investigated by [28], who coined the name “weighted bootstrap”. For example, Bayesian bootstrap when the weighted vector

(ξ_{n 1}, \dots, ξ_{n n}) = (M_{n 1}, \dots, M_{n n}),

is equal to the vector of n spacings of

n - 1

ordered uniform

(0, 1)

random variables in distributions, that is,

(M_{n 1}, \dots, M_{n n})

follows a Dirichlet distribution of parameters

(n; 1, \dots, 1) .

For more details, see [36]. This diversity of resampling approaches necessitates the use of a uniform approach, commonly known as general weighted resampling, which was first described by [37] and has since been developed by [38,39]. In [40], the authors investigated the almost-sure rate of convergence of strong approximation for the weighted bootstrap process by a sequence of Brownian bridge processes; refer to [41] for the multivariate setting and [42] for recent references. The concept of the generalized bootstrap, introduced by [37], was extended to the class of nondegenerate U-statistics of degree two and the corresponding Studentized U-statistics by [43]; refer to [44,45]. In [46], the author generalized this theory for a higher order. In his work, he developed a multiplier inequality of a U-process for i.i.d. random variables. We mention that the multiplier processes’ theory is directly and strongly related to the symmetrization inequalities investigated by [6,7].

This paper aims to investigate the exchangeable bootstrap for U-processes in the same way that [46] did but without the restriction of the independence setting. The previous reference focused on U-processes in an independent framework, whereas this paper considers U-processes in the dependent setting of Markov chains. We believe we are the first to present a successful consideration in this general context. We combine the techniques of the renewal bootstrap with the randomly weighted bootstrap in a nontrivial way. We mention a connection between moving-blocks bootstrap and its modification, matched-block bootstrap, at this point. Instead of artificially splitting a sample into fixed-size blocks and then resampling them, the latter seeks to match the blocks to create a smoother transition; for more information, see [47]. The main difficulties in proving Theorem 3 are due to the random size of the resampled blocks. This randomness generates a problem with the random stopping times, which cannot be removed by replacing a random stopping time with its expectation. In the present setting, the bootstrap random variables are generated by resampling from a random number of blocks. One can think that using the conditioning arguments can overcome the problem, but the answer is negative. Our proof uses some arguments from [46,47] by verifying bootstrap stochastic equicontinuity by comparing it to the original process in a similar way as in [48]. However, as we shall see later, integrating concepts from these papers is not enough to solve the problem. To deal with U-processes in the Markov framework, sophisticated mathematical derivations are necessary. We present the first complete theoretical justification of the bootstrap consistency. This justification requires the efficient use of large sample theoretical approaches established for U-empirical processes.

The rest of this paper is organized as follows. Section 2 is devoted to the introduction of the Markov framework, the U-process, the bootstrap weights and the definitions needed in our work. In Section 3, we recall the necessary ingredient for U-statistics and U-processes in the Markov setting. Furthermore, we provide some asymptotic results including the weak convergence of U-processes in Theorem 1. In Section 4, we derive the main results concerning the bootstrap of the U-processes. In Section 5, we collect some examples of weighted U-statistics. Some concluding remarks and possible future developments are relegated to Section 6. To prevent the interruption of the flow of the presentation, all proofs are gathered in Section 7. Appendix A contains a few pertinent technical findings and proofs.

2. Notation and Definitions

In what follows, we aim to properly define our settings. For this reason, we have collected the definitions and notation needed.

2.1. Markov Chain

Let

X = {(X_{n})}_{n \in N}

be an homogeneous

ψ

-irreducible Markov chain, that means that the chain has stationary transition probabilities, defined on a measurable space

(E, E)

, where

E

is a separable

σ

-algebra. Let

π (x, d y)

be the transition probability and

ν = ν {(i)}_{i > 0}

the initial probability. Therefore, we denote by

P_{ν}

or just

P

the probability measure for

P = (π, ν)

. Likewise,

E_{ν}

denotes the integration with respect to

P_{ν}

. In our framework, let

P_{x}

be a probability measure such that

X_{0} = x

,

X_{0} \in E

and

E_{x} (\cdot)

is the

P_{x}

-expectation. We further assume that the Markov chain is Harris positive recurrent with an atom

A

.

Definition 1

(Harris recurrent). A Markov chain

X = {(X_{n})}_{n \in N}

is said to be Harris recurrent if there exists a σ-finite measure such that, for ψ a positive measure on a countable generated measurable space

(E, E)

,

ψ (E) > 0

and if for all

B \in E

with

ψ (B) > 0

P_{x} (\cup_{i = 1}^{\infty} (X_{i} \in B)) = 1 f o r a n y x \in E .

Recall that a chain is positive Harris recurrent and aperiodic if and only if it is ergodic ([49] Proposition 6.3), i.e., there exists a probability measure

π

, called the stationary distribution, such that, in total variation distance,

lim_{n \to + \infty} {∥P^{n} (x, \cdot) - π∥}_{tv} = 0 .

Definition 2

(Small sets). A set

S \in E

is said to be Ψ-small if there exists

δ > 0,

a positive probability measure Ψ supported by S and an integer

m \in N^{*}

, such that

\forall x \in S, B \in E, P^{m} (x, B) \geq δ Ψ (B) .

(1)

Definition 3.

Let

{(X_{n})}_{n \geq 1}

be a Markov chain taking value in

(E, E)

. We say that

{(X_{n})}_{n \geq 1}

is positive recurrent if

1.: ${(X_{n})}_{n \geq 1}$ is $(A, p, ν, m)$ recurrent (or Harris recurrent if $E$ is countably generated), where $A \in E$ is a set, $0 < p < 1$ , m is an integer and ν is a probability measure.
2.: $sup_{x \in A} E_{x} (T_{0}) < \infty$ , where $T_{0}$ is the hitting time of A by the m-step chain, roughly speaking, $T_{0} = min {i \geq 1 : X_{i, m} \in A}$ .

Definition 4.

A ψ-irreducible aperiodic chain X is called regenerative or atomic if there exists a measurable set

A

called an atom, in such a way that

ψ (A) > 0

and for all

(x, y) \in A^{2}

we have

P (x, \cdot) = P (y, \cdot)

. Roughly speaking, an atom is a set on which the transition probabilities are the same. If a finite number of states or subsets are visited from the chain, then any state or any subset of the states is actually an atom.

Definition 5

(Aperiodicity). Assuming ψ-irreducibility, there exists

d^{'} \in N^{*}

and disjoints sets

D_{1}, \dots,

D_{d^{'}}

(set

D_{d^{'} + 1} = D_{1}

) positively weighted by ψ such that

ψ (E ∖ \cup_{1 ⩽ i ⩽ d^{'}} D_{i}) = 0

and

\forall x \in D_{i}, P (x, D_{i + 1}) = 1 .

The period of the chain is the greatest common divisor d of such integers, it is said to be aperiodic if

d = 1

.

Definition 6

(Irreducibility). The chain is ψ-irreducible if there exists a σ-finite measure ψ such that, for all set

B \in E

, when

ψ (B) > 0

, for any

x \in E

, there exists

n > 0

such that

P^{n} (x, B) > 0

.

One of the most important properties of Harris recurrent Markov chains is the existence of an invariant distribution which we is called

μ

(a limiting probability distribution, also called occupation measure). Furthermore, Harris recurrent Markov chains can always be embedded in a certain Markov chain on an extended sample space with a recurrent atom. The existence of a recurrent atom

A

gives an immediate consequence for the construction of a regenerative extension of this chain. The time that the chain hits a given atom (recurrent state) is seen as the regenerative time. In [50,51], the authors give the construction of such a regenerative extension. The development of a regenerative extension makes the use of regenerative techniques possible in order to study this type of Markov chain. As we mentioned above, we assume in this work that the Harris recurrent chain is atomic, i.e., the set which is infinitely almost sure is well-defined and accessible, this set

A

is called an atom. By definition, an atom

A

is a set in

E

, where

μ (A) > 0

, and for all

x, y \in A

,

π (x, \cdot) = π (y, \cdot)

. Let

P_{A}

(respectively,

E_{A}

) be the probability measure on the underlying space such that

x \in A

(respectively, the

P_{A}

-expectation).

The conditions imposed on the Markov chain ensure that the defined atom

A

(or the constructed one in the case of a nonatomic chain) is one recurrent class, and let us define the following terms.

2.1.1. Hitting Times

Define

T_{j} : E \to N \cup {\infty}

by

\begin{matrix} T_{0} & : = & inf {n \geq 0 : X_{n} \in A}, \\ T_{j} & : = & inf {n \geq T_{j - 1} : X_{n} \in A} . \end{matrix}

(2)

A well-known property of the hitting time is that for all

j \in N

,

T_{j} < \infty

,

P_{ν} - a . s

([52], chap. I14).

2.1.2. Renewal Times

Using the hitting times, we can define the renewal times as

\begin{matrix} τ_{0} & : = & T_{0} + 1, \\ τ (j) & : = & T_{j} - T_{j - 1} . \end{matrix}

(3)

Similar to the regenerative process, the sequence of renewal times

{τ (j)}_{j = 1}^{\infty}

is i.i.d. and it is independent of the choice of the initial probability. All over this work, we set

τ = τ (1)

and

α = E_{A} (τ)

.

Definition 7

(Strong Markov property). Let

{(X_{n})}_{n \geq 0}

be a Markov chain, with T the stopping time of

{(X_{n})}_{n \geq 0}

. Then, conditionally on

T < \infty

and

X_{T} = i

,

{(X_{T + n})}_{n \geq 0}

is a sequence of a Markov chain and is independent of

X_{0}, \dots, X_{T}

.

2.1.3. Regenerative Blocks

Let

l_{n} : = max {j : \sum_{i = 0}^{j} τ (j) \leq n}

be the number of visits to the atom

A

. Using the strong property of a Markov chain, it is possible to divide the given sample

(X_{1}, \dots, X_{n})

into a sequence of blocks

{B_{j}}_{j = 0}^{l_{n}}

such that:

\begin{matrix} B_{0} & = & \{X_{1}, \dots, X_{T_{0}}\}, \\ B_{j} & = & \{X_{T_{j - 1} + 1}, \dots, X_{T_{j}}\} i n T = ⋃_{n = 1}^{\infty} E^{n}, for all j = 1, \dots, l_{n}, \\ B_{l_{n}}^{(n)} & = & \{X_{T_{l_{n} - 1} + 1}, \dots, X_{n}\}, \end{matrix}

(4)

where

l_{n}

is the total number of blocks. The length of each block is denoted by

l (B_{j}) : = T_{j} - T_{j - 1} .

2.2. Exchangeable Weights

In what follows,

ξ

represents a real-valued random variable,

ξ_{i}

are independent from

(X_{i})

. For

1 \leq p < \infty

, we denote the p-norm by

{∥ ξ ∥}_{p} = {(E (| ξ |^{p}))}^{1 / p} .

Definition 8

(Exchangeability). Let

ξ_{n 1}, \dots, ξ_{n n}

be a sequence of random variables with joint distribution

P_{ξ}

and let

Σ (n)

be the group of all permutations acting on

{1, \dots, n}

. We say that

ξ_{n 1}, \dots, ξ_{n n}

is exchangeable if, for all

σ (i) \in Σ (n)

,

P_{ξ} (ξ_{n 1}, \dots, ξ_{n n}) = P_{ξ} (ξ_{n σ (1)}, \dots, ξ_{n σ (n)}) .

Assuming the following:

(A1): $(ξ_{1}, \dots, ξ_{n})$ are exchangeable non-negative, symmetric and for all n

$\sum_{i = 1}^{n} ξ_{i} = n .$
(A2): $\frac{1}{n} max_{1 \leq i \leq n} {(ξ_{i} - 1)}^{2} \to 0$ in $P_{ξ}$ -probability which is satisfied by the assumption of the moment

$sup_{n} {∥ ξ_{1} ∥}_{2 m, 1} < \infty .$
(A3): There exists $c > 0$ such that, in $P_{ξ}$ -probability,

$\frac{1}{n} \sum_{i = 1}^{n} {(ξ_{i} - 1)}^{2} \to c^{2} > 0 .$
(A4): Assume

$lim_{λ \to \infty} \underset{t \geq λ}{lim^{¯}} t^{2} P_{ξ} (ξ_{1} \geq t) = 0 .$

2.3. The U-Process Framework

Let

{(X_{n})}_{n \in N}

be a sequence of random variables with values in a measurable space

(E, E)

. Let

h : E^{m} \to R

be a measurable function symmetric in its arguments. The U-statistic of order (or degree) m and kernel

h (\cdot)

is defined as:

U_{n} (h) = {(\binom{n}{m})}^{- 1} \sum_{(i_{1}, \dots, i_{m}) \in I_{n}^{m}} h (X_{i_{1}}, \dots, X_{i_{m}}), for n \geq m .

(5)

Accordingly, a U-process is the collection

{U_{n} (h) : h \in F}

, where

F

is the class of kernels

h (\cdot)

of m variables. The decoupling inequality of U-statistics and U-processes plays a central role in the latest developments in the asymptotic theory. As a result, the decoupling inequality can give a relation between the quantities

E Φ (|\sum_{I_{n}^{m}} h (X_{i_{1}}, \dots, X_{i_{m}})|) and E Φ (|\sum_{I_{n}^{m}} h (X_{i_{1}}^{1}, \dots, X_{i_{m}}^{m})|),

where

Φ (\cdot)

is a non-negative function and

{X_{i}^{k}}, k = 1, \dots, m

are independent copies of the original sequence

{X_{i}}

. One of the useful reasons for decoupling is randomization, which is frequently used in the study of the asymptotic theory of U-statistics, and was studied by [6,7]. The main idea of randomization is to compare the tail probabilities or moments of the original U-statistic or process,

\sum_{I_{n}^{m}} h (X_{i_{1}}, \dots, X_{i_{m}})

, with the tail probabilities or moments of the statistic

\sum_{I_{n}^{m}} ε_{i_{1}} \dots ε_{i_{r}} h (X_{i_{1}}, \dots, X_{i_{m}}),

where

ε_{i}

are independent Rademacher variables, independent from

X_{i}

,

1 \leq r \leq m

and the variables depend on the degree of degeneracy (centering) of the kernel

h (\cdot)

.

Definition 9

([6]). A symmetric

P^{m}

-integrable kernel

h : E^{m} \to R

is

P

-degenerate of order

r - 1

if and only if

\int h (x_{1}, \dots, x_{m}) d P^{m - r + 1} (x_{r}, \dots, x_{m}) = \int h d P^{m}

holds for any

x_{1}, \dots, x_{r - 1} \in E

, whereas

\int h (x_{1}, \dots, x_{m}) d P^{m - r} (x_{r + 1}, \dots, x_{m}),

is not a constant function. If h is furthermore

P^{m}

-centered, that is,

P^{m} f = 0

, we write

h \in L_{2}^{c, r} (P^{m})

. For notational simplicity, we usually write

L_{2}^{c, m} (P^{m}) = L_{2}^{c, m} (P)

.

Moreover,

h (\cdot)

is said to be canonical or completely degenerated if the integral with respect to one variable is equal to zero, i.e.,

\int h (x_{1}, \dots, x_{m}) d P (x_{1}) = 0 for all x_{2}, \dots, x_{m} \in E .

The fact that the kernel is completely degenerate with the condition

P^{m} h^{2} < \infty

is used for the orthogonality of the different elements of the Hoeffding decomposition of the U-statistics.

Definition 10

(Covering number). The covering number

N_{p} (ε, Q, F)

is defined as the minimal number of balls with radius ε that are needed to cover a class of functions

F

in the norm

L_{p} (Q)

, where Q is the measure on E with finite support.

We can associate some distances

e_{n, p}

to the covering numbers, where

e_{n, p} = (U_{n} {(| f - g |}^{p} {))}^{1 / p} .

In this work, we use the two distances defined afterward

e_{n, 2} (f, g) = {(\frac{(n - m)!}{n!} \sum_{0 \leq i_{1} < \dots < i_{m} \leq n} (f - g) {(X_{i_{1}}, \dots, X_{i_{m}})}^{2})}^{1 / 2} .

For decoupled statistics, we also associate covering numbers, well-known as

\tilde{N} (ϵ, F, {\tilde{e}}_{n, p})

and a distance, which can be defined for

p = 2

as follows:

{\tilde{e}}_{n, 2} (f, g) = n^{1 / 2} \frac{(n - m)!}{n!} {[E_{ε} {(\sum_{0 \leq i_{1} < \dots < i_{m} \leq n} ε_{i_{1}} (f - g) (X_{i_{1}}, \dots, X_{i_{m}}))}^{2}]}^{1 / 2} .

Definition 11.

A class

F

of measurable functions

E \to R

is said to be of VC type (or Vapnik–Chervonenkis type) for an envelope F and admissible characteristic

(C, v)

(positive constants) such that

C \geq {(3 \sqrt{e})}^{v}

and

v \geq 1

, if for all probability measure

Q

on

(E, E)

with

{0 < ∥ F ∥}_{L_{2} (Q)} < \infty

and every

0 < ϵ < 1

,

N ({ϵ ∥ F ∥}_{L_{2} (Q)}, F, {∥ \cdot ∥}_{L_{2} (Q)}) \leq C ϵ^{- v} .

We assume that the class is countable to avoid measurability issues (the noncountable case may be handled similarly by using an outer probability and additional measurability assumptions, see [53]).

Definition 12

(Stochastic equicontinuity, ([54])). Let

{Z_{n}}

be a sequence of stochastic processes. Call

{Z_{n}}

stochastically equicontinuous at

t_{0}

if for each

δ > 0

, there exists a neighborhood D of

t_{0}

such that

\underset{n}{lim sup} P \{sup_{D} | Z_{n} (t) - Z_{n} (t_{0}) |\} < ε .

(6)

In the context of the U-process

{U_{n}}

, the stochastic equicontinuity at a function

g \in F

implies generally that

| U_{n} (h) - U_{n} (g) |

should be uniformly small for all

h (\cdot)

close enough to

g (\cdot)

, with high probability and for all n large enough.

2.4. Gaussian Chaos Process

Definition 13.

Let H denote a real separable Hilbert space with scalar product

{〈 \cdot, \cdot 〉}_{H}

. We say that a stochastic process

G = {G_{P} (h), h \in H}

defined in a complete probability space

(E, E, P)

is an isonormal Gaussian process (or a Gaussian process on H) if

G_{P}

is a centered Gaussian family of random variables such that

E (G_{P} (h) G_{P} (g)) = {〈 h, g 〉}_{H}

for all

h, g \in H

.

Define the mapping

h \to G_{P} (h)

. Under the assumption mentioned above, this map is linear and it provides a linear isometry of H onto a closed subspace

L_{2} (E, E, P)

which contains a zero mean Gaussian random variables as its elements. Let

K_{P}

be the isonormal Gaussian chaos process associated with

G_{P}

determined by:

K_{P} (h_{m}^{ψ}) = {(m!)}^{\frac{1}{2}} R_{m} (G_{P} (ψ), E ψ^{2}, 0, \dots, 0),

where

h_{m}^{ψ} (x_{1}, \dots, x_{m}) = ψ (x_{1}) \dots ψ (x_{m}), ψ \in L_{2} (P)

and

R_{m}

is a polynomial defined as a sum of monomials of degree m; ref. [6] give us a simple expression of this polynomial, extracted from Newton’s identity given by

\sum_{1 \leq i_{1} < \cdot < i_{m} \leq n} t_{i_{1}} \dots t_{i_{m}} = R_{m} (\sum_{i = 1}^{n} t_{i}, \sum_{i = 1}^{n} t_{i}^{2}, \dots, \sum_{i = 1}^{n} t_{i}^{m}) .

Therefore,

\sum_{1 \leq i_{1} < \cdot < i_{m} \leq n} ψ (x_{i_{1}}) \dots ψ (x_{i_{m}}) = R_{m} (\sum_{i = 1}^{n} ψ (x_{i}), \sum_{i = 1}^{n} ψ {(x_{i})}^{2}, \dots, \sum_{i = 1}^{n} ψ {(x_{i})}^{m}) .

Hence, by the continuous mapping theorem, we can see that CLT and LLN give:

\begin{matrix} ({(\binom{n}{m_{1}})}^{\frac{1}{2}} U_{n} (h_{m_{1}}^{ψ_{1}}), \dots, {(\binom{n}{m_{r}})}^{\frac{1}{2}} U_{n} (h_{m_{r}}^{ψ_{r}})) \\ \to ({(m_{1}!)}^{\frac{1}{2}} R_{k_{1}} (G_{P} (ψ_{1}), E ψ_{1}^{2}, 0, \dots, 0) \dots, {(m_{r}!)}^{\frac{1}{2}} R_{k_{r}} (G_{P} (ψ_{r}), E ψ_{r}^{2}, 0, \dots, 0)) . \end{matrix}

Under the linearity of the kernel, we only need to show that:

\begin{matrix} \{{(\binom{n}{m})}^{\frac{1}{2}} U_{n} (f) : f \in F\} \\ \underset{d}{\to} \{K_{P} (f_{k}) = m! R_{m} (G_{P} (ψ), E ψ^{2}, 0, \dots, 0) : f_{k} \in F\} in ℓ^{\infty} (F), \end{matrix}

to hold the weak convergence. The limit

K_{P}

is useful in the case of degenerate U-statistics and it provides a convergence of all moments, which in turn plays a crucial role because it is due to the hypercontractivity, which makes the uniform integrability better. For a good explanation of

K_{P}

, readers are invited to see ([6] Chapter 4, Section 4.2).

2.5. Technical Assumptions

For our results, we need the following assumptions.

(C.1): (Block-length assumption) For all $q \geq 1$ and $l \geq 1$ ,

$E_{ν} [τ^{l}] < \infty, E_{A} [τ^{q}] < \infty;$
(C.2): (Nonregenerative blocks) For $l \geq 1$ , we have

$E_{ν} [{(\sum_{i_{1} = 1}^{T_{0}} \sum_{i_{2} = T_{0} + 1}^{T_{1}} \sum_{i_{3} = T_{1} + 1}^{T_{2}} \dots \sum_{i_{m} = T_{m - 1} + 1}^{T_{m}} | h (X_{i_{1}}, \dots, X_{i_{m}}) |)}^{l}] < \infty$

and

$E_{ν} [{(\sum_{i_{1} = T_{0} + 1}^{T_{1}} \sum_{i_{2} = T_{1} + 1}^{T_{2}} \dots \sum_{i_{m - 1} = T_{m - 1} + 1}^{T_{m}} \sum_{i_{m} = T (l_{n}) + 1}^{n} | h (X_{i_{1}}, \dots, X_{i_{m - 1}}, X_{i_{m}}) |)}^{l}] < \infty$
(C.3): (Block-sum: moment assumptions) For $l \geq 1$ , we have

$E_{ν} [{(\sum_{i_{1} = T_{0} + 1}^{T_{1}} \sum_{i_{2} = T_{1} + 1}^{T_{2}} \dots \sum_{i_{m} = T_{m - 1} + 1}^{T_{m}} | h (X_{i_{1}}, \dots, X_{i_{m}}) |)}^{l}] < \infty,$

and

$E_{A} [{(\sum_{T_{0} + 1 \leq i_{1} \leq \dots \leq i_{m} \leq T_{1}} h (X_{i_{1}}, \dots, X_{i_{m}}))}^{l}] < \infty;$
(C.4): For $l \geq 1$ , we have

$\begin{matrix} E_{ν} [(\sum_{i_{1} = T_{0} + 1}^{T_{1}} \sum_{i_{2} = T_{1} + 1}^{T_{2}} \underset{u times}{\underset{︸}{\sum_{i_{k} = T_{k} + 1}^{T_{k + 1}} \dots \sum_{i_{k} = T_{k} + 1}^{T_{k + 1}}}} \sum_{i_{k + u} = T_{k + u} + 1}^{T_{k + u + 1}} \dots \sum_{i_{m} = T_{m - 1} + 1}^{T_{m}} \\ | h (X_{i_{1}}, \underset{u times}{\underset{︸}{X_{i_{k}}, \dots, X_{i_{k}},}} X_{i_{k + u}}, \dots, X_{i_{m}}) |)^{l}] < \infty; \end{matrix}$
(C.5): (Nondegeneracy.) We suppose also that

$E_{A} [{(\sum_{i = T_{0} + 1}^{T_{1}} h_{1} (X_{i}))}^{2}] > 0 .$

Remark 1

(Moment assumptions). In practice, we recall that block-moment assumptions for the split Markov chain can be generally checked by establishing drift conditions of Lyapunov’s type for the original chain; see Chapter 11 in [55,56], as well as All these moment conditions are discussed in detail in ([57], Chapters 11 and 17). There is a key condition in the proof of ergodic theorems in the Markovian context, which is the fact that

E_{A} (τ_{0}) < \infty

, for any

A

that is a set in

E

, such that

ψ (A) > 0

. In fact, when there is a finite invariant measure and an atom

A

, then this condition is readily found. We also refer to [58] for an explicit check of such conditions on several important examples and to §4.1.2 of [59] for sufficient conditions expressed in terms of a uniform return rate to small sets. Finally, as discussed in Chapter 8 of [60], similar conditions can be expressed in potential kernels. Observe that, in the positive recurrent case, the assumptions of (C.1) are not independent when

ν = μ

: from the basic renewal theory, one has

P_{μ} (τ = k) =

{(E_{A} [τ])}^{- 1} P_{A} (τ \geq k)

for all

k \geq 1

. Hence, conditions

E_{μ} [τ^{l}] < \infty

and

E_{A} [τ^{l + 1}] < \infty

are equivalent.

3. Preliminary Results

A significant issue was detected in recovering the estimation of our parameter of interest using the U-process. The given shape of this parameter is as follows:

μ (h) = \int_{x_{1} \in E} \dots \int_{x_{k} \in E} h (x_{1}, \dots, x_{k}) μ (d x_{1}) \dots μ (d x_{k}),

where

h : E^{m} \to R

is a kernel function. The estimation of this parameter should be possible using the U-statistics of the form:

U_{n} (h) = {(\binom{n}{m})}^{- 1} \sum_{(i_{1}, \dots, i_{m}) \in I_{n}^{m}} h (X_{i_{1}}, \dots, X_{i_{m}}), for n \geq m,

(7)

As the parameter of interest is defined and based on Kac’s theorem for the occupation measure,

μ (h)

in the regeneration setup can be written as follows:

\begin{matrix} μ (h) & = & \frac{1}{{(E_{A} (τ))}^{m}} E_{A} (\sum_{i_{1} = T_{0} + 1}^{T_{1}} \sum_{i_{2} = T_{1} + 1}^{T_{2}} \dots \sum_{i_{m} = T_{(m - 1)} + 1}^{T_{m}} h (X_{i_{1}}, \dots, X_{i_{m}})) . \end{matrix}

(8)

In the Markovian context and since the variables are not independent, the approximation related to the i.i.d. blocks and the regenerative case is introduced below:

Definition 14

(Regenerative kernel). Let

h : E^{m} \to R

a kernel. We define the regenerative kernel

ω_{h} : T^{m} \to R

as follows:

ω_{h} ((x_{11}, \dots, x_{1 n_{1}}), \dots, (x_{k 1}, \dots, x_{k n_{k}})) = \sum_{i_{1} = 1}^{n_{1}} \dots \sum_{i_{k} = 1}^{n_{k}} h (x_{1 i_{1}}, \dots, x_{k i_{k}}) .

(9)

It is not necessary that the kernel

ω_{h} (\cdot)

be symmetric, as soon as

h (\cdot)

. In fact, we can use the symmetrization of

S_{m} ω_{h}

in the following way

(S_{m} ω_{h}) = {(m!)}^{- 1} \sum \sum_{σ (1) = 1}^{n_{1}} \dots \sum_{σ (m) = 1}^{n_{k}} h (x_{σ (1)}, \dots, x_{σ (m)}),

(10)

where the first sum is over all permutations

σ = {i_{1}, \dots, i_{m}}

of

{1, \dots, m}

. Next, we consider the U-statistic formed by the regenerative data.

Definition 15

(Regenerative U-statistic). Let

h : E^{m} \to R

a kernel such that

μ (| h |) < \infty

and set

\tilde{h} (\cdot) = h (\cdot) - μ (h) .

The regenerative U-statistic associated with the sequence of regenerative blocks

{B_{j}}_{j = 1}^{L}

, generated by the Markov chain is given by

R_{l_{n}} (h) = {(\binom{l_{n} - 1}{m})}^{- 1} \sum_{(i_{1}, \dots, i_{m}) \in I_{l_{n} - 1}^{m}} ω_{\tilde{h}} (B_{i_{1}}, \dots, B_{i_{m}}) .

(11)

Hence,

R_{l_{n}} (h)

is a standard U-statistic with mean zero.

Proposition 1.

Let us define

W_{n} (h) = U_{n} (h) - μ (h) - (\binom{l_{n} - 1}{m}) {(\binom{n}{m})}^{- 1} R_{l_{n}} (h) .

(12)

Then, under conditions (C.1), (C.2), (C.3) and (C.4), we have the following stochastic convergences:

\begin{matrix} W_{n} (h) & \to & 0, P_{ν} - a . s . \end{matrix}

Before stating the weak convergence in the next theorem, we define the corresponding U-processes related to the U-statistic

U_{n}

and the regenerative U-statistic

R_{L}

, respectively:

\begin{matrix} Z_{n} & : = & {(\binom{n}{m})}^{1 / 2} [U_{n} - μ (h)], \end{matrix}

(13)

\begin{matrix} T_{l_{n}} & : = & {(\binom{l_{n}}{m})}^{1 / 2} [R_{l_{n}} - E (R_{l_{n}})] . \end{matrix}

(14)

Theorem 1.

Let

{(X_{n})}_{n}

be a positive recurrent Harris Markov chain, with an accessible atom

A

,

X_{n}

satisfies the conditions (C.1) and (C.2) (moments assumptions), (C.3), (C.4), (C.5) and, for a fixed

γ > 0

,

E {(τ)}^{2 + γ} < \infty

. Let

F

be a uniform bounded class of functions with an envelope H square-integrable such that:

\int_{0}^{\infty} {(log N (ε, F, e_{n, 2}))}^{m / 2} d ε < \infty .

Then, the process

Z_{n}

converges weakly in probability under

P_{ν}

to a Gaussian process

G_{P}

indexed by

F

whose sample paths are bounded and uniformly continuous with respect to the metric

L_{2} (μ)

.

The Bootstrapped U-Processes

Trying to facilitate the bootstrap technique, we write the detailed steps of the regenerative block construction and the weighted bootstrap method in Algorithm 1:

Algorithm 1 Regenerative block and weighted bootstrap construction.

Identify the number of visits $l_{n} = \sum_{i = 0}^{n} 𝟙_{X_{i} \in A}$ to the atom $A$ .
Divide the sample $X^{(n)} = (X_{1}, \dots, X_{n}^{(n)})$ into $(l_{n} + 1)$ regenerative blocks $B_{0}, \dots, B_{l_{n} - 1}, B_{l_{n}}^{(n)} \in T$ , each block $B_{i}$ with a length $l (B_{i}) \equiv τ_{i}$ .
Drop the first and the last blocks if $τ_{l_{n}} < n$ to avoid bias.
Let $ξ = (ξ_{i, l_{n}}, i = 1, \dots, n)$ be a triangular array of random variables. Define the weighted bootstrap empirical measure from the data:

$P_{n}^{*} = \frac{1}{l_{n}} \sum_{i = 1}^{n} ξ_{i, l_{n}} δ_{B_{i}} .$

In what follows, we denote by

P^{*}

and

E^{*}

, respectively, the conditional probability and the conditional expectation given the sample

{X_{1}, \dots, X_{n}}

. The same notation is used for the sample

{B_{1}, \dots, B_{L_{n}}}

. Define the bootstrapped U-statistic as

\begin{matrix} U_{n}^{*} (h) = {(\binom{n}{m})}^{- 1} \sum_{(i_{1}, \dots, i_{m}) \in I_{n}^{m}} ξ_{i_{1}, n} \dots ξ_{i_{m}, n} h (X_{i_{1}}, \dots, X_{i_{m}}) \end{matrix}

(15)

and the regenerative bootstrapping

R_{l_{n}}^{*} (h) = {(\binom{l_{n}}{m})}^{- 1} \sum_{(i_{1}, \dots, i_{m}) \in I_{l_{n}}^{m}} ξ_{i_{1}, l_{n}} \dots ξ_{i_{m}, l_{n}} ω_{h} (B_{i_{1}}, \dots, B_{i_{m}}) .

(16)

and the U-processes are:

\begin{matrix} Z_{n}^{*} & : = & {(\binom{n}{m})}^{1 / 2} {[U_{n}^{*} (h) - U_{n} (h)]}_{h \in F} \\ = & {(\binom{n}{m})}^{- 1 / 2} \sum_{(i_{1}, \dots, i_{m}) \in I_{n}^{m}} (ξ_{i_{1}, l_{n}} - 1) \dots (ξ_{i_{m}, l_{n}} - 1) h (X_{i_{1}}, \dots, X_{i_{m}}) . \end{matrix}

(17)

and

\begin{matrix} T_{l_{n}}^{*} & : = & {(\binom{l_{n}}{m})}^{1 / 2} {[R_{l_{n}}^{*} (h) - E (R_{l_{n}}^{*} (h))]}_{h \in F} \\ = & {(\binom{l_{n}}{m})}^{- 1 / 2} \sum_{(i_{1}, \dots, i_{m}) \in I_{l_{n}}^{m}} (ξ_{i_{1}, l_{n}} - 1) \dots (ξ_{i_{m}, l_{n}} - 1) ω_{h} (B_{i_{1}}, \dots, B_{i_{m}}) . \end{matrix}

(18)

Given

Δ_{n}

, a real-valued function, defined on the product probability space, we say that

Δ_{n}

is of an order

o_{P_{ξ}}^{o} (1)

in

P_{ν}^{o}

-probability if for any

ε, δ > 0

P_{ν}^{o} \{P_{ξ ∣ X}^{o} (|Δ_{n}| > ε) > δ\} ⟶ 0 as n \to \infty

and that

Δ_{n}

is of an order

O_{P_{ξ}}^{o} (1)

in

P_{ν}^{o}

-probability if for any

δ > 0

, there exists a

0 < M < \infty

such that

P_{ν}^{o} \{P_{ξ ∣ X}^{o} (|Δ_{n}| \geq M) > δ\} ⟶ 0 as n \to \infty

We must comment here that the bootstrap works in probability if

d_{B L} (T_{l_{n}}^{*}, T_{l_{n}}) \to 0 in probability,

where

d_{B L} (T_{l_{n}}^{*}, T_{l_{n}}) = sup_{g \in B L (l^{\infty} (F))} |E g {(T_{l_{n}}^{*})}^{\circ} - E g (T_{l_{n}})|,

and

B L (l^{\infty} (F)) : = \{g : l^{\infty} (F) \to R, | g (x) - g (y) | \leq {∥ x - y ∥}_{F}, {∥ g ∥}_{\infty} \leq 1\},

and

g {(T_{l_{n}}^{*})}^{\circ}

is the measurable envelope of

g (T_{l_{n}}^{*})

. In addition, for any measurable random elements,

Y_{n}

and Y, the convergence in law of

Y_{n}

to Y is in the sense of Hoffman–Jorgensen, which is defined as

E g {(Y_{n})}^{\circ} \to E g (Y),

for g bounded and continuous. This weak convergence is metrizable by Theorem A1 in Appendix A.

Proposition 2.

Suppose that the bootstrap weights

(ξ_{1}, \dots, ξ_{n})

satisfy Assumptions (A1)–(A4). Let

W_{n}^{*} (h) : = U_{n}^{*} (h) - (\binom{l_{n} - 1}{m}) {(\binom{n}{m})}^{- 1} R_{l_{n}}^{*} (h) .

(19)

Then, we have

\begin{matrix} W_{n}^{*} (h) & \to & 0, P_{ν} \times P_{ξ} - a . s . \end{matrix}

The proof of Proposition 2 is postponed until Section 7.

Now, in the following lemma, there are some instrumental results needed later.

Lemma 1.

Let

{(X_{n})}_{n}

be a Markov chain defined in Section 2.1. Define

p : = P (X_{0} \in A) = α^{- 1}

. Then, for any initial probability ν, we have:

(i): For some $η > 0$ and $C > 0$ :

$|\frac{E_{ν} (l_{n})}{n p} - 1| \leq \frac{C}{n} a n d \sqrt{n} (\frac{l_{n}}{n p} - 1) \to N (0, η^{2}) .$

(20)
(ii): $\frac{n *}{n} \to 1$ in $P_{ν} \times P_{ξ}$ -probability.
(iii): Let $X_{i}$ be a sequence of random variables. If

$T_{n} = \frac{1}{n} \sum_{i = 1}^{n} X_{i} \to C a . s .,$

then for any integer $t_{n}$ valued sequence of random variables,

$\frac{1}{t_{n}} \sum_{i = 1}^{t_{n}} X_{i} \to C i n P_{ν} - p r o b a b i l i t y .$

The proof of Lemma 1 is postponed until Section 7.

4. Weighted Bootstrap Weak Convergence

In this section, we extend some existing results concerning the multiplier U-process to prove the bootstrap uniform weak convergence. Most of these results can be found in [46], generalizing the empirical process work of [38] in the i.i.d. setting. The weak convergence is proved for degenerate U-processes, as we mentioned before, and under the weighted regenerative bootstrap schemes described in Algorithm 1. Before stating the weak convergence theorem, we recall the following important results. The next theorem, proved in [46], is a sharp multiplier inequality, which is essential in the study of the multiplier U-process. These results are based on the decoupling symmetrized U-process, a basic framework of U-statistics. In [47], the author solved these problems for the empirical process settings in the Markov setting (multinomial bootstrap), which we generalize to the U-process by considering more general weights, i.e., the exchangeable weighted bootstrap.

Theorem 2

([46]). Let

(ξ_{1}, \dots, ξ_{n})

be a random vector independent of

(Y_{1}, \dots, Y_{n})

. Then, there exists some measurable function

ψ_{n} : R_{\geq 0}^{m} \to R_{\geq 0}

such that the expected supremum of the decoupled (Here “decoupled” refers to the fact that

{Y_{i}^{(k)}}, k \in N

are independent copies of

{Y_{i}}

, and

{ϵ_{i}^{(k)}}, k \in N

are independent copies of the Rademacher sequence

{ϵ_{i}}

.) U-processes

\begin{matrix} E {∥\sum_{\begin{matrix} 1 \leq i_{k} \leq ℓ_{k}, 1 \leq k \leq m \end{matrix}} ϵ_{i_{1}}^{(1)} \dots ϵ_{i_{m}}^{(m)} f (Y_{i_{1}}^{(1)}, \dots, Y_{i_{m}}^{(m)})∥}_{F} \leq ψ_{n} (ℓ_{1}, \dots, ℓ_{m}), \end{matrix}

for all

1 \leq ℓ_{1}, \dots, ℓ_{m} \leq n

, consequently,

\begin{matrix} E {∥\sum_{1 \leq i_{1}, \dots, i_{m} \leq l_{n - 1}} ξ_{i_{1}} \dots ξ_{i_{m}} f (Y_{i_{1}}^{(1)}, \dots, Y_{i_{m}}^{(m)})∥}_{F} \\ \leq K \int_{R_{\geq 0}^{m}} E ψ_{n} (\sum_{i = 1}^{l_{n - 1}} 1_{| ξ_{i} | > t_{1}}, \dots, \sum_{i = 1}^{l_{n - 1}} 1_{| ξ_{i} | > t_{m}}) d t_{1} \dots d t_{m} . \end{matrix}

Furthermore, if there exists a concave and nondecreasing function

{\bar{ψ}}_{n} : R \to R

such that

ψ_{n} (ℓ_{1}, \dots, ℓ_{m}) = {\bar{ψ}}_{n} (\prod_{k = 1}^{m} ℓ_{k})

, then

\begin{matrix} E {∥\sum_{1 \leq i_{1}, \dots, i_{m} \leq n} ξ_{i_{1}} \dots ξ_{i_{m}} f (Y_{i_{1}}^{(1)}, \dots, Y_{i_{m}}^{(m)})∥}_{F} \\ \leq K \int_{R_{\geq 0}^{m}} {\bar{ψ}}_{n} (\sum_{1 \leq i_{1}, \dots, i_{m} \leq n} \prod_{k = 1}^{m} P {(| ξ_{i_{k}} | > t_{k})}^{1 / m}) d t_{1} \dots d t_{m} . \end{matrix}

Here,

K > 0

is a constant depending on m only and can be taken as

K = 2^{2 m} \prod_{k = 2}^{m} (k^{k} - 1)

for

m \geq 2

.

Lemma 2

([46]). Let

{F_{(ℓ_{1}, \dots, ℓ_{m}), n} : 1 \leq ℓ_{1}, \dots, ℓ_{m} \leq n, n \in N}

be function classes such that

F_{(ℓ_{1}, \dots, ℓ_{m}), n} \supset F_{(n, \dots, n), n}

for all

1 \leq ℓ_{1}, \dots, ℓ_{m} \leq n

. Suppose that the

ξ_{i}

’s have the same marginal distributions with

{∥ ξ_{1} ∥}_{2 m, 1} < \infty

. Suppose that there exists some bounded measurable function

a : R_{\geq 0}^{m (n)} \to R_{\geq 0}

with

a (ℓ_{1}, \dots, ℓ_{m}) \to 0

as

ℓ_{1} \land \dots \land ℓ_{m} \to \infty

, such that the expected supremum of the decoupled U-processes satisfies

\begin{matrix} E {∥\sum_{\begin{matrix} 1 \leq i_{k} \leq ℓ_{k}, 1 \leq k \leq m \end{matrix}} ϵ_{i_{1}}^{(1)} \dots ϵ_{i_{m}}^{(m)} f (Y_{i_{1}}^{(1)}, \dots, Y_{i_{m}}^{m})∥}_{F_{(ℓ_{1}, \dots, ℓ_{m}), n}} \\ \leq a (ℓ_{1}, \dots, ℓ_{m}) {(\prod_{k = 1}^{m} ℓ_{k})}^{1 / 2}, \end{matrix}

for all

1 \leq ℓ_{1}, \dots, ℓ_{m} \leq n

. Then,

\begin{matrix} n^{- m / 2} E {∥\sum_{1 \leq i_{1}, \dots, i_{m} \leq n} ξ_{i_{1}} \dots ξ_{i_{m (n)}} f (Y_{i_{1}}^{(1)}, \dots, Y_{i_{m}}^{(m)})∥}_{F_{(n, \dots, n), n}} \to 0, n \to \infty . \end{matrix}

The main result of this paper is represented in the following theorem. It is worth noting here that it is not easy to prove the stochastic equicontinuity in the present setting as explained in the introduction.

Definition 16

(Permissible classes of function). Let

(E, E, P)

be a measurable space (

E

a Borel σ-field on E). Let

F

be a class of functions indexed by a parameter x that belongs to a set E.

F

is called permissible if it can be indexed by a E such that:

There exists a function $g (x, f) = f (x)$ defined from $S \times F$ to $R$ in such a way that this function is $L \otimes B (F)$ measurable function, where $B (F)$ is the Borel σ-algebra generated by the metric on $F$ .
E is a Suslin measurable space whose mean E is an analytic subset of a compact metric space E from which it inherits its metric and Borel σ-field.

Theorem 3.

Suppose Assumptions (A1) to (A4), and Conditions (C.1)–(C.5) hold. Let

F \subset L_{2}^{c, m} (P)

be permissible and admit a

P^{m}

-square integrable envelope F such that

\begin{matrix} \int_{0}^{1} {(sup_{Q} log N (ϵ {∥ F ∥}_{L_{2} (Q)}, F, L_{2} (Q)))}^{m / 2} d ϵ < \infty, \end{matrix}

where the supremum is taken over all discrete probability measures. Then,

\begin{matrix} sup_{ψ \in BL} |E_{ξ} ψ (Z_{n}^{*} (h)) - E ψ (c \cdot K_{P})| \to_{P_{ν}} 0, \end{matrix}

where c is the constant in (A3), and the convergence in probability

\to_{P_{ν}}

is with respect to the outer probability of

P^{\infty}

defined on

(E^{\infty}, E^{\infty})

.

The proof of Theorem 3 is postponed until Section 7.

4.1. Bootstrap Weights Examples

Let

(ξ_{1}, \dots, ξ_{n})

be a class of real random variables satisfying Assumptions (A1)–(A4). We give some examples of bootstrap weights; for instance, refer to [38,61] for more explanations.

4.1.1. Bayesian Resampling Scheme

In this case,

(ξ_{1}, \dots, ξ_{n})

are positive i.i.d. random variables with mean

μ

and finite variance

σ^{2}

. The weights satisfy

∥ ξ_{1} ∥_{2, 1} < \infty

, and we define

{\bar{ξ}}_{n} = \sum_{i = 1}^{n} ξ_{i} .

The Bayesian bootstrapped weight can be defined as:

ξ_{n i} = ξ_{i} / {\bar{ξ}}_{n},

satisfying

∥ ξ_{n 1} ∥_{2, 1} = \int_{0}^{\infty} \sqrt{P_{ξ} (ξ_{n 1} \geq u)} d u .

For

ξ_{n i} \sim E x p o n e n t i a l (1)

or

ξ_{n i} \sim G a m m a (4, 1)

, the Bayesian weights are distributionally equivalent with Dirichlet weights. For the value of

c^{2}

, we have:

\frac{1}{n} \sum_{i = 1}^{n} {(ξ_{n i} - 1)}^{2} \to \frac{σ^{2}}{μ^{2}} : = c^{2}, n \to \infty .

4.1.2. Efron’s Resampling Scheme

For Efron’s bootstrap, we have

(ξ_{1}, \dots, ξ_{n}) \sim Multinomial (n; n^{- 1}, \dots, n^{- 1}) .

Condition (A1) follows directly. Condition (A3) follows from ([37] Lemma 4.1), and Condition (A2) is detailed in [43].

4.1.3. The Delete h-Jackknife

In [62], the authors permute deterministic weights

w_{n}

, where

w_{n} = \{\frac{n}{n - h}, \dots, \frac{n}{n - h}, 0, \dots, 0\} with \sum_{i = 1}^{n} w_{n i} = n

in order to build new bootstrap weights, and they defined the new weights

ξ_{n j} : = w_{n R_{n} (j)}

where

R_{n} (\cdot)

is a random permutation uniformly distributed over

{1, \dots, n}

. These weights are called the delete h-Jackknife. In order to achieve Assumption (A3), we must assume that

h / n \to α \in (0, 1)

, as

c^{2} = h / (n - h)

and

c > 0

.

4.1.4. The Multivariate Hypergeometric Resampling Scheme

As its name indicates, the bootstrap weights of this scheme follow the multivariate hypergeometric distribution with density:

P (ξ_{n 1} = ε_{1}, \dots, ξ_{n n} = ε_{n}) = \frac{(\binom{K}{ε_{1}}) \dots (\binom{K}{ε_{n}})}{(\binom{n K}{n})},

where K is a positive integer. Assumption (A3) is satisfied with

c^{2} = (K - 1) / K

.

Remark 2.

As was pointed out in [38], the preceding mentioned bootstraps are “smoother” in some way than the multinomial bootstrap because they place some (random) weight on all elements in the sample, whereas the multinomial bootstrap applies the positive weight at a proportion of about

1 - {(1 - n^{- 1})}^{n} \to 1 - e^{- 1} = 0.6322

of each element of the sample, on average. Notice that when

ω_{i} \sim G a m m a (4, 1)

, the

ξ_{n i} / n

are equivalent to four spacings from a sample of

4 n - 1

Uniform

(0, 1)

random variables. In [63,64], it was noticed that in addition to being four times more expensive to implement, the choice of four spacings depends on the functional of interest and is not universal.

Remark 3.

It is noteworthy that choosing the bootstrap weights

ξ_{n i}

properly implies a smaller limit variance, that is,

c^{2}

is smaller than 1. A typical example is the multivariate hypergeometric bootstrap ([38] Example 3.4) and the subsample bootstrap, ([65] Remark 2.2-(3)). A thorough treatment of the weight selection is undoubtedly outside the scope of the current work; for review, we refer the readers to [66].

Remark 4.

In the present paper, we considered a renewal type of bootstrap for atomic Markov chains under minimal moment conditions on renewal times. The atomic Markov chains assumption can be dropped by mimicking the ideas of [50,51] by introducing an artificial atom and deriving the bootstrap procedure that applies to nonatomic Markov chains. Precisely, in the case of a general irreducible chain X with a transition kernel

Π (x, d y)

satisfying a minorization condition:

\forall x \in S, Π (x, d y) ⩾ δ ψ (d y),

for an accessible measurable set S, a probability measure ψ and

δ \in] 0, 1 [

(note that such a minorization condition always holds for Π or an iterate when the chain is irreducible), an atomic extension

(X, Y)

of the chain may be explicitly constructed by the Nummelin splitting technique (see [49]) from the parameters

(S, δ, ψ)

and the transition probability Π, see for instance [47,67]. From a practical viewpoint, the size of the first block may be large compared to the size n of the whole trajectory, for instance, in the case where the expected return time to the (pseudo-)atom when starting with the initial probability distribution is large. The effective sample size for constructing the data blocks and the corresponding statistic is then dramatically reduced. However, in [68], some simulations were given together with examples including content-dependent storage systems and general AR models supporting the method discussed in this work.

5. Applications

Example 1

(Symmetry test). This example gives an application for the bootstrap U-statistics, inspired by the goodness-of-fit tests in [69], where they considered the symmetry test for the distribution of

X_{t}

. Let

{(X_{t})}_{t \in N}

be a stationary mixing process with

f_{X} (\cdot)

the Lebesgue density. We test the hypothesis:

\{\begin{matrix} H_{0} : & f_{X} (u) & = f_{X} (- u), & a l m o s t e v e r y w e r e, \\ H_{1} : & f_{X} (u) & \neq f_{X} (- u) & o n a s e t o f p o s i t i v e m e a s u r e . \end{matrix}

(21)

The estimator of

f_{X} (u)

is:

{\hat{f}}_{X} (u) = \frac{1}{n h_{n}} \sum_{i = 1}^{n} K (\frac{u - X_{i}}{h_{n}}),

where

K (\cdot)

is a kernel function and

h_{n} > 0

is a smoothing parameter or the bandwidth. An appropriate estimator of the integrated squared difference represent the symmetry test:

I = \int_{R} {(f_{X} (u) - f_{X} (- u))}^{2} d u .

According to [69], I can be estimated by

{\hat{I}}_{n} : = \frac{4}{n^{2} h_{n}} \sum_{1 \leq i < j \leq n} Φ_{n} (X_{i}, X_{j}),

where

Φ_{n} (X_{i}, X_{j}) = K_{X_{i}, X_{j}} - K_{X_{i}, - X_{j}}

with

K_{X_{i}, Y_{j}} = K (\frac{X_{i} - Y_{j}}{h_{n}})

, for

Y_{j} \in \{X_{j}, - X_{j}\}

. Clearly,

{\hat{I}}_{n}

is a degenerate U-statistic with kernel varying with the sample size n. Thus, the stationary bootstrap test,

{\hat{I}}_{n}^{*} : = \frac{4}{n^{2} h_{n}} \sum_{1 \leq i < j \leq n} Φ_{n} (X_{i}^{*}, X_{j}^{*}),

can be shown to have the same limit as

{\hat{I}}_{n}

.

Example 2

(Kendall’s tau). The covariance matrix quantifies the linear dependency in a random vector. The rank correlation is another measure of the nonlinear dependency in a random vector. Two generic vectors

y = (y_{1}, y_{2})

and

z = (z_{1}, z_{2})

in

R^{2}

are said to be concordant if

(y_{1} - z_{1}) (y_{2} - z_{2}) > 0

. For

m, k = 1, \dots, p

, define

τ_{m k} = \frac{1}{n (n - 1)} \sum_{1 \leq i \neq j \leq n} 1 \{(X_{i m} - X_{j m}) (X_{i k} - X_{j k}) > 0\} .

Then, Kendall’s tau rank correlation coefficient matrix

T = {\{τ_{m k}\}}_{m, k = 1}^{p}

is a matrix-valued U-statistic with a bounded kernel. It is clear that

τ_{m k}

quantifies the monotonic dependency between

(X_{1 m}, X_{1 k})

and

(X_{2 m}, X_{2 k})

and it is an unbiased estimator of

P ((X_{1 m} - X_{2 m}) (X_{1 k} - X_{2 k}) > 0),

that is, the probability that

(X_{1 m}, X_{1 k})

and

(X_{2 m}, X_{2 k})

are concordant.

Example 3

(Test of independence). In [2] the author introduced the parameter

▵ = \int_{- \infty}^{\infty} \int_{- \infty}^{\infty} D^{2} (y_{1}, y_{2}) d F (y_{1}, y_{2}),

where

D (y_{1}, y_{2}) = F (y_{1}, y_{2}) - F (y_{1}, \infty) F (\infty, y_{2})

and

F (\cdot, \cdot)

is the distribution function of

Y_{1}

and

Y_{2}

. The parameter Δ has the property that

▵ = 0

if and only if

Y_{1}

and

Y_{2}

are independent. From [8], an alternative expression for Δ can be developed by introducing the functions

ψ (y_{1}, y_{2}, y_{3}) = \{\begin{matrix} 1 & i f y_{2} \leq y_{1} < y_{3} \\ 0 & i f y_{1} < y_{2}, y_{3} o r y_{1} \geq y_{2}, y_{3} \\ - 1 & i f y_{3} \leq y_{1} < y_{2} \end{matrix}

and

φ (y_{1, 1}, y_{1, 2}, \dots, y_{5, 1}, y_{5, 2}) = \frac{1}{4} ψ (y_{1, 1}, y_{1, 2}, y_{1, 3}) ψ (y_{1, 1}, y_{1, 4}, y_{1, 5}) ψ (y_{1, 2}, y_{2, 2}, y_{3, 2}) ψ (y_{1, 2}, y_{4, 2}, y_{5, 2}) .

We have

▵ = \int \dots \int φ (y_{1, 1}, y_{1, 2}, \dots, y_{5, 1}, y_{5, 2}) d F (y_{1, 1}, y_{1, 2}) \dots d F (y_{1, 5}, y_{2, 5}) .

The corresponding U-statistics may be used to test the independence.

6. Conclusions

The present paper was concerned with the randomly weighted bootstrap of the U-process in a Markov framework. A large number of bootstrap resampling schemes emerged as special cases of our setting, in particular, the multinomial bootstrap, which is the best-known bootstrap scheme introduced by [26]. One of the main tools was the approximation of the Markov U-process by the corresponding regenerative one. We looked to mimic this result in Proposition 2, in order to approximate the weighted-bootstrap U-process

U_{n}^{*}

to the regenerative weighted-bootstrap U-process

R_{l_{n}}^{*}

. Other technical arguments were given in Lemma 1 extended from the work of [47]. These intricate tools were used to reach the full independence of regenerative block variables by proving that a deterministic one could substitute the random size of blocks, which was the main problem for the extension of the bootstrap results to the Markov framework. After a lengthy proof to arrive at independence, we used the results of [46]. All the above steps led us to prove the weak convergence of the regenerative-block weighted-bootstrap U-process, which implied the weak convergence of the weighted-bootstrap U-process. It will be of interest to consider the extension of the paper to the semi-Markov setting. A more delicate problem is to consider the setting of incomplete data such as censored cases or missing data. To the best of our knowledge, this problem has not been considered, even for the original sample (without bootstrap) in the Markov framework. It would be interesting to extend our work to the case of the local stationary process, which requires nontrivial mathematics; this would go well beyond the scope of the present paper.

7. Mathematical Development

This section is devoted to the proof of our results. The previously defined notations continue to be used in what follows.

Proof of Proposition 2.

We have

\begin{matrix} U_{n}^{*} (h) - (\binom{l_{n} - 1}{m}) {(\binom{n}{m})}^{- 1} R_{l_{n}}^{*} (h) \\ = {(\binom{n}{m})}^{- 1} \sum_{(i_{1}, \dots, i_{m}) \in I_{n}^{m}} ξ_{i_{1}, n^{*}} \dots ξ_{i_{m}, n^{*}} h (X_{i_{1}}, \dots, X_{i_{m}}) \\ - {(\binom{n}{m})}^{- 1} \sum_{(i_{1}, \dots, i_{m}) \in I_{l_{n}}^{m}} ξ_{i_{1}, l_{n}} \dots ξ_{i_{m}, l_{n}} ω_{\tilde{h}} (B_{i_{1}}, \dots, B_{i_{m}}) \\ = {(\binom{n}{m})}^{- 1} \sum_{(i_{1}, \dots, i_{m}) \in I_{n}^{m}} ξ_{i_{1}, n^{*}} \dots ξ_{i_{m}, n^{*}} h (X_{i_{1}}, \dots, X_{i_{m}}) \\ - {(\binom{n}{m})}^{- 1} \sum_{(i_{1}, \dots, i_{m}) \in I_{l_{n}}^{m}} ξ_{i_{1}, l_{n}} \dots ξ_{i_{m}, l_{n}} ω_{\tilde{h}} (B_{i_{1}}, \dots, B_{i_{m}}) \\ + {(\binom{n}{m})}^{- 1} \sum_{(i_{1}, \dots, i_{m}) \in I_{n}^{m}} h (X_{i_{1}}, \dots, X_{i_{m}}) - {(\binom{n}{m})}^{- 1} \sum_{(i_{1}, \dots, i_{m}) \in I_{n}^{m}} h (X_{i_{1}}, \dots, X_{i_{m}}) \\ + {(\binom{n}{m})}^{- 1} \sum_{(i_{1}, \dots, i_{m}) \in I_{l_{n}}^{m}} \{ξ_{i_{1}, l_{n}} \dots ξ_{i_{m}, l_{n}} - 1\} ω_{\tilde{h}} (B_{i_{1}}, \dots, B_{i_{m}}) \\ - {(\binom{n}{m})}^{- 1} \sum_{(i_{1}, \dots, i_{m}) \in I_{l_{n}}^{m}} \{ξ_{i_{1}, l_{n}} \dots ξ_{i_{m}, l_{n}} - 1\} ω_{\tilde{h}} (B_{i_{1}}, \dots, B_{i_{m}}) \\ = {(\binom{n}{m})}^{- 1} \sum_{(i_{1}, \dots, i_{m}) \in I_{n}^{m}} \{ξ_{i_{1}, n^{*}} \dots ξ_{i_{m}, n^{*}} - 1\} h (X_{i_{1}}, \dots, X_{i_{m}}) \\ - {(\binom{n}{m})}^{- 1} \sum_{(i_{1}, \dots, i_{m}) \in I_{l_{n}}^{m}} \{ξ_{i_{1}, l_{n}} \dots ξ_{i_{m}, l_{n}} - 1\} ω_{\tilde{h}} (B_{i_{1}}, \dots, B_{i_{m}}) \\ + {(\binom{n}{m})}^{- 1} \sum_{(i_{1}, \dots, i_{m}) \in I_{l_{n}}^{m}} ω_{\tilde{h}} (B_{i_{1}}, \dots, B_{i_{m}}) \\ - {(\binom{n}{m})}^{- 1} \sum_{(i_{1}, \dots, i_{m}) \in I_{n}^{m}} h (X_{i_{1}}, \dots, X_{i_{m}}) . \end{matrix}

Given

J \subseteq {1, \dots, m} (J = \emptyset

is not excluded), and

i = (i_{1}, \dots, i_{m}) \in

{1, \dots, n}^{m}

, we set

i_{J}

to be the point of

{1, \dots, n}^{| J |}

obtained from i by deleting the coordinates in the places not in J (e.g., if

i = (3, 4, 2, 1)

, then

i_{{1, 3}} = (3, 2))

. Furthermore,

\sum_{i_{J}}

indicates the sum over

1 \leq i_{j} \leq n, j \in J

; for instance, if

m = 4

and

J = {1, 3}

, then

\sum_{i_{J}} h_{i} = \sum_{i_{{1, 3}}} h_{i_{1}, i_{2}, i_{3}, i_{4}} = \sum_{1 \leq i_{1}, i_{3} \leq n} h_{i_{1}, i_{2}, i_{3}, i_{4}} (X_{i_{1}}^{(1)}, \dots, X_{i_{4}}^{(4)}) .

By convention,

\sum_{i} a = a

. Notice that

\begin{matrix} E {(\binom{n}{m})}^{- 1} \sum_{(i_{1}, \dots, i_{m}) \in I_{n}^{m}} \{ξ_{i_{1}, n} \dots ξ_{i_{m}, n} - 1\} h (X_{i_{1}}, \dots, X_{i_{m}}) \\ = {(\binom{n}{m})}^{- 1} \sum_{(i_{1}, \dots, i_{m}) \in I_{n}^{m}} E \{ξ_{i_{1}, n} \dots ξ_{i_{m}, n} - 1\} E h (X_{i_{1}}, \dots, X_{i_{m}}) \\ = {(\binom{n}{m})}^{- 1} \sum_{i_{{1, \dots, m - 1}}} \sum_{j = 1}^{m} E \{\prod_{k = 1, k \neq j}^{m} ξ_{i_{k}, n} E \{\sum_{i_{j} = 1}^{n} (ξ_{i_{j}, n} - 1) ∣ \prod_{k = 1, k \neq j}^{m} ξ_{i_{k}, n}\}\} \\ \times E h (X_{i_{1}}, \dots, X_{i_{m}}) \\ = 0 . \end{matrix}

In a similar way, we have

\begin{matrix} E {(\binom{n}{m})}^{- 1} \sum_{(i_{1}, \dots, i_{m}) \in I_{l_{n}}^{m}} \{ξ_{i_{1}, l_{n}} \dots ξ_{i_{m}, l_{n}} - 1\} ω_{\tilde{h}} (B_{i_{1}}, \dots, B_{i_{m}}) = 0 . \end{matrix}

Making use of Proposition 1 and the law of large numbers, we infer that

\begin{matrix} U_{n}^{*} (h) - (\binom{l_{n} - 1}{m}) {(\binom{n}{m})}^{- 1} R_{l_{n}}^{*} (h) = 0, a . s . \end{matrix}

Hence, the proof is completed. □

Proof of Lemma 1.

The proof of part

i)

and part

i i i)

follows from ([47] Lemma 3.1 and Lemma 3.2). In order to prove

i i)

, we need to show that, for every

ϵ > 0

,

P_{ν} \times P_{ξ | X} (|\frac{n *}{n} - 1| > ϵ) \to 0,

(22)

which follows if, conditioned on the sample,

P_{ξ | X} (|\frac{n *}{n} - 1| > ϵ) \to 0 .

(23)

We have:

\begin{matrix} \frac{n *}{n} - 1 & = & \frac{\sum_{i = 1}^{l_{n}} ξ_{i} τ_{i}}{n} - 1 = \frac{l_{n}}{n} [\frac{\sum_{i = 1}^{l_{n}} (ξ_{i} τ_{i} + ξ_{i} E^{*} (τ) - ξ_{i} E^{*} (τ)) - n}{l_{n}}] \\ = & \frac{l_{n}}{n} [\frac{\sum_{i = 1}^{l_{n}} ξ_{i} (τ_{i} - E^{*} (τ))}{l_{n}}] + \frac{l_{n}}{n} [\frac{\sum_{i = 1}^{l_{n}} ξ_{i} E^{*} (τ)}{l_{n}} - \frac{n}{l_{n}}] \\ = & I + I I . \end{matrix}

(24)

We denote by

E^{*}

the expectation conditionally on

X_{1}, \dots, X_{n}

. By the fact that

τ_{i}

are i.i.d. and using Chebyshev’s inequality, we have:

\begin{matrix} P_{ξ | X} (|I| > ϵ) & \leq & ϵ^{- 2} {(\frac{l_{n}}{n})}^{2} \frac{1}{l_{n}} E_{ξ | X} {(ξ_{1, l_{n}} (τ_{1} - E^{*} (τ)))}^{2} \\ \leq & 2 ϵ^{- 2} (\frac{l_{n}}{n}) \frac{1}{n} E (ξ_{1, l_{n}}^{2}) \frac{1}{l_{n}} \sum_{i = 1}^{l_{n}} τ_{i}^{2} \to 0 in probability . \end{matrix}

The last inequality follows using

i)

, which implies that

\frac{l_{n}}{n} \to p

and

i i i)

where

\frac{1}{l_{n}} \sum_{i = 1}^{l_{n}} τ_{i}^{2} \to E (τ^{2}),

for

E (ξ_{1, l_{n}}^{2}) < \infty

. For

I I

we have:

\begin{matrix} I I & = & \frac{l_{n}}{n} [\frac{\sum_{i = 1}^{l_{n}} ξ_{i} E^{*} (τ)}{l_{n}} - \frac{n}{l_{n}}] = \frac{l_{n}}{n} [E^{*} (τ) - \frac{n}{l_{n}}] by (A 1) \\ = & \frac{l_{n}}{n} [\frac{1}{l_{n}} \sum_{i = 1}^{l_{n}} (τ_{i} - E (τ) + E (τ)) - \frac{n}{l_{n}}] \\ = & \frac{l_{n}}{n} [\frac{1}{l_{n}} \sum_{i = 1}^{l_{n}} (τ_{i} - E (τ))] + \frac{l_{n}}{n} [E (τ) - \frac{n}{l_{n}}] . \end{matrix}

The last equality converges to zero by the fact that

n / l_{n} \to α = E (τ)

and by iii)

\frac{1}{l_{n}} \sum_{i = 1}^{l_{n}} (τ_{i} - E (τ)) \to 0 .

This proves Lemma 1. □

Proof of Theorem 3.

For the weak convergence, we need to show the finite-dimensional convergence and the asymptotic equicontinuity. According to Proposition 2 and [6], the finite-dimensional convergence is considered if, for every fixed finite collection of functions

{f_{1}, \dots, f_{k}} \subset F

,

\begin{matrix} ({(\binom{n}{m_{1}})}^{- 1 / 2} R_{l_{n}}^{*} (f_{1}), \dots {(\binom{n}{m_{k}})}^{- 1 / 2} R_{l_{n},}^{*} (f_{k})) \to (K_{P} (f_{1}), \dots, K_{P} (f_{k})), \end{matrix}

where

K_{P}

is the Gaussian chaos process. According to Cramér–Wold and the countability of

F

, we only need to show that for any

f \in L_{2}^{c, m} (P)

,

\begin{matrix} sup_{ψ \in BL} | E [ψ ({(\binom{n}{m})}^{- 1 / 2} R_{l_{n}}^{*} (f)) | {B_{i}}] - E ψ (c \cdot K_{P} (f))| \to 0 a . s . \end{matrix}

(25)

By ([6] Section 4.2) and ([29] Section 2A), any

f \in L_{2}^{c, m} (P)

can be expanded in

L_{2} (P^{m})

by

f = \sum_{q = 1}^{\infty} c_{q} h_{m}^{ψ_{q}}

, where

{c_{q}}

is a sequence of real numbers and

h_{m}^{ψ_{q}} (x_{1}, \dots, x_{m}) \equiv ψ_{q} (x_{1}) \dots ψ_{q} (x_{m})

for some bounded

ψ_{q} \in L_{2}^{c, 1} (P)

. Fix

ϵ > 0

. Then, there exists

Q_{ϵ} \in N

such that with

f_{n}^{ϵ} \equiv \sum_{q = 1}^{Q_{ϵ}} c_{q} h_{m}^{ψ_{q}}

,

{∥ f - f_{ϵ} ∥}_{L_{2} (P^{m})} \leq ϵ .

The left-hand side of (25) can be further bounded by

\begin{matrix} sup_{ψ \in BL} | E [ψ ({(\binom{n}{m})}^{- 1 / 2} R_{l_{n}}^{*} (f)) | {B_{i}}] - E ψ (c \cdot K_{P} (f))| \\ \leq sup_{ψ \in BL} | E [ψ ({(\binom{n}{m})}^{- 1 / 2} R_{l_{n}}^{*} (f)) | {B_{i}}] - E [ψ ({(\binom{n}{m})}^{- 1 / 2} R_{l_{n}}^{*} (f^{ϵ})) | {B_{i}}]| \\ + sup_{ψ \in BL} | E [ψ ({(\binom{n}{m})}^{- 1 / 2} R_{l_{n}}^{*} (f^{ϵ})) | {B_{i}}] - E ψ (c \cdot K_{P} (f^{ϵ}))| \\ + sup_{ψ \in BL} |E ψ (c \cdot K_{P} (f^{ϵ})) - E ψ (c \cdot K_{P} (f))| \\ \equiv (I) + (I I) + (I I I) . \end{matrix}

(26)

Let

\bar{f^{ϵ}} \equiv f - f^{ϵ}

; noting that

ψ

is bounded by one and using Lemma 1, we can replace

l_{n}

by

φ (n) = ⌊\frac{n}{E_{A} (τ)}⌋

which is deterministic. In the following, we denote by

π

a random permutation uniformly distributed over

Σ (n)

, the set of all permutations over

1, \dots, n .

We have

\begin{matrix} {(I)}^{2} & \leq E^{*} {|2 \land {(\binom{n}{m})}^{1 / 2} R_{l_{n}}^{*} (\bar{f^{ϵ}})|}^{2} \\ ≲ E_{ξ | X} E_{R} {(1 \land n^{- m / 2} \sum_{1 \leq i_{1} \neq \dots \neq i_{m} \leq φ (n)} (ξ_{π_{i_{1}}} - 1) \dots (ξ_{π_{i_{m}}} - 1) {\bar{f}}_{ε} (B_{i_{1}}, \dots, B_{i_{m}}))}^{2} \\ ≲ \sum_{\begin{matrix} α_{i} \in {1, 2} : \sum_{i = 1}^{l} α_{i} = 2 m, α_{1} \geq \dots \geq α_{l}, 1 \leq l \leq m \end{matrix}} E_{ξ}^{*} [1 \land n^{- m / 2} E_{R} [\prod_{i = 1}^{l} {(ξ_{π_{i}} - 1)}^{α_{i}}] \\ \times \sum_{\begin{matrix} i_{1} \neq \dots \neq i_{m}, \\ i_{1}^{'} \neq \dots \neq i_{m}^{'}, \\ i_{j} = i_{j}^{'}, 1 \leq j \leq max {j : α_{j} = 2} \end{matrix}} {\bar{f}}^{ϵ} (B_{i_{1}}, \dots, B_{i_{m} (n)}) {\bar{f}}_{n}^{ϵ} (B_{i_{1}^{'}}, \dots, B_{i_{m}^{'} (n)})] \\ ≲ \sum_{\begin{matrix} α_{i} \in {1, 2} : \sum_{i = 1}^{l} α_{i} = 2 m, \\ α_{1} \geq \dots \geq α_{l}, 1 \leq ℓ \leq m \end{matrix}} E {[1 \land \frac{1}{n} \sum_{i = 1}^{n} {(ξ_{i} - 1)}^{2}]}^{m} \\ \times n^{- ℓ} \sum_{\begin{matrix} i_{1} \neq \dots \neq i_{m}, \\ i_{1}^{'} \neq \dots \neq i_{m}^{'}, \\ i_{j} = i_{j}^{'}, 1 \leq j \leq max {j : α_{j} = 2} \end{matrix}} {\bar{f}}^{ϵ} (B_{i_{1}}, \dots, B_{i_{m}}) {\bar{f}}^{ϵ} (B_{i_{1}^{'}}, \dots, B_{i_{m}^{'}}) . \end{matrix}

We have, according to [43], for

(ξ_{1}, \dots, ξ_{n})

a non-negative sequence of variables such that

\sum_{i = 1}^{n} ξ_{i} = n

and for

π = (π_{1}, \dots, π_{n})

a random permutation of

{1, \dots, n}

, for any

ℓ \in N

and

α = (α_{1}, \dots, α_{ℓ}) \in N^{ℓ}

,

\begin{matrix} |E_{π} [\prod_{i = 1}^{ℓ} {(ξ_{π_{i}} - 1)}^{α_{i}}]| \leq C_{l, α} n^{- ℓ} {[\sum_{i = 1}^{l_{n}} {(ξ_{i} - 1)}^{2}]}^{\sum_{i} α_{i} / 2} . \end{matrix}

Furthermore, according to [70,71], we have:

\begin{matrix} n^{- ℓ} \sum_{\begin{matrix} i_{1} \neq \dots \neq i_{m}, \\ i_{1}^{'} \neq \dots \neq i_{m}^{'}, \\ i_{j} = i_{j}^{'}, 1 \leq j \leq max {j : α_{j} = 2} \end{matrix}} {\bar{f}}^{ϵ} (B_{i_{1}}, \dots, B_{i_{m}}) {\bar{f}}^{ϵ} (B_{i_{1}^{'}}, \dots, B_{i_{m}^{'}}) \\ \to_{a . s .} E_{A} {(τ)}^{- ℓ} E {\bar{f}}^{ϵ} (B_{1}, \dots, B_{m}) {\bar{f}}^{ϵ} (B_{1}^{'}, \dots, B_{m}^{'}) \\ (where B_{j} = B_{j}^{'} for 1 \leq j \leq max {j : α_{j} = 2} and for l_{n} / n \to E_{A} {(τ)}^{- 1}) \\ \leq E_{A} {(τ)}^{- ℓ} P^{m} {\bar{f}}^{ϵ^{2}} \leq ϵ^{2} . (Under Conditions (C . 1) and (C . 3) .) \end{matrix}

Hence we have

\begin{matrix} \underset{n \to \infty}{lim sup} (I) ≲_{m, ξ} ϵ, a . s . \end{matrix}

(27)

Now for the second term, we have:

\begin{matrix} {(\binom{n}{m})}^{- 1 / 2} R_{l_{n}}^{*} (f^{ϵ}) \\ = \frac{1}{{(\binom{n}{m})}^{1 / 2}} \sum_{q = 1}^{Q_{ϵ}} c_{q} \sum_{1 \leq i_{1} < \dots < i_{m} \leq φ (n)} (ξ_{π_{i_{1}}} - 1) \dots (ξ_{π_{i_{m}}} - 1) ψ_{q} (B_{i_{1}}) \dots ψ_{q} (B_{i_{m}}) \\ = \frac{φ {(n)}^{m / 2}}{{(\binom{n}{m})}^{1 / 2}} \sum_{q = 1}^{Q_{ϵ}} c_{q} R_{m} (\frac{1}{φ {(n)}^{1 / 2}} \sum_{i = 1}^{φ (n)} (ξ_{π_{i}} - 1) ψ_{q} (B_{i}), \dots, \frac{1}{φ {(n)}^{m / 2}} \sum_{i = 1}^{φ (n)} {(ξ_{π_{i}} - 1)}^{m} ψ_{q}^{m} (B_{i})) \\ \equiv (1 + o (1)) {(m!)}^{1 / 2} E_{A} {(τ)}^{- m / 2} \sum_{q = 1}^{Q_{ϵ}} c_{q} R_{m} (A_{φ (n), q}^{(1)}, \dots, A_{φ (n), q}^{(m)}), \end{matrix}

where

R_{m}

is the polynomial of degree m (see [6], p. 175):

\begin{matrix} \sum_{1 \leq i_{1} < \dots < i_{m} \leq φ (n)} t_{i_{1}} \dots t_{i_{m}} = R_{m} (\sum_{i = 1}^{φ (n)} t_{i}, \sum_{i = 1}^{φ (n)} t_{i}^{2}, \dots, \sum_{i = 1}^{φ (n)} t_{i}^{m}) . \end{matrix}

(28)

As we mentioned before, this polynomial follows from Newton’s inequality and allows us to show a polynomial function as a sum of monomials. All we need now is to check each argument of this polynomial function. □

For $ℓ = 1$ : We first recall the following lemma from [53].

Lemma 3

([53]). Let

(a_{1}, \dots, a_{n})

be a vector and

(ξ_{1}, \dots, ξ_{n})

be a vector of exchangeable random variables. Suppose that

\begin{matrix} {\bar{a}}_{n} = \frac{1}{n} \sum_{i = 1}^{n} a_{i} = 0, \frac{1}{n} \sum_{i = 1}^{n} a_{i}^{2} \to σ^{2}, lim_{M \to \infty} \underset{n \to \infty}{lim sup} \frac{1}{n} \sum_{i = 1}^{n} a_{i}^{2} 1_{{| a_{i} | > M}} = 0, \end{matrix}

and

\begin{matrix} {\bar{ξ}}_{n} = \frac{1}{n} \sum_{i = 1}^{n} ξ_{i} = 0, \frac{1}{n} \sum_{i = 1}^{n} ξ_{i}^{2} \to_{P_{ξ}} α^{2}, \frac{1}{n} max_{1 \leq i \leq n} ξ_{i}^{2} \to_{P_{ξ}} 0 . \end{matrix}

Then,

\frac{1}{\sqrt{n}} \sum_{i = 1}^{n} a_{i} ξ_{i} \to N (0, σ^{2} α^{2}) .

Applying Lemma 3 with

a_{i} \equiv ψ_{q} (B_{i}) - P_{n} ψ_{q}

and

ξ_{i}

replaced by

ξ_{R_{i}} - 1

, we can see that

A_{φ (n), q}^{(1)} \to c \cdot G_{P} (ψ_{q}), a . s .,

where

G_{P}

is a Gaussian process defined on

L_{2}^{c, 1} (P)

with covariance

E G_{P} (f) G_{P} (g) = P (f g), for f, g \in L_{2}^{c, 1} (P) .

For $ℓ = 2$ : Note that

\begin{matrix} E_{π}^{*, ξ} (A_{φ (n), q}^{(2)}) & = \frac{1}{φ (n)} \sum_{i = 1}^{φ (n)} {(ξ_{i} - 1)}^{2} \cdot \frac{1}{φ (n)} \sum_{i = 1}^{φ (n)} ψ_{q}^{2} (B_{i}) \\ \to_{P_{ν, ξ}} c^{2} E ψ_{q}^{2} (B_{1}) = c^{2} E_{A} [{(\sum_{i = T_{0} + 1}^{T_{1}} h_{1} (X_{i}))}^{2}], a . s . \end{matrix}

Furthermore,

\begin{matrix} {Var}^{*, ξ} (A_{φ (n), q}^{(2)}) \\ = E^{*, ξ} {(A_{φ (n), q}^{(2)})}^{2} - {(E_{π}^{*, ξ} (A_{φ (n), q}^{(2)}))}^{2} \\ = E_{π}^{*, ξ} {[\frac{1}{φ (n)} \sum_{i = 1}^{φ (n)} {(ξ_{i} - 1)}^{2} ψ_{q}^{2} (B_{π_{i}})]}^{2} - {[\frac{1}{φ (n)} \sum_{i = 1}^{φ (n)} {(ξ_{i} - 1)}^{2} P_{n} ψ_{q}^{2}]}^{2} \\ = \frac{1}{{φ (n)}^{2}} \sum_{i, j} {(ξ_{i} - 1)}^{2} {(ξ_{j} - 1)}^{2} [E_{π}^{*} ψ_{q}^{2} (B_{π_{i}}) ψ_{q}^{2} (B_{π_{j}}) - {(P_{n} ψ_{q}^{2})}^{2}] \\ = \frac{1}{{φ (n)}^{2}} \sum_{i} {(ξ_{i} - 1)}^{4} [E_{π}^{*} ψ_{q}^{4} (B_{π_{i}}) - {(P_{n} ψ_{q}^{2})}^{2}] \\ + \frac{1}{{φ (n)}^{2}} \sum_{i \neq j} {(ξ_{i} - 1)}^{2} {(ξ_{j} - 1)}^{2} [E_{π}^{*} ψ_{q}^{2} (B_{π_{i}}) ψ_{q}^{2} (B_{π_{j}}) - {(P_{n} ψ_{q}^{2})}^{2}] \\ \leq \frac{1}{{φ (n)}^{2}} \sum_{i} {(ξ_{i} - 1)}^{4} \cdot P_{n} ψ_{q}^{4} + \frac{1}{{φ (n)}^{2}} {(\sum_{i} {(ξ_{i} - 1)}^{2})}^{2} \cdot \frac{1}{φ (n) - 1} P_{n} ψ_{q}^{4} \\ \leq C \frac{1}{{φ (n)}^{2}} \sum_{i = 1}^{n} {(ξ_{i} - 1)}^{4} \cdot P_{n} ψ_{q}^{4} \\ \leq C {∥ ψ_{q} ∥}_{\infty}^{4} \frac{{max}_{i} {(ξ_{i} - 1)}^{2}}{φ (n)} \cdot \frac{1}{φ (n)} \sum_{i = 1}^{φ (n)} (ξ_{i} {- 1)}^{2} \to_{P_{ξ}} 0, a . s . \end{matrix}

The first inequality in the above display follows, since

\begin{matrix} E_{π}^{*} ψ_{q}^{2} (B_{π_{i}} B_{π_{i}}) ψ_{q}^{2} (B_{π_{j}}) - {(P_{n} ψ_{q}^{2})}^{2} \\ = \frac{1}{φ (n) (φ (n) - 1)} [\sum_{i \neq j} ψ_{q}^{2} (B_{π_{i}}) ψ_{q}^{2} (B_{π_{j}})] - {(P_{n} ψ_{q}^{2})}^{2} \\ \leq \frac{1}{φ (n) - 1} {(P_{n} ψ_{q}^{2})}^{2} \leq \frac{1}{φ (n) - 1} P_{n} ψ_{q}^{4} . \end{matrix}

This shows that

A_{φ (n), q}^{(2)} \to_{P_{ξ}} c^{2} E ψ_{q}^{2} a . s .

For

ℓ \geq 3

:

\begin{matrix} E_{π}^{*, ξ} | A_{φ (n), q}^{(ℓ)} | & \leq \frac{1}{{φ (n)}^{ℓ / 2}} \sum_{i = 1}^{φ (n)} | ξ_{i} - 1 |^{ℓ} \cdot \frac{1}{φ (n)} \sum_{i = 1}^{φ (n)} {| ψ_{q} (B_{i}) |}^{ℓ} \\ \leq {(\frac{{max}_{i} {| ξ_{i} - 1 |}^{2}}{φ (n)})}^{\frac{ℓ - 2}{2}} \cdot \frac{1}{φ (n)} \sum_{i = 1}^{φ (n)} {| ξ_{i} - 1 |}^{2} \cdot {∥ ψ_{q} ∥}_{\infty} \\ \to_{P_{ξ}} 0, a . s . \end{matrix}

This shows that

A_{φ (n), q}^{(ℓ)} \to_{P_{ξ}} 0, a . s .

Then, we have

R_{m} (A_{φ (n), q}^{(1)}, \dots, A_{φ (n), q}^{(m)}) \to R_{m} (G_{P} (c ψ_{q}), E {(c ψ_{q})}^{2}, 0, \dots, 0) = c {(m!)}^{- 1 / 2} \cdot K_{P} (ψ_{q}) a . s .,

where

K_{P}

is the Gaussian chaos process defined on (⊕ is the orthogonal sum in

L_{2} (E^{\infty}, E^{\infty}, P^{\infty})

)

R \oplus L_{2}^{c, N} (P) \equiv R \oplus (\oplus_{m = 1}^{\infty} L_{2}^{c, m} (P)) .

Hence, it follows that, by linearity of

K_{P}

,

{(\binom{n}{m})}^{- 1 / 2} R_{l_{n}}^{*} ({\bar{f}}^{ϵ}) \to c \cdot K_{P} (f^{ϵ}), a . s .

The last term in (26) follows from the definition of

K_{P}

\begin{matrix} (I I I) & \leq c \sqrt{E K_{P}^{2} ({\bar{f}}^{ϵ})} \to 0 (ϵ \to 0) . \end{matrix}

(29)

All these final results give the finite-dimensional convergence.

Now, we take a step-by-step approach to establish stochastic equicontinuity. We assume that the class of functions must be bounded, so we suppose that

h \leq H

, for H an envelope. Throughout the following, we denote by

F_{δ} : = {f, g \in F : d (f, g) \leq δ} .

Step 1

Let

\begin{matrix} Z_{n}^{*} & : = & {(\binom{n^{*}}{m})}^{1 / 2} [U_{n}^{*} (h) - E^{*} (U_{n}^{*} (h))], \end{matrix}

(30)

and

\begin{matrix} {\overset{˘}{T}}_{l_{n}}^{*} & : = & {(\binom{n}{m})}^{- 1 / 2} (\binom{l_{n}}{m}) [R_{l_{n}}^{*} - E^{*} (R_{l_{n}}^{*})] . \end{matrix}

(31)

In this step, we must prove that the stochastic equicontinuity of the U-process implies that of the regenerative U-process. This is a consequence of 1, and for the weighted bootstrap Proposition 2 and part ii) of Lemma 1.

Step 2

Define

{\overset{˘}{T}}_{l_{n}}^{*} : = {(\binom{n}{m})}^{- 1 / 2} \sum_{(i_{1}, \dots, i_{m}) \in I_{l_{n}}^{m}} (ξ_{i_{1}, l_{n}} - 1) \dots (ξ_{i_{m}, l_{n}} - 1) ω_{h} (B_{i_{1}}, \dots, B_{i_{m}})

and

{\tilde{T}}_{l_{n}}^{*} = {(\binom{n}{m})}^{- 1 / 2} \sum_{(i_{1}, \dots, i_{m}) \in I_{E (l_{n})}^{m}} (ξ_{i_{1}, l_{n}} - 1) \dots (ξ_{i_{m}, l_{n}} - 1) ω_{\tilde{h}} (B_{i_{1}}, \dots, B_{i_{m}}) .

Hypothesis: The stochastic equicontinuity of

{\overset{˘}{T}}_{l_{n}}^{*}

implies the stochastic equicontinuity of

{\tilde{T}}_{l_{n}}^{*}

.

Proof.

In order to prove the previous implication, we only need to show that:

\begin{matrix} P^{*} ({∥{\tilde{T}}_{l_{n}}^{*} - {\overset{˘}{T}}_{l_{n}}^{*}∥}_{F_{δ}} > ϵ) \\ \leq & P^{*} ({∥{(\binom{n}{m})}^{- 1 / 2} \sum_{(i_{1}, \dots, i_{m}) \in I_{E (l_{n})}^{m}} (ξ_{i_{1}, l_{n}} - 1) \dots (ξ_{i_{m}, l_{n}} - 1) ω_{h} (B_{i_{1}}, \dots, B_{i_{m}})∥}_{F_{δ}} > ϵ) . \end{matrix}

Suppose that

l_{n} \leq E (l_{n})

, the opposite case can be treated in a similar way. We have

\begin{matrix} P^{*} ({∥{\tilde{T}}_{l_{n}}^{*} - {\overset{˘}{T}}_{l_{n}}^{*}∥}_{F_{δ}} > ϵ) \\ = & P^{*} (∥{(\binom{n}{m})}^{- 1 / 2} \sum_{(i_{1}, \dots, i_{m}) \in I_{E (l_{n})}^{m}} (ξ_{i_{1}, l_{n}} - 1) \dots (ξ_{i_{m}, l_{n}} - 1) ω_{h} (B_{i_{1}}, \dots, B_{i_{m}}) \\ {- {(\binom{n}{m})}^{- 1 / 2} \sum_{(i_{1}, \dots, i_{m}) \in I_{l_{n}}^{m}} (ξ_{i_{1}, l_{n}} - 1) \dots (ξ_{i_{m}, l_{n}} - 1) ω_{h} (B_{i_{1}}, \dots, B_{i_{m}})∥}_{F_{δ}} > ϵ) \\ Define I : = \{(i_{1}, \dots, i_{m}) : 1 \leq i_{1} < \dots < i_{m} \leq E (l_{n}) : i_{j} \neq i_{k} for j \neq k, \\ such that \exists ℓ \in {1, \dots, m} : l_{n} \leq i_{ℓ} \leq E (l_{n})\} \\ \leq & P^{*} ({∥{(\binom{n}{m})}^{- 1 / 2} \sum_{(i_{1}, \dots, i_{m}) \in I} (ξ_{i_{1}, l_{n}} - 1) \dots (ξ_{i_{m}, l_{n}} - 1) ω_{h} (B_{i_{1}}, \dots, B_{i_{m}})∥}_{F_{δ}} > ϵ) \\ = & P^{*} (∥ {(\binom{n}{m})}^{- 1 / 2} \sum_{(i_{1}, \dots, i_{m}) \in I} (ξ_{i_{1}, l_{n}} - 1) \dots (ξ_{i_{m}, l_{n}} - 1) ω_{h} (B_{i_{1}}, \dots, B_{i_{m}}) > ϵ \\ \cap (| E (l_{n}) - l_{n} | \leq n / 4) ∥_{F_{δ}}) + P^{*} (| E (l_{n}) - l_{n} | > n / 4) . \end{matrix}

However,

| E (l_{n}) - l_{n} | = O_{P} (\sqrt{n})

by Lemma 1, part i). Then, the exists a constant

K > 0

, such that for every

ϵ > 0

,

P^{*} (| E (l_{n}) - l_{n} | > n / 4) < ϵ,

and the first expression in the previous expression is bounded by:

\begin{matrix} P^{*} (max_{M \leq n / 2 + E (l_{n})} {∥{(\binom{n}{m})}^{- 1 / 2} \sum_{(i_{1}, \dots, i_{m}) \in I^{^{'}}} (ξ_{i_{1}, l_{n}} - 1) \dots (ξ_{i_{m}, l_{n}} - 1) ω_{\tilde{h}} (B_{i_{1}}, \dots, B_{i_{m}})∥}_{F_{δ}} > ϵ) \\ where I^{^{'}} : = \{(i_{1}, \dots, i_{m}) : 1 \leq i_{1} < \dots < i_{m} \leq E (l_{n}), \exists ℓ = 1, \dots, m, E (l_{n}) < i_{ℓ} \leq M, \\ i_{j} \neq i_{k} for j \neq k\} \\ \leq & C_{1} P^{*} ({∥{(\binom{n}{m})}^{- 1 / 2} \sum_{(i_{1}, \dots, i_{m}) \in I^{^{''}}} (ξ_{i_{1}, l_{n}} - 1) \dots (ξ_{i_{m}, l_{n}} - 1) ω_{\tilde{h}} (B_{i_{1}}, \dots, B_{i_{m}})∥}_{F_{δ}} > C_{2} ϵ) \\ where I_{m}^{^{''}} : = \{(i_{1}, \dots, i_{m}) : 1 \leq i_{1} < \dots < i_{m} \leq E (l_{n}), \exists ℓ = 1, \dots, m, \\ E (l_{n}) < i_{ℓ} \leq E (l_{n}) + n / 2, i_{j} \neq i_{k} for j \neq k\} . \end{matrix}

The last expression follows from the Montgomery–Smith inequality. Since

E (l_{n}) / n \to α^{- 1},

the last expression matches the stochastic equicontinuity condition for

{\tilde{T}}_{l_{n}}^{*}

. This proves this step. □

Before passing to the next step, we introduce a new bootstrap sample. Define

{\hat{B}}_{i} : = \{X_{T_{i - 1} + 1}, \dots, X_{T_{i}}\}

for

i = 1, \dots, E (l_{n})

. Now, apply the weighted bootstrap procedure on the sample

{{\hat{B}}_{i}}_{i = 1}^{E (l_{n})}

. This new procedure is the same as the old one for

B_{i}

, but we aim here to replace the random quantity

l_{n}

with a deterministic one, which is

E (l_{n})

.

Step 3

Define:

{\hat{T}}_{l_{n}}^{*} = {(\binom{n}{m})}^{- 1 / 2} \sum_{(i_{1}, \dots, i_{m}) \in I_{E (l_{n})}^{m}} (ξ_{i_{1}, l_{n}} - 1) \dots (ξ_{i_{m}, l_{n}} - 1) ω_{\tilde{h}} ({\hat{B}}_{i_{1}}, \dots, {\hat{B}}_{i_{m}})

Hypothesis: The stochastic equicontinuity of

{\tilde{T}}_{l_{n}}^{*}

implies the stochastic equicontinuity of

{\hat{T}}_{l_{n}}^{*}

.

Proof.

First case: $l_{n} \leq E (l_{n})$ :

In this case, all of the terms in the following computation should be multiplied with

1_{(l_{n} \leq E (l_{n}))}

. We leave it out to keep the already complex notation simple. Define

\begin{matrix} A_{n} & : = & {B_{1}, \dots, B_{l_{n}}} \\ {\vec{T}}_{l_{n}}^{*} & : = & {\hat{T}}_{l_{n}}^{*} 1_{({\hat{B}}_{i_{1}}, \dots, {\hat{B}}_{i_{m}}) \in A_{n}} + T_{l_{n}}^{*} 1_{({\hat{B}}_{i_{1}}, \dots, {\hat{B}}_{i_{m}}) \in A_{n}^{c}} . \end{matrix}

{\vec{T}}_{l_{n}}^{*}

is well defined, i.i.d., and has the same distribution as

T_{l_{n}}^{*}

and

(i_{1}, \dots, i_{m}) \in I_{E (l_{n})}^{m}

. Hence, if we show that:

lim_{δ \to 0} \underset{n \to \infty}{lim sup} P^{*} ({∥{\vec{T}}_{l_{n}}^{*}∥}_{F_{δ}} > ϵ) = 0 in probability,

then the stochastic equicontinuity of

{\tilde{T}}_{l_{n}}^{*}

is established. However, we aim to approximate the one of

{\hat{T}}_{l_{n}}^{*}

. In order to achieve our goal, it is sufficient to estimate:

\begin{matrix} {∥{\vec{T}}_{l_{n}}^{*} - {\hat{T}}_{l_{n}}^{*}∥}_{F_{δ}} \\ = {∥{\hat{T}}_{l_{n}}^{*} 1_{({\hat{B}}_{i_{1}}, \dots, {\hat{B}}_{i_{m}}) \in A_{n}} + T_{l_{n}}^{*} 1_{({\hat{B}}_{i_{1}}, \dots, {\hat{B}}_{i_{m}}) \in A_{n}^{c}} - [{\hat{T}}_{l_{n}}^{*} 1_{({\hat{B}}_{i_{1}}, \dots, {\hat{B}}_{i_{m}}) \in A_{n}} + {\hat{T}}_{l_{n}}^{*} 1_{({\hat{B}}_{i_{1}}, \dots, {\hat{B}}_{i_{m}}) \in A_{n}^{c}}]∥}_{F_{δ}} \\ \leq {∥T_{l_{n}}^{*} 1_{({\hat{B}}_{i_{1}}, \dots, {\hat{B}}_{i_{m}}) \in A_{n}^{c}}∥}_{F_{δ}} + {∥{\hat{T}}_{l_{n}}^{*} 1_{({\hat{B}}_{i_{1}}, \dots, {\hat{B}}_{i_{m}}) \in A_{n}^{c}}∥}_{F_{δ}} \\ : = I_{n} + I I_{n} . \end{matrix}

(32)

For $I_{n}$ : Let

S_{n}^{*} : = \sum_{(i_{1}, \dots, i_{m}) \in I_{E (l_{n})}^{m}} 1_{({\hat{B}}_{i_{1}}, \dots, {\hat{B}}_{i_{m}}) \in A_{n}^{c}},

conditioned on the sample, we have:

\begin{matrix} L (\sum_{(i_{1}, \dots, i_{m}) \in I_{E (l_{n})}^{m}} (ξ_{i_{1}, l_{n}} - 1) \dots (ξ_{i_{m}, l_{n}} - 1) ω_{\tilde{h}} (B_{i_{1}}, \dots, B_{i_{m}}) 1_{({\hat{B}}_{i_{1}}, \dots, {\hat{B}}_{i_{m}}) \in A_{n}^{c}}) \\ = & L (\sum_{(i_{1}, \dots, i_{m}) \in I_{S_{n}^{*}}^{m}} (ξ_{i_{1}, l_{n}} - 1) \dots (ξ_{i_{m}, l_{n}} - 1) ω_{\tilde{h}} (B_{i_{1}}, \dots, B_{i_{m}})) . \end{matrix}

Hence,

\begin{matrix} P^{*} (I_{n} > ϵ) \\ = & P^{*} ({(\binom{n}{m})}^{- 1 / 2} {∥\sum_{(i_{1}, \dots, i_{m}) \in I_{S_{n}^{*}}^{m}} (ξ_{i_{1}, l_{n}} - 1) \dots (ξ_{i_{m}, l_{n}} - 1) ω_{\tilde{h}} (B_{i_{1}}, \dots, B_{i_{m}})∥}_{F_{δ}} > ϵ) \\ = & P^{*} (∥ {(\binom{n}{m})}^{- 1 / 2} \sum_{(i_{1}, \dots, i_{m}) \in I_{S_{n}^{*}}^{m}} (ξ_{i_{1}, l_{n}} - 1) \dots (ξ_{i_{m}, l_{n}} - 1) ω_{\tilde{h}} (B_{i_{1}}, \dots, B_{i_{m}}) > ϵ \\ \cap (S_{n}^{*} \leq K \sqrt{n}) ∥_{F_{δ}}) + P^{*} (S_{n}^{*} > K \sqrt{n}) \\ \leq & P^{*} (max_{M \leq K \sqrt{n}} ∥ {(\binom{n}{m})}^{- 1 / 2} \sum_{(i_{1}, \dots, i_{m}) \in I^{^{'}}} (ξ_{i_{1}, l_{n}} - 1) \dots (ξ_{i_{m}, l_{n}} - 1) ω_{\tilde{h}} (B_{i_{1}}, \dots, B_{i_{m}}) ∥_{F_{δ}} > ϵ) \\ + P^{*} (S_{n}^{*} > K \sqrt{n}), for any K > 0, \\ where I^{^{'}} : = \{(i_{1}, \dots, i_{m}) : 1 \leq i_{1} < \dots < i_{m} \leq S_{n}^{*}, \exists ℓ = 1, \dots, m, S_{n}^{*} < i_{ℓ} \leq M, \\ i_{j} \neq i_{k} for j \neq k\} . \\ \leq & C_{1} P^{*} (∥ {(\binom{n}{m})}^{- 1 / 2} \sum_{(i_{1}, \dots, i_{m}) \in I_{m}^{^{''}}} (ξ_{i_{1}, l_{n}} - 1) \dots (ξ_{i_{m}, l_{n}} - 1) ω_{\tilde{h}} (B_{i_{1}}, \dots, B_{i_{m}}) ∥_{F_{δ}} > C_{2} ϵ) \\ + P^{*} (S_{n}^{*} > K \sqrt{n}), for any K > 0, \end{matrix}

(33)

where

\begin{matrix} I_{m}^{^{''}} : = \{(i_{1}, \dots, i_{m}) : 1 \leq i_{1} < \dots < i_{m} \leq S_{n}^{*}, \exists ℓ = 1, \dots, m, \\ S_{n}^{*} < i_{ℓ} \leq K \sqrt{n}, i_{j} \neq i_{k} for j \neq k\} . \end{matrix}

For n large enough, we need to show that there exists

K > 0

such that

P^{*} (S_{n}^{*} > K \sqrt{n}) \to 0 .

As

1_{{\hat{B}}_{i} \in A_{n}^{c}}

are i.i.d and bounded,

\frac{S_{n}^{*} - E (S_{n}^{*})}{\sqrt{E (l_{n})}} \to N (0, η^{2}) in probability .

therefore, we can find

M > 0

such that

P^{*} (S_{n}^{*} > E (S_{n}^{*}) + M \sqrt{n}) < ϵ .

However,

E (S_{n}^{*}) = E (l_{n}) P^{*} ({\hat{B}}_{i}^{*} \in A_{n}^{c}) = E (l_{n}) - l_{n} = O_{P} (\sqrt{n}),

by Lemma 1 i), then

P^{*} (S_{n}^{*} > K \sqrt{n}) \to 0 .

Then, we only need to estimate the first part in (33). Define the following bootstrap procedure: let

{\vec{B}}_{i} : = \{X_{T_{i - 1} + 1}, \dots, X_{T_{i}}, 0, 0, \dots\}

and let

\vec{F}

be a class of function, related to the class of functions

F

, such that, for every

{\vec{ω}}_{h} \in \vec{F}

:

\{\begin{matrix} {\vec{ω}}_{h} ({\vec{B}}_{1}, {\vec{B}}_{2}, \dots, {\vec{B}}_{k}) = \sum_{i_{1} = 1}^{\infty} \dots \sum_{i_{k} = 1}^{\infty} h (x_{i_{1}}, \dots, x_{i_{k}}) 1_{x_{k} \neq 0} & if defined, \\ \infty & otherwise . \end{matrix}

(34)

It is classical that

{{\vec{B}}_{i}}

are i.i.d., applying the same bootstrap method of Algorithm 1. This new sample allows us to enlarge and bound (33) by

P^{*} (\underset{\vec{h} \in \vec{H}}{s u p} | {(\binom{n}{m})}^{- 1 / 2} \sum_{(i_{1}, \dots, i_{m}) \in I_{m}^{^{''}}} (ξ_{i_{1}, l_{n}} - 1) \dots (ξ_{i_{m}, l_{n}} - 1) {\vec{ω}}_{\vec{h}} ({\vec{B}}_{i_{1}}, \dots, {\vec{B}}_{i_{m}}) | > ϵ),

(35)

where

\vec{h} \in \vec{H} = {{\vec{ω}}_{\vec{f}} - {\vec{ω}}_{\vec{g}}, \vec{f}, \vec{g} \in \vec{F}}

and the corresponding class

H = {ω_{f} - ω_{g}, f, g \in F}

, with envelope

\tilde{F}

and F, respectively. To estimate the last expression, we use bracketing. Define the bracket

[f^{ℓ}, f^{u}]

by:

[f^{ℓ}, f^{u}] : = {f \in F : f^{ℓ} ⩽ f ⩽ f^{u}},

and the bracketing entropy number by

N_{1} (γ, F, P)

, which denotes the minimal number

N \geq 1

for which there exist functions

f_{1}^{ℓ}, \dots, f_{N}^{ℓ}

and

f_{1}^{u}, \dots, f_{N}^{u}

such that:

\{\begin{matrix} F \subset ⋃_{k = 1}^{N} [f_{k}^{ℓ}, f_{k}^{u}], \\ \int_{S} (f_{k}^{u} - f_{k}^{ℓ}) P \leq γ . \end{matrix}

(36)

For the class of functions

H

, consider the bracket

[h^{ℓ}, h^{u}]

, such that

E^{*} (h^{ℓ}, h^{u}) ⩽ γ

, where

γ > 0

and it is determined later. In this framework, the bracketing entropy number is

N^{*} (γ) : = N_{1} (γ, H, D^{*})

, for

D^{*} = {(\binom{l_{n}}{m})}^{- 1} \sum_{(i_{1}, \dots, i_{m}) \in I_{l_{n}}^{m}} ξ_{i_{1}, l_{n}} \dots ξ_{i_{m}, l_{n}} δ_{({\vec{B}}_{i_{1}}, \dots, {\vec{B}}_{i_{m}})} .

Hence, we have the following inequalities

\begin{matrix} sup_{\vec{h} \in \vec{H}} | {(\binom{n}{m})}^{- 1 / 2} \sum_{(i_{1}, \dots, i_{m}) \in I_{m}^{^{''}}} (ξ_{i_{1}, l_{n}} - 1) \dots (ξ_{i_{m}, l_{n}} - 1) {\vec{ω}}_{\vec{h}} ({\vec{B}}_{i_{1}}, \dots, {\vec{B}}_{i_{m}}) | \\ \leq & max_{k \leq N^{*} (γ)} | {(\binom{n}{m})}^{- 1 / 2} \sum_{(i_{1}, \dots, i_{m}) \in I_{m}^{^{''}}} (ξ_{i_{1}, l_{n}} - 1) \dots (ξ_{i_{m}, l_{n}} - 1) (h_{k}^{u} - h_{k}^{ℓ}) ({\vec{B}}_{i_{1}}, \dots, {\vec{B}}_{i_{m}}) | \\ \leq & max_{k \leq N^{*} (γ)} | {(\binom{n}{m})}^{- 1 / 2} \sum_{(i_{1}, \dots, i_{m}) \in I_{m}^{^{''}}} (ξ_{i_{1}, l_{n}} - 1) \dots (ξ_{i_{m}, l_{n}} - 1) \\ [h_{k}^{u} ({\vec{B}}_{i_{1}}, \dots, {\vec{B}}_{i_{m}}) - E^{*} (h_{k}^{u} ({\vec{B}}_{i_{1}}, \dots, {\vec{B}}_{i_{m}}))] | \\ + max_{k \leq N^{*} (γ)} | {(\binom{n}{m})}^{- 1 / 2} \sum_{(i_{1}, \dots, i_{m}) \in I_{m}^{^{''}}} (ξ_{i_{1}, l_{n}} - 1) \dots (ξ_{i_{m}, l_{n}} - 1) \\ [h_{k}^{ℓ} ({\vec{B}}_{i_{1}}, \dots, {\vec{B}}_{i_{m}}) - E^{*} (h_{k}^{ℓ} ({\vec{B}}_{i_{1}}, \dots, {\vec{B}}_{i_{m}}))] | \\ + max_{k \leq N^{*} (γ)} | {(\binom{n}{m})}^{- 1 / 2} \sum_{(i_{1}, \dots, i_{m}) \in I_{m}^{^{''}}} (ξ_{i_{1}, l_{n}} - 1) \dots (ξ_{i_{m}, l_{n}} - 1) \\ [E^{*} (h_{k}^{u} ({\vec{B}}_{i_{1}}, \dots, {\vec{B}}_{i_{m}})) - E^{*} (h_{k}^{ℓ} ({\vec{B}}_{i_{1}}, \dots, {\vec{B}}_{i_{m}}))] | \\ : = & I_{A} + I_{B} + I_{C} . \end{matrix}

(37)

Treating each term, keeping in mind Condition (A.1), i.e.,

\sum_{i = 1}^{n} ξ_{i} = n

, we have

\begin{matrix} I_{C} : = max_{1 \leq k \leq N^{*} (γ)} | {(\binom{n}{m})}^{- 1 / 2} \sum_{(i_{1}, \dots, i_{m}) \in I_{m}^{^{''}}} (ξ_{i_{1}, l_{n}} - 1) \dots (ξ_{i_{m}, l_{n}} - 1) \\ [E^{*} (h_{k}^{u} ({\vec{B}}_{i_{1}}, \dots, {\vec{B}}_{i_{m}})) - E^{*} (h_{k}^{ℓ} ({\vec{B}}_{i_{1}}, \dots, {\vec{B}}_{i_{m}}))] | \\ \leq γ | {(\binom{n}{m})}^{- 1 / 2} \sum_{(i_{1}, \dots, i_{m}) \in I_{m}^{^{''}}} (ξ_{i_{1}, l_{n}} - 1) \dots (ξ_{i_{m}, l_{n}} - 1) | \\ = γ |{(\binom{n}{m})}^{- 1} \sum_{i_{{1, \dots, m - 1}}} \sum_{j = 1}^{m} \prod_{k = 1, k \neq j}^{m} ξ_{i_{k}, n} \{\sum_{i_{j} = 1}^{n} (ξ_{i_{j}, n} - 1)\}| = 0, \end{matrix}

and

\begin{matrix} P (I_{B} > ϵ) : = P (max_{k \leq N^{*} (γ)} | {(\binom{n}{m})}^{- 1 / 2} \sum_{(i_{1}, \dots, i_{m}) \in I_{m}^{^{''}}} (ξ_{i_{1}, l_{n}} - 1) \dots (ξ_{i_{m}, l_{n}} - 1) \\ [h_{k}^{ℓ} ({\vec{B}}_{i_{1}}, \dots, {\vec{B}}_{i_{m}}) - E^{*} (h_{k}^{ℓ} ({\vec{B}}_{i_{1}}, \dots, {\vec{B}}_{i_{m}}))] | > ϵ) \\ \leq & N^{*} (γ) max_{k \leq N^{*} (γ)} P (| {(\binom{n}{m})}^{- 1 / 2} \sum_{(i_{1}, \dots, i_{m}) \in I_{m}^{^{''}}} (ξ_{i_{1}, l_{n}} - 1) \dots (ξ_{i_{m}, l_{n}} - 1) \\ [h_{k}^{ℓ} ({\vec{B}}_{i_{1}}, \dots, {\vec{B}}_{i_{m}}) - E^{*} (h_{k}^{ℓ} ({\vec{B}}_{i_{1}}, \dots, {\vec{B}}_{i_{m}}))] | > ϵ) \\ (For h_{k}^{ℓ} = h_{k}^{ℓ} 1_{h_{k}^{ℓ} \leq M_{n}} + h_{k}^{ℓ} 1_{h_{k}^{ℓ} > M_{n}}) \\ \leq & N^{*} (γ) max_{k \leq N^{*} (γ)} P (| {(\binom{n}{m})}^{- 1 / 2} \sum_{(i_{1}, \dots, i_{m}) \in I_{m}^{^{''}}} (ξ_{i_{1}, l_{n}} - 1) \dots (ξ_{i_{m}, l_{n}} - 1) \\ [h_{k}^{ℓ} 1_{h_{k}^{ℓ} \leq M_{n}} ({\vec{B}}_{i_{1}}, \dots, {\vec{B}}_{i_{m}}) - E^{*} (h_{k}^{ℓ} 1_{h_{k}^{ℓ} \leq M_{n}} ({\vec{B}}_{i_{1}}, \dots, {\vec{B}}_{i_{m}}))] | > ϵ) \\ + N^{*} (γ) max_{k \leq N^{*} (γ)} P (| {(\binom{n}{m})}^{- 1 / 2} \sum_{(i_{1}, \dots, i_{m}) \in I_{m}^{^{''}}} (ξ_{i_{1}, l_{n}} - 1) \dots (ξ_{i_{m}, l_{n}} - 1) \\ [h_{k}^{ℓ} 1_{h_{k}^{ℓ} > M_{n}} ({\vec{B}}_{i_{1}}, \dots, {\vec{B}}_{i_{m}}) - E^{*} (h_{k}^{ℓ} 1_{h_{k}^{ℓ} > M_{n}} ({\vec{B}}_{i_{1}}, \dots, {\vec{B}}_{i_{m}}))] | > ϵ) \\ \leq & \frac{N^{*} (γ)}{ϵ^{2} {(\binom{n}{m})}^{1 / 2}} E^{*} {((ξ_{1, l_{n}} - 1) \dots (ξ_{m, l_{n}} - 1) [h_{k}^{ℓ} 1_{h_{k}^{ℓ} \leq M_{n}} - E^{*} (h_{k}^{ℓ} 1_{h_{k}^{ℓ} \leq M_{n}})])}^{2} \\ + \frac{N^{*} (γ)}{ϵ} E^{*} |(ξ_{1, l_{n}} - 1) \dots (ξ_{m, l_{n}} - 1) [h_{k}^{ℓ} 1_{h_{k}^{ℓ} > M_{n}} - E^{*} (h_{k}^{ℓ} 1_{h_{k}^{ℓ} > M_{n}})]| \\ \leq & \frac{N^{*} (γ)}{ϵ^{2} {(\binom{n}{m})}^{1 / 2}} E_{ξ} (\prod_{i = 1}^{m} {(ξ_{i, l_{n}} - 1)}^{2}) E^{*} {([h_{k}^{ℓ} 1_{h_{k}^{ℓ} \leq M_{n}} - E^{*} (h_{k}^{ℓ} 1_{h_{k}^{ℓ} \leq M_{n}})])}^{2} \\ + \frac{N^{*} (γ)}{ϵ} E^{*} |(ξ_{1, l_{n}} - 1) \dots (ξ_{m, l_{n}} - 1) [h_{k}^{ℓ} 1_{h_{k}^{ℓ} > M_{n}} - E^{*} (h_{k}^{ℓ} 1_{h_{k}^{ℓ} > M_{n}})]| \\ \leq & \frac{N^{*} (γ)}{ϵ^{2} {(\binom{n}{m})}^{1 / 2}} c^{2} \times 4 M_{n}^{2} + \frac{N^{*} (γ)}{ϵ} E^{*} |(ξ_{1, l_{n}} - 1) \dots (ξ_{m, l_{n}} - 1) h_{k}^{ℓ} 1_{h_{k}^{ℓ} > M_{n}}| \\ \leq & \frac{N^{*} (γ)}{ϵ^{2} {(\binom{n}{m})}^{1 / 2}} c^{2} \times 4 M_{n}^{2} + \frac{2 N^{*} (γ)}{ϵ} E^{*} |(ξ_{1, l_{n}} - 1) \dots (ξ_{m, l_{n}} - 1) \tilde{F} 1_{\tilde{F} > M_{n}}| \\ \leq & \frac{N^{*} (γ)}{ϵ^{2} {(\binom{n}{m})}^{1 / 2}} c^{2} \times 4 M_{n}^{2} \\ + \frac{4 N^{*} (γ)}{ϵ} {(\binom{l_{n}}{m})}^{- 1} \sum_{(i_{1}, \dots, i_{m}) \in I_{l_{n}}^{m}} (ξ_{i_{1}, l_{n}} - 1) \dots (ξ_{i_{m}, l_{n}} - 1) |\tilde{F} 1_{\tilde{F} > M_{n}} ({\vec{B}}_{i_{1}}, \dots, {\vec{B}}_{i_{m}})|, \end{matrix}

(38)

yet,

{\vec{B}}_{i}

are i.i.d. and and

E (\tilde{F}) = E {(τ)}^{m} E F < \infty

, so for any

M_{n} ↗ \infty

, we have

{(\binom{E (l_{n})}{m})}^{- 1} \sum_{(i_{1}, \dots, i_{m}) \in I_{E (l_{n})}^{m}} (ξ_{i_{1}, l_{n}} - 1) \dots (ξ_{i_{m}, l_{n}} - 1) \tilde{F} 1_{\tilde{F} > M_{n}} ({\vec{B}}_{i_{1}}, \dots, {\vec{B}}_{i_{m}}) \to 0 a . s .

Using the same argument as in part iii) of Lemma 1, we can prove that

{(\binom{l_{n}}{m})}^{- 1} \sum_{(i_{1}, \dots, i_{m}) \in I_{l_{n}}^{m}} (ξ_{i_{1}, l_{n}} - 1) \dots (ξ_{i_{m}, l_{n}} - 1) |\vec{F} 1_{\vec{F} > M_{n}} ({\vec{B}}_{i_{1}}, \dots, {\vec{B}}_{i_{m}})| \to 0 i n p r o b a b i l i t y .

Then, it remains to find that, for every fixed

γ > 0

,

N^{*} (γ)

is bounded in probability, as the last expression in (38) does not depend on k. It is interesting to note that

N_{1} (γ, \vec{H}, P)

is finite, due to the boundness of

\vec{H}

by

2 F

with

E \vec{F} (\vec{B}) < \infty

and the fact that

{\vec{B}}_{i}

are i.i.d. and discrete random variables. Under the norm

L_{1} (P)

, define

γ / 2

brackets

h_{1}^{ℓ}, \dots h_{N (γ / 2)}^{ℓ}

and

h_{1}^{u}, \dots, h_{N (γ / 2)}^{u}

. Observe that

\begin{matrix} max_{j \leq N (γ / 2)} | {(\binom{l_{n}}{m})}^{- 1 / 2} \sum_{(i_{1}, \dots, i_{m}) \in I_{l_{n}}^{m}} (ξ_{i_{1}, l_{n}} - 1) \dots (ξ_{i_{m}, l_{n}} - 1) (h_{j}^{u} - h_{j}^{ℓ}) ({\vec{B}}_{i_{1}}, \dots, {\vec{B}}_{i_{m}}) |, \end{matrix}

(39)

converges to zero in probability, and

N (γ / 2)

does not depend on n. That implies that

N^{*} (γ) \leq N (γ / 2) < \infty

in probability. Replacing

h^{ℓ}

by

h^{u}

,

I_{A}

is identical to

I_{B}

, i.e.,

I_{A}

also converges to zero in probability. This proves the convergence of

I_{n}

to zero in probability.

For $I I_{n}$ : In the same manner, let

S_{n}^{*} : = \sum_{(i_{1}, \dots, i_{m}) \in I_{E (l_{n})}^{m}} 𝟙_{({\hat{B}}_{i_{1}}, \dots, {\hat{B}}_{i_{m}}) \in A_{n}^{c}} .

Define a new bootstrap sample

{B_{i}^{* *}}

in

i = l_{n} + 1, \dots, E (l_{n})

. Clearly, the new sample is well-defined since we assumed at the beginning that

l_{n} \leq E (l_{n})

, and it is defined independently from

B_{i}^{*}

and

{\hat{B}}_{i}^{*}

. In this case:

\begin{matrix} L (\sum_{(i_{1}, \dots, i_{m}) \in I_{E (l_{n})}^{m}} (ξ_{i_{1}, l_{n}} - 1) \dots (ξ_{i_{m}, l_{n}} - 1) ω_{\tilde{h}} ({\hat{B}}_{i_{1}}, \dots, {\hat{B}}_{i_{m}}) 𝟙_{({\hat{B}}_{i_{1}}, \dots, {\hat{B}}_{i_{m}}) \in A_{n}^{c}}) \\ = L (\sum_{(i_{1}, \dots, i_{m}) \in I_{S_{n}^{*}}^{m}} (ξ_{i_{1}, l_{n}} - 1) \dots (ξ_{i_{m}, l_{n}} - 1) ω_{\tilde{h}} (B_{i_{1}}^{* *}, \dots, B_{i_{m}}^{* *})) . \end{matrix}

Hence, as in (33), we have:

\begin{matrix} P^{*} (I I_{n} > ϵ) \\ = & P^{* *} ({(\binom{n}{m})}^{- 1 / 2} {∥\sum_{(i_{1}, \dots, i_{m}) \in I_{S_{n}^{*}}^{m}} (ξ_{i_{1}, l_{n}} - 1) \dots (ξ_{i_{m}, l_{n}} - 1) ω_{\tilde{h}} (B_{i_{1}}, \dots, B_{i_{m}})∥}_{F_{δ}} > ϵ) \\ \leq & C_{1} P^{* *} (∥ {(\binom{n}{m})}^{- 1 / 2} \sum_{(i_{1}, \dots, i_{m}) \in I_{m}^{^{''}}} (ξ_{i_{1}, l_{n}} - 1) \dots (ξ_{i_{m}, l_{n}} - 1) ω_{\tilde{h}} (B_{i_{1}}, \dots, B_{i_{m}}) ∥_{F_{δ}} > C_{2} ϵ) \\ + P^{*} (S_{n}^{*} > K \sqrt{n}), for any K > 0, \end{matrix}

(40)

where

\begin{matrix} I_{m}^{^{''}} : = \{(i_{1}, \dots, i_{m}) : 1 \leq i_{1} < \dots < i_{m} \leq S_{n}^{*}, \exists ℓ = 1, \dots, m, \\ S_{n}^{*} < i_{ℓ} \leq K \sqrt{n}, i_{j} \neq i_{k} f o r j \neq k\} . \end{matrix}

Using the same bootstrap procedure defined previously for

I_{n}

, let

{\vec{B}}_{i} : = \{X_{T_{i - 1} + 1}, \dots, X_{T_{i}}, 0, 0, \dots\},

for

i = l_{n} + 1, \dots, E (l_{n})

, and let

\vec{F}

be a class of function such that, for every

{\vec{ω}}_{h} \in \vec{F}

:

\{\begin{matrix} {\vec{ω}}_{h} ({\vec{B}}_{1}, {\vec{B}}_{2}, \dots, {\vec{B}}_{k}) = \sum_{i_{1} = 1}^{\infty} \dots \sum_{i_{k} = 1}^{\infty} h (x_{i_{1}}, \dots, x_{i_{k}}) 1_{x_{k} \neq 0} & if defined \\ \infty & otherwise . \end{matrix}

(41)

It is classical that

{{\vec{B}}_{i}}

are i.i.d., applying the same bootstrap method of Algorithm 1. This new sample allows us to enlarge and bound (33) by

P^{* *} (\underset{\vec{h} \in \vec{H}}{s u p} | {(\binom{n}{m})}^{- 1 / 2} \sum_{(i_{1}, \dots, i_{m}) \in I_{m}^{^{''}}} (ξ_{i_{1}, l_{n}} - 1) \dots (ξ_{i_{m}, l_{n}} - 1) {\vec{ω}}_{\vec{h}} ({\vec{B}}_{i_{1}}, \dots, {\vec{B}}_{i_{m}}) | > ϵ),

(42)

where

\vec{h} \in \vec{H} = {{\vec{ω}}_{\vec{f}} - {\vec{ω}}_{\vec{g}}, \vec{f}, \vec{g} \in \vec{F}}

corresponding to the class

H = {ω_{f} - ω_{g}, f, g \in F},

with envelope

\vec{F}

and F, respectively. As before, for the class of functions

H

, consider the bracket

[h^{ℓ}, h^{u}]

, such that

E^{* *} (h^{ℓ}, h^{u}) ⩽ γ,

where

γ > 0

and it is determined later. In this framework, the bracketing entropy number is

N^{* *} (γ) : = N_{1} (γ, H, D^{* *})

, for

D^{* *} = {(\binom{E (l_{n}) - l_{n}}{m})}^{- 1} \sum_{(i_{1}, \dots, i_{m}) \in I_{E (l_{n}) - l_{n}}^{m}} ξ_{i_{1}, l_{n}} \dots ξ_{i_{m}, l_{n}} δ_{({\vec{B}}_{i_{1}}, \dots, {\vec{B}}_{i_{m}})} .

Following the same arguments from Equations (37) through (38), we can find that (42) is

\begin{matrix} \leq & N^{* *} (γ) max_{k \leq N^{* *} (γ)} P (| {(\binom{n}{m})}^{- 1 / 2} \sum_{(i_{1}, \dots, i_{m}) \in I_{m}^{^{''}}} (ξ_{i_{1}, l_{n}} - 1) \dots (ξ_{i_{m}, l_{n}} - 1) \\ [h_{k}^{ℓ} ({\vec{B}}_{i_{1}}, \dots, {\vec{B}}_{i_{m}}) - E^{* *} (h_{k}^{ℓ} ({\vec{B}}_{i_{1}}, \dots, {\vec{B}}_{i_{m}}))] | > ϵ) \\ \leq & \frac{N^{* *} (γ)}{ϵ^{2} {(\binom{n}{m})}^{1 / 2}} c^{2} \times 4 M_{n}^{2} + \frac{2 N^{* *} (γ)}{ϵ} E^{* *} |(ξ_{1, l_{n}} - 1) \dots (ξ_{m, l_{n}} - 1) \tilde{F} 1_{\tilde{F} > M_{n}}| \\ \leq & \frac{N^{* *} (γ)}{ϵ^{2} {(\binom{n}{m})}^{1 / 2}} c^{2} \times 4 M_{n}^{2} + \frac{4 N^{* *} (γ)}{ϵ} \\ {(\binom{E (l_{n}) - l_{n}}{m})}^{- 1} \sum_{(i_{1}, \dots, i_{m}) \in I_{E (l_{n}) - l_{n}}^{m}} (ξ_{i_{1}, l_{n}} - 1) \dots (ξ_{i_{m}, l_{n}} - 1) |\tilde{F} 1_{\tilde{F} > M_{n}} ({\vec{B}}_{i_{1}}, \dots, {\vec{B}}_{i_{m}})| . \end{matrix}

(43)

Here, we must pay attention to the randomness of

N^{* *}

which depends on n. According to Lemma 1 i), we can see that

| E (l_{n}) - l_{n} | \to \infty

in probability, under the assumption that

l_{n} < E (l_{n})

. Now, using the same treatment of

I_{n}

, and for

M_{n} : = n^{1 / 3}

( to provide the convergence of

M_{n}

to ∞), as in [47], this allows the convergence of (43) to zero in probability. Estimating now

N^{* *}

by considering the same

γ / 2

brackets

h_{1}^{ℓ}, \dots h_{N (γ / 2)}^{ℓ}

and

h_{1}^{u}, \dots, h_{N (γ / 2)}^{u}

, we have

N^{* *} (γ) < N (γ / 2)

, which does not depend on n. Then,

I I_{n}

is proved. Following the same footsteps, we can prove the case where

l_{n} > E (l_{n})

. This proves Step 3. □

The end of the previous step yields that we only need to show the stochastic equicontinuity of

{\hat{T}}_{l_{n}}^{*}

, where the number of blocks is replaced by the deterministic quantity

E (l_{n})

. In order to achieve the equicontinuity of this statistic, Lemma 2 shows that it is sufficient to prove that:

\begin{matrix} E {∥\sum_{\begin{matrix} 1 \leq i_{k} \leq ℓ_{k}, 1 \leq k \leq m (n) \end{matrix}} ϵ_{i_{1}}^{(1)} \dots ϵ_{i_{m}}^{(m)} ω_{\vec{h}} ({\hat{B}}_{i_{1}}^{(1)}, \dots, {\hat{B}}_{i_{m}}^{(m)})∥}_{\vec{H}} \\ \leq a (ℓ_{1}, \dots, ℓ_{m}) {(\prod_{k = 1}^{m} ℓ_{k})}^{1 / 2} \end{matrix}

for all

1 \leq ℓ_{1}, \dots, ℓ_{m} \leq n

. We begin to define the distance:

e_{ℓ}^{2} (f, g) \equiv \frac{1}{\prod_{k = 1}^{m} ℓ_{k}} \sum_{1 \leq i_{k} \leq ℓ_{k}, 1 \leq k \leq m} ω_{\vec{h}}^{2} ({\hat{B}}_{i_{1}}^{(1)}, \dots, {\hat{B}}_{i_{m}}^{(m)}),

defined in

L_{2}

, associated with the Rademacher process

\{\frac{1}{{(\prod_{k = 1}^{m} ℓ_{k})}^{1 / 2}} \sum_{1 \leq i_{k} \leq ℓ_{k}, 1 \leq k \leq m} ε_{i_{1}}^{(1)} \dots ε_{i_{m}}^{(m)} ω_{\vec{h}} ({\hat{B}}_{i_{1}}^{(1)}, \dots, {\hat{B}}_{i_{m}}^{(m)}) : \vec{h} \in \vec{H} ∣ {\hat{B}}_{1}, \dots, {\hat{B}}_{m}\} .

Take

{∥ f ∥}_{ℓ}^{2} \equiv e_{ℓ}^{2} (f, 0)

and

r_{ℓ} (δ) \equiv sup_{f \in {\vec{F}}_{δ}} {∥ f ∥}_{ℓ}^{2} .

Using Corollary A1, we have

\begin{matrix} E {∥\frac{1}{{(\prod_{k = 1}^{m} ℓ_{k})}^{1 / 2}} \sum_{1 \leq i_{k} \leq ℓ_{k}, 1 \leq k \leq m} ε_{i_{1}}^{(1)} \dots ε_{i_{m}}^{(m)} ω_{\vec{h}} ({\hat{B}}_{i_{1}}^{(1)}, \dots, {\hat{B}}_{i_{m}}^{(m)})∥}_{\vec{H}} \\ \leq C \int_{0}^{r_{ℓ} (δ)} {(log N (ε, F, e_{ℓ}))}^{m / 2} d ε \\ = {C ∥ F ∥}_{ℓ} \cdot \int_{0}^{r_{ℓ} (δ) / {∥ F ∥}_{ℓ}} {(log N ({ε ∥ F ∥}_{ℓ}, F, e_{ℓ}))}^{m / 2} d ε \\ \leq {C ∥ F ∥}_{ℓ} \cdot \int_{0}^{r_{ℓ} (δ) / {∥ F ∥}_{ℓ}} {(sup_{Q} log N ({ε ∥ F ∥}_{L_{2} (Q)}, F, L_{2} (Q)))}^{m / 2} d ε . \end{matrix}

(44)

Assuming that

F \geq 1

, the upper bound in the integral can be replaced by

r_{ℓ} (δ)

. The following proposition is necessary for the following.

Proposition 3

([46]). Let

\{X_{i}\}

be i.i.d. random variables with law

P

. Let

H

be a class of measurable real-valued functions defined on

(X^{m}, A^{m})

with an

P^{m}

-integrable envelope such that the following holds: for any fixed

δ >

0, M > 0, 1 \leq k \leq m

,

max_{1 \leq j^{'} \leq k} E {(\frac{log N (δ, {(π_{k} H)}_{M}, e_{ℓ, j^{'}})}{ℓ_{j^{'}}})}^{1 / 2} \to 0

(45)

holds for any

ℓ_{1} \land \dots \land ℓ_{k} \to \infty

. Here for

ℓ = (ℓ_{1}, \dots, ℓ_{k})

and

{\{X_{i}\}}_{i = 1}^{\infty}

,

e_{ℓ, j^{'}} (f, g) \equiv \frac{1}{ℓ_{j^{'}}} \sum_{i_{j^{'}} = 1}^{ℓ_{j^{'}}} |\frac{1}{\prod_{j \neq j^{'}} ℓ_{j}} \sum_{1 \leq i_{j} \leq ℓ_{j} : j \neq j^{'}} (f - g) (X_{i_{1}}, \dots, X_{i_{k}})|

and

{(π_{k} H)}_{M} \equiv \{h 1_{H_{k} \leq M} : h \in π_{k} H\},

where

H_{k}

is an envelope for

π_{k} H

. Then,

sup_{h \in H} |\frac{1}{\prod_{k = 1}^{m} ℓ_{k}} \sum_{1 \leq i_{k} \leq ℓ_{k}, 1 \leq k \leq m} (h (X_{i_{1}}, \dots, X_{i_{m}}) - P^{m} h)| \to 0

in

L_{1}

as

ℓ_{1} \land \dots \land ℓ_{m} \to \infty

. The above equation can be replaced by the decoupled version.

By this proposition,

{∥ F ∥}_{ℓ} \to_{P} {∥ F ∥}_{L_{2} (P)}

as

ℓ_{1} \land \dots \land ℓ_{m} \to \infty

, therefore, it suffices to get

r_{ℓ} (δ) \to_{p} 0

as

ℓ_{1} \land \dots \land ℓ_{m} \to \infty

and

δ \to 0

. It is obvious that all that is left to do now is to demonstrate that

sup_{f \in {\tilde{F}}_{δ}} |\frac{1}{\prod_{k = 1}^{m} ℓ_{k}} \sum_{1 \leq i_{k} \leq ℓ_{k}, 1 \leq k \leq m} (ω_{\tilde{h}}^{2} ({\hat{B}}_{i_{1}}^{(1)}, \dots, {\hat{B}}_{i_{m}}^{(m)}) - P^{m} ω_{\tilde{h}}^{2})| \to_{p} 0 .

(46)

Verifying condition (45)

\begin{matrix} max_{1 \leq j^{'} \leq k} E {(\frac{log N (δ, F_{M}^{2}, e_{ℓ, j^{'}})}{ℓ_{j^{'}}})}^{1 / 2} \\ \leq {(ℓ_{1} \land \dots \land ℓ_{k})}^{- 1 / 2} E [\int_{0}^{δ} {(log N (ε, F_{M}^{2}, e_{ℓ, j^{'}}))}^{m / 2} d ε] \\ \leq {(\frac{δ}{\sqrt{2} M})}^{- 1} {(ℓ_{1} \land \dots \land ℓ_{k})}^{- 1 / 2} E [\int_{0}^{δ / \sqrt{2} M} {(log N (ε, F_{M}^{2}, e_{ℓ, j^{'}}))}^{m / 2} d ε] \\ \leq {(δ / 2 M)}^{- 1} {(ℓ_{1} \land \dots \land ℓ_{k})}^{- 1 / 2} \int_{0}^{1} {(sup_{Q} log N ({ε ∥ F ∥}_{L_{2} (Q)}, F, L_{2} (Q)))}^{m / 2} d ε \\ \times {∥ F ∥}_{L_{2} (P^{m})} \to 0 . \end{matrix}

(47)

The shift from the second to the third line is true because

N (δ, F_{M}^{2}, L_{2} (Q)) \leq N (δ / \sqrt{2} M, F_{M}, L_{2} (Q)) .

As the condition is verified, as well as

ℓ_{1} \land \dots \land ℓ_{m} \to \infty

, (46) follows directly using the previous proposition. Hence, there exists some sequence

{a_{ℓ}}

, in a way that

a_{ℓ} \to 0

for any sequence

{δ_{ℓ}}

with

δ_{ℓ} \to 0

both under

ℓ_{1} \land \dots \land ℓ_{m} \to \infty

, such that:

\begin{matrix} E {∥\sum_{1 \leq i_{k} \leq ℓ_{k}, 1 \leq k \leq m} ε_{i_{1}}^{(1)} \dots ε_{i_{m}}^{(m)} ω_{\tilde{h}} ({\hat{B}}_{i_{1}}^{(1)}, \dots, {\hat{B}}_{i_{m}}^{(m)})∥}_{\tilde{H}} \leq a_{ℓ} {(\prod_{k = 1}^{m} ℓ_{k})}^{1 / 2} . \end{matrix}

(48)

An application of Lemma 2 proves that

\begin{matrix} n^{- m / 2} E {∥\sum_{1 \leq i_{1}, \dots, i_{m} \leq n} (ξ_{i_{1}} - 1) \dots (ξ_{i_{m}} - 1) ω_{\tilde{h}} ({\hat{B}}_{i_{1}}, \dots, {\hat{B}}_{i_{m}})∥}_{F_{δ_{n}}} \to 0, n \to \infty . \end{matrix}

This completes the proof for the asymptotic equicontinuity.

Author Contributions

I.S. and S.B.: conceptualization, methodology, investigation, writing—original draft, writing—review and editing. All authors contributed equally to the writing of this paper. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Not applicable.

Acknowledgments

The authors would like to thank the Special Issue Editor of the Special Issue on “Current Developments in Theoretical and Applied Statistics”, Christophe Chesneau for the invitation. The authors are indebted to the Editor-in-Chief and the three referees for their very generous comments and suggestions on the first version of our article, which helped us to improve the content, presentation and layout of the manuscript.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

This appendix contains supplementary information that is an essential part of providing a more comprehensive understanding of the paper. We also refer to [46] for more details.

Proof of Proposition 1.

Let

B_{0} = \{X_{1}, \dots, X_{T_{0}}\}

and

B_{l_{n}}^{(n)} = \{X_{T_{l_{n} - 1} + 1}, \dots, X_{n}\}

the possibly empty non-regenerative blocks of observations. Note that, for

l_{n} \leq 2

, the demonstration can be viewed directly in [59], under the assumptions (C1), (C2) and (C3), we can see that

P_{ν} (l_{n} \leq 2) = O (n^{- 2}) .

Otherwise, for

l_{n} > 2

, we can write

W_{n} (h)

as follows:

W_{n} (h) = (I) + (I I),

where

\begin{matrix} (I) & = & \frac{1}{(\binom{n}{m})} [\sum_{1 \leq i_{1} < \dots < i_{m - 1} \leq l_{n} - 1} ω_{\tilde{h}} (B_{0}, B_{i_{1}}, \dots, B_{i_{m - 1}}) \\ + \sum_{1 \leq i_{1} < \dots i_{m - 1} \leq l_{n} - 1} ω_{\tilde{h}} (B_{i_{1}}, \dots, B_{i_{m - 1}}, B_{l_{n}})], \\ (I I) & = & \frac{1}{(\binom{n}{m})} [\sum_{j = 2}^{m} \sum_{0 \leq k < i_{1} < \dots i_{m - 1 - j} \leq l_{n}} ω_{\tilde{h}} (B_{k}, \dots, B_{k}, B_{i_{1}}, \dots, B_{i_{m - 1 - j}}) \\ - \sum_{j = 2}^{m} \sum_{1 \leq k < i_{1} < \dots < i_{m - 1 - j} \leq n} \tilde{h} (X_{k}, \dots, X_{k}, X_{i_{1}}, \dots, X_{i_{m}})] \\ = & \frac{1}{(\binom{n}{m})} [\sum_{{(I_{m}^{l_{n} - 1})}^{c}} ω_{\tilde{h}} (B_{i_{1}}, \dots, B_{i_{m}}) - \sum_{{(I_{m}^{n})}^{c}} \tilde{h} (X_{i_{1}}, \dots, X_{i_{m}})], \end{matrix}

where

{(I_{m}^{s})}^{c} = {(i_{1}, \dots, i_{m}) : i_{j} \in N, 1 \leq i_{j} \leq n; a t l e a s t t h e r e a r e j a n d k s u c h t h a t i_{j} = i_{k}},

the complement of index set, with cardinal equal to

(\binom{s + m - 1}{m}) - (\binom{s}{m}) : = \bar{(\binom{s}{m})}

. To prove the convergence of

W_{n} (h)

to zero in probability, we must fulfill the convergence of (I) and (II) to zero in probability.

\begin{matrix} A & = & {\bar{(\binom{n}{m})}}^{- 1} \sum_{{(I_{m}^{l_{n} - 1})}^{c}} ω_{\tilde{h}} (B_{i_{1}}, \dots, B_{i_{m}}) \\ \underset{n \to \infty}{\to} & α^{- m} [E (ω_{h} (B_{1}, \underset{u}{\underset{︸}{B_{k}, \dots, B_{k},}} times B_{k + u}, \dots, B_{m})) - E_{A} ({(τ)}^{u}) {(E_{A} (τ))}^{m - u} μ (h)], \end{matrix}

where

1 \leq k \leq m

and

1 \leq u \leq k

. We apply the SLLN for Harris Markov chains to find the convergence of

B = {\bar{(\binom{n}{m})}}^{- 1} \sum_{{(I_{m}^{n})}^{c}} \tilde{h} (X_{i_{1}}, \dots, X_{i_{m}}),

to

\int \dots \int h (x_{1}, \underset{u times}{\underset{︸}{x_{k}, \dots, x_{k},}} x_{k + u}, \dots, x_{m}) d μ (x_{1}) d μ^{u} (x_{1}) d μ (x_{k + u}) \dots d μ (x_{m}) - μ (h) .

Using the conditions, all terms in

A

and

B

are finite and we can prove the convergence of

(I I)

to zero. Now, for

(I)

, applying the SLLN and by Lemma 3.2 in [47] part i), we can see that

P_{ν} (lim_{n \to + \infty} \frac{l_{n}}{n} \to α^{- 1} = {(E_{A} (τ))}^{- 1}) = 1 .

(A1)

We have

\begin{matrix} n^{- 2 m} E_{ν} [{(\sum_{1 \leq i_{1} < \dots < i_{m - 1} \leq l_{n} - 1} ω_{\tilde{h}} (B_{0}, B_{i_{1}}, \dots, B_{i_{m - 1}}))}^{2}] \\ \leq 2 α^{- 2 m} {E_{ν} [(ω_{| h |} {(B_{0}, B_{1}, \dots, B_{m - 1})}^{2}] + E_{ν} (τ_{0}) (E_{A} {(τ)}^{m - 1} μ {(h)}^{2}\} < \infty . \end{matrix}

We obtain, in turn, that

\begin{matrix} n^{- 2 m} E_{ν} [{(\sum_{1 \leq i_{1} < \dots i_{m - 1} \leq l_{n}} ω_{\tilde{h}} (B_{i_{1}}, \dots, B_{i_{m - 1}}, B_{l_{n}}))}^{2}] \\ \leq 2 α^{- 2 m} {E_{ν} [(ω_{| h |} {(B_{1}, \dots, B_{m - 1}, B_{l_{n}})}^{2}] + {(E_{A} (τ))}^{m} μ {(| h |)}^{2}\} < \infty . \end{matrix}

Hence,

(I)

also converges to zero a.s under

P_{ν}

as

n \to \infty

. □

Proof of Theorem 1.

In what follows, let

L = l_{n} - 1

denote the number of blocks observed. We find that

R_{L} (h) = S_{L} (h) + D_{L} (h),

where

\begin{matrix} S_{L} (h) & = & \frac{m}{L} \sum_{i = 1}^{L} {\tilde{h}}^{(1)} (B_{i}), \\ D_{L} (h) & = & \sum_{j = 2}^{m} (\binom{m}{j}) {(\binom{L}{j})}^{- 1} \sum_{1 \leq i_{1} < \dots < i_{j} \leq L} {\tilde{h}}^{(j)} (B_{i_{1}}, \dots, B_{i_{j}}), \end{matrix}

where

{\tilde{h}}^{(c)} (\cdot)

represents the conditional expectation of

ω_{\tilde{h}} (\cdot)

given the c of the coordinates, for all

B_{c} \in T

. The U-statistics

D_{L} (h)

is obtained by truncating the Hoeffding decomposition after the first term

S_{L} (h)

. Then, we just need to show that:

$L^{1 / 2} S_{L} (h) converges weakly to a Gaussian process G_{P} on l^{\infty} (F),$
$∥ L^{- m + 1 / 2} D_{L} {(h) ∥}_{F} \to 0 .$

For

P_{L} (h_{1}) : = \frac{1}{L} \sum_{i = 1}^{L} {\tilde{h}}_{1} (B_{i})

, introduce

Z_{L} (h) = \sqrt{L} (P_{L} (h_{1}) - P (h_{1})) = \frac{1}{\sqrt{L}} \sum_{i = 1}^{L} (P_{L} (P^{m - 1} (h)) - P^{m} (h)) .

Using (A1), we can replace the random variable

L = l_{n} - 1

with the deterministic quantity

\overset{˘}{L}

and we write

Z_{\overset{˘}{L}} (h) = \frac{1}{\sqrt{\overset{˘}{L}}} \sum_{i = 1}^{\overset{˘}{L}} (P_{\overset{˘}{L}} (P^{m - 1} (h)) - P^{m} (h)) + o_{P},

where

\overset{˘}{L} = 1 + ⌊\frac{n}{E_{A} (τ)}⌋ .

In order to establish the weak convergence for the empirical process

Z_{\overset{˘}{L}} (h)

, it is sufficient and necessary to prove the finite dimensional convergence and the stochastic equicontinuity. For the finite multidimensional convergence, we have to prove that

(Z_{\overset{˘}{L}} (h_{i_{1}}), \dots, Z_{\overset{˘}{L}} (h_{i_{k}}))

converges weakly to

(G (h_{i_{1}}), \dots, G (h_{i_{k}}))

, for every fixed finite collection of functions

\{h_{i_{1}}, \dots, h_{i_{k}}\} \subset F .

In order to fix this, it is enough to show that for every fixed

a_{1}, \dots, a_{k} \in R

,

\sum_{j = 1}^{k} a_{j} Z_{\overset{˘}{L}} (h_{i_{j}}) \to N (0, σ^{2}), in distribution,

where

σ^{2} = \sum_{j = 1}^{k} a_{j}^{2} Var (Z_{\overset{˘}{L}} (h_{i_{j}})) + \sum_{s \neq r} a_{j} a_{i} Cov (Z_{\overset{˘}{L}} (h_{i_{s}}), Z_{\overset{˘}{L}} (h_{i_{r}})) .

By linearity, and in the same footsteps of the arguments of ([57], Chapter 17), we can prove that

\frac{1}{\sqrt{n}} \sum_{j = 1}^{\overset{˘}{L}} {\tilde{h}}_{1} (B_{j}) \to N (0, γ_{h_{1}}^{2}),

where, under Condition (C5),

γ_{h_{1}}^{2} = α E_{A} ({\tilde{h}}_{1}^{2} (B_{1})) .

We readily infer that we have

\sqrt{L} S_{L} (h) \to N (0, m^{2} E_{A} ({\tilde{h}}_{1}^{2} (B_{1}))) .

Now, to verify the equicontinuity, we need to check that for every

ϵ > 0

,

lim_{δ \to 0} lim_{n \to \infty} P (sup_{d (f, g) \leq δ} | Z_{L} (f) - Z_{L} (g) | > ϵ) = 0,

where

d (\cdot, \cdot)

is a pseudo distance for which the class

F

is totally bounded, and

f, g

belong to

F

. According to [72], we have

\begin{matrix} | Z_{L} (f - g) | & = & |\frac{1}{\sqrt{L}} [\sum_{k = 1}^{L} (f - g) (B_{k}) - P^{m} (f - g)]| \\ \leq & |\frac{1}{\sqrt{L}} \sum_{a \leq k \leq b} ((f - g) (B_{k}) - P^{m} (f - g))| \\ + |\frac{1}{\sqrt{L}} \sum_{1 \leq k \leq ⌊ n / E (τ) ⌋} ((f - g) (B_{k}) - P^{m} (f - g))|, \end{matrix}

where

a = min (L, ⌊ n / E (τ) ⌋) a n d b = max (L, ⌊ n / E (τ) ⌋) .

For the left-hand part in the last inequality, we have

\begin{matrix} |\sum_{a \leq k \leq b} ((f - g) (B_{k}) - P^{m} (f - g))| \\ \leq 2 sup_{f \in F} max_{1 \leq s \leq n} \{2 |\sum_{1 \leq k \leq s} (f (B_{k}) - P^{m} (f))|\} . \end{matrix}

Dividing the last inequality by

L^{1 / 2}

and using the convergence result in ([72] Lemma 2.11) with Condition (C1), we obtain the desired result. The right-hand part in the inequality is treated using ([72] Lemma 4.2) providing that

E_{A} {(τ)}^{2 + α} < \infty,

for

α > 0

and the hypothesis of a finite uniform entropy integral. To complete the weak convergence of the regenerative U-statistic, we must treat the remaining terms of its Hoeffding decomposition. For

ζ \in F

, let us introduce

ζ : = ω_{\tilde{h}} (B_{1}, \dots, B_{m}) - P^{m} (h) - \sum_{i = 1}^{m} {\tilde{h}}^{(1)} (B_{i}), B_{i} \in T .

(A2)

Once can see that

ζ

is centered, that is

\int ζ (B_{1}, \dots, B_{m}) d P (B_{1}) \dots d P (B_{i}) \dots d P (B_{m}) = 0 .

(A3)

By the randomization theorem, according to [7] (for

r = 2

):

\begin{matrix} E ∥\sum_{1 \leq i_{1} < \dots < i_{m} \leq \overset{˘}{L}} ζ (B_{i_{1}}, \dots, B_{i_{m}})∥ & = & E ∥\sum_{1 \leq i_{1} < \dots < i_{m} \leq \overset{˘}{L}} ε_{i_{1}}^{(1)} ε_{i_{2}}^{(2)} ζ (B_{i_{1}}^{(1)}, \dots, B_{i_{m}}^{(1)})∥ . \end{matrix}

Hence, for

C

a constant:

\begin{matrix} E {∥{\overset{˘}{L}}^{- 1 / 2} \sum_{1 \leq i_{1} < \dots < i_{m} \leq \overset{˘}{L}} ζ (B_{i_{1}}, \dots, B_{i_{m}})∥}_{F} & \leq & C E \int_{0}^{\infty} {\overset{˘}{L}}^{- 1 / 2} log N_{n, 2} (ε, F) d ε . \end{matrix}

It is sufficient now to use the theorem hypothesis of a uniform entropy integral to complete the proof of the theorem. □

Proof of Theorem 2.

We have

\begin{matrix} E {∥\sum_{1 \leq i_{1} < \dots < i_{m} \leq n} ξ_{i_{1}} \dots ξ_{i_{m}} f (Y_{i_{1}}^{(1)}, \dots, Y_{i_{m}}^{(m)})∥}_{F} . \end{matrix}

By decoupling of the U-process, due to [6],

\begin{matrix} \leq C_{m} E {∥\sum_{1 \leq i_{1} < \dots < i_{m} \leq n} ξ_{i_{1}} \dots ξ_{i_{m}} f (Y_{i_{1}}^{(1)}, \dots, Y_{i_{m}}^{(m)})∥}_{F} . \end{matrix}

By symmetrization, due to [6], we have

\begin{matrix} \leq 2^{m} C_{m} E {∥\sum_{1 \leq i_{1} < \dots < i_{m} \leq n} ξ_{i_{1}} \dots ξ_{i_{m}} ε_{i_{1}}^{(1)} \dots ε_{i_{m}}^{(m (n))} f (Y_{i_{1}}^{(1)}, \dots, Y_{i_{m}}^{(m)})∥}_{F} \end{matrix}

for

(sgn (ξ_{1}) ε_{1}^{*}, \dots, sgn (ξ_{n}) ε_{n}^{*})

a sequence independent and with the same distribution as

(ξ_{1}, \dots, ξ_{n})

. By the invariance of

{(P_{ϵ} \otimes P)}^{m n}

and the fact that

ξ

is independent of

X^{\cdot}, ϵ^{\cdot}

, we have that

\begin{matrix} = 2^{m} C_{m} E_{x, ϵ} {∥\sum_{1 \leq i_{1} < \dots < i_{m} \leq n} | ξ_{i_{1}} | \dots | ξ_{i_{m}} | sgn (ξ_{i_{1}}) ε_{i_{1}}^{(1)} \dots sgn (ξ_{i_{m}}) ε_{i_{m}}^{(m (n))} f (Y_{i_{1}}^{(1)}, \dots, Y_{i_{m}}^{(m)})∥}_{F} \\ = 2^{m} C_{m} E_{x, ϵ} {∥\sum_{1 \leq i_{1} < \dots < i_{m} \leq l_{n} - 1} | ξ_{i_{1}} | \dots | ξ_{i_{m}} | ε_{i_{1}}^{(1)} \dots ε_{i_{m}}^{(m (n))} f (Y_{i_{1}}^{(1)}, \dots, Y_{i_{m}}^{(m)})∥}_{F}, \end{matrix}

using the reversed order statistics of

{| ξ_{i} {|}}_{i = 1}^{n}

,

| ξ_{(1)} | \geq \dots \geq | ξ_{(n)} |

, and the permutations between the different sequences of random variables, and in the same footsteps as [46],

\begin{matrix} = 2^{m} C_{m} E {∥\sum_{1 \leq i_{1} < \dots < i_{m} \leq n} | ξ_{(i_{1})} | \dots | ξ_{(i_{m})} | ε_{i_{1}}^{(1)} \dots ε_{i_{m}}^{(m (n))} f (Y_{i_{1}}^{(1)}, \dots, Y_{i_{m}}^{(m)})∥}_{F} \end{matrix}

substituting

ξ_{(i)}

by

\sum_{k = i}^{n} ξ_{(k)} - ξ_{(k + 1)}

, with

| ξ_{(n + 1)} | = 0

, we have

\begin{matrix} \leq 2^{m} C_{m} E ∥ \sum_{1 \leq i_{1} < \dots < i_{m} \leq n} \sum_{k_{j} \geq i_{j}, 1 \leq j \leq r} (| ξ_{(l_{1})} | - | ξ_{(l_{1} + 1)} |) \dots (| ξ_{(l_{m})} | - | ξ_{(l_{m} + 1)} |) \\ \times ε_{i_{1}}^{(1)} \dots ε_{i_{m}}^{(m (n))} f (Y_{i_{1}}^{(1)}, \dots, Y_{i_{m}}^{(m)}) ∥_{F} \\ \leq 2^{m} C_{m} E ∥ \sum_{1 \leq i_{1}, \dots, i_{m} \leq n} \sum_{ℓ_{k} \geq i_{k}, 1 \leq k \leq m} (| ξ_{(ℓ_{1})} | - | ξ_{(ℓ_{1} + 1)} |) \dots (| ξ_{(ℓ_{m})} | - | ξ_{(ℓ_{m} + 1)} |) \\ \times ϵ_{i_{1}}^{(1)} \dots ε_{i_{m}}^{(m (n))} f (Y_{i_{1}}^{(1)}, \dots, Y_{i_{m}}^{(m)}) ∥_{F} \\ \leq 2^{m} C_{m} E [\sum_{1 \leq ℓ_{1}, \dots, ℓ_{m} \leq n} (| ξ_{(ℓ_{1})} | - | ξ_{(ℓ_{1} + 1)} |) \dots (| ξ_{(ℓ_{m})} | - | ξ_{(ℓ_{m} + 1)} |) \\ \times E {∥\sum_{\begin{matrix} 1 \leq i_{k} \leq ℓ_{k}, 1 \leq k \leq m \end{matrix}} ϵ_{i_{1}}^{(1)} \dots ε_{i_{m}}^{(m (n))} f (Y_{i_{1}}^{(1)}, \dots, Y_{i_{m}}^{(m)})∥}_{F}] \\ \leq 2^{m} C_{m} E [\sum_{1 \leq ℓ_{1}, \dots, ℓ_{m} \leq n} \int_{| ξ_{(ℓ_{1} + 1)} |}^{| ξ_{(ℓ_{1})} |} \dots \int_{| ξ_{(ℓ_{m} + 1)} |)}^{| ξ_{(ℓ_{m})} |} ψ_{n} (ℓ_{1}, \dots, ℓ_{m}) d t_{m} \dots d t_{1}] \\ \leq 2^{m} C_{m} E [\sum_{1 \leq ℓ_{1}, \dots, ℓ_{m} \leq n} \int_{| ξ_{(ℓ_{1} + 1)} |}^{| ξ_{(ℓ_{1})} |} \dots \int_{| ξ_{(ℓ_{m} + 1)} |)}^{| ξ_{(ℓ_{m})} |} \\ ψ_{n} (| {i : | ξ_{i} | > t_{1}} |, \dots, | {i : | ξ_{i} | > t_{m}} |) d t_{m} \dots d t_{1}] \\ \leq 2^{m} C_{m} E [\int_{R_{\geq 0}^{m}} ψ_{n} (| {i : | ξ_{i} | > t_{1}} |, \dots, | {i : | ξ_{i} | > t_{m}} |) d t_{1} \dots d t_{m}] \\ \leq 2^{m} C_{m} \int_{R_{\geq 0}^{m}} E ψ_{n} (\sum_{i = 1}^{n} 1_{| ξ_{i} | > t_{1}}, \dots, \sum_{i = 1}^{n} 1_{| ξ_{i} | > t_{m}}) d t_{1} \dots d t_{m} . (B y F u b i n i^{'} s t h e o r e m .) \end{matrix}

Now, suppose that

ψ_{n} (ℓ_{1}, \dots, ℓ_{m}) = {\bar{ψ}}_{n} (\prod_{k = 1}^{m} ℓ_{k})

. Then, we may further bound the above equation by

\begin{matrix} \int_{R_{\geq 0}^{m}} E {\bar{ψ}}_{n} (\prod_{k = 1}^{m} \sum_{i = 1}^{n} 1_{| ξ_{i} | > t_{k}}) d t_{1} \dots d t_{m} \\ = \int_{R_{\geq 0}^{m}} E {\bar{ψ}}_{n} (\sum_{1 \leq i_{1}, \dots, i_{m} \leq n} \prod_{k = 1}^{m} 1_{| ξ_{i_{k}} | > t_{k}}) d t_{1} \dots d t_{m} \\ \leq \int_{R_{\geq 0}^{m}} {\bar{ψ}}_{n} (\sum_{1 \leq i_{1}, \dots, i_{m} \leq n} E \prod_{k = 1}^{m} 1_{| ξ_{i_{k}} | > t_{k}}) d t_{1} \dots d t_{m} (by {Jensen}^{'} s inequality) \\ \leq \int_{R_{\geq 0}^{m}} {\bar{ψ}}_{n} (\sum_{1 \leq i_{1}, \dots, i_{m} \leq n} \prod_{k = 1}^{m} P {(| ξ_{i_{k}} | > t_{k})}^{1 / m}) d t_{1} \dots d t_{m}, \end{matrix}

where the last inequality follows from the generalized Hölder inequality and the assumption that

{\bar{ψ}}_{n}

is nondecreasing. □

Proof of Lemma 2.

For

ψ_{n} (ℓ_{1}, \dots, ℓ_{m}) \equiv a (ℓ_{1}, \dots, ℓ_{m}) {(\prod_{k = 1}^{m} ℓ_{k})}^{1 / 2},

Theorem 2 implies that:

\begin{matrix} E {∥\sum_{1 \leq i_{1}, \dots, i_{m} \leq n} ξ_{i_{1}} \dots ξ_{i_{m}} f (Y_{i_{1}}, \dots, Y_{i_{m}})∥}_{F_{(n, \dots, n), n}} \\ \leq K_{m} \int_{R_{\geq 0}^{m}} E [a (\sum_{i = 1}^{n} 1_{|ξ_{i}| > t_{1}}, \dots, \sum_{i = 1}^{n} 1_{|ξ_{i}| > t_{m}}) \prod_{k = 1}^{m} {(\sum_{i = 1}^{n} 1_{|ξ_{i}| > t_{k}})}^{1 / 2}] d t_{1} \dots d t_{m} \\ \leq K_{m} \int_{R_{\geq 0}^{m}} A_{2, n} (t_{1}, \dots, t_{m}) {\{E \prod_{k = 1}^{m} \sum_{i = 1}^{n} 1_{|ξ_{i}| > t_{k}}\}}^{1 / 2} d t_{1} \dots d t_{m} \\ \leq K_{m} \int_{R_{\geq 0}^{m}} A_{2, n} (t_{1}, \dots, t_{m}) {(\sum_{1 \leq i_{1}, \dots, i_{m} \leq n} \prod_{k = 1}^{m} P {(|ξ_{i_{k}}| > t_{k})}^{1 / m})}^{1 / 2} d t_{1} \dots d t_{m} \\ = n^{m / 2} K_{m} \int_{R_{\geq 0}^{m}} A_{2, n} (t_{1}, \dots, t_{m}) \prod_{k = 1}^{m} P {(|ξ_{1}| > t_{k})}^{1 / 2 m} d t_{1} \dots d t_{m} . \end{matrix}

Here

\begin{matrix} A_{2, n} (t_{1}, \dots, t_{m}) \equiv {\{E [a^{2} (\sum_{i = 1}^{n} 1_{|ξ_{i}| > t_{1}}, \dots, \sum_{i = 1}^{n} 1_{|ξ_{i}| > t_{m}})]\}}^{1 / 2} \to 0 \end{matrix}

as long as none of

\{P (|ξ_{1}| > t_{k}) : 1 \leq k \leq m\}

vanishes. The claim now follows from the dominated convergence theorem. □

Corollary A1

([6]). Let

X (t), t \in T

, be a (weak) Gaussian or Rademacher chaos process of degree m and let

d_{X} (s, t) : = {[E {(X (t) - X (s))}^{2}]}^{1 / 2}, s, t \in T .

If

\int_{0}^{D} {(log N (T, d_{X}, ε))}^{m / 2} d ε < \infty,

then there is a version of X, which we keep denoting X, with almost all of its sample paths in

C_{u} (T, d_{X})

and such that

{∥sup_{t \in T} | X (t) |∥}_{ψ_{2 / m}} \leq {∥X (t_{0})∥}_{ψ_{2}} + K \int_{0}^{D} {(log (N (T, d_{X}, ε)))}^{m / 2} d ε,

and

{∥sup_{\begin{matrix} d_{X} (s, t) \leq δ \\ s . t \in T \end{matrix}} | X (t) - X (s) |∥}_{ψ_{2 / m}} \leq K \int_{0}^{δ} {(log (N (T, d_{X}, ε)))}^{m / 2} d ε,

for all

0 < δ \leq D

, where K is a universal constant and D is the diameter of T for the pseudodistance

d_{X}

. In fact, every separable version of X satisfies these properties.

Theorem A1

([73]). For any random elements

Y_{n}

with values in a metric space

(S, d)

, where Y is measurable and has a separable range, the following are equivalent:

$Y_{n}$ converge in law to Y;
$d_{BL} (Y_{n}, Y) \to 0$ as $n \to \infty$ ;

References

Halmos, P.R. The theory of unbiased estimation. Ann. Math. Stat. 1946, 17, 34–43. [Google Scholar] [CrossRef]
Hoeffding, W. A class of statistics with asymptotically normal distribution. Ann. Math. Stat. 1948, 19, 293–325. [Google Scholar] [CrossRef]
van der Vaart, A.W. Asymptotic Statistics; Cambridge Series in Statistical and Probabilistic Mathematics; Cambridge University Press: Cambridge, MA, USA, 1998; Volume 3, p. xvi+443. [Google Scholar]
Mises, R.V. On the asymptotic distribution of differentiable statistical functions. Ann. Math. Stat. 1947, 18, 309–348. [Google Scholar] [CrossRef]
Serfling, R.J. Approximation Theorems of Mathematical Statistics; John Wiley & Sons: Hoboken, NJ, USA, 2009; Volume 162. [Google Scholar]
de la Peña, V.H.; Giné, E. Decoupling; From dependence to independence, Randomly stopped processes. U-statistics and processes. Martingales and beyond; Probability and its Applications (New York); Springer: New York, NY, USA, 1999; p. xvi+392. [Google Scholar]
Arcones, M.A.; Giné, E. Limit theorems for U-processes. Ann. Probab. 1993, 21, 1494–1542. [Google Scholar] [CrossRef]
Lee, A.J. U-statistics; Volume 110, Statistics: Textbooks and Monographs; Theory and practice; Marcel Dekker, Inc.: New York, NY, USA, 1990; p. xii+302. [Google Scholar]
Stute, W. Almost sure representations of the product-limit estimator for truncated data. Ann. Statist. 1993, 21, 146–156. [Google Scholar] [CrossRef]
Arcones, M.A.; Wang, Y. Some new tests for normality based on U-processes. Statist. Probab. Lett. 2006, 76, 69–82. [Google Scholar] [CrossRef]
Giné, E.; Mason, D.M. Laws of the iterated logarithm for the local U-statistic process. J. Theoret. Probab. 2007, 20, 457–485. [Google Scholar] [CrossRef]
Giné, E.; Mason, D.M. On local U-statistic processes and the estimation of densities of functions of several sample variables. Ann. Statist. 2007, 35, 1105–1145. [Google Scholar] [CrossRef] [Green Version]
Schick, A.; Wang, Y.; Wefelmeyer, W. Tests for normality based on density estimators of convolutions. Statist. Probab. Lett. 2011, 81, 337–343. [Google Scholar] [CrossRef]
Joly, E.; Lugosi, G. Robust estimation of U-statistics. Stochastic Process. Appl. 2016, 126, 3760–3773. [Google Scholar] [CrossRef]
Lee, S.; Linton, O.; Whang, Y.J. Testing for stochastic monotonicity. Econometrica 2009, 77, 585–602. [Google Scholar]
Ghosal, S.; Sen, A.; van der Vaart, A.W. Testing monotonicity of regression. Ann. Statist. 2000, 28, 1054–1082. [Google Scholar] [CrossRef]
Abrevaya, J.; Jiang, W. A nonparametric approach to measuring and testing curvature. J. Bus. Econom. Statist. 2005, 23, 1–19. [Google Scholar] [CrossRef]
Nolan, D.; Pollard, D. U-processes: Rates of convergence. Ann. Statist. 1987, 15, 780–799. [Google Scholar] [CrossRef]
Sherman, R.P. Maximal inequalities for degenerate U-processes with applications to optimization estimators. Ann. Statist. 1994, 22, 439–459. [Google Scholar] [CrossRef]
Yoshihara, K.i. Limiting behavior of U-statistics for stationary, absolutely regular processes. Z. Wahrscheinlichkeitstheorie und Verw. Gebiete 1976, 35, 237–252. [Google Scholar] [CrossRef]
Borovkova, S.; Burton, R.; Dehling, H. Limit theorems for functionals of mixing processes with applications to U-statistics and dimension estimation. Trans. Amer. Math. Soc. 2001, 353, 4261–4318. [Google Scholar] [CrossRef]
Denker, M.; Keller, G. On U-statistics and v. Mises’ statistics for weakly dependent processes. Z. Wahrsch. Verw. Gebiete 1983, 64, 505–522. [Google Scholar] [CrossRef]
Leucht, A. Degenerate U- and V-statistics under weak dependence: Asymptotic theory and bootstrap consistency. Bernoulli 2012, 18, 552–585. [Google Scholar] [CrossRef]
Leucht, A.; Neumann, M.H. Degenerate U- and V-statistics under ergodicity: Asymptotics, bootstrap and applications in statistics. Ann. Inst. Statist. Math. 2013, 65, 349–386. [Google Scholar] [CrossRef] [Green Version]
Bouzebda, S.; Nemouchi, B. Weak-convergence of empirical conditional processes and conditional U-processes involving functional mixing data. In Statistical Inference for Stochastic Processes; Springer: New York, NY, USA, 2022; pp. 1–56. [Google Scholar]
Efron, B. Bootstrap methods: Another look at the jackknife. Ann. Statist. 1979, 7, 1–26. [Google Scholar] [CrossRef]
Hall, P. The Bootstrap and Edgeworth Expansion; Springer Series in Statistics; Springer: New York, NY, USA, 1992; p. xiv+352. [Google Scholar]
Bickel, P.J.; Freedman, D.A. Some asymptotic theory for the bootstrap. Ann. Statist. 1981, 9, 1196–1217. [Google Scholar] [CrossRef]
Arcones, M.A.; Giné, E. On the bootstrap of U and V statistics. Ann. Statist. 1992, 20, 655–674. [Google Scholar] [CrossRef]
Dehling, H.; Mikosch, T. Random quadratic forms and the bootstrap for U-statistics. J. Multivariate Anal. 1994, 51, 392–413. [Google Scholar] [CrossRef]
Leucht, A.; Neumann, M.H. Consistency of general bootstrap methods for degenerate U-type and V-type statistics. J. Multivariate Anal. 2009, 100, 1622–1633. [Google Scholar] [CrossRef] [Green Version]
Politis, D.N.; Romano, J.P. A circular block-resampling procedure for stationary data. In Exploring the Limits of Bootstrap (East Lansing, MI, 1990); Wiley Ser. Probab. Math. Statist. Probab. Math. Statist.; Wiley: New York, NY, USA, 1992; pp. 263–270. [Google Scholar]
Carlstein, E. The use of subseries values for estimating the variance of a general statistic from a stationary sequence. Ann. Statist. 1986, 14, 1171–1179. [Google Scholar] [CrossRef]
Politis, D.N.; Romano, J.P. The stationary bootstrap. J. Amer. Statist. Assoc. 1994, 89, 1303–1313. [Google Scholar] [CrossRef]
Rubin, D.B. The Bayesian bootstrap. Ann. Statist. 1981, 9, 130–134. [Google Scholar] [CrossRef]
Lo, A.Y. A Bayesian method for weighted sampling. Ann. Statist. 1993, 21, 2138–2148. [Google Scholar] [CrossRef]
Mason, D.M.; Newton, M.A. A rank statistics approach to the consistency of a general bootstrap. Ann. Statist. 1992, 20, 1611–1624. [Google Scholar] [CrossRef]
Præstgaard, J.; Wellner, J.A. Exchangeably weighted bootstraps of the general empirical process. Ann. Probab. 1993, 21, 2053–2086. [Google Scholar] [CrossRef]
van der Vaart, A. New Donsker classes. Ann. Probab. 1996, 24, 2128–2140. [Google Scholar] [CrossRef]
Alvarez-Andrade, S.; Bouzebda, S. Strong approximations for weighted bootstrap of empirical and quantile processes with applications. Stat. Methodol. 2013, 11, 36–52. [Google Scholar] [CrossRef]
Bouzebda, S. On the strong approximation of bootstrapped empirical copula processes with applications. Math. Methods Statist. 2012, 21, 153–188. [Google Scholar] [CrossRef]
Bouzebda, S.; Elhattab, I.; Ferfache, A.A. General M-Estimator Processes and their m out of n Bootstrap with Functional Nuisance Parameters. In Methodology and Computing in Applied Probability; Springer: New York, NY, USA, 2022; pp. 1–45. [Google Scholar]
Huskova, M.; Janssen, P. Consistency of the generalized bootstrap for degenerate U-statistics. Ann. Stat. 1993, 21, 1811–1823. [Google Scholar] [CrossRef]
Janssen, P. Weighted bootstrapping of U-statistics. J. Stat. Plan. Inference 1994, 38, 31–41. [Google Scholar] [CrossRef]
Alvarez-Andrade, S.; Bouzebda, S. Cramér’s type results for some bootstrapped U-statistics. Statist. Papers 2020, 61, 1685–1699. [Google Scholar] [CrossRef]
Han, Q. Multiplier U-processes: Sharp bounds and applications. Bernoulli 2022, 28, 87–124. [Google Scholar] [CrossRef]
Radulović, D. Renewal type bootstrap for Markov chains. Test 2004, 13, 147–192. [Google Scholar] [CrossRef]
Giné, E.; Zinn, J. Bootstrapping general empirical measures. Ann. Probab. 1990, 18, 851–869. [Google Scholar] [CrossRef]
Nummelin, E. General Irreducible Markov Chains and Nonnegative Operators; Cambridge Tracts in Mathematics; Cambridge University Press: Cambridge, UK, 1984; Volume 83, p. xi+156. [Google Scholar]
Athreya, K.B.; Ney, P. A new approach to the limit theory of recurrent Markov chains. Trans. Amer. Math. Soc. 1978, 245, 493–501. [Google Scholar] [CrossRef]
Nummelin, E. A splitting technique for Harris recurrent Markov chains. Z. Wahrsch. Verw. Gebiete 1978, 43, 309–318. [Google Scholar] [CrossRef]
Chung, K.L. Markov Chains with Stationary Transition Probabilities, 2nd ed.; Die Grundlehren der mathematischen Wissenschaften, Band 104; Springer: New York, NY, USA, 1967; p. xi+301. [Google Scholar]
van der Vaart, A.W.; Wellner, J.A. Weak Convergence and Empirical Processes; Springer Series in Statistics; With applications to statistics; Springer: New York, NY, USA, 1996; p. xvi+508. [Google Scholar]
Pollard, D. Convergence of Stochastic Processes; Springer Series in Statistics; Springer: New York, NY, USA, 1984; p. xiv+215. [Google Scholar]
Douc, R.; Guillin, A.; Moulines, E. Bounds on regeneration times and limit theorems for subgeometric Markov chains. Ann. Inst. Henri Poincaré Probab. Stat. 2008, 44, 239–257. [Google Scholar] [CrossRef]
Meyn, S.; Tweedie, R.L. Markov Chains and Stochastic Stability, 2nd ed.; Cambridge University Press: Cambridge, MA, USA, 2009. [Google Scholar]
Meyn, S.P.; Tweedie, R.L. Markov Chains and Stochastic Stability; Communications and Control Engineering Series; Springer: London, UK, 1993; p. xvi+548. [Google Scholar]
Bertail, P.; Clémençon, S. Regeneration-based statistics for Harris recurrent Markov chains. In Dependence in Probability and Statistics; Springer: New York, NY, USA, 2006; Volume 187, pp. 3–54. [Google Scholar]
Bertail, P.; Clémençon, S. A renewal approach to Markovian U-statistics. Math. Methods Statist. 2011, 20, 79–105. [Google Scholar] [CrossRef]
Revuz, D. Markov Chains, 2nd ed.; North-Holland Publishing Co.: Amsterdam, The Netherlands, 1984; Volume 11. [Google Scholar]
Cheng, G. Moment consistency of the exchangeably weighted bootstrap for semiparametric M-estimation. Scand. J. Stat. 2015, 42, 665–684. [Google Scholar] [CrossRef] [Green Version]
Shao, J.; Wu, C. Heteroscedasticity-robustness of jackknife variance estimators in linear models. Ann. Statist. 1987, 15, 1563–1579. [Google Scholar] [CrossRef]
Weng, C.S. On a second-order asymptotic property of the Bayesian bootstrap mean. Ann. Statist. 1989, 17, 705–710. [Google Scholar] [CrossRef]
van Zwet, W.R. The Edgeworth expansion for linear combinations of uniform order statistics. In Second Prague Symposium on Asymptotic Statistics (Hradec Králové, 1978); North-Holland: Amsterdam, NY, USA, 1979; pp. 93–101. [Google Scholar]
Pauly, M. Consistency of the subsample bootstrap empirical process. Statistics 2012, 46, 621–626. [Google Scholar] [CrossRef]
Shao, J.; Tu, D.S. The Jackknife and Bootstrap; Springer Series in Statistics; Springer: New York, NY, USA, 1995; p. xviii+516. [Google Scholar]
Bertail, P.; Clémençon, S. Regenerative block bootstrap for Markov chains. Bernoulli 2006, 12, 689–712. [Google Scholar] [CrossRef]
Bertail, P.; Clémençon, S. Approximate regenerative-block bootstrap for Markov chains. Comput. Statist. Data Anal. 2008, 52, 2739–2756. [Google Scholar] [CrossRef]
Fan, Y.; Ullah, A. On goodness-of-fit tests for weakly dependent processes using kernel method. J. Nonparametr. Statist. 1999, 11, 337–360, First NIU Symposium on Statistical Sciences (De Kalb, IL, 1996). [Google Scholar] [CrossRef]
Frees, E.W. Infinite order U-statistics. Scand. J. Statist. 1989, 16, 29–45. [Google Scholar]
Rempala, G.; Gupta, A. Weak limits of U-statistics of infinite order. Random Oper. Stochastic Equations 1999, 7, 39–52. [Google Scholar] [CrossRef]
Levental, S. Uniform limit theorems for Harris recurrent Markov chains. Probab. Theory Related Fields 1988, 80, 101–118. [Google Scholar] [CrossRef]
Dudley, R.M. Nonlinear functionals of empirical measures and the bootstrap. In Probability in Banach Spaces, 7 (Oberwolfach, 1988); Progr. Probab.; Birkhäuser Boston: Boston, MA, USA, 1990; Volume 21, pp. 63–82. [Google Scholar]

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Soukarieh, I.; Bouzebda, S. Exchangeably Weighted Bootstraps of General Markov U-Process. Mathematics 2022, 10, 3745. https://doi.org/10.3390/math10203745

AMA Style

Soukarieh I, Bouzebda S. Exchangeably Weighted Bootstraps of General Markov U-Process. Mathematics. 2022; 10(20):3745. https://doi.org/10.3390/math10203745

Chicago/Turabian Style

Soukarieh, Inass, and Salim Bouzebda. 2022. "Exchangeably Weighted Bootstraps of General Markov U-Process" Mathematics 10, no. 20: 3745. https://doi.org/10.3390/math10203745

APA Style

Soukarieh, I., & Bouzebda, S. (2022). Exchangeably Weighted Bootstraps of General Markov U-Process. Mathematics, 10(20), 3745. https://doi.org/10.3390/math10203745

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Exchangeably Weighted Bootstraps of General Markov U-Process

Abstract

1. Introduction

2. Notation and Definitions

2.1. Markov Chain

2.1.1. Hitting Times

2.1.2. Renewal Times

2.1.3. Regenerative Blocks

2.2. Exchangeable Weights

2.3. The U-Process Framework

2.4. Gaussian Chaos Process

2.5. Technical Assumptions

3. Preliminary Results

The Bootstrapped U-Processes

4. Weighted Bootstrap Weak Convergence

4.1. Bootstrap Weights Examples

4.1.1. Bayesian Resampling Scheme

4.1.2. Efron’s Resampling Scheme

4.1.3. The Delete h-Jackknife

4.1.4. The Multivariate Hypergeometric Resampling Scheme

5. Applications

6. Conclusions

7. Mathematical Development

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI