A Blockwise Bootstrap-Based Two-Sample Test for High-Dimensional Time Series

Yang, Lin

doi:10.3390/e26030226

Open AccessArticle

A Blockwise Bootstrap-Based Two-Sample Test for High-Dimensional Time Series

by

Lin Yang

Joint Laboratory of Data Science and Business Intelligence, Southwestern University of Finance and Economics, Chengdu 611130, China

Entropy 2024, 26(3), 226; https://doi.org/10.3390/e26030226

Submission received: 16 January 2024 / Revised: 24 February 2024 / Accepted: 27 February 2024 / Published: 1 March 2024

(This article belongs to the Special Issue Recent Advances in Statistical Inference for High Dimensional Data)

Download

Browse Figures

Versions Notes

Abstract

We propose a two-sample testing procedure for high-dimensional time series. To obtain the asymptotic distribution of our

ℓ_{\infty}

-type test statistic under the null hypothesis, we establish high-dimensional central limit theorems (HCLTs) for an

α

-mixing sequence. Specifically, we derive two HCLTs for the maximum of a sum of high-dimensional

α

-mixing random vectors under the assumptions of bounded finite moments and exponential tails, respectively. The proposed HCLT for

α

-mixing sequence under bounded finite moments assumption is novel, and in comparison with existing results, we improve the convergence rate of the HCLT under the exponential tails assumption. To compute the critical value, we employ the blockwise bootstrap method. Importantly, our approach does not require the independence of the two samples, making it applicable for detecting change points in high-dimensional time series. Numerical results emphasize the effectiveness and advantages of our method.

Keywords:

two-sample testing; high-dimensional time series; α-mixing; Gaussian approximation; blockwise bootstrap

1. Introduction

A fundamental testing problem in multivariate analysis involves assessing the equality of two mean vectors, denoted as

μ_{X}

and

μ_{Y}

. Since its inception by [1], the Hotelling

T^{2}

test has proven to be a valuable tool in multivariate analyses. Subsequently, numerous studies have addressed the testing of

μ_{X} = μ_{Y}

, within various contexts and under distinct assumptions. See refs. [2,3], along with their respective references.

Consider two sets of observations,

{X_{t}}_{t = 1}^{n_{1}}

and

{Y_{t}}_{t = 1}^{n_{2}}

, where

X_{t} = {(X_{t, 1}, \dots, X_{t, p})}^{T}

and

Y_{t} = {(Y_{t, 1}, \dots, Y_{t, p})}^{T}

. These observations are drawn from two populations with means

μ_{X}

and

μ_{Y}

, respectively. The classical test aims to test the hypotheses:

\begin{matrix} H_{0} : μ_{X} = μ_{Y} versus H_{1} : μ_{X} \neq μ_{Y} . \end{matrix}

(1)

When

{X_{t}}_{t = 1}^{n_{1}}

and

{Y_{t}}_{t = 1}^{n_{2}}

are two independent sequences and independent with each other, a considerable body of literature focuses on testing Hypothesis (1). The

ℓ_{2}

-type test statistic corresponding to (1) is of the form

{(\bar{X} - \bar{Y})}^{T} S^{- 1} (\bar{X} - \bar{Y})

, where

\bar{X} = n_{1}^{- 1} \sum_{t = 1}^{n_{1}} X_{t}

,

\bar{Y} = n_{2}^{- 1} \sum_{t = 1}^{n_{2}} Y_{t}

and

S^{- 1}

is the weight matrix. A straightforward choice for

S^{- 1}

is the identity matrix

I_{p}

[4,5], implying equal weighting for each dimension. Several classical asymptotic theories have been developed based on this selection of

S^{- 1}

. However, this choice disregards the variability in each dimension and the correlations between them, resulting in suboptimal performance, particularly in the presence of heterogeneity or the existence of correlations between dimensions. In recent decades, numerous researchers have investigated various choices for

S^{- 1}

along with the corresponding asymptotic theories. See refs. [6,7]. In addition, some researchers have developed a framework centered on

ℓ_{\infty}

-type test statistics, represented as

{max}_{j \in [p]} | {(S^{- 1 / 2} (\bar{X} - \bar{Y}))}_{j} |

[8,9,10]. Extreme value theory plays a pivotal role in deriving the asymptotic behaviors of these test statistics.

However, when

{X_{t}}_{t = 1}^{n_{1}}

and

{Y_{t}}_{t = 1}^{n_{2}}

are two weakly dependent sequences and are not independent of each other, the above methods may not work well. In this paper, we introduce an

ℓ_{\infty}

-type test statistic

T_{n} : = {(n_{1} n_{2})}^{1 / 2} {(n_{1} + n_{2})}^{- 1 / 2} {| \bar{X} - \bar{Y} |}_{\infty}

for testing

H_{0}

under two dependent sequences. Based on

Σ

, which represents the variance of

{(n_{1} n_{2})}^{1 / 2} {(n_{1} + n_{2})}^{- 1 / 2} (\bar{X} - \bar{Y})

, we construct a Gaussian maxima, denoted as

T_{n}^{G}

, to approximate

T_{n}

under the null hypothesis. When

n_{1} = n_{2} = n

,

T_{n}

can be written as

| S_{n} |_{\infty}

, the maximum of a sum of high-dimensional weakly dependent random vectors, where

S_{n} = n^{- 1 / 2} \sum_{t = 1}^{n} (X_{t} - Y_{t})

. Let

T_{n}^{G} = {| G |}_{\infty}

with

G = {(G_{1}, \dots, G_{p})}^{T}

∼

N {0, var (S_{n})}

and

A

be a class of Borel subsets in

R^{p}

. Define

ρ_{n} (A) = sup_{A \in A} | P (S_{n} \in A) - P (G \in A) | .

Paticularly, let

A^{\max}

consists of all sets

A^{\max}

of the form

A^{\max} = {{(a_{1}, \dots, a_{p})}^{T} \in R^{p}

:

{max}_{j \in [p]} | a_{j} | \leq x}

with some

x \in R

. Then we have

\begin{matrix} ρ_{n} (A^{\max}) = sup_{x \in R} | P (T_{n} \leq x) - P (T_{n}^{G} \leq x) | . \end{matrix}

Note that

ρ_{n} (A^{\max})

is the Kolmogorov distance between

T_{n}

and

T_{n}^{G}

.

When dimension p diverges exponentially with respect to the sample size n, several studies have focused on deriving

ρ_{n} (A^{\max}) = o (1)

under a weakly dependent assumption. Based on the coupling method for

β

-mixing sequence, ref. [11] obtained

ρ_{n} (A^{\max}) = o (1)

under the

β

-mixing condition, contributing to the understanding of such phenomena. Ref. [12] extended the scope of the investigation to the physical dependence framework introduced by [13]. Considering three distinct types of dependence—namely

α

-mixing, m-dependence, and physical dependence measures—ref. [14] made significant strides. They established nonasymptotic error bounds for Gaussian approximations of sums involving high-dimensional dependent random vectors. Their analysis encompassed various scenarios of

A

, including hyper-rectangles, simple convex sets, and sparsely convex sets. Let

A^{re}

be the class of all hyper-rectangles in

R^{p}

. Under the

α

-mixing scenario and some mild regularity conditions, [14] showed

ρ_{n} (A^{re}) ≲ \frac{{log (p n)}^{7 / 6}}{n^{1 / 9}},

hence the Gaussian approximation holds if

log (p n) = o (n^{2 / 21})

. In this paper, under some conditions similar to or even weaker than [14], we obtain

ρ_{n} (A^{\max}) ≲ \frac{{log (p n)}^{3 / 2}}{n^{1 / 6}},

which implies the Gaussian approximation holds if

log (p n) = o (n^{1 / 9})

. Refer to Remark 1 for more details on the comparison of the convergence rates. By using the Gaussian-to-Gaussian comparison and Nazarov’s inequality for p-dimensional random vectors, we can easily extend our result to

ρ_{n} (A^{re}) ≲ {log (p n)}^{3 / 2} n^{- 1 / 6}

. Given that our framework and numerous testing procedures rely on

ℓ_{\infty}

-type test statistics, we thus propose our results under

A^{\max}

. When p diverges polynomially with respect to n, to the best of our knowledge, there is no existing literature providing the convergence rate of

ρ_{n} (A^{\max})

for

α

-mixing sequences under bounded finite moments.

Based on the Gaussian approximation for high-dimensional independent random vectors [15,16], we employ the coupling method for

α

-mixing sequence [17] and “big-and-small” block technique to specify the convergence rate of

ρ_{n} (A^{\max})

under various divergence rates of p. For more details, refer to Theorem 1 in Section 3.1 and its corresponding proof in Appendix A. Given that

Σ

is typically unknown in practice, we develop a data-driven procedure based on blockwise wild bootstrap [18] to determine the critical value for a given significance level

α

. The blockwise wild bootstrap method is widely used in the time series analysis. See [19,20] and references within.

The independence between

{X_{t}}_{t = 1}^{n_{1}}

and

{Y_{t}}_{t = 1}^{n_{2}}

is not a necessary assumption in our method. We only require the pair sequence

{(X_{t}, Y_{t})}

is weakly dependent. Therefore, our method can be applied effectively to detect change points in high-dimensional time series. Further details on this application can be found in Section 4.

The rest of this paper is organized as follows. Section 2 introduces the test statistic and the blockwise bootstrap method. The convergence rates of Gaussian approximations for high-dimensional

α

-mixing sequence and the theoretical properties of the proposed test can be found in Section 3. In Section 4, an application to change point detection for high-dimensional time series is presented. The selection method for tuning parameter and a simulation study to investigate the numerical performance of the test are displayed in Section 5. We apply the proposed method to the opening price data from multiple stocks in Section 6. Section 7 provides discussions on the results and outlines our future work. The proofs of the main results in Section 3 are detailed in the Appendix A, Appendix B, Appendix C and Appendix D.

Notation:

For any positive integer

p \geq 1

, we write

[p] = {1, \dots, p}

. We use

{| a |}_{\infty} = {max}_{j \in [p]} | a_{j} |

to denote the

ℓ_{\infty}

-norm of the p-dimensional vector

a

. Let

⌊ x ⌋

and

⌈ x ⌉

represent the greatest integer less than or equal to x and the smallest integer greater than or equal to x, respectively. For two sequences of positive numbers

{a_{n}}

and

{b_{n}}

, we write

a_{n} ≲ b_{n}

or

b_{n} ≳ a_{n}

if

{lim sup}_{n \to \infty} a_{n} / b_{n} ⩽ c_{0}

for some positive constant

c_{0}

. Let

a_{n} ≍ b_{n}

if

a_{n} ≲ b_{n}

and

b_{n} ≲ a_{n}

hold simultaneously. Denote

0_{p} = {(0, \dots, 0)}^{T} \in R^{p}

. For any

m \times m

matrix

A = {(a_{i j})}_{m \times m}

, let

{| A |}_{\infty} = {max}_{i, j \in [m]} | a_{i j} |

and

{∥ A ∥}_{2}

be the spectral norm of

A

. Additionally, denote

λ_{min} (A)

as the smallest eigenvalue of

A

. Let

1 (\cdot)

be the indicator function. For any

x, y \in R

, denote

x \lor y = max {x, y}

and

x \land y = min {x, y}

. Given

γ > 0

, we define the function

ψ_{γ} (x) : = exp (x^{γ}) - 1

for any

x > 0

. For a real-valued random variable

ξ

, we define

{∥ ξ ∥}_{ψ_{γ}} : = inf [λ > 0 : E {ψ_{γ} (| ξ | / λ)} \leq 1]

. Throughout the paper, we use

c, C \in (0, \infty)

to denote two generic finite constants that do not depend on

(n_{1}, n_{2}, p)

, and may be different in different uses.

2. Methodology

2.1. Test Statistic and Its Gaussian Analog

Consider two weakly stationary time series

{X_{t}, t \in Z}

and

{Y_{t}, t \in Z}

with

X_{t} = {(X_{t, 1}, \dots, X_{t, p})}^{T}

and

Y_{t} = {(Y_{t, 1}, \dots, Y_{t, p})}^{T}

. Let

μ_{X} = E (X_{t})

and

μ_{Y} = E (Y_{t})

. The primary focus is on testing equality of mean vectors of the two populations:

\begin{matrix} H_{0} : μ_{X} = μ_{Y} versus H_{1} : μ_{X} \neq μ_{Y} . \end{matrix}

Given the observations

{X_{t}}_{t = 1}^{n_{1}}

and

{Y_{t}}_{t = 1}^{n_{2}}

, the estimations of

μ_{X}

and

μ_{Y}

are, respectively,

{\hat{μ}}_{X} = {n_{1}}^{- 1} \sum_{t = 1}^{n_{1}} X_{t}

and

{\hat{μ}}_{Y} = {n_{2}}^{- 1} \sum_{t = 1}^{n_{2}} Y_{t}

. In this paper, we assume

n_{1} ≍ n_{2} ≍ n

. It is natural to consider the

ℓ_{\infty}

-type test statistic

T_{n} = {(n_{1} n_{2})}^{1 / 2} {(n_{1} + n_{2})}^{- 1 / 2} {| {\hat{μ}}_{X} - {\hat{μ}}_{Y} |}_{\infty}

. Write

\tilde{n} = max {n_{1}, n_{2}}

. Define two new sequences

{{\tilde{X}}_{t}}_{t = 1}^{\tilde{n}}

and

{{\tilde{Y}}_{t}}_{t = 1}^{\tilde{n}}

with

\begin{matrix} {\tilde{X}}_{t} = X_{t \land n_{1}} 1 (1 \leq t \leq n_{1}) and {\tilde{Y}}_{t} = Y_{t \land n_{2}} 1 (1 \leq t \leq n_{2}) . \end{matrix}

For each

t \in [\tilde{n}]

, let

Z_{t} = \sqrt{\frac{n_{2} \tilde{n}}{n_{1} (n_{1} + n_{2})}} {\tilde{X}}_{t} - \sqrt{\frac{n_{1} \tilde{n}}{n_{2} (n_{1} + n_{2})}} {\tilde{Y}}_{t} .

Then,

T_{n}

can be rewritten as

\begin{matrix} T_{n} = | \frac{1}{\sqrt{\tilde{n}}} \sum_{t = 1}^{\tilde{n}} Z_{t} |_{\infty} . \end{matrix}

(2)

We reject the null hypothesis

H_{0}

if

T_{n} > {cv}_{α}

, where

{cv}_{α}

represents the critical value at the significance level

α \in (0, 1)

. Determining

{cv}_{α}

involves deriving the distribution of

T_{n}

under

H_{0}

. However, due to the divergence of p in a high-dimensional scenario, obtaining the distribution of

T_{n}

is challenging. To address this challenge, we employ the Gaussian approximation theorem [15,16]. We seek a Gaussian analog, denoted as

T_{n}^{G}

, satisfying the property that the Kolmogorov distance between

T_{n}

and

T_{n}^{G}

converges to zero under

H_{0}

. Then, we can replace

{cv}_{α}

by

{cv}_{α}^{G} : = inf {x > 0 : P (T_{n}^{G} > x) \leq α}

. Define a p-dimensional Gaussian vector

\begin{matrix} G \sim N (0_{p}, Ξ_{\tilde{n}}) with Ξ_{\tilde{n}} = var (\frac{1}{\sqrt{\tilde{n}}} \sum_{t = 1}^{\tilde{n}} Z_{t}) . \end{matrix}

(3)

We then define the Gaussian analogue of

T_{n}

as

\begin{matrix} T_{n}^{G} = {| G |}_{\infty} . \end{matrix}

Proposition 1 below demonstrates that the null distribution of

T_{n}

can be effectively approximated by the distribution of

T_{n}^{G}

.

2.2. Blockwise Bootstrap

Note that the long-run covariance matrix

Ξ_{\tilde{n}}

specified in (3) is typically unknown. As a result, determining

{cv}_{α}^{G}

through the distribution of

T_{n}^{G}

becomes challenging. To address this challenge, we introduce a parametric bootstrap estimator for

T_{n}

using the blockwise bootstrap method [18].

For some positive constant

ϑ \in [1 / 2, 1)

, let

S ≍ {\tilde{n}}^{1 - ϑ}

and

B = ⌈ \tilde{n} / S ⌉

be the size of each block and the number of blocks, respectively. Denote

I_{b} = {(b - 1) S + 1, \dots, b S}

for

b \in [B - 1]

and

I_{B} = {(B - 1) S + 1, \dots, \tilde{n}}

. Let

{ϱ_{b}}_{b = 1}^{B}

be the sequence of i.i.d. standard normal random variables and

ϱ^{'} = (ϱ_{1}^{'}, \dots, ϱ_{\tilde{n}}^{'})

, where

ϱ_{t}^{'} = ϱ_{b}

if

t \in I_{b}

. Define the bootstrap estimator of

T_{n}

as

\begin{matrix} {\hat{T}}_{n}^{G} = | \frac{1}{\sqrt{\tilde{n}}} \sum_{t = 1}^{\tilde{n}} (Z_{t} - \bar{Z}) ϱ_{t}^{'} |_{\infty}, \end{matrix}

where

\bar{Z} = {\tilde{n}}^{- 1} \sum_{t = 1}^{\tilde{n}} Z_{t}

. Based on this estimator, we define the estimated critical value

{\hat{cv}}_{α}

as

\begin{matrix} {\hat{cv}}_{α} : = & inf {x > 0 : P ({\hat{T}}_{n}^{G} > x | E) \leq α}, \end{matrix}

(4)

where

E = {X_{1}, \dots, X_{n_{1}}, Y_{1}, \dots, Y_{n_{2}}}

. Then, we reject the null hypothesis

H_{0}

if

T_{n} > {\hat{cv}}_{α}

. The procedure for selecting the parameter

ϑ

(or block size S) is detailed in Section 5.1. In practice, we obtain

{\hat{cv}}_{α}

through the following bootstrap procedure: Generate K independent sequences

{ϱ_{(1), t}^{'}}_{t = 1}^{\tilde{n}}, \dots, {ϱ_{(K), t}^{'}}_{t = 1}^{\tilde{n}}

, with each

{ϱ_{(k), t}^{'}}_{t = 1}^{\tilde{n}}

generated as

{ϱ_{t}^{'}}_{t = 1}^{\tilde{n}}

. For each

k \in [K]

, calculate

{\hat{T}}_{(k), n}^{G}

with

{ϱ_{(k), t}^{'}}_{t = 1}^{\tilde{n}}

. Then,

{\hat{cv}}_{α}

is the

(1 - α) K

-th largest value among

{{\hat{T}}_{(1), n}^{G}, \dots, {\hat{T}}_{(K), n}^{G}}

. Here, K is the number of bootstrap replications.

3. Theoretical Results

We employ the concept of ‘

α

-mixing’ to characterize the serial dependence of

{(X_{t}, Y_{t})}

, with the

α

-mixing coefficient at lag

κ

defined as

\begin{matrix} α (κ) : = sup_{r} sup_{A \in F_{- \infty}^{r}, B \in F_{r + κ}^{\infty}} | P (A B) - P (A) P (B) |, \end{matrix}

(5)

where

F_{- \infty}^{r}

and

F_{r + κ}^{\infty}

are the

σ

-fields generated by

{(X_{t}, Y_{t}) : t \leq r}

and

{(X_{t}, Y_{t}) : t \geq r + κ}

, respectively. We call the sequence

{(X_{t}, Y_{t})}

is

α

-mixing if

α (κ) \to 0

as

κ \to \infty

.

3.1. Gaussian Approximation for High-Dimensional $α$ -Mixing Sequence

To show that the Kolmogorov distance between

T_{n}

and

T_{n}^{G}

converges to zero under various divergence rates of p, we need the following central limit theorems for high-dimensional

α

-mixing sequence.

Theorem 1.

Let

{ξ_{t}}_{t = 1}^{n}

be an α-mixing sequence of p-dimensional centered random vectors and

{α (κ)}_{κ \geq 1}

denote the α-mixing coefficients of

{ξ_{t}}

, defined in the same manner as (5). Write

S_{n} = {(S_{n, 1}, \dots, S_{n, p})}^{T} = n^{- 1 / 2} \sum_{t = 1}^{n} ξ_{t}

and

W = {(W_{1}, \dots, W_{p})}^{T} \sim N (0_{p}, Σ_{n})

with

Σ_{n} = E (S_{n} S_{n}^{T})

. Define

ρ_{n} = sup_{x \in R} | P (| S_{n} |_{\infty} \leq {x) - P (| W |}_{\infty} \leq x) | .

(i): If ${max}_{t \in [n]} {max}_{j \in [p]} E (| ξ_{t, j} |^{m}) \leq C_{1}^{*}$ , $α (κ) \leq C_{2}^{*} κ^{- τ}$ and $λ_{min} (Σ_{n}) \geq C_{3}^{*}$ for some $m > 3$ , $τ > max {2 m / (m - 3), 3}$ and constants $C_{1}^{*}, C_{2}^{*}, C_{3}^{*} > 0$ , we have

$ρ_{n} ≲ \frac{p^{1 / 2} {(log p)}^{1 / 4}}{n^{\tilde{τ}}}$

provided that $p = o (n^{2 \tilde{τ}})$ , where $\tilde{τ} = τ / (11 τ + 12)$ .
(ii): If ${max}_{t \in [n]} {max}_{j \in [p]} {∥ ξ_{t, j} ∥}_{ψ_{γ_{1}}} \leq M_{n}$ , $α (κ) \leq C_{1}^{* *} exp (- C_{2}^{* *} κ^{γ_{2}})$ and ${min}_{j \in [p]} {(Σ_{n})}_{j, j} \geq C_{3}^{* *}$ for some $M_{n} \geq 1$ , $γ_{1} \in (0, 2]$ , $γ_{2} > 0$ and constants $C_{1}^{* *}, C_{2}^{* *}, C_{3}^{* *} > 0$ , we have

$ρ_{n} ≲ \frac{M_{n} {log (p n)}^{max {(2 γ_{2} + 1) / 2 γ_{2}, 3 / 2}}}{n^{1 / 6}}$

provided that ${log (p n)}^{3} = o {n^{γ_{1} γ_{2} / (2 γ_{1} + 2 γ_{2} - γ_{1} γ_{2})}}$ and $M_{n}^{2} {log (p n)}^{1 / γ_{2}} = o (n^{1 / 3})$ .

Remark 1.

In scenarios where the dimension p diverges polynomially with respect to n, Theorem 1(i) represents a novel contribution to the existing literature. Moreover, if

τ \to \infty

(i.e.,

α (κ) ≲ exp (- C κ)

for some constant

C > 0

), we have

\tilde{τ} \to 1 / 11

, and thus

ρ_{n} = o (1)

if

p {(log p)}^{1 / 2} = o (n^{2 / 11})

. Compared with Theorem 1 in [14], which provides a Gaussian approximation result when p diverges exponentially with respect to n, Theorem 1(ii) has three improvements. Firstly, all conditions of Theorem 1(ii) are equivalent to those in Theorem 1 of [14], with the exception that we permit

γ_{1} \in (0, 1)

, thereby offering a weaker assumption that is more broadly applicable. Secondly, the convergence rate dependent on n via

n^{- 1 / 6}

in Theorem 1(ii) outperforms the rate of

n^{- 1 / 9}

demonstrated in Theorem 1 of [14]. Note that the convergence rate in Theorem 1 of [14] can be rewritten as

{[\frac{M_{n} {log (p n)}^{(2 γ_{2} + 1) / 2 γ_{2}}}{n^{1 / 6}}]}^{2 / 3} + \frac{M_{n} {log (p n)}^{7 / 6}}{n^{1 / 9}} .

To ensure

ρ_{n} = o (1)

, in our result, it is necessary to allow

M_{n}^{6} {log (p n)}^{(6 γ_{2} + 3) / γ_{2}} = o (n)

when

γ_{2} \leq 2 / 3

and

M_{n}^{6} {log (p n)}^{max {(6 γ_{2} + 3) / γ_{2}, 9}} = o (n)

when

γ_{2} > 2 / 3

, respectively. Comparatively, the basic requirements under Theorem 1 of [14] are

M_{n}^{6} {log (p n)}^{(6 γ_{2} + 3) / γ_{2}} = o (n)

when

γ_{2} \leq 2 / 3

and

M_{n}^{9} {log (p n)}^{21 / 2} = o (n)

when

γ_{2} > 2 / 3

, respectively. Due to

(6 γ_{2} + 3) / γ_{2} < 21 / 2

when

γ_{2} > 2 / 3

, our result permits a larger or equal divergence rate of p compared with Theorem 1 in [14].

3.2. Theoretical Properties

In order to derive the theoretical properties of

T_{n}

, the following regular assumptions are needed.

Assumption 1.

(i): For some $m > 4$ , there exists a constant $C_{1} > 0$ s.t. ${max}_{t \in [\tilde{n}]} {max}_{j \in [p]} E (| Z_{t, j} |^{m}) \leq C_{1}$ .
(ii): There exists a constant $C_{2} > 0$ s.t. $α (κ) \leq C_{2} κ^{- τ}$ for some $τ > 3 m / (m - 4)$ .
(iii): There exists a constant $C_{3} > 0$ s.t. $λ_{min} (Ξ_{\tilde{n}}) \geq C_{3}$ .

Assumption 2.

(i): There exists a constant $C_{1}^{'} > 0$ s.t. ${max}_{t \in [\tilde{n}]} {max}_{j \in [p]} {∥ Z_{t, j} ∥}_{ψ_{2}} \leq C_{1}^{'}$ .
(ii): There exist two constants $C_{2}^{'}, C_{3}^{'} > 0$ s.t. $α (κ) \leq C_{2}^{'} exp (- C_{3}^{'} κ)$ .
(iii): There exists a constant $C_{4}^{'} > 0$ s.t. ${min}_{j \in [p]} {(Ξ_{\tilde{n}})}_{j, j} \geq C_{4}^{'}$ .

Remark 2.

The two mild Assumptions, 1 and 2, delineate the necessary assumptions for

{(X_{t}, Y_{t})}

to facilitate the development of Gaussian approximation theories for the dimension p divergence, characterized by polynomial and exponential rates relative to the sample size n, respectively. Assumptions 1(i) and 1(ii) are common assumptions in multivariate time series analysis. Due to

n_{1} ≍ n_{2} ≍ n

, if

{max}_{t \in [n_{1}], j \in [p]} E (| X_{t, j} |^{m}) \leq C

and

{max}_{t \in [n_{2}], j \in [p]} E (| Y_{t, j} |^{m}) \leq C

, then Assumption 1(i) holds, as verified by the triangle inequality. Additionally, Assumption 1(iii) necessitates the strong nondegeneracy of

Ξ_{\tilde{n}}

, a requirement commonly assumed in Gaussian approximation theories (see refs. [21,22], among others). Note that Assumption 2(iii) is implied by Assumption 1(iii). The latter assumption only necessitates the nondegeneracy of

{min}_{j \in [p]} var ({\tilde{n}}^{- 1 / 2} \sum_{t = 1}^{\tilde{n}} Z_{t, j})

. We can modify Assumption 2(i) to

{max}_{t \in [\tilde{n}]} {max}_{j \in [p]} {∥ Z_{t, j} ∥}_{ψ_{γ}} \leq C

for any

γ \in (0, 2]

, a standard assumption in the literature on ultra-high-dimensional data analysis. This assumption ensures subexponential upper bounds for the tail probabilities of the statistics in question when

p ≫ n

, as discussed in [23,24]. The requirement of sub-Gaussian properties in Assumption 2(i) is made for the sake of simplicity. If

{X_{t}}

and

{Y_{t}}

share the same tail probability, Assumption 2(i) is satisfied automatically. Assumption 2(ii) necessitates that the α-mixing coefficients decay at an exponential rate.

Write

Δ_{n} : = max {n_{1}, n_{2}} - min {n_{1}, n_{2}}

. Define two cases with respect to the distinct divergence rates of p as

Case1: ${X_{t}}_{t = 1}^{n_{1}}$ and ${Y_{t}}_{t = 1}^{n_{2}}$ satisfy Assumption 1, and the dimension p satisfies $p^{2} log p = o {n^{4 τ / (11 τ + 12)}}$ and $Δ_{n}^{2} log p = o (n)$ ;
Case2: ${X_{t}}_{t = 1}^{n_{1}}$ and ${Y_{t}}_{t = 1}^{n_{2}}$ satisfy Assumption 2, and the dimension p satisfies $log (p n) = o (n^{1 / 9})$ and $Δ_{n}^{2} log p = o (n)$ .

Note that

Δ_{n}^{2} log p = o (n)

mandates the maximum difference between the two sample sizes. Proposition 1 below demonstrates that, under the aforementioned cases and

H_{0}

, the Kolmogorov distance between

T_{n}

and

T_{n}^{G}

converges to zero as the sample size approaches infinity. Proposition 1 can be directly derived from Theorem 1. Note that, in the scenario where the dimension p diverges in a polynomial rate with respect to n, obtaining Proposition 1 requires only

m > 3

and

τ > max {2 m / (m - 3), 3}

, an assumption weaker than Assumption 1. The more stringent restrictions

m > 4

and

τ > 3 m / (m - 4)

in Assumption 1 are imposed to establish the results presented in Theorems 2 and 3.

Proposition 1.

In either Case1 or Case2, it holds under the null hypothesis

H_{0}

that

\begin{matrix} sup_{x \in R} | P (T_{n} \leq x) - P (T_{n}^{G} \leq x) | = o (1) . \end{matrix}

According to Proposition 1, the critical value

{cv}_{α}

can be substituted with

{cv}_{α}^{G}

. However, in practical scenarios, the long-run covariance

Ξ_{\tilde{n}}

defined in (3) is typically unknown. This implies that obtaining

{cv}_{α}^{G}

directly from the distribution of

T_{n}^{G}

is not feasible. We introduce a bootstrap method for obtaining the estimator

{\hat{cv}}_{α}

defined in (4). In situations where the dimension p diverges at a polynomial rate relative to the sample size n, we require an additional Assumption 3 to ensure that

{\hat{cv}}_{α}

serves as a reliable estimator for

{cv}_{α}

. Assumption 3 places restrictions on the cumulant function, a commonly assumed criterion in time series analysis. Refer to [25,26] for examples of such assumptions in the literature.

Assumption 3.

For each

i, j \in [p]

, define

{cum}_{i, j} (h, t, s) = cov ({\overset{˚}{Z}}_{0, i} {\overset{˚}{Z}}_{h, j}, {\overset{˚}{Z}}_{t, i} {\overset{˚}{Z}}_{s, j}) - γ_{t, i, i} γ_{s - h, j, j} - γ_{s, i, j} γ_{t - h, j, i}

, where

γ_{h, i, j} = cov (Z_{0, i}, Z_{h, j})

and

{\overset{˚}{Z}}_{t, j} = Z_{t, j} - E (Z_{t, j})

. There exists a constant

C_{4} > 0

s.t.

max_{i, j \in [p]} \sum_{h = - \infty}^{\infty} \sum_{t = - \infty}^{\infty} \sum_{s = - \infty}^{\infty} | {cum}_{i, j} (h, t, s) | < C_{4} .

Similar to Case1 and Case2, we consider two cases corresponding to different divergence rates of the dimension p, as outlined below:

Case3: ${X_{t}}_{t = 1}^{n_{1}}$ and ${Y_{t}}_{t = 1}^{n_{2}}$ satisfy Assumptions 1 and 3.
Case4: ${X_{t}}_{t = 1}^{n_{1}}$ and ${Y_{t}}_{t = 1}^{n_{2}}$ satisfy Assumption 2.

Theorem 2.

In either Case3 with

p log p = o [n^{min {(1 - ϑ) / 4, 2 τ / (11 τ + 12)}}]

and

Δ_{n}^{2} log p = o (n)

, or Case4 with

log (p n) = o [n^{min {(1 - ϑ) / 2, ϑ / 7, 1 / 9}}]

and

Δ_{n}^{2} log p = o (n)

, it holds under

H_{0}

that

{sup}_{x \in R} | P (T_{n} \leq x) - P ({\hat{T}}_{n}^{G} \leq x | E) | = o_{p} (1)

. Moreover, it holds under

H_{0}

that

P (T_{n} > {\hat{cv}}_{α}) \to α a s n \to \infty .

Theorem 3.

In either Case3 with

p = o {n^{(1 - ϑ) / 4}}

or Case4 with

log (p n) = o [n^{min {ϑ / 3, (1 - ϑ) / 2}}]

, if

{max}_{j \in [p]} | μ_{X, j} - μ_{Y, j} | ≫ n^{- 1 / 2} {(log p)}^{1 / 2}

, it holds that

P (T_{n} > {\hat{cv}}_{α}) \to 1 a s n \to \infty .

Remark 3.

The different requirements for the divergence rates of p follow from the fact that we do not rely on the Gaussian approximation and comparison results under certain alternative hypotheses. By Theorem 2 and Theorem 3, the optimal selections for ϑ are

1 / 2

and

7 / 9

in Case3 and Case4, respectively. This implies that

{lim}_{n \to \infty} P_{H_{0}} (T_{n} > {\hat{cv}}_{α}) = α

holds with

p log p = o (n^{1 / 8})

in Case3 and

log (p n) = o (n^{1 / 9})

in Case4. Under certain alternative hypotheses,

{lim}_{n \to \infty} P_{H_{1}} (T_{n} > {\hat{cv}}_{α}) = 1

holds with

p = o (n^{1 / 8})

in Case3 and

log (p n) = o (n^{1 / 9})

in Case4.

4. Application: Change Point Detection

In this section, we elaborate that our two-sample testing procedure can be regarded as a novel method for detecting change points for high-dimensional time series. To illustrate, we provide a notation for the detection of a single change point, with the understanding that it can be easily extended to the multiple change points case.

Consider a p-dimensional time series

{X_{t}}_{t = 1}^{n}

. Let

μ_{t} = E (X_{t})

. Consider the following hypothesis testing problem:

H_{0}^{'} : μ_{1} = \dots = μ_{n} versus H_{1}^{'} : μ_{1} = \dots = μ_{τ_{0} - 1} \neq μ_{τ_{0}} = \dots = μ_{n} .

Here,

τ_{0}

is the unknown change point. Let w be a positive integer such that

w < min {τ_{0}, n - τ_{0}}

. We define

{\bar{μ}}^{t} = w^{- 1} \sum_{l = t - w / 2 + 1}^{t + w / 2} μ_{l}

,

{\bar{μ}}^{(1)} = w^{- 1} \sum_{l = 1}^{w} μ_{l}

and

{\bar{μ}}^{(2)} = w^{- 1} \sum_{l = n - w + 1}^{n} μ_{l}

. Then for each

t \in [3 w / 2, n - 3 w / 2]

, define

Δ^{t, (1)} = {\bar{μ}}^{t} - {\bar{μ}}^{(1)}

and

Δ^{t, (2)} = {\bar{μ}}^{t} - {\bar{μ}}^{(2)}

. Thus,

\begin{matrix} Δ^{t, (1)} = \{\begin{matrix} 0_{p} & , if 3 w / 2 \leq t \leq τ_{0} - w / 2, \\ ({\bar{μ}}^{(2)} - {\bar{μ}}^{(1)}) \frac{t + w / 2 - τ_{0}}{w} & , if τ_{0} - w / 2 < t \leq τ_{0} + w / 2, \\ {\bar{μ}}^{(2)} - {\bar{μ}}^{(1)} & , if τ_{0} + w / 2 < t \leq n - 3 w / 2, \end{matrix} \\ Δ^{t, (2)} = \{\begin{matrix} {\bar{μ}}^{(1)} - {\bar{μ}}^{(2)} & , if 3 w / 2 \leq t \leq τ_{0} - w / 2, \\ ({\bar{μ}}^{(1)} - {\bar{μ}}^{(2)}) \frac{- t + w / 2 + τ_{0}}{w} & , if τ_{0} - w / 2 < t \leq τ_{0} + w / 2, \\ 0_{p} & , if τ_{0} + w / 2 < t \leq n - 3 w / 2 . \end{matrix} \end{matrix}

Assume

| {\bar{μ}}^{(1)} - {\bar{μ}}^{(2)} |_{\infty} = O (1)

, which represents the sparse signals case. Define

t_{1} (ε^{t, (1)}) = min {t \in [3 w / 2, n - 3 w / 2] : | Δ^{t, (1)} | > ε^{t, (1)}}

and

t_{2} (ε^{t, (2)}) = max {t \in [3 w / 2, n - 3 w / 2] : | Δ^{t, (2)} | > ε^{t, (2)}}

with two well-defined thresholds

ε^{t, (1)}, ε^{t, (2)} \geq 0

. Due to the symmetry of

| Δ^{t, (1)} |

and

| Δ^{t, (2)} |

, it holds under

H_{1}^{'}

that

\begin{matrix} τ_{0} = \frac{t_{1} (ε^{t, (1)}) + t_{2} (ε^{t, (2)})}{2} . \end{matrix}

The sample estimators of

{\bar{μ}}^{t}

,

{\bar{μ}}^{(1)}

and

{\bar{μ}}^{(2)}

are, respectively,

{\hat{\bar{μ}}}^{t} = w^{- 1} \sum_{l = t - w / 2 + 1}^{t + w / 2} X_{l}

,

{\hat{\bar{μ}}}^{(1)} = w^{- 1} \sum_{l = 1}^{w} X_{l}

and

{\hat{\bar{μ}}}^{(2)} = w^{- 1} \sum_{l = n - w + 1}^{n} X_{l}

. Based on the method proposed in Section 2, with

n_{1} = n_{2} = w

, we define the following two test statistics:

T_{w}^{t, (1)} = \sqrt{w} | {\hat{\bar{μ}}}^{t} - {\hat{\bar{μ}}}^{(1)} |_{\infty} and T_{w}^{t, (2)} = \sqrt{w} {| {\hat{\bar{μ}}}^{t} - {\hat{\bar{μ}}}^{(2)} |}_{\infty} .

Given a significance level

α > 0

, we choose

ε^{t, (1)} = {cv}_{1 α}^{t}

and

ε^{t, (2)} = {cv}_{2 α}^{t}

, where

{cv}_{1 α}^{t}

and

{cv}_{2 α}^{t}

are, respectively, the

(1 - α)

-quantiles of the distributions of

T_{w}^{t, (1)}

and

T_{w}^{t, (2)}

. The estimated critical values

{\hat{cv}}_{1 α}^{t}

and

{\hat{cv}}_{2 α}^{t}

can be obtained by (4). Thus,

{\hat{t}}_{1} = min {t \in [3 w / 2, n - 3 w / 2] : T_{w}^{t, (1)} > {\hat{cv}}_{1 α}^{t}}

and

{\hat{t}}_{2} = max {t \in [3 w / 2, n - 3 w / 2] : T_{w}^{t, (2)} > {\hat{cv}}_{2 α}^{t}}

. Hence, the estimator of

τ_{0}

is given by

\begin{matrix} {\hat{τ}}_{0} = \frac{{\hat{t}}_{1} + {\hat{t}}_{2}}{2} . \end{matrix}

We utilize

T_{w}^{t, (1)}

as an illustrative example to elucidate the applicability of our proposed method. Let w be an even integer. For any

t \in [5 w / 2, n - 3 w / 2]

, we have

T_{w}^{t, (1)} = {| w^{- 1 / 2} \sum_{l = 1}^{w} (X_{t - w / 2 + l} - X_{l}) |}_{\infty}

, where the sequence

{X_{t - w / 2 + l} - X_{l}}_{l = 1}^{w}

possesses the same weakly dependence properties and similar moment/tail conditions as

{X_{l}}_{l = 1}^{n}

. For

t \in [3 w / 2, 5 w / 2 - 1]

, let

{{\tilde{X}}_{l}}_{l = 1}^{t - w / 2}

be defined as

{\tilde{X}}_{l} = X_{l}

when

l \in [1, w]

and

{\tilde{X}}_{l} = 0_{p}

when

l \in [w + 1, t - w / 2]

. Additionally, define

{{\tilde{Y}}_{l}}_{l = t - w / 2 + 1}^{2 t - w}

as

{\tilde{Y}}_{l} = X_{l}

when

l \in [t - w / 2 + 1, t + w / 2]

and

{\tilde{Y}}_{l} = 0_{p}

when

l \in [t + w / 2 + 1, 2 t - w]

. Then,

T_{w}^{t, (1)}

can be expressed as

| w^{- 1 / 2} \sum_{l = 1}^{t / 2 - w / 4} {({\tilde{Y}}_{t - w / 2 + l} - {\tilde{X}}_{l}) + ({\tilde{Y}}_{2 t - w + 1 - l} - {\tilde{X}}_{t - w / 2 + 1 - l})} |_{\infty}

, and

{({\tilde{Y}}_{t - w / 2 + l} - {\tilde{X}}_{l}) + ({\tilde{Y}}_{2 t - w + 1 - l} - {\tilde{X}}_{t - w / 2 + 1 - l})}_{l = 1}^{t / 2 - w / 4}

shares the same weakly dependence properties and similar moment/tail conditions as

{X_{l}}_{l = 1}^{n}

. Hence, our method can be applied to change point detection.

The selections of w and

α

are crucial in this method. We will elaborate on the specific choices for them in future works.

5. Simulation Study

5.1. Tuning Parameter Selection

Given the observations

{X_{t}}_{t = 1}^{n_{1}}

and

{Y_{t}}_{t = 1}^{n_{2}}

, we use the minimum volatility (MV) method proposed in [27] to select the block size S.

When the data are independent, by the multiplier bootstrap method described in [28], we set

B = \tilde{n}

(thus

S = 1

). In this case,

\begin{matrix} {\hat{Ξ}}_{\tilde{n}} = & var (\frac{1}{\sqrt{\tilde{n}}} \sum_{t = 1}^{\tilde{n}} (Z_{t} - \bar{Z}) ϱ_{t}^{'} | Z_{1}, \dots, Z_{\tilde{n}}) \\ = & \frac{1}{\tilde{n}} \sum_{b = 1}^{B} \{(\sum_{t \in I_{b}} (Z_{t} - \bar{Z})) {(\sum_{t \in I_{b}} (Z_{t} - \bar{Z}))}^{T}\} = \frac{1}{\tilde{n}} \sum_{t = 1}^{\tilde{n}} (Z_{t} - \bar{Z}) {(Z_{t} - \bar{Z})}^{T} \end{matrix}

proves to be a reliable estimator of

Ξ_{\tilde{n}}

introduced in Section 3. When the data are weakly dependent (and thus nearly independent), we expect a small value for S and a large value for B. Therefore, we recommend exploring a narrow range of S, such as

S \in {1, \dots, m}

, where m is a moderate integer. In our theoretical proof, the quality of the bootstrap approximation depends on how well the

{\hat{Ξ}}_{\tilde{n}}

approximates the covariance

Ξ_{\tilde{n}}

. The idea behind the MV method is that the conditional covariance

{\hat{Ξ}}_{\tilde{n}}

should exhibit stable behavior as a function of S within an appropriate range. For more comprehensive discussions on the MV method and its applications in time series analysis, we refer readers to [27,29]. For a moderately sized integer m, let

S_{1} < S_{2} < \dots < S_{m}

be a sequence of equally spaced candidate block sizes, and

S_{0} = 2 S_{1} - S_{2}

,

S_{m + 1} = 2 S_{m} - S_{m - 1}

. For each

i \in {0, \dots, m + 1}

, let

Y_{j}^{i} = \sum_{b = 1}^{B (S_{i})} {\{\sum_{t \in I_{b}} (Z_{t, j} - {\bar{Z}}_{j})\}}^{2},

where

j \in [p]

and

B (S) = ⌈ \tilde{n} / S ⌉

. Then for each

i \in {1, \dots, m}

, we compute

Y^{i} = \sum_{j = 1}^{p} sd ({Y_{j}^{l}}_{l = i - 1}^{i + 1}),

where

sd (\cdot)

is the standard deviation. Then, we select the block size

S_{i^{*}}

with

i^{*} = arg {min}_{i \in {1, \dots, m}} Y^{i}

.

5.2. Simulation Settings

We present the results of a simulation study aimed at evaluating the performance of tests based on

T_{n}

, as defined in (2), in finite samples. To assess the finite-sample properties of the proposed test, we employed the following fundamental generating processes:

W = H A + f (a) \in R^{n \times p}

, where

A \in R^{p \times p}

is the loading matrix,

f (\cdot) : R \to R^{n \times p}

is a constant function, the parameter a belongs to the set

{0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6}

, representing the distance between the null and alternative hypotheses. Additionally,

H = {(H_{1}, \dots, H_{n})}^{T} \in R^{n \times p}

with

H_{t} = ρ H_{t - 1} + ε_{t} \in R^{p \times 1}

, where

ε_{t} \overset{i i d}{\sim} N (0_{p}, I_{p})

and

ρ \in {0, 0.1, 0.2}

. Construct

f_{i} (a) = {(m_{1}^{(i)}, \dots, m_{n}^{(i)})}^{T} \in R^{n \times p}

such that

m_{t}^{(i)} = {(m_{t, 1}^{(i)}, \dots, m_{t, p}^{(i)})}^{T}

for

i \in {1, 2}

, where

m_{t, j}^{(1)} = a^{j}

and

m_{t, j}^{(2)} = a (1 - j / p)

for each

t \in [n]

and

j \in [p]

. Then

f_{1} (\cdot)

and

f_{2} (\cdot)

represent the sparse and dense signal cases, respectively. We consider three different loading matrices for

A

as follows:

(M1).: Let $V = {(v_{k, l})}_{1 \leq k, l \leq p}$ s.t. $v_{k, l} = 0 . 995^{| k - l |}$ , then let $A = V^{1 / 2}$ .
(M2).: Let $A = {(a_{k, l})}_{1 \leq k, l \leq p}$ s.t. $a_{k, k} = 1$ , $a_{k, l} = 0.7$ for $| k - l | = 1$ and $a_{k, l} = 0$ otherwise.
(M3).: Let $r = ⌈ p / 2.5 ⌉$ , $V = {(v_{k, l})}_{1 \leq k, l \leq p}$ , where $v_{k, k} = 1$ , $v_{k, l} = 0.9$ for $r (q - 1) + 1 \leq k \neq l \leq r q$ with $q = 1, \dots, ⌊ p / r ⌋$ , and $v_{k, l} = 0$ otherwise. Let $A = V^{1 / 2}$ .

We assess the finite sample performance of our proposed test (denoted by

Yang

) in comparison with tests introduced by [5] (denoted by

Dempster

), [4] (denoted by

BS

), [6] (denoted by

SD

), and [8] (denoted by

CLX

). All tests in our simulations are conducted at the

5 %

significance level with 1000 Monte Carlo replications, and the number of bootstrap replications is set to 1000. We consider dimensions

p \in {50, 200, 400, 800}

and sample size pairs

(n_{1}, n_{2}) \in {(200, 220), (400, 420)}

.

5.3. Simulation Results

For the testing of the null hypothesis, consider independent generations of

{X_{t}}

and

{Y_{t}}

, following the same process as

W

, with identical values for

ρ

and

f (a) = 0

. The choice of

f (a) = 0

here is made for the sake of simplicity. We exclusively present the simulation results for (M1) in the main body of the paper. The results obtained for (M2) and (M3) are analogous to those of (M1) and are detailed in the Appendix E.

Table 1 presents the performance of various methods in controlling Type I errors based on (M1). As the dimension p or sample size

(n_{1}, n_{2})

increases, the results of all methods exhibit small changes, except BS’s. When

ρ

equals 0, indicating samples are generated from independent Gaussian distributions, both Yang’s method and BS’s method effectively control Type I errors at around

5 %

, while the control achieved by the other three methods is less optimal. It is noteworthy that, with an increase in

ρ

, the data generated by the AR(1) model significantly influence the other methods. In contrast, Yang’s method demonstrates superior and more stable results with increasing

ρ

. These comparative effects are also observable in the results based on (M2) and (M3) in the Appendix E. For this reason, we exclusively compare the empirical power results by different methods with

ρ = 0

.

Figure 1 and Figure 2 depict the empirical power results of various methods for sparse and dense signals based on (M1). Similarly, as the dimension p increases, the results of all methods show little variation, except Dempster’s. However, with an increase in sample size

(n_{1}, n_{2})

, most methods exhibit improvement in their results. In Figure 1, it is evident that Yang’s method outperforms others significantly when the signal is sparse. Methods like SD, BS, and Dempster rely on the

ℓ_{2}

-norm of the data, aggregating signals across all dimensions for testing. This makes them less effective when the signal is sparse, i.e., anomalies appear in only a few dimensions. CLX’s approach, akin to Yang’s, tests whether the largest signal is abnormal. Consequently, CLX performs better than the other three methods in scenarios with sparse signals but still falls short of Yang’s method. On the contrary, when the signal is dense, Figure 2 shows that all methods yield favorable results, with Dempster’s method proving to be the most effective. Yang’s method performs at a relatively high level among these methods. In contrast, the CLX’s method, which performs well in sparse signals, exhibits a relatively lower level of performance in dense signals. In conclusion, the proposed method exhibits the most stable performance across all methods and performs exceptionally well on sparse data.

6. Real Data Analysis

In this section, we apply the proposed method to a dataset comprised of stock data obtained from Bloomberg’s public database. This dataset includes daily opening prices from 1 January 2018 to 31 December 2021 for 30 companies in the Consumer Discretionary Sector (CDS) and 31 companies in the Information Technology Sector (ITS), all listed in the S&P 500. The sample sizes for the years 2018, 2019, 2020, and 2021 are 251, 250, 253, and 252, respectively. The findings are presented in Table 2. Regarding the data for the Consumer Discretionary (CD) and Information Technology (IT) sectors, all p-values from the tests between two consecutive years are 0. This suggests a significant variation in the average annual opening prices across different years for both CDs and ITs.

For data visualization, Figure 3 displays the average annual opening prices of 30 companies in the CDS (left subgragh) and 31 companies in the ITS (right subgragh) in 2018, 2019, 2020, and 2021. The two subgraghs both exhibit a pattern of annual growth in the opening prices of nearly every stock. These results are well in line with the conclusion of Table 2.

7. Discussion

In this paper, we propose a two-sample test for high-dimensional time series based on blockwise bootstrap. Our

ℓ_{\infty}

-type test statistic is designed to detect the largest abnormal signal among dimensions. Unlike some frameworks, we do not necessarily require independence within each observation or between the two sets of observations. Instead, we rely on the weak dependence property of the pair sequence

{(X_{t}, Y_{t})}

to ensure the asymptotic properties of our proposed method. We derive two Gaussian approximation results for two cases in which the dimension p diverges, one at a polynomial rate relative to the sample size n and the other at an exponential rate relative to the sample size n. In the bootstrap procedure, the block size serves as the tuning parameter, and we employ the minimum volatility method, as proposed by [27], for block size selection.

Our test statistic is designed to pinpoint the maximum value among dimensions, facilitating the detection of significant differences in certain dimensions. In cases where differences in each dimension are minimal, it is more appropriate to consider the

ℓ_{2}

-type test statistic rather than the

ℓ_{\infty}

-type one. Consequently, in the absence of prior information, the utilization of test statistics that combine both types proves advantageous. However, deriving theoretical results from such a combined approach is a significant challenge. As discussed in Section 4, our two-sample testing procedure can be applied to change point detection in high-dimensional time series. The choices of w, the size of each subsample mean, and the significance level

α

play crucial roles in this change point detection procedure. We leave these considerations for future research.

Funding

This research received no external funding.

Data Availability Statement

The data used to support the findings of this study are included within the article.

Conflicts of Interest

The author declares no conflict of interest.

Appendix A. Proof of Theorem 1

Appendix A.1. Proof of Theorem 1(i)

Proof.

We first show that, for any

τ > (q - 1) m / (m - q)

with some

q \in [2, ⌊ m ⌋]

,

\begin{matrix} max_{j \in [p]} E (| \sum_{t = 1}^{n} ξ_{t, j} |^{q}) ≲ n^{q / 2} . \end{matrix}

(A1)

If

q = 2

, due to

\sum_{κ = 1}^{\infty} α^{\frac{m - 2}{m}} (κ) ≲ \sum_{κ = 1}^{\infty} κ^{- \frac{(m - 2) τ}{m}} < \infty

, Equation (1.12b) (Davydov’s inequality) of [30] yields

\begin{matrix} E \{{(\sum_{t = 1}^{n} ξ_{t, j})}^{2}\} = & \sum_{t = 1}^{n} E (| ξ_{t, j} |^{2}) + \sum_{t_{1} \neq t_{2}} cov (ξ_{t_{1}, j}, ξ_{t_{2}, j}) \\ ≲ & n + \sum_{t_{1} \neq t_{2}} {E (| ξ_{t_{1}, j} |^{m} {)}}^{\frac{1}{m}} {E (| ξ_{t_{2}, j} |^{m} {)}}^{\frac{1}{m}} α^{\frac{m - 2}{m}} (| t_{1} - t_{2} |) \\ ≲ & n + n \sum_{κ = 1}^{n} α^{\frac{m - 2}{m}} (κ) ≲ n \end{matrix}

(A2)

for any

j \in [p]

. For

q > 2

and

j \in [p]

, Theorem 6.3 of [30] yields

\begin{matrix} E (| \sum_{t = 1}^{n} ξ_{t, j} |^{q}) \leq a_{q} s_{n, j}^{q} + n b_{q} \int_{0}^{1} {[α^{- 1} (u) \land n]}^{q - 1} {\{sup_{t \in [n]} Q_{t, j} (u)\}}^{q} d u, \end{matrix}

where

a_{q}, b_{q} > 0

are two constants depending only on q,

s_{n, j}^{2} = \sum_{t_{1}, t_{2} = 1}^{n} | Cov (ξ_{t_{1}, j}, ξ_{t_{2}, j}) |

,

α^{- 1} (u) = \sum_{κ \geq 0} 1 (u \leq α (κ))

and

Q_{t, j} (u) = inf {x : P (| ξ_{t, j} | > x) \leq u}

. By (A2), it holds that

s_{n, j}^{q} = {(s_{n, j}^{2})}^{q / 2} ≲ n^{q / 2}

. Due to

{max}_{t \in [n]} {max}_{j \in [p]} E (| ξ_{t, j} |^{m}) \leq C

, we have

{max}_{t \in [n]} {max}_{j \in [p]} Q_{t, j} (u) ≲ u^{- \frac{1}{m}}

. By the denifition of

α^{- 1} (\cdot)

, we know that

α^{- 1} (u) ≲ u^{- \frac{1}{τ}}

. Thus

\int_{0}^{1} {[α^{- 1} (u) \land n]}^{q - 1} {\{sup_{t \in [n]} Q_{t, j} (u)\}}^{q} d u ≲ \int_{0}^{1} u^{- \frac{q - 1}{τ} - \frac{q}{m}} d u \leq C,

where the last inequality follows from

τ > (q - 1) m / (m - q)

. Hence, we have

E (| \sum_{t = 1}^{n} ξ_{t, j} |^{q}) ≲ n^{q / 2}

for any

j \in [p]

. By combining above results, we complete the proof of (A1).

Now, we begin to prove Theorem 1(i). Define

\begin{matrix} ω_{n} = sup_{x > 0} | P (max_{j \in [p]} S_{n, j} \leq x) - P (max_{j \in [p]} W_{j} \leq x) | . \end{matrix}

Let

{\overset{ˇ}{S}}_{n} = {({\overset{ˇ}{S}}_{n, 1}, \dots, {\overset{ˇ}{S}}_{n, 2 p})}^{T} = {(S_{n, 1}, - S_{n, 1}, \dots, S_{n, p}, - S_{n, p})}^{T}

and

\overset{ˇ}{W} = {({\overset{ˇ}{W}}_{1}, \dots, {\overset{ˇ}{W}}_{2 p})}^{T} = {(W_{1}, - W_{1}, \dots, W_{p}, - W_{p})}^{T}

. Then, we have

{max}_{j \in [p]} | S_{n, j} | = {max}_{j \in [2 p]} {\overset{ˇ}{S}}_{n, j}

and

{max}_{j \in [p]} | W_{j} | = {max}_{j \in [2 p]} {\overset{ˇ}{W}}_{j}

. Then, to obtain Theorem 1(i), without loss of generality, it suffices to specify the convergence rate of

ω_{n}

.

For some constant

ς \in (0, 1)

, let

B_{n} = ⌊ n^{ς} ⌋

and

K_{n} = ⌈ n / B_{n} ⌉

be the number of blocks and the size of each block, respectively. For simplicity, we assume

B_{n} ≍ n^{ς}

and

K_{n} = n / B_{n} ≍ n^{1 - ς}

. We first decompose the sequence

{1, \dots, n}

into

B_{n}

blocks:

G_{b} = {(b - 1) K_{n} + 1, \dots, b K_{n}}

for

b \in [B_{n}]

. Let

g_{n} ≫ k_{n}

be two non-negative integers such that

K_{n} = g_{n} + k_{n}

. We then decompose each

G_{b} (b \in [B_{n}])

to a “large” block

I_{b}

with length

g_{n}

and a “small” block

J_{b}

with length

k_{n}

:

I_{b} = {(b - 1) K_{n} + 1, \dots, b K_{n} - k_{n}}

and

J_{b} = {b K_{n} - k_{n} + 1, \dots, b K_{n}}

. Let

H_{b} = {(H_{b, 1}, \dots, H_{b, p})}^{T} = K_{n}^{- 1 / 2} \sum_{t \in I_{b}} ξ_{t}

. For each

b \in [B_{n}]

and some

D_{n} \to \infty

, define

H_{b}^{+} = {(H_{b, 1}^{+}, \dots, H_{b, p}^{+})}^{T}

with

H_{b, j}^{+} = H_{b, j} 1 (| H_{b, j} | \leq D_{n}) - E {H_{b, j} 1 (| H_{b, j} | \leq D_{n})}

and

H_{b}^{-} = {(H_{b, 1}^{-}, \dots, H_{b, p}^{-})}^{T}

with

H_{b, j}^{-} = H_{b, j} 1 (| H_{b, j} | > D_{n}) - E {H_{b, j} 1 (| H_{b, j} | > D_{n})}

. For each

j \in [p]

, by Theorem 2 of [17], there exists an independent sequence

{{\tilde{H}}_{b, j}}_{b = 1}^{B_{n}}

such that

{\tilde{H}}_{b, j}

has the same distribution as

H_{b, j}^{+}

and

\begin{matrix} E (| {\tilde{H}}_{b, j} - H_{b, j}^{+} |) ≲ \int_{0}^{α (k_{n})} inf {x \in R : P (| H_{b, j}^{+} | > x) \leq u} d u . \end{matrix}

Due to

| H_{b, j}^{+} | \leq 2 D_{n}

, we have

inf {x \in R : P (| H_{b, j}^{+} | > x) \leq u} ≲ D_{n}

for any

u \geq 0

, which implies

\begin{matrix} E (| {\tilde{H}}_{b, j} - H_{b, j}^{+} |) ≲ D_{n} α (k_{n}) . \end{matrix}

(A3)

Define

{\tilde{S}}_{n} = {({\tilde{S}}_{n, 1}, \dots, {\tilde{S}}_{n, p})}^{T} = B_{n}^{- 1 / 2} \sum_{b = 1}^{B_{n}} {\tilde{H}}_{b}

with

{\tilde{S}}_{n, j} = B_{n}^{- 1 / 2} \sum_{b = 1}^{B_{n}} {\tilde{H}}_{b, j}

and

\begin{matrix} {\tilde{ω}}_{n} = sup_{x > 0} | P (max_{j \in [p]} {\tilde{S}}_{n, j} \leq x) - P (max_{j \in [p]} W_{j} \leq x) | . \end{matrix}

(A4)

For any

ϵ_{1} > 0

, triangle inequality implies

\begin{matrix} P (max_{j \in [p]} S_{n, j} \leq x) \leq P (max_{j \in [p]} {\tilde{S}}_{n, j} \leq x + ϵ_{1}) + P (| max_{j \in [p]} S_{n, j} - max_{j \in [p]} {\tilde{S}}_{n, j} | > ϵ_{1}) \\ \leq & P (max_{j \in [p]} W_{j} \leq x + ϵ_{1}) + {\tilde{ω}}_{n} + P (| max_{j \in [p]} S_{n, j} - max_{j \in [p]} {\tilde{S}}_{n, j} | > ϵ_{1}) \\ \leq & P (max_{j \in [p]} W_{j} \leq x) + P (x - ϵ_{1} < max_{j \in [p]} W_{j} \leq x + ϵ_{1}) + {\tilde{ω}}_{n} + P (| max_{j \in [p]} S_{n, j} - max_{j \in [p]} {\tilde{S}}_{n, j} | > ϵ_{1}) \end{matrix}

for any

x > 0

, then

P ({max}_{j \in [p]} S_{n, j} \leq x) - P ({max}_{j \in [p]} W_{j} \leq x) \leq P (x - ϵ_{1} < {max}_{j \in [p]} W_{j} \leq x + ϵ_{1}) + {\tilde{ω}}_{n} + P (| {max}_{j \in [p]} S_{n, j} - {max}_{j \in [p]} {\tilde{S}}_{n, j} | > ϵ_{1})

. Likewise,

P ({max}_{j \in [p]} S_{n, j} \leq x) - P ({max}_{j \in [p]} W_{j} \leq x) \geq - P (x - ϵ_{1} < {max}_{j \in [p]} W_{j} \leq x + ϵ_{1}) - {\tilde{ω}}_{n} - P (| {max}_{j \in [p]} S_{n, j} - {max}_{j \in [p]} {\tilde{S}}_{n, j} | > ϵ_{1})

. Due to

{min}_{j \in [p]} {(Σ_{n})}_{j, j} \geq λ_{min} (Σ_{n}) \geq c

, Lemma A.1 of [31] yields

\begin{matrix} sup_{x \in R} P (x - ϵ_{1} < max_{j \in [p]} W_{j} \leq x + ϵ_{1}) ≲ ϵ_{1} {(log p)}^{1 / 2} \end{matrix}

for any

ϵ_{1} > 0

. Thus, we can conclude that

\begin{matrix} ω_{n} ≲ {\tilde{ω}}_{n} + P (| max_{j \in [p]} S_{n, j} - max_{j \in [p]} {\tilde{S}}_{n, j} | > ϵ_{1}) + ϵ_{1} {(log p)}^{1 / 2} . \end{matrix}

(A5)

Define

S_{n}^{+} = {(S_{n, 1}^{+}, \dots, S_{n, p}^{+})}^{T} = B_{n}^{- 1 / 2} \sum_{b = 1}^{B_{n}} H_{b}^{+}

. By triangle inequality,

\begin{matrix} | max_{j \in [p]} S_{n, j} - max_{j \in [p]} S_{n, j}^{+} | \leq max_{j \in [p]} | S_{n, j} - S_{n, j}^{+} | \leq max_{j \in [p]} | \frac{1}{n^{1 / 2}} \sum_{b = 1}^{B_{n}} \sum_{t \in J_{b}} ξ_{t, j} | + max_{j \in [p]} | \frac{1}{B_{n}^{1 / 2}} \sum_{b = 1}^{B_{n}} H_{b, j}^{-} | . \end{matrix}

By (A1), we have

E (| H_{b, j} |^{3}) \leq C

. Thus

E (| H_{b, j}^{-} |^{3}) ≲ E (| H_{b, j} |^{3}) \leq C

, and

\begin{matrix} E (| H_{b, j}^{-} |^{2}) ≲ E {| H_{b, j} |^{2} 1 (| H_{b, j} | > D_{n})} \leq E (| H_{b, j} |^{3}) D_{n}^{- 1} ≲ D_{n}^{- 1} . \end{matrix}

(A6)

Similar to (A2), we have

E (| \sum_{b = 1}^{B_{n}} \sum_{t \in J_{b}} ξ_{t, j} |) ≲ B_{n}^{1 / 2} k_{n}^{1 / 2}

for any

j \in [p]

, and

\begin{matrix} E \{{(\sum_{b = 1}^{B_{n}} H_{b, j}^{-})}^{2}\} = \sum_{b = 1}^{B_{n}} E (| H_{b, j}^{-} |^{2}) + \sum_{b_{1} \neq b_{2}} cov (H_{b_{1}, j}^{-}, H_{b_{2}, j}^{-}) \\ ≲ & B_{n} D_{n}^{- 1} + \sum_{b_{1} \neq b_{2}} α^{\frac{1}{3}} \{k_{n} 1 (| b_{1} - b_{2} | = 1) + | b_{2} - b_{1} - 1 | K_{n} 1 (| b_{1} - b_{2} | > 1)\} \\ ≲ & B_{n} D_{n}^{- 1} + \sum_{| b_{1} - b_{2} | = 1} α^{\frac{1}{3}} (k_{n}) + \sum_{| b_{1} - b_{2} | > 1} α^{\frac{1}{3}} (| b_{1} - b_{2} - 1 | K_{n}) ≲ B_{n} D_{n}^{- 1} + B_{n} k_{n}^{- \frac{τ}{3}}, \end{matrix}

(A7)

where the last inequality follows from

τ > 3

. Thus,

E (| \sum_{b = 1}^{B_{n}} H_{b, j}^{-} |) ≲ B_{n}^{1 / 2} (D_{n}^{- 1 / 2} + k_{n}^{- τ / 6})

and

\begin{matrix} E (| max_{j \in [p]} S_{n, j} - max_{j \in [p]} S_{n, j}^{+} |) ≲ \frac{p k_{n}^{1 / 2}}{K_{n}^{1 / 2}} + \frac{p}{D_{n}^{1 / 2}} + \frac{p}{k_{n}^{τ / 6}} . \end{matrix}

(A8)

Due to

{\tilde{H}}_{b, j}

having the same distribution as

H_{b, j}^{+}

and

| H_{b, j}^{+} | \leq 2 D_{n}

, by (A3), we have

E (| {\tilde{H}}_{b, j} - H_{b, j}^{+} |^{s}) ≲ D_{n}^{s} k_{n}^{- τ}

for

s \in {2, 3}

. Thus, following the same arguments as in the proof of (A7), it holds that

\begin{matrix} E [{\{\sum_{b = 1}^{B_{n}} ({\tilde{H}}_{b, j} - H_{b, j}^{+})\}}^{2}] \\ ≲ & \sum_{b = 1}^{B_{n}} D_{n}^{2} k_{n}^{- τ} + D_{n}^{2} k_{n}^{- 2 τ / 3} \{\sum_{| b_{1} - b_{2} | = 1} α^{\frac{1}{3}} (k_{n}) + \sum_{| b_{1} - b_{2} | > 1} α^{\frac{1}{3}} (| b_{1} - b_{2} - 1 | K_{n})\} \\ ≲ & B_{n} D_{n}^{2} k_{n}^{- τ} . \end{matrix}

Thus,

E (| \sum_{b = 1}^{B_{n}} ({\tilde{H}}_{b, j} - H_{b, j}^{+}) |) ≲ B_{n}^{1 / 2} D_{n} k_{n}^{- τ / 2}

and

\begin{matrix} E (| max_{j \in [p]} {\tilde{S}}_{n, j} - max_{j \in [p]} S_{n, j}^{+} |) \leq E (max_{j \in [p]} | {\tilde{S}}_{n, j} - S_{n, j}^{+} |) ≲ \frac{p D_{n}}{k_{n}^{τ / 2}} . \end{matrix}

Together with (A8), we have

E (| max_{j \in [p]} S_{n, j} - max_{j \in [p]} {\tilde{S}}_{n, j} |) ≲ \frac{p k_{n}^{1 / 2}}{K_{n}^{1 / 2}} + \frac{p}{D_{n}^{1 / 2}} + \frac{p}{k_{n}^{τ / 6}} + \frac{p D_{n}}{k_{n}^{τ / 2}} .

Let

ϵ_{1} = p^{1 / 2} {(log p)}^{- 1 / 4} {(k_{n}^{1 / 2} K_{n}^{- 1 / 2} + D_{n}^{- 1 / 2} + k_{n}^{- τ / 6} + D_{n} k_{n}^{- τ / 2})}^{1 / 2}

. It holds by (A5) and Markov inequality that

\begin{matrix} ω_{n} ≲ {\tilde{ω}}_{n} + p^{1 / 2} {(log p)}^{1 / 4} {(\frac{k_{n}^{1 / 2}}{K_{n}^{1 / 2}} + \frac{1}{D_{n}^{1 / 2}} + \frac{1}{k_{n}^{τ / 6}} + \frac{D_{n}}{k_{n}^{τ / 2}})}^{1 / 2} . \end{matrix}

(A9)

Define

{\tilde{Σ}}_{G} = B_{n}^{- 1} \sum_{b = 1}^{B_{n}} var ({\tilde{H}}_{b})

and

Δ = | Σ_{n} - {\tilde{Σ}}_{G} |

, where

Σ_{n} = E (S_{n} S_{n}^{T})

. Note that

\begin{matrix} Δ = & | \frac{1}{B_{n}} \sum_{b = 1}^{B_{n}} \{var ({\tilde{H}}_{b}) - var (H_{b}^{+})\} + \frac{1}{B_{n}} \sum_{b = 1}^{B_{n}} \{var (H_{b}^{+}) - var (H_{b})\} + \frac{1}{B_{n}} \sum_{b = 1}^{B_{n}} var (H_{b}) - Σ_{n} | \\ \leq & \underset{Δ_{1}}{\underset{︸}{\frac{1}{B_{n}} \sum_{b = 1}^{B_{n}} | var ({\tilde{H}}_{b}) - var (H_{b}^{+}) |}} + \underset{Δ_{2}}{\underset{︸}{\frac{1}{B_{n}} \sum_{b = 1}^{B_{n}} | var (H_{b}^{+}) - var (H_{b}) |}} + \underset{Δ_{3}}{\underset{︸}{| \frac{1}{B_{n}} \sum_{b = 1}^{B_{n}} var (H_{b}) - Σ_{n} |}} . \end{matrix}

(A10)

In this sequel, we specify the convergence rates of

| Δ_{1} |_{\infty}

,

| Δ_{2} |_{\infty}

, and

| Δ_{3} |_{\infty}

, respectively. Note that the

(i, j)

-th element of

var ({\tilde{H}}_{b}) - var (H_{b}^{+})

is

E ({\tilde{H}}_{b, i} {\tilde{H}}_{b, j} - H_{b, i}^{+} H_{b, j}^{+})

. Due to

{\tilde{H}}_{b, j}

having the same distribution as

H_{b, j}^{+}

and

| H_{b, j}^{+} | ≲ D_{n}

for any

b \in [B_{n}]

and

j \in [p]

, it holds by (A3) that

\begin{matrix} | E ({\tilde{H}}_{b, i} {\tilde{H}}_{b, j} - H_{b, i}^{+} H_{b, j}^{+}) | \leq | E {({\tilde{H}}_{b, i} - H_{b, i}^{+}) {\tilde{H}}_{b, j}} | + | E {({\tilde{H}}_{b, j} - H_{b, j}^{+}) {\tilde{H}}_{b, i}^{+}} | ≲ D_{n}^{2} k_{n}^{- τ} \end{matrix}

for any

b \in [B_{n}]

and

i, j \in [p]

. Thus, we can conclude that

| Δ_{1} |_{\infty} ≲ D_{n}^{2} k_{n}^{- τ}

. The

(i, j)

-th element of

var (H_{b}^{+}) - var (H_{b})

is

E (H_{b, i}^{+} H_{b, j}^{+} - H_{b, i} H_{b, j})

. Note that

E (| H_{b, j}^{-} |) ≲ E {| H_{b, j} | 1 (| H_{b, j} | > D_{n})} \leq E (| H_{b, j} |^{3}) D_{n}^{- 2} ≲ D_{n}^{- 2}

. Due to

H_{b, j} = H_{b, j}^{+} + H_{b, j}^{-}

, it holds by (A6) that

\begin{matrix} | E (H_{b, i}^{+} H_{b, j}^{+} - H_{b, i} H_{b, j}) | = & | E {H_{b, i}^{+} H_{b, j}^{+} - (H_{b, i}^{+} + H_{b, i}^{-}) (H_{b, j}^{+} + H_{b, j}^{-})} | \\ \leq & | E (H_{b, i}^{+} H_{b, j}^{-}) | + | E (H_{b, j}^{+} H_{b, i}^{-}) | + | E (H_{b, i}^{-} H_{b, j}^{-}) | ≲ D_{n}^{- 1} \end{matrix}

for any

b \in [B_{n}]

and

i, j \in [p]

. Thus, we can conclude that

| Δ_{2} |_{\infty} ≲ D_{n}^{- 1}

. The

(i, j)

-th element of

Σ_{n} - {B_{n}}^{- 1} \sum_{b = 1}^{B_{n}} var (H_{b})

is

n^{- 1} \sum_{t_{1}, t_{2} = 1}^{n} E (ξ_{t_{1}, i} ξ_{t_{2}, j}) - n^{- 1} \sum_{b = 1}^{B_{n}} \sum_{t_{1}, t_{2} \in I_{b}} E (ξ_{t_{1}, i} ξ_{t_{2}, j})

, and

\begin{matrix} | \frac{1}{n} \sum_{t_{1}, t_{2} = 1}^{n} E (ξ_{t_{1}, i} ξ_{t_{2}, j}) - \frac{1}{n} \sum_{b = 1}^{B_{n}} \sum_{t_{1}, t_{2} \in I_{b}} E (ξ_{t_{1}, i} ξ_{t_{2}, j}) | \\ = & \frac{1}{n} | \sum_{b_{1} \neq b_{2}} E \{(\sum_{t \in G_{b_{1}}} ξ_{t, i}) (\sum_{t \in G_{b_{2}}} ξ_{t, j})\} \\ + \sum_{b = 1}^{B_{n}} E \{(\sum_{t \in I_{b}} ξ_{t, i}) (\sum_{t \in J_{b}} ξ_{t, j}) + (\sum_{t \in J_{b}} ξ_{t, i}) (\sum_{t \in G_{b}} ξ_{t, j})\} | . \end{matrix}

(A11)

Similar to the proof of (A2), we have

\begin{matrix} | E \{(\sum_{t \in J_{b}} ξ_{t, i}) (\sum_{t \in G_{b}} ξ_{t, j})\} | \\ = & | \sum_{t \in J_{b}} cov (ξ_{t, i}, ξ_{t, j}) + \sum_{t_{1} \neq t_{2} : t_{1}, t_{2} \in J_{b}} cov (ξ_{t_{1}, i}, ξ_{t_{2}, j}) + \sum_{t_{1} \in J_{b}} \sum_{t_{2} \in I_{b}} cov (ξ_{t_{1}, i}, ξ_{t_{2}, j}) | \\ ≲ & k_{n} + \sum_{t_{1} \neq t_{2} : t_{1}, t_{2} \in J_{b}} {E (| ξ_{t_{1}, i} |^{3} {)}}^{\frac{1}{3}} {E (| ξ_{t_{2}, j} |^{3} {)}}^{\frac{1}{3}} α^{\frac{1}{3}} (| t_{1} - t_{2} |) \\ + \sum_{t_{1} \in J_{b}} \sum_{t_{2} \in I_{b}} {E (| ξ_{t_{1}, i} |^{3} {)}}^{\frac{1}{3}} {E (| ξ_{t_{2}, j} |^{3} {)}}^{\frac{1}{3}} α^{\frac{1}{3}} (| t_{1} - t_{2} |) ≲ k_{n} . \end{matrix}

Similarly, we can also obtain

| E \{(\sum_{t \in I_{b}} ξ_{t, i}) (\sum_{t \in J_{b}} ξ_{t, j})\} | ≲ k_{n} .

Thus,

\begin{matrix} | \sum_{b = 1}^{B_{n}} E \{(\sum_{t \in I_{b}} ξ_{t, i}) (\sum_{t \in J_{b}} ξ_{t, j}) + (\sum_{t \in J_{b}} ξ_{t, i}) (\sum_{t \in G_{b}} ξ_{t, j})\} | ≲ k_{n} B_{n} . \end{matrix}

Analogously to the proof of (A2), if

b_{1} < b_{2}

, due to

τ > 2 m / (m - 3)

,

\begin{matrix} | E \{(\sum_{t \in G_{b_{1}}} ξ_{t, i}) (\sum_{t \in G_{b_{2}}} ξ_{t, j})\} | ≲ & \sum_{t_{1} \in G_{b_{1}}} \sum_{t_{2} \in G_{b_{2}}} {E (| ξ_{t_{1}, i} |^{m} {)}}^{\frac{1}{m}} {E (| ξ_{t_{2}, i} |^{m} {)}}^{\frac{1}{m}} α^{\frac{m - 2}{m}} (| t_{1} - t_{2} |) \\ ≲ & \sum_{δ = 1}^{K_{n}} δ α^{\frac{m - 2}{m}} {(b_{2} - b_{1} - 1) K_{n} + δ} \\ ≲ & 1 (b_{2} - b_{1} = 1) + K_{n}^{2} α^{\frac{m - 2}{m}} {(b_{2} - b_{1} - 1) K_{n}} 1 (b_{2} - b_{1} > 1) . \end{matrix}

Then,

\begin{matrix} \sum_{b_{1} < b_{2}} | E \{(\sum_{t \in G_{b_{1}}} ξ_{t, i}) (\sum_{t \in G_{b_{2}}} ξ_{t, j})\} | ≲ B_{n} + B_{n} K_{n}^{\frac{2 m - (m - 2) τ}{m}} \sum_{δ = 1}^{B_{n}} δ^{\frac{- (m - 2) τ}{m}} ≲ B_{n} . \end{matrix}

The same result still holds for

b_{1} > b_{2}

. Thus, we can conclude that

\begin{matrix} \sum_{b_{1} \neq b_{2}} | E \{(\sum_{t \in G_{b_{1}}} ξ_{t, i}) (\sum_{t \in G_{b_{2}}} ξ_{t, j})\} | ≲ B_{n} . \end{matrix}

Then, by (A11), it holds that

\begin{matrix} | \frac{1}{n} \sum_{t_{1}, t_{2} = 1}^{n} E (ξ_{t_{1}, i} ξ_{t_{2}, j}) - \frac{1}{n} \sum_{b = 1}^{B_{n}} \sum_{t_{1}, t_{2} \in I_{b}} E (ξ_{t_{1}, i} ξ_{t_{2}, j}) | ≲ \frac{k_{n}}{K_{n}} \end{matrix}

for any

i, j \in [p]

. Thus,

| Δ_{3} |_{\infty} ≲ k_{n} K_{n}^{- 1}

. By (A10), we can conclude that

\begin{matrix} {| Δ |}_{\infty} \leq | Δ_{1} |_{\infty} + | Δ_{2} |_{\infty} + {| Δ_{3} |}_{\infty} ≲ \frac{D_{n}^{2}}{k_{n}^{τ}} + \frac{1}{D_{n}} + \frac{k_{n}}{K_{n}} . \end{matrix}

Let

{{\tilde{H}}_{b}^{G}}_{b = 1}^{B_{n}}

be a sequence of an independent Gaussian vector such that

{\tilde{H}}_{b}^{G} = {({\tilde{H}}_{b, 1}^{G}, \dots, {\tilde{H}}_{b, p}^{G})}^{T}

\sim N {0_{p}, var ({\tilde{H}}_{b})}

for each

b \in [B_{n}]

, where

{\tilde{H}}_{b} = {({\tilde{H}}_{b, 1}, \dots, {\tilde{H}}_{b, p})}^{T}

. By Theorem 1.1 of [15], Cauchy–Schwarz inequality and Jensen’s inequality,

\begin{matrix} sup_{x > 0} | P (max_{j \in [p]} \frac{1}{B_{n}^{1 / 2}} \sum_{b = 1}^{B_{n}} {\tilde{H}}_{b, j} \leq x) - P (max_{j \in [p]} \frac{1}{B_{n}^{1 / 2}} \sum_{b = 1}^{B_{n}} {\tilde{H}}_{b, j}^{G} \leq x) | \\ ≲ p^{1 / 4} \cdot \{\sum_{b = 1}^{B_{n}} E (| {\tilde{Σ}}_{G}^{- 1 / 2} B_{n}^{- 1 / 2} {\tilde{H}}_{b} |_{2}^{3})\} \\ ≲ \frac{p^{1 / 4}}{B_{n}^{3 / 2}} {∥ {\tilde{Σ}}_{G}^{- 1 / 2} ∥}_{2}^{3} \cdot [\sum_{b = 1}^{B_{n}} E \{{(\sum_{j = 1}^{p} {\tilde{H}}_{b, j}^{2})}^{3 / 2}\}] \\ ≲ \frac{p^{7 / 4}}{B_{n}^{3 / 2}} {∥ {\tilde{Σ}}_{G}^{- 1 / 2} ∥}_{2}^{3} \cdot \{\sum_{b = 1}^{B_{n}} max_{j \in [p]} E (| {\tilde{H}}_{b, j} |^{3})\}, \end{matrix}

where

{\tilde{Σ}}_{G} = B_{n}^{- 1} \sum_{b = 1}^{B_{n}} var ({\tilde{H}}_{b})

. Note that

| λ_{min} ({\tilde{Σ}}_{G}) - λ_{min} (Σ_{n}) {| \leq ∥ Δ ∥}_{2} \leq p {| Δ |}_{\infty} .

Due to

λ_{min} (Σ_{n}) \geq c

, we have

λ_{min} ({\tilde{Σ}}_{G}) \geq c

as long as

{p | Δ |}_{\infty} = o (1)

. Thus, if

{p | Δ |}_{\infty} = o (1)

, we have

∥ {\tilde{Σ}}_{G}^{- 1 / 2} ∥_{2} \leq C

. Since

H_{b, j} = K_{n}^{- 1 / 2} \sum_{t \in I_{b}} ξ_{t, j}

, (A1) yields

E (| {\tilde{H}}_{b, j} |^{3}) = E (| H_{b, j}^{+} |^{3}) ≲ E (| H_{b, j} |^{3}) \leq C

for any

b \in [B_{n}]

and

j \in [p]

, which implies

\begin{matrix} sup_{x > 0} | P (max_{j \in [p]} \frac{1}{B_{n}^{1 / 2}} \sum_{b = 1}^{B_{n}} {\tilde{H}}_{b, j} \leq x) - P (max_{j \in [p]} \frac{1}{B_{n}^{1 / 2}} \sum_{b = 1}^{B_{n}} {\tilde{H}}_{b, j}^{G} \leq x) | ≲ \frac{p^{7 / 4}}{B_{n}^{1 / 2}} \end{matrix}

(A12)

provided that

{p | Δ |}_{\infty} = o (1)

. By Proposition 2.1 of [16], we have

\begin{matrix} sup_{x > 0} | P (max_{j \in [p]} \frac{1}{B_{n}^{1 / 2}} \sum_{b = 1}^{B_{n}} {\tilde{H}}_{b, j}^{G} \leq x) - P (max_{j \in [p]} W_{j} \leq x) | ≲ {| Δ |}_{\infty}^{1 / 2} log p . \end{matrix}

(A13)

Then, by (A4), (A12), and (A13), we have

\begin{matrix} {\tilde{ω}}_{n} ≲ \frac{p^{7 / 4}}{B_{n}^{1 / 2}} + {| Δ |}_{\infty}^{1 / 2} log p \end{matrix}

provided that

{p | Δ |}_{\infty} = o (1)

. Together with (A9),

\begin{matrix} ω_{n} ≲ & \frac{p^{7 / 4}}{B_{n}^{1 / 2}} + {| Δ |}_{\infty}^{1 / 2} log p + p^{1 / 2} {(log p)}^{1 / 4} {(\frac{k_{n}^{1 / 2}}{K_{n}^{1 / 2}} + \frac{1}{D_{n}^{1 / 2}} + \frac{1}{k_{n}^{τ / 6}} + \frac{D_{n}}{k_{n}^{τ / 2}})}^{1 / 2} \end{matrix}

provided that

{p | Δ |}_{\infty} = o (1)

. Select

D_{n} ≍ n^{4 τ / (11 τ + 12)}

,

k_{n} ≍ n^{12 / (11 τ + 12)}

, and

ς = 7 τ / (11 τ + 12)

. Then, if

p = o {n^{2 τ / (11 τ + 12)}}

, we have

\begin{matrix} ω_{n} ≲ \frac{p^{7 / 4}}{n^{7 τ / (22 τ + 24)}} + \frac{log p}{n^{2 τ / (11 τ + 12)}} + \frac{p^{1 / 2} {(log p)}^{1 / 4}}{n^{τ / (11 τ + 12)}} ≲ \frac{p^{1 / 2} {(log p)}^{1 / 4}}{n^{τ / (11 τ + 12)}} . \end{matrix}

Hence, we complete the proof of Theorem 1(i). □

Appendix A.2. Proof of Theorem 1(ii)

Proof.

Define

{(G_{b}, I_{b}, J_{b})}_{b = 1}^{B_{n}}

,

{H_{b}^{+}}_{b = 1}^{B_{n}}

, and

{H_{b}^{-}}_{b = 1}^{B_{n}}

in the same manner as in the proof of Theorem 1(i) with

B_{n} ≍ n^{ς}

,

K_{n} ≍ n^{1 - ς}

,

k_{n} ≪ n^{1 - ς}

and

D_{n} \to \infty

, where

ς \in (0, 1)

. Let

\begin{matrix} ω_{n} = sup_{x > 0} | P (max_{j \in [p]} S_{n, j} \leq x) - P (max_{j \in [p]} W_{j} \leq x) | . \end{matrix}

Analogously to (A5), due to

{min}_{j \in [p]} {(Σ_{n})}_{j, j} > c

, we have

\begin{matrix} ω_{n} ≲ {\tilde{ω}}_{n} + P (| max_{j \in [p]} S_{n, j} - max_{j \in [p]} {\tilde{S}}_{n, j} | > ϵ_{2}) + ϵ_{2} {(log p)}^{1 / 2} \end{matrix}

(A14)

for some

ϵ_{2} > 0

, where

{\tilde{S}}_{n, j} = B_{n}^{- 1 / 2} \sum_{b = 1}^{B_{n}} {\tilde{H}}_{b, j}

with

{{\tilde{H}}_{b, j}}

specified in the same manner as in the proof of Theorem 1(i), and

\begin{matrix} {\tilde{ω}}_{n} = sup_{x > 0} | P (max_{j \in [p]} {\tilde{S}}_{n, j} \leq x) - P (max_{j \in [p]} W_{j} \leq x) | . \end{matrix}

Define

S_{n}^{+} = {(S_{n, 1}^{+}, \dots, S_{n, p}^{+})}^{T} = B_{n}^{- 1 / 2} \sum_{b = 1}^{B_{n}} H_{b}^{+}

. By triangle inequality,

\begin{matrix} | max_{j \in [p]} S_{n, j} - max_{j \in [p]} S_{n, j}^{+} | \leq max_{j \in [p]} | S_{n, j} - S_{n, j}^{+} | \leq max_{j \in [p]} | \frac{1}{n^{1 / 2}} \sum_{b = 1}^{B_{n}} \sum_{t \in J_{b}} ξ_{t, j} | + max_{j \in [p]} | \frac{1}{B_{n}^{1 / 2}} \sum_{b = 1}^{B_{n}} H_{b, j}^{-} | . \end{matrix}

Note that

P (| ξ_{t, j} {M_{n}}^{- 1} | > x) ≲ exp (- C x^{γ_{1}})

for any

x > 0

. Let

\tilde{γ} = {(1 / γ_{1} + 1 / γ_{2})}^{- 1}

. By Theorem 1 of [32] and Bonferroni inequality, we have

\begin{matrix} P (max_{j \in [p]} | \frac{1}{n^{1 / 2}} \sum_{b = 1}^{B_{n}} \sum_{t \in J_{b}} ξ_{t, j} | > x) ≲ p B_{n} k_{n} exp (- \frac{C n^{\tilde{γ} / 2} x^{\tilde{γ}}}{M_{n}^{\tilde{γ}}}) + p exp (- \frac{C n x^{2}}{M_{n}^{2} B_{n} k_{n}}) \end{matrix}

(A15)

for any

x ≫ M_{n} n^{- 1 / 2}

. Similarly, by Theorem 1 of [32] again, for any

x ≫ M_{n} K_{n}^{- 1 / 2}

,

\begin{matrix} P (| H_{b, j} | > x) = P (| \frac{1}{K_{n}^{1 / 2}} \sum_{t \in I_{b}} ξ_{t, j} | > x) ≲ K_{n} exp (- \frac{C K_{n}^{\tilde{γ} / 2} x^{\tilde{γ}}}{M_{n}^{\tilde{γ}}}) + exp (- \frac{C x^{2}}{M_{n}^{2}}) . \end{matrix}

Then, if

D_{n} > M_{n}

,

\begin{matrix} E {H_{b, j}^{2} 1 (| H_{b, j} | > D_{n})} = & 2 \int_{0}^{D_{n}} x P (| H_{b, j} | > D_{n}) d x + 2 \int_{D_{n}}^{\infty} x P (| H_{b, j} | > x) d x \\ ≲ & D_{n}^{2} \{K_{n} exp (- \frac{C K_{n}^{\tilde{γ} / 2} D_{n}^{\tilde{γ}}}{M_{n}^{\tilde{γ}}}) + exp (- \frac{C D_{n}^{2}}{M_{n}^{2}})\} \\ + K_{n} \int_{D_{n}}^{\infty} x exp (- \frac{C K_{n}^{\tilde{γ} / 2} x^{\tilde{γ}}}{M_{n}^{\tilde{γ}}}) d x + \int_{D_{n}}^{\infty} x exp (- \frac{C x^{2}}{M_{n}^{2}}) d x \\ ≲ & D_{n}^{2} \{K_{n} (- \frac{C K_{n}^{\tilde{γ} / 2} D_{n}^{\tilde{γ}}}{M_{n}^{\tilde{γ}}}) + exp (- \frac{C D_{n}^{2}}{M_{n}^{2}})\} . \end{matrix}

Thus, for any

b \in [B_{n}]

and

j \in [p]

,

\begin{matrix} E (| H_{b, j}^{-} |^{2}) ≲ E {H_{b, j}^{2} 1 (| H_{b, j} | > D_{n})} ≲ D_{n}^{2} \{K_{n} (- \frac{C K_{n}^{\tilde{γ} / 2} D_{n}^{\tilde{γ}}}{M_{n}^{\tilde{γ}}}) + exp (- \frac{C D_{n}^{2}}{M_{n}^{2}})\} . \end{matrix}

(A16)

Select

D_{n} = C^{*} M_{n} {log (p n)}^{1 / 2}

for some sufficiently large constant

C^{*} > 0

. Thus, for any

x \geq 0

,

\begin{matrix} P (max_{j \in [p]} | \frac{1}{B_{n}^{1 / 2}} \sum_{b = 1}^{B_{n}} H_{b, j}^{-} | > x) \leq \frac{p B_{n}^{1 / 2} {max}_{j \in [p]} {max}_{b \in [B_{n}]} E (| H_{b, j}^{-} |)}{x} ≲ \frac{{(p n)}^{- 1}}{x} \end{matrix}

provided that

log (p n) = o {K_{n}^{\tilde{γ} / (2 - \tilde{γ})}}

. Then, by (A15), we can conclude that for any

x ≫ M_{n} n^{- 1 / 2}

,

\begin{matrix} P (| max_{j \in [p]} S_{n, j} - max_{j \in [p]} S_{n, j}^{+} | > x) ≲ p B_{n} k_{n} exp (- \frac{C n^{\tilde{γ} / 2} x^{\tilde{γ}}}{M_{n}^{\tilde{γ}}}) + p exp (- \frac{C n x^{2}}{M_{n}^{2} B_{n} k_{n}}) + \frac{{(p n)}^{- 1}}{x} \end{matrix}

(A17)

provided that

log (p n) = o {K_{n}^{\tilde{γ} / (2 - \tilde{γ})}}

. Similar to (A3), we have

\begin{matrix} E (| {\tilde{H}}_{b, j} - H_{b, j}^{+} |) ≲ D_{n} α (k_{n}) ≲ D_{n} exp (- C k_{n}^{γ_{2}}) . \end{matrix}

(A18)

Select

k_{n} = C^{* *} {log (p n)}^{1 / γ_{2}}

for some sufficiently large constant

C^{* *} > 0

. By (A18) and triangle inequality,

\begin{matrix} P (| max_{j \in [p]} {\tilde{S}}_{n, j} - max_{j \in [p]} S_{n, j}^{+} | > x) \leq \frac{p B_{n}^{1 / 2} {max}_{b \in [B_{n}]} {max}_{j \in [p]} E (| {\tilde{H}}_{b, j} - H_{b, j}^{+} |)}{x} ≲ \frac{{(p n)}^{- 1}}{x} \end{matrix}

for any

x \geq 0

. Thus, by (A17), for any

x ≫ M_{n} n^{- 1 / 2}

,

\begin{matrix} P (| max_{j \in [p]} S_{n, j} - max_{j \in [p]} {\tilde{S}}_{n, j} | > x) ≲ p B_{n} k_{n} exp (- \frac{C n^{\tilde{γ} / 2} x^{\tilde{γ}}}{M_{n}^{\tilde{γ}}}) + p exp (- \frac{C n x^{2}}{M_{n}^{2} B_{n} k_{n}}) + \frac{{(p n)}^{- 1}}{x} \end{matrix}

(A19)

provided that

log (p n) = o {K_{n}^{\tilde{γ} / (2 - \tilde{γ})}}

. Let

ϵ_{2} = C^{* * *} M_{n} k_{n}^{1 / 2} K_{n}^{- 1 / 2} {log (p n)}^{1 / 2}

for some sufficient large constant

C^{* * *} > 0

. It holds by (A14) that

\begin{matrix} ω_{n} ≲ {\tilde{ω}}_{n} + \frac{M_{n} {log (p n)}^{(2 γ_{2} + 1) / 2 γ_{2}}}{K_{n}^{1 / 2}} \end{matrix}

(A20)

provided that

log (p n) = o {k_{n}^{\tilde{γ} / (2 - \tilde{γ})} B_{n}^{\tilde{γ} / (2 - \tilde{γ})} \land K_{n}^{\tilde{γ} / (2 - \tilde{γ})}}

. Define

{\tilde{Σ}}_{G} = B_{n}^{- 1} \sum_{b = 1}^{B_{n}} var ({\tilde{H}}_{b})

and

Δ = | Σ_{n} - {\tilde{Σ}}_{G} |

, where

Σ_{n} = E (S_{n} S_{n}^{T})

. Note that

\begin{matrix} Δ = & | \frac{1}{B_{n}} \sum_{b = 1}^{B_{n}} \{var ({\tilde{H}}_{b}) - var (H_{b}^{+})\} + \frac{1}{B_{n}} \sum_{b = 1}^{B_{n}} \{var (H_{b}^{+}) - var (H_{b})\} + \frac{1}{B_{n}} \sum_{b = 1}^{B_{n}} var (H_{b}) - Σ_{n} | \\ \leq & \underset{Δ_{1}}{\underset{︸}{\frac{1}{B_{n}} \sum_{b = 1}^{B_{n}} | var ({\tilde{H}}_{b}) - var (H_{b}^{+}) |}} + \underset{Δ_{2}}{\underset{︸}{\frac{1}{B_{n}} \sum_{b = 1}^{B_{n}} | var (H_{b}^{+}) - var (H_{b}) |}} + \underset{Δ_{3}}{\underset{︸}{| \frac{1}{B_{n}} \sum_{b = 1}^{B_{n}} var (H_{b}) - Σ_{n} |}} . \end{matrix}

(A21)

In this sequel, we will specify the convergence rates of

| Δ_{1} |_{\infty}

,

| Δ_{2} |_{\infty}

and

| Δ_{3} |_{\infty}

, respectively. Note that the

(i, j)

-th element of

var ({\tilde{H}}_{b}) - var (H_{b}^{+})

is

E ({\tilde{H}}_{b, i} {\tilde{H}}_{b, j} - H_{b, i}^{+} H_{b, j}^{+})

. Due to

{\tilde{H}}_{b, j}

has the same distribution as

H_{b, j}^{+}

and

| H_{b, j}^{+} | ≲ D_{n}

for any

b \in [B_{n}]

and

j \in [p]

, it holds by (A18) that

\begin{matrix} | E ({\tilde{H}}_{b, i} {\tilde{H}}_{b, j} - H_{b, i}^{+} H_{b, j}^{+}) | \leq | E {({\tilde{H}}_{b, i} - H_{b, i}^{+}) {\tilde{H}}_{b, j}} | + | E {({\tilde{H}}_{b, j} - H_{b, j}^{+}) {\tilde{H}}_{b, i}^{+}} | ≲ {(p n)}^{- 1} \end{matrix}

for any

b \in [B_{n}]

and

i, j \in [p]

. Thus, we can conclude that

| Δ_{1} |_{\infty} ≲ {(p n)}^{- 1}

. The

(i, j)

-th element of

var (H_{b}^{+}) - var (H_{b})

is

E (H_{b, i}^{+} H_{b, j}^{+} - H_{b, i} H_{b, j})

. Due to

H_{b, j} = H_{b, j}^{+} + H_{b, j}^{-}

, then it holds by (A16) that

\begin{matrix} | E (H_{b, i}^{+} H_{b, j}^{+} - H_{b, i} H_{b, j}) | = | E {H_{b, i}^{+} H_{b, j}^{+} - (H_{b, i}^{+} + H_{b, i}^{-}) (H_{b, j}^{+} + H_{b, j}^{-})} | \\ \leq & | E (H_{b, i}^{+} H_{b, j}^{-}) | + | E (H_{b, j}^{+} H_{b, i}^{-}) | + | E (H_{b, i}^{-} H_{b, j}^{-}) | ≲ {(p n)}^{- 1} \end{matrix}

for any

b \in [B_{n}]

and

i, j \in [p]

provided that

log (p n) = o {K_{n}^{\tilde{γ} / (2 - \tilde{γ})}}

. Thus, we can conclude that

| Δ_{2} |_{\infty} ≲ {(p n)}^{- 1}

provided that

log (p n) = o {K_{n}^{\tilde{γ} / (2 - \tilde{γ})}}

. The

(i, j)

-th element of

Σ_{n} - {B_{n}}^{- 1} \sum_{b = 1}^{B_{n}} var (H_{b})

is

n^{- 1} \sum_{t_{1}, t_{2} = 1}^{n} E (ξ_{t_{1}, i} ξ_{t_{2}, j}) - n^{- 1} \sum_{b = 1}^{B_{n}} \sum_{t_{1}, t_{2} \in I_{b}} E (ξ_{t_{1}, i} ξ_{t_{2}, j})

, and

\begin{matrix} | \frac{1}{n} \sum_{t_{1}, t_{2} = 1}^{n} E (ξ_{t_{1}, i} ξ_{t_{2}, j}) - \frac{1}{n} \sum_{b = 1}^{B_{n}} \sum_{t_{1}, t_{2} \in I_{b}} E (ξ_{t_{1}, i} ξ_{t_{2}, j}) | \\ = & \frac{1}{n} | \sum_{b_{1} \neq b_{2}} E \{(\sum_{t \in G_{b_{1}}} ξ_{t, i}) (\sum_{t \in G_{b_{2}}} ξ_{t, j})\} \\ + \sum_{b = 1}^{B_{n}} E \{(\sum_{t \in I_{b}} ξ_{t, i}) (\sum_{t \in J_{b}} ξ_{t, j}) + (\sum_{t \in J_{b}} ξ_{t, i}) (\sum_{t \in G_{b}} ξ_{t, j})\} | . \end{matrix}

(A22)

Note that

E (| ξ_{t, j} |^{r}) ≲ M_{n}^{r}

for any constant integer

r > 0

. Equation (1.12b) of [30] yields

\begin{matrix} | E \{(\sum_{t \in J_{b}} ξ_{t, i}) (\sum_{t \in G_{b}} ξ_{t, j})\} | \\ = & | \sum_{t \in J_{b}} cov (ξ_{t, i}, ξ_{t, j}) + \sum_{t_{1} \neq t_{2} : t_{1}, t_{2} \in J_{b}} cov (ξ_{t_{1}, i}, ξ_{t_{2}, j}) + \sum_{t_{1} \in J_{b}} \sum_{t_{2} \in I_{b}} cov (ξ_{t_{1}, i}, ξ_{t_{2}, j}) | \\ ≲ & M_{n}^{2} k_{n} + \sum_{t_{1} \neq t_{2} : t_{1}, t_{2} \in J_{b}} {E (| ξ_{t_{1}, i} |^{3} {)}}^{\frac{1}{3}} {E (| ξ_{t_{2}, j} |^{3} {)}}^{\frac{1}{3}} α^{\frac{1}{3}} (| t_{1} - t_{2} |) \\ + \sum_{t_{1} \in J_{b}} \sum_{t_{2} \in I_{b}} {E (| ξ_{t_{1}, i} |^{3} {)}}^{\frac{1}{3}} {E (| ξ_{t_{2}, j} |^{3} {)}}^{\frac{1}{3}} α^{\frac{1}{3}} (| t_{1} - t_{2} |) ≲ M_{n}^{2} k_{n} . \end{matrix}

(A23)

Similarly, we can also obtain

| E \{(\sum_{t \in I_{b}} ξ_{t, i}) (\sum_{t \in J_{b}} ξ_{t, j})\} | ≲ M_{n}^{2} k_{n} .

Thus,

\begin{matrix} | \sum_{b = 1}^{B_{n}} E \{(\sum_{t \in I_{b}} ξ_{t, i}) (\sum_{t \in J_{b}} ξ_{t, j}) + (\sum_{t \in J_{b}} ξ_{t, i}) (\sum_{t \in G_{b}} ξ_{t, j})\} | ≲ M_{n}^{2} k_{n} B_{n} . \end{matrix}

By Equation (1.12b) of [30], if

b_{1} < b_{2}

,

\begin{matrix} | E \{(\sum_{t \in G_{b_{1}}} ξ_{t, i}) (\sum_{t \in G_{b_{2}}} ξ_{t, j})\} | \\ ≲ & \sum_{t_{1} \in G_{b_{1}}} \sum_{t_{2} \in G_{b_{2}}} {E (| ξ_{t_{1}, i} |^{3} {)}}^{\frac{1}{3}} {E (| ξ_{t_{2}, j} |^{3} {)}}^{\frac{1}{3}} α^{\frac{1}{3}} (| t_{1} - t_{2} |) \\ ≲ & M_{n}^{2} \sum_{δ = 1}^{K_{n}} δ exp [- C {(b_{2} - b_{1} - 1) K_{n} + δ}^{γ_{2}}] \\ ≲ & M_{n}^{2} 1 (b_{2} - b_{1} = 1) + M_{n}^{2} K_{n}^{2} exp {- C {(b_{2} - b_{1} - 1)}^{γ_{2}} {K_{n}}^{γ_{2}}} 1 (b_{2} - b_{1} > 1) . \end{matrix}

Thus,

\begin{matrix} \sum_{b_{1} < b_{2}} | E \{(\sum_{t \in G_{b_{1}}} ξ_{t, i}) (\sum_{t \in G_{b_{2}}} ξ_{t, j})\} | \\ ≲ & M_{n}^{2} B_{n} + M_{n}^{2} K_{n}^{2} \sum_{b_{2} - b_{1} = 2}^{B_{n} - 1} exp {- C {(b_{2} - b_{1} - 1)}^{γ_{2}} {K_{n}}^{γ_{2}}} ≲ M_{n}^{2} B_{n} . \end{matrix}

Same result holds for

b_{1} > b_{2}

. Thus we can conclude that

\begin{matrix} \sum_{b_{1} \neq b_{2}} | E \{(\sum_{t \in G_{b_{1}}} ξ_{t, i}) (\sum_{t \in G_{b_{2}}} ξ_{t, j})\} | ≲ M_{n}^{2} B_{n} . \end{matrix}

Note that the above upper bounds do not depend on

(i, j)

. Then by (A22), it holds that

| Δ_{3} |_{\infty} ≲ M_{n}^{2} k_{n} K_{n}^{- 1}

. By (A21), we can conclude that

\begin{matrix} {| Δ |}_{\infty} ≲ \frac{M_{n}^{2} k_{n}}{K_{n}} \end{matrix}

(A24)

provided that

log (p n) = o {K_{n}^{\tilde{γ} / (2 - \tilde{γ})}}

. Let

{{\tilde{H}}_{b}^{G}}_{b = 1}^{B_{n}}

be a sequence of independent Gaussian vector such that

{\tilde{H}}_{b}^{G} = {({\tilde{H}}_{b, 1}^{G}, \dots, {\tilde{H}}_{b, p}^{G})}^{T} \sim N {0_{p}, var ({\tilde{H}}_{b})}

for any

b \in [B_{n}]

, where

{\tilde{H}}_{b} = {({\tilde{H}}_{b, 1}, \dots, {\tilde{H}}_{b, p})}^{T}

. Due to

k_{n} ≍ {log (p n)}^{1 / γ_{2}}

, we know that

{min}_{j \in [p]} {({\tilde{Σ}}_{G})}_{j, j} > c

provided that

M_{n}^{2} {log (p n)}^{1 / γ_{2}} = o (K_{n})

and

log (p n) = o {K_{n}^{\tilde{γ} / (2 - \tilde{γ})}}

. Due to

{\tilde{H}}_{b, j} \leq 2 D_{n} ≲ M_{n} {log (p n)}^{1 / 2}

, it holds that

E ({\tilde{H}}_{b, j}^{4}) ≲ D_{n}^{2} E ({\tilde{H}}_{b, j}^{2}) ≲ M_{n}^{4} log (p n)

for any

b \in [B_{n}]

and

j \in [p]

, where the last inequality follows from

E ({\tilde{H}}_{b, j}^{2}) = E (| H_{b, j}^{+} |^{2}) ≲ E (H_{b, j}^{2})

and the similar arguments as in the proof of (A23). By Theorem 2.1 of [16], we have

\begin{matrix} sup_{x > 0} | P (max_{j \in [p]} {\tilde{S}}_{n, j} \leq x) - P (max_{j \in [p]} \frac{1}{B_{n}^{1 / 2}} \sum_{b = 1}^{B_{n}} {\tilde{H}}_{b, j}^{G} \leq x) | ≲ \frac{M_{n} {log (p n)}^{3 / 2}}{B_{n}^{1 / 4}} . \end{matrix}

(A25)

provided that

M_{n}^{2} {log (p n)}^{1 / γ_{2}} = o (K_{n})

and

log (p n) = o {K_{n}^{\tilde{γ} / (2 - \tilde{γ})}}

. By Proposition 2.1 of [16] and (A24), we have

\begin{matrix} sup_{x > 0} | P (max_{j \in [p]} \frac{1}{B_{n}^{1 / 2}} \sum_{b = 1}^{B_{n}} {\tilde{H}}_{b, j}^{G} \leq x) - P (max_{j \in [p]} W_{j} \leq x) | \\ ≲ & {| Δ |}_{\infty}^{1 / 2} log p ≲ \frac{M_{n} {log (p n)}^{(2 γ_{2} + 1) / 2 γ_{2}}}{K_{n}^{1 / 2}} . \end{matrix}

(A26)

By (A20), (A25) and (A26), due to

\tilde{γ} = {(1 / γ_{1} + 1 / γ_{2})}^{- 1}

, we have

\begin{matrix} ω_{n} ≲ \frac{M_{n} {log (p n)}^{3 / 2}}{B_{n}^{1 / 4}} + \frac{M_{n} {log (p n)}^{(2 γ_{2} + 1) / 2 γ_{2}}}{K_{n}^{1 / 2}} \end{matrix}

provided that

log (p n) = o {B_{n}^{γ_{1} γ_{2} / (γ_{1} + 2 γ_{2} - γ_{1} γ_{2})} \land K_{n}^{γ_{1} γ_{2} / (2 γ_{1} + 2 γ_{2} - γ_{1} γ_{2})}}

and

M_{n}^{2} {log (p n)}^{1 / γ_{2}} = o (K_{n})

. Select

ς = 2 / 3

. Then

B_{n} ≍ n^{2 / 3}

,

K_{n} ≍ n^{1 / 3}

and

\begin{matrix} ω_{n} ≲ \frac{M_{n} {log (p n)}^{max {(2 γ_{2} + 1) / 2 γ_{2}, 3 / 2}}}{n^{1 / 6}} \end{matrix}

provided that

M_{n}^{2} {log (p n)}^{1 / γ_{2}} = o (n^{1 / 3})

and

{log (p n)}^{3} = o {n^{γ_{1} γ_{2} / (2 γ_{1} + 2 γ_{2} - γ_{1} γ_{2})}}

. Thus we complete the proof of Theorem 1(ii). □

Appendix B. Proof of Proposition 1

Proof.

Define

{\overset{˚}{T}}_{n} = | \frac{1}{\sqrt{\tilde{n}}} \sum_{t = 1}^{\tilde{n}} {\overset{˚}{Z}}_{t} |_{\infty},

where

{\overset{˚}{Z}}_{t} = Z_{t} - E (Z_{t})

. Under

H_{0}

, we know that

μ_{X} = μ_{Y} = : μ

. Recall

n_{1} ≍ n_{2} ≍ n

and

Δ_{n} = n_{1} \lor n_{2} - n_{1} \land n_{2}

. Without loss of generality, we assume

n_{1} \leq n_{2}

. By triangle inequality, for any

j \in [p]

,

\begin{matrix} | \sum_{t = 1}^{\tilde{n}} {\overset{˚}{Z}}_{t, j} - \sum_{t = 1}^{\tilde{n}} Z_{t, j} | ≲ \sum_{t = 1}^{n_{1}} | \sqrt{\frac{n_{2}^{2}}{n_{1} (n_{1} + n_{2})}} - \sqrt{\frac{n_{1}}{(n_{1} + n_{2})}} | + \sum_{t = n_{1} + 1}^{n_{2}} | \sqrt{\frac{n_{1}}{(n_{1} + n_{2})}} | = O (Δ_{n}) . \end{matrix}

Thus

| T_{n} - {\overset{˚}{T}}_{n} | = O (Δ_{n} n^{- 1 / 2})

. Write

δ_{n} = Δ_{n} n^{- 1 / 2} π_{n}

, where

π_{n} > 0

diverges at a sufficiently slow rate. Thus, we have

\begin{matrix} P (T_{n} \leq x) \leq P ({\overset{˚}{T}}_{n} \leq x + δ_{n}) + P (| T_{n} - {\overset{˚}{T}}_{n} | > δ_{n}) \\ \leq & P (T_{n}^{G} \leq x + δ_{n}) + sup_{x \in R} | P ({\overset{˚}{T}}_{n} \leq x) - P (T_{n}^{G} \leq x) | + o (1) \\ \leq & P (T_{n}^{G} \leq x) + sup_{x \in R} P (x - δ_{n} \leq T_{n}^{G} \leq x + δ_{n}) + sup_{x \in R} | P ({\overset{˚}{T}}_{n} \leq x) - P (T_{n}^{G} \leq x) | + o (1) . \end{matrix}

Analogously, we can also obtain that

P (T_{n} \leq x) \geq P (T_{n}^{G} \leq x) - {sup}_{x \in R} P (x - δ_{n} \leq T_{n}^{G} \leq x + δ_{n}) - {sup}_{x \in R} | P ({\overset{˚}{T}}_{n} \leq x) - P (T_{n}^{G} \leq x) | - o (1)

. Thus,

\begin{matrix} sup_{x \in R} | P (T_{n} \leq x) - P (T_{n}^{G} \leq x) | \\ \leq & sup_{x \in R} P (x - δ_{n} \leq T_{n}^{G} \leq x + δ_{n}) + sup_{x \in R} | P ({\overset{˚}{T}}_{n} \leq x) - P (T_{n}^{G} \leq x) | + o (1) . \end{matrix}

In Case1, by Assumption 1(iii), we have

{min}_{j \in [p]} {(Ξ_{\tilde{n}})}_{j, j} > c

. Then by Lemma A.1 of [31], due to

Δ_{n}^{2} log p = o (n)

, we have

{sup}_{x \in R} P (x - δ_{n} \leq T_{n}^{G} \leq x + δ_{n}) ≲ Δ_{n} n^{- 1 / 2} π_{n} {(log p)}^{1 / 2} = o (1)

. By Assumption 1(i), we have

{max}_{t \in [\tilde{n}]} {max}_{j \in [p]} E (| {\overset{˚}{Z}}_{t, j} |^{m}) \leq C

. Note that

Ξ_{\tilde{n}} = E ({\tilde{n}}^{- 1 / 2} \sum_{t = 1}^{\tilde{n}} {\overset{˚}{Z}}_{t}, {\tilde{n}}^{- 1 / 2} \sum_{t = 1}^{\tilde{n}} {\overset{˚}{Z}}_{t}^{T})

. Then by Assumption 1 and Theorem 1(i), due to

3 m / (m - 4) > max {2 m / (m - 3), 3}

, we have

{sup}_{x \in R} | P ({\overset{˚}{T}}_{n} \leq x) - P (T_{n}^{G} \leq x) | = o (1)

provided that

p^{2} log p = o {n^{4 τ / (11 τ + 12)}}

. Thus, if

p^{2} log p = o {n^{4 τ / (11 τ + 12)}}

,

sup_{x \in R} | P (T_{n} \leq x) - P (T_{n}^{G} \leq x) | = o (1) .

Similarly, in Case2, by Assumption 2 and Theorem 1(ii) with

(M_{n}, γ_{1}, γ_{2}) = (C, 2, 1)

, we have

{sup}_{x \in R} P (x - δ_{n} \leq T_{n}^{G} \leq x + δ_{n}) ≲ Δ_{n} n^{- 1 / 2} π_{n} {(log p)}^{1 / 2} = o (1)

and

{sup}_{x \in R} | P ({\overset{˚}{T}}_{n} \leq x) - P (T_{n}^{G} \leq x) | = o (1)

provided that

log (p n) = o (n^{1 / 9})

. Thus, if

log (p n) = o (n^{1 / 9})

,

sup_{x \in R} | P (T_{n} \leq x) - P (T_{n}^{G} \leq x) | = o (1) .

We complete the proof of Proposition 1. □

Appendix C. Proof of Theorem 2

Appendix C.1. Proof of Theorem 2 under Case3

Proof.

By Proposition 1 under Case1, it suffices to show

\begin{matrix} sup_{x \in R} | P ({\hat{T}}_{n}^{G} \leq x | E) - P (T_{n}^{G} \leq x) | = o_{p} (1) . \end{matrix}

Recall

T_{n}^{G} = {| G |}_{\infty}

with

G \sim N (0, Ξ_{\tilde{n}})

and

{\hat{T}}_{n}^{G} = {| {\tilde{n}}^{- 1 / 2} \sum_{t = 1}^{\tilde{n}} (Z_{t} - \bar{Z}) ϱ_{t}^{'} |}_{\infty}

, where

Ξ_{\tilde{n}} = var ({\tilde{n}}^{- 1 / 2} \sum_{t = 1}^{\tilde{n}} Z_{t})

. Let

\begin{matrix} {\hat{Ξ}}_{\tilde{n}} = \frac{1}{\tilde{n}} \sum_{b = 1}^{B} \{(\sum_{t \in I_{b}} (Z_{t} - \bar{Z})) {(\sum_{t \in I_{b}} (Z_{t} - \bar{Z}))}^{T}\} . \end{matrix}

(A27)

By Proposition 2.1 of [16], we have

\begin{matrix} sup_{x \in R} | P ({\hat{T}}_{n}^{G} \leq x | E) - P (T_{n}^{G} \leq x) | ≲ Γ^{1 / 2} log p, \end{matrix}

(A28)

where

Γ = | Ξ_{\tilde{n}} - {\hat{Ξ}}_{\tilde{n}} |_{\infty} = \frac{1}{\tilde{n}} | \sum_{b = 1}^{B} \{(\sum_{t \in I_{b}} (Z_{t} - \bar{Z})) {(\sum_{t \in I_{b}} (Z_{t} - \bar{Z}))}^{T}\} - var (\sum_{t = 1}^{\tilde{n}} Z_{t}) |_{\infty} .

Let

{\overset{˚}{Z}}_{t} = {({\overset{˚}{Z}}_{t, 1}, \dots, {\overset{˚}{Z}}_{t, p})}^{T} = Z_{t} - E (Z_{t})

. Then, for any

i, j \in [p]

, triangle inequality yields

\begin{matrix} | \sum_{b = 1}^{B} \{(\sum_{t \in I_{b}} (Z_{t, i} - {\bar{Z}}_{i})) (\sum_{t \in I_{b}} (Z_{t, j} - {\bar{Z}}_{j}))\} - E \{(\sum_{t = 1}^{\tilde{n}} {\overset{˚}{Z}}_{t, i}) (\sum_{t = 1}^{\tilde{n}} {\overset{˚}{Z}}_{t, j})\} | \\ = & | \sum_{b = 1}^{B} \{(\sum_{t \in I_{b}} ({\overset{˚}{Z}}_{t, i} - {\bar{\overset{˚}{Z}}}_{i})) (\sum_{t \in I_{b}} ({\overset{˚}{Z}}_{t, j} - {\bar{\overset{˚}{Z}}}_{j}))\} - E \{(\sum_{t = 1}^{\tilde{n}} {\overset{˚}{Z}}_{t, i}) (\sum_{t = 1}^{\tilde{n}} {\overset{˚}{Z}}_{t, j})\} | \\ \leq & | \sum_{b = 1}^{B} \{(\sum_{t \in I_{b}} ({\overset{˚}{Z}}_{t, i} - {\bar{\overset{˚}{Z}}}_{i})) (\sum_{t \in I_{b}} ({\overset{˚}{Z}}_{t, j} - {\bar{\overset{˚}{Z}}}_{j}))\} - \sum_{b = 1}^{B} E \{(\sum_{t \in I_{b}} {\overset{˚}{Z}}_{t, i}) (\sum_{t \in I_{b}} {\overset{˚}{Z}}_{t, j})\} | \\ + | \sum_{b = 1}^{B} E \{(\sum_{t \in I_{b}} {\overset{˚}{Z}}_{t, i}) (\sum_{t \in I_{b}} {\overset{˚}{Z}}_{t, j})\} - E \{(\sum_{t = 1}^{\tilde{n}} {\overset{˚}{Z}}_{t, i}) (\sum_{t = 1}^{\tilde{n}} {\overset{˚}{Z}}_{t, j})\} | \\ \leq & \underset{I_{1, i, j}}{\underset{︸}{| \sum_{b = 1}^{B} \sum_{t_{1}, t_{2} \in I_{b}} {{\overset{˚}{Z}}_{t_{1}, i} {\overset{˚}{Z}}_{t_{2}, j} - E ({\overset{˚}{Z}}_{t_{1}, i} {\overset{˚}{Z}}_{t_{2}, j})} |}} + \underset{I_{2, i, j}}{\underset{︸}{\frac{S}{\tilde{n}} | (\sum_{t = 1}^{\tilde{n}} {\overset{˚}{Z}}_{t, i}) (\sum_{t = 1}^{\tilde{n}} {\overset{˚}{Z}}_{t, j}) |}} \\ + \underset{I_{3, i, j}}{\underset{︸}{| \sum_{b = 1}^{B} E \{(\sum_{t \in I_{b}} {\overset{˚}{Z}}_{t, i}) (\sum_{t \in I_{b}} {\overset{˚}{Z}}_{t, j})\} - E \{(\sum_{t = 1}^{\tilde{n}} {\overset{˚}{Z}}_{t, i}) (\sum_{t = 1}^{\tilde{n}} {\overset{˚}{Z}}_{t, j})\} |}} . \end{matrix}

In this sequel, we will specify the upper bounds of

I_{1, i, j}

,

I_{2, i, j}

and

I_{3, i, j}

, respectively. Without loss of generality, we assume

\tilde{n} = B S

with

B ≍ n^{ϑ}

and

S ≍ n^{1 - ϑ}

. By Assumption 1(i), it holds that

{max}_{t \in [\tilde{n}]} {max}_{j \in [p]} E (| {\overset{˚}{Z}}_{t, j} |^{m}) \leq C

for some

m > 4

. Then, due to

τ > 3 m / (m - 4)

, (A1) yields

E ({[\sum_{t_{1}, t_{2} \in I_{b}} {{\overset{˚}{Z}}_{t_{1}, i} {\overset{˚}{Z}}_{t_{2}, j} - E ({\overset{˚}{Z}}_{t_{1}, i} {\overset{˚}{Z}}_{t_{2}, j})}]}^{2}) \leq E \{{(\sum_{t \in I_{b}} {\overset{˚}{Z}}_{t, i})}^{2} {(\sum_{t \in I_{b}} {\overset{˚}{Z}}_{t, j})}^{2}\} ≲ S^{2} .

By triangle inequality,

\begin{matrix} E ({[\sum_{b = 1}^{B} \sum_{t_{1}, t_{2} \in I_{b}} {{\overset{˚}{Z}}_{t_{1}, i} {\overset{˚}{Z}}_{t_{2}, j} - E ({\overset{˚}{Z}}_{t_{1}, i} {\overset{˚}{Z}}_{t_{2}, j})}]}^{2}) \\ ≲ B S^{2} + \sum_{b = 1}^{B - 1} \sum_{s = 1}^{B - b} | \sum_{t_{1}, t_{2} \in I_{b}} \sum_{t_{3}, t_{4} \in I_{b + s}} cov ({\overset{˚}{Z}}_{t_{1}, i} {\overset{˚}{Z}}_{t_{2}, j}, {\overset{˚}{Z}}_{t_{3}, i} {\overset{˚}{Z}}_{t_{4}, j}) | \\ \leq B S^{2} + \sum_{b = 1}^{B - 1} \sum_{s = 1}^{B - b} | \sum_{t_{1}, t_{2} \in I_{b}} \sum_{t_{3}, t_{4} \in I_{b + s}} {cum}_{i, j} (t_{2} - t_{1}, t_{3} - t_{1}, t_{4} - t_{1}) | \\ + \sum_{b = 1}^{B - 1} \sum_{s = 1}^{B - b} | \sum_{t_{1}, t_{2} \in I_{b}} \sum_{t_{3}, t_{4} \in I_{b + s}} E ({\overset{˚}{Z}}_{t_{1}, i} {\overset{˚}{Z}}_{t_{3}, i}) E ({\overset{˚}{Z}}_{t_{2}, j} {\overset{˚}{Z}}_{t_{4}, j}) | \\ + \sum_{b = 1}^{B - 1} \sum_{s = 1}^{B - b} | \sum_{t_{1}, t_{2} \in I_{b}} \sum_{t_{3}, t_{4} \in I_{b + s}} E ({\overset{˚}{Z}}_{t_{1}, i} {\overset{˚}{Z}}_{t_{4}, j}) E ({\overset{˚}{Z}}_{t_{3}, i} {\overset{˚}{Z}}_{t_{2}, j})} | . \end{matrix}

(A29)

By Assumption 3,

\sum_{t_{1}, t_{2} \in I_{b}} \sum_{t_{3}, t_{4} \in I_{b + s}} {cum}_{i, j} (t_{2} - t_{1}, t_{3} - t_{1}, t_{4} - t_{1}) ≲ S

, which implies

\begin{matrix} \sum_{b = 1}^{B - 1} \sum_{s = 1}^{B - b} | \sum_{t_{1}, t_{2} \in I_{b}} \sum_{t_{3}, t_{4} \in I_{b + s}} {cum}_{i, j} (t_{2} - t_{1}, t_{3} - t_{1}, t_{4} - t_{1}) | ≲ B^{2} S . \end{matrix}

(A30)

For any

b \in [B - 1]

and

s \in [B - b]

, due to

τ > 3 m / (m - 4)

, Equation (1.12b) of [30] yields

\begin{matrix} | \sum_{t_{1} \in I_{b}} \sum_{t_{3} \in I_{b + s}} E ({\overset{˚}{Z}}_{t_{1}, i} {\overset{˚}{Z}}_{t_{3}, i}) | \leq \sum_{t_{1} \in I_{b}} \sum_{t_{3} \in I_{b + s}} | E ({\overset{˚}{Z}}_{t_{1}, i} {\overset{˚}{Z}}_{t_{3}, i}) | \\ ≲ & \sum_{t_{1} \in I_{b}} \sum_{t_{3} \in I_{b + s}} {E (| Z_{t_{1}, i} |^{m} {)}}^{\frac{1}{m}} {E (| Z_{t_{3}, i} |^{m} {)}}^{\frac{1}{m}} α^{\frac{m - 2}{m}} (t_{3} - t_{1}) ≲ \sum_{t_{1} \in I_{b}} \sum_{t_{3} \in I_{b + s}} α^{\frac{m - 2}{m}} (t_{3} - t_{1}) \\ ≲ & \sum_{h = 1}^{S} h α^{\frac{m - 2}{m}} {h + (s - 1) S} ≲ 1 (s = 1) + S^{\frac{2 m - (m - 2) τ}{m}} {(s - 1)}^{- \frac{(m - 2) τ}{m}} 1 (s > 1) . \end{matrix}

Similarly, we also have

| \sum_{t_{2} \in I_{b}} \sum_{t_{4} \in I_{b + s}} E ({\overset{˚}{Z}}_{t_{2}, j} {\overset{˚}{Z}}_{t_{4}, j}) | ≲ 1 (s = 1) + S^{\frac{2 m - (m - 2) τ}{m}} {(s - 1)}^{- \frac{(m - 2) τ}{m}} 1 (s > 1) .

Thus,

\begin{matrix} \sum_{b = 1}^{B - 1} \sum_{s = 1}^{B - b} | \sum_{t_{1}, t_{2} \in I_{b}} \sum_{t_{3}, t_{4} \in I_{b + s}} E ({\overset{˚}{Z}}_{t_{1}, i} {\overset{˚}{Z}}_{t_{3}, i}) E ({\overset{˚}{Z}}_{t_{2}, j} {\overset{˚}{Z}}_{t_{4}, j}) | \\ ≲ & \sum_{b = 1}^{B - 1} \sum_{s = 1}^{B - b} \{1 (s = 1) + S^{\frac{4 m - 2 (m - 2) τ}{m}} {(s - 1)}^{- \frac{2 (m - 2) τ}{m}} 1 (s > 1)\} ≲ B . \end{matrix}

(A31)

Analogously, we also have

\sum_{b = 1}^{B - 1} \sum_{s = 1}^{B - b} | \sum_{t_{1}, t_{2} \in I_{b}} \sum_{t_{3}, t_{4} \in I_{b + s}} E ({\overset{˚}{Z}}_{t_{1}, i} {\overset{˚}{Z}}_{t_{4}, j}) E ({\overset{˚}{Z}}_{t_{3}, i} {\overset{˚}{Z}}_{t_{2}, j})} | ≲ B

. Combining this with (A29)–(A31), due to

B \geq S

,

E ({[\sum_{b = 1}^{B} \sum_{t_{1}, t_{2} \in I_{b}} {{\overset{˚}{Z}}_{t_{1}, i} {\overset{˚}{Z}}_{t_{2}, j} - E ({\overset{˚}{Z}}_{t_{1}, i} {\overset{˚}{Z}}_{t_{2}, j})}]}^{2}) ≲ B^{2} S .

Then it holds that

\begin{matrix} I_{1, i, j} = O_{p} (B S^{1 / 2}) . \end{matrix}

(A32)

Similar to (A1), we have

| (\sum_{t = 1}^{\tilde{n}} {\overset{˚}{Z}}_{t, i}) (\sum_{t = 1}^{\tilde{n}} {\overset{˚}{Z}}_{t, j}) | = O_{p} (n)

. Thus, we know that

\begin{matrix} I_{2, i, j} = O_{p} (S) . \end{matrix}

(A33)

Note that

\begin{matrix} I_{3, i, j} \leq \sum_{b_{1} \neq b_{2}} | E \{(\sum_{t \in I_{b_{1}}} {\overset{˚}{Z}}_{t, i}) (\sum_{t \in I_{b_{2}}} {\overset{˚}{Z}}_{t, j})\} | . \end{matrix}

For

b_{1} < b_{2}

, due to

τ > 3 m / (m - 4)

, Equation (1.12b) of [30] yields

\begin{matrix} | E \{(\sum_{t \in I_{b_{1}}} {\overset{˚}{Z}}_{t, i}) (\sum_{t \in I_{b_{2}}} {\overset{˚}{Z}}_{t, j})\} | \\ ≲ & \sum_{s = 1}^{S} s {E (| Z_{t, i} |^{m} {)}}^{\frac{1}{m}} {E (| Z_{t + s, j} |^{m} {)}}^{\frac{1}{m}} α^{\frac{m - 2}{m}} {s + (b_{2} - b_{1} - 1) S} \\ ≲ & 1 (b_{2} - b_{1} = 1) + S^{\frac{2 m - (m - 2) τ}{m}} {(b_{2} - b_{1} - 1)}^{- \frac{(m - 2) τ}{m}} 1 (b_{2} - b_{1} > 1) . \end{matrix}

Thus,

\begin{matrix} \sum_{b_{1} < b_{2}} | E \{(\sum_{t \in I_{b_{1}}} {\overset{˚}{Z}}_{t, i}) (\sum_{t \in I_{b_{2}}} {\overset{˚}{Z}}_{t, j})\} | ≲ & B + S^{\frac{2 m - (m - 2) τ}{m}} \sum_{b_{2} - b_{1} > 1} {(b_{2} - b_{1} - 1)}^{- \frac{(m - 2) τ}{m}} ≲ B . \end{matrix}

Similarly, we can also obtain

\sum_{b_{1} > b_{2}} | E {(\sum_{t \in I_{b_{1}}} {\overset{˚}{Z}}_{t, i}) (\sum_{t \in I_{b_{2}}} {\overset{˚}{Z}}_{t, j})} | ≲ B

, which implies

I_{3, i, j} ≲ B

. Then by (A32) and (A33), it holds that

\frac{1}{\tilde{n}} | \sum_{b = 1}^{B} (\sum_{t \in I_{b}} (Z_{t, i} - {\bar{Z}}_{i})) (\sum_{t \in I_{b}} (Z_{t, j} - {\bar{Z}}_{j})) - E \{(\sum_{t = 1}^{\tilde{n}} {\overset{˚}{Z}}_{t, i}) (\sum_{t = 1}^{\tilde{n}} {\overset{˚}{Z}}_{t, j})\} | = O_{p} (S^{- 1 / 2}) .

Then, by Markov’s inequality,

\begin{matrix} Γ = | Ξ_{\tilde{n}} - {\hat{Ξ}}_{\tilde{n}} |_{\infty} = O_{p} (p^{2} S^{- 1 / 2}) . \end{matrix}

(A34)

By (A28), due to

S ≍ n^{1 - ϑ}

, it holds that

\begin{matrix} sup_{x \in R} | P ({\hat{T}}_{n}^{G} \leq x | E) - P (T_{n}^{G} \leq x) | = o_{p} (1) \end{matrix}

(A35)

provided that

p log p = o {n^{(1 - ϑ) / 4}}

.

Recall

{\hat{cv}}_{α} = inf {x > 0 : P ({\hat{T}}_{n}^{G} > x | E) \leq α}

. For any

ϵ > 0

, let

{cv}_{α}^{(ϵ)}

and

{cv}_{α}^{(- ϵ)}

be two constants which satisfy

P {T_{n}^{G} > {cv}_{α}^{(ϵ)}} = α + ϵ

and

P {T_{n}^{G} > {cv}_{α}^{(- ϵ)}} = α - ϵ

, respectively. We claim that for any

ϵ > 0

, it holds that

P {{cv}_{α}^{(ϵ)} < {\hat{cv}}_{α} < {cv}_{α}^{(- ϵ)}} \to 1

as

n \to \infty

. Otherwise, if

{\hat{cv}}_{α} \leq {cv}_{α}^{(ϵ)}

, by (A35), we have

\begin{matrix} α = P ({\hat{T}}_{n}^{G} > {\hat{cv}}_{α} | E) \geq P {{\hat{T}}_{n}^{G} > {cv}_{α}^{(ϵ)} | E} = P {T_{n}^{G} > {cv}_{α}^{(ϵ)}} + o_{p} (1) = α + ϵ + o_{p} (1), \end{matrix}

which is a contradiction with probability approaching one as

n \to \infty

. Analogously, if

{\hat{cv}}_{α} \geq {cv}_{α}^{(- ϵ)}

, by (A35), we have

\begin{matrix} α = P ({\hat{T}}_{n}^{G} > {\hat{cv}}_{α} | E) \leq P {{\hat{T}}_{n}^{G} > {cv}_{α}^{(- ϵ)} | E} = P {T_{n}^{G} > {cv}_{α}^{(- ϵ)}} + o_{p} (1) = α - ϵ + o_{p} (1), \end{matrix}

which is also a contradiction with probability approaching one as

n \to \infty

.

For any

ϵ > 0

, define the event

E_{1, ϵ} = {{cv}_{α}^{(ϵ)} < {\hat{cv}}_{α} < {cv}_{α}^{(- ϵ)}}

. Then

P (E_{1, ϵ}) \to 1

as

n \to \infty

. On the one hand, by Proposition 1,

\begin{matrix} P (T_{n} > {\hat{cv}}_{α}) \leq & P (T_{n} > {\hat{cv}}_{α} | E_{1, ϵ}) + P (E_{1, ϵ}^{c}) \leq P {T_{n} > {cv}_{α}^{(ϵ)}} + o (1) \\ = & P {T_{n}^{G} > {cv}_{α}^{(ϵ)}} + o (1) = α + ϵ + o (1), \end{matrix}

which implies that

{\lim^{¯}}_{n \to \infty} P (T_{n} > {\hat{cv}}_{α}) \leq α + ϵ

. On the other hand, by Proposition 1,

\begin{matrix} P (T_{n} > {\hat{cv}}_{α}) \geq & P (T_{n} > {\hat{cv}}_{α} | E_{1, ϵ}) \geq P {T_{n} > {cv}_{α}^{(- ϵ)}} - P (E_{1, ϵ}^{c}) \\ \geq & P {T_{n}^{G} > {cv}_{α}^{(- ϵ)}} - o (1) = α - ϵ - o (1), \end{matrix}

which implies that

{\lim_{̲}}_{n \to \infty} P (T_{n} > {\hat{cv}}_{α}) \geq α - ϵ

. Since

P (T_{n} > {\hat{cv}}_{α})

does not depend on

ϵ

, by letting

ϵ \to 0^{+}

, we have

{lim}_{n \to \infty} P (T_{n} > {\hat{cv}}_{α}) = α

. Thus we complete the proof of Theorem 2 under Case3. □

Appendix C.2. Proof of Theorem 2 under Case4

Proof.

By Proposition 1 under Case2 and the arguments in Appendix C.1, it suffices to show

\begin{matrix} sup_{x \in R} | P ({\hat{T}}_{n}^{G} \leq x | E) - P (T_{n}^{G} \leq x) | ≲ Γ^{1 / 2} log p = o_{p} (1), \end{matrix}

where

Γ \leq max_{i, j \in [p]} {\tilde{n}}^{- 1} | I_{1, i, j} | + max_{i, j \in [p]} {\tilde{n}}^{- 1} | I_{2, i, j} | + max_{i, j \in [p]} {\tilde{n}}^{- 1} | I_{3, i, j} |

with

I_{1, i, j}

,

I_{2, i, j}

and

I_{3, i, j}

specified in Appendix C.1. In this sequel, we will specify the upper bounds of

{max}_{i, j \in [p]} | I_{1, i, j} |

,

{max}_{i, j \in [p]} | I_{2, i, j} |

and

{max}_{i, j \in [p]} | I_{3, i, j} |

, respectively.

Without loss of generality, we assume

\tilde{n} = B S

with

B ≍ {\tilde{n}}^{ϑ}

and

S ≍ {\tilde{n}}^{1 - ϑ}

for some

ϑ \in [1 / 2, 1)

. Let

W_{b, i, j} = \sum_{t_{1}, t_{2} \in I_{b}} {\overset{˚}{Z}}_{t_{1}, i} {\overset{˚}{Z}}_{t_{2}, j} - E (\sum_{t_{1}, t_{2} \in I_{b}} {\overset{˚}{Z}}_{t_{1}, i} {\overset{˚}{Z}}_{t_{2}, j})

. For

R_{n} > C^{*} S

with some sufficiently large constant

C^{*} > 0

, denote

W_{b, i, j}^{+} = W_{b, i, j} 1 (| W_{b, i, j} | \leq R_{n}) - E {W_{b, i, j} 1 (| W_{b, i, j} | \leq R_{n})}

and

W_{b, i, j}^{-} = W_{b, i, j} 1 (| W_{b, i, j} | > R_{n}) - E {W_{b, i, j} 1 (| W_{b, i, j} | > R_{n})}

. Then for some

C_{n} > 0

, it holds by Bonferroni inequality that

\begin{matrix} P (max_{i, j \in [p]} | \sum_{b = 1}^{B} W_{b, i, j} | > \tilde{n} x) \leq & p^{2} max_{i, j \in [p]} P (| \sum_{b = 1}^{B} W_{b, i, j}^{+} | + | \sum_{b = 1}^{B} W_{b, i, j}^{-} | > \tilde{n} x) \\ \leq & p^{2} max_{i, j \in [p]} \{P (| \sum_{b = 1}^{B} W_{b, i, j}^{+} | > \tilde{n} x - C_{n}) + P (| \sum_{b = 1}^{B} W_{b, i, j}^{-} | > C_{n})\} \end{matrix}

for all

x > C_{n} {\tilde{n}}^{- 1}

. Note that

\begin{matrix} E {W_{b, i, j}^{2} 1 (| W_{b, i, j} | > R_{n})} = 2 \int_{0}^{R_{n}} x P (| W_{b, i, j} | > R_{n}) d x + 2 \int_{R_{n}}^{\infty} x P (| W_{b, i, j} | > x) d x \end{matrix}

By Assumptions 2(i)–(ii) and Cauchy–Schwarz inequality,

E (\sum_{t_{1}, t_{2} \in I_{b}} {\overset{˚}{Z}}_{t_{1}, i} {\overset{˚}{Z}}_{t_{2}, j}) ≲ S

. By Assumptions 2(i)–(ii) again and Theorem 1 of [32], we know that

P (| \sum_{t \in I_{b}} {\overset{˚}{Z}}_{t, i} | > x) ≲ S exp (- C x^{2 / 3}) + exp (- C S^{- 1} x^{2})

for any

x \to \infty

. Thus, for any

x > C S

, we have

\begin{matrix} P (| W_{b, i, j} | > x) \leq P (| \sum_{t \in I_{b}} {\overset{˚}{Z}}_{t, i} | | \sum_{t \in I_{b}} {\overset{˚}{Z}}_{t, j} | > C x) \\ \leq & P (| \sum_{t \in I_{b}} {\overset{˚}{Z}}_{t, i} | > C x^{1 / 2}) + P (| \sum_{t \in I_{b}} {\overset{˚}{Z}}_{t, j} | > C x^{1 / 2}) ≲ S exp (- C x^{1 / 3}) + exp (- C S^{- 1} x) . \end{matrix}

Due to

R_{n} > C S

, we can show that

\begin{matrix} E (| W_{b, i, j}^{-} |^{2}) ≲ E {W_{b, i, j}^{2} 1 (| W_{b, i, j} | > R_{n})} ≲ R_{n}^{2} S exp (- C R_{n}^{1 / 3}) + R_{n}^{2} exp (- C S^{- 1} R_{n}) . \end{matrix}

Selecting

R_{n} = C^{* *} S log (p n)

for some sufficiently large constant

C^{* *} > 0

, and

C_{n} ≍ B^{1 / 2}

, it holds by Markov’s inequality that

\begin{matrix} p^{2} max_{i, j \in [p]} P (| \sum_{b = 1}^{B} W_{b, i, j}^{-} | > C_{n}) ≲ p^{2} B^{1 / 2} max_{i, j \in [p]} max_{b \in [B]} E (| W_{b, i, j}^{-} |) = o (1) \end{matrix}

provided that

log (p n) = o (S^{1 / 2})

. Due to

| W_{b, i, j}^{+} | \leq 2 R_{n}

, by Theorem 1 of [33],

\begin{matrix} P (| \sum_{b = 1}^{B} W_{b, i, j}^{+} | > \tilde{n} x - C_{n}) ≲ exp \{- \frac{C {\tilde{n}}^{2} x^{2}}{B R_{n}^{2} + R_{n} \tilde{n} x (log B) (log log B)}\} \end{matrix}

for any

x > C B^{- 1 / 2} S^{- 1}

. Thus, we can conclude that

\begin{matrix} max_{i, j \in [p]} {\tilde{n}}^{- 1} | I_{1, i, j} | = max_{i, j \in [p]} | \frac{1}{\tilde{n}} \sum_{b = 1}^{B} W_{b, i, j} | = O_{p} [\frac{{log (p n)}^{3 / 2}}{B^{1 / 2}}] \end{matrix}

provided that

log (p n) = o [min {S^{1 / 2}, B {(log n log log n)}^{- 2}}]

. By Bonferroni inequality and Theorem 1 of [32], we know that

\begin{matrix} P (max_{i, j \in [p]} {\tilde{n}}^{- 1} | I_{2, i, j} | > x) ≲ & p^{2} max_{i \in [p]} P (| \sum_{t = 1}^{\tilde{n}} {\overset{˚}{Z}}_{t, i} | > C S^{- 1 / 2} n x^{1 / 2}) \\ ≲ & p^{2} n exp (- C S^{- 1 / 3} n^{2 / 3} x^{1 / 3}) + p^{2} exp (- C S^{- 1} n x) \end{matrix}

for any

x ≫ S n^{- 2}

. Then, we can conclude that

{max}_{i, j \in [p]} {\tilde{n}}^{- 1} | I_{2, i, j} | = O_{p} {B^{- 1} log (p n)}

provided that

log (p n) = o (n^{1 / 2})

. Finally, Equation (1.12b) of [30] yields

\begin{matrix} | \sum_{b = 1}^{B} E \{(\sum_{t \in I_{b}} {\overset{˚}{Z}}_{t, i}) (\sum_{t \in I_{b}} {\overset{˚}{Z}}_{t, j})\} - E \{(\sum_{t = 1}^{\tilde{n}} {\overset{˚}{Z}}_{t, i}) (\sum_{t = 1}^{\tilde{n}} {\overset{˚}{Z}}_{t, j})\} | \\ \leq & \sum_{b = 1}^{B - 1} \sum_{κ = 1}^{B - b} | \sum_{t_{1} \in I_{b}} \sum_{t_{2} \in I_{b + κ}} E ({\overset{˚}{Z}}_{t_{1}, i} {\overset{˚}{Z}}_{t_{2}, j}) | ≲ \sum_{b = 1}^{B - 1} \sum_{κ = 1}^{B - b} \sum_{δ = 1}^{B} δ exp [- C {δ + (κ - 1) S}] \\ ≲ & \sum_{b = 1}^{B - 1} \sum_{δ = 1}^{B} δ exp (- C δ) + \sum_{b = 1}^{B - 1} \sum_{κ = 2}^{B - b} B^{2} exp {- C (κ - 1) S} ≲ B + B^{3} exp (- C S) ≲ B, \end{matrix}

which implies

{max}_{i, j \in [p]} {\tilde{n}}^{- 1} | I_{3, i, j} | = O (S^{- 1})

. Thus,

\begin{matrix} Γ = O_{p} [\frac{{log (p n)}^{3 / 2}}{B^{1 / 2}} + \frac{1}{S}] \end{matrix}

(A36)

provided that

log (p n) = o [min {S^{1 / 2}, B {(log n log log n)}^{- 2}}]

. It holds that

\begin{matrix} sup_{x \in R} | P ({\hat{T}}_{n}^{G} \leq x | E) - P (T_{n}^{G} \leq x) | ≲ Γ^{1 / 2} log p = o_{p} (1) \end{matrix}

provided that

log (p n) = o [n^{min {(1 - ϑ) / 2, ϑ / 7}}]

. The proof of the second result of Theorem 2 under Case4 is the same as in the proof of the second result of Theorem 2 under Case3. Thus, we complete the proof of Theorem 2. □

Appendix D. Proof of Theorem 3

Proof.

Let

s = C π_{n} p^{2} S^{- 1 / 2}

in Case3 and

s = C π_{n} [B^{- 1 / 2} {log (p n)}^{3 / 2} + C S^{- 1}]

in Case4, where

π_{n} > 0

diverges at a sufficiently slow rate. Then

s = o (1)

provided that

p = o (S^{1 / 4})

in Case3 and

log (p n) = o (B^{1 / 3})

in Case4. Define an event

Φ (s) = \{max_{j \in [p]} | \frac{{({\hat{Ξ}}_{\tilde{n}})}_{j, j}}{{(Ξ_{\tilde{n}})}_{j, j}} - 1 | \leq s\},

where

{\hat{Ξ}}_{\tilde{n}}

and

Ξ_{\tilde{n}}

are specified in (A27) and (3), respectively. By (A34) and (A36) in Appendix C, we have

\begin{matrix} max_{j \in [p]} | {({\hat{Ξ}}_{\tilde{n}})}_{j, j} - {(Ξ_{\tilde{n}})}_{j, j} | \leq & | {\hat{Ξ}}_{\tilde{n}} - Ξ_{\tilde{n}} |_{\infty} = o_{p} (s) \end{matrix}

holds under Case3 and Case4 with

log (p n) = o (S^{1 / 2})

. By Assumption 1(iii) and Assumption 2(iii), we know that

{min}_{j \in [p]} {(Ξ_{\tilde{n}})}_{j, j} > c

holds under Case3 and Case4. Therefore,

max_{j \in [p]} | \frac{{({\hat{Ξ}}_{\tilde{n}})}_{j, j}}{{(Ξ_{\tilde{n}})}_{j, j}} - 1 | \leq \frac{{max}_{j \in [p]} | {({\hat{Ξ}}_{\tilde{n}})}_{j, j} - {(Ξ_{\tilde{n}})}_{j, j} |}{{min}_{j \in [p]} {(Ξ_{\tilde{n}})}_{j, j}} = o_{p} (s) .

Then it holds that

P {Φ^{c} (s) | E} = o_{p} (1)

under Case3 and Case4. Let

ϱ = {max}_{j \in [p]} {(Ξ_{\tilde{n}})}_{j, j}

. Restricted on

Φ (s)

, there exists a constant

C_{0} > 0

such that

\begin{matrix} E ({\hat{T}}_{n}^{G} | E) \leq & C_{0} {(log p)}^{1 / 2} max_{j \in [p]} {({\hat{Ξ}}_{\tilde{n}})}_{j, j}^{1 / 2} \leq {(1 + s)}^{1 / 2} C_{0} {(log p)}^{1 / 2} ϱ^{1 / 2} . \end{matrix}

By Borell inequality for Gaussian process,

\begin{matrix} P {{\hat{T}}_{n}^{G} > E ({\hat{T}}_{n}^{G} | E) + x | E} \leq 2 exp \{- \frac{x^{2}}{2 {max}_{j \in [p]} {({\hat{Ξ}}_{\tilde{n}})}_{j, j}}\} \end{matrix}

for any

x > 0

. Let

x_{0} = ϱ^{1 / 2} {(1 + s)}^{1 / 2} [C_{0} {(log p)}^{1 / 2} + {2 log (4 / α)}^{1 / 2}]

. Restricted on

Φ (s)

, we have

x_{0} \geq E ({\hat{T}}_{n}^{G} | E) + {(2 ϱ)}^{1 / 2} {(1 + s)}^{1 / 2} {log}^{1 / 2} (\frac{4}{α}),

which implies

\begin{matrix} P {{\hat{T}}_{n}^{G} > x_{0}, Φ (s) | E} \leq 2 exp \{- \frac{2 ϱ (1 + s) log (4 / α)}{2 ϱ (1 + s)}\} = \frac{α}{2} . \end{matrix}

Since

P {Φ^{c} (s) | E} = o_{p} (1)

, then

P {Φ^{c} (s) | E} \leq α / 4

with probability approaching one. Hence,

P ({\hat{T}}_{n}^{G} > x_{0} | E) \leq α

with probability approaching one. Similar to the proof of (A1), we know that

ϱ \leq C

under Case3 and Case4. By the definition of

{\hat{cv}}_{α}

, it holds with probability approaching one that

\begin{matrix} {\hat{cv}}_{α} \leq & ϱ^{1 / 2} {(1 + s)}^{1 / 2} [C_{0} {(log p)}^{1 / 2} + {2 log (4 / α)}^{1 / 2}] ≲ {(log p)}^{1 / 2} \end{matrix}

under Case3 with

p = o (S^{1 / 4})

and Case4 with

log (p n) = o (B^{1 / 3} \land S^{1 / 2})

. Let

μ_{X} = {(μ_{X, 1}, \dots, μ_{X, p})}^{T} = E (X_{t})

,

μ_{Y} = {(μ_{Y, 1}, \dots, μ_{Y, p})}^{T} = E (Y_{t})

and

j_{0} = arg {max}_{j \in [p]} | μ_{X, j} - μ_{Y, j} |

, then

\begin{matrix} T_{n} = & \sqrt{\frac{n_{1} n_{2}}{n_{1} + n_{2}}} {| {\hat{μ}}_{X} - {\hat{μ}}_{Y} |}_{\infty} \geq \sqrt{\frac{n_{1} n_{2}}{n_{1} + n_{2}}} | \frac{1}{n_{1}} \sum_{t = 1}^{n_{1}} X_{t, j_{0}} - \frac{1}{n_{2}} \sum_{t = 1}^{n_{2}} Y_{t, j_{0}} | \\ = & \sqrt{\frac{n_{1} n_{2}}{n_{1} + n_{2}}} | \frac{1}{n_{1}} \sum_{t = 1}^{n_{1}} (X_{t, j_{0}} - μ_{X, j_{0}}) - \frac{1}{n_{2}} \sum_{t = 1}^{n_{2}} (Y_{t, j_{0}} - μ_{Y, j_{0}}) + μ_{X, j_{0}} - μ_{Y, j_{0}} | \\ \geq & \sqrt{\frac{n_{1} n_{2}}{n_{1} + n_{2}}} | μ_{X, j_{0}} - μ_{Y, j_{0}} | - \sqrt{\frac{n_{1} n_{2}}{n_{1} + n_{2}}} | \frac{1}{n_{1}} \sum_{t = 1}^{n_{1}} (X_{t, j_{0}} - μ_{X, j_{0}}) - \frac{1}{n_{2}} \sum_{t = 1}^{n_{2}} (Y_{t, j_{0}} - μ_{Y, j_{0}}) | . \end{matrix}

Similar to the proof of (A1), we know that

| \frac{1}{n_{1}} \sum_{t = 1}^{n_{1}} (X_{t, j_{0}} - μ_{X, j_{0}}) - \frac{1}{n_{2}} \sum_{t = 1}^{n_{2}} (Y_{t, j_{0}} - μ_{Y, j_{0}}) | = O_{p} (n^{- 1 / 2})

under Case3 and Case4. If

| μ_{X, j_{0}} - μ_{Y, j_{0}} | ≫ n^{- 1 / 2}

, we can conclude that

P (T_{n} \geq C^{*} n^{1 / 2} | μ_{X, j_{0}} - μ_{Y, j_{0}} |) \to 1

as

n \to \infty

for some constant

C^{*} > 0

. Due to

{\hat{cv}}_{α} ≲ {(log p)}^{1 / 2} = o (n^{1 / 2} | μ_{X, j_{0}} - μ_{Y, j_{0}} |)

under Case3 and Case4, we have that Theorem 3 holds under Case3 and Case4. □

Appendix E. Additional Simulation Results

Table A1. The Type

I

error rates, expressed as percentages, were calculated by independently generated sequences

{X_{t}}_{t = 1}^{n_{1}}

and

{Y_{t}}_{t = 1}^{n_{2}}

based on (M2). The simulations were replicated 1000 times.

Table A1. The Type

I

error rates, expressed as percentages, were calculated by independently generated sequences

{X_{t}}_{t = 1}^{n_{1}}

and

{Y_{t}}_{t = 1}^{n_{2}}

based on (M2). The simulations were replicated 1000 times.

( $n_{1}, n_{2}$ )	$ρ$	p	$Yang$	$Dempster$	$BS$	$SD$	$CLX$
(200,220)	0	50	4.6	2.3	6.4	4.5	3
		200	3.3	0	6.7	5.8	3.4
		400	3.7	0	5	4.4	4.1
		800	3.3	0	6.4	5.2	4.2
	0.1	50	6.2	12.4	23.4	18.4	9.8
		200	5.2	1.8	43.2	39.5	12.9
		400	5.8	0.1	63	59.7	14.7
		800	5.3	0	88.7	87.3	19.4
	0.2	50	8.1	35.6	51.3	44.3	21.9
		200	7.7	23.5	87.2	85.5	37.9
		400	9.2	16.9	98.5	98.3	43
		800	9.6	9.8	100	100	54.5
(400,420)	0	50	4.9	1.8	5.4	3.6	3.5
		200	3.5	0	6.4	5.3	3.2
		400	5	0	5.7	4.8	4.4
		800	4.3	0	6	5	4.5
	0.1	50	6.9	12.2	21.7	17	9.3
		200	4.9	1.9	41.7	39	11.1
		400	7.8	0.1	63.6	61.4	17.7
		800	7.3	0	87.9	86.7	18.3
	0.2	50	8.6	33.7	46.9	40.7	20.6
		200	7.7	23.7	86.3	84.7	31
		400	9.4	17.1	99.2	99	43.6
		800	9	9.5	100	100	53.2

Figure A1. The empirical powers with sparse signals were evaluated by independently generated sequences

{X_{t}}_{t = 1}^{n_{1}}

based on (M2),

f (\cdot) = 0

and

ρ = 0

, and

{Y_{t}}_{t = 1}^{n_{2}}

based on (M2),

f (\cdot) = f_{1} (\cdot)

and

ρ = 0

. The parameter a represents the distance between the null and alternative hypotheses. The simulations were replicated 1000 times.

Figure A1. The empirical powers with sparse signals were evaluated by independently generated sequences

{X_{t}}_{t = 1}^{n_{1}}

based on (M2),

f (\cdot) = 0

and

ρ = 0

, and

{Y_{t}}_{t = 1}^{n_{2}}

based on (M2),

f (\cdot) = f_{1} (\cdot)

and

ρ = 0

. The parameter a represents the distance between the null and alternative hypotheses. The simulations were replicated 1000 times.

Figure A2. The empirical powers with dense signals were evaluated by independently generated sequences

{X_{t}}_{t = 1}^{n_{1}}

based on (M2),

f (\cdot) = 0

and

ρ = 0

, and

{Y_{t}}_{t = 1}^{n_{2}}

based on (M2),

f (\cdot) = f_{2} (\cdot)

and

ρ = 0

. The parameter a represents the distance between the null and alternative hypotheses. The simulations were replicated 1000 times.

Figure A2. The empirical powers with dense signals were evaluated by independently generated sequences

{X_{t}}_{t = 1}^{n_{1}}

based on (M2),

f (\cdot) = 0

and

ρ = 0

, and

{Y_{t}}_{t = 1}^{n_{2}}

based on (M2),

f (\cdot) = f_{2} (\cdot)

and

ρ = 0

. The parameter a represents the distance between the null and alternative hypotheses. The simulations were replicated 1000 times.

Table A2. The Type

I

error rates, expressed as percentages, were calculated by independently generated sequences

{X_{t}}_{t = 1}^{n_{1}}

and

{Y_{t}}_{t = 1}^{n_{2}}

based on (M3). The simulations were replicated 1000 times.

Table A2. The Type

I

error rates, expressed as percentages, were calculated by independently generated sequences

{X_{t}}_{t = 1}^{n_{1}}

and

{Y_{t}}_{t = 1}^{n_{2}}

based on (M3). The simulations were replicated 1000 times.

( $n_{1}, n_{2}$ )	$ρ$	p	$Yang$	$Dempster$	$BS$	$SD$	$CLX$
(200,220)	0	50	5.7	16.8	7.7	3	1.6
		200	4.3	14.9	6.9	0.9	1.6
		400	3.5	14.7	7.7	0.2	1.2
		800	4.2	15.4	6.9	0.2	1.7
	0.1	50	7.9	25.2	13.7	5.5	5.4
		200	6.2	23	12	2.7	5
		400	5.5	23.3	12.5	1.2	4.2
		800	6.9	24	12.9	0.7	5.5
	0.2	50	8.6	33.8	21	10.7	12.8
		200	7.5	32.5	19.7	5.8	13.9
		400	6.9	30.4	20.2	4.4	15
		800	9.3	32.3	20.7	1.7	18.4
(400,420)	0	50	5.4	13.9	6.7	1.7	1.6
		200	5.1	15.5	6.4	1	1.1
		400	5.3	14.1	7.1	0.8	1.3
		800	4	16.2	6.3	0.1	1.3
	0.1	50	6.9	21.3	10.7	4.7	5.4
		200	6.6	23.1	12.5	2.7	4.9
		400	7.3	22	11.4	1.7	5.9
		800	6.2	23.8	12.7	0.6	5.4
	0.2	50	8.2	31.8	18.2	8.6	11
		200	8.2	31	19.6	5.2	13.5
		400	8.7	32.6	18.9	3.9	14.7
		800	7.4	35.2	21.3	1.8	17.3

Figure A3. The empirical powers with sparse signals were evaluated by independently generated sequences

{X_{t}}_{t = 1}^{n_{1}}

based on (M3),

f (\cdot) = 0

and

ρ = 0

, and

{Y_{t}}_{t = 1}^{n_{2}}

based on (M3),

f (\cdot) = f_{1} (\cdot)

and

ρ = 0

. The parameter a represents the distance between the null and alternative hypotheses. The simulations were replicated 1000 times.

Figure A3. The empirical powers with sparse signals were evaluated by independently generated sequences

{X_{t}}_{t = 1}^{n_{1}}

based on (M3),

f (\cdot) = 0

and

ρ = 0

, and

{Y_{t}}_{t = 1}^{n_{2}}

based on (M3),

f (\cdot) = f_{1} (\cdot)

and

ρ = 0

. The parameter a represents the distance between the null and alternative hypotheses. The simulations were replicated 1000 times.

Figure A4. The empirical powers with dense signals were evaluated by independently generated sequences

{X_{t}}_{t = 1}^{n_{1}}

based on (M3),

f (\cdot) = 0

and

ρ = 0

, and

{Y_{t}}_{t = 1}^{n_{2}}

based on (M3),

f (\cdot) = f_{2} (\cdot)

and

ρ = 0

. The parameter a represents the distance between the null and alternative hypotheses. The simulations were replicated 1000 times.

Figure A4. The empirical powers with dense signals were evaluated by independently generated sequences

{X_{t}}_{t = 1}^{n_{1}}

based on (M3),

f (\cdot) = 0

and

ρ = 0

, and

{Y_{t}}_{t = 1}^{n_{2}}

based on (M3),

f (\cdot) = f_{2} (\cdot)

and

ρ = 0

. The parameter a represents the distance between the null and alternative hypotheses. The simulations were replicated 1000 times.

References

Hotelling, H. The generalization of student’s ratio. Ann. Math. Stat. 1931, 2, 360–378. [Google Scholar] [CrossRef]
Hu, J.; Bai, Z. A review of 20 years of naive tests of significance for high-dimensional mean vectors and covariance matrices. Sci. China Math. 2016, 59, 2281–2300. [Google Scholar] [CrossRef]
Harrar, S.W.; Kong, X. Recent developments in high-dimensional inference for multivariate data: Parametric, semiparametric and nonparametric approaches. J. Multivar. Anal. 2022, 188, 104855. [Google Scholar] [CrossRef]
Bai, Z.; Saranadasa, H. Effect of high dimension: By an example of a two sample problem. Stat. Sin. 1996, 6, 311–329. [Google Scholar]
Dempster, A.P. A high dimensional two sample significance test. Ann. Math. Stat. 1958, 29, 995–1010. [Google Scholar] [CrossRef]
Srivastava, M.S.; Du, M. A test for the mean vector with fewer observations than the dimension. J. Multivar. Anal. 2008, 99, 386–402. [Google Scholar] [CrossRef]
Gregory, K.B.; Carroll, R.J.; Baladandayuthapani, V.; Lahiri, S.N. A two-sample test for equality of means in high dimension. J. Am. Stat. Assoc. 2015, 110, 837–849. [Google Scholar] [CrossRef] [PubMed]
Cai, T.T.; Liu, W.; Xia, Y. Two-sample test of high dimensional means under dependence. J. R. Stat. Soc. Ser. B Stat. Methodol. 2014, 76, 349–372. [Google Scholar]
Chang, J.; Zheng, C.; Zhou, W.X.; Zhou, W. Simulation-Based Hypothesis Testing of High Dimensional Means Under Covariance Heterogeneity. Biometrics 2017, 73, 1300–1310. [Google Scholar] [CrossRef]
Xu, G.; Lin, L.; Wei, P.; Pan, W. An adaptive two-sample test for high-dimensional means. Biometrika 2017, 103, 609–624. [Google Scholar] [CrossRef]
Chernozhukov, V.; Chetverikov, D.; Kato, K. Testing Many Moment Inequalities; Cemmap working paper, No. CWP42/16; Centre for Microdata Methods and Practice (cemmap): London, UK, 2016. [Google Scholar]
Zhang, D.; Wu, W.B. Gaussian approximation for high dimensional time series. Ann. Stat. 2017, 45, 1895–1919. [Google Scholar] [CrossRef]
Wu, W.B. Nonlinear system theory: Another look at dependence. Proc. Natl. Acad. Sci. USA 2005, 102, 14150–14154. [Google Scholar] [CrossRef]
Chang, J.; Chen, X.; Wu, M. Central limit theorems for high dimensional dependent data. Bernoulli 2024, 30, 712–742. [Google Scholar] [CrossRef]
Raič, M. A multivariate berry–esseen theorem with explicit constants. Bernoulli 2019, 25, 2824–2853. [Google Scholar]
Chernozhukov, V.; Chetverikov, D.; Kato, K.; Koike, Y. Improved central limit theorem and bootstrap approximations in high dimensions. Ann. Stat. 2022, 50, 2562–2586. [Google Scholar] [CrossRef]
Peligrad, M. Some remarks on coupling of dependent random variables. Stat. Probab. Lett. 2002, 60, 201–209. [Google Scholar] [CrossRef]
Künsch, H.R. The jackknife and the bootstrap for general stationary observations. Ann. Stat. 1989, 17, 1217–1241. [Google Scholar] [CrossRef]
Liu, R.Y. Bootstrap procedures under some non-I.I.D. models. Ann. Stat. 1988, 16, 1696–1708. [Google Scholar] [CrossRef]
Hill, J.B.; Li, T. A global wavelet based bootstrapped test of covariance stationarity. arXiv 2022, arXiv:2210.14086. [Google Scholar]
Fang, X.; Koike, Y. High-dimensional central limit theorems by Stein’s method. Ann. Appl. Probab. 2021, 31, 1660–1686. [Google Scholar] [CrossRef]
Chernozhukov, V.; Chetverikov, D.; Koike, Y. Nearly optimal central limit theorem and bootstrap approximations in high dimensions. Ann. Appl. Probab. 2023, 33, 2374–2425. [Google Scholar] [CrossRef]
Chang, J.; He, J.; Yang, L.; Yao, Q. Modelling matrix time series via a tensor CP-decomposition. J. R. Stat. Soc. Ser. B Stat. Methodol. 2023, 85, 127–148. [Google Scholar] [CrossRef]
Koike, Y. High-dimensional central limit theorems for homogeneous sums. J. Theor. Probab. 2023, 36, 1–45. [Google Scholar] [CrossRef]
Hörmann, S.; Kokoszka, P. Weakly dependent functional data. Ann. Stat. 2010, 38, 1845–1884. [Google Scholar] [CrossRef]
Zhang, X. White noise testing and model diagnostic checking for functional time series. J. Econ. 2016, 194, 76–95. [Google Scholar] [CrossRef]
Politis, D.N.; Romano, J.P.; Wolf, M. Subsampling; Springer Series in Statistics; Springer: Berlin/Heidelberg, Germany, 1999. [Google Scholar]
Chernozhukov, V.; Chetverikov, D.; Kato, K. Gaussian approximations and multiplier bootstrap for maxima of sums of high-dimensional random vectors. Ann. Stat. 2013, 41, 2786–2819. [Google Scholar] [CrossRef]
Zhou, Z. Heteroscedasticity and autocorrelation robust structural change detection. J. Am. Stat. Assoc. 2013, 108, 726–740. [Google Scholar] [CrossRef]
Rio, E. Inequalities and Limit Theorems for Weakly Dependent Sequences, 3rd Cycle. France. 2013, p. 170. Available online: https://cel.hal.science/cel-00867106v2 (accessed on 8 December 2023).
Chernozhukov, V.; Chetverikov, D.; Kato, K. Central limit theorems and bootstrap in high dimensions. Ann. Probab. 2017, 45, 2309–2352. [Google Scholar] [CrossRef]
Merlevède, F.; Peligrad, M.; Rio, E. A Bernstein type inequality and moderate deviations for weakly dependent sequences. Probab. Theory Relat. Fields 2011, 151, 435–474. [Google Scholar] [CrossRef]
Merlevède, F.; Peligrad, M.; Rio, E. Bernstein inequality and moderate deviations under strong mixing conditions. In High Dimensional Probability V: The Luminy Volume; Institute of Mathematical Statistics: Waite Hill, OH, USA, 2009; Volume 5, pp. 273–292. [Google Scholar]

Figure 1. The empirical powers with sparse signals were evaluated by independently generated sequences

{X_{t}}_{t = 1}^{n_{1}}

based on (M1),

f (\cdot) = 0

and

ρ = 0

, and

{Y_{t}}_{t = 1}^{n_{2}}

based on (M1),

f (\cdot) = f_{1} (\cdot)

and

ρ = 0

. The parameter a represents the distance between the null and alternative hypotheses. The simulations were replicated 1000 times.

Figure 1. The empirical powers with sparse signals were evaluated by independently generated sequences

{X_{t}}_{t = 1}^{n_{1}}

based on (M1),

f (\cdot) = 0

and

ρ = 0

, and

{Y_{t}}_{t = 1}^{n_{2}}

based on (M1),

f (\cdot) = f_{1} (\cdot)

and

ρ = 0

. The parameter a represents the distance between the null and alternative hypotheses. The simulations were replicated 1000 times.

Figure 2. The empirical powers with dense signals were evaluated by independently generated sequences

{X_{t}}_{t = 1}^{n_{1}}

based on (M1),

f (\cdot) = 0

and

ρ = 0

, and

{Y_{t}}_{t = 1}^{n_{2}}

based on (M1),

f (\cdot) = f_{2} (\cdot)

and

ρ = 0

. The parameter a represents the distance between the null and alternative hypotheses. The simulations were replicated 1000 times.

Figure 2. The empirical powers with dense signals were evaluated by independently generated sequences

{X_{t}}_{t = 1}^{n_{1}}

based on (M1),

f (\cdot) = 0

and

ρ = 0

, and

{Y_{t}}_{t = 1}^{n_{2}}

based on (M1),

f (\cdot) = f_{2} (\cdot)

and

ρ = 0

. The parameter a represents the distance between the null and alternative hypotheses. The simulations were replicated 1000 times.

Figure 3. The average annual opening prices of 30 Consumer Discretionary corporations and 31 Information Technology corporations in 2018, 2019, 2020, and 2021.

Table 1. The Type

I

error rates, expressed as percentages, were calculated by independently generated sequences

{X_{t}}_{t = 1}^{n_{1}}

and

{Y_{t}}_{t = 1}^{n_{2}}

based on (M1). The simulations were replicated 1000 times.

Table 1. The Type

I

error rates, expressed as percentages, were calculated by independently generated sequences

{X_{t}}_{t = 1}^{n_{1}}

and

{Y_{t}}_{t = 1}^{n_{2}}

based on (M1). The simulations were replicated 1000 times.

( $n_{1}, n_{2}$ )	$ρ$	p	$Yang$	$Dempster$	$BS$	$SD$	$CLX$
(200,220)	0	50	5	18.5	5.8	0.9	0.3
		200	5.9	16.5	6.6	0.4	0.4
		400	5.4	17.4	6.2	0.2	0.3
		800	4.2	13.5	6.7	0.3	0.2
	0.1	50	6.5	22.8	9.3	2	1
		200	6.6	22.6	9.6	1.2	0.8
		400	7.4	22.9	10.4	1	0.8
		800	5.8	22.5	12.4	1	1.2
	0.2	50	6.8	30.2	13.8	3.1	2.5
		200	7.7	29.9	14.3	2.2	2.7
		400	9.3	30.5	18.2	2.2	2.4
		800	7.9	33.3	21.3	3	3.2
(400,420)	0	50	5.2	17.6	6.8	1	0.5
		200	5.3	17.2	6.8	0.5	0.1
		400	4.6	15.1	5.7	0.3	0
		800	5.2	14.2	6.3	0.3	0.4
	0.1	50	5.6	22.4	9.6	1.4	1
		200	6.3	22.5	9.6	1.3	0.8
		400	6.1	21.4	9.7	0.8	0.8
		800	6.5	23.6	12.1	0.7	1.2
	0.2	50	6.7	26.9	12.8	2.5	1.9
		200	7.6	29.2	14.9	2.3	2.4
		400	7.6	29.4	15.1	1.5	2.9
		800	8.3	36.3	21.9	2.5	3.8

Table 2. The p-values for testing the equality of average annual opening prices across two consecutive years in the Consumer Discretionary Sector and Information Technology Sector, respectively.

Sector of S&P 500	2018–2019	2019–2020	2020–2021
Consumer Discretionary	0	0	0
Information Technology	0	0	0

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Yang, L. A Blockwise Bootstrap-Based Two-Sample Test for High-Dimensional Time Series. Entropy 2024, 26, 226. https://doi.org/10.3390/e26030226

AMA Style

Yang L. A Blockwise Bootstrap-Based Two-Sample Test for High-Dimensional Time Series. Entropy. 2024; 26(3):226. https://doi.org/10.3390/e26030226

Chicago/Turabian Style

Yang, Lin. 2024. "A Blockwise Bootstrap-Based Two-Sample Test for High-Dimensional Time Series" Entropy 26, no. 3: 226. https://doi.org/10.3390/e26030226

APA Style

Yang, L. (2024). A Blockwise Bootstrap-Based Two-Sample Test for High-Dimensional Time Series. Entropy, 26(3), 226. https://doi.org/10.3390/e26030226

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Blockwise Bootstrap-Based Two-Sample Test for High-Dimensional Time Series

Abstract

1. Introduction

2. Methodology

2.1. Test Statistic and Its Gaussian Analog

2.2. Blockwise Bootstrap

3. Theoretical Results

3.1. Gaussian Approximation for High-Dimensional $α$ -Mixing Sequence

3.2. Theoretical Properties

4. Application: Change Point Detection

5. Simulation Study

5.1. Tuning Parameter Selection

5.2. Simulation Settings

5.3. Simulation Results

6. Real Data Analysis

7. Discussion

Funding

Data Availability Statement

Conflicts of Interest

Appendix A. Proof of Theorem 1

Appendix A.1. Proof of Theorem 1(i)

Appendix A.2. Proof of Theorem 1(ii)

Appendix B. Proof of Proposition 1

Appendix C. Proof of Theorem 2

Appendix C.1. Proof of Theorem 2 under Case3

Appendix C.2. Proof of Theorem 2 under Case4

Appendix D. Proof of Theorem 3

Appendix E. Additional Simulation Results

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Article Menu

A Blockwise Bootstrap-Based Two-Sample Test for High-Dimensional Time Series

Abstract

1. Introduction

2. Methodology

2.1. Test Statistic and Its Gaussian Analog

2.2. Blockwise Bootstrap

3. Theoretical Results

3.1. Gaussian Approximation for High-Dimensional α -Mixing Sequence

3.2. Theoretical Properties

4. Application: Change Point Detection

5. Simulation Study

5.1. Tuning Parameter Selection

5.2. Simulation Settings

5.3. Simulation Results

6. Real Data Analysis

7. Discussion

Funding

Data Availability Statement

Conflicts of Interest

Appendix A. Proof of Theorem 1

Appendix A.1. Proof of Theorem 1(i)

Appendix A.2. Proof of Theorem 1(ii)

Appendix B. Proof of Proposition 1

Appendix C. Proof of Theorem 2

Appendix C.1. Proof of Theorem 2 under Case3

Appendix C.2. Proof of Theorem 2 under Case4

Appendix D. Proof of Theorem 3

Appendix E. Additional Simulation Results

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

3.1. Gaussian Approximation for High-Dimensional $α$ -Mixing Sequence