Next Article in Journal
Return Strategies of Competing E-Sellers: Return Freight Insurance vs. Return Pickup Services
Previous Article in Journal
Effectiveness of Centrality Measures for Competitive Influence Diffusion in Social Networks
Previous Article in Special Issue
A New Random Coefficient Autoregressive Model Driven by an Unobservable State Variable
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Test of the Equality of Several High-Dimensional Covariance Matrices: A Normal-Reference Approach

1
Department of Statistics and Data Science, National University of Singapore, Singapore 117546, Singapore
2
National Institute of Education, Nanyang Technological University, Singapore 637616, Singapore
*
Author to whom correspondence should be addressed.
Mathematics 2025, 13(2), 295; https://doi.org/10.3390/math13020295
Submission received: 17 December 2024 / Revised: 13 January 2025 / Accepted: 16 January 2025 / Published: 17 January 2025
(This article belongs to the Special Issue Computational Statistics and Data Analysis, 2nd Edition)

Abstract

:
As the field of big data continues to evolve, there is an increasing necessity to evaluate the equality of multiple high-dimensional covariance matrices. Many existing methods rely on approximations to the null distribution of the test statistic or its extreme-value distributions under stringent conditions, leading to outcomes that are either overly permissive or excessively cautious. Consequently, these methods often lack robustness when applied to real-world data, as verifying the required assumptions can be arduous. In response to these challenges, we introduce a novel test statistic utilizing the normal-reference approach. We demonstrate that the null distribution of this test statistic shares the same limiting distribution as a chi-square-type mixture under certain regularity conditions, with the latter reliably estimable from data using the three-cumulant matched chi-square-approximation. Additionally, we establish the asymptotic power of our proposed test. Through comprehensive simulation studies and real data analysis, our proposed test demonstrates superior performance in terms of size control compared to several competing methods.

1. Introduction

With the rapid advancement in data collection and storage, it has become increasingly common to encounter datasets characterized by a large number of features but a limited number of individuals. For instance, in financial studies, particularly those involving long-term data, each index often comprises hundreds or thousands of time points. However, due to constraints such as market capacity, policy restrictions, and other factors, resources are typically scarce, resulting in only a few subjects available for comparison across indexes. In such scenarios, the data dimension p approaches or even surpasses the total sample size n, a characteristic known as the “large p, small n” phenomenon. This feature renders many conventional methods inapplicable, necessitating specialized approaches. We refer to datasets exhibiting this characteristic as high-dimensional data, and the associated challenge as a “large p, small n” problem. A key focus of multivariate statistical analysis is to compare covariance matrices across several high-dimensional populations. The motivation for this paper partially stems from a financial dataset provided by the Credit Research Initiative of the National University of Singapore (NUS-CRI). In finance, contagion refers to a phenomenon observed through concurrent movements in exchange rates, stock prices, sovereign spreads, and capital flows [1]. Identifying the presence of financial contagion is crucial, as it signifies potential risks for countries aiming to integrate their financial systems with international markets and institutions. Additionally, it aids in understanding economic crises that spread to neighboring countries or regions. A common approach to detecting contagion involves examining the variance–covariance relationships of financial indices across different regions or time periods, as demonstrated by [2,3,4]. The Probability of Default (PD) serves as a metric for quantifying the likelihood of an obligor being unable to meet its financial obligations and forms the core of the credit product within the NUS-CRI corporate default prediction system, built on the forward intensity model of [5]. A notable example is the financial contagion observed during the 1997 Asian Financial Crisis, described in Section 4. Consequently, there is interest in investigating whether the covariance matrices of daily PD for neighboring countries during periods of stability and crisis are equal. This inquiry stimulates a k-sample equal-covariance matrix testing problem tailored for high-dimensional data.
Mathematically, a k-sample equal-covariance matrix testing problem for high-dimensional data is described as follows. Let us consider the following k independent high-dimensional samples:
y α 1 , , y α n α a r e   i . i . d .   w i t h   E ( y α 1 ) = μ α , Cov ( y α 1 ) = Σ α , α = 1 , , k ,
where the dimension p is significantly large, potentially exceeding the total sample size n = α = 1 k n α . The objective is to test whether the k covariance matrices are equal:
H 0 : Σ 1 = = Σ k vs . H 1 : H 0   i s   n o t   t r u e .
When k = 2 , the k-sample equal-covariance matrix testing problem in (2) simplifies to a two-sample equal-covariance matrix testing problem, which has been the subject of several previous studies. Ref. [6] devised a test based on an unbiased estimator using U-statistics of the usual squared Frobenius norm of the covariance matrix difference Σ 1 Σ 2 . Under certain stringent conditions, ref. [6] demonstrated that the null distribution of their test statistic is asymptotically normal, without relying on the normality assumption for the samples. However, this test may lack power when the entries of the covariance matrix difference Σ 1 Σ 2 are sparse, due to its reliance on an L 2 -norm-based approach. To address this limitation, ref. [7] proposed an L -type test. They showed that under certain regularity conditions, their test statistic asymptotically follows an extreme-value distribution of Type I. Unfortunately, simulation results presented in [8] reveal that [7]’s test is excessively conservative, exhibiting notably small empirical sizes.
For a general k > 2 , the problem of testing for equality of covariance matrices across all groups has attracted significant attention from researchers. Extending the test to multiple groups necessitates careful consideration of the problem’s complexity and the potential trade-offs between power and Type I error control. Ref. [9] addressed (2) by constructing an unbiased estimator for the sum of the usual squared Frobenius norm of the covariance matrix difference Σ α Σ β , where 1 α < β k . However, to derive the asymptotic normal distribution of his test statistic, Schott imposed strong assumptions, including the assumption of Gaussian populations. Nevertheless, this assumption may not hold in real datasets, leading to inaccurate results. Specifically, empirical results in Section 3.1 demonstrate that [9]’s test is overly permissive, particularly when the k samples (1) are non-Gaussian. With a nominal size of 5%, the empirical sizes of [9]’s test can exceed 9.61% and 10.08% for k = 3 and 4, respectively, when the samples are normally distributed. Conversely, when the samples are not normally distributed, the empirical sizes can soar to 32.14% and 41.86% for k = 3 and 4, respectively. To mitigate the reliance on normality assumptions, ref. [10] proposed a test statistic to extend [6]’s test for the k-sample high-dimensional equal-covariance matrix testing problem. However, they also followed strong assumptions imposed by [6], such as the existence of the samples’ eighth moments. According to the results from Section 3.1, [10]’s test may also be overly permissive, with empirical sizes reaching as high as 13.46% when the assumptions are not satisfied. This aligns with the simulation results presented in [8], which suggested that [6]’s test is overly permissive. Furthermore, both [9]’s and [10]’s tests are L 2 -norm-based, which may yield poor performance when the entries of the covariance matrix difference are sparse. In an effort to address both sparse and dense alternatives, ref. [11] combined two types of norms to characterize the distance among the covariance matrices: the Frobenius norm, as adopted by [6], and the maximum norm, introduced by [7]. However, empirical results displayed in Section 3.1 indicate that [11]’s test remains overly permissive in many cases. A common issue with these existing tests is their reliance on achieving normality of their null limiting distributions under certain strong conditions. However, in numerous scenarios, satisfying these conditions is challenging, rendering testing based on normal distribution inadequate.
From the preceding discussion, it is apparent that existing methods often struggle to control the size of the test effectively. In this paper, we address this issue by proposing and examining a normal-reference test for the k-sample equal-covariance matrix testing problem for high-dimensional data as described in (2). Our primary contributions are outlined below. Firstly, leveraging the well-known Kronecker product, we transform the k-sample equal-covariance matrix testing problem (2) on original high-dimensional samples (1) into a k-sample equal-mean vector testing problem on induced high-dimensional samples. This novel approach offers a fresh and innovative method tailored specifically for testing the equality of covariance matrices in high-dimensional data settings. Secondly, to address the k-sample equal-mean vector problem, we adopt the methodology introduced by [12] to construct a U-statistic-based test statistic on the induced high-dimensional samples. Under certain regularity conditions and the null hypothesis, it is demonstrated that the proposed test statistic and a chi-square-type mixture share the same normal or non-normal limiting distribution. Therefore, approximating the null distribution of the test statistic using the normal distribution, as carried out in the works of [9,10], may not always be appropriate. Our approach, termed the normal-reference approach, utilizes the chi-square-type mixture, obtained when the k induced samples are normally distributed, to accurately approximate the null distribution of the test statistic. A key advantage of this approach is its elimination of the need to verify whether the limiting distribution is normal or non-normal. Thirdly, instead of estimating the unknown coefficients of the chi-square-type mixture, we employ the three-cumulant matched chi-square-approximation method proposed by [13] to approximate the distribution of the chi-square-type mixture. The approximation parameters are consistently estimated from the data. Fourthly, we establish the asymptotic power under a local alternative. Fifthly, alongside the theoretical foundation, we conduct two simulation studies and a real data application to empirically demonstrate the superiority of our method over several competitors, such as the tests proposed by [9,10,11]. It is worth highlighting that our adaptation of the normal-reference test to the k-sample equal-covariance matrix testing problem is not a direct application of the results from [8]. The asymptotic properties presented in Theorems 1–3 are not directly derived from the theoretical results of [8,14], as these were proposed for the two-sample testing problem. The proofs of Theorems 1–3 are significantly more complex than those in [8].
The structure of this paper is organized as follows: Section 2 presents the main results. Simulation studies are detailed in Section 3. An application to a financial dataset is provided in Section 4. Concluding remarks are offered in Section 5. The technical proofs of the main results are outlined in Appendix A.

2. Main Results

2.1. Test Statistic

Without loss of generality and for simplicity, throughout this section, we assume μ 1 = = μ k = 0 , since in this paper, we focus solely on the equal-covariance matrix testing problem. This zero-mean assumption is commonly adopted for equal-covariance matrix testing in high-dimensional data, following a convention observed in various studies including [15,16,17], among others. In practice, it is often sufficient to replace y α i , i = 1 , , n α ; α = 1 , , k with y α i y ¯ α , i = 1 , , n α ; α = 1 , , k , where y ¯ α , α = 1 , , k are the usual group sample mean vectors of the samples (1) when μ α , α = 1 , , k are not actually equal to 0 . Under this assumption, we can express the equal-covariance matrix testing problem (2) based on the k samples (1) as an equal-mean vector testing problem using the following simple transformation.
Let v e c ( A ) denote a column vector obtained by stacking the column vectors of a matrix A one by one. We have v e c ( y y ) = y y , where ⊗ denotes the well-known Kronecker operator, and y is a column vector. Then, the equal-covariance matrix testing problem (2) can be equivalently expressed as the following equal-mean vector testing problem:
H 0 : v e c ( Σ 1 ) = = v e c ( Σ k ) vs H 1 : H 0 is not true ,
based on the following k induced samples:
w α i = v e c ( y α i y α i ) = y α i y α i , i = 1 , , n α ; α = 1 , , k ,
with E ( w α i ) = v e c ( Σ α ) and Cov ( w α i ) = Ω α for i = 1 , , n α ; α = 1 , , k . To test (3), it is natural to construct an unbiased estimator of 1 α < β k n 1 n α n β v e c ( Σ α ) v e c ( Σ β ) 2 , where a denotes the usual L 2 -norm of a vector a . It is also apparent that v e c ( Σ α Σ β ) 2 = tr [ ( Σ α Σ β ) 2 ] , representing the usual squared Frobenius norm of the covariance matrix difference Σ α Σ β for 1 α < β k . Let
w ¯ α = n α 1 i = 1 n α w α i ,   and   Ω ^ α = ( n α 1 ) 1 i = 1 n α ( w α i w ¯ α ) ( w α i w ¯ α ) ,
represent the usual group sample mean vectors and sample covariance matrices of the k induced samples (4). Following [12], for α β , α , β = 1 , , k , the U-statistics for estimating v e c ( Σ α ) v e c ( Σ α ) and v e c ( Σ α ) v e c ( Σ β ) are given by
S α α = 2 1 i < j n α w α i w α j n α ( n α 1 ) = w ¯ α 2 tr ( Ω ^ α ) n α , and S α β = i = 1 n α j = 1 n β w α i w β j n α n β = w ¯ α w ¯ β .
It follows that 1 α < β k n 1 n α n β ( S α α + S β β 2 S α β ) is an unbiased estimator of 1 α < β k n 1 n α n β v e c ( Σ α ) v e c ( Σ β ) 2 . Consequently, we can construct a U-statistic-based test statistic for (3) as follows:
T n , p = 1 α < β k n α n β n ( S α α + S β β 2 S α β ) = α = 1 k n α ( n n α ) n S α α 2 1 α < β k n α n β n S α β .
To save computation time, we can equivalently rewrite T n , p in (6) as follows:
T n , p = α = 1 k n α w ¯ α w ¯ 2 tr ( Ω ^ n ) ,
where w ¯ = n 1 α = 1 k n α w ¯ α , and tr ( Ω ^ n ) = α = 1 k ( 1 n α / n ) tr ( Ω ^ α ) .

2.2. Asymptotic Null Distribution

To further investigate the null distribution of T n , p (6), we set u α i = w α i v e c ( Σ α ) , i = 1 , , n α ; α = 1 , , k , and let u ¯ α be the usual sample mean vector of u α i , i = 1 , , n α ; α = 1 , , k , so that u ¯ α = w ¯ α v e c ( Σ α ) , α = 1 , , k . We can then further write
T n , p = T n , p , 0 + 2 Q n , p + 1 α < β k n α n β n tr [ ( Σ α Σ β ) 2 ] ,
where
T n , p , 0 = α = 1 k n α u ¯ α u ¯ 2 tr ( Ω ^ n ) , a n d Q n , p = α = 1 k n α ( u ¯ α u ¯ ) v e c ( Σ α ) ,
with u ¯ = n 1 α = 1 k n α u ¯ α . It is clear that under the null hypothesis, T n , p and T n , p , 0 have the same distribution. For further study, we can express T n , p , 0 in (8) as T n , p , 0 = v ( H I p 2 ) v tr ( Ω ^ n ) , where v = [ n 1 u ¯ 1 , , n k u ¯ k ] , and H = I k δ n δ n with δ n = [ n 1 / n , , n k / n ] . It is easy to check that H I p 2 is an idempotent matrix. Following the proof of Theorem 3 in [12], we have E ( T n , p , 0 ) = 0 , and
σ T 2 = Var ( T n , p , 0 ) = 2 α = 1 k ( n n α ) 2 n α n 2 ( n α 1 ) tr ( Ω α 2 ) + 2 1 α < β k n α n β n 2 tr ( Ω α Ω β ) .
When the k induced samples (4) are treated as normally distributed, we denote the k Gaussian induced samples as w α i * i . i . d . N p 2 ( v e c ( Σ α ) , Ω α ) , i = 1 , , n α ; α = 1 , , k and set u α i * = w α i * v e c ( Σ α ) , i = 1 , , n α ; α = 1 , , k . Then, we have T n , p , 0 * = α = 1 k n α u ¯ α * u ¯ * 2 tr ( Ω ^ n * ) , where u ¯ α * = n α 1 i = 1 n α u α i * , u ¯ * = n 1 α = 1 k n α u ¯ α * , Ω ^ α * = ( n α 1 ) 1 i = 1 n α ( u α i * u ¯ α * ) ( u α i * u ¯ α * ) , α = 1 , , k , and tr ( Ω ^ n * ) = α = 1 k ( 1 n α / n ) tr ( Ω ^ α * ) . In other words, T n , p , 0 * is obtained from T n , p , 0 when the k induced samples (4) are treated as normally distributed. We then call the distribution of T n , p , 0 * the normal-reference distribution of T n , p , 0 and the resulting test a normal-reference test. In what follows, we shall show that the distribution of T n , p , 0 can be asymptotically approximated by the distribution of T n , p , 0 * .
Throughout this paper, let = d denote equality in distribution and χ v 2 denote a central chi-square distribution with v degrees of freedom. For any given n and p, it is easy to show that T n , p , 0 * has the same distribution as that of a chi-square-type mixture as follows:
T n , p , 0 * = d r = 1 k p 2 λ n , p , r A r α = 1 k r = 1 p 2 ( n n α ) λ α r n ( n α 1 ) B α r ,
where λ n , p , r , r = 1 , , k p 2 are the eigenvalues of
Ω n = Cov ( H I p 2 ) v = ( H I p 2 ) diag ( Ω 1 , , Ω k ) ( H I p 2 ) ,
while λ α r , r = 1 , , p 2 are the eigenvalues of Ω α for α = 1 , , k , and A r i . i . d . χ 1 2 , and B α r i . i . d . χ n α 1 2 , α = 1 , , k are mutually independent. Obviously, we have E ( T n , p , 0 * ) = 0 and Var ( T n , p , 0 * ) = σ T 2 .
Remark 1. 
In practice, the k induced samples (4) are rarely normally distributed. Nevertheless, as a normal-reference test, we treat them as normally distributed to simplify T n , p , 0 * to a chi-square-type mixture (10). The crux of the proposed normal-reference test is thus to demonstrate that T n , p , 0 * and T n , p , 0 share the same asymptotic limit and that approximating the distribution of T n , p , 0 * is straightforward.
For further theoretical discussion, following [14], we introduce a norm which measures the difference between two probability measures. For two probability measures ν 1 and ν 2 on R , let ν 1 ν 2 denote the signed measure such that for any Borel set A, ( ν 1 ν 2 ) ( A ) = ν 1 ( A ) ν 2 ( A ) . Let B b 3 ( R ) denote the class of bounded functions with continuous derivatives up to order 3. It is known that a sequence of random variables { x i } i = 1 converges weakly to a random variable x if and only if for every f B b 3 ( R ) , we have E [ f ( x i ) ] E [ f ( x ) ] ; see [14] for some details. We use this property to give a definition of the weak convergence in R . For a function f B b 3 ( R ) , let f ( r ) denote the r-th derivative of f , r = 1 , 2 , 3 . For a finite signed measure ν on R , we define the norm ν 3 as sup f R f ( x ) ν ( d x ) where the supremum is taken over all f B b 3 ( R ) such that sup x R | f ( r ) ( x ) | 1 , r = 1 , 2 , 3 . It is straightforward to verify that · 3 is indeed a norm. Also, a sequence of probability measures { ν i } i = 1 converges weakly to a probability measure ν if and only if ν i ν 3 0 . For simplicity, we often denote [ E ( X ) ] 2 and [ Var ( X ) ] 2 as E 2 ( X ) and Var 2 ( X ) , respectively. Let c n , p , r = λ n , p , r / tr ( Ω n 2 ) , r = 1 , , k p 2 . These values represent the eigenvalues of Ω n / tr ( Ω n 2 ) , arranged in descending order, as λ n , p , r , r = 1 , , k p 2 are the eigenvalues of Ω n as defined in (11). We further impose the following conditions.
C1. 
As n , we have n α / n τ α ( 0 , 1 ) , α = 1 , , k .
C2. 
There is a universal constant 3 γ < such that for all q × p 2 real matrix B , we have E B u α i 4 γ E 2 ( B u α i 2 ) , for all i = 1 , , n α ; α = 1 , , k .
C3. 
As n , p , we have c n , p , r c r 0 for all r = 1 , 2 , uniformly.
C4. 
As n , p , we have p 2 / n c ( 0 , ) .
Condition C1 is regular for any k-sample testing problem. It requires that the k sample sizes n 1 , , n k tend to infinity proportionally. Under Condition C1, by (9) and (11), as n , we have
σ T 2 = 2 tr ( Ω n 2 ) + α = 1 k ( n n α ) 2 tr ( Ω α 2 ) n 2 ( n α 1 ) = 2 tr ( Ω n 2 ) [ 1 + o ( 1 ) ] .
Condition C2 is a key condition in this study. It is largely equivalent to the assumption that the original k samples (1) have the finite 8-th moment as imposed in [18]. Remark 1 of [8] has shown that Condition C2 automatically holds under Assumption 1 of [18]. To give more insight about Condition C2, we list the following remarks.
Remark 2. 
When B is a row vector, e.g., B = b , Condition C2 implies that the kurtosis of b u α i is bounded by γ for all b : kurt ( b u α i ) = E ( b u α i ) 4 / Var 2 ( b u α i ) γ . In simpler terms, it means that the kurtosis of u α i is uniformly bounded in any projection direction for all i = 1 , , n α ; α = 1 , , k . According to [19], the kurtosis value reflects the tails of the distribution. Thus, Condition C2 essentially ensures that the distribution of u α i does not exhibit heavy tails in any projection direction. This condition may seem quite weak.
Remark 3. 
We have E ( B u α i 4 ) = Var ( B u α i 2 ) + E 2 ( B u α i 2 ) . This expression, along with Condition C2, implies that the variances of B u α i 2 ’s are uniformly bounded by ( γ 1 ) E 2 ( B u α i 2 ) and that the noise-to-signal ratios Var 1 / 2 ( B u α i 2 ) / E ( B u α i 2 ) are also uniformly bounded.
Remark 4. 
When u α i , i = 1 , , n α ; α = 1 , , k are normally distributed, Condition C2 is automatically satisfied with γ = 3 . A proof is outlined in Appendix A.
Condition C3 ensures the existence of the limits of c n , p , r which are the eigenvalues of Ω n / tr ( Ω n 2 ) . It is used to obtain the limiting distributions of the standardized versions of T n , p , 0 and T n , p , 0 * , namely, T ˜ n , p , 0 = T n , p , 0 / σ T , and T ˜ n , p , 0 * = T n , p , 0 * / σ T , where T ˜ n , p , 0 and T ˜ n , p , 0 * have zero mean and unit variance. Condition C4 is imposed for studying the ratio consistency of the estimators used in the proposed normal-reference test. This is analogous to the condition p / n c ( 0 , ) imposed in [20] for testing the high-dimensional p-dimensional mean vectors, while in our equal-covariance matrix testing, the associated dimension is p 2 . Throughout this paper, let L ( y ) denote the distribution of a random variable y and L denote convergence in distribution. We have the following useful theorem whose proof is presented in Appendix A.
Theorem 1. 
Under Condition C2, we have
L ( T ˜ n , p , 0 ) L ( T ˜ n , p , 0 * ) 3 ( 2 γ ) 3 / 2 3 1 / 4 α = 1 k 1 n α 1 / 2 ,
where γ is defined in Condition C2.
Theorem 1 states that the distance between the distributions of T ˜ n , p , 0 and T ˜ n , p , 0 * is O ( n min 1 / 2 ) , where n min = min 1 α k n α . This theorem demonstrates that the distributions of T ˜ n , p , 0 and T ˜ n , p , 0 * become asymptotically equivalent. Hence, Theorem 1 furnishes a systematic theoretical justification for employing the distribution T ˜ n , p , 0 * to approximate the distribution of T ˜ n , p , 0 . Consequently, we study the asymptotic distribution of T ˜ n , p , 0 * in Theorem 2, which is proved in Appendix A.
Theorem 2. 
Under Conditions C1–C3, as n , p , we have T ˜ n , p , 0 * L ζ with
ζ = d ( 1 r = 1 c r 2 ) 1 / 2 z 0 + 2 1 / 2 r = 1 c r ( z r 2 1 ) ,
where z 0 , z 1 , z 2 , are i.i.d. N ( 0 , 1 ) , and c r , r = 1 , 2 , , are defined in Condition C3.
Theorem 2 offers a unified expression for the possible asymptotic distributions of T ˜ n , p , 0 * , denoted as the distribution of a weighted sum of a standard normal random variable and a sequence of centered chi-square random variables. From Fatou’s Lemma and Condition C3, we have r = 1 c r 2 lim n , p r = 1 k p 2 c n , p , r 2 = 1 , indicating that r = 1 c r 2 lies within the interval [ 0 , 1 ] . Below, we provide some remarks to elucidate certain special cases of the possible distribution of ζ (13).
Remark 5. 
We have ζ = d z 0 N ( 0 , 1 ) when r = 1 c r 2 = 0 , equivalently, c r = 0 , r = 1 , 2 , , which holds when the following condition holds: as p ,
tr ( Ω α Ω β Ω γ Ω β ) = o { tr ( Ω α Ω β ) tr ( Ω γ Ω β ) } , f o r α , β , γ = 1 , , k .
The above condition was proposed and used in [12] which is a multi-sample analogy of [21]’s condition (3.6).
Remark 6. 
We have ζ = d r = 1 c r ( z r 2 1 ) , a weighted sum of centered chi-square random variables when r = 1 c r 2 = 1 , which holds under Condition C3 and when lim n , p r = 1 k p 2 c n , p , r 2 = r = 1 lim n , p c n , p , r 2 holds.
Remark 7. 
The preceding two remarks suggest that the null limiting distribution of T n , p can either be normal or non-normal. Nevertheless, in practical scenarios, verifying whether r = 1 c r 2 = 0 or r = 1 c r 2 = 1 can be quite challenging. Consequently, it may not always be suitable to rely on the normal approximation for the null distribution of T n , p . This theoretical insight elucidates why test statistics grounded on normal approximation, such as those proposed by [9,10], are not universally applicable.

2.3. Implementation

To implement the proposed normal-reference test, we approximate the null distribution of T n , p with the distribution of T n , p , 0 * , which is akin to a chi-square-type mixture as outlined in (10). However, accurately estimating the coefficients of this mixture poses a challenge. To surmount this hurdle, we adopt the three-cumulant (3-c) matched chi-square-approximation [13] to approximate the distribution of T n , p , 0 * . The core concept of the 3-c matched χ 2 -approximation involves approximating the distribution of T n , p , 0 * using that of a random variable defined as R = d β 0 + β 1 χ d 2 , where β 0 , β 1 , and d are the approximation parameters with d representing the approximate degrees of freedom of the 3-c matched χ 2 -approximation. These parameters are determined via matching the first three cumulants (mean, variance, and third central moment) of T n , p , 0 * and R. For simplicity, let K ( X ) , = 1 , 2 , 3 denote the first three cumulants of a random variable X. It is evident that the first three cumulants of R are given by K 1 ( R ) = β 0 + β 1 d , K 2 ( R ) = 2 β 1 2 d , and K 3 ( R ) = 8 β 1 3 d while the first three cumulants of T n , p , 0 * are given by K 1 ( T n , p , 0 * ) = E ( T n , p , 0 * ) = 0 ,
K 2 ( T n , p , 0 * ) = Var ( T n , p , 0 * ) = σ T 2 = 2 tr ( Ω n 2 ) + α = 1 k ( n n α ) 2 n 2 ( n α 1 ) tr ( Ω α 2 ) , and K 3 ( T n , p , 0 * ) = E ( T n , p , 0 * 3 ) = 8 tr ( Ω n 3 ) α = 1 k ( n n α ) 3 n 3 ( n α 1 ) 2 tr ( Ω α 3 ) .
By some simple algebra, we have
K 2 ( T n , p , 0 * ) = 2 α = 1 k ( n n α ) 2 n α n 2 ( n α 1 ) tr ( Ω α 2 ) + 2 1 α < β k n α n β n 2 tr ( Ω α Ω β ) , and K 3 ( T n , p , 0 * ) = 8 α = 1 k ( n n α ) 3 n α ( n α 2 ) n 3 ( n α 1 ) 2 tr ( Ω α 3 ) + 3 α β n α n β ( n n α ) n 3 tr ( Ω α 2 Ω β ) 6 1 α < β < γ k n α n β n γ n 3 tr ( Ω α Ω β Ω γ ) .
It is evident that K 2 ( T n , p , 0 * ) > 0 and K 3 ( T n , p , 0 * ) > 0 since we should always have n α > 2 and Ω α being non-negative for all α = 1 , , k . Matching the first three cumulants of T n , p , 0 * and R then leads to
β 0 = 2 K 2 2 ( T n , p , 0 * ) K 3 ( T n , p , 0 * ) , β 1 = K 3 ( T n , p , 0 * ) 4 K 2 ( T n , p , 0 * ) , a n d d = 8 K 2 3 ( T n , p , 0 * ) K 3 2 ( T n , p , 0 * ) .
This leads to β 0 < 0 , β 1 > 0 , and d > 0 . The negative value of β 0 is expected since T n , p , 0 * is a chi-square-type mixture with both positive and negative coefficients. Note that the skewness of T n , p , 0 * can be expressed as
E ( T n , p , 0 * 3 ) Var 3 / 2 ( T n , p , 0 * ) = K 3 ( T n , p , 0 * ) [ K 2 ( T n , p , 0 * ) ] 3 / 2 = 8 / d .
Remark 8. 
For large n α , α = 1 , , k , by (14), we have
K 2 ( T n , p , 0 * ) = 2 tr ( Ω n 2 ) [ 1 + o ( 1 ) ] , a n d K 3 ( T n , p , 0 * ) = 8 tr ( Ω n 3 ) [ 1 + o ( 1 ) ] .
Then, by (16), we have
β 0 = tr 2 ( Ω n 2 ) tr ( Ω n 3 ) [ 1 + o ( 1 ) ] , β 1 = tr ( Ω n 3 ) tr ( Ω n 2 ) [ 1 + o ( 1 ) ] , a n d d = tr 3 ( Ω n 2 ) tr 2 ( Ω n 3 ) [ 1 + o ( 1 ) ] .
To apply the 3-c matched χ 2 -approximation, we need to estimate K 2 ( T n , p , 0 * ) and K 3 ( T n , p , 0 * ) consistently. Recall that the usual unbiased estimators of Ω α , α = 1 , , k are given by Ω ^ α , α = 1 , , k as in (5). We first find an unbiased and ratio-consistent estimator of K 2 ( T n , p , 0 * ) . According to (15), to obtain an unbiased and ratio-consistent estimator of K 2 ( T n , p , 0 * ) , we need the unbiased and ratio-consistent estimators of tr ( Ω α 2 ) , α = 1 , , k , and tr ( Ω α Ω β ) , α β , respectively. By Lemma S.3 of [22], the unbiased and ratio-consistent estimators of tr ( Ω α 2 ) , α = 1 , , k are given by
tr ( Ω α 2 ) ^ = ( n α 1 ) 2 ( n α 2 ) ( n α + 1 ) tr ( Ω ^ α 2 ) 1 n α 1 tr 2 ( Ω ^ α ) , α = 1 , , k .
By the proof of Theorem 2 of [23], when the k induced samples (4) are normally distributed, the unbiased and ratio-consistent estimator of tr ( Ω α Ω β ) is given by tr ( Ω ^ α Ω ^ β ) , α β . Therefore, based on (15), the unbiased and ratio-consistent estimator of K 2 ( T n , p , 0 * ) is given by
K 2 ( T n , p , 0 * ) ^ = 2 α = 1 k ( n n α ) 2 n α n 2 ( n α 1 ) tr ( Ω α 2 ) ^ + 2 1 α < β k n α n β n 2 tr ( Ω ^ α Ω ^ β ) .
We now find an unbiased and ratio-consistent estimator of K 3 ( T n , p , 0 * ) . According to (15), to obtain an unbiased and ratio-consistent estimator of K 3 ( T n , p , 0 * ) , we need the unbiased and ratio-consistent estimators of tr ( Ω α 3 ) , α = 1 , , k , tr ( Ω α 2 Ω β ) , α β , and tr ( Ω α Ω β Ω γ ) , 1 α < β < γ k , respectively. By Lemma 1 of [24], under Condition C4 and when the k induced samples (4) are normally distributed, the unbiased and ratio-consistent estimators of tr ( Ω α 3 ) , α = 1 , , k are given by
tr ( Ω α 3 ) ^ = ( n α 1 ) 4 ( n α 2 + n α 6 ) ( n α 2 2 n α 3 ) [ tr ( Ω ^ α 3 ) 3 tr ( Ω ^ α ) tr ( Ω ^ α 2 ) ( n α 1 ) + 2 tr 3 ( Ω ^ α ) ( n α 1 ) 2 ] .
By Lemma 1 of [12], when the k induced samples (4) are normally distributed, the unbiased estimators of tr ( Ω α 2 Ω β ) , α β are given by
tr ( Ω α 2 Ω β ) ^ = ( n α 1 ) ( n α 2 ) ( n α + 1 ) ( n α 1 ) tr ( Ω ^ α 2 Ω ^ β ) tr ( Ω ^ α Ω ^ β ) tr ( Ω ^ α ) .
Under some regularity conditions and when the k induced samples (4) are normally distributed, ref. [25] showed that the above estimators are also ratio-consistent for tr ( Ω α 2 Ω β ) , α β . By Lemma 2 of [12], when the k induced samples (4) are normally distributed, the unbiased estimators of tr ( Ω α Ω β Ω γ ) , α < β < γ are given by tr ( Ω ^ α Ω ^ β Ω ^ γ ) , α < β < γ . Then, the unbiased and ratio-consistent estimator of K 3 ( T n , p , 0 * ) is given by
K 3 ( T n , p , 0 * ) ^ = 8 α = 1 k ( n n α ) 3 n α ( n α 2 ) n 3 ( n α 1 ) 2 tr ( Ω α 3 ) ^ + 3 α β n α n β ( n n α ) n 3 tr ( Ω α 2 Ω β ) ^ 6 1 α < β < γ k n α n β n γ n 3 tr ( Ω ^ α Ω ^ β Ω ^ γ ) .
It follows that the ratio-consistent estimators of β 0 , β 1 , and d are given by
β ^ 0 = 2 [ K 2 ( T n , p , 0 * ) ^ ] 2 K 3 ( T n , p , 0 * ) ^ , β ^ 1 = K 3 ( T n , p , 0 * ) ^ 4 K 2 ( T n , p , 0 * ) ^ , a n d d ^ = 8 [ K 2 ( T n , p , 0 * ) ^ ] 3 [ K 3 ( T n , p , 0 * ) ^ ] 2 .
Remark 9. 
Recognizing that the k induced samples (4) typically deviate from a normal distribution, it follows that the estimators β ^ 0 , β ^ 1 , and d ^ are, at best, of a normal-reference nature. Nevertheless, the simulation results presented in Section 3 demonstrate the robust size control of the proposed normal-reference test, indicating that, as anticipated, the normal-reference estimators β ^ 0 , β ^ 1 , and d ^ can still effectively perform even when the k induced samples (4) are not normally distributed.
For any nominal significance level α * > 0 , let χ d 2 ( α * ) denote the upper 100 α * -percentile of χ d 2 . Then, using (17), the normal-reference test for the k-sample equal-covariance matrix testing problem (2) is conducted via using the approximate critical value β ^ 0 + β ^ 1 χ d ^ 2 ( α * ) or the approximate p-value Pr [ χ d ^ 2 ( T n , p β ^ 0 ) / β ^ 1 ] .
In practice, one may often use the normalized version of T n , p : T ˜ n , p = T n , p / K 2 ( T n , p , 0 * ) ^ . Then, to approximate the null distribution of T n , p , using the distribution of β ^ 0 + β ^ 1 χ d ^ 2 is equivalent to approximate the null distribution of T ˜ n , p using the distribution of ( χ d ^ 2 d ^ ) / 2 d ^ . In this case, the normal-reference test using T ˜ n , p is then conducted via using the approximate critical value [ χ d ^ 2 ( α * ) d ^ ] / 2 d ^ or the approximate p-value Pr χ d ^ 2 d ^ + 2 d ^ T ˜ n , p .

2.4. Asymptotic Power

We now consider the asymptotic power of T n , p under the following local alternative:
Var ( Q n , p ) = Var [ ϕ n ( H I p 2 ) v ] = ϕ n Ω n ϕ n = o [ tr ( Ω n 2 ) ] as   n , p ,
where Q n , p is defined in (8) and ϕ n = [ n 1 v e c ( Σ 1 ) , , n k v e c ( Σ k ) ] . This is the case when Var ( Q n , p ) is ignorable compared with Var ( T n , p , 0 ) = σ T 2 so that we have
T n , p = T n , p , 0 + 1 α < β k n α n β n tr [ ( Σ α Σ β ) 2 ] + o ( σ T ) .
Under Condition C1, as n , we have H H * = I k δ δ , where δ = ( τ 1 , , τ k ) so that
Ω n Ω = ( H * I p 2 ) diag ( Ω 1 , , Ω k ) ( H * I p 2 ) .
The asymptotic power of T n , p is established in Theorem 3, and its proof is provided in Appendix A.
Theorem 3. 
Assume that as n , p , β ^ 0 , β ^ 1 , and d ^ are ratio-consistent for β 0 , β 1 , and d. Under Conditions C1–C4, and the local alternative (18), as n , p , we have
Pr T n , p > β ^ 0 + β ^ 1 χ d ^ 2 ( α * ) = Pr ζ χ d 2 ( α * ) d 2 d 1 α < β k n τ α τ β tr ( Σ α Σ β ) 2 2 tr ( Ω 2 ) [ 1 + o ( 1 ) ] ,
where ζ is defined in Theorem 2. In addition, when d , the above expression can be further expressed as
Pr T n , p > β ^ 0 + β ^ 1 χ d ^ 2 ( α * ) = Φ z α * + 1 α < β k n τ α τ β tr ( Σ α Σ β ) 2 2 tr ( Ω 2 ) [ 1 + o ( 1 ) ] ,
where z α * denotes the upper 100 α * -percentile of N ( 0 , 1 ) .

3. Simulation Studies

In this section, we conduct two simulation studies to assess the finite-sample performance of the proposed normal-reference test, denoted as T N E W , via comparing it against three competitors, [9]’s test ( T S ), [10]’s test ( T Z B H W ), and [11]’s test ( T Z L G Y ), in terms of size control and power. We compare their performance for the k-sample equal-covariance matrix testing problem (2) in cases where k = 3 and k = 4 . To generate “large p, small n” samples, we consider three cases with p = 50 , 100 , 500 . For k = 3 , we specify three cases of n = ( n 1 , n 2 , n 3 ) as n 1 = ( 50 , 80 , 110 ) , n 2 = ( 80 , 110 , 140 ) , n 3 = ( 120 , 150 , 180 ) , and for k = 4 , we specify three cases of n = ( n 1 , n 2 , n 3 , n 4 ) as n 1 = ( 50 , 80 , 110 , 140 ) , n 2 = ( 80 , 110 , 140 , 170 ) , n 3 = ( 120 , 150 , 180 , 210 ) . We compute the empirical size or power of a test as the proportion of the number of rejections out of N simulation runs. Throughout this section, we set the nominal size α * as 5 % and the number of simulation runs as N = 10000 . We adopt the average relative error (ARE) to measure the overall performance of a test in maintaining the nominal size. The ARE value of a test is calculated as ARE = 100 M 1 j = 1 M | α ^ j α * | / α * , where α ^ j , j = 1 , , M denote the empirical sizes under M simulation settings. A smaller ARE value of a test indicates a better performance of that test in terms of size control.

3.1. Simulation 1

In this simulation study, under Condition C2, we generate the k samples (1) using y α i = μ + Σ α 1 / 2 z α i , i = 1 , , n α ; α = 1 , , k where z α i = ( z α i 1 , , z α i p ) , i = 1 , , n α ; α = 1 , , k are i.i.d. random variables with E ( z α i ) = 0 and Cov ( z α i ) = I p . The p entries of z α i are generated using the following three models:
Model 1: 
z α i h , h = 1 , , p i . i . d . N ( 0 , 1 ) .
Model 2: 
z α i h = u α i h / 5 / 3 , with u α i h , h = 1 , , p i . i . d . t 5 .
Model 3: 
z α i h = ( u α i h 1 ) / 2 , with u α i h , h = 1 , , p i . i . d . χ 1 2 .
The above three generative models correspond to three types of distributions: the normal distribution, a symmetric but non-normal distribution, and an asymmetric distribution, respectively. Without loss of generality, we set μ = 0 . The covariance matrices are specified as Σ α = V α [ ( 1 ρ α ) I p + ρ α J p ] , α = 1 , , k , where J p is the p × p matrix of ones, and V α = diag ( v α ) , α = 1 , , k with v α = ( v α 1 , , v α p ) . It is apparent that the covariance matrix difference Σ α Σ 1 , α = 2 , , k is determined by two tuning parameters, v α and ρ α , α = 1 , , k . In particular, v α , α = 1 , , k controls the variances of the generated k samples (1) while ρ α , α = 1 , , k controls their corresponding correlations. The null hypothesis (2) holds when v 1 = = v k = v and ρ 1 = = ρ k = ρ . For simplicity, we set v = 4 1 p where 1 p represents the p-dimensional vector of ones, and consider three cases of ρ = 0.3 , 0.5 , and 0.9 so that the simulated data are less correlated, moderately correlated, and highly correlated, respectively. For power consideration, we keep v 1 = v , but for α = 2 , , k , we set v α = ( v α 1 , , v α p ) , with v α h , h = 1 , , p randomly generated from the uniform distribution U ( 3.5 , 4.5 ) . Additionally, we set ρ 1 = 0.5 and consider three cases of ρ α = ρ , α = 2 , , k with ρ = 0.3 , 0.5 , 0.9 . The empirical powers of the tests are expected to increase when the value of Δ = | ρ ρ 1 | increases.
Table 1 displays the empirical sizes of T S , T Z B H W , T Z L G Y , and T N E W when k = 3 with the last row showing their ARE values associated with the three values of ρ . From Table 1, we can draw the following conclusions regarding size control. Firstly, T N E W generally performs well regardless of the correlation in the generated data, as its empirical sizes under various settings range from 4.11% to 6.67%, with ARE values of 8.06, 14.86, and 20.55 for ρ = 0.3 , 0.5 , and 0.9 , respectively. Secondly, T S appears to be rather liberal, with empirical sizes ranging from 8.58% to 32.14%, and ARE values of 185.76, 120.06, and 108.41. When comparing the empirical sizes under different models but keeping the other settings the same, it appears that T S is more liberal for Model 2, which represents a non-normal but symmetric distribution, and is the most liberal for Model 3, which represents an asymmetric distribution. This suggests that in the case of non-normal data, T S would be inadequate due to its assumption of normality in the population. Thirdly, T Z B H W is generally less liberal than T S because it does not require the assumption of normality for the k samples. Nevertheless, it is still quite liberal with empirical sizes ranging from 8.29% to 13.47% and ARE values of 116.80, 113.60 and 111.76. This is not surprising since T Z B H W extends [6]’s test to the k-sample case and hence exhibits similar performance to [6]’s test in Tables 1 and 4 of [8]. Fourthly, similar to T Z B H W , T Z L G Y exhibits superior performance compared to T S since the former does not require the normality of the k samples. Although T Z L G Y incorporates approaches from both [6]’s and [7]’s tests, it still exhibits a general trend of being liberal in terms of empirical sizes, which range from 4.74% to 12.63%. Additionally, its associated ARE values are 72.93, 37.65, and 29.06, respectively, when ρ = 0.3 , 0.5 , and 0.9. To sum up, T N E W generally outperforms its competitors T S , T Z B H W , and T Z L G Y in terms of size control.
For a more direct visualization, Figure 1 illustrates the histograms of the empirical sizes of T S , T Z B H W , T Z L G Y , and T N E W (from top to bottom), from which some of the above conclusions may be further verified visually. For example, all three competitors exhibit liberal behavior as shown by their histograms being shifted to the right from the nominal size (5%), while T S is more liberal compared to T Z B H W and T Z L G Y as evidenced by its greater degree of deviation. On the other hand, T N E W demonstrates better size control performance, as indicated by its histogram being more concentrated around the nominal size.
Table 2 and Figure 2 display the empirical sizes and the corresponding histograms of T S , T Z B H W , T Z L G Y , and T N E W when k = 4 . We can draw similar conclusions as those drawn from Table 1 and Figure 1. Essentially, T N E W continues to perform well as evidenced by its histogram of empirical sizes being concentrated at the nominal size, ranging from 4.14% to 6.86%, and its ARE values are 13.33, 21.27, and 23.22 for ρ = 0.3, 0.5, and 0.9, respectively. In addition, T N E W continues to perform much better than T S , T Z B H W , and T Z L G Y when k = 4 , since their empirical sizes range from 8.50% to 41.86%, 9.9% to 13.50%, and 5.53% to 12.96%, respectively. In terms of ARE values, it is worth noting that all of the competitors for the 4-sample case are more liberal than those for the 3-sample case, indicating that T S , T Z B H W , and T Z L G Y perform less effectively when dealing with more samples.
Table 3 displays the estimated approximate degrees of freedom d ^ (17) of T N E W under various settings in Simulation 1 when k = 3 and k = 4 , which explains why T S and T Z B H W perform worse than T N E W in terms of size control in Table 1 and Table 2. It is seen that the values of d ^ are generally quite small under each setting, showing that the underlying null distribution of T N E W is unlikely to be normal. Therefore, the null distributions of T S and T Z B H W are inadequate to be approximated to normal distributions. This partially explains why in terms of size control, T S and T Z B H W are inaccurate no matter how the data are correlated. It is also seen that the value of d ^ decreases with the value of ρ increasing. This means that the more highly correlated the data are, the less adequate the normal approximations to the null distributions of T S and T Z B H W would be.
We now proceed by comparing the empirical powers of the four considered tests: T S , T Z B H W , T Z L G Y , and T N E W . Table 4 and Table 5 present the empirical powers of these tests when k = 3 and k = 4 under various configurations, respectively. As anticipated, with an increase in the value of Δ = | ρ ρ 1 | , the empirical powers of the tests rise due to the escalating differences between the covariance matrices. It is noteworthy that a strong correlation exists between the empirical powers and the corresponding empirical sizes. In essence, a test with a larger empirical size tends to exhibit a greater empirical power compared to another test under the same conditions, and vice versa. Hence, from Table 4 and Table 5, it is evident that the empirical powers of T S , T Z B H W , and T Z L G Y generally surpass those of T N E W . This aligns with the conclusions drawn from Table 1 and Figure 1 for the 3-sample case, and the conclusions from Table 2 and Figure 2 for the 4-sample case, namely that T S , T Z B H W , and T Z L G Y tend to be liberal. These similarities underscore the challenge and the unnecessary nature of comparing empirical powers when their empirical sizes vary significantly, emphasizing that relying solely on empirical powers can be misleading if the test fails to control the size properly. A test with robust size control is often preferred over a test with high empirical powers but poor size control.

3.2. Simulation 2

In this simulation study, we continue to compare T N E W against T S , T Z B H W , and T Z L G Y in terms of size control but with the k samples (1) generated from the following moving average model:
y α i h = z α i h + θ α 1 z α i ( h + 1 ) + + θ α m α z α i ( h + m α ) , h = 1 , , p ; i = 1 , , n α ; α = 1 , , k ,
where y α i h denotes the h-th component of y α i , and z α i , = 1 , , p + m α ; i = 1 , , n α ; α = 1 , , k are i.i.d. random variables generated in the same ways as described in Simulation 1. The covariance matrix difference is then determined by m α , α = 1 , , k and θ α j , j = 1 , , m α ; α = 1 , , k . When m 1 = = m k = m and θ 1 j = = θ k j = θ j , j = 1 , , m , the generated k samples (1) share the same covariance matrix so that the null hypothesis (2) holds. To evaluate their level accuracy, we set m = 0.5 p , and let θ j , j = 1 , , m be generated from the uniform distribution U ( 2 , 3 ) . For power comparison, we set m α = ( 0.6 0.1 α ) p , α = 1 , , k and let θ α j , j = 1 , , m α be generated from the uniform distribution U ( α + 1 , α + 2 ) . Since the data in this simulation are generated from a moving average model, the correlations between samples are expected to decrease as the order of moving items increases. As a result, the samples in this study are only moderately correlated or even close to uncorrelated.
Figure 3 displays the histograms of the empirical sizes (in %) of the four considered tests when k = 3 (left column) and k = 4 (right column), respectively. It can be seen visually that T N E W still performs well generally regardless of whether k = 3 or k = 4 , since its histograms are concentrated at the nominal size (5%). All the histograms of its competitors are on the right of the nominal size.
To save space, we do not present the empirical powers of the four tests in this simulation study since the conclusions drawn from them are similar to those drawn from Table 4 and Table 5. That is, the empirical powers of of T S , T Z B H W and T Z L G Y are generally “larger” than those of T N E W since they are generally more liberal than T N E W .

4. Application to the Financial Data

In this section, we apply T S , T Z B H W , T Z L G Y , and T N E W to the financial dataset briefly described in Section 1. The dataset investigates financial contagion during the period of the well-known “1997 Asian financial crisis” and is accessible at https://nuscri.org/en/datadownload/, accessed on 1 December 2024. This crisis originated in Thailand in 1997 and subsequently spread to neighboring countries such as Indonesia, Malaysia, and the Philippines, causing a ripple effect and raising concerns about a global economic downturn due to financial contagion. However, the recovery in 1998 was swift, and concerns about a meltdown quickly diminished.
The dataset provides daily aggregated Probability of Default (PD) data for four sectors, energy, financials, real estate, and industrials, across the aforementioned four countries in 1997. Our interest lies in examining whether there were any structural breaks in the correlations (variance–covariance matrices) of the PDs for these countries and sectors during the crisis period. For this purpose, we divide the dataset into four groups labeled as Q 1 , Q 2 , Q 3 , and Q 4 . These groups represent the daily aggregated PD for each quarter of 1997, with each quarter spanning a three-month period and p = 65 representing the 65 trading days in a quarter. Additionally, since we analyze the daily aggregated PD of the four sectors across the four countries, each group comprises 16 observations, i.e., n 1 = = n 4 = 16 .
To ensure that we have four independent samples, we conduct six pairwise independence tests by utilizing distance correlation-based tests proposed by [26], implemented in the R package energy. As all the p-values exceed 0.05, we can conclude that there is insufficient evidence to reject the null hypothesis that any two groups are independent. Subsequently, we employ T S , T Z B H W , T Z L G Y , and T N E W to test the equality of covariance matrices for this financial dataset.
Table 6 presents the p-values of the four considered tests for testing the equality of covariance matrices, along with the corresponding estimated approximate degrees of freedom d ^ of T N E W under the column labeled “d.f.”. We initially apply the four considered tests to assess the equality of covariance matrices among the four groups. Given the small p-values observed, there is compelling evidence to reject the null hypothesis of no difference between the covariance matrices of the four groups. This suggests significant divergence among the covariance matrices, potentially indicating the presence of financial contagion during the crisis period.
Subsequently, we aim to ascertain whether the inequality of the four covariance matrices is attributable to financial contagion. We commence by conducting the contrast test “ Q 1 vs . Q 2 vs . Q 3 ”, with the test results displayed in Table 6. Notably, all considered tests yield consistent conclusions, as all p-values exceed 0.05, implying that the covariance matrices for the first three quarters are equivalent. This finding is plausible, suggesting a gradual dissipation of financial contagion towards the end of the year. The equivalence of covariance matrices for the initial three quarters indicates a relatively stable level of financial contagion during that period. It is pertinent to mention that the estimated approximate degrees of freedom (d.f.) are relatively small, indicating that the normal approximation to the null distributions of T S and T Z B H W may not be adequate. Consequently, their p-values may not be reliable.
To further illustrate the finite-sample performance of T N E W in terms of size control, we utilize this dataset to calculate the empirical sizes of these test procedures. The empirical size is computed from 10,000 runs. Building upon the testing results provided in Table 6, where we have established that the first three quarters share the same covariance matrix, we proceed to calculate their empirical sizes based on the first two quarters ( k = 2 ) and the first three quarters ( k = 3 ). The procedures are outlined as follows: in each run, we randomly partition the 16 k samples from the first k quarters into k sub-groups of equal size and then compute the p-values to assess the equality of covariance structures among the k sub-groups. The empirical size is determined as the proportion of times the p-value is smaller than the nominal level α * = 5 % across the 10,000 independent runs.
Table 7 presents the empirical sizes of the four tests: T S , T Z B H W , T Z L G Y , and T N E W . It is evident from this table that T N E W exhibits significantly improved level accuracy compared to the other three tests, which tend to be quite liberal. This finding aligns with the conclusions drawn from the simulation studies presented in Section 3.

5. Concluding Remarks

In this paper, we introduce and investigate a normal-reference test for the k-sample equal-covariance matrix testing problem, particularly tailored for high-dimensional data. Several existing tests necessitate strong assumptions or conditions, rendering them excessively liberal. Addressing this concern, under certain regularity conditions and null hypothesis, we establish that our proposed test statistic and a chi-square-type mixture share the same limiting distribution. This equivalence permits us to approximate the null distribution of our test statistic without solely relying on the normal approximation. Instead, we leverage the distribution of the chi-square-type mixture for this purpose, ensuring more reliable results and mitigating potential issues associated with the normal approximation, such as unreliable p-values or incorrect rejection rates. Furthermore, we utilize the three-cumulant matched chi-square-approximation proposed by [13] to approximate the distribution of the chi-square-type mixture, with parameters consistently estimated from the data. We apply our methodology to a financial dataset encompassing various sectors across multiple countries during a financial crisis, showcasing the efficacy of our approach in detecting potential financial contagion.

Author Contributions

Conceptualization, J.-T.Z.; methodology, J.W., T.Z. and J.-T.Z.; software, J.W.; validation, J.W., T.Z. and J.-T.Z.; formal analysis, J.W., T.Z. and J.-T.Z.; investigation, T.Z. and J.-T.Z.; resources, J.W.; data curation, J.W.; writing—original draft preparation, J.W.; writing—review and editing, J.W., T.Z. and J.-T.Z.; visualization, J.W.; supervision, T.Z. and J.-T.Z.; project administration, T.Z.; funding acquisition, T.Z. and J.-T.Z. All authors have read and agreed to the published version of the manuscript.

Funding

Wang and Zhang’s studies were partially supported by the National University of Singapore academic research grants (22-5699-A0001 and 23-1046-A0001), and Zhu’s research was supported by the National Institute of Education (NIE), Singapore, under its Academic Research Fund (RI 4/22 ZTM).

Data Availability Statement

The original contributions presented in this study are included in the article.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A. Technical Proofs

Proof of Remark 4. 
When u α i , i = 1 , , n α ; α = 1 , , k are normally distributed, we have B u α i N q ( 0 , B Ω α B ) and hence B u α i 2 = d r = 1 q v α r z α r 2 , where v α r , r = 1 , , q are the eigenvalues of B Ω α B and z α r , r = 1 , , q i . i . d . N ( 0 , 1 ) . It follows that E ( B u α i 2 ) = r = 1 q v α r and Var ( B u α i 2 ) = 2 r = 1 q v α r 2 . Thus,
E ( B u α i 4 ) = Var ( B u α i 2 ) + E 2 ( B u α i 2 ) = 2 r = 1 q v α r 2 + ( r = 1 q v α r ) 2 3 ( r = 1 q v α r ) 2 = 3 E 2 ( B u α i 2 ) ,
as desired. □
Proof of Theorem 1. 
We firstly set N 0 = 0 and N α = r = 1 α n r , α = 1 , , k . It is seen that N k = n = α = 1 k n α . Since we can write T n , p , 0 in (8) as
T n , p , 0 = α = 1 k 2 ( n n α ) n ( n α 1 ) 1 i < j n α u α i u α j 1 α < β k 2 n i = 1 n α j = 1 n β u α i u β j ,
then T ˜ n , p , 0 can be written as the following generalized quadratic form as defined in ([14], Section S.2 of the Appendix): T ˜ n , p , 0 = 1 s < t n a s t ξ s ξ t , where ξ N α 1 + i = u α i , i = 1 , , n α ; α = 1 , , k , and
a s t = 2 ( n n α ) n ( n α 1 ) σ T , w h e n N α 1 + 1 s < t N α , α = 1 , , k , 2 n σ T , w h e n N α 1 + 1 s N α , N β 1 + 1 t N β , 1 α < β k .
Similarly, T ˜ n , p , 0 * can also be written as the generalized quadratic form T ˜ n , p , 0 * = 1 s < t n a s t ξ s * ξ t * , where ξ N α 1 + i * = u α i * , i = 1 , , n α ; α = 1 , , k .
We will employ Theorem S.1 of [14] in the following proofs. To employ Theorem S.1 of [14], we need to check Assumptions S.1 and S.2 of [14] first. Note that we have σ s t 2 = E ( a s t ξ s ξ t ) 2 = a s t 2 tr ( Ω s Ω t ) < , 1 s < t n where Ω u = Ω α when N α 1 + 1 u N α . Under Condition C2, we have
E ( a s t ξ s ξ t ) 4 = a s t 4 E ( ξ s ξ t ) 4 = a s t 4 E { E [ ( ξ s ξ t ) 4 | ξ s ] } a s t 4 E { γ E 2 [ ( ξ s ξ t ) 2 | ξ s ] } = γ a s t 4 E [ ( ξ s Ω t ξ s ) 2 ] = γ a s t 4 E ( Ω t 1 / 2 ξ s 4 ) γ 2 a s t 4 E 2 ( Ω t 1 / 2 ξ s 2 ) = γ 2 a s t 4 tr 2 ( Ω s Ω t ) = γ 2 σ s t 4 < ,
where γ is defined in Condition C2. Then, Assumption S.1(a) of [14] is satisfied. Similarly, we can show that E ( a s t ξ s ξ t * ) 4 γ 2 σ s t 4 < , E ( a s t ξ s * ξ t ) 4 γ 2 σ s t 4 < , and E ( a s t ξ s * ξ t * ) 4 γ 2 σ s t 4 < , indicating that Assumption S.2(a) of [14] is also satisfied.
In addition, Assumptions S.1(b), S.2(b), and S.2(c) of [14] are also satisfied by T ˜ n , p , 0 and T ˜ n , p , 0 * which are independent from each other. Applying Theorem S.1 of [14], we have
L ( T ˜ n , p , 0 ) L ( T ˜ n , p , 0 * ) 3 γ 3 / 2 3 1 / 4 s = 1 n I n f s 3 / 2 .
For α = 1 , , k , when N α 1 + 1 s N α , by ([14] p. 23 of the Appendix), we have
I n f s = = 1 s 1 σ s 2 + = s + 1 n σ s 2 = = 1 N α 1 σ s 2 + = N α 1 + 1 s 1 σ s 2 + = s + 1 N α σ s 2 + = N α + 1 n σ s 2 = β = 1 α 1 = N β 1 + 1 N β σ s 2 + = N α 1 + 1 s 1 σ s 2 + = s + 1 N α σ s 2 + β = α + 1 k = N β 1 + 1 N β σ s 2 = β = 1 α 1 = N β 1 + 1 N β 4 n 2 σ T 2 tr ( Ω β Ω α ) + = N α 1 + 1 s 1 4 ( n n α ) 2 n 2 ( n α 1 ) 2 σ T 2 tr ( Ω α 2 ) + = s + 1 N α 4 ( n n α ) 2 n 2 ( n α 1 ) 2 σ T 2 tr ( Ω α 2 ) + β = α + 1 k = N β 1 + 1 N β 4 n 2 σ T 2 tr ( Ω α Ω β ) = β = 1 α 1 4 n β n 2 σ T 2 tr ( Ω β Ω α ) + 4 ( n n α ) 2 n 2 ( n α 1 ) σ T 2 tr ( Ω α 2 ) + β = α + 1 k 4 n β n 2 σ T 2 tr ( Ω α Ω β ) = 2 n α β α 2 n α n β n 2 σ T 2 tr ( Ω α Ω β ) + 2 n α ( n n α ) 2 n 2 ( n α 1 ) σ T 2 tr ( Ω α 2 ) 2 G α n α .
It is easy to see from (9) that 0 < G α < 1 and α = 1 k G α = 1 . Therefore, we have s = 1 n I n f s = α = 1 k s = N α 1 + 1 N α 2 G α / n α = 2 , and s = 1 n I n f s 2 = α = 1 k s = N α 1 + 1 N α 4 G α 2 / n α 2 4 α = 1 k n α 1 . By the Cauchy–Schwarz inequality, we have
s = 1 n I n f s 3 / 2 s = 1 n I n f s 1 / 2 s = 1 n I n f s 2 1 / 2 = s = 1 n I n f s s = 1 n I n f s 2 1 / 2 .
It follows that
s = 1 n I n f s 3 / 2 2 3 / 2 α = 1 k 1 n α 1 / 2 .
Thus, we have
L ( T ˜ n , p , 0 ) L ( T ˜ n , p , 0 * ) 3 ( 2 γ ) 3 / 2 3 1 / 4 α = 1 k 1 n α 1 / 2 .
The proof is complete. □
Proof of Theorem 2. 
Since for α = 1 , , k , we have u α i * , i = 1 , , n α , i . i . d . N p 2 ( 0 , Ω α ) and they are independent from each other. Let W W p ( v , Σ / v ) denote a Wishart distribution with v degrees of freedom and a covariance matrix Σ / v . Then, we have ( n α 1 ) Ω ^ α * W p 2 ( n α 1 , Ω α ) . Therefore, we have E [ tr ( Ω ^ n * ) ] = tr ( Ω n ) and Var [ tr ( Ω ^ n * ) ] = 2 α = 1 k [ n 2 ( n α 1 ) ] 1 ( n n α ) 2 tr ( Ω α 2 ) . It follows that under Condition C1, as n , we have
Var [ tr ( Ω ^ n * ) / tr ( Ω n ) ] = 2 α = 1 k ( n n α ) 2 n 2 ( n α 1 ) tr ( Ω α 2 ) / tr 2 ( Ω n ) 0 ,
uniformly for all p. Thus, tr ( Ω ^ n * ) / tr ( Ω n ) 1 in probability uniformly for all p. By (12), we have σ T 2 = 2 tr ( Ω n 2 ) [ 1 + o ( 1 ) ] . In addition, we have ( H I p 2 ) v * N k p 2 ( 0 , Ω n ) . Thus, we can express v * ( H I p 2 ) v * = ϵ k p 2 Ω n ϵ k p 2 where ϵ k p 2 N k p 2 ( 0 , I k p 2 ) . It follows that we have
T ˜ n , p , 0 * = T n , p , 0 * σ T = v * ( H I p 2 ) v * tr ( Ω ^ n * ) σ T 2 = ϵ k p 2 Ω n ϵ k p 2 tr ( Ω n ) 2 tr ( Ω n 2 ) [ 1 + o p ( 1 ) ] .
Under Condition C3, the expression (13) follows from Corollary 1 of [14] immediately. The proof is complete. □
Proof of Theorem 3. 
By (7) and under the local alternative (18), we have
T n , p = T n , p , 0 + 1 α < β k n α n β n tr ( Σ α Σ β ) 2 [ 1 + o p ( 1 ) ] .
By (9) and (19), we have σ T 2 = 2 tr ( Ω n 2 ) [ 1 + o ( 1 ) ] = 2 tr ( Ω 2 ) [ 1 + o ( 1 ) ] . In addition, under the given conditions, we have β ^ 0 / β 0 P 1 , β ^ 1 / β 1 P 1 and d ^ / d P 1 as n , p . We first prove (20). Under Conditions C1–C3, Theorems 1 and 2 indicate that as n , p , we have T ˜ n , p , 0 = T n , p , 0 / σ T L ζ where ζ is defined in Theorem 2. It follows that as n , p , we have
Pr T n , p β ^ 0 + β ^ 1 χ d ^ 2 ( α * ) = Pr T ˜ n , p , 0 β 0 + β 1 χ d 2 ( α * ) σ T 1 α < β k n τ α τ β tr ( Σ α Σ β ) 2 2 tr ( Ω 2 ) [ 1 + o ( 1 ) ] = Pr ζ χ d 2 ( α * ) d 2 d 1 α < β k n τ α τ β tr ( Σ α Σ β ) 2 2 tr ( Ω 2 ) [ 1 + o ( 1 ) ] ,
where τ is defined in Condition C1.
We next prove (21). Under the given conditions, when d , Theorem 2 indicates that as n , p , we have T ˜ n , p , 0 L ζ N ( 0 , 1 ) and T ˜ n , p , 0 * L N ( 0 , 1 ) . In addition, as d , we have [ χ d 2 ( α * ) d ] / 2 d z α * where z α * denotes the upper 100 α * -percentile of N ( 0 , 1 ) . Then, by (A1), as n , p , we have
Pr T n , p β ^ 0 + β ^ 1 χ d ^ 2 ( α * ) = Φ z α * + 1 α < β k n τ α τ β tr ( Σ α Σ β ) 2 2 tr ( Ω 2 ) [ 1 + o ( 1 ) ] ,
where Φ ( · ) denotes the cumulative distribution function of N ( 0 , 1 ) . The proof is complete. □

References

  1. Dornbusch, R.; Park, Y.C.; Claessens, S. Contagion: Understanding How It Spreads. World Bank Res. Obs. 2000, 15, 177–197. [Google Scholar] [CrossRef]
  2. King, M.A.; Wadhwani, S. Transmission of Volatility between Stock Markets. Rev. Financ. Stud. 1990, 3, 5–33. [Google Scholar] [CrossRef]
  3. Bekaert, G.; Harvey, C.; Ng, A. Market Integration and Contagion. J. Bus. 2005, 78, 39–69. [Google Scholar] [CrossRef]
  4. Corsetti, G.; Pericoli, M.; Sbracia, M. Some contagion, some interdependence: More pitfalls in tests of financial contagion. J. Int. Money Financ. 2005, 24, 1177–1199. [Google Scholar] [CrossRef]
  5. Duan, J.C.; Sun, J.; Wang, T. Multiperiod corporate default prediction A forward intensity approach. J. Econom. 2012, 170, 191–209. [Google Scholar] [CrossRef]
  6. Li, J.; Chen, S.X. Two sample tests for high-dimensional covariance matrices. Ann. Stat. 2012, 40, 908–940. [Google Scholar] [CrossRef]
  7. Cai, T.; Liu, W.; Xia, Y. Two-Sample Covariance Matrix Testing and Support Recovery in High-Dimensional and Sparse Settings. J. Am. Stat. Assoc. 2013, 108, 265–277. [Google Scholar] [CrossRef]
  8. Wang, J.; Zhu, T.; Zhang, J.T. Two-sample test for high-dimensional covariance matrices: A normal-reference approach. J. Multivar. Anal. 2024, 204, 105354. [Google Scholar] [CrossRef]
  9. Schott, J.R. A test for the equality of covariance matrices when the dimension is large relative to the sample sizes. Comput. Stat. Data Anal. 2007, 51, 6535–6542. [Google Scholar] [CrossRef]
  10. Zhang, C.; Bai, Z.; Hu, J.; Wang, C. Multi-sample test for high-dimensional covariance matrices. Commun. Stat.—Theory Methods 2018, 47, 3161–3177. [Google Scholar] [CrossRef]
  11. Zheng, S.; Lin, R.; Guo, J.; Yin, G. Testing homogeneity of high-dimensional covariance matrices. Stat. Sin. 2020, 30, 35–53. [Google Scholar] [CrossRef]
  12. Zhang, J.T.; Zhu, T. A new normal reference test for linear hypothesis testing in high-dimensional heteroscedastic one-way MANOVA. Comput. Stat. Data Anal. 2022, 168, 107385. [Google Scholar] [CrossRef]
  13. Zhang, J.T. Approximate and Asymptotic Distributions of Chi-Squared-Type Mixtures With Applications. J. Am. Stat. Assoc. 2005, 100, 273–285. [Google Scholar] [CrossRef]
  14. Wang, R.; Xu, W. An approximate randomization test for the high-dimensional two-sample Behrens–Fisher problem under arbitrary covariances. Biometrika 2022, 109, 1117–1132. [Google Scholar] [CrossRef]
  15. Li, W.; Qin, Y. Hypothesis testing for high-dimensional covariance matrices. J. Multivar. Anal. 2014, 128, 108–119. [Google Scholar] [CrossRef]
  16. Hu, J.; Li, W.; Liu, Z.; Zhou, W. High-dimensional covariance matrices in elliptical distributions with application to spherical test. Ann. Stat. 2019, 47, 527–555. [Google Scholar] [CrossRef]
  17. Yu, X.; Li, D.; Xue, L. Fisher’s Combined Probability Test for High-Dimensional Covariance Matrices. J. Am. Stat. Assoc. 2024, 119, 511–524. [Google Scholar] [CrossRef]
  18. Chen, S.X.; Zhang, L.X.; Zhong, P.S. Tests for high-dimensional covariance matrices. J. Am. Stat. Assoc. 2010, 105, 810–819. [Google Scholar] [CrossRef]
  19. Westfall, P.H. Kurtosis as peakedness, 1905–2014. RIP. Am. Stat. 2014, 68, 191–195. [Google Scholar] [CrossRef] [PubMed]
  20. Bai, Z.D.; Saranadasa, H. Effect of high dimension: By an example of a two sample problem. Stat. Sin. 1996, 6, 311–329. [Google Scholar]
  21. Chen, S.X.; Qin, Y.L. A two-sample test for high-dimensional data with applications to gene-set testing. Ann. Stat. 2010, 38, 808–835. [Google Scholar] [CrossRef]
  22. Zhang, J.T.; Guo, J.; Zhou, B.; Cheng, M.Y. A simple two-sample test in high dimensions based on L2-norm. J. Am. Stat. Assoc. 2020, 115, 1011–1027. [Google Scholar] [CrossRef]
  23. Zhang, J.T.; Zhou, B.; Guo, J.; Zhu, T. Two-sample Behrens–Fisher Problems for High-Dimensional Data: A Normal Reference Approach. J. Stat. Plan. Inference 2021, 213, 142–161. [Google Scholar] [CrossRef]
  24. Zhang, J.T.; Zhou, B.; Guo, J. Testing high-dimensional mean vector with applications: A normal reference approach. Stat. Pap. 2022, 63, 1105–1137. [Google Scholar] [CrossRef]
  25. Hyodo, M.; Nishiyama, T.; Pavlenko, T. On error bounds for high-dimensional asymptotic distribution of L2-type test statistic for equality of means. Stat. Probab. Lett. 2020, 157, 108637. [Google Scholar] [CrossRef]
  26. Székely, G.J.; Rizzo, M.L.; Bakirov, N.K. Measuring and testing dependence by correlation of distances. Ann. Stat. 2007, 35, 2769–2794. [Google Scholar] [CrossRef]
Figure 1. Histograms of the empirical sizes (in %) of T S , T Z B H W , T Z L G Y and T N E W (from top to bottom) in Simulation 1 when k = 3 .
Figure 1. Histograms of the empirical sizes (in %) of T S , T Z B H W , T Z L G Y and T N E W (from top to bottom) in Simulation 1 when k = 3 .
Mathematics 13 00295 g001
Figure 2. Histograms of the empirical sizes (in %) of T S , T Z B H W , T Z L G Y and T N E W (from top to bottom) in Simulation 1 when k = 4 .
Figure 2. Histograms of the empirical sizes (in %) of T S , T Z B H W , T Z L G Y and T N E W (from top to bottom) in Simulation 1 when k = 4 .
Mathematics 13 00295 g002
Figure 3. Histograms of the empirical sizes (in %) in Simulation 2 when k = 3 (left column) and k = 4 (right column).
Figure 3. Histograms of the empirical sizes (in %) in Simulation 2 when k = 3 (left column) and k = 4 (right column).
Mathematics 13 00295 g003
Table 1. Empirical sizes (in %) of T S , T Z B H W , T Z L G Y , and T N E W in Simulation 1 when k = 3 .
Table 1. Empirical sizes (in %) of T S , T Z B H W , T Z L G Y , and T N E W in Simulation 1 when k = 3 .
ρ = 0.3 ρ = 0.5 ρ = 0.9
Model p n T S T ZBHW T ZLGY T NEW T S T ZBHW T ZLGY T NEW T S T ZBHW T ZLGY T NEW
150 n 1 9.0210.275.425.419.5510.955.335.289.0710.885.645.74
n 2 8.8810.145.215.189.1610.895.235.369.1210.876.625.86
n 3 9.2810.595.545.619.0610.846.316.099.2311.165.866.47
100 n 1 8.9810.2510.645.158.5810.028.885.379.2110.956.675.62
n 2 9.3110.6511.865.319.2911.229.146.039.2210.976.906.39
n 3 9.0510.5012.325.819.6111.278.685.779.4411.196.805.76
500 n 1 9.4810.3411.734.779.0310.628.295.219.0210.896.445.24
n 2 9.2610.6912.134.939.1210.838.996.088.7810.677.286.17
n 3 9.5711.3112.605.619.5311.478.926.489.0411.077.236.26
250 n 1 16.1410.685.365.3312.3211.465.665.6210.6211.716.475.77
n 2 15.6710.385.604.4612.3511.305.725.5210.1611.225.856.12
n 3 16.1710.925.484.5912.0111.575.205.6310.4311.636.275.99
100 n 1 12.4510.3610.505.1710.6410.297.015.5611.019.486.765.91
n 2 12.6610.7012.554.6410.5710.277.155.4810.649.457.296.64
n 3 13.1711.2811.525.0510.9310.087.875.7310.979.387.315.89
500 n 1 10.439.889.205.1910.849.476.426.0211.329.825.956.01
n 2 10.489.8810.845.1310.288.917.445.5911.119.336.686.17
n 3 11.1910.3510.625.3911.319.627.256.1711.159.177.276.16
350 n 1 30.0311.614.964.1616.8812.634.885.6311.0910.726.215.83
n 2 31.2110.085.114.4614.4111.664.745.5313.2112.245.554.52
n 3 32.1413.475.685.4915.9211.036.505.8813.1113.105.455.92
100 n 1 20.3213.018.105.5811.5710.416.445.6312.6711.125.735.91
n 2 19.1812.129.085.5612.2912.256.765.9911.5310.275.906.32
n 3 18.0311.798.684.5811.0510.217.335.8610.8310.686.776.49
500 n 1 11.0210.476.864.4910.289.746.375.719.388.976.554.11
n 2 11.6910.578.185.479.878.886.265.799.268.295.856.06
n 3 10.9710.397.565.5210.6310.476.296.0510.7410.656.926.67
ARE185.76116.8072.938.06120.06113.6037.6514.86108.41111.7629.0620.55
Table 2. Empirical sizes (in %) of T S , T Z B H W , T Z L G Y and T N E W in Simulation 1 when k = 4 .
Table 2. Empirical sizes (in %) of T S , T Z B H W , T Z L G Y and T N E W in Simulation 1 when k = 4 .
ρ = 0.3 ρ = 0.5 ρ = 0.9
Model p n T S T ZBHW T ZLGY T NEW T S T ZBHW T ZLGY T NEW T S T ZBHW T ZLGY T NEW
150 n 1 9.1910.385.565.849.1210.725.775.939.1810.676.556.14
n 2 9.7410.906.285.989.1710.616.056.139.1411.016.496.27
n 3 9.3410.336.086.079.6611.486.676.169.0111.076.506.35
100 n 1 9.7810.9210.385.289.2311.3111.646.249.0410.997.126.27
n 2 10.0811.3611.625.288.8310.5711.096.438.8910.948.526.11
n 3 9.3810.4810.756.179.1210.8712.316.389.5311.808.525.74
500 n 1 9.6610.9211.825.129.1111.0511.085.978.5010.437.905.82
n 2 9.7110.9312.745.939.3911.2011.336.059.2511.468.286.24
n 3 9.6811.3312.235.849.2011.2612.316.848.8410.748.396.55
250 n 1 18.2410.896.204.8812.6211.065.535.7110.5911.555.886.22
n 2 18.8411.116.304.8412.1911.136.236.1411.0112.155.926.22
n 3 19.1911.497.095.5812.9611.716.506.2810.9912.116.736.28
100 n 1 13.9910.9112.865.1110.3410.778.755.889.6611.286.596.14
n 2 14.2810.9412.964.9711.6612.0610.166.039.9911.627.005.87
n 3 14.3911.4812.096.1311.2611.5710.265.8710.1612.017.385.86
500 n 1 10.2810.4011.975.079.1410.778.426.439.6111.396.856.86
n 2 10.3810.8511.205.889.6210.977.976.259.2310.956.506.78
n 3 10.7211.3511.676.2410.1911.689.006.289.4611.338.005.89
350 n 1 38.7511.906.084.1415.6910.805.894.8312.8812.506.016.17
n 2 38.7811.505.885.7017.6511.106.365.6113.8913.505.836.26
n 3 41.8613.306.824.9318.5111.706.056.0411.5811.307.135.72
100 n 1 20.1610.2010.495.6614.2612.307.526.0711.7613.106.165.87
n 2 22.6111.7011.156.2114.2012.508.206.2610.8711.207.025.72
n 3 20.9411.5011.335.8913.7712.608.045.659.4011.007.015.47
500 n 1 10.8210.908.925.7310.9611.907.575.619.249.906.416.65
n 2 10.1610.609.445.8810.6111.907.426.1610.0712.607.096.32
n 3 12.3012.109.906.169.8810.708.046.158.7310.607.496.56
ARE220.93122.7292.4413.33128.40126.8867.5321.27100.37129.0440.2023.22
Table 3. Estimated approximate degrees of freedom of T N E W under various settings in Simulation 1.
Table 3. Estimated approximate degrees of freedom of T N E W under various settings in Simulation 1.
Model123
k p n ρ = 0 . 3 ρ = 0 . 5 ρ = 0 . 9 ρ = 0 . 3 ρ = 0 . 5 ρ = 0 . 9 ρ = 0 . 3 ρ = 0 . 5 ρ = 0 . 9
350 n 1 1.971.351.212.331.411.212.961.491.21
n 2 1.821.291.182.141.341.182.671.421.17
n 3 1.731.271.172.011.311.162.511.361.15
100 n 1 1.751.321.211.971.341.212.291.381.21
n 2 1.611.261.181.791.291.182.111.321.17
n 3 1.531.231.161.671.251.161.931.281.16
500 n 1 1.571.291.211.621.291.211.971.331.21
n 2 1.441.241.181.481.241.181.851.271.18
n 3 1.381.211.161.401.211.161.681.241.16
450 n 1 2.321.621.452.701.671.443.231.711.46
n 2 2.191.581.442.521.611.423.111.681.41
n 3 2.111.551.432.371.591.412.941.641.43
100 n 1 2.081.581.462.301.611.442.661.631.45
n 2 1.951.531.432.111.561.432.471.591.42
n 3 1.861.511.432.031.531.422.321.561.41
500 n 1 1.871.551.451.941.551.462.331.571.45
n 2 1.751.501.441.791.511.432.291.511.44
n 3 1.681.471.421.711.471.422.151.471.42
Table 4. Empirical powers (in %) in Simulation 1 for k = 3 with Δ = | ρ ρ 1 | .
Table 4. Empirical powers (in %) in Simulation 1 for k = 3 with Δ = | ρ ρ 1 | .
Δ = 0 Δ = 0.2 Δ = 0.4
Model p n T S T ZBHW T ZLGY T NEW T S T ZBHW T ZLGY T NEW T S T ZBHW T ZLGY T NEW
150 n 1 20.6824.5016.7217.8170.1274.5449.3857.3686.4988.9472.0062.07
n 2 29.8929.7329.1129.1987.7094.8393.1388.7485.5388.2975.7788.69
n 3 45.6749.4738.6734.1690.2887.8799.6699.4389.9892.1690.4885.56
100 n 1 41.0436.7753.0119.3986.2474.0277.4956.5284.1387.0861.4472.64
n 2 39.7231.5052.8726.9695.9594.8392.2969.9583.2687.0496.8873.68
n 3 47.0136.6877.8831.4596.0288.6081.5993.7493.0891.7493.9785.27
500 n 1 35.5026.1447.9822.1981.7569.6775.0158.9772.1882.4070.7367.96
n 2 47.6741.4660.1221.7093.7280.3183.1593.9089.2291.2787.5482.85
n 3 51.9146.3183.9930.9694.2790.0995.7195.9689.6489.3493.9192.00
250 n 1 14.6623.2111.6512.5079.4768.1065.3729.2373.9378.2154.6456.07
n 2 17.8129.3219.2119.3690.5188.9778.1660.0592.3894.5570.9073.54
n 3 25.1737.1132.0627.5896.5394.6094.8890.3997.1091.1493.8594.73
100 n 1 17.8822.4623.0813.1484.7367.6996.2341.7866.1793.2249.9155.63
n 2 29.9635.7240.3618.5497.0782.3092.8066.5581.0696.3970.6259.12
n 3 34.4248.5276.8934.5899.8899.2391.3694.6785.7594.4997.3287.03
500 n 1 22.8236.0532.6416.9674.4768.7277.2045.0162.9190.3872.0665.81
n 2 27.9940.9028.8123.6084.5189.0770.9983.8671.5192.5964.9685.98
n 3 35.9356.8854.2531.0388.1899.1591.5798.8982.1290.0492.9382.53
350 n 1 24.5623.9516.279.4892.0676.8957.4835.3987.9693.5058.0558.92
n 2 38.3628.8027.6518.4495.1395.5185.9283.2991.8099.5971.7494.44
n 3 45.9243.1424.8327.3998.8598.0396.9369.9885.5480.0398.8483.47
100 n 1 40.1532.3421.1115.7375.5559.1171.0938.3674.7891.3255.0844.88
n 2 47.8539.5430.0619.7892.0487.6599.4288.5781.8896.4177.7266.36
n 3 51.0756.0440.3424.9296.7588.0690.4692.9379.9795.3286.7174.00
500 n 1 30.9838.8828.2017.2872.5568.2286.1364.0272.9287.0755.9097.30
n 2 38.3254.2330.7422.9573.7681.5793.1980.8793.9488.0788.7786.83
n 3 32.6339.2953.8332.7392.1496.0699.8192.1474.3691.7091.4686.15
Table 5. Empirical powers (in %) in Simulation 1 for k = 4 with Δ = | ρ ρ 1 | .
Table 5. Empirical powers (in %) in Simulation 1 for k = 4 with Δ = | ρ ρ 1 | .
Δ = 0 Δ = 0.2 Δ = 0.4
Model p n T S T ZBHW T ZLGY T NEW T S T ZBHW T ZLGY T NEW T S T ZBHW T ZLGY T NEW
150 n 1 21.6525.0315.4415.8668.8273.7548.2053.1485.4590.6962.0058.02
n 2 29.8630.5125.1825.5279.9588.2277.2976.8785.3487.1677.2382.89
n 3 42.8446.7136.5833.7789.7090.0890.8191.9092.1892.9199.5987.18
100 n 1 38.1532.5840.4416.6879.1869.4879.4955.1385.7186.7657.5865.11
n 2 41.7833.4443.6025.2997.8688.9094.2370.3586.3587.2886.5977.05
n 3 49.5438.0354.8828.4492.6488.7793.5288.2792.2096.4898.9385.57
500 n 1 35.1825.1335.9219.3680.2365.9774.4354.9476.6086.0457.6561.18
n 2 46.3040.0947.7021.8189.3878.5579.1878.0784.6884.9876.9781.92
n 3 53.7747.1860.8229.3393.2099.9298.6592.1891.6792.0889.5497.48
250 n 1 14.3124.0511.9212.3170.3266.7956.5231.9374.1479.3060.1652.02
n 2 18.0529.7617.6617.4183.6083.1369.5355.3485.2487.3170.1172.36
n 3 23.3236.6725.6124.7389.7799.4188.8374.3592.1597.1387.3590.35
100 n 1 18.4021.4618.4912.4375.4064.2894.9042.2875.4286.7551.2153.55
n 2 27.1630.4228.4016.8586.0680.4989.9062.1386.3486.5373.6266.88
n 3 33.4242.2759.0033.7591.4197.5087.0477.9992.5997.2396.3587.48
500 n 1 27.0631.7024.8715.8875.5565.2859.3546.0874.1177.9262.6157.66
n 2 29.9133.2226.9121.1085.3281.1168.6873.1686.0887.4166.7478.25
n 3 39.8846.8543.6930.4992.0599.5392.4585.4296.7997.1593.4586.32
350 n 1 26.4328.0013.4911.0571.3575.0146.8235.5675.7380.1960.0055.67
n 2 31.3230.2520.6118.1884.6183.7174.7065.1796.8299.3668.2975.41
n 3 39.4940.6726.7026.6698.9399.2989.0577.9396.8492.7898.4686.39
100 n 1 32.5827.3718.0814.5976.1575.3954.9037.8280.5777.5251.2945.18
n 2 41.4238.7524.7818.9395.0590.7989.1679.3086.8588.4165.3473.32
n 3 40.9845.4136.7825.8591.9190.2899.9295.5992.1492.5583.8087.80
500 n 1 29.0631.8323.7317.5873.8965.5366.3150.1774.0278.8957.1560.13
n 2 35.6540.4725.9721.5784.8681.3480.7675.2386.3897.4273.2383.25
n 3 35.1138.4542.1032.2091.1099.6691.4991.5391.4892.1393.7187.60
Table 6. Testing results for the financial dataset.
Table 6. Testing results for the financial dataset.
Hypothesis T S T ZBHW T ZLGY T NEW d.f.
Q 1 vs . Q 2 vs . Q 3 vs . Q 4 0 1.59 × 10 8 00.011.74
Q 1 vs . Q 2 vs . Q 3 0.080.490.110.242.09
Table 7. Empirical sizes (%) of the financial dataset with the nominal level α * = 5 % .
Table 7. Empirical sizes (%) of the financial dataset with the nominal level α * = 5 % .
k T S T ZBHW T ZLGY T NEW
239.2921.0112.375.73
348.1633.8924.256.15
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Wang, J.; Zhu, T.; Zhang, J.-T. Test of the Equality of Several High-Dimensional Covariance Matrices: A Normal-Reference Approach. Mathematics 2025, 13, 295. https://doi.org/10.3390/math13020295

AMA Style

Wang J, Zhu T, Zhang J-T. Test of the Equality of Several High-Dimensional Covariance Matrices: A Normal-Reference Approach. Mathematics. 2025; 13(2):295. https://doi.org/10.3390/math13020295

Chicago/Turabian Style

Wang, Jingyi, Tianming Zhu, and Jin-Ting Zhang. 2025. "Test of the Equality of Several High-Dimensional Covariance Matrices: A Normal-Reference Approach" Mathematics 13, no. 2: 295. https://doi.org/10.3390/math13020295

APA Style

Wang, J., Zhu, T., & Zhang, J.-T. (2025). Test of the Equality of Several High-Dimensional Covariance Matrices: A Normal-Reference Approach. Mathematics, 13(2), 295. https://doi.org/10.3390/math13020295

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop