Next Article in Journal
On the Strong Convergence of Combined Generalized Equilibrium and Fixed Point Problems in a Banach Space
Next Article in Special Issue
Reliability Analysis of Improved Type-II Adaptive Progressively Inverse XLindley Censored Data
Previous Article in Journal
Fixed Point Approximation for Enriched Suzuki Nonexpansive Mappings in Banach Spaces
Previous Article in Special Issue
Parameter Estimation of the Lomax Lifetime Distribution Based on Middle-Censored Data: Methodology, Applications, and Comparative Analysis
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Novel Robust Test to Compare Covariance Matrices in High-Dimensional Data

Department of Statistics, Faculty of Science, Ondokuz Mayıs University, 55139 Samsun, Türkiye
Axioms 2025, 14(6), 427; https://doi.org/10.3390/axioms14060427
Submission received: 16 April 2025 / Revised: 21 May 2025 / Accepted: 27 May 2025 / Published: 30 May 2025
(This article belongs to the Special Issue Computational Statistics and Its Applications, 2nd Edition)

Abstract

The comparison of covariance matrices is one of the most important assumptions in many multivariate hypothesis tests, such as Hotelling T 2 and MANOVA. The sample covariance matrix, however, is singular in high-dimensional data when the variable number (p) is greater than the sample size (n). Therefore, its determinant is zero, and its inverse cannot be calculated. Although many studies addressing this problem are discussed in the Introduction Section, they have not focused on outliers in datasets. In this study, we propose a test statistic that can be used on high-dimensional datasets without being affected by outliers. There is no distributional assumption because our proposed test is permutational. We investigate the performance of the proposed test based on simulation studies and real example data. In all cases, our proposed test demonstrates good type-1 error control, power, and robustness. Additionally, we have constructed an R function and added it to the “MVTests” package. Therefore, our proposed test can be performed easily on real datasets.

1. Introduction

In many scientific disciplines, such as genomics, finance, and medical research, comparing the variability structures of multiple groups plays a crucial role in understanding group differences. In multivariate data analysis, these structures are often characterized by covariance matrices, which capture not only the spread of variables but also their mutual relationships. Before conducting comparisons of mean vectors using techniques such as the Hotelling T 2 test or MANOVA, it is essential to verify whether the covariance matrices across groups are homogeneous [1,2].
The assumption of homogeneity of covariance matrices, often referred to as the multivariate version of the Behrens–Fisher problem, underlies the validity of many multivariate hypothesis tests. Violations of this assumption can lead to misleading statistical conclusions. For this reason, testing the equality of covariance matrices is a fundamental problem in multivariate statistics [1].
Classical methods, such as Box’s M test [3], have long been used to assess the equality of covariance matrices. However, this test requires that the sample size be larger than the number of variables, and it is highly sensitive to outliers. As high-dimensional datasets have become increasingly common in modern research, especially when the number of variables p exceeds the sample size n, the limitations of classical approaches have become more pronounced.
In recent years, several methods have been proposed for application in high-dimensional settings. Notable among them are the tests developed by Schott [4], Srivastava and Yanagihara [5], and Li and Chen [6], which are based on Frobenius norms of covariance matrix differences. Additionally, Yu [7] introduced a permutation-based approach that does not rely on any distributional assumptions and has shown promising results for multi-sample high-dimensional comparisons.
Despite these advances, the existing methods still suffer from a major limitation: they are not robust to outliers. This study addresses this gap by proposing a novel test statistic that is applicable in high-dimensional settings and remains reliable even in the presence of outliers. Our approach is based on minimum regularized covariance determinant (MRCD) estimators [8] and utilizes a permutation-based framework, eliminating the need for distributional assumptions. The proposed method was evaluated through extensive simulation studies and a real data application to demonstrate its type-1 error control, power, and robustness.
The rest of the paper is organized as follows. Section 2 introduces the tests used to test the homogeneity of covariance matrices in high-dimensional data in the literature. In Section 3, we introduce the MRCD estimators. Section 4 introduces the proposed test procedure. Section 5 presents a simulation study to compare the type-1 error control, power, and robustness properties of the proposed approach with the T M statistic proposed by Yu [7]. Section 6 exemplifies the use of the proposed approach with a real data application. Section 7 introduces the R function developed to apply the proposed approach in real data studies. Finally, Section 8 presents a discussion.

2. Literature Review

The homogeneity of variances is an important issue for inferential statistics [1,4]. In univariate mean tests (t-test or ANOVA), unequal variances are referred to as the univariate Behrens–Fisher problem. Similarly, in the Hotelling T 2 and MANOVA methods used for the comparison of multivariate mean vectors, there is an assumption that the variance–covariance matrices of the groups are equal [1,2]. This assumption is called the homogeneity assumption, and not meeting this assumption is called the multivariate Behrens–Fisher problem. Hypotheses about the equality of covariance matrices are established as follows:
H 0 : Σ 1 = Σ 2 = = Σ g H 1 : A t   l e a s t   Σ j   is   different   from   others ,   j = 1 , 2 , , g  
where Σ j is the covariance matrix of the j t h group, and g is the number of groups. To test these hypotheses, let us draw samples with sample sizes of n 1 , n 2 , , n g units from multivariate normal distributions. Let the sample covariance matrices of these samples be S 1 , S 2 , , S g , respectively. Accordingly, we can use the test statistic to test the null hypothesis given in (1):
U = 2 1 c 1 l n M ~ χ p p + 1 g 1 2 2
where
M = S 1 n 1 1 2 S 2 n 2 1 2 S g n g 1 2 S p o o l j = 1 g n j 1 2       ,     S p o o l = j = 1 g n j 1 S j j = 1 g n j 1
c 1 = 2 p 2 + 3 p 1 6 p + 1 g 1 j = 1 g 1 n j 1 1 j = 1 g n j 1         , n i n j ,   i j = 1 , 2 , , g g + 1 2 p 2 + 3 p 1 6 g p + 1 n 1         , n 1 = n 2 = = n g = n .
When U > χ p p + 1 g 1 2 ; α 2 , we reject the null hypothesis [1,2]. However, the test statistic given in Equation (2) can be only used for low-dimensional data where p < m i n n 1 , n 2 , , n g and the samples come from multivariate normal distribution [3].
In addition to hypothesis testing procedures, recent studies have focused on improving covariance matrix estimation in high-dimensional settings. Among these, shrinkage-based estimators have attracted particular attention due to their ability to produce well-conditioned covariance estimates when the number of variables exceeds the sample size. Ledoit and Wolf [9] introduced a linear shrinkage estimator that combines the sample covariance matrix with a structured target to stabilize estimation. Similarly, Schäfer and Strimmer [10] proposed a shrinkage approach particularly suited to applications in genomics and functional data analysis. Although our study focuses on hypothesis testing, incorporating such estimation techniques can improve the reliability of test statistics in challenging high-dimensional scenarios.
Several tests have been proposed to test hypotheses about the equality of covariance matrices in high-dimensional data [4,5,6,7,11]. Schott [4] and Srivastava and Yanagihara [5] have suggested tests to compare any two covariance matrices based on the Frobenius norm. Similarly, Li and Chen [6] proposed a test statistic based on Frobenius norm to test the hypothesis about the equality of two covariance matrices of two independent high-dimensional groups. Their test statistic is given in Equation (5).
T n 1 , n 2 = A n 1 + A n 2 2 C n 1 n 2
where, under the assumption that μ 1 = μ 2 = 0 , we can calculate the A n h   h = 1 , 2 and C n 1 n 2 values as follows:
A n h = 1 n h n h 1 i j X h i X h j 2 ,   C n 1 n 2 = 1 n 1 n 2 i j X 1 i X 2 j 2 .
Li and Chen [6] defined the asymptotic distribution of T n 1 , n 2 statistic under the null hypothesis H 0 : Σ 1 = Σ 2 = Σ as follows:
T n 1 , n 2 σ ^ n 1 , n 2 ~ N 0 , 1 ,     σ ^ n 1 , n 2 2 = 4 1 n 1 + 1 n 2 2 t r 2 Σ 2 .  
This test proposed by Li and Chen [6] can test the null hypothesis given in (1) when the number of groups is two. To solve this problem, Wang [11] proposed to divide the null hypothesis given in (1) into g 1 pieces and to use the T n 1 , n 2 statistic defined in Equation (5) to test these hypotheses. For this purpose, we can rewrite the null hypothesis in (1) as H 0 = H 02 H 03 H 0 g , where the H 0 j and H 1 j hypotheses can be written as follows:
H 0 j : Σ j 1 = Σ j   vs .   H 1 j : Σ j 1 Σ j       ,     j = 2 , 3 , , g .
Because there are g − 1 comparisons, we use the Bonferroni correction. Accordingly, the test is performed according to the largest of the calculated T j 1 , j   j = 2 , , g statistics by absolute value. If the null hypothesis cannot be rejected according to this statistic, then that the null hypothesis given by (1) cannot be rejected.
Yu [7] suggested a permutation test to compare covariance matrices and to test the null hypothesis in Equation (1) in high-dimensional datasets. Let S j be the sample covariance matrix of the j t h group j = 1 , 2 , , g . The pooled covariance matrix can be calculated as follows:
S p o o l = j = 1 g n j S j j = 1 g n j .
If the mean vectors of the groups are zero vectors, M h k can be used for the pairwise comparison of the covariance matrices of the h t h and k t h groups. We can define M h k as follows:
M h k = m a x λ 1 , λ 2 , , λ s
where λ 1 , λ 2 , , λ s are non-zero eigenvalues of the matrix n h n k n S p o o l . d 1 / 2 S h S k S p o o l . d 1 / 2 , and S p o o l . d is a diagonal matrix and has the same diagonal elements as S p o o l . Yu [7] proposed the following test statistic:
T M Y U = 2 g g 1 h < k M h k .  
Because the distribution of the T M Y U statistic is not known, Yu [7] proposed using a permutation approach to obtain the sampling distribution of the test statistic. Detailed information is available in [7].
All of the approaches introduced above are sensitive to outliers in the dataset since they are based on classical estimations. This study aims to propose a test statistic to compare the covariance matrices of g -independent groups in high-dimensional data contaminated with outliers. The proposed test statistic is based on the minimum regularized covariance determinant (MRCD) estimations introduced in Section 3. Details of these estimations are given in Section 4.

3. MRCD Estimators

Outliers in multivariate data can significantly distort classical estimates of location and dispersion. The minimum covariance determinant (MCD) estimator is a robust alternative that is less sensitive to outliers. This makes it ideal to estimate the location and scatter parameters in contaminated data. However, MCD estimates cannot be used in high-dimensional data where the number of variables p exceeds the sample size n . This limitation restricts its applicability in high-dimensional data.
Boudt et al. [8] proposed minimum regularized covariance determinant (MRCD) estimators of location and scatter parameters without being affected by outliers in high-dimensional data. The MRCD estimator retains the good breakdown point properties of the MCD estimator [8,12].
To compute MRCD estimates, we first standardized the data using the median and Q n as the univariate location and scatter estimators [13], and then used the T target matrix. This T matrix was symmetric and positive definite. The regularized covariance matrix of any subset H obtained from the standardised Z data was computed as follows:
K H = ρ T + 1 ρ c α S Z H
where ρ is the regularization parameter, c α is the consistency factor defined by Croux and Haesbroeck [14], and
S Z H = 1 h 1 Z H μ Z H T Z H μ Z H   , μ Z H = 1 h Z H T 1 h .  
MRCD estimations are obtained from the subset H M R C D , which is obtained by solving the minimization problem given by Equation (14).
H M R C D = argmin H H d e t K H 1 / p  
where H is the set of all subsets with size h in the data. Finally, the MRCD location and scatter estimators were obtained as given in Equations (15) and (16).
μ ^ M R C D = V X + D X   μ Z H M R C D  
Σ ^ M R C D = D X   Q Λ 1 / 2 ρ I + 1 ρ c α S w H M R C D Λ 1 / 2 Q D X
where Λ and Q are the eigenvalues and eigenvector matrices of T , respectively. Also, S w H M R C D was calculated as follows:
S w H M R C D = Λ 1 / 2 Q S Z H M R C D Q Λ 1 / 2 .
More detailed information on the MRCD estimators can be found in [8]. In this study, we used the “rrcov” package in R software for calculations regarding the MRCD estimators [15]. When using this package, we assumed that we knew the outlier rate of the data. We also preferred to use the default values of the regularization parameter (rho) and the target matrix. This function automatically calculates these values from the dataset.

4. Proposed Test Statistic

To test the equality of covariance matrices in high-dimensional data, the test statistics given by Equations (10) and (11) have been shown to be more successful than alternative methods [7]. However, since these test statistics are based on classical covariance matrices, they are affected by outliers in the dataset. Moreover, the S p o o l . d matrix used in these test statistics is a diagonal matrix consisting of the S p o o l covariance matrix. Therefore, it only considers the variances of the variables and excludes the relationships among the variables.
In this study, to test the null hypothesis given by (1) in contaminated high-dimensional data, we propose using the MRCD estimators introduced in Section 2 instead of the classical estimators in the test statistics given by Equations (10) and (11). We also recommend using the S p o o l 1 / 2 matrix directly instead of the S p o o l . d 1 / 2 matrix to take the relationships among the variables into account in the test statistics. In the proposed approach, the S p o o l 1 / 2 matrix can be calculated by using spectral decomposition as follows:
S p o o l = S p o o l 1 / 2 S p o o l 1 / 2
S p o o l 1 / 2 = P Λ 1 / 2 P T
where Λ is the diagonal eigenvalue matrix of S p o o l , and P is the orthogonal eigenvector matrix. We calculated the S p o o l matrix based on the MRCD estimations, as given in Equation (20).
S p o o l = h = 1 g n h S M R C D . h h = 1 g n h
where S M R C D . h is the MRCD covariance matrix of the h t h group h = 1 , 2 , , g .
If the mean vectors of the groups are zero vector, M h k can be used for pairwise comparison of the covariance matrices of the h t h and k t h groups as follows:
M h k = m a x λ 1 , λ 2 , , λ s
where λ 1 , λ 2 , , λ s are the non-zero eigenvalues of the matrix n h n k n S p o o l 1 / 2 S h S k S p o o l 1 / 2 , and S p o o l is calculated as given in Equation (20). We propose the following test statistic to test the null hypothesis given in (1):
T M M R C D = 2 g g 1 h < k M h k .
Because the distribution of the T M M R C D statistic is not known, we can use a permutation approach to obtain the sampling distribution of the test statistic as proposed by Yu [7]. For this purpose, we propose the following Algorithm 1 to test the null hypothesis:
Algorithm 1: Robust test for comparison of covariance matrices
  • Let X h i be the i t h   i = 1 , 2 , , n h observation vector in the h t h group h = 1 , 2 , , g . Combine all X h i observation vectors to the data X n × p .
  • Randomly distribute the observations in the data X n × p into g groups such that there are n h observations in the h t h group. After this operation, let X h i 1 represent the n h observations in the h t h group.
  • For each group, calculate the MRCD covariance matrices S M R C D . h   h = 1 , 2 , , g based on the X h i 1 observations and calculate the statistic given by Equation (22). Let us denote this statistic as T M M R C D 1 .
  • Repeat steps (i)–(iii) R times and calculate the statistics T M M R C D r   r = 1 , 2 , , R at each step.
  • Calculate the p-value as follows:
         p v a l u e = # r :   T M M R C D r > T M M R C D / R   .
When the p-value is lower than the significance level, the null hypothesis given by (1) is rejected, indicating that the covariance matrices are not homogeneous.
If the mean vectors are not equal to the zero vectors, then the test statistic given by Equality (22) is defined as follows:
T M M R C D = 2 g g 1 h < k m a x e i g e n v a l u e s   o f   n h n k n S M R C D . h S M R C D . k  
where S M R C D . h = 1 n h i = 1 n h Z h i Z h i T , Z h i = S p o o l 1 / 2 X h i X ¯ h .   S h = 1 n h i = 1 n h Z h i Z h i T , Z h i = S p o o l . d 1 / 2 X h i M h , and . T is the transpose operator. Here, S p o o l is calculated as given in Equation (20), and M h is the MRCD location estimation of the h t h group.
Lemma 1.
Let  b  be the number of test statistics computed from  R  randomly sampled permutations (without replacement) that are more extreme than the observed test statistic  T M M R C D . Then, the p-value defined as
p = b R
is a consistent estimator under the null hypothesis and converges to the true significance level as R , provided that the permutation distribution approximates the true null distribution.
This formulation aligns with the permutation-based inference strategy described by Yu [7] and is widely used in the literature for high-dimensional testing problems. Although alternatives, such as b + 1 / R + 1 ,   have been proposed [16] to avoid zero p-values and improve small-sample behavior, we did not observe any zero p-values in our simulation settings. Thus, the classical estimator b / R remains practically appropriate and computationally efficient in our context.
The test we propose can test whether the covariance matrices of high-dimensional independent groups are equal or not without being affected by outliers in the dataset. Moreover, since it is a permutation test, it does not require any distributional assumption.
The proposed test statistic T M M R C D integrates a robust covariance estimation approach (MRCD) with a permutation-based inference strategy. This design grants the test several important theoretical properties:
  • Nonparametric nature: The test does not require multivariate normality. Its null distribution is derived empirically via permutation of the group labels. Under the null hypothesis and assuming exchangeability, this ensures exact control of type-1 error in finite samples.
  • Robustness to outliers: Unlike the TMYU test, the TMMRCD test based on MRCD estimations is robust to outliers in data. The MRCD estimator, while not affine equivariant due to the use of a fixed regularization matrix, offers strong resistance to outliers. It guarantees a minimum eigenvalue bounded away from zero when the regularization parameter is applied, resulting in a 100% implosion breakdown value [8]. Though not maximally robust in all directions, this feature provides practical protection against severe contamination.
  • Finite sample validity: Since permutation testing is used, type-1 error control holds in finite samples, making the procedure reliable even for small sample sizes, provided that the group labels are exchangeable.
  • Asymptotic behavior: Although a formal proof of asymptotic consistency is beyond the scope of this paper, the simulation results show that the type-1 error rates stabilize near the nominal level as the sample size increases. This is consistent with findings from the permutation test literature.
  • High-dimensional applicability: Unlike Box’s M test, which fails in high-dimensional contexts, TMMRCD remains valid and operational. This makes it suitable for applications such as gene expression analysis, where the number of variables often exceeds the sample size.
These properties make T M M R C D a robust and flexible tool for covariance matrix comparison in challenging data environments.

5. Simulation Study

In this section, we perform simulation studies to compare our statistics with the T M statistic proposed by Yu [7] and the M statistic suggested by Box [3]. Since the M statistic can only be used for low-dimensional data, it is not included in the comparisons for high-dimensional data. In the simulation studies, we compared the test statistics according to the type-1 error, power, and robustness performance. The significance level was α = 0.05 for all tests. Although these test statistics can measure the covariance matrix of two or more groups, only three group covariance matrices were compared in the simulation studies. In other words, each test evaluated the null hypothesis H 0 : Σ 1 = Σ 2 = Σ 3 .
In the simulation study, the sample sizes of the groups were taken as equal to each other. Accordingly, the sample sizes n h were taken as 10, 30, and 60. The number of variables p was determined as 5, 10, 50, 100, and 300. Therefore, both low-dimensional and high-dimensional data were included in the dataset used in simulation studies. In each step of the simulation study, the number of trials R was taken as 1000 repetitions.

5.1. Comparisons of Type-1 Error Rates

To investigate the type-1 error performance of our proposed test, we compared the rejection rates when the null hypothesis which was true. Similar to the simulation study used by Yu [7], the diagonal elements of the Σ 1 matrix were σ 1 i i = 1   i = 1 , 2 , , p , and its non-diagonal elements were σ 1 i j = 0.6 i j   i j = 1 , 2 , , p . In order for the null hypothesis to be true, we defined the other covariance matrices as Σ 2 = Σ 3 = Σ 1 .
We generated datasets under two scenarios: multivariate normal distribution and mixed distributions. We give details about these scenarios below. To make more precise comparisons in terms of type-1 error, the average relative error (ARE) values were calculated for each statistic. The ARE values indicate the deviation of the test statistic from the nominal significance level of type-1 proportions and were calculated as given in Equation (25).
A R E = 100 θ α i = 1 θ α ^ i α .
where θ is the number of type-1 error rates calculated for each statistic in the table, α ^ i is the number of type-1 error rates, and α is the nominal significance level used in the testing process. In this study, we used the significant level α as 5%. We can say that the test statistic with the smallest ARE value had a higher performance in terms of the type-1 error rate.
  • Scenario 1: We generated datasets from multivariate normal distribution. For this purpose, the mean vectors were set as μ 1 = μ 2 = μ 3 = 0 without any loss of generality. Accordingly, observations in each group were randomly generated from X h ~ N p μ h , Σ h   h = 1 , 2 , 3 , such that the sample sizes were n h = 10 ,   30 ,   60 . We tested the null hypothesis H 0 : Σ 1 = Σ 2 = Σ 3 , which is true in this case. The obtained results are presented in Table 1 and visualized in Figure 1.
The results presented in Table 1 and illustrated in Figure 1 show the behaviors of the three test statistics in terms of type-1 error control under various sample sizes and dimensional settings. The proposed T M M R C D test consistently yielded rejection rates that were closest to the nominal 5% level across all considered scenarios. This accuracy is reflected in its notably lower average relative error (ARE) compared to the alternative methods. In contrast, T M Y U tended to deviate from the nominal level, especially for higher values of p , while Box’s M test was applicable only in low-dimensional settings and exhibited greater variability. The graphical representation reinforces these numerical findings and highlights the comparative advantage of the proposed test in controlling the type-1 error rate under Scenario 1.
  • Scenario 2: The mixed distribution data in this scenario were generated in two stages, as follows:
    • The observation vectors Z h i = Z h i 1 , Z h i 2 , , Z h i p T were generated. Here, the first p 1 = p / 2 variables came from the standardized normal distribution N 0 , 1 , and the remaining p 2 = p p 1 variables came from the standardized chi-square distribution χ 2 2 2 / 2 .
    • The X h i observation vectors i = 1 , 2 , , n h were obtained with the transformation X h i = Σ h 1 / 2 Z h i   h = 1 , 2 , 3 . The X h i observations had a mean vector of 0 and a covariance matrix Σ h .
After the datasets were generated in this way, we tested the null hypothesis H 0 : Σ 1 = Σ 2 = Σ 3 , which is actually true. The p , n h , and Σ h values used here correspond to those given in Scenario 1. The results are presented in Table 2 and visualized in Figure 2.
The simulation results under the mixed distributed data are summarized in Table 2 and visualized in Figure 2. The T M M R C D statistic continued to exhibit robust performance in controlling the type-1 error rate, yielding rejection rates that remained consistently close to the nominal level across all combinations of n and p. This is reflected in the lowest ARE value (6.254) among the three methods. The T M Y U test also performed reasonably well in this setting, with moderate deviations from the nominal 5% level, particularly at higher dimensions. In contrast, the classical Box’s M test showed severe inflation of type-1 error rates under the mixed distribution, with values exceeding 25% in all applicable scenarios. For this reason, the Box M test was excluded from Figure 2 to improve the readability and visual interpretability of the results.
In both Figure 1 and Figure 2, the horizontal dashed line represents the nominal 5% significance level; test statistics with values closer to this line are considered more accurate in terms of type-1 error control.

5.2. Comparison of Powers

To examine the power performance of the proposed approach, we compared the rejection rates when the null hypothesis was false. Similar to the simulation design used by Yu [7], the Σ 1 matrix was defined as shown in Section 5.1. However, we defined Σ 2 = 2 Σ 1 and Σ 3 = 10 Σ 1 to ensure the null hypothesis was false. Datasets were generated under two different scenarios: multivariate normal distribution and mixed.
  • Scenario 3: We generated datasets from a multivariate normal distribution. For this purpose, the mean vectors were set as μ 1 = μ 2 = μ 3 = 0 without any loss of generality. Accordingly, observations in each group were randomly generated from X h ~ N p μ h , Σ h   h = 1 , 2 , 3 such that the sample sizes were n h = 10 ,   30 ,   60 . We tested the null hypothesis H 0 : Σ 1 = Σ 2 = Σ 3 , which is false. The power performance of the test statistics under multivariate normal distribution is presented in Table 3 and visualized in Figure 3.
As expected, the classical Box’s M test yielded the highest power values across all scenarios, often exceeding 95%. However, this high power comes at the cost of poor type-1 error control in high-dimensional or non-normal data, as shown in previous results. The T M Y U test exhibited moderately high power, typically ranging between 43% and 51%, and remained relatively stable across dimensions. The proposed T M M R C D test demonstrated the lowest power values among the three, generally around 30%, but still consistent across varying sample sizes and dimensions. These results reflect the classical trade-off between power and robustness: while T M M R C D is more conservative, it provides strong type-1 error control even in contaminated or non-normal data settings. Therefore, although its power is slightly lower, it offers a more reliable alternative when robustness is essential.
  • Scenario 4: In this scenario, we generated observations as defined in Scenario 2. Unlike in Scenario 2, the Σ h   h = 1 , 2 , 3 matrices were different from each other here. After the datasets were generated in this way, we tested the null hypothesis H 0 : Σ 1 = Σ 2 = Σ 3 which is actually false. The p, n h , and Σ h values used here were as given in Scenario 3. The results are presented in Table 4 and visualized in Figure 4.
The results for Scenario 4, which involved mixed distributed data under the alternative hypothesis, are presented in Table 4 and visualized in Figure 4. Similar to the previous scenario, Box’s M test showed the highest power values, often exceeding 97%, but its known sensitivity to distributional deviations limits its practical reliability. The T M Y U test maintained stable and moderately high power across varying dimensionalities and sample sizes, typically around 50%. The proposed T M M R C D test again demonstrated lower power, generally between 26% and 32%, but its behavior remained consistent across the simulation settings. These results align with the classical robustness–power trade-off, where T M M R C D prioritizes robustness and type-1 control over maximizing power. While the power of the proposed test was lower in this contaminated data scenario, its performance did not substantially deteriorate, suggesting resilience under non-ideal conditions.

5.3. Comparisons of Robustness

In order to examine the robustness of the proposed approach to outliers, we first contaminated the dataset and then calculated the rejection rates of the null hypothesis, which is true. The observations generated here can be categorized as regular and outliers. Here, too, the data were contaminated with two different scenarios. In each case, the covariance matrices were generated as those given in Section 5.1.
  • Scenario 5: We generated datasets from a multivariate normal distribution. To generate regular observations, we set the mean vectors as μ 1 = μ 2 = μ 3 = 0 without any loss of generality. Accordingly, regular observations in each group were randomly generated from X h ~ N p μ h , Σ h   h = 1 , 2 , 3 . The proportion of regular observations in the data was 100 φ , where φ denotes the contamination rate. In Scenario 5, we used φ = 10 and 25 for sensitivity analysis. Outliers were generated from multivariate normal distributions X h . o u t ~ N p μ h . o u t , Σ h   h = 1 , 2 , 3 , with the following group-specific mean vectors:
μ 1 . o u t = l o g p p × 1 ,   μ 2 . o u t = p p × 1 ,   μ 3 . o u t = p p × 1
Although the mean vectors of the outliers differed among the groups, the covariance matrices remained identical, ensuring that the null hypothesis remained true even under contamination. Finally, we tested the null hypothesis H 0 : Σ 1 = Σ 2 = Σ 3 , which holds in reality. The results for φ = 10 are presented in Table 5 and visualized in Figure 5. Similarly, the results for φ = 25 are presented in Table 6 and visualized in Figure 6.
The results of Scenario 5 given in Table 5 and Table 6 and visualized Figure 5 and Figure 6, explore the sensitivity of the test statistics to outlier contamination by comparing the results obtained under 10% and 25% contamination levels. When benchmarked against the clean data scenario (Scenario 1; Table 1 and Figure 1), a clear pattern emerges: as the proportion of contamination increased, the performance of the classical Box’s M and T M Y U tests deteriorated significantly, while the proposed T M M R C D test remained stable. Specifically, under 10% contamination, Box’s M exhibited severely inflated type-1 error rates (ARE = 522.667), and T M Y U also showed poor robustness (ARE = 66.855). This effect became even more pronounced under 25% contamination, where Box’s M failed entirely (ARE = 672.444) and T M Y U consistently returned zero p-values, indicating excessive conservatism or breakdown. In contrast, T M M R C D maintained type-1 error rates remarkably close to the nominal 5% level across all contamination levels, with decreasing ARE values as contamination increased (11.002 at 10% vs. 4.103 at 25%). This stable performance under increasing contamination levels confirms that the MRCD estimator is effective in identifying and limiting the impact of outliers. Overall, the results clearly demonstrate that the proposed T M M R C D test provided strong robustness and reliability when the data included outliers.
  • Scenario 6: All observation values in this scenario were initially generated as described in Scenario 2. To contaminate the data, we modified the last observation vector in each group: the last observation in group 1 was multiplied by −5, in group 2 by 5, and in group 3 by 15. This approach introduced group-specific outliers of varying magnitude, which resulted in different contamination rates depending on the group sample sizes. Importantly, the null hypothesis H 0 : Σ 1 = Σ 2 = Σ 3 remained valid in this setting. The impact of these structured outliers on the performance of the test statistics was evaluated, and the results are presented in Table 7 and Figure 7.
Table 7 and Figure 7 present the robustness performance of the test statistics under Scenario 6, in which structured outliers of varying magnitude were introduced to each group. As expected, the proposed T M M R C D test maintained type-1 error rates close to the nominal 5% level across all combinations of n and p. This stability is confirmed by the lowest ARE value (5.522) among the three methods. In contrast, T M Y U displayed substantial variability and inflated rejection rates, particularly for small sample sizes and moderate dimensional settings. The classical Box’s M test was severely affected by the contamination, with type-1 error rates exceeding 98% in nearly all configurations, yielding an extremely large ARE value of 1885.582. These findings confirm that the T M M R C D test is highly robust to structured outliers, whereas both T M Y U and Box’s M failed to provide reliable control over the type-1 error under contaminated conditions.

6. Real Data Example

In this section, as in the simulation study, we analyze a real dataset to compare the performance of the proposed test statistic with the T M Y U statistic proposed by Yu [7]. In this scenario, we used the dataset available at the NCBI website under the code GSE57275. This dataset includes 14 observations and 45281 genes (variables). In addition, these 14 observations are divided into three different groups.
These groups are called the controlled, infected, and infected-medication groups. The infected group consists of chips GSM1378192, GSM1378193, GSM1378194, and GSM1378195, resulting in a total of four observations in the first group. The infected-medication group consists of chips GSM1378196, GSM1378197, GSM1378198, GSM1378199, and GSM1378200. Finally, the control group consists of chips GSM1378201, GSM1378202, GSM1378203, GSM1378204, and GSM1378205. Therefore, there were five observations in both the second and third groups.
We tested whether the covariance matrices of these high-dimensional groups were equal or not and compared the test statistics. For this purpose, we also examined how the test statistics were affected as the p / n ratio increased by determining different gene (variable) numbers. For this purpose, p values were chosen as 5, 20, 100, 300, 400, and 500. At all stages, the first p genes in the data were selected.
To examine the sensitivity of the test statistics to outliers, after performing the test process for clean data, we multiplied the last observation row in the first (infected) group by 10 and repeated this test process by creating an outlier in the dataset. Thus, we can see whether this outlier changed the decision made by the test statistics. The results are given in Table 8.
According to Table 8, the T M Y U and T M M R C D statistics failed to reject the null hypothesis H 0 : Σ 1 = Σ 2 = Σ 3 when there were no outliers in the data ( p values > 0.05 ) . The T M Y U statistic, however, rejected the null hypothesis when we added outliers to the data ( p value < 0.001 ) . Accordingly, it is concluded that the outliers in the data changed the decision of the T M Y U statistic, indicating that this statistic is sensitive to outliers in data. On the other hand, the T M M R C D statistic still failed to reject data contaminated with outliers. Therefore, we can say that the outliers in the data had no effect on the decision of the T M M R C D statistic and that this statistic is robust to outliers.

7. Software Availability

We constructed the function RobPer_CovTest() in the R package entitled “MVTests” to perform the proposed robust permutational test on real datasets. This function needs four arguments: The data matrix is assigned to the argument x, and the grouping vector of observations is assigned to the argument group. The permutation number is assigned to the argument N (default of N = 100). Finally, the argument alpha, which takes a value between 0.5 and 1 (default of alpha = 0.75), can be used to determine the trimming parameter. The function obtains the p-value, the T M M R C D value, and the T M M R C D r   r = 1 , 2 , , N values. Due to this function, all researchers can use our proposed test statistics without being affected by outliers to compare covariance matrices in high-dimensional data. Researchers can install this package from GitHub by using the following code block:
install.packages(“devtools”)
devtools::install_github(“hsnbulut/MVTests”)

8. Conclusions

Classical methods cannot be used to test the equality of covariance matrices in high-dimensional settings because the classical covariance matrix becomes singular, making its determinant zero and inverse undefined. Several alternative methods have been proposed in the literature to address this problem, as summarized in Section 2. However, these proposed tests are used for high-dimensional data, they are not robust to outliers in the data. To overcome this limitation, this study proposes a new test statistic, T M M R C D , designed to compare covariance matrices in high-dimensional data while being resistant to outlier effects.
The performance of the proposed test was evaluated through an extensive simulation study focusing on its type-1 error control, statistical power, and robustness. In the simulation study, it was observed that the proposed approach had a lower ARE value in terms of type-1 than the T M Y U statistic proposed by Yu [7]. Accordingly, it can be said that the proposed approach is more successful in terms of type-1 error rate. Although T M Y U exhibited higher statistical power, T M M R C D remained competitive. Importantly, the robustness comparisons reveal that T M M R C D maintained a low ARE under contamination while the T M Y U statistic became highly sensitive to outliers and yielded inflated rejection rates.
Although T M M R C D does not permit closed-form distribution under the null or alternative hypotheses due to its permutation-based structure, its asymptotic consistency can be justified empirically and theoretically. Under the null hypothesis, permutation-based tests are known to control the type-1 error asymptotically [17]. Under the alternative hypothesis, the proposed test statistic diverged from its permutation distribution as the group differences in covariance increased, thereby leading to the power converging to 1 as the sample size increased. This behavior is supported by the simulation results reported in Table 1, Table 2, Table 3 and Table 4.
To further compare their practical performance, we applied both test statistics to a real gene expression dataset. In these analyses, both the T M Y U and T M M R C D statistics failed to reject the null hypothesis on clean data. However, when the data were contaminated, the T M Y U statistic no longer failed to reject the null hypothesis, while T M M R C D maintained the same decision without being affected by the outliers. This real data example shows that the proposed test can be used on high-dimensional data without being affected by outliers. For a concise summary of the key differences among the proposed test T M M R C D , the T M Y U statistic, and Box’s M test, we refer the reader to the comparative table provided in Table A2 in Appendix A.
Finally, to support real-world applications, we implemented the proposed test as an R function in the MVTests package. In conclusion, we believe that the proposed approach can contribute to the literature not only theoretically but also practically.

Funding

This research received no external funding.

Data Availability Statement

We used the dataset available at the NCBI website under the code GSE57275.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

Table A1. List of notations.
Table A1. List of notations.
NotationDescription
p Number of variables (dimension)
n Total sample size (sum of all groups)
n h Sample size of group h
g Number of groups
μ h Mean vector of group h
Σ h Population covariance matrix of group h
S h Sample covariance matrix of group h
S p o o l e d Pooled covariance matrix across all groups
X h Observation matrix from group h
X h . o u t Outlier observations from group h
α Trimming proportion in MRCD estimation
φ Contamination rate (proportion of outliers)
M h k Test component comparing group h and k
T M Y U Test statistic used by Yu [7]
T M M R C D Proposed test statistic using MRCD estimators
A R E Average relative error (%) with respect to nominal type-1 error level
Table A2. Comparison of test statistics.
Table A2. Comparison of test statistics.
TestRobust to OutliersHigh-Dimensional Applicability (p > n)Estimation Approach
B o x s   M NoNoClassical covariance
T M Y U NoYesClassical covariance + permutation
T M M R C D YesYesMRCD estimators + permutation

References

  1. Rencher, A.C. Methods of Multivariate Analysis; John Willey & Sons. Inc. Publications: Montreal, QC, Canada, 2002. [Google Scholar]
  2. Bulut, H. Multivariate Statistical Methods with R Applications, 2nd ed.; Nobel Academic Publishing: Ankara, Turkey, 2023. [Google Scholar]
  3. Box, G.E.P. A General Distribution Theory for a Class of Likelihood Criteria. Biometrika 1949, 36, 317–346. [Google Scholar] [CrossRef] [PubMed]
  4. Schott, J.R. A Test for the Equality of Covariance Matrices When the Dimension is Large Relative to the Sample Sizes. Comput. Stat. Data Anal. 2007, 51, 6535–6542. [Google Scholar] [CrossRef]
  5. Srivastava, M.S.; Yanagihara, H. Testing the Equality of Several Covariance Matrices with Fewer Observations Than the Dimension. J. Multivar. Anal. 2010, 101, 1319–1329. [Google Scholar] [CrossRef]
  6. Li, J.; Chen, S.X. Two Sample Tests for High-Dimensional Covariance Matrices. Ann. Stat. 2012, 40, 908–940. [Google Scholar] [CrossRef]
  7. Yu, W. A New Method for Multi-Sample High-Dimensional Covariance Matrices Test Based on Permutation. Commun. Stat-Theor. Methods 2022, 51, 4476–4486. [Google Scholar] [CrossRef]
  8. Boudt, K.; Rousseeuw, P.J.; Vanduffel, S.; Verdonck, T. The Minimum Regularized Covariance Determinant Estimator. Stat. Comput. 2020, 30, 113–128. [Google Scholar] [CrossRef]
  9. Ledoit, O.; Wolf, M. A Well-Conditioned Estimator for Large-Dimensional Covariance Matrices. J. Multivar. Anal. 2004, 88, 365–411. [Google Scholar] [CrossRef]
  10. Schäfer, J.; Strimmer, K. A Shrinkage Approach to Large-Scale Covariance Matrix Estimation and Implications for Functional Genomics. Stat. Appl. Genet. Mol. Biol. 2005, 4. [Google Scholar] [CrossRef] [PubMed]
  11. Wang, X.B. Homogeneity Test of K Covariance Matrices for Large-Dimensional Data. Master Thesis, Yunnan University, Kunming, China, 2018. [Google Scholar]
  12. Bulut, H. A Robust Hotelling Test Statistic for One Sample Case in High Dimensional Data. Commun. Stat-Theor. Methods 2023, 52, 4590–4604. [Google Scholar] [CrossRef]
  13. Rousseeuw, P.J.; Croux, C. Alternatives to the Median Absolute Deviation. J. Am. Stat. Assoc. 1993, 88, 1273–1283. [Google Scholar] [CrossRef]
  14. Croux, C.; Haesbroeck, G. Influence Function and Efficiency of the Minimum Covariance Determinant Scatter Matrix Estimator. J. Multivar. Anal. 1999, 71, 161–190. [Google Scholar] [CrossRef]
  15. Todorov, V.; Filzmoser, P. An Object-Oriented Framework for Robust Multivariate Analysis. J. Stat. Softw. 2010, 32, 1–47. [Google Scholar] [CrossRef]
  16. Phipson, B.; Smyth, G.K. Permutation P-Values Should Never Be Zero: Calculating Exact P-Values When Permutations Are Randomly Drawn. Stat. Appl. Genet. Mol. Biol. 2010, 9. [Google Scholar] [CrossRef] [PubMed]
  17. Lehmann, E.L.; Romano, J.P. Testing Statistical Hypotheses; Springer: Berlin/Heidelberg, Germany, 1986; Volume 3. [Google Scholar]
Figure 1. Type-1 error rates of methods based on normally distributed data.
Figure 1. Type-1 error rates of methods based on normally distributed data.
Axioms 14 00427 g001
Figure 2. Type-1 error rates of methods based on mixed distributed data.
Figure 2. Type-1 error rates of methods based on mixed distributed data.
Axioms 14 00427 g002
Figure 3. Powers of methods based on normally distributed data.
Figure 3. Powers of methods based on normally distributed data.
Axioms 14 00427 g003
Figure 4. Powers of methods based on mixed distributed data.
Figure 4. Powers of methods based on mixed distributed data.
Axioms 14 00427 g004
Figure 5. Robustness of methods based on normally distributed data with 10% contamination.
Figure 5. Robustness of methods based on normally distributed data with 10% contamination.
Axioms 14 00427 g005
Figure 6. Robustness of methods based on normally distributed data with 25% contamination.
Figure 6. Robustness of methods based on normally distributed data with 25% contamination.
Axioms 14 00427 g006
Figure 7. Robustness of methods based on mixed distributed data.
Figure 7. Robustness of methods based on mixed distributed data.
Axioms 14 00427 g007
Table 1. Type-1 error rates of methods based on normally distributed data.
Table 1. Type-1 error rates of methods based on normally distributed data.
n h p T M M R C D T M Y U M
1054.7514.7385.818
3054.9235.8616.727
6055.9877.6035.364
10105.2755.709-
30105.2765.5496.455
60105.5493.8747.455
10505.5335.319-
30505.3595.955-
60505.3337.2945.727
101005.2555.290-
301005.2994.236-
601004.5545.398-
103005.0616.088-
303005.0178.081-
603005.6615.588-
ARE:7.17121.18225.152
Table 2. Type-1 error rates of methods based on mixed distributed data.
Table 2. Type-1 error rates of methods based on mixed distributed data.
n h p T M M R C D T M Y U M
1055.1535.15127.455
3055.6185.55427.727
6055.1006.18229.364
10105.8145.171-
30105.0445.44526.091
60105.5095.05726.636
10504.9285.538-
30505.2004.800-
60505.3625.75927.818
101005.4004.940-
301005.2965.783-
601005.1586.080-
103005.4195.506-
303005.5144.851-
603005.0315.684-
ARE:6.2549.758180.121
Table 3. Powers of methods based on normally distributed data.
Table 3. Powers of methods based on normally distributed data.
n h p T M M R C D T M Y U M
10530.90948.18298.909
30531.81842.72799.000
60529.09150.90998.091
101030.90947.273-
301031.81843.63698.182
601034.54549.09198.273
105030.00045.455-
305030.00045.455-
605032.72750.00097.727
1010030.00045.455-
3010029.09148.182-
6010030.90946.364-
1030032.72750.909-
3030029.09149.091-
6030030.00046.364-
Table 4. Powers of methods based on mixed distributed data.
Table 4. Powers of methods based on mixed distributed data.
n h p T M M R C D T M Y U M
10528.18251.81898.182
30527.27350.00098.364
60529.09150.00097.636
101030.00051.818-
301032.72746.36497.909
601030.00050.90998.909
105028.18248.182-
305030.00050.000-
605028.18252.72798.091
1010028.18246.364-
3010026.36446.364-
6010031.81850.000-
1030030.90951.818-
3030028.18247.273-
6030028.18250.000-
Table 5. Robustness of methods based on normally distributed data with 10% contamination.
Table 5. Robustness of methods based on normally distributed data with 10% contamination.
n h p T M M R C D T M Y U M
1055.1582.52633.233
3055.1411.30730.133
6056.0540.59130.700
10105.1940.729-
30105.3370.11330.900
60105.7363.06030.667
10505.6341.692-
30505.8233.409-
60505.6792.39931.167
101005.5611.035-
301005.6571.857-
601005.4812.553-
103005.1101.358-
303005.7801.310-
603005.9070.920-
ARE:11.00266.855522.667
Table 6. Robustness of methods based on contaminated normal distributed data with 25% contamination.
Table 6. Robustness of methods based on contaminated normal distributed data with 25% contamination.
n h p T M M R C D T M Y U M
1055.3850.00031.133
3054.6150.00031.633
6054.6150.00031.733
10105.0000.000-
30105.0000.00030.967
60105.0000.00030.400
10505.0000.000-
30504.6150.000-
60504.6150.00030.867
101005.0000.000-
301004.6150.000-
601004.6150.000-
103005.0000.000-
303005.0000.000-
603004.6150.000-
ARE:4.103100.000672.444
Table 7. Robustness of methods based on mixed distributed data.
Table 7. Robustness of methods based on mixed distributed data.
n h p T M M R C D T M Y U M
1055.2920.01399.193
3055.3153.46898.678
6055.2300.46499.137
10104.6521.518-
30105.2200.62399.315
60105.3132.81899.748
10505.2120.400-
30505.2266.973-
60505.2560.91399.603
101004.7840.449-
301005.2080.713-
601005.2670.439-
103005.3050.673-
303004.6650.807-
603005.3990.486-
ARE:5.52277.5851885.582
Table 8. Results of tests using real data examples.
Table 8. Results of tests using real data examples.
DimensionsData T M M R C D T M Y U
Test Statisticsp-ValueTest Statisticsp-Value
5Clean2.8760.3403.1690.747
Cont.3.5700.39313.9340.000
20Clean3.8840.06017.5440.210
Cont.3.8110.10055.6760.000
100Clean3.7470.62066.8710.510
Cont.3.8630.230278.4340.000
200Clean3.8790.260135.7040.470
Cont.3.9660.051556.6220.000
300Clean3.8720.260208.2160.380
Cont.3.9670.060834.6400.000
400Clean3.8710.500277.0740.410
Cont.3.9710.0701111.9470.000
500Clean3.8750.530340.9320.290
Cont.3.9790.0701390.2070.000
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Bulut, H. A Novel Robust Test to Compare Covariance Matrices in High-Dimensional Data. Axioms 2025, 14, 427. https://doi.org/10.3390/axioms14060427

AMA Style

Bulut H. A Novel Robust Test to Compare Covariance Matrices in High-Dimensional Data. Axioms. 2025; 14(6):427. https://doi.org/10.3390/axioms14060427

Chicago/Turabian Style

Bulut, Hasan. 2025. "A Novel Robust Test to Compare Covariance Matrices in High-Dimensional Data" Axioms 14, no. 6: 427. https://doi.org/10.3390/axioms14060427

APA Style

Bulut, H. (2025). A Novel Robust Test to Compare Covariance Matrices in High-Dimensional Data. Axioms, 14(6), 427. https://doi.org/10.3390/axioms14060427

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop