A Two-Sample Test of High Dimensional Means Based on Posterior Bayes Factor

Jiang, Yuanyuan; Xu, Xingzhong

doi:10.3390/math10101741

Open AccessArticle

A Two-Sample Test of High Dimensional Means Based on Posterior Bayes Factor

by

Yuanyuan Jiang

¹ and

Xingzhong Xu

^1,2,*

¹

School of Mathematics and Statistics, Beijing Institute of Technology, Beijing 100081, China

²

Beijing Key Laboratory on MCAACI, Beijing Institute of Technology, Beijing 100081, China

^*

Author to whom correspondence should be addressed.

Mathematics 2022, 10(10), 1741; https://doi.org/10.3390/math10101741

Submission received: 5 April 2022 / Revised: 14 May 2022 / Accepted: 17 May 2022 / Published: 19 May 2022

(This article belongs to the Special Issue Multivariate Statistics: Theory and Its Applications)

Download

Browse Figure

Versions Notes

Abstract

:

In classical statistics, the primary test statistic is the likelihood ratio. However, for high dimensional data, the likelihood ratio test is no longer effective and sometimes does not work altogether. By replacing the maximum likelihood with the integral of the likelihood, the Bayes factor is obtained. The posterior Bayes factor is the ratio of the integrals of the likelihood function with respect to the posterior. In this paper, we investigate the performance of the posterior Bayes factor in high dimensional hypothesis testing through the problem of testing the equality of two multivariate normal mean vectors. The asymptotic normality of the linear function of the logarithm of the posterior Bayes factor is established. Then we construct a test with an asymptotically nominal significance level. The asymptotic power of the test is also derived. Simulation results and an application example are presented, which show good performance of the test. Hence, taking the posterior Bayes factor as a statistic in high dimensional hypothesis testing is a reasonable methodology.

Keywords:

high dimension; mean test; posterior Bayes factor; asymptotic normality

MSC:

62H15

1. Introduction

The likelihood ratio is the primary test statistic in hypothesis testing owing to its dominating power. However, for high dimensional data, the likelihood ratio statistic is sometimes undefined. For example, the likelihood function of a multivariate normal distribution is unbounded when the dimension of data is greater than the sample size. Even if the likelihood ratio is well-defined, its performance is unsatisfactory when the dimension is proportionally “close to" the sample size [1]. Therefore, when the dimension is large relative to the sample size, that is the so-called “large p small n” situation; how to choose a test statistic plays a key role in statistical inference.

In this article, we try to use the posterior Bayes factor to be a test statistic for high dimensional data, applying it to equality testing of two multivariate normal mean vectors. The classical likelihood ratio test statistic is the ratio of the maximum values of likelihoods, whereas the Bayes factor is the ratio of the integrated likelihoods. We chose the posterior Bayes factor rather than the prior Bayes factor because when the dimension is fixed, the former is less affected by the variations of the prior. This paper aims to investigate the ability of the posterior Bayes factor as a test statistic. As a result, a simple prior is taken for the parameter.

In multivariate analysis, testing the equality of two means is a fundamental problem. The classical procedure for this problem is the famous Hotelling T

^{2}

test in [2], which is based on Mahalanobis distance between the sample mean vectors weighted by the inverse sample covariance matrix. Hotelling’s T

^{2}

test is the most powerful invariant test when the dimension is fixed and much smaller than the total sample size [3], but it is unsatisfactory when the dimension is large relative to the sample size [1]. However, in recent decades, hypothesis testing viable for high dimensional data is increasingly demanded in many application areas such as genomics, finance, medicine, and so on. An important work [1] modifies Hotelling’s T

^{2}

statistic in a high dimensional setting by removing the inverse of the sample covariance from the Hotelling formulation. Some new test statistics for the mean vector are introduced by replacing the sample covariance with its diagonal in [4,5,6]. In [7], a statistic is constructed by retaining the cross-product terms in work [1]. In the sequel, ref. [8] standardizes each component of

\bar{X} - \bar{Y}

in [7] by the corresponding variance estimation and proposes a scale-invariant test. The test statistics introduced above are called “sum-of-squares type statistics” (see [9]) and attempt to get around the ill-formed sample covariance matrix. Another major approach called “projecting data” transforms high dimensional data into low dimensional data with random projection so that traditional tests can be applied. See, for example [10,11,12]. By maximizing an average signal to noise ratio, ref. [13] finds the optimal projection subspace and proposes a new test procedure based on it. Besides the two main approaches mentioned above, ref. [14] studies the rates of convergence for the high-dimensional mean and proposes tests based on the sample mean. A new test based on random subspaces is proposed by [15]. A generalized component test is presented in [16], whose statistic is the average of the squared t-statistics for all the component testing problems. A method using a multiple hypothesis test based on the maximum of standardized partial sums of logarithmic p-values statistic is introduced in [17]. More works about testing the mean vectors are presented in [18,19,20].

Few articles develop tests for the means of two samples with Bayesian machineries in high dimensional settings. A Bayes factor-based testing procedure is developed by [12]. However, the statistic is still constructed with lower dimensional random projections of the high dimensional data vectors because Bayes factors based on Jeffrey’s prior involve inversion of the ill-formed sample covariance matrices, as in the classical Hotelling T

^{2}

test statistic in a “large p small n” setting. The approach of random projection cannot be applied when the difference of two mean vectors is dense. However, whether the difference of two mean vectors is dense or sparse is not known in applications. Aitkin [21] proposed the posterior Bayes factor, which is the ratio of the posterior means of the likelihood under each model rather than the usual prior means. Suppose two models

M_{0}

and

M_{1}

for common data x are considered, under which the likelihood function is

L_{i} (θ_{j})

, where

θ_{j}

is the parameter of dimension

p_{j}

and belongs to the parameter space

Θ_{j}, j \in {0, 1}

. Specifying prior

π_{j} (θ_{j})

to

θ_{j}

,

j \in {0, 1}

, then the posterior Bayes factor in favor of the model

M_{1}

, denoted by PBF

_{10}

, is defined as

{PBF}_{10} = \frac{{\bar{L}}_{1}}{{\bar{L}}_{0}},

(1)

where

{\bar{L}}_{j}

is the posterior mean:

{\bar{L}}_{j} = \int_{Θ_{j}} L_{j} (θ_{j}) π_{j} (θ_{j} | x, y) d θ_{j}, j \in {0, 1},

and

π_{j} (θ_{j} | x, y)

is the posterior density of

θ_{j}

:

π_{j} (θ_{j} | x, y) = \frac{L_{j} (θ_{j}) π_{j} (θ_{j}) d θ_{j}}{\int_{Θ_{j}} L_{j} (θ_{j}) π_{j} (θ_{j}) d θ_{j}}, j \in {0, 1} .

(2)

Unlike the Bayes factor, which is highly dependent on the prior and may be very sensitive to variations in the prior, the posterior Bayes factor reduces this sensitivity to the prior. Specifically, when model

M_{0}

is a regular submodel of

M_{1}

, the logarithm of the posterior Bayes factor under model

M_{1}

has

2 ln {PBF}_{10} \overset{d}{\to} - v ln 2 + χ^{2} (v),

where “

\overset{d}{\to}

” means the convergence in distribution, and

v = p_{1} - p_{0}

and

χ^{2} (v)

denotes a Chi-square distribution with v degrees of freedom. The asymptotic distribution of the logarithm of the posterior Bayes factor is independent of the prior distribution, which further illustrates that the posterior Bayes factor is insensitive to the prior.

Inspired by [21], we consider testing the equality of two high dimensional means with the posterior Bayes factor. With an appropriate prior, the posterior Bayes factor no longer suffers the impediment of the inversion of ill-formed matrices. Additionally, compared with the approach in [12], which proposed a test based on the Bayes factor with random projections, the posterior Bayes factor can be applied for both dense and sparse cases. In this paper, a non-informative prior also works for the location parameters, while an inverse Wishart prior is taken for the covariance matrix. We establish the asymptotic normality of the logarithm of the posterior Bayes factor under the null hypothesis and derive the asymptotic power of the test. Simulation studies are carried out to investigate the performance of the proposed test. The numerical results show that the power of our test outperforms the competitors in most cases.

The rest of this article is organized as follows. In Section 2, we derive the posterior Bayes factor for testing the equality of two mean vectors in the “large p small n” setting. The asymptotic null distribution of the posterior Bayes factor and the local power function of the test are also presented. Simulation results are given in Section 3. We apply the proposed test to a real dataset in Section 4. Section 5 concludes the paper. Technical proofs and the code for performing the simulation studies are deferred to Appendix A and Appendix B.

2. Test Based on Posterior Bayes Factor

This section tries to construct the test based on the posterior Bayes factor. Let

X = (X_{1}, \dots, X_{n_{1}})

and

Y = (Y_{1}, \dots, Y_{n_{2}})

be iid samples from p-dimensional multivariate normal distributions

N_{p} (μ_{1}, Σ)

and

N_{p} (μ_{2}, Σ)

, respectively, where

μ_{1}

and

μ_{2}

are

p \times 1

vectors, and

Σ

is a positively definite

p \times p

matrix. The goal is to test the hypotheses

H_{0} : μ_{1} = μ_{2} versus H_{1} : μ_{1} \neq μ_{2} .

(3)

In order to test Hypotheses (3) by the posterior Bayes factor, we specify the priors for the parameters

μ_{1}

,

μ_{2}

and

Σ

under both the null and alternative hypotheses as

π_{0} (μ) = 1, Σ \sim W_{p}^{- 1} (m_{0}, V^{- 1}),

(4)

and

π_{1} (μ_{1}) = π_{1} (μ_{2}) = 1, Σ \sim W_{p}^{- 1} (m_{1}, V^{- 1}),

(5)

respectively, where

μ

is the common mean vector under the null hypothesis, and

W_{p}^{- 1} (m_{j}, V^{- 1})

is the inverse Wishart distribution with real degrees of freedom

m_{j}

and a positive definite matrix

V^{- 1}

,

j \in {0, 1}

.

The reasons for choosing the above priors are as follows.

When no knowledge about the prior is available, a non-informative prior is suggested. A usual one is Jeffrey’s prior. As a result, for the parameters $μ_{1}$ , $μ_{2}$ , and the common parameter $μ$ under the null hypothesis, we choose Jeffrey’s prior, i.e., Lebesgue measure.
For the covariance matrix $Σ$ , the posterior distribution with Jeffrey’s prior does not exist when $p > n - 2$ , where $n = n_{1} + n_{2}$ . Therefore, we take the inverse Wishart distribution, which is a conjugate for a normal covariance matrix.
This paper aims to investigate whether the test with the posterior Bayes factor statistic in high dimensional settings performs better than the existing methods. If the results turn out to be as expected, the posterior Bayes factor could be suggested to be the test statistic for high dimensional datasets. Hence, we will take simple priors. Furthermore, we take $V = k I_{p}$ in the priors for the covariance matrices with small k so that the variation of the $Σ$ is large.

The joint densities of X and Y under the null and alternative hypotheses are

{(2 π)}^{- \frac{n p}{2}} {| Σ |}^{- \frac{n}{2}} exp \{- \frac{1}{2} tr Σ^{- 1} [\sum_{i = 1}^{n_{1}} (x_{i} - μ) {(x_{i} - μ)}^{T} + \sum_{i = j}^{n_{2}} (y_{j} - μ) {(y_{j} - μ)}^{T}]\}

and

{(2 π)}^{- \frac{n p}{2}} {| Σ |}^{- \frac{n}{2}} exp \{- \frac{1}{2} tr Σ^{- 1} [\sum_{i = 1}^{n_{1}} (x_{i} - μ_{1}) {(x_{i} - μ_{1})}^{T} + \sum_{i = j}^{n_{2}} (y_{j} - μ_{2}) {(y_{j} - μ_{2})}^{T}]\},

respectively. Then the posterior mean

{\bar{L}}_{1}

under

H_{1}

can be calculated as

{\bar{L}}_{1} = \int_{Θ_{1}} L_{1} (θ_{1}) [\frac{L_{1} (θ_{1}) π_{1} (θ_{1})}{\int_{Θ_{1}} L_{1} (θ_{1}) π_{1} (θ_{1}) d θ_{1}}] d θ_{1} = \frac{\int_{Θ_{1}} L_{1}^{2} (θ_{1}) π_{1} (θ_{1}) d θ_{1}}{\int_{Θ_{1}} L_{1} (θ_{1}) π_{1} (θ_{1}) d θ_{1}},

where

\begin{matrix} \int_{Θ_{1}} L_{1} (θ_{1}) π_{1} (θ_{1}) d θ_{1} \\ = & {(2 π)}^{- \frac{(n - 2) p}{2}} n_{1}^{- \frac{p}{2}} n_{2}^{- \frac{p}{2}} \frac{2^{\frac{(m_{1} + n - 2) p}{2}} {| V^{- 1} + S_{1} + S_{2} |}^{- \frac{m_{1} + n - 2}{2}} Γ_{p} (\frac{m_{1} + n - 2}{2})}{2^{\frac{m_{1} p}{2}} {| V |}^{\frac{m_{1}}{2}} Γ_{p} (\frac{m_{1}}{2})}, \end{matrix}

and

Γ_{p} (\cdot)

denotes the multivariate gamma function, that is,

Γ_{p} (a) = π^{p (p - 1) / 4} \prod_{j = 1}^{p} Γ [a + (1 - j) / 2] .

S_{1} = \sum_{i = 1}^{n_{1}} (X_{i} - \bar{X}) {(X_{i} - \bar{X})}^{T}, S_{2} = \sum_{j = 1}^{n_{2}} (Y_{j} - \bar{Y}) {(Y_{j} - \bar{Y})}^{T}

with

\bar{X} = \sum_{i = 1}^{n_{1}} X_{i} / n_{1}

,

\bar{Y} = \sum_{j = 1}^{n_{2}} Y_{j} / n_{2}

, and

\begin{matrix} \int_{Θ_{1}} L_{1}^{2} (θ_{1}) π_{1} (θ_{1}) d θ_{1} \\ = & {(2 π)}^{- \frac{(2 n - 2) p}{2}} {(2 n_{1})}^{- \frac{p}{2}} {(2 n_{2})}^{- \frac{p}{2}} \frac{2^{\frac{(m_{1} + 2 n - 2) p}{2}} {| V^{- 1} + 2 S_{1} + 2 S_{2} |}^{- \frac{m_{1} + 2 n - 2}{2}} Γ_{p} (\frac{m_{1} + 2 n - 2}{2})}{2^{\frac{m_{1} p}{2}} {| V |}^{\frac{m_{1}}{2}} Γ_{p} (\frac{m_{1}}{2})} . \end{matrix}

The posterior mean

{\bar{L}}_{0}

under

H_{0}

can be calculated as

{\bar{L}}_{0} = \frac{\int_{Θ_{0}} L_{0}^{2} (θ_{0}) π_{0} (θ_{0}) d θ_{0}}{\int_{Θ_{0}} L_{0} (θ_{0}) π_{0} (θ_{0}) d θ_{0}},

where

\begin{matrix} \int_{Θ_{0}} L_{0} (θ_{0}) π_{0} (θ_{0}) d θ_{0} \\ = & {(2 π)}^{- \frac{(n - 1) p}{2}} n^{- \frac{p}{2}} \frac{2^{\frac{(m_{0} + n - 1) p}{2}} {| V^{- 1} + S_{1} + S_{2} + \frac{n_{1} n_{2}}{n} (\bar{X} - \bar{Y}) {(\bar{X} - \bar{Y})}^{T} |}^{- \frac{m_{0} + n - 1}{2}} Γ_{p} (\frac{m_{0} + n - 1}{2})}{2^{\frac{m_{0} p}{2}} {| V |}^{\frac{m_{0}}{2}} Γ_{p} (\frac{m_{0}}{2})} \end{matrix}

and

\begin{matrix} \int_{Θ_{0}} L_{0}^{2} (θ_{0}) π_{0} (θ_{0}) d θ_{0} \\ = & {(2 π)}^{- \frac{(2 n - 1) p}{2}} {(2 n)}^{- \frac{p}{2}} \\ \times \frac{2^{\frac{(m_{0} + 2 n - 1) p}{2}} {| V^{- 1} + 2 S_{1} + 2 S_{2} + 4 \frac{n_{1} n_{2}}{n} (\bar{X} - \bar{Y}) {(\bar{X} - \bar{Y})}^{T} |}^{- \frac{m_{0} + 2 n - 1}{2}} Γ_{p} (\frac{m_{0} + 2 n - 1}{2})}{2^{\frac{m_{0} p}{2}} {| V |}^{\frac{m_{0}}{2}} Γ_{p} (\frac{m_{0}}{2})} . \end{matrix}

For simplicity, we specify

m_{0} = m + 1

and

m_{1} = m + 2

. Then the posterior Bayes factor in favor of

H_{1}

against

H_{0}

in (3) denoted by

PB

admits an expression as

PB (X, Y) = {(\frac{1}{2})}^{p / 2} \frac{{[1 + 2 \frac{n_{1} n_{2}}{n} {(\bar{X} - \bar{Y})}^{T} {(V^{- 1} + 2 (S_{1} + S_{2}))}^{- 1} (\bar{X} - \bar{Y})]}^{\frac{m + 2 n}{2}}}{{[1 + \frac{n_{1} n_{2}}{n} {(\bar{X} - \bar{Y})}^{T} {(V^{- 1} + S_{1} + S_{2})}^{- 1} (\bar{X} - \bar{Y})]}^{\frac{m + n}{2}}} .

(6)

Multiplying the logarithm of the posterior Bayes factor by 2, we have

\begin{matrix} 2 ln PB (X, Y) = & - p ln 2 + (m + 2 n) ln [1 + 2 \frac{n_{1} n_{2}}{n} {(\bar{X} - \bar{Y})}^{T} {(V^{- 1} + 2 (S_{1} + S_{2}))}^{- 1} (\bar{X} - \bar{Y})] \\ - (m + n) ln [1 + \frac{n_{1} n_{2}}{n} {(\bar{X} - \bar{Y})}^{T} {(V^{- 1} + S_{1} + S_{2})}^{- 1} (\bar{X} - \bar{Y})] . \end{matrix}

(7)

Now we want to determine a critical value

c_{α}

, which makes the test given by the rejection region

{(x, y) : 2 ln PB (x, y) \geq c_{α}}

have a significance level

α

. Since the distribution of

2 ln PB (X, Y)

under the null hypothesis is unknown, the critical value

c_{α}

is determined by means of the asymptotic distribution of it. In order to obtain its asymptotic distribution, Taylor series expansion of the logarithm function

ln (1 + x)

in (7) around 0 is carried out, which can be summarized as

2 ln PB (X, Y) = - p ln 2 + (m + 2 n) [A_{1} - \frac{A_{1}^{2}}{2 {(1 + A_{1}^{*})}^{2}}] - (m + n) [A_{2} - \frac{A_{2}^{2}}{2 {(1 + A_{2}^{*})}^{2}}],

(8)

where

A_{1} = 2 \frac{n_{1} n_{2}}{n} {(\bar{X} - \bar{Y})}^{T} {(V^{- 1} + 2 (S_{1} + S_{2}))}^{- 1} (\bar{X} - \bar{Y}),

A_{2} = \frac{n_{1} n_{2}}{n} {(\bar{X} - \bar{Y})}^{T} {(V^{- 1} + S_{1} + S_{2})}^{- 1} (\bar{X} - \bar{Y}),

A_{1}^{*} \in (0, A_{1})

and

A_{2}^{*} \in (0, A_{2})

.

In (8), the quadratic form in

\bar{X} - \bar{Y}

is

(m + 2 n) A_{1} - (m + n) A_{2} = \frac{n_{1} n_{2}}{n} {(\bar{X} - \bar{Y})}^{T} B (\bar{X} - \bar{Y}),

where

B = 2 (m + 2 n) {(V^{- 1} + 2 S_{1} + 2 S_{2})}^{- 1} - (m + n) {(V^{- 1} + S_{1} + S_{2})}^{- 1} .

Denotes the spectral decomposition of

Σ^{\frac{1}{2}} B Σ^{\frac{1}{2}}

by

G^{T} A G

, where

A = diag (a_{1}, \dots, a_{p})

. Let

ξ = {(ξ_{1}, ξ_{2}, \dots, ξ_{p})}^{T} = \sqrt{\frac{n_{1} n_{2}}{n}} G Σ^{- \frac{1}{2}} (\bar{X} - \bar{Y})

. Then we have

\begin{matrix} \frac{n_{1} n_{2}}{n} {(\bar{X} - \bar{Y})}^{T} B (\bar{X} - \bar{Y}) & = ξ^{T} G Σ^{\frac{1}{2}} B Σ^{\frac{1}{2}} G^{T} ξ \\ = ξ^{T} G G^{T} A G G^{T} ξ \\ = \sum_{i = 1}^{p} a_{i} ξ_{i}^{2} . \end{matrix}

(9)

When the null hypothesis is true,

\sqrt{\frac{n_{1} n_{2}}{n}} (\bar{X} - \bar{Y}) \sim N_{p} (0_{p}, Σ),

ξ \sim N_{p} (0_{p}, I_{p})

. The asymptotic distribution of the above formulation can be derived with the following Lemma.

Lemma 1

([22]). Let

ζ_{n, i}

,

i \in {1, \dots, n}

,

n = 1, 2, \dots

, be iid s-dimensional random vectors with mean zero, covariance matrix M and finite fourth moment. For

n = 1, 2, \dots

, let

{a_{n, i}}_{i = 1}^{n}

be real random variables which are independent of

{ζ_{n, i}}_{i = 1}^{n}

and satisfy

\frac{{max}_{1 \leq i \leq n} a_{n, i}^{2}}{\sum_{i = 1}^{n} a_{n, i}^{2}} \overset{P}{\to} 0 .

(10)

Then

\frac{\sum_{i = 1}^{n} a_{n, i} ζ_{n, i}}{\sqrt{\sum_{i = 1}^{n} a_{n, i}^{2}}} \overset{d}{\to} N_{s} (0_{s}, M) .

We take

ζ_{n, i} = ξ_{i}^{2} - 1

, such that

E ζ_{n, i} = 0

,

i \in {1, \dots, p}

. From Lemma 1, (9) needs to be normalized by

\sqrt{2 \sum_{i = 1}^{n} a_{n, i}^{2}} = \sqrt{2 tr A^{2}} = \sqrt{2 tr {(Σ B)}^{2}}

because

Var (ζ_{n, i}) = 2

,

i \in {1, \dots, p}

. To ensure equality,

\sum_{i = 1}^{p} a_{i} = tr A = tr (B Σ)

is added to the right side of the equality. By now, we have

2 ln PB + p ln 2 = \sum_{i = 1}^{p} a_{i} (ξ_{i}^{2} - 1) + tr (B Σ) - (m + 2 n) \frac{A_{1}^{2}}{2 {(1 + A_{1}^{*})}^{2}} + (m + n) \frac{A_{2}^{2}}{2 {(1 + A_{2}^{*})}^{2}} .

As a result,

\begin{matrix} \frac{2 ln PB + p ln 2 - \hat{tr (B Σ)}}{\sqrt{2 tr {(B Σ)}^{2}}} \\ = & \frac{\sum_{i = 1}^{p} a_{i} (ξ_{i}^{2} - 1)}{\sqrt{2 tr {(B Σ)}^{2}}} + \frac{tr (B Σ) - \hat{tr (B Σ)}}{\sqrt{2 tr {(B Σ)}^{2}}} \\ - (m + 2 n) \frac{A_{1}^{2}}{2 {(1 + A_{1}^{*})}^{2} \sqrt{2 tr {(B Σ)}^{2}}} + (m + n) \frac{A_{2}^{2}}{2 {(1 + A_{2}^{*})}^{2} \sqrt{2 tr {(B Σ)}^{2}}}, \end{matrix}

(11)

where

\hat{tr (B Σ)}

is the estimator of

tr (B Σ)

. We take

\hat{tr (B Σ)} = tr (B S_{n}),

where

S_{n} = (S_{1} + S_{2}) / (n - 2)

.

We shall next prove that the first item on the right side of (11) converges in distribution to

N (0, 1)

and the remaining items converge in probability to 0. If a ratio consistent estimator of

tr {(B Σ)}^{2}

is obtained, a test with a level of asymptotical significance

α

can be constructed by (11). To this end, some usual assumptions are made as follows:

\frac{p}{n} \to c \in (0, \infty), \frac{n_{1}}{n} \to τ \in (0, 1) a s n \to \infty, a n d m = O (n) .

(12)

\frac{λ_{1} (Σ)}{\sqrt{tr Σ^{2}}} \to 0,

(13)

where

λ_{1} (A)

is the largest eigenvalue of a matrix A. Let

δ = μ_{1} - μ_{2}

. We also assume

\frac{n_{1} n_{2}}{n} \frac{δ^{T} Σ δ}{tr Σ^{2}} \to 0,

(14)

and

\frac{δ^{T} δ}{tr Σ^{2}} = O (1) .

(15)

Carefully choosing

k = ε_{n} / [n p λ_{1} (S_{1} + S_{2})]

, where

ε_{n} \to 0

, we ensure that

\begin{matrix} \frac{m + 2 n}{\sqrt{2 tr {(B Σ)}^{2}}} \frac{A_{1}^{2}}{2 {(1 + A_{1}^{*})}^{2}} \overset{P}{\to} 0 a n d \frac{m + n}{\sqrt{2 tr {(B Σ)}^{2}}} \frac{A_{2}^{2}}{2 {(1 + A_{2}^{*})}^{2}} \overset{P}{\to} 0 \end{matrix}

(16)

under condition (12), (13) and (15). See Appendix A for the proof. For the estimator

tr (B S_{n})

, the following theorem shows its property.

Lemma 2.

If conditions (12) and (13) are true, the estimator

\hat{tr (B Σ)}

satisfies

\frac{tr (B S_{n}) - tr (B Σ)}{\sqrt{2 tr {(B Σ)}^{2}}} \overset{P}{\to} 0 .

(17)

Combining (11) with (16) and (17), we have

\begin{matrix} \frac{1}{\sqrt{2 tr {(B Σ)}^{2}}} [2 ln PB + p ln 2 - \hat{tr (B Σ)}] - \frac{\sum_{i = 1}^{p} a_{i} (ξ_{i}^{2} - 1)}{\sqrt{2 tr {(B Σ)}^{2}}} \overset{P}{\to} 0 . \end{matrix}

(18)

By now, the asymptotic distributions of the linear function of the logarithm of the posterior Bayes factor can be derived.

Theorem 1.

Under the conditions in (12) and (13), the posterior Bayes factor

PB

has properties as follows.

Under the null hypothesis,

$\frac{1}{\sqrt{2 tr {(B Σ)}^{2}}} [2 ln (PB) + p ln 2 - \hat{tr (B Σ)}] \overset{d}{\to} N (0, 1) .$

(19)
Under the local alternative in (14) and condition (15),

$\frac{1}{\sqrt{2 tr {(B Σ)}^{2}}} [2 ln (PB) + p ln 2 - \hat{tr (B Σ)} - \frac{n_{1} n_{2}}{n} δ^{T} B δ] \overset{d}{\to} N (0, 1) .$

(20)

In order to formulate a test procedure based on Lemma 2, the estimator of

tr {(B Σ)}^{2}

is demanded. The following ratio-consistent estimator for

tr Σ^{2}

is proposed in [1]:

\hat{tr Σ^{2}} = \frac{{(n - 2)}^{2}}{n (n - 1)} [tr S_{n}^{2} - \frac{1}{n - 2} {(tr S_{n})}^{2}] .

Inspired by it, we propose the following estimator for

tr {(B Σ)}^{2}

:

\hat{t r {(B Σ)}^{2}} = tr {(B S_{n})}^{2} - \frac{1}{n - 2} {[tr (B S_{n})]}^{2} .

(21)

The following theorem shows the property of

\hat{tr {(B Σ)}^{2}}

.

Theorem 2.

Under the conditions in (12) and (13), the estimator

\hat{tr {(B Σ)}^{2}}

is a ratio-consistent estimator of

tr {(B Σ)}^{2}

, which means

\frac{\hat{tr {(B Σ)}^{2}}}{tr {(B Σ)}^{2}} \overset{P}{\to} 1 a s p and n \to \infty .

(22)

By Theorems 1 and 2, we obtain a test statistic for (3),

T_{PB} = \frac{1}{\sqrt{2 \hat{tr {(B Σ)}^{2}}}} [2 ln (PB) + p ln 2 - tr (B S_{n})],

which is asymptotically distributed as

N (0, 1)

when the null hypothesis is true. Then the rejection region of the test with approximate significance level

α

is

\{2 ln (PB) \geq z_{1 - α} \sqrt{2 \hat{tr {(B Σ)}^{2}}} - p ln 2 + tr (B S_{n})\},

where

z_{1 - α}

is the

1 - α

quantile of

N (0, 1)

.

Previous results allow us to investigate the asymptotic power of the proposed test. By Theorem 1, the following conclusion is obtained.

Corollary 1.

Under the conditions in (12), (13), the local alternative (14) and condition (15), the power of the posterior Bayes factor-based test is

β_{P B F} (δ) - E_{Σ} [Φ (\frac{\frac{n_{1} n_{2}}{n} δ^{T} B δ}{\sqrt{\hat{2 tr {(B Σ)}^{2}}}} - z_{1 - α})] \to 0,

(23)

where “

E_{Σ}

” means the expectation about random variance

S_{n}

.

3. Simulation

In this section, we conduct simulation studies using R language to evaluate the performance of the posterior Bayes factor-based test for various scenarios. The significance level is set to

α = 0.05

in all the simulations;

p = 1000

and the sample sizes are

n_{1} = n_{2} = 70

. The data

X_{1}, \dots, X_{n 1}

and

Y_{1}, \dots, Y_{n 2}

are generated from multivariate normal distributions

N_{p} (μ_{1}, Σ)

and

N_{p} (μ_{2}, Σ)

, respectively.

We consider the following choices for

Σ = ((σ_{i, j}))

.

$Σ_{1} = I_{p}$ is the identity matrix.
$Σ_{2}$ is a covariance matrix with $σ_{i, j} = {0.4}^{| i - j |}$ .
$Σ_{3}$ is block diagonal matrix, with block $B_{25 \times 25}$ in which the diagonal entries are 1 and the off-diagonal entries are 0.15.

Σ_{1}

is for independent cases, while

Σ_{2}

and

Σ_{3}

are for dependent cases.

Theorem 1 shows that

T_{PB}

is a linear function of the logarithm of the posterior Bayes factor, which is asymptotically distributed as

N (0, 1)

. Q–Q plots are presented in Figure 1 to reveal the asymptotic behavior of

T_{PB}

for

μ_{1} = μ_{2} = 0_{p \times 1}

and different choices of

Σ

. We can see that points in Figure 1a–c are closely aligned along the identity line, indicating that the distributions of

T_{P B}

with different

Σ

are close to

N (0, 1)

.

We also compare the empirical significance levels and powers of the proposed test with several other tests, including not only tests based on the sum-of-squares-type statistics in [4], referred to as SD, and [7], referred to as CQ, but also a Bayes factor-based test which relies on two random projection approaches as in [12], referred to as RMPBT

_{1}

and RMPBT

_{2}

. In this section, the test we proposed is denoted as PB. The results of SD, CQ and RMPBT are cited from [12].

As in [12], we consider two possible alternatives, as follows. Without loss of generality, we shall always take

μ_{2} = 0_{p \times 1}

in the simulations. The proportion of entries of the vector

δ = μ_{1} - μ_{2}

that are exactly zero is denoted by

p_{0}

.

Simulate $μ_{1} \sim N_{p} (1, I_{p})$ , set $p_{0}$ randomly selected elements to 0, and scale $μ_{1}$ so that $δ^{T} Σ^{- 1} δ = 2$ .
Simulate $μ_{1} \sim N_{p} (1, I_{p})$ , set $p_{0}$ randomly selected elements to 0, and scale $μ_{1}$ so that $\frac{{| | δ | |}^{2}}{\sqrt{tr Σ^{2}}} = 0.1$ .

We take

p_{0} = 0.5, 0.75, 0.80, 0.95, 0.975, 1

. Note that the case

p_{0} = 1

corresponds to the null hypothesis and the power becomes the empirical level. A larger

p_{0}

corresponds to a more sparse alternative, while a smaller

p_{0}

corresponds to a denser one.

For the PB test, we take

m = 2 p

and

ε_{n} = 1 / ln (n)

. The numerical results are calculated from 1000 replications and summarized in Table 1, Table 2, Table 3 and Table 4. Table 1 compares the empirical sizes of the tests. In general, the test PB performs best in maintaining the significance level. It can be seen that the estimated sizes of PB are reasonably close to the nominal level 0.05. Tests RMPBT and SD show lower empirical levels than the nominal one, whereas test CQ is a little higher.

Table 2, Table 3 and Table 4 compare the powers of the tests. Covariance matrix

Σ

in Table 2, Table 3 and Table 4 are

Σ_{1}

,

Σ_{2}

and

Σ_{3}

, respectively. Table 2 shows that our test PB substantively outperforms the other three tests for both dense and sparse alternatives. This implies that our method provides the most powerful test compared with the approaches of [4,7,12] for independent cases. In Table 3, the test PB performs better than its competitors in most cases. In Table 4, PB also performs better than the competitors with dense alternatives. Finally, from Table 3 and Table 4, either the prior or the posterior Bayes factor-based tests are better than others. For the dense alternative, the PB test is more powerful than RMPBT.

4. An Application Example

To further explore the practical utility of the posterior Bayes factor-based test, we analyze a real dataset about the small round blue cell tumors (SRBCTs), which is available at https://file.biolab.si/biolab/supp/bi-cancer/projections/info/SRBCT.html, accessed on 1 March 2022.

The SRBCTs are four different childhood tumors including Ewing’s family of tumors (EWS), neuroblastoma (NB), non-Hodgkin lymphoma (BL) and rhabdomyosarcoma (RMS). Our interest is in examining the equality of means of the genes between the EWS and the RMS tumor groups. The dataset contains 29 examples of EWS and 25 examples of RMS with 2038 genes. The observed test statistic of PB is T

_{P B} = 14.19842

with p-value

\approx 0

, indicating a serious deviation from the null hypothesis.

5. Conclusions

In this article, we explore the potential for the posterior Bayes factor to be a statistic for testing the mean equality of two high dimensional populations. A closed form of the posterior Bayes factor is obtained with simple priors for the model parameters. Asymptotic normality of the posterior Bayes factor is established, and the corresponding test is constructed. Numerical studies and a real-life example show the superiority of the test. Therefore, we recommend the posterior Bayes factor as a test statistic for hypothesis testing in high dimensional settings.

Author Contributions

Conceptualization, X.X.; methodology, X.X.; software, Y.J.; validation, Y.J. and X.X.; formal analysis, Y.J. and X.X.; writing—original draft preparation, Y.J.; writing—review and editing, Y.J.; visualization, Y.J.; supervision, X.X.; project administration, X.X.; funding acquisition, X.X. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China under grant number 11471035.

Institutional Review Board Statement

The study did not require ethical approval.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Acknowledgments

The authors would like to thank the anonymous referee for the insightful suggestions and comments, which significantly improve the quality and the exposition of the paper.

Conflicts of Interest

The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

Appendix A. Proof

The Proof of (16).

\begin{matrix} \frac{(m + 2 n) A_{1}^{2}}{\sqrt{2 tr {(B Σ)}^{2}}} & = \frac{4 (m + 2 n) {[\frac{n_{1} n_{2}}{n} {(\bar{X} - \bar{Y})}^{T} {(V^{- 1} + 2 (S_{1} + S_{2}))}^{- 1} (\bar{X} - \bar{Y})]}^{2}}{\sqrt{2 tr {(B Σ)}^{2}}} \end{matrix}

Substituting

k I_{p}

for V in B,

\begin{matrix} B = & 2 (m + 2 n) {(\frac{1}{k} I_{p} + 2 (S_{1} + S_{2}))}^{- 1} - (m + n) {(\frac{1}{k} I_{p} + S_{1} + S_{2})}^{- 1} \\ = & k [2 (m + 2 n) {(I_{p} + 2 k (S_{1} + S_{2}))}^{- 1} - (m + n) {(I_{p} + k (S_{1} + S_{2}))}^{- 1}] \\ = & k (m + 3 n) \times \frac{[2 (m + 2 n) {(I_{p} + 2 k (S_{1} + S_{2}))}^{- 1} - (m + n) {(I_{p} + k (S_{1} + S_{2}))}^{- 1}]}{m + 3 n} . \end{matrix}

Let

\begin{matrix} C = & \frac{1}{m + 3 n} [2 (m + 2 n) {(I_{p} + 2 k (S_{1} + S_{2}))}^{- 1} - (m + n) {(I_{p} + k (S_{1} + S_{2}))}^{- 1}], \end{matrix}

then

B = k (m + 3 n) C,

(A1)

\begin{matrix} \frac{(m + 2 n) A_{1}^{2}}{\sqrt{2 tr {(B Σ)}^{2}}} & = \frac{4 (m + 2 n) k^{2} {[\frac{n_{1} n_{2}}{n} {(\bar{X} - \bar{Y})}^{T} {(I_{p} + 2 k (S_{1} + S_{2}))}^{- 1} (\bar{X} - \bar{Y})]}^{2}}{(m + 3 n) k \sqrt{2 tr ({(C Σ)}^{2})}} \\ \leq \frac{4 k (m + 2 n) {[\frac{n_{1} n_{2}}{n} {(\bar{X} - \bar{Y})}^{T} (\bar{X} - \bar{Y})]}^{2}}{\sqrt{2} (m + 3 n) \sqrt{tr ({(C Σ)}^{2})}} \end{matrix}

(A2)

\begin{matrix} \frac{tr {(C Σ)}^{2}}{tr Σ^{2}} & = \frac{tr {(Σ + (C - I_{p}) Σ)}^{2}}{tr Σ^{2}} \\ = \frac{tr Σ^{2} + 2 tr [(C - I_{p}) Σ^{2}] + tr {[(C - I_{p}) Σ]}^{2}}{tr Σ^{2}} \\ = 1 + \frac{2 tr [(C - I_{p}) Σ^{2}]}{tr Σ^{2}} + \frac{tr {[(C - I_{p}) Σ]}^{2}}{tr Σ^{2}} \end{matrix}

\begin{matrix} C \geq & \frac{1}{m + 3 n} [(m + 2 n) {(I_{p} + k (S_{1} + S_{2}))}^{- 1} - (m + n) {(I_{p} + k (S_{1} + S_{2}))}^{- 1}] \\ = & \frac{n}{m + 3 n} {(I_{p} + k (S_{1} + S_{2}))}^{- 1} \\ > & 0, \end{matrix}

and

\begin{matrix} C & \leq \frac{1}{m + 3 n} [2 (m + 2 n) {(I_{p} + k (S_{1} + S_{2}))}^{- 1} - (m + n) {(I_{p} + k (S_{1} + S_{2}))}^{- 1}] \\ = {(I_{p} + k (S_{1} + S_{2}))}^{- 1} \\ \leq I_{p} . \end{matrix}

Hence

\begin{matrix} I_{p} - C = & \frac{1}{m + 3 n} \{2 (m + 2 n) [I_{p} - {(I_{p} + 2 k (S_{1} + S_{2}))}^{- 1}] - (m + n) [I_{p} - {(I_{p} + k (S_{1} + S_{2}))}^{- 1}]\} \\ = & \frac{1}{m + 3 n} \{2 (m + 2 n) 2 k (S_{1} + S_{2}) {(I_{p} + 2 k (S_{1} + S_{2}))}^{- 1} - (m + n) k (S_{1} + S_{2}) {(I_{p} + k (S_{1} + S_{2}))}^{- 1}\} \\ \leq & \frac{1}{m + 3 n} [(3 m + 7 n_{1} + 7 n_{2}) k (S_{1} + S_{2}) {(I_{p} + k (S_{1} + S_{2}))}^{- 1}] \\ \leq & 3 k (S_{1} + S_{2}) {(I_{p} + k (S_{1} + S_{2}))}^{- 1} . \end{matrix}

Because

k = ε_{n} / [n p λ_{1} (S_{1} + S_{2})]

, we have

k (S_{1} + S_{2}) \leq ε_{n} / (n p) I_{p}

. Hence,

\begin{matrix} 0 \leq I_{p} - C \leq \frac{3 ε_{n}}{n p} I_{p} . \end{matrix}

(A3)

It follows that

|\frac{tr [(C - I_{p}) Σ^{2}]}{tr Σ^{2}}| \leq \frac{\frac{3 ε_{n}}{n p} tr Σ^{2}}{tr Σ^{2}} = \frac{3 ε_{n}}{n p} \to 0,

and

\frac{tr {[(C - I_{p}) Σ]}^{2}}{tr Σ^{2}} \leq \frac{9 ε_{n}^{2}}{n^{2} p^{2}} \to 0 .

Consequently,

\frac{tr {(C Σ)}^{2}}{tr Σ^{2}} \overset{P}{\to} 1 .

(A4)

Therefore,

\begin{matrix} \frac{(m + 2 n) A_{1}^{2}}{\sqrt{2 tr {(B Σ)}^{2}}} & \leq \frac{4 k (m + 2 n) {[\frac{n_{1} n_{2}}{n} {(\bar{X} - \bar{Y})}^{T} (\bar{X} - \bar{Y})]}^{2} O_{p} (1)}{\sqrt{2} (m + 3 n) \sqrt{tr Σ^{2}}} = \frac{4 (m + 2 n) ε_{n} {[\frac{n_{1} n_{2}}{n} {(\bar{X} - \bar{Y})}^{T} (\bar{X} - \bar{Y})]}^{2} O_{p} (1)}{\sqrt{2} n p (m + 3 n) (n - 2) \sqrt{tr Σ^{2}} λ_{1} (S_{n})} . \end{matrix}

(A5)

S_{n}

can be written as

S_{n} = \frac{\sum_{i = 1}^{n - 2} Σ^{1 / 2} z_{i} z_{i}^{T} Σ^{1 / 2}}{n - 2},

where

z_{i}

,

i \in (1, \dots, n - 2)

are independently distributed according to the normal distribution

N_{p} (0_{p}, I_{p})

. It follows that

Var (tr S_{n}) = \frac{\sum_{i = 1}^{n - 2} Var (z_{i}^{T} Σ z_{i})}{{(n - 2)}^{2}} .

By elementary calculation, we have

E {(z_{i}^{T} Σ z_{i})}^{2} = 2 tr Σ^{2} + {(tr Σ)}^{2} .

Since

E [z_{i}^{T} Σ z_{i}] = tr Σ,

it follows that,

Var (tr S_{n}) = \frac{2 tr Σ^{2}}{n - 2} .

Hence,

tr S_{n} = tr Σ \{1 + O_{p} [\frac{\sqrt{2 tr Σ^{2}}}{\sqrt{n - 2} tr Σ}]\} .

(A6)

By (A6), we have

\frac{1}{λ_{1} (S_{n})} \leq \frac{n - 2}{tr S_{n}} = \frac{n - 2}{tr Σ \{1 + O_{p} [\frac{\sqrt{2 tr Σ^{2}}}{\sqrt{n - 2} tr Σ}]\}} .

Because

\frac{\sqrt{2 tr Σ^{2}}}{\sqrt{n - 2} tr Σ} = o (1),

then (A5) becomes

\begin{matrix} \frac{(m + 2 n) A_{1}^{2}}{\sqrt{2 tr {(B Σ)}^{2}}} & \leq \frac{4 (m + 2 n) ε_{n} {[\frac{n_{1} n_{2}}{n} {(\bar{X} - \bar{Y})}^{T} (\bar{X} - \bar{Y})]}^{2} O_{p} (1)}{\sqrt{2} n p (m + 3 n) \sqrt{tr Σ^{2}} tr Σ \{1 + o_{p} (1)\}} . \end{matrix}

(A7)

Under the null hypothesis,

\sqrt{\frac{n_{1} n_{2}}{n}} (\bar{X} - \bar{Y}) \sim N_{p} (0_{p}, Σ) .

By elementary calculation, we have

E {[\frac{n_{1} n_{2}}{n} {(\bar{X} - \bar{Y})}^{T} (\bar{X} - \bar{Y})]}^{2} = 2 tr Σ^{2} + {(tr Σ)}^{2} .

Therefore,

\frac{(m + 2 n) A_{1}^{2}}{\sqrt{2 tr {(B Σ)}^{2}}} \leq \frac{4 (m + 2 n) ε_{n} (2 \sqrt{tr Σ^{2}} / tr Σ + tr Σ / \sqrt{tr Σ^{2}}) O_{p} (1)}{\sqrt{2} n p (m + 3 n) \{1 + o_{p} (1)\}} .

Because

1 \leq \frac{tr Σ}{\sqrt{tr Σ^{2}}} \leq \sqrt{p},

(A8)

we can obtain

\begin{matrix} \frac{4 (m + 2 n) ε_{n} (2 \sqrt{tr Σ^{2}} / tr Σ + tr Σ / \sqrt{tr Σ^{2}}) O_{p} (1)}{\sqrt{2} n p (m + 3 n) \{1 + o_{p} (1)\}} \leq \frac{4 (m + 2 n) ε_{n} (2 + \sqrt{p}) O_{p} (1)}{\sqrt{2} n p (m + 3 n) \{1 + o_{p} (1)\}} \overset{P}{\to} 0 \end{matrix}

(A9)

as

ε_{n} \overset{P}{\to} 0

.

We can conclude

\frac{(m + 2 n) A_{1}^{2}}{\sqrt{2 tr {(B Σ)}^{2}}} \overset{P}{\to} 0 .

Because

A_{1}^{*} \in (0, A_{1})

, we have

\frac{(m + 2 n) A_{1}^{2}}{2 {(1 + A_{1}^{*})}^{2} \sqrt{2 tr {(B Σ)}^{2}}} \overset{P}{\to} 0 .

Similarly, we can prove that

\frac{(m + n) A_{2}^{2}}{2 {(1 + A_{2}^{*})}^{2} \sqrt{2 tr {(B Σ)}^{2}}} \overset{P}{\to} 0 .

Under the alternative hypothesis,

\begin{matrix} E {[\frac{n_{1} n_{2}}{n} {(\bar{X} - \bar{Y})}^{T} (\bar{X} - \bar{Y})]}^{2} = 2 tr Σ^{2} + {(tr Σ)}^{2} + 4 \frac{n_{1} n_{2}}{n} δ^{T} Σ δ + {(\frac{n_{1} n_{2}}{n} δ^{T} δ)}^{2} + \frac{2 n_{1} n_{2}}{n} (tr Σ) δ^{T} δ . \end{matrix}

(A7) becomes

\begin{matrix} \frac{(m + 2 n) A_{1}^{2}}{\sqrt{2 tr {(B Σ)}^{2}}} \leq \frac{4 (m + 2 n) ε_{n} [2 tr Σ^{2} + {(tr Σ)}^{2} + 4 \frac{n_{1} n_{2}}{n} δ^{T} Σ δ + {(\frac{n_{1} n_{2}}{n} δ^{T} δ)}^{2} + \frac{2 n_{1} n_{2}}{n} (tr Σ) δ^{T} δ] O_{p} (1)}{\sqrt{2} n p (m + 3 n) \sqrt{tr Σ^{2}} tr Σ \{1 + o_{p} (1)\}} \end{matrix}

(A10)

By (A9),

\begin{matrix} \frac{4 (m + 2 n) ε_{n} [2 tr Σ^{2} + {(tr Σ)}^{2} + 4 \frac{n_{1} n_{2}}{n} δ^{T} Σ δ + {(\frac{n_{1} n_{2}}{n} δ^{T} δ)}^{2} + \frac{2 n_{1} n_{2}}{n} (tr Σ) δ^{T} δ] O_{p} (1)}{\sqrt{2} n p (m + 3 n) \sqrt{tr Σ^{2}} tr Σ \{1 + o_{p} (1)\}} \\ - \frac{4 (m + 2 n) ε_{n} [4 \frac{n_{1} n_{2}}{n} \frac{δ^{T} Σ δ}{\sqrt{tr Σ^{2}} (tr Σ)} + \frac{{(\frac{n_{1} n_{2}}{n} δ^{T} δ)}^{2}}{\sqrt{tr Σ^{2}} (tr Σ)} + \frac{\frac{2 n_{1} n_{2}}{n} δ^{T} δ}{\sqrt{tr Σ^{2}}}] O_{p} (1)}{\sqrt{2} n p (m + 3 n) \{1 + o_{p} (1)\}} \overset{P}{\to} 0 \end{matrix}

Together with (A8) and (A10) becomes

\frac{(m + 2 n) A_{1}^{2}}{\sqrt{2 tr {(B Σ)}^{2}}} \leq \frac{4 (m + 2 n) ε_{n} [4 \frac{n_{1} n_{2}}{n} \frac{δ^{T} Σ δ}{tr Σ^{2}} + {(\frac{\frac{n_{1} n_{2}}{n} δ^{T} δ}{\sqrt{tr Σ^{2}}})}^{2} + 2 \frac{\frac{n_{1} n_{2}}{n} δ^{T} δ}{\sqrt{tr Σ^{2}}}] O_{p} (1)}{\sqrt{2} n p (m + 3 n) \{1 + o_{p} (1)\}}

Under the assumption (14),

\frac{(m + 2 n) A_{1}^{2}}{\sqrt{2 tr {(B Σ)}^{2}}} \leq \frac{4 (m + 2 n) ε_{n} [{(\frac{\frac{n_{1} n_{2}}{n} δ^{T} δ}{\sqrt{tr Σ^{2}}})}^{2} + 2 \frac{\frac{n_{1} n_{2}}{n} δ^{T} δ}{\sqrt{tr Σ^{2}}}] O_{p} (1)}{\sqrt{2} n p (m + 3 n) \{1 + o_{p} (1)\}}

Additionally, with assumption (15),

\frac{(m + 2 n) A_{1}^{2}}{2 {(1 + A_{1}^{*})}^{2} \sqrt{2 tr {(B Σ)}^{2}}} \overset{P}{\to} 0

holds as

ε_{n} \overset{P}{\to} 0

. Similarly,

\frac{(m + n) A_{2}^{2}}{2 {(1 + A_{2}^{*})}^{2} \sqrt{2 tr {(B Σ)}^{2}}} \overset{P}{\to} 0 .

□

Proof of Lemma 2.

By (A1) and (A4), we need only show that

\frac{tr (C S_{n}) - tr (C Σ)}{\sqrt{tr Σ^{2}}} \overset{P}{\to} 0 .

\begin{matrix} \frac{tr (C S_{n}) - tr (C Σ)}{\sqrt{tr Σ^{2}}} & = \frac{tr (S_{n} - Σ + (C - I_{p}) (S_{n} - Σ))}{\sqrt{tr Σ^{2}}} \\ = \frac{O_{p} [\sqrt{\frac{2 tr Σ^{2}}{n - 2}}] + tr ((C - I_{p}) (S_{n} - Σ))}{\sqrt{tr Σ^{2}}} \\ = \frac{1}{\sqrt{n - 2}} O_{p} (1) + \frac{tr ((C - I_{p}) (S_{n} - Σ))}{\sqrt{tr Σ^{2}}}, \end{matrix}

where the second equality follows from (A6).

By Cauchy–Schwarz inequality and (A3),

\begin{matrix} \frac{|tr ((C - I_{p}) (S_{n} - Σ))|}{\sqrt{tr Σ^{2}}} & \leq \frac{\sqrt{tr {(C - I_{p})}^{2}} \sqrt{tr {(S_{n} - Σ)}^{2}}}{\sqrt{tr Σ^{2}}} \\ \leq \frac{\frac{3 ε_{n}}{n p} (n - 2) \sqrt{p} \sqrt{tr {(S_{n} - Σ)}^{2}}}{\sqrt{tr Σ^{2}}} . \end{matrix}

We know that

E [tr {(S_{n} - Σ)}^{2}] = E [tr S_{n}^{2} + tr Σ^{2} - 2 tr (S_{n} Σ)],

where

\begin{matrix} E [tr S_{n}^{2}] = & \frac{1}{{(n - 2)}^{2}} \sum_{i, j = 1}^{n - 2} E [z_{i}^{T} Σ z_{i} z_{j}^{T} Σ z_{j}] \\ = & \frac{1}{{(n - 2)}^{2}} [\sum_{i = 1}^{n - 2} E {(z_{i}^{T} Σ z_{i})}^{2} + \sum_{i \neq j} E [z_{i}^{T} Σ z_{i} z_{j}^{T} Σ z_{j}]] \\ = & \frac{1}{{(n - 2)}^{2}} \{(n - 2) [2 tr Σ^{2} + {(tr Σ)}^{2}] + [{(n - 2)}^{2} - (n - 2)] {(tr Σ)}^{2}\} \\ = & \frac{2}{n - 2} tr Σ^{2} + {(tr Σ)}^{2} . \end{matrix}

Therefore,

E tr {(S_{n} - Σ)}^{2} = \frac{2 tr Σ^{2}}{n - 2} .

Hence

\frac{\sqrt{tr {(S_{n} - Σ)}^{2}}}{\sqrt{tr Σ^{2}}} = \frac{1}{\sqrt{n - 2}} O_{p} (1) .

0 \leq \frac{| tr (I_{p} - C) (S_{n} - Σ) |}{\sqrt{tr Σ^{2}}} \leq 3 \sqrt{p} \sqrt{n - 2} \frac{ε_{n}}{n p} O_{p} (1) .

We conclude

\frac{| tr (I_{p} - C) (S_{n} - Σ) |}{\sqrt{tr Σ^{2}}} \overset{P}{\to} 0 .

The Lemma is proved. □

Proof of Theorem 1.

From (18),

\frac{1}{\sqrt{2 tr {(B Σ)}^{2}}} [2 ln (PB) + p ln 2 - \hat{tr (B Σ)}] - \frac{\sum_{i = 1}^{p} a_{i} (ξ_{i}^{2} - 1)}{\sqrt{2 \sum_{i = 1}^{p} a_{i}^{2}}} \overset{P}{\to} 0 .

Under the null hypothesis, we know that

ξ \sim N_{p} (0_{p}, I_{p}) .

Hence,

E ξ_{i}^{2} = 1 a n d Var (ξ_{i}^{2}) = 2 .

We know that

max_{1 \leq i \leq p} a_{i} = λ_{1} (A) = λ_{1} (B Σ),

and

\sum_{i = 1}^{p} a_{i}^{2} = tr A^{2} = tr {(B Σ)}^{2} .

As

B \leq (m + 2 n_{1} + 2 n_{2} - \frac{m + n_{1} + n_{2}}{2}) {(V^{- 1} + S_{1} + S_{2})}^{- 1} \leq (\frac{m + 3 n_{1} + 3 n_{2}}{2}) V,

and

\begin{matrix} B & \geq (m + 2 n_{1} + 2 n_{2}) {(2 V^{- 1} + 2 S_{1} + 2 S_{2})}^{- 1} - \frac{m + n_{1} + n_{2}}{2} {(V^{- 1} + S_{1} + S_{2})}^{- 1} \\ = \frac{n_{1} + n_{2}}{2} {(V^{- 1} + S_{1} + S_{2})}^{- 1} \\ = \frac{n_{1} + n_{2}}{2} {(\frac{1}{k} I_{p} + S_{1} + S_{2})}^{- 1} \\ \geq \frac{n_{1} + n_{2}}{2} \frac{1}{λ_{1} (\frac{1}{k} I_{p} + S_{1} + S_{2})} I_{p}, \end{matrix}

we have

\begin{matrix} \frac{{max}_{1 \leq i \leq p} | a_{i} |}{\sqrt{\sum_{i = 1}^{p} a_{i}^{2}}} & \leq \frac{\frac{m + 3 n_{1} + 3 n_{2}}{2} k λ_{1} (Σ) λ_{1} (\frac{1}{k} I_{p} + S_{1} + S_{2})}{\frac{n_{1} + n_{2}}{2} \sqrt{tr Σ^{2}}} \\ = \frac{λ_{1} (Σ)}{\sqrt{tr Σ^{2}}} \frac{\frac{m + 3 n_{1} + 3 n_{2}}{2} k}{\frac{n_{1} + n_{2}}{2}} λ_{1} (\frac{1}{k} I_{p} + S_{1} + S_{2}) \\ = \frac{λ_{1} (Σ)}{\sqrt{tr Σ^{2}}} \frac{m + 3 n_{1} + 3 n_{2}}{n_{1} + n_{2}} λ_{1} (I_{p} + k S_{1} + k S_{2}) . \end{matrix}

Because

k = \frac{ε_{n}}{n p λ_{1} (S_{1} + S_{2})}, ε_{n} \to 0,

we have

λ_{1} (I_{p} + k S_{1} + k S_{2}) \leq k λ_{1} (S_{1} + S_{2}) + λ_{1} (I_{p}) = \frac{ε_{n}}{n p} + λ_{1} (I_{p}) = O (1) .

By conditions (12) and (13), we have

\frac{{max}_{1 \leq i \leq p} | a_{i} |}{\sqrt{\sum_{i = 1}^{p} a_{i}^{2}}} \overset{P}{\to} 0 .

By Lemma 1,

\frac{\sum_{i = 1}^{p} a_{i} (ξ_{i}^{2} - 1)}{\sqrt{2 \sum_{i = 1}^{p} a_{i}^{2}}} \to N (0, 1) as p \to \infty .

Therefore,

\frac{1}{\sqrt{2 tr {(B Σ)}^{2}}} [2 ln (PB) + p ln 2 - \hat{tr (B Σ)}] \overset{d}{\to} N (0, 1) .

By now, (1) in Theorem 1 has been proved.

Under the alternative hypothesis, we have

\sqrt{\frac{n_{1} n_{2}}{n}} Σ^{- \frac{1}{2}} (\bar{X} - \bar{Y}) \sim N_{p} (\sqrt{\frac{n_{1} n_{2}}{n}} Σ^{- \frac{1}{2}} δ, I_{p}) .

Let

η = \sqrt{\frac{n_{1} n_{2}}{n}} Σ^{- \frac{1}{2}} (\bar{X} - \bar{Y}), η_{0} = η - \sqrt{\frac{n_{1} n_{2}}{n}} Σ^{- \frac{1}{2}} δ,

\begin{matrix} \sum_{i = 1}^{p} a_{i} ξ_{i}^{2} & = {[η_{0} + \sqrt{\frac{n_{1} n_{2}}{n}} Σ^{- \frac{1}{2}} δ]}^{T} Σ^{\frac{1}{2}} B Σ^{\frac{1}{2}} [η_{0} + \sqrt{\frac{n_{1} n_{2}}{n}} Σ^{- \frac{1}{2}} δ] \\ = η_{0}^{T} Σ^{\frac{1}{2}} B Σ^{\frac{1}{2}} η_{0} + \frac{n_{1} n_{2}}{n} δ^{T} B δ + 2 \sqrt{\frac{n_{1} n_{2}}{n}} δ^{T} B Σ^{\frac{1}{2}} η_{0} . \end{matrix}

Since

η_{0} \sim N_{p} (0_{p}, I_{p}),

E [\frac{\sqrt{\frac{n_{1} n_{2}}{n}} δ^{T} B Σ^{\frac{1}{2}} η_{0}}{\sqrt{2 tr {(B Σ)}^{2}}}| S_{n}] = 0, and Var [\frac{\sqrt{\frac{n_{1} n_{2}}{n}} δ^{T} B Σ^{\frac{1}{2}} η_{0}}{\sqrt{2 tr {(B Σ)}^{2}}}| S_{n}] = \frac{\frac{n_{1} n_{2}}{n} δ^{T} B Σ B δ}{2 tr {(B Σ)}^{2}} .

By (22), (A1) and (A4),

Var [\frac{\sqrt{\frac{n_{1} n_{2}}{n}} δ^{T} B Σ^{\frac{1}{2}} η_{0}}{\sqrt{2 tr {(B Σ)}^{2}}}| S_{n}] - \frac{\frac{n_{1} n_{2}}{n} δ^{T} C Σ C δ}{2 tr Σ^{2}} \overset{P}{\to} 0 .

\begin{matrix} \frac{n_{1} n_{2}}{n} \frac{δ^{T} C Σ C δ}{tr Σ^{2}} & = \frac{n_{1} n_{2}}{n} \frac{δ^{T} (C - I_{p} + I_{p}) Σ (C - I_{p} + I_{p}) δ}{tr Σ^{2}} \\ \leq 2 \frac{n_{1} n_{2}}{n} \frac{δ^{T} Σ δ}{tr Σ^{2}} + 2 \frac{n_{1} n_{2}}{n} \frac{δ^{T} (I_{p} - C) Σ (I_{p} - C) δ}{tr Σ^{2}} \end{matrix}

With (A3), we have

\frac{n_{1} n_{2}}{n} \frac{δ^{T} (I_{p} - C) Σ (I_{p} - C) δ}{tr Σ^{2}} \leq \frac{n_{1} n_{2}}{n} \frac{9 {ε_{n}}^{2} δ^{T} Σ δ}{n^{2} p^{2} tr Σ^{2}} .

0 \leq \frac{n_{1} n_{2}}{n} \frac{δ^{T} C Σ C δ}{tr Σ^{2}} \leq \frac{n_{1} n_{2}}{n} \frac{δ^{T} Σ δ}{tr Σ^{2}} + \frac{n_{1} n_{2}}{n} \frac{9 {(\frac{ε_{n}}{n p})}^{2} δ^{T} Σ δ}{tr Σ^{2}} .

Because

n_{1} n_{2} δ^{T} Σ δ / (n tr Σ^{2}) \to 0

,

Var [\frac{\sqrt{\frac{n_{1} n_{2}}{n}} δ^{T} B Σ^{\frac{1}{2}} η_{0}}{\sqrt{\hat{tr {(B Σ)}^{2}}}}| S_{n}] \overset{P}{\to} 0 .

Hence, we can conclude that

\frac{\sqrt{\frac{n_{1} n_{2}}{n}} δ^{T} B Σ^{\frac{1}{2}} η_{0}}{\sqrt{2 tr {(B Σ)}^{2}}} \overset{P}{\to} 0,

\begin{matrix} \frac{1}{\sqrt{2 tr {(B Σ)}^{2}}} [\sum_{i = 1}^{p} a_{i} ξ_{i}^{2} - \frac{n_{1} n_{2}}{n} δ^{T} B δ - η_{0}^{T} Σ^{\frac{1}{2}} B Σ^{\frac{1}{2}} η_{0}] \overset{P}{\to} 0 . \end{matrix}

With (17) and (18), we have

\frac{1}{\sqrt{2 tr {(B Σ)}^{2}}} [2 ln (PB) + p ln 2 - \frac{n_{1} n_{2}}{n} δ^{T} B δ - η_{0}^{T} Σ^{\frac{1}{2}} B Σ^{\frac{1}{2}} η_{0}] \overset{P}{\to} 0 .

In the proof of Theorem 1, we proved that

\frac{1}{\sqrt{2 tr {(B Σ)}^{2}}} [ψ^{T} Σ^{\frac{1}{2}} B Σ^{\frac{1}{2}} ψ - \hat{tr (B Σ)}] \overset{d}{\to} N (0, 1),

where

ψ

is a random vector distributed according to

N_{p} (0_{p}, I_{p})

. Hence, we have

\frac{1}{\sqrt{2 tr {(B Σ)}^{2}}} [η_{0}^{T} Σ^{\frac{1}{2}} B Σ^{\frac{1}{2}} η_{0} - \hat{tr (B Σ)}] \overset{d}{\to} N (0, 1) .

Therefore,

\frac{1}{\sqrt{2 tr {(B Σ)}^{2}}} [2 ln (PB) + p ln 2 - \hat{tr (B Σ)} - \frac{n_{1} n_{2}}{n} δ^{T} B δ] \overset{d}{\to} N (0, 1) .

(2) in Theorem 1 has been proved. □

Proof of Theorem 2.

Denote the spectral decomposition of

S_{n}

by

P D P^{T}

, where

D = diag [d_{1}, d_{2}, \dots, d_{p}]

. We can rewrite C as

\begin{matrix} C = & \frac{1}{m + 3 n} [2 (m + 2 n) {(I_{p} + 2 k (n - 2) P D P^{T})}^{- 1} - (m + n) {(I_{p} + k (n - 2) P D P^{T})}^{- 1}] \\ ≜ & P H P^{T}, \end{matrix}

where

H = diag [h_{1}, h_{2}, \dots, h_{p}]

and

h_{i} = \frac{1}{m + 3 n} [\frac{2 (m + 2 n)}{1 + 2 k (n - 2) d_{i}} - \frac{m + n}{1 + k (n - 2) d_{i}}], i \in {1, \dots, p} .

\begin{matrix} 1 - h_{i} & = \frac{k (n - 2) d_{i}}{m + 3 n} [\frac{4 (m + 2 n)}{1 + 2 k (n - 2) d_{i}} - \frac{m + n}{1 + k (n - 2) d_{i}}] \\ \leq \frac{3 k (n - 2) d_{i}}{1 + k (n - 2) d_{i}} \end{matrix}

Now substitute the expression of k into the inequality,

1 - h_{i} \leq 3 (n - 2) \frac{ε_{n}}{n p} \overset{P}{\to} 0 .

(A11)

\begin{matrix} tr {(C S_{n})}^{2} - \frac{1}{n - 2} {(tr (C S_{n}))}^{2} \\ = & \sum_{i = 1}^{p} {(d_{i} h_{i})}^{2} - \frac{1}{n - 2} {(\sum_{i = 1}^{p} d_{i} h_{i})}^{2} \\ = & \sum_{i = 1}^{p} {[d_{i} (1 + h_{i} - 1)]}^{2} - \frac{1}{n - 2} {[\sum_{i = 1}^{p} d_{i} (1 + h_{i} - 1)]}^{2} \\ = & \sum_{i = 1}^{p} d_{i}^{2} - 2 \sum_{i = 1}^{p} (1 - h_{i}) d_{i}^{2} + \sum_{i = 1}^{p} {(1 - h_{i})}^{2} d_{i}^{2} - \frac{1}{n - 2} {(\sum_{i = 1}^{p} d_{i} - \sum_{i = 1}^{p} (1 - h_{i}) d_{i})}^{2} \\ = & \sum_{i = 1}^{p} d_{i}^{2} - 2 \sum_{i = 1}^{p} (1 - h_{i}) d_{i}^{2} + \sum_{i = 1}^{p} {(1 - h_{i})}^{2} d_{i}^{2} - \frac{1}{n - 2} {(\sum_{i = 1}^{p} d_{i})}^{2} + \frac{2}{n - 2} \sum_{i = 1}^{p} d_{i} \sum_{i = 1}^{p} (1 - h_{i}) d_{i} - \frac{1}{n - 2} {(\sum_{i = 1}^{p} (1 - h_{i}) d_{i})}^{2} \\ = & \sum_{i = 1}^{p} d_{i}^{2} - \frac{1}{n - 2} {(\sum_{i = 1}^{p} d_{i})}^{2} - 2 \sum_{i = 1}^{p} (1 - h_{i}) d_{i}^{2} + \frac{2}{n - 2} \sum_{i = 1}^{p} d_{i} \sum_{i = 1}^{p} (1 - h_{i}) d_{i} + \sum_{i = 1}^{p} {(1 - h_{i})}^{2} d_{i}^{2} - \frac{1}{n - 2} {(\sum_{i = 1}^{p} (1 - h_{i}) d_{i})}^{2} . \end{matrix}

By [1], we know that

tr S_{n}^{2} = \frac{n}{n - 2} tr Σ^{2} + \frac{n}{{(n - 2)}^{2}} {(tr Σ)}^{2} + o_{p} (tr Σ^{2}),

and

\frac{1}{n - 2} {(tr S_{n})}^{2} = \frac{n}{{(n - 2)}^{2}} {(tr Σ)}^{2} + o_{p} (tr Σ^{2}) .

Noting that

p / n \in (0, \infty)

, by

tr Σ \leq \sqrt{tr I_{p}^{2} tr Σ^{2}} = \sqrt{p} \sqrt{tr Σ^{2}},

we have

\frac{tr Σ}{\sqrt{n}} \leq \sqrt{\frac{p}{n}} \sqrt{tr Σ^{2}} = O (\sqrt{tr Σ^{2}}) .

Hence

\frac{tr S_{n}^{2}}{tr Σ^{2}} = O_{p} (1),

and

\frac{1}{n - 2} \frac{tr S_{n}^{2}}{{(tr Σ)}^{2}} = O_{p} (1) .

With (A11),

0 \leq \sum_{i = 1}^{p} (1 - h_{i}) d_{i}^{2} \leq 3 (n - 2) \frac{ε_{n}}{n p} tr S_{n}^{2} = 3 (n - 2) \frac{ε_{n}}{n p} O_{p} (tr Σ^{2}) .

Similarly, we have

0 \leq \frac{2}{n - 2} \sum_{i = 1}^{p} d_{i} \sum_{i = 1}^{p} (1 - h_{i}) d_{i} \leq \frac{ε_{n}}{n p} {(tr S_{n})}^{2},

therefore

\frac{2 \sum_{i = 1}^{p} d_{i} \sum_{i = 1}^{p} (1 - h_{i}) d_{i}}{(n - 2) tr Σ^{2}} \leq (n - 2) \frac{ε_{n}}{n p} O_{p} (1) .

As

ε_{n} \to 0

,

\frac{\sum_{i = 1}^{p} (1 - h_{i}) d_{i}^{2}}{tr Σ^{2}} \overset{P}{\to} 0,

and

\frac{2 \sum_{i = 1}^{p} d_{i} \sum_{i = 1}^{p} (1 - h_{i}) d_{i}}{(n - 2) tr Σ^{2}} \overset{P}{\to} 0 .

Therefore, with (A11),

\begin{matrix} tr {(C S_{n})}^{2} - \frac{1}{n - 2} {(tr (C S_{n}))}^{2} & = (1 + o_{p} (1)) [\sum_{i = 1}^{p} d_{i}^{2} - \frac{1}{n - 2} {(\sum_{i = 1}^{p} d_{i})}^{2}] \\ = (1 + o_{p} (1)) [tr S_{n}^{2} - \frac{1}{n - 2} {(tr S_{n})}^{2}] . \end{matrix}

Then,

\frac{tr {(C S_{n})}^{2} - \frac{1}{n - 2} {(tr (C S_{n}))}^{2}}{tr S_{n}^{2} - \frac{1}{n - 2} {(tr S_{n})}^{2}} \overset{P}{\to} 1 .

(A12)

The authors of [1] have proved that

\frac{tr S_{n}^{2} - \frac{1}{n - 2} {(tr S_{n})}^{2}}{tr Σ^{2}} \overset{P}{\to} 1 .

Thus, by the above and (A4) and (A12), we have

\frac{tr {(C S_{n})}^{2} - \frac{1}{n - 2} {(tr (C S_{n}))}^{2}}{tr {(C Σ)}^{2}} \overset{P}{\to} 1,

which implies (22) holds. □

Proof of Corollary 1.

The power of the proposed test is

\begin{matrix} β (δ) = & \Pr (T_{PB} \geq z_{1 - α}) \\ = & \Pr (\frac{1}{\sqrt{2 \hat{tr {(B Σ)}^{2}}}} [2 ln (PB) + p ln 2 - \hat{tr (B Σ)}] \geq z_{1 - α}) \\ = & E_{Σ} [\Pr (\frac{1}{\sqrt{2 \hat{tr {(B Σ)}^{2}}}} [2 ln (PB) + p ln 2 - \hat{tr (B Σ)} - \frac{n_{1} n_{2}}{n} δ^{T} B δ] \geq z_{1 - α} - \frac{n_{1} n_{2}}{n} \frac{δ^{T} B δ}{\sqrt{2 \hat{tr {(B Σ)}^{2}}}}| S_{n})] \end{matrix}

By (20) and (22), we have

\begin{matrix} β (δ) - E_{Σ} [Φ (\frac{\frac{n_{1} n_{2}}{n} δ^{T} B δ}{\sqrt{\hat{tr {(B Σ)}^{2}}}} - z_{1 - α})] \overset{P}{\to} 0 . \end{matrix}

□

Appendix B. R code

rm(list = ls(all = TRUE))

library(MASS)

library(Matrix)

#install.packages("lava")

library(lava)

n1=70

n2=70

n=n1+n2

p=1000

M=1000

m=2*p

mu1=rep(0,p)

#Sigma 1

Sigma1=diag(1,p)

#Sigma 2

#ro=0.4

#Sigma2_0=matrix(1,p,p)

#for (i in 1:p) {

# for (j in 1:p) {

# k<-abs(j-i)

# Sigma2_0[i,j]=ro^{k}

# }

#}

#Sigma2<-Sigma2_0

#Sigma 3

#Sigma3_1=diag(0.85,25)+matrix(0.15,25,25)

#list2 <- NULL

#for (i in 1:(p/25)){

# list2[[i]] <- Sigma3_1

#}

#Sigma3<-as.matrix(bdiag(list2))

Sigma=Sigma1

delta=0.975

t1=proc.time()

p0=delta*p

mu20=mvrnorm(1,rep(1,p),diag(rep(1, p)))

mu20_xiabiao=sort(sample(1:p,p0))

for (i in 1:p0){mu20[mu20_xiabiao[i]]=0}

#alternative 1

scal=sqrt((t(mu20)%*%solve(Sigma)%*%(mu20))/2)

#alternative 2

#scal=sqrt(t(mu20)%*%(mu20)/sqrt(tr(t(Sigma)%*%Sigma))/0.1)

mu2=mu20/rep(scal,p)

c=0

T_BF=rep(0,M)

for (q in 1:M) {

xi<-mvrnorm(n1,mu1,Sigma)

yi<-mvrnorm(n2,mu2,Sigma)

x_mean<-rep(0,p)

for(i in 1:p){

x_mean[i]=mean(xi[,i])

}

y_mean<-rep(0,p)

for(i in 1:p){

y_mean[i]=mean(yi[,i])

}

z1=matrix(0,n1,p)

for (l in 1:n1) {

z1[l,]=x_mean

}

z2=matrix(0,n2,p)

for (l in 1:n2) {

z2[l,]=y_mean

}

A<-t(xi-z1)%*%(xi-z1)+t(yi-z2)%*%(yi-z2)

k<-1/log10(n)/(eigen(A)$values[1])/(p)

V<-k*diag(rep(1, p))

B=((m+2*(n))*solve(solve(V)+2*(A))-(m+n)/2*solve(solve(V)+A))

T=n1*n2/(n)*(t(x_mean-y_mean)%*%B%*%(x_mean-y_mean))

S_n<-A/(n-2)

mu_T=tr(B%*%S_n)

sigma_T<-tr((B%*%S_n)%*%(B%*%S_n))-1/(n-2)*(tr(B%*%S_n))^2

T_BF[q]=(T-mu_T)/sqrt(2*sigma_T)

if(T_BF[q]>=qnorm(0.95)){c=c+1}

}

t2=proc.time()

t=t2-t1

cat("power =", c/M,"time",t[3][[1]],"s","\n")

References

Bai, Z.; Saranadasa, H. Effect of high dimension: By an example of a two sample problem. Stat. Sin. 1996, 6, 311–329. [Google Scholar]
Hotelling, H. The Generalization of Student’s Ratio. Ann. Math. Stat. 1931, 2, 360–378. [Google Scholar] [CrossRef]
Anderson, T.W. An Introduction to Multivariate Statistical Analysis; Technical Report; John Wiley & Sons, Inc.: Hoboken, NJ, USA, 1958. [Google Scholar]
Srivastava, M.S.; Du, M. A test for the mean vector with fewer observations than the dimension. J. Multivar. Anal. 2008, 99, 386–402. [Google Scholar] [CrossRef] [Green Version]
Srivastava, M.S. A test for the mean vector with fewer observations than the dimension under non-normality. J. Multivar. Anal. 2009, 100, 518–532. [Google Scholar] [CrossRef] [Green Version]
Srivastava, M.S.; Katayama, S.; Kano, Y. A two sample test in high dimensional data. J. Multivar. Anal. 2013, 114, 349–358. [Google Scholar] [CrossRef]
Chen, S.X.; Qin, Y.L. A two-sample test for high-dimensional data with applications to gene-set testing. Ann. Stat. 2010, 38, 808–835. [Google Scholar] [CrossRef] [Green Version]
Feng, L.; Zou, C.; Wang, Z.; Zhu, L. Two-sample Behrens-Fisher problem for high-dimensional data. Stat. Sin. 2015, 25, 1297–1312. [Google Scholar] [CrossRef] [Green Version]
Cai, T.T.; Liu, W.; Xia, Y. Two-sample test of high dimensional means under dependence. J. R. Stat. Soc. Ser. B Stat. Methodol. 2014, 76, 349–372. [Google Scholar]
Lopes, M.; Jacob, L.; Wainwright, M.J. A more powerful two-sample test in high dimensions using random projection. Adv. Neural Inf. Process. Syst. 2011, 24, 1206–1214. [Google Scholar]
Srivastava, R.; Li, P.; Ruppert, D. RAPTT: An exact two-sample test in high dimensions using random projections. J. Comput. Graph. Stat. 2016, 25, 954–970. [Google Scholar] [CrossRef]
Zoh, R.S.; Sarkar, A.; Carroll, R.J.; Mallick, B.K. A powerful Bayesian test for equality of means in high dimensions. J. Am. Stat. Assoc. 2018, 113, 1733–1741. [Google Scholar] [CrossRef] [PubMed]
Wang, R.; Xu, X. On two-sample mean tests under spiked covariances. J. Multivar. Anal. 2018, 167, 225–249. [Google Scholar] [CrossRef]
Kuelbs, J.; Vidyashankar, A.N. Asymptotic inference for high-dimensional data. Ann. Stat. 2010, 38, 836–869. [Google Scholar] [CrossRef] [Green Version]
Thulin, M. A high-dimensional two-sample test for the mean using random subspaces. Comput. Stat. Data Anal. 2014, 74, 26–38. [Google Scholar] [CrossRef] [Green Version]
Gregory, K.B.; Carroll, R.J.; Baladandayuthapani, V.; Lahiri, S.N. A two-sample test for equality of means in high dimension. J. Am. Stat. Assoc. 2015, 110, 837–849. [Google Scholar] [CrossRef] [PubMed]
Yu, W.; Xu, W.; Zhu, L. A combined p-value test for the mean difference of high-dimensional data. Sci. China Math. 2018, 62, 961. [Google Scholar] [CrossRef]
Chen, S.X.; Li, J.; Zhong, P.S. Two-sample and ANOVA tests for high dimensional means. Ann. Stat. 2019, 47, 1443–1474. [Google Scholar] [CrossRef] [Green Version]
Zhang, J.T.; Guo, J.; Zhou, B.; Cheng, M.Y. A simple two-sample test in high dimensions based on L²-norm. J. Am. Stat. Assoc. 2020, 115, 1011–1027. [Google Scholar] [CrossRef]
Zhu, Y.; Bradic, J. Significance testing in non-sparse high-dimensional linear models. Electron. J. Stat. 2018, 12, 3312–3364. [Google Scholar] [CrossRef]
Aitkin, M. Posterior bayes factors. J. R. Stat. Soc. Ser. B Methodol. 1991, 53, 111–128. [Google Scholar] [CrossRef]
Wang, R.; Xu, X. Least favorable direction test for multivariate analysis of variance in high dimension. Stat. Sin. 2021, 31, 723–747. [Google Scholar] [CrossRef]

Figure 1. Quantile–Quantile plot of asymptotic distribution for

T_{P B F}

under the null hypothesis

μ_{1} = μ_{2}

against

N (0, 1)

for different

Σ

based on 1000 independently generated

T_{P B F}

with

n_{1} = n_{2} = 70

,

p = 1000

.

Figure 1. Quantile–Quantile plot of asymptotic distribution for

T_{P B F}

under the null hypothesis

μ_{1} = μ_{2}

against

N (0, 1)

for different

Σ

based on 1000 independently generated

T_{P B F}

with

n_{1} = n_{2} = 70

,

p = 1000

.

Table 1. Empirical sizes based on 1000 replications with

α = 0.05

,

n_{1} = n_{2} = 70

and

p = 1000

. RMPBT is the approach of [12], SD is the approach of [4] and CQ is the approach of [7].

Table 1. Empirical sizes based on 1000 replications with

α = 0.05

,

n_{1} = n_{2} = 70

and

p = 1000

. RMPBT is the approach of [12], SD is the approach of [4] and CQ is the approach of [7].

	$PB$	RMPBT $_{1}$	RMPBT $_{2}$	SD	CQ
$Σ_{1}$	0.049	0.031	0.030	0.040	0.063
$Σ_{2}$	0.052	0.038	0.035	0.037	0.049
$Σ_{3}$	0.060	0.060	0.040	0.045	0.063

Table 2. Power analysis of 4 tests assuming the true covariance matrix is

Σ = Σ_{1}

;

n_{1} = n_{2} = 70

and

p = 1000

. RMPBT is the approach of [12], SD is the approach of [4] and CQ is the approach of [7].

Table 2. Power analysis of 4 tests assuming the true covariance matrix is

Σ = Σ_{1}

;

n_{1} = n_{2} = 70

and

p = 1000

. RMPBT is the approach of [12], SD is the approach of [4] and CQ is the approach of [7].

	$p_{0}$	$PB$	RMPBT $_{1}$	RMPBT $_{2}$	SD	CQ
Alternative 1	0.975	0.470	0.332	0.309	0.384	0.450
	0.950	0.478	0.388	0.339	0.423	0.474
	0.800	0.482	0.337	0.304	0.389	0.448
	0.750	0.482	0.348	0.294	0.401	0.470
	0.500	0.485	0.372	0.343	0.422	0.473
Alternative 2	0.975	0.764	0.685	0.612	0.722	0.761
	0.950	0.797	0.694	0.612	0.741	0.775
	0.800	0.785	0.660	0.581	0.717	0.762
	0.750	0.806	0.695	0.616	0.756	0.789
	0.500	0.786	0.677	0.588	0.727	0.767

Table 3. Power analysis of 4 tests assuming the true covariance matrix is

Σ = Σ_{2}

.

n_{1} = n_{2} = 70

and

p = 1000

; RMPBT is the approach of [12], SD is the approach of [4] and CQ is the approach of [7].

Table 3. Power analysis of 4 tests assuming the true covariance matrix is

Σ = Σ_{2}

.

n_{1} = n_{2} = 70

and

p = 1000

; RMPBT is the approach of [12], SD is the approach of [4] and CQ is the approach of [7].

	$p_{0}$	$PB$	RMPBT $_{1}$	RMPBT $_{2}$	SD	CQ
Alternative 1	0.975	0.269	0.259	0.243	0.219	0.266
	0.950	0.277	0.249	0.232	0.209	0.258
	0.800	0.282	0.261	0.222	0.221	0.270
	0.750	0.299	0.264	0.236	0.242	0.284
	0.500	0.336	0.303	0.265	0.268	0.326
Alternative 2	0.975	0.783	0.791	0.738	0.722	0.768
	0.950	0.780	0.786	0.734	0.718	0.766
	0.800	0.794	0.755	0.699	0.700	0.756
	0.750	0.792	0.772	0.722	0.730	0.785
	0.500	0.789	0.753	0.686	0.720	0.766

Table 4. Power analysis of 4 tests assuming the true covariance matrix is

Σ = Σ_{3}

;

n_{1} = n_{2} = 70

and

p = 1000

. RMPBT is the approach of [12], SD is the approach of [4] and CQ is the approach of [7].

Table 4. Power analysis of 4 tests assuming the true covariance matrix is

Σ = Σ_{3}

;

n_{1} = n_{2} = 70

and

p = 1000

. RMPBT is the approach of [12], SD is the approach of [4] and CQ is the approach of [7].

	$p_{0}$	$PB$	RMPBT $_{1}$	RMPBT $_{2}$	SD	CQ
Alternative 1	0.975	0.296	0.315	0.278	0.245	0.294
	0.950	0.303	0.335	0.307	0.270	0.311
	0.800	0.332	0.348	0.318	0.285	0.343
	0.750	0.357	0.327	0.294	0.278	0.331
	0.500	0.422	0.414	0.379	0.353	0.401
Alternative 2	0.975	0.785	0.836	0.776	0.716	0.755
	0.950	0.801	0.827	0.776	0.730	0.782
	0.800	0.795	0.796	0.734	0.728	0.775
	0.750	0.793	0.790	0.727	0.718	0.764
	0.500	0.778	0.774	0.717	0.720	0.761

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Jiang, Y.; Xu, X. A Two-Sample Test of High Dimensional Means Based on Posterior Bayes Factor. Mathematics 2022, 10, 1741. https://doi.org/10.3390/math10101741

AMA Style

Jiang Y, Xu X. A Two-Sample Test of High Dimensional Means Based on Posterior Bayes Factor. Mathematics. 2022; 10(10):1741. https://doi.org/10.3390/math10101741

Chicago/Turabian Style

Jiang, Yuanyuan, and Xingzhong Xu. 2022. "A Two-Sample Test of High Dimensional Means Based on Posterior Bayes Factor" Mathematics 10, no. 10: 1741. https://doi.org/10.3390/math10101741

APA Style

Jiang, Y., & Xu, X. (2022). A Two-Sample Test of High Dimensional Means Based on Posterior Bayes Factor. Mathematics, 10(10), 1741. https://doi.org/10.3390/math10101741

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Two-Sample Test of High Dimensional Means Based on Posterior Bayes Factor

Abstract

1. Introduction

2. Test Based on Posterior Bayes Factor

3. Simulation

4. An Application Example

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Appendix A. Proof

Appendix B. R code

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI