Next Article in Journal
Analysis of the Truncated XLindley Distribution Using Bayesian Robustness
Previous Article in Journal
A Multi-State Model for Lung Cancer Mortality in Survival Progression
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A High Dimensional Omnibus Regression Test

1
Department of Basic Sciences, Saudi Electronic University, Madinah 42351, Saudi Arabia
2
Department of Statistical Sciences, Wake Forest University, Winston-Salem, NC 27109, USA
3
School of Mathematical & Statistical Sciences, Southern Illinois University, Carbondale, IL 62901, USA
*
Author to whom correspondence should be addressed.
Stats 2025, 8(4), 107; https://doi.org/10.3390/stats8040107
Submission received: 13 September 2025 / Revised: 27 October 2025 / Accepted: 31 October 2025 / Published: 5 November 2025
(This article belongs to the Section Regression Models)

Abstract

Consider regression models where the response variable Y only depends on the p × 1 vector of predictors x = ( x 1 , , x p ) T through the sufficient predictor S P = α + x T β . Let the covariance vector Cov ( x , Y ) = Σ x Y . Assume the cases ( x i T , Y i ) T are independent and identically distributed random vectors for i = 1 , , n . Then for many such regression models, β = 0 if and only if Σ x Y = 0 where 0 is the p × 1 vector of zeroes. The test of H 0 : Σ x Y = 0 versus H 1 : Σ x Y 0 is equivalent to the high dimensional one sample test H 0 : μ = 0 versus H A : μ 0 applied to w 1 , , w n where w i = ( x i μ x ) ( Y i μ Y ) and the expected values E ( x ) = μ x and E ( Y ) = μ Y . Since μ x and μ Y are unknown, the test of H 0 : β = 0 versus H 1 : β 0 is implemented by applying the one sample test to v i = ( x i x ¯ ) ( Y i Y ¯ ) for i = 1 , , n . This test has milder regularity conditions than its few competitors. For the multiple linear regression one component partial least squares and marginal maximum likelihood estimators, the test can be adapted to test H 0 : ( β i 1 , , β i k ) T = 0 versus H 1 : ( β i 1 , , β i k ) T 0 where 1 k p .

1. Introduction

This section reviews regression models where the response variable Y depends on the p × 1 vector of predictors x = ( x 1 , , x p ) T only through the sufficient predictor S P = α + x T β . Then there are n cases ( Y i , x i T ) T . For the regression models, the conditioning and subscripts, such as i, will often be suppressed. This paper gives a high dimensional test for H 0 : β = 0 versus H 1 : β 0 where 0 = ( 0 , , 0 ) T is the p × 1 vector of zeroes.
A useful multiple linear regression (MLR) model is
Y i = α + x i , 1 β 1 + + x i , p β p + e i = α + x i T β + e i
for i = 1 , , n . Assume that the e i are independent and identically distributed (iid) with expected value E ( e i ) = 0 and variance V ( e i ) = σ 2 . In matrix form, this model is
Y = X ϕ + e ,
where Y is an n × 1 vector of dependent variables, X is an n × ( p + 1 ) matrix with ith row ( 1 , x i T ) , ϕ = ( α , β T ) T is a ( p + 1 ) × 1 vector, and e is an n × 1 vector of unknown errors. Also E ( e ) = 0 and Cov( e ) = σ 2 I n where I n is the n × n identity matrix.
For a multiple linear regression model with heterogeneity, assume model (1) holds with E ( e ) = 0 and Cov( e ) = Σ e = d i a g ( σ i 2 ) = d i a g ( σ 1 2 , , σ n 2 ) is an n × n positive definite matrix. Under regularity conditions, the ordinary least squares (OLS) estimator ϕ ^ O L S = ( X T X ) 1 X T Y can be shown to be a consistent estimator of ϕ .
For estimation with ordinary least squares, let the covariance matrix of x be Cov ( x ) = Σ x = E [ ( x E ( x ) ) ( x E ( x ) ) T ] and the p × 1 vector η = Cov ( x , Y ) = Σ x Y = E [ ( x E ( x ) ( Y E ( Y ) ) ] = ( Cov ( x 1 , Y ) , , Cov ( x p , Y ) ) T . Let
η ^ = Σ ^ x Y = 1 n 1 i = 1 n ( x i x ¯ ) ( Y i Y ¯ )
and
η ˜ = Σ ˜ x Y = n 1 n Σ ^ x Y .
For a multiple linear regression model with iid cases, β ^ O L S is a consistent estimator of β O L S = Σ x 1 Σ x Y under mild regularity conditions, while α ^ O L S is a consistent estimator of E ( Y ) β O L S T E ( x ) .
Ref. [1] showed that the one component partial least squares (OPLS) estimator β ^ O P L S = λ ^ Σ ^ x Y estimates λ Σ x Y = β O P L S where
λ = Σ x Y T Σ x Y Σ x Y T Σ x Σ x Y a n d λ ^ = Σ ^ x Y T Σ ^ x Y Σ ^ x Y T Σ ^ x Σ ^ x Y
for Σ x Y 0 . If Σ x Y = 0 , then β O P L S = 0 . Also see [2,3,4]. Ref. [5] derived the large sample theory for η ^ O P L S = Σ ^ x Y and OPLS under milder regularity conditions than those in the previous literature, where η O P L S = Σ x Y . Ref. [6] showed that for iid cases ( x i , Y i ) , these results still hold for multiple linear regression models with heterogeneity.
The marginal maximum likelihood estimator (MMLE or marginal least squares estimator) is due to [7,8]. This estimator computes the marginal regression of Y on x i , such as Poisson regression, resulting in the estimator ( α ^ i , M , β ^ i , M ) for i = 1 , , p . Then β ^ M M L E = ( β ^ 1 , M , , β ^ p , M ) T .
For multiple linear regression, the marginal estimators are the simple linear regression estimators. Hence
β ^ M M L E = [ d i a g ( Σ ^ x ) ] 1 Σ ^ x , Y .
If the t i are the predictors that are scaled or standardized to have unit sample variances, then
β ^ M M L E = β ^ M M L E ( t , Y ) = Σ ^ t , Y = η ^ O P L S ( t , Y )
where ( t , Y ) denotes that Y was regressed on t. Ref. [6] derived large sample theory for the MMLE for multiple linear regression models, including models with heterogeneity.
For Poisson regression and related models, the response variable Y is a nonnegative count variable. A useful Poisson regression (PR) model is Y P o i s s o n e S P . This model has E ( Y | S P ) = V ( Y | S P ) = exp ( S P ) . The quasi-Poisson regression model has E ( Y | S P ) = exp ( S P ) and V ( Y | S P ) = ϕ exp ( S P ) where the dispersion parameter ϕ > 0 . Note that this model and the Poisson regression model have the same conditional mean function, and the conditional variance functions are the same if ϕ = 1 .
Some notation is needed for the negative binomial regression model. If Y has a (generalized) negative binomial distribution, Y N B ( μ , κ ) , then the probability mass function (pmf) of Y is
P ( Y = y ) = Γ ( y + κ ) Γ ( κ ) Γ ( y + 1 ) κ μ + κ κ 1 κ μ + κ y
for y = 0 , 1 , 2 , where μ > 0 and κ > 0 . Then E ( Y ) = μ and V( Y ) = μ + μ 2 / κ .
The negative binomial regression model states that Y 1 , , Y n are independent random variables with
Y | S P N B ( exp ( S P ) , κ ) .
This model has E ( Y | S P ) = exp ( S P ) and
V ( Y | S P ) = exp ( S P ) 1 + exp ( S P ) κ = exp ( S P ) + τ exp ( 2 S P ) .
Following Ref. [9] (p. 560), as τ 1 / κ 0 , it can be shown that the negative binomial regression model converges to the Poisson regression model.
Let the log transformation Z i = log ( Y i ) if Y i > 0 and Z i = log ( 0.5 ) if Y i = 0 . This transformation often results in a linear model with heterogeneity:
Z i = α Z + x i T β Z + e i
where the e i are independent with expected value E ( e i ) = 0 and variance V ( e i ) = σ i 2 . For Poisson regression, the minimum chi-square estimator is the weighted least squares estimator from the regression of Z i on x i with weights w i = e Z i . See [9] (pp. 611–612).
If the regression model for Y depends on x only through α + x T β , and if the predictors x i are iid from a large class of elliptically contoured distributions, then [10,11] showed that, under regularity conditions, β O L S = c β . Hence Σ x Y = c Σ x β . Thus Σ x Y = d β if Σ x = τ 2 I p where τ 2 > 0 and I p is the p × p identity matrix. If β = β O L S in this case, then β i = 0 implies that C o v ( x i , Y ) = 0 . The constant c is typically nonzero unless the model has a lot of symmetry about the distribution of α + x T β . Simulation with Σ ^ x Y can be difficult if the population values of c and d are unknown. Results from [12] (p. 89) suggest that for Poisson regression model, a rough approximation is β ^ P R β ^ O L S / Y ¯ . Results from [13] suggest that for binary logistic regression, a rough approximation is β ^ L R β ^ O L S / M S E where MSE is the mean square error from the OLS regression.
Ref. [14] has an interesting result for the multiple linear regression model (1). Assume that the cases ( x i T , Y i ) T are iid with E ( Y ) = μ Y , E ( x ) = μ x and nonsingular C o v ( x ) = Σ x . Let β = β O L S . Then testing H 0 : β = β 0 versus H 1 : β β 0 is equivalent to testing H 0 : μ = 0 versus H 1 : μ 0 with μ = E ( w i ) = Σ x ( β β 0 ) where w i = ( x i μ x ) ( Y i μ Y ( x i μ x ) T β 0 ) , and a one sample test can be applied to v i = ( x i x ¯ ) ( Y i Y ¯ ( x i x ¯ ) T β 0 ) .
Ref. [14] notes that there are only a few high dimensional analogs of the low dimensional multiple linear regression F-test for H 0 versus H 1 . See [15,16,17,18]. The assumptions on the predictors in these four papers are very strong.
This paper uses the above test for β 0 = 0 , which is equivalent to a test for Σ x Y = 0 . The resulting test is not limited to OLS for multiple linear regression with iid errors. As shown below and in the following paragraph, the test can be used for multiple linear regression when heterogeneity is present, and the test can also be used for many regression models that depend on the predictors only through x i T β . Suppose β D = D 1 Σ x Y where D is a p × p positive definite matrix. Then β D = 0 if and only if Σ x Y = 0 . Then D 1 = λ I for OPLS, D 1 = Σ x 1 for OLS, and D 1 = [ d i a g ( Σ x ) ] 1 for the MMLE. The k-component partial least squares estimator can be found by regressing Y on a constant and on W i = η ^ i T x for i = 1 , , k where η ^ i = Σ ^ x i 1 Σ ^ x Y for i = 1 , , k . See [19]. Hence β k P L S = 0 if Σ x Y = 0 . Thus if the cases ( x i T , Y i ) T are iid, then using β 0 = 0 gives tests for H 0 : β = 0 , H 0 : β M M L E = 0 , H 0 : Σ x Y = 0 , H 0 : β O P L S = 0 , and H 0 : β k P L S = 0 . For multiple linear regression with heterogeneity, β ^ O L S is still a consistent estimator of β = β O L S = Σ x 1 Σ x Y . Hence the test can be used when the constant variance assumption is violated.
Under iid cases with β = 0 , if the response variables Y i depend on the x i only through x i T β , then Y i Y i | α Y i | ( α + x i T β ) . Hence the Y i are iid and do not depend on x, and thus satisfy a multiple linear regression model with β O L S = β = 0 . For a parametric regression, such as a generalized linear model, assume Y i D ( τ ( α + x i T β ) , θ ) where D is the parametric distribution and τ is a real valued function. For example, D could be the negative binomial distribution with τ ( S P ) = e S P and θ = κ . If β = 0 , then the iid Y i D ( τ ( α ) , θ ) . Typically, if β 0 , then Σ x Y 0 , and the test can have good power. An exception is when there is a lot of symmetry which rarely occurs with real data. For example, suppose Y = m ( S P ) + e where the iid errors e i N ( 0 , σ 1 2 ) are independent of the predictors, S P N ( 0 , σ 2 2 ) , and the function m is symmetric about 0, e.g., m ( S P ) = ( S P ) 2 . Then β O L S = 0 and Σ x Y = 0 even if β 0 .
If β 0 = 0 , then w i = ( x i μ x ) ( Y i μ Y ) , and E ( w i ) = Σ x Y . Then apply a high dimensional one sample test on the v i = ( x i x ¯ ) ( Y i Y ¯ ) . Note that the sample mean v ¯ = Σ ˜ x Y .
Section 2.1 reviews and derives some results for the one sample test that will be used. Section 2.2 reviews some two sample tests. Section 2.3 gives theory for the test given in the above paragraph.

2. Materials and Methods

2.1. A High Dimensional One Sample Test

This section reviews and derives some results for the one sample test that will be used. Suppose x 1 , , x n are iid random vectors with E ( x ) = μ and covariance matrix Cov ( x ) = Σ . Then the test H 0 : μ = 0 versus H 1 : μ 0 is equivalent to the test H 0 : μ T μ = 0 versus H 1 : μ T μ 0 . Let S = Σ ^ . A U-statistic for estimating μ T μ is
T n = T n ( x ) = 1 n ( n 1 ) i j x i T x j = n x ¯ T x ¯ t r ( S ) n
where tr() is the trace function. See, for example, [20].
To see that the last equality holds, note that
T n = 1 n ( n 1 ) i j x i T x j i x i T x i = n 2 x ¯ T x ¯ i x i T x i n ( n 1 ) .
Now
S = 1 n 1 i = 1 n ( x i x ¯ ) ( x i x ¯ ) T = 1 n 1 i x i x i T n x ¯ x ¯ T .
Thus
t r ( S ) = 1 n 1 i t r ( x i x i T ) n t r ( x ¯ x ¯ T ) = 1 n 1 i x i T x i n x ¯ T x ¯ .
Thus
n x ¯ T x ¯ t r ( S ) = n x ¯ T x ¯ + n n 1 x ¯ T x ¯ 1 n 1 i x i T x i = n 2 x ¯ T x ¯ i x i T x i n 1 .
Next, we derive a simple test. Let the variance V ( x i T x j ) = V ( W ) = V ( W i j ) = σ W 2 for i j . Let m = f l o o r ( n / 2 ) = n / 2 be the integer part of n / 2 . So floor(100/2) = floor(101/2) = 50. Let the iid random variables W 1 = x 1 T x 2 , W 2 = x 3 T x 4 , , , W m = x 2 m 1 T x 2 m . Note that E ( W i ) = μ T μ and V ( W i ) = σ W 2 . Let S W 2 be the sample variance of the W i :
S W 2 = 1 m 1 i = 1 m ( W i W ¯ ) 2 .
The following new theorem follows from the univariate central limit theorem.
Theorem 1.
Assume x 1 , , x n are iid, E ( x i ) = μ , and the variance V ( x i T x j ) = σ W 2 for i j . Let W 1 , , W m be defined as above. Then
(a) m ( W ¯ μ T μ ) D N ( 0 , σ W 2 ) .
( b ) m ( W ¯ μ T μ ) S W D N ( 0 , 1 )
as n .
The following theorem derives the variance V ( T n ) under simpler regularity conditions than those in the literature, and the new proof of the theorem is also simpler.
Theorem 2.
Assume x 1 , , x n are iid, E ( x i ) = μ , and the variance V ( x i T x j ) = σ W 2 for i j . Let W i j = x i T x j for i j . Let θ = C o v ( W i j , W i d ) = μ T Σ μ where j d , i < j , and i < d . Then
( a ) V ( T n ) = 2 σ W 2 n ( n 1 ) + 4 ( n 2 ) θ n ( n 1 ) .
(b) If H 0 : μ = 0 is true, then θ = 0 and
V 0 = V ( T n ) = 2 σ W 2 n ( n 1 ) .
Proof. 
(a) To find the variance V ( T n ) with T n from Equation (7), let W i j = x i T x j = W j i , and note that
T n = 2 n ( n 1 ) H n w h e r e H n = i < j x i T x j = i < j x i T x j .
Then V ( H n ) = C o v ( H n , H n ) =
C o v i < j W i j , k < d W k d = i < j k < d C o v ( W i j , W k d ) .
Let V ( W i j ) = σ W 2 for i j . The covariances are of 3 types. First, if ( i j ) = ( k d ) with i < j , then C o v ( W i j , W k d ) = V ( W i j ) = σ W 2 . Second, if i , j , k , d are distinct with i < j and k < d , then W i j and W k d are independent with C o v ( W i j , W k d ) = 0 . Third, there are terms where exactly three of the four subscripts are distinct, which have C o v ( W i j , W i d ) = θ where j d , i < j , and i < d or C o v ( W i j , W k j ) = θ where i k , i < j , and k < j . These covariance terms are all equal to the same number θ since W i j = W j i . The number of ways to get three distinct subscripts is
a b c = n 2 2 n 2 n 2 2 n 2 = n ( n 1 ) ( n 2 )
since a is the number of terms on the right hand side of (8), b is the number of terms where i , j , k , d are distinct with i < j and k < d , and c is the number of terms where ( i j ) = ( k d ) with i < j . [Note that n ( n 1 ) terms have i and j distinct. Half of these terms have i < j and half have i > j . Similarly, n ( n 1 ) ( n 2 ) ( n 3 ) terms have i j k d distinct, and half of the n ( n 1 ) terms have i < j , while half of the ( n 2 ) ( n 3 ) terms have k < d .] Thus
V ( H n ) = 0.5 n ( n 1 ) σ W 2 + n ( n 1 ) ( n 2 ) θ .
This calculation was adapted from [21] (pp. 336–337). Thus
V ( T n ) = 4 [ n ( n 1 ) ] 2 V ( H n ) = 2 σ W 2 n ( n 1 ) + 4 ( n 2 ) θ n ( n 1 ) .
(b) Now θ = C o v ( x i T x j , x i T x k ) where x i , x j , and x k are iid. Hence θ =
C o v ( d x i d x j d , t x i t x k t ) = d t C o v ( x i d x j d , x i t x k t ) =
d t [ E ( x i d x j d x i t x k t ) E ( x i d x j d ) E ( x i t x k t ) ] =
d t [ E ( x i d x i t ) E ( x j d ) E ( x k t ) E ( x i d ) E ( x j d ) E ( x i t ) E ( x k t ) ] =
d t [ E ( x j d ) E ( x k t ) ( E ( x i d x i t ) E ( x i d ) E ( x i t ) ) ] =
d t [ E ( x j d ) E ( x k t ) C o v ( x i d , x i t ) ] = μ T Σ μ .
Under H 0 , μ = 0 and thus θ = 0 .    □
Note that T n is the sample mean of the 0.5 n ( n 1 ) distinct, identically distributed W i j = x i T x j for i < j . When μ = 0 , Theorem 2 proves that the W i j are uncorrelated. Hence when H 0 is true, V ( T n ) satisfies (Theorem 2b). Ref. [14] (p. 2024) showed that V ( W i j ) = V ( x i T x j ) = σ W 2 = t r ( Σ 2 ) + 2 μ T Σ μ . Plugging this value into (Theorem 2a) gives the [22] result
V ( T n ) = 2 n ( n 1 ) t r ( Σ 2 ) + 4 μ T Σ μ n .
Note that θ = μ T Σ μ can be consistently estimated as follows. Let g = f l o o r ( n / 3 ) . Let W 1 = x 1 T x 2 , Z 1 = x 1 T x 3 , W 2 = x 4 T x 5 , Z 2 = x 4 T x 6 , …, W g = x 3 g 2 T x 3 g 1 , Z g = x 3 g 2 T x 3 g . Then θ ^ is the sample covariance of the ( W i , Z i ) where i = 1 , , g . Note that a consistent estimator of t r ( Σ 2 ) is S W 2 2 θ ^ .
Let V ^ ( T n ) and V ^ 0 ( T n ) be consistent estimators of V ( T n ) and V 0 ( T n ) , respectively. Then ref. [22,23,24,25], and others proved that under mild regularity conditions when H 0 is true,
T n / V ^ ( T n ) = T n / V ^ 0 ( T n ) D N ( 0 , 1 ) .
Under regularity conditions when H 0 is true, ref. [25] proved that T n / V ^ 0 ( T n ) D t k as p for fixed n 3 where k = 0.5 n ( n 1 ) 1 .
A consistent estimator of V 0 ( T n ) needs a consistent estimator of σ W 2 = 0.5 n ( n 1 ) V 0 ( T n ) . Let s n 2 = V ^ 0 ( T n ) . Then one estimator is 0.5 n ( n 1 ) s n 2 = S W 2 from Theorem 1. An estimator nearly the same as the one used by [25] is
0.5 n ( n 1 ) s n 2 = σ ^ W 2 = 1 n ( n 1 ) i j ( x i T x j T n ) 2 = 1 n ( n 1 ) i j ( W i j T n ) 2 .
Note that σ W can be proportional to p since σ W is the standard deviation of a sum of p random variables. Thus to have good asymptotic power against all alternatives, likely need p / n 0 as n , p . When μ 0 , T n / V ^ 0 ( T n ) tends to have more power than T n / V ^ ( T n ) since V 0 ( T n ) < V ( T n ) . Suppose μ = δ 1 where the constant δ > 0 and 1 is the p × 1 vector of ones. Then μ T μ = δ 2 p , and the test using V ^ 0 ( T n ) may have good power for T n / V ^ 0 ( T n ) > 1.96 2 or for
δ 2 p 2 σ W 2 n ( n 1 ) > 2 o r δ 2 > 2 2 σ W n p .
For computing V ^ 0 ( T n ) , a question is whether to use an estimator of σ W 2 or of τ 2 = t r ( Σ 2 ) . Let the i j th element of Σ be σ i j with Σ = ( σ i j ) . Let Σ F be the Frobenius norm of Σ , and a be the Euclidean norm of vector a. Let v e c ( Σ ) be the vector formed by stacking the columns of Σ into a vector. Then τ 2 = t r ( Σ T Σ ) = Σ F 2 = i = 1 p j = 1 p σ i j 2 = v e c ( Σ ) 2 . There is a level-power tradeoff. Using σ ^ W 2 is good for controlling the level = P(type I) error when H 0 is true. Since σ W 2 = τ 2 + 2 μ T Σ μ = τ 2 + 2 θ , the parameter τ 2 can be much smaller than σ W 2 , and using a good estimator of τ 2 may result in better power.
In high dimensions, it is often very difficult to estimate a k × 1 vector θ when k > n . This result is a form of “the curse of dimensionality.” If a n consistent estimator of θ is available, then the squared norm
θ ^ θ 2 = i = 1 k ( θ ^ i θ i ) 2 k / n .
Hence estimators τ ^ 2 that use many parameters, such as plug in estimators Σ ^ , are likely to be poor. The two parameter estimator τ ^ 2 = σ ^ W 2 2 θ ^ likely has more variability than σ ^ W 2 when H 0 is true, and better estimators of θ are needed. In simulations, τ ^ 1 2 = σ ^ W 2 2 θ ^ was often negative. Let τ ^ 2 2 = τ ^ 1 2 if τ ^ 1 2 > 0 and τ ^ 2 2 = σ ^ W 2 , otherwise. In limited simulations, this estimator did about as well as τ ^ 3 2 = σ ^ W 2 . Obtaining an estimator that clearly outperforms σ ^ W 2 would improve the omnibus test, but is beyond the scope of this paper.
We also considered replacing x i by z i = s s ( x i ) where the spatial sign function s s ( x i ) = 0 if x i = 0 , and s s ( x i ) = x i / x i otherwise. This function projects the nonzero x i onto the unit p-dimensional hypersphere centered at 0 . Let T n ( w ) denote the statistic T n computed from an iid sample w 1 , , w n . Since the z i are iid if the x i are iid, use T n ( z ) to test H 0 : μ z = 0 versus H A : μ z 0 where μ z = E ( z i ) . In general, μ z μ = μ x = E ( x i ) , but μ z = μ = 0 can occur if the x i have a lot of symmetry about 0. In particular, μ z = μ = 0 if the x i are iid from an elliptically contoured distribution with E ( x i ) = μ = 0 . The test based on the statistic T n ( z ) can be useful if the first or second moments of the x i do not exist, for example if the x i are iid from a multivariate Cauchy distribution. These results may be useful for understanding papers such as [26].
The nonparametric bootstrap draws a bootstrap data set x 1 * , , x n * with replacement from the x i and computes T 1 * by applying T n on the bootstrap data set. This process is repeated B times to get a bootstrap sample T 1 * , , T B * . For the statistic T n , the nonparametric bootstrap fails in high dimensions because terms like x j T x j need to be avoided, and the nonparametric bootstrap has replicates: the proportion of cases in the bootstrap sample that are not replicates is about 1 e 1 2 / 3 7 / 11 . The m out of n bootstrap draws a sample of size m without replacement from the n cases. Using m = f l o o r ( 2 n / 3 ) worked well in simulations. Sampling without replacement is also known as subsampling and the delete d jackknife.

2.2. Three High Dimensional Two Sample Tests

If ( x 1 i , x 2 i ) come in correlated pairs, a high dimensional analog of the paired t test applies the one sample test on z i = x 1 i x 2 i .
Now suppose there are two independent random samples x 1 , 1 , , x 1 , n 1 and x 2 , 1 , , x 2 , n 2 from two populations or groups, and that it is desired to test H 0 : μ 1 = μ 2 versus H 1 : μ 1 μ 2 where E ( x i ) = μ i are p × 1 vectors. Let n = n 1 + n 2 . Let S i be the sample covariance matrix of x i and let C o v ( x i ) = Σ i for i = 1 , 2 .
A simple test takes m = min ( n 1 , n 2 ) and z i = x 1 i x 2 i for i = 1 , , m . Then apply the one sample test from Theorem 2 to the z i . This paired test might work well in high dimensions because of the superior power of the Theorem 2 test, but in low dimensions, it is known that there are better tests.
Let x 1 be the x i that has n 1 n 2 . Then let
y i = x 1 i n 1 n 2 x 2 i + 1 n 1 n 2 j = 1 n 1 x 2 j x ¯ 2 = x 1 i n 1 n 2 x 2 i + a n 1 , n 2 x ¯ 2
for i = 1 , , n 1 . Note that y i = z i = x 1 i x 2 i if n 1 = n 2 . Ref. [27] (pp. 177–178) proved that y ¯ = x ¯ 1 x ¯ 2 , that y i and y j are uncorrelated for i j , that E ( y i ) = μ 1 μ 2 , and that C o v ( y i ) = C o v ( x 1 ) + ( n 1 / n 2 ) C o v ( x 2 ) for i = 1 , , n 1 . Ref. [25] showed that T n ( y ) / V ^ 0 ( y ) D N ( 0 , 1 ) where the y denotes that the one sample test was computed using the y i .
Note that H 0 : μ 1 = μ 2 holds if and only if μ 1 μ 2 2 = μ 1 T μ 1 + μ 2 T μ 2 2 μ 1 T μ 2 = 0 . These terms can be estimated by T n = T n ( x , y ) = T 1 + T 2 2 T 3 where T 1 and T 2 are the one sample test statistic applied to samples 1 and 2 and n 1 n 2 T 3 = i = 1 n 1 j = 1 n 2 x 1 i T x 2 j .
Let X i j = x 1 i T x 1 j = X j i and Y i j = x 2 i T x 2 j = Y j i where i j . Let Z i j = x 1 i T x 2 j = Z j i . Let σ X 2 = V ( X i j ) , σ Y 2 = V ( Y i j ) , and σ Z 2 = V ( Z i j ) . Let V 0 ( T n ) be the variance of T n when H 0 is true. Assume s n 2 is a consistent estimator of V 0 ( T n ) . Under H 0 : μ 1 = μ 2 and additional regularity conditions, ref. [22] showed that V 0 ( T n ) =
2 n 1 ( n 1 1 ) t r ( Σ 1 2 ) + 2 n 2 ( n 2 1 ) t r ( Σ 2 2 ) + 4 n 1 n 2 t r ( Σ 1 Σ 2 )
and that
T n s n D N ( 0 , 1 ) .
Let θ 1 = C o v ( X i j , X i t ) = μ 1 T Σ 1 μ 1 where j t , i < j , and i < t , θ 2 = C o v ( Y i j , Y i t ) = μ 2 T Σ 2 μ 2 where j t , i < j , and i < t , θ 3 = C o v ( Z i j , Z i t ) = μ 2 T Σ 1 μ 2 where j t , and θ 4 = C o v ( Z i j , Z k j ) = μ 1 T Σ 2 μ 1 where i k .
Ref. [22] showed that
V ( T 3 ) = t r ( Σ 1 Σ 2 ) n 1 n 2 + θ 3 n 1 + θ 4 n 2 .
Ref. [28], using arguments similar to Theorem 2, showed
V ( T 3 ) = σ Z 2 n 1 n 2 + θ 3 ( n 2 1 ) n 1 n 2 + θ 4 ( n 1 1 ) n 1 n 2 .
Thus σ Z 2 = t r ( Σ 1 Σ 2 ) + θ 3 + θ 4 and t r ( Σ 1 Σ 2 ) = σ Z 2 θ 3 θ 4 . Hence
V 0 ( T n ) = 2 ( σ X 2 2 θ 1 ) n 1 ( n 1 1 ) + 2 ( σ Y 2 2 θ 2 ) n 2 ( n 2 1 ) + 4 ( σ Z 2 θ 3 θ 4 ) n 1 n 2 .
If μ 1 = μ 2 = 0 , then the θ i = 0 , and the formula with the θ i = 0 worked well in simulations. Note that σ X 2 , σ Y 2 , and the θ i can be estimated as in Section 2.1. Let m = min ( n 1 , n 2 ) , and Z i = x 1 i T x 2 i for i = 1 , , m . Let S Z 2 be the sample variance of the Z i . Another estimator of σ Z 2 is
σ ^ Z 2 = 1 n 1 n 2 i = 1 n 1 j = 1 n 2 ( x 1 i T x 2 j T 3 ( x , y ) ) 2 .

2.3. Theory for Testing H 0 : A Σ x Y = 0

Consider tests of the form H 0 : A Σ x Y = 0 versus H 1 : A Σ x Y 0 . The omnibus test uses A = I p and tests H 0 : Σ x Y = 0 versus H 1 : Σ x Y 0 .
Let w i = ( x i μ x ) ( Y i μ Y ) and v i = ( x i x ¯ ) ( Y i Y ¯ ) for i = 1 , , n . Then T n ( w ) / s n ( w ) D N ( 0 , 1 ) under mild regularity conditions by Section 2.1 where w indicates that the test was applied to the w i . Ref. [14] showed that T n ( v ) / s n ( v ) D N ( 0 , 1 ) and used p = 1.5 n for multiple linear regression in their simulations.
Let O = { i 1 , , i k } , x O = ( x i 1 , , x i k ) T , and x i , O = ( x i , i 1 , , x i , i k ) T . Then testing H 0 : Σ x O Y = 0 uses the one sample test on the v i , O = ( x i , O x ¯ O ) ( Y i Y ¯ ) . This test is equivalent to testing H 0 : β O P L S , O = 0 and H 0 : β M M L E , O = 0 . Note that data splitting could be used to select O. For multiple linear regression and the MMLE and OPLS estimators, these tests are high dimensional analogs for the OLS partial F tests for testing whether a reduced model is good. If I O = { 1 , , p } , then I corresponds to the predictors in the reduced model while O corresponds to the predictors out of the reduced model.
In low dimensions, important tests for regression include (a) H 0 : β i = 0 (the Wald tests for MLR), (b) H 0 : β = 0 (the Anova F test for MLR), and (c) H 0 : ( β i 1 , , β i k ) T = 0 (the partial F test for MLR). The above paragraph shows how to do these high dimensional tests for the multiple linear regression OPLS and MMLE estimators, with or without heterogeneity. Data splitting is not needed if O is known. Note that (a) corresponds to testing H 0 : Cov ( x i , Y ) = 0 while (c) corresponds to testing H 0 : ( Cov ( x i 1 , Y ) , , Cov ( x i k , Y ) ) T = 0 .
The next subsection reviews competitors for the above tests when k is small compared to n.

2.4. Theory for Certain A

This subsection reviews some large sample theory for η ^ O P L S = Σ ^ x Y and OPLS for the multiple linear regression model, including some high dimensional tests for low dimensional quantities such as H 0 : β i = 0 or H 0 : β i β j = 0 . These tests depended on iid cases, but not on linearity or the constant variance assumption. Hence the tests are useful for multiple linear regression with heterogeneity.
The following [5] theorem gives the large sample theory for η ^ = Cov ^ ( x , Y ) . Ref. [6] gave alternative proofs. This theory needs η = η O P L S = Σ x , Y to exist for η ^ = Σ ^ x , Y to be a consistent estimator of η . Let x i = ( x i 1 , , x i p ) T and let w i and z i be defined below where
Cov ( w i ) = Σ w = E [ ( x i μ x ) ( x i μ x ) T ( Y i μ Y ) 2 ) ] Σ x Y Σ x Y T .
Then the low order moments are needed for Σ ^ z to be a consistent estimator of Σ w .
Theorem 3.
Assume the cases ( x i T , Y i ) T are iid. Assume E ( x i j k Y i m ) exist for j = 1 , , p and k , m = 0 , 1 , 2 . Let μ x = E ( x ) and μ Y = E ( Y ) . Let w i = ( x i μ x ) ( Y i μ Y ) with sample mean w ¯ n . Let η = Σ x , Y . Then (a)
n ( w ¯ n η ) D N p ( 0 , Σ w ) , n ( η ^ n η ) D N p ( 0 , Σ w ) ,
a n d n ( η ˜ n η ) D N p ( 0 , Σ w ) .
(b) Let v i = ( x i x ¯ n ) ( Y i Y ¯ n ) . Then Σ ^ w = Σ ^ v + O P ( n 1 / 2 ) . Hence Σ ˜ w = Σ ˜ v + O P ( n 1 / 2 ) .(c) Let A be a k × p full rank constant matrix with k p , assume H 0 : A β O P L S = 0 is true, and assume λ ^ P λ 0 . Then
n A ( β ^ O P L S β O P L S ) D N k ( 0 , λ 2 A Σ w A T ) .
For the following theorem, consider a subset of k distinct elements from Σ ˜ or from Σ ^ . Stack the elements into a vector, and let each vector have the same ordering. For example, the largest subset of distinct elements corresponds to
v e c h ( Σ ˜ ) = ( σ ˜ 11 , , σ ˜ 1 p , σ ˜ 22 , , σ ˜ 2 p , , σ ˜ p 1 , p 1 , σ ˜ p 1 , p , σ ˜ p p ) T = [ σ ˜ j k ] .
For random variables x 1 , , x p , use notation such as x ¯ j = the sample mean of the x j , μ j = E ( x j ) , and σ j k = C o v ( x j , x k ) . Let
n v e c h ( Σ ˜ ) = [ n σ ˜ j k ] = i = 1 n [ ( x i j x ¯ j ) ( x i k x ¯ k ) ] .
For general vectors of elements, the ordering of the vectors will all be the same and be denoted by vectors such as c ^ = [ σ ^ j k ] , c ˜ = [ σ ˜ j k ] , c = [ σ j k ] , v i = [ ( x i j x ¯ j ) ( x i k x ¯ k ) ] , and w i = [ ( x i j μ j ) ( x i k μ k ) ] . Let w ¯ n = i = 1 n w i / n be the sample mean of the w i . Assuming that C o v ( w i ) = Σ w exists, then E ( w i ) = E ( w ¯ n ) = c .
The following [6] theorem provides large sample theory for c ^ and c ˜ . We use C o v ( w i ) = Σ d to avoid confusion with the Σ w used in Theorem 3. Note that x i are dummy variables and could be replaced by u i = ( Y i 1 , , Y i m , x i 1 , , x i p ) T to get information about m response variables Y 1 , , Y m . Testing H 0 : ( Σ x Y 1 T , Σ x Y 2 T ) T = 0 could likely be done applying the one sample test to z 1 = ( ( x 1 x ¯ ) T ( Y 1 , 1 Y ¯ 1 ) , ( x 2 x ¯ ) T ( Y 2 , 2 Y ¯ 2 ) ) T , …, z m = ( ( x n 1 x ¯ ) T ( Y 1 , n 1 Y ¯ 1 ) , ( x n x ¯ ) T ( Y 2 , n Y ¯ 2 ) ) T assuming n = 2 m and iid cases.
Theorem 4.
Assume the cases x i are iid and that C o v ( w i ) = Σ d exists. Using the above notation with c a k × 1 vector,
(i) n ( c ˜ c ) D N k ( 0 , Σ d ) .
(ii) n ( c ^ c ) D N k ( 0 , Σ d ) .
(iii) Σ ^ d = Σ ^ v + O P ( n 1 / 2 ) and Σ ˜ d = Σ ˜ v + O P ( n 1 / 2 ) .

2.5. Testing

As noted by [5], the following simple testing method reduces a possibly high dimensional problem to a low dimensional problem. Testing H 0 : A β O P L S = 0 versus H 1 : A β O P L S 0 is equivalent to testing H 0 : A η = 0 versus H 1 : A η 0 where A is a k × p constant matrix. Let Cov ( Σ ^ x Y ) = Cov ( η ^ ) = Σ w be the asymptotic covariance matrix of η ^ = Σ ^ x Y . In high dimensions where n < 5 p , we can’t get a good nonsingular estimator of Cov ( Σ ^ x Y ) , but we can get good nonsingular estimators of Cov ( Σ ^ u Y ) = Cov ( ( η ^ i 1 , , η ^ i k ) T ) with u = x I = ( x i 1 , , x i k ) T where n J k with J 10 . Here I = { i 1 , , i k } denotes predictors that are in the model. (Values of J much larger than 10 may be needed if some of the k predictors and/or Y are skewed.) Simply apply Theorem 3 to the predictors u used in the hypothesis test, and thus use the sample covariance matrix of the vectors ( u i u ¯ ) ( Y i Y ¯ ) . Hence we can test hypotheses like H 0 : β i β j = 0 . In particular, testing H 0 : β i = 0 is equivalent to testing H 0 : η i = σ x i , Y = 0 where σ x i , Y = Cov ( x i , Y ) .

2.6. High Dimensional Outlier Detection

High dimensional outlier detection is important. This subsection follows [29] closely. See [29,30] for examples and simulations. Let W be a data matrix, where the rows w i correspond to cases. For example, w i = x i or w i = z i = ( Y i , x i 1 , , x i p ) T . One of the simplest outlier detection methods uses the Euclidean distances of the w i from the coordinatewise median D i = D i ( MED ( W ) , I p ) . Concentration type steps compute the weighted median MED j : the coordinatewise median computed from the “half set” of cases w i with D i 2 MED ( D i 2 ( MED j 1 , I p ) ) where MED 0 = MED ( W ) . We often used j = 0 (no concentration type steps) or j = 9 . Let D i = D i ( MED j , I p ) . Let W i = 1 if D i MED ( D 1 , , D n ) + k MAD ( D 1 , , D n ) where k 0 and k = 5 is the default choice. Let W i = 0 , otherwise. Using k 0 insures that at least half of the cases get weight 1. This weighting corresponds to the weighting that would be used in a one sided metrically trimmed mean (Huber type skipped mean) of the distances. Here, the sample median absolute deviation is MAD ( n ) = MAD ( D 1 , , D n ) = MED ( | D i MED ( n ) | , i = 1 , , n ) where MED ( n ) = MED ( D 1 , , D n ) is the sample median of D 1 , , D n .
Let the covmb2 set B of at least n / 2 cases correspond to the cases with weight W i = 1 . Then the covmb2 estimator ( T , C ) is the sample mean and sample covariance matrix applied to the cases in set B. If w i = x i , then
T = i = 1 n W i x i i = 1 n W i and C = i = 1 n W i ( x i T ) ( x i T ) T i = 1 n W i 1 .
This estimator was built for speed, applications, and outlier resistance.
Another method to get an outlier resistant estimator Σ ^ x Y is to use the following identity. If X and Y are random variables, then
Cov ( X , Y ) = [ Var ( X + Y ) Var ( X Y ) ] / 4 .
Then replace Var( W ) by [ σ ^ ( W ) ] 2 where σ ^ ( W ) is a robust estimator of scale or standard deviation and W = X + Y or W = X Y . We used σ ^ ( W ) = 1.483 M A D ( W ) where M A D ( W ) = M A D ( n ) = M A D ( W 1 , , W n ) . Hence
Cov ^ ( X , Y ) = ( [ 1.483 M A D ( X + Y ) ] 2 [ 1.483 M A D ( X Y ) ] 2 ) / 4 .
The function ddplot5 plots the Euclidean distances from the coordinatewise median versus the Euclidean distances from the covmb2 location estimator. Typically the plotted points in this DD plot cluster about the identity line, and outliers appear in the upper right corner of the plot with a gap between the bulk of the data and the outliers.
The function rcovxy makes the classical and three robust estimators of η = Σ x Y , and makes a scatterplot matrix of the four estimated sufficient predictors η ^ T x and Y. Only two robust estimators are made if n 2.5 p .

3. Results

Example 1.
The [31] data was collected from n = 26 districts in Prussia in 1843. Let Y = the number of women married to civilians in the district with a constant and predictors x 1 = the population of the district in 1843, x 2 = the number of married civilian men in the district, x 3 = the number of married men in the military in the district, and x 4 = the number of women married to husbands in the military in the district. Sometimes the person conducting the survey would not count a spouse if the spouse was not at home. Hence Y and x 2 are highly correlated but not equal. Similarly, x 3 and x 4 are highly correlated but not equal. We expect β = β O L S ( 0 , 1 , 0 , 0 ) T . Then β ^ O L S = ( 0.00035 , 0.9995 , 0.2328 , 0.1531 ) T , β ^ M M L E = ( 0.1782 , 1.0010 , 48.5630 , 51.5513 ) T , Σ ^ x Y = ( 9285758004 , 1674298902 , 9855702 , 9653811 ) T , and β ^ O P L S = ( 0.1727 , 0.0311 , 0.0002 , 0.0002 ) T . Let the omnibus test statistic Z ( v ) = T n ( v ) / V ^ 0 ( T n ( v ) ) applied to the v i = ( x i x ¯ ) ( Y i Y ¯ ) . Then Z ( v ) = 9.3281 and the hypotheses H 0 : Σ x Y = 0 , H 0 : β O L S = 0 , H 0 : β O P L S = 0 and H 0 : β M M L E = 0 are all rejected. The classical F-test also rejects H 0 with p-value=0.
Example 2.
The [32] pottery data has n = 36 pottery shards of Roman earthware produced between second century B.C. and fourth century A.D. Often the pottery was stamped by the manufacturer. A chemical analysis was done for p = 20 chemicals (variables), the types of pottery were 1-Arretine, 2-not-Arretine, 3-North Italian, 4-Central Italian, 5-questionable origin. Let the binary response variable Y = 1 for type 1 and 0 for types 2–5. The omnibus test had Z = 2.146 for a two sided p-value of 0.0319 and the more correct right tailed p-value of 0.016. The chi-square logistic regression test for β = 0 had p-value = 0.0002, but the GLM did not converge.

3.1. One Sample Tests

In the simulations, we examined five one sample tests. The first “test” used the m out of n bootstrap to compute T 1 * , , T B * with B = 100 . We used the shorth bootstrap confidence interval described in [30] (ch. 2). This “test” has not been proven to have level α . The second test computed the usual t confidence interval
[ W ¯ t 1 α / 2 , m 1 S W / m , W ¯ + t 1 α / 2 , m 1 S W / m ]
for μ T μ based on the W i from Theorem 1. The third and fourth tests used Theorem 2 (b) and T n / s n D N ( 0 , 1 ) if s n 2 is a consistent estimator of V ( T n ) when H 0 is true. The third test used 0.5 n ( n 1 ) s n 2 = σ ^ W 2 , while the fourth test used 0.5 n ( n 1 ) s n 2 = S W 2 based on Theorem 1. These two tests computed intervals (“confidence intervals for 0”)
[ T n t 1 α / 2 , m 1 s n , T n + t 1 α / 2 , m 1 s n ] .
The tests 2–4 use the same cutoff t 1 α / 2 , m 1 so that the average interval lengths are more comparable. The fifth test used the Theorem 2 test applied to the spatial sign vectors with S W 2 .
The simulation used four distribution types where x = A y + δ 1 with E ( x ) = δ 1 where 1 is the p × 1 vector of ones. Type 1 used y N p ( 0 , I ) , type 2 used a mixture distribution y 0.6 N p ( 0 , I ) + 0.4 N p ( 0 , 25 I ) , type 3 for a multivariate t 4 distribution, and type 4 for a multivariate lognormal distribution where y = ( y 1 , , y p ) with w i = exp ( Z ) where Z N ( 0 , 1 ) and y i = w i E ( w i ) where E ( w i ) = exp ( 0.5 ) . The covariance matrix type depended on the matrix A. Type 1 used A = I p , type 2 used A = d i a g ( 1 , , p ) , and type 3 used A = ψ 1 1 T + ( 1 ψ ) I p giving cor( x i j , x i k ) = ρ for j k where ρ = 0 if ψ = 0 , ρ 1 / ( c + 1 ) as p if ψ = 1 / c p where c > 0 , and ρ 1 as p if ψ ( 0 , 1 ) is a constant. We used δ = 0 and δ > 0 chosen so at least one test had good power. The simulation used 5000 runs, the 4 x distributions, and the 3 matrices A. For the third A, we used ψ = 1 / p .
Table 1 and Table 2 summarize some simulation results. There are two lines for each simulation scenario. The first line gives the simulated power = proportion of times H 0 : μ = 0 was rejected. The second line gives the average length of the confidence interval for 0 where H 0 is rejected if 0 is not in the confidence interval. When δ = 0 , observed coverage between 0.04 and 0.06 suggests coverage = power = level is close to the nominal value 0.05. For larger δ , want the coverage near 1 for good power. See [28] for more simulations.
The bootstrap test corresponds to the boot column, the tests using ( w ¯ , S W ) , ( T n , σ ^ W ) , and ( T n , S W ) correspond to the next three columns. The last column corresponds to the spatial sign test. This test tends to have much shorter lengths because of the transformation of the data. The test using ( w ¯ , S W ) has simple large sample theory, but low power compared to the other methods. This test’s length is approximately n 1 times the length of that corresponding to ( T n , S W ) where 99 10 in the tables. The bootstrap test was sometimes conservative with observed coverage < 0.04 when δ = 0 . For xtype = 4 and δ = 0 , H 0 was not true for the spatial test. Hence the coverage for the spatial test was sometimes higher than 0.06 for this scenario. For δ = 0 , the test with ( T n , σ ^ W ) sometimes had coverage less than 0.04, while the test with ( T n , S W ) sometimes had coverage greater than 0.06. In the simulations, the spatial test often performed well, but typically E ( z i ) = μ z μ x = E ( x i ) , which makes the spatial test harder to use. For testing H 0 : μ x = 0 , the test with ( T n , σ ^ W ) appeared to perform better than the three competitors.

3.2. Two Sample Tests

In the simulations, we examined three two sample tests. The first “test” used the m out of n bootstrap where m i = 2 n i / 3 to bootstrap the [22] test that estimates μ 1 μ 2 2 = μ 1 T μ 1 + μ 2 T μ 2 2 μ 1 T μ 2 . The second test was the “paired test” with m = min ( n 1 , n 2 ) and z i = x 1 i x 2 i for i = 1 , , m . Then apply the one sample test from Theorem 2 to the z i . The third test was the [25] Li test. Both of these tests used S W 2 applied to the z i or the y i .
The simulation used four distribution types where x 1 = A 1 y 1 + δ 1 and x 2 = A 2 y 2 where y 1 and y 2 had the same distribution, with E ( x 1 ) = δ 1 and E ( x 2 ) = 0 . Type 1 used y N p ( 0 , I ) , type 2 used a mixture distribution y 0.6 N p ( 0 , I ) + 0.4 N p ( 0 , 25 I ) , type 3 for a multivariate t 4 distribution, and type 4 for a multivariate lognormal distribution where y = ( y 1 , , y p ) with w i = exp ( Z ) where Z N ( 0 , 1 ) and y i = w i E ( w i ) where E ( w i ) = exp ( 0.5 ) . The covariance matrix type depended on the matrix A.
For the covariance types, C o v ( x 1 ) = I , C o v ( x 2 ) = σ 2 C o v ( x 1 ) for covtyp = 1. C o v ( x 1 ) = d i a g ( 1 , 2 , , p ) , C o v ( x 2 ) = σ 2 C o v ( x 1 ) for covtyp = 2. C o v ( x 1 ) = I , C o v ( x 2 ) = σ 2 d i a g (1,2,...,p) for covtyp = 3. Table 3 shows some results. Two lines were used for each simulation scenario, with coverages on the first line and lengths on the second line. When n 1 = n 2 , the paired test and Li test gave the same results. When n 1 / n 2 was not near 1, the Li test had better power and shorter length. Increasing δ could greatly increase the length for the bootstrap test, but the coverage would be 1. Improving the one sample test would improve the Li test, but the Li test performed well in simulations.

3.3. Theorem 3 Tests

We illustrate Theorem 3 and Section 2.5 for Poisson regression and negative binomial regression. This simulation is similar to that done by [6] for multiple linear regression with and without heterogeneity. Let x N p 1 ( 0 , I ) be the ( p 1 ) × 1 vector of nontrivial predictors. Let S P i = α + x i T β = 1 + 1 x i , 1 + + 1 x i , k for i = 1 , , n . Hence α = 1 and ϕ = ( α , β T ) T = ( 1 , . . , 1 , 0 , , 0 ) T with k + 1 ones and p k 1 zeros. Here β is the Poisson regression parameter vector β P R or the negative binomial regression parameter vector β N B R . Let Z i = log ( Y i ) if Y i > 0 and Z i = log ( 0.5 ) if Y i = 0 . Then a multiple linear regression model with heterogeneity is Z i = α Z + x i T β Z + e i where the e i are independent with expected value E ( e i ) = 0 and variance V ( e i ) = σ i 2 . Since the cases ( x i , Y i ) are iid, the OLS estimator β O L S = c o β = Σ x 1 Σ x Z = Σ x Z because Σ x = I p 1 . Thus Σ x Z = ( c o , , c o , 0 , , 0 ) T with the first k values equal to c o and p k 1 zeros.
Let η O P L S = Σ x Z = ( η 1 , , η p 1 ) T . Then the Theorem 3 large sample 100 ( 1 δ ) confidence interval (CI) is η ^ i ± t n 1 , 1 δ / 2 S E ( η ^ i ) could be computed for each η i . If 0 is not in the confidence interval, then H 0 : η i = 0 and H 0 : β i E = 0 are both rejected for estimators E = OPLS and MMLE for the multiple linear regression model with Z. In the simulations with n = 50 , p = 4 , and ψ > 0 , the maximum observed undercoverage was about 0.05 = 5 % . Hence the program has the option to replace the cutoff t n 1 , 1 δ / 2 by t n 1 , u p where u p = m i n ( 1 δ / 2 + 0.05 , 1 δ / 2 + 2.5 / n ) if δ / 2 > 0.1 ,
u p = m i n ( 1 δ / 4 , 1 δ / 2 + 12.5 δ / n )
if δ / 2 0.1 . If u p < 1 δ / 2 + 0.001 , then use u p = 1 δ / 2 . This correction factor was used in the simulations for the nominal 95% CIs, where the correction factor uses a cutoff that is between t n 1 , 0.975 and the cutoff t n 1 , 0.9875 that would be used for a 97.5% CI. The nominal coverage was 0.95 with δ = 0.05 . Observed coverage between 0.94 and 0.96 suggests coverage is close to the nominal value. Ref. [33] noted that weighted least squares tests tend to reject H 0 too often (liberal tests with undercoverage).
To summarize the p 1 confidence intervals, the average length of the p 1 confidence intervals over 5000 runs was computed. Then the minimum, mean, and maximum of the average lengths was computed. The proportion of times each confidence interval contained zero was computed. These proportions were the observed coverages of the p 1 confidence intervals. Then the minimum observed coverage was found. The percentage of the observed coverages that were ≥ 0.9, 0.92, 0.93, 0.94, and 0.96 were also recorded. The test H 0 : ( η i , η j ) T = ( 0 , 0 ) T was also done where H 0 was true. The coverage of the test was recorded and a correction factor was not used. Negative binomial regression and Poisson regression were used, where κ = indicates that Poisson regression was used.
Table 4 illustrates Theorem 3(a) where k = 1 and Table 4 replaces Y with Z. For Table 4, confidence intervals were made for η i = C o v ( x i , Z ) for i = 1 , , 99 and the coverage was the percentage of the 5000 CIs that contained 0. Here η 1 0 , but η i = 0 for i = 2 , , 99 . The first two lines of Table 4 correspond to Poisson regression. The confidence interval for η 1 never contained 0, hence the minimum coverage was 0 with observed power = 1 0 = 1 . The proportion of CIs that had coverage 0.94 was 0.9898 (98/99 CIs). Hence this was also the proportion of CIs with coverage 0.90 , 0.92 and 0.93 . The proportion of CIs that had coverage 0.96 was 0.8081 (80/99 CIs). The typical coverage was near 0.965, hence the correction factor was slightly too large. The test H 0 : ( η 98 , η 99 ) T = ( 0 , 0 ) T did not use a correction factor, and coverage was 0.9438. The minimum average CI length was 0.4166, the sample mean of the average CI lengths was 0.4187, and the maximum average length was 0.4875, corresponding to η 1 . The second two lines and below for Table 4 were for the negative binomial regression with kappa = κ = 0.5 ,   1 ,   10 ,   100 . For κ = 1000 and 10,000, the simulations were very similar to those for κ = . Using Y instead of Z gave similar results with longer lengths.

3.4. Omnibus Test

Multiple Linear Regression
For this simulation, the x were generated as in Section 3.1 with μ = 0 , and then Y = α + x T β + e where β = δ 1 . Hence H 0 : β = 0 is true when δ = 0 . The one sample test was applied on the v i using S W and σ ^ W . The zero mean iid errors e i were iid from five distributions: (i) N(0,1), (ii) t 3 , (iii) EXP(1) − 1, (iv) uniform( 1 , 1 ), and (v) 0.9 N(0,1) + 0.1 N(0,100). Only distribution (iii) is not symmetric. With 5000 runs, would like the coverage to be between 0.04 and 0.06 when δ = 0 . In Table 5, the coverage was a bit high when S W was used (second to last column) instead of σ ^ W (fourth column). Power near 0.95 was good for δ = 0.0035 .
Poisson Regression
For this simulation, the x i were generated in a manner similar to Section 3.1 when the x i were from a multivariate normal distribution. Let β = δ ( 1 , , 1 , 0 , , 0 ) T where there were k 1’s and p k 0’s. Then the x i were scaled such that S P N ( 1 , 1 ) when δ = 1 . In general, S P N ( 1 , δ 2 ) for δ 0 . Hence the population Poisson regression was fairly strong for δ = 1 and rather weak for δ = 0.25 . Table 6 shows that using σ ^ W controlled the nominal level 0.05 better than using S W . As p got larger, the power performance could decrease. See line 8 of Table 6.
Sample R code for the above two tables is shown below.

4. Discussion

The omnibus test is resistant to model misspecification. For example, (a) the constant variance multiple linear regression model could be assumed when there is heterogeneity, and (b) for count data, a multiple linear regression model, or a negative binomial regression model, or a quasi-Poisson regression model may fit the data much better than the count model actually chosen. The test can also be used in low dimensions when the MLE fails to converge.
Based on the simulations and the theory, (a) the omnibus test and one sample test will not have good power against all alternatives unless σ W / n 0 as n , p . (b) The omnibus test and one sample test tended to have simulated observed level near the nominal level (control the type I error) if σ ^ W was used, but the omnibus test could be conservative if n was small: n = 10 for multiple linear regression and n = 30 for Poisson regression in the simulations. Sometimes σ ^ W exploded if p was large or if H 0 was false. (c) The omnibus test and one sample test have little outlier resistance. Thus it is important to check for outliers before performing the tests. (d) Both tests worked fairly well in simulations for n 50 and p 10 n , and Ref. [14] used p = 1.5 n in their simulations for multiple linear regression.
Right tail tests should be used for μ T μ since they have more power, but two tail tests are easier to explain and compare. Ref. [14] used the statistic
t n = 2 n ( n 1 ) i < j [ v i T v j + k n ( a T v i ) ( v j T a ) ]
with a = 1 / p and k n = ( p / l n ( p ) ) 1 / 2 . This statistic can also be used for an omnibus test when β 0 = 0 . The extra term was used to increase power and is likely a good idea, but better formulas for V ^ 0 ( t n ) may be needed.
Ref. [28] has many references for high dimensional one and two sample tests. For classification with two groups, let Σ be the pooled covariance matrix. Then β = Σ 1 ( μ 1 μ 2 ) = 0 if and only if μ 1 μ 2 = 0 , which can be tested with a two sample test. For the importance of β in discriminant analysis, see, for example, [34].
Let the “fail to reject region” be the compliment of the rejection region. Often the fail to reject region is a confidence region for the parameter or parameter vector of interest, where a confidence interval is a special case of a confidence region. In high dimensions, the length or volume of the fail to reject region does not necessarily converge to 0 as n , p , and the volume could diverge to if p / n . For the one sample test, the fail to reject region using V 0 has much more power than using a confidence interval for μ T μ .
Simulations were done in R. See [35]. The collection of [30] R functions slpack, available from (http://parker.ad.siu.edu/Olive/slpack.txt, accessed on 28 October 2025). has some useful functions for the inference. The function hdomni does the omnibus test. The relevant R code is shown below.
  • hdomni(x,y,alpha=0.05)
  • k <- n*(n-1)
  • xx <- scale(x,scale=F) #centered but not scaled
  • v <- xx*c(y-mean(y))
  • a <- apply(v,2,sum)
  • Thd <- (t(a)%*%a - sum(v^2))/k       #1 by 1 matrix
  • Thd <- as.double(Thd) #so the test statistic Thd=Tn is a scalar
  • sscp <- v%*%t(v)
  • ss <- sscp - Thd
  • ss <- ss^2
  • vw1 <- (sum(ss) - sum(diag(ss)))/k
  • Vohat <- 2*vw1/k
  • Z <- Thd/sqrt(Vohat)
  • pval <- 2*pnorm(-abs(Z)) #two tail pvalue
  • rpval=1-pnorm(Z) #right tail pvalue
The function hdhot1sim3 was used to simulate the five one sample tests, and was used for Table 1 and Table 2. The function hdhot1sim4 added the test using τ ^ 2 . The function hdhot2sim simulates the two sample test which applies the fast paired test on the z i = x i 1 x i 2 for i = 1 , , m , the [25] test, and the two sample [22] test based on subsampling with m i = f l o o r ( 2 n i / 3 ) for i = 1, 2. See Table 3. Proofs for Theorems 3 and 4 were not given, but are available from preprints of the corresponding published papers from (http://parker.ad.siu.edu/Olive/preprints.htm, accessed on 28 October 2025).
For Table 4, the function nbinroplssimz was used to create negative binomial regression data sets for finite κ , while the function PRoplssimz was used to create the Poisson regression data sets corresponding to κ = . The functions without the z do not use the Z = l o g ( Y ) transformation.
For the omnibus test, the function mlrcovxysim was used for multiple linear regression, while the function prcovxysim was used for Poisson regression.
The spatial sign vectors have a some outlier resistance. If the predictor variables are all continuous, the covmb2 and ddplot5 functions are useful for detecting outliers in high dimensions. See [30] (section 1.4.3). Ref. [36] gave estimators for the variance of U-statistics.

Author Contributions

Conceptualization, A.M.A., P.A.Q. and D.J.O.; methodology, A.M.A., P.A.Q. and D.J.O.; writing-original draft preparation, D.J.O. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The data sets are available from (http://parker.ad.siu.edu/Olive/sldata.txt, accessed on 28 October 2025).

Acknowledgments

The authors thank the editors and referees for their work.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:
CIconfidence interval
iidindependent and identically distributed
MDPIMultidisciplinary Digital Publishing Institute
MLRMultiple Linear Regression
MMLEmarginal maximum likelihood estimator
OLSordinary least squares
OPLSone component partial least squares
SPsufficient predictor

References

  1. Cook, R.D.; Helland, I.S.; Su, Z. Envelopes and partial least squares regression. J. Roy. Stat. Soc. B 2013, 75, 851–877. [Google Scholar] [CrossRef]
  2. Basa, J.; Cook, R.D.; Forzani, L.; Marcos, M. Asymptotic distribution of one-component partial least squares regression estimators in high dimensions. Can. J. Stat. 2024, 52, 118–130. [Google Scholar] [CrossRef]
  3. Cook, R.D.; Forzani, L. Partial Least Squares Regression: And Related Dimension Reduction Methods; Chapman and Hall/CRC: Boca Raton, FL, USA, 2024. [Google Scholar]
  4. Wold, H. Soft modelling by latent variables: The non-linear partial least squares (NIPALS) approach. J. Appl. Prob. 1975, 12, 117–142. [Google Scholar] [CrossRef]
  5. Olive, D.J.; Zhang, L. One component partial least squares, high dimensional regression, data splitting, and the multitude of models. Commun. Stat. Theory Methods 2025, 54, 130–145. [Google Scholar] [CrossRef]
  6. Olive, D.J.; Alshammari, A.A.; Pathiranage, K.G.; Hettige, L.A.W. Testing with the one component partial least squares and the marginal maximum likelihood estimators. Commun. Stat. Theory Methods 2025. [Google Scholar] [CrossRef]
  7. Fan, J.; Lv, J. Sure independence screening for ultrahigh dimensional feature space. J. Roy. Stat. Soc. B 2008, 70, 849–911. [Google Scholar] [CrossRef]
  8. Fan, J.; Song, R. Sure independence screening in generalized linear models with np-Dimensionality. Ann. Stat. 2010, 38, 3217–3841. [Google Scholar] [CrossRef]
  9. Agresti, A. Categorical Data Analysis, 2nd ed.; Wiley: Hoboken, NJ, USA, 2002. [Google Scholar]
  10. Li, K.C.; Duan, N. Regression analysis under link violation. Ann. Stat. 1989, 17, 1009–1052. [Google Scholar] [CrossRef]
  11. Chen, C.H.; Li, K.C. Can SIR be as popular as multiple linear regression? Stat. Sinica 1998, 8, 289–316. [Google Scholar]
  12. Cameron, A.C.; Trivedi, P.K. Regression Analysis of Count Data; Cambridge University Press: Cambridge, UK, 1998. [Google Scholar]
  13. Haggstrom, G.W. Logistic regression and discriminant analysis by ordinary least squares. J. Bus. Econ. Stat. 1983, 1, 229–238. [Google Scholar] [CrossRef]
  14. Zhao, A.; Li, C.; Li, R.; Zhang, Z. Testing high-dimensional regression coefficients in linear models. Ann. Stat. 2024, 52, 2034–2058. [Google Scholar] [CrossRef]
  15. Cui, H.; Guo, W.; Zhong, W. Test for high-dimensional regression coefficients using refitted cross-validation variance estimation. Ann. Stat. 2018, 46, 958–988. [Google Scholar] [CrossRef]
  16. Goeman, J.J.; van de Geer, S.A.; van Houwelingen, H.C. Testing against a high dimensional alternative. J. R. Stat. Soc. B 2006, 68, 477–493. [Google Scholar] [CrossRef]
  17. Lan, W.; Wang, H.; and Tsai, C.-L. Testing covariates in high-dimensional regression. Ann. Inst. Statist. Math. 2014, 66, 279–301. [Google Scholar] [CrossRef]
  18. Zhong, P.-S.; Chen, S.X. Tests for high dimensional regression coefficients with factorial designs. J. Amer. Stat. Assoc. 2011, 106, 260–274. [Google Scholar] [CrossRef]
  19. Helland, I.S. Partial least squares regression and statistical models. Scand. J. Stat. 1990, 17, 97–114. [Google Scholar]
  20. Park, J.; Ayyala, D.N. A test for the mean vector in large dimension and small samples. J. Stat. Plan. Inf. 2013, 143, 929–943. [Google Scholar] [CrossRef]
  21. Lehmann, E.L. Nonparametrics: Statistical Methods Based on Ranks; Holden-Day: San Francisco, CA, USA, 1975. [Google Scholar]
  22. Chen, S.X.; Qin, Y.L. A two sample test for high-dimensional data with applications to gene-set testing. Ann. Stat. 2010, 38, 808–835. [Google Scholar] [CrossRef]
  23. Srivastava, M.S.; Du, M. A test for the mean vector with fewer observations than the dimension. J. Mult. Anal. 2008, 99, 386–402. [Google Scholar] [CrossRef]
  24. Bai, Z.D.; Saranadasa, H. Effects of high dimension: By an example of a two sample problem. Stat. Sinica 1996, 6, 311–329. [Google Scholar]
  25. Li, J. Finite sample t-tests for high-dimensional means. J. Mult. Anal. 2023, 196, 105183. [Google Scholar] [CrossRef] [PubMed]
  26. Wang, L.; Peng, B.; and Li, R. A high-dimensional nonparametric multivariate test for mean vector. J. Am. Stat. Assoc. 2015, 110, 1658–1669. [Google Scholar] [CrossRef]
  27. Anderson, T.W. An Introduction to Multivariate Statistical Analysis, 2nd ed. Wiley: New York, NY, USA, 1984.
  28. Abid, A.M. Some Simple High Dimensional One and Two Sample Tests. Ph.D. Thesis, Southern Illinois University, Carbondale, IL, USA, 2025. Available online: http://parker.ad.siu.edu/Olive/sAhlam.pdf (accessed on 28 October 2025).
  29. Olive, D.J. Some useful techniques for high dimensional statistics. Stats 2025, 8, 60. [Google Scholar] [CrossRef]
  30. Olive, D.J. Prediction and Statistical Learning, Online Course Notes. 2025. Available online: http://parker.ad.siu.edu/Olive/slearnbk.htm (accessed on 28 October 2025).
  31. Hebbler, B. Statistics of Prussia. J. Roy. Stat. Soc. A 1847, 10, 154–186. [Google Scholar] [CrossRef]
  32. Wisseman, S.U.; Hopke, P.K.; Schindler-Kaudelka, E. Multielemental and multivariate analysis of Italian terra sigillata in the world heritage museum, university of Illinois at Urbana-Champaign. Archeomaterials 1987, 1, 101–107. [Google Scholar]
  33. Pötscher, B.M.; Preinerstorfer, D. How reliable are bootstrap-based heteroskedasticity robust tests? Econ. Theory 2023, 39, 789–847. [Google Scholar] [CrossRef]
  34. Wang, Y.; Wu, Z.; Wang, C. High dimensional discriminant analysis under weak sparsity. Commun. Stat. Theory Methods 2025, 54, 2657–2674. [Google Scholar] [CrossRef]
  35. R Core Team. R: A Language and Environment for Statistical Computing; R Foundation for Statistical Computing: Vienna, Austria, 2024. [Google Scholar]
  36. Xu, T.; Zhu, R.; Shao, X. On variance estimation of random forests with infinite-order U-statistics. Electr. J. Stat. 2024, 18, 2135–2207. [Google Scholar] [CrossRef]
Table 1. One sample tests, covtyp = 1, cov = observed type I error for δ = 0 and power for δ > 0 . Boldface for good performance not including spatial.
Table 1. One sample tests, covtyp = 1, cov = observed type I error for δ = 0 and power for δ > 0 . Boldface for good performance not including spatial.
nppsi/xtype δ Boot ( w ¯ , S W ) ( T n , σ ^ W ) ( T n , S W ) Spatial
100100000.02300.05800.04000.04520.0444
len1 0.67325.65200.57110.56810.0057
10010000.0750.81600.06880.92160.91760.9166
len1 0.80815.70180.57410.57310.0057
100100000.02360.04360.04660.07760.0478
len2 7.059058.25936.00945.85530.0057
10010000.150.19380.05060.31280.34900.9988
len2 7.583058.14176.02045.84350.0057
100100000.02220.04660.04500.06800.0468
len3 1.303110.69461.11401.07490.0057
10010000.10.75360.05440.87200.87140.9956
len3 1.556310.89761.12601.09530.0057
100100000.02060.05560.03720.06560.0906
len4 3.110525.45582.65432.55840.0057
10010000.170.90240.05460.96220.94960.7668
len4 3.781625.54202.67082.56710.0057
1001000000.02360.04820.04480.05060.0506
len1 2.140317.83021.80591.79200.0018
100100000.04150.8720.0680.94380.93980.9388
len1 2.277117.90041.80891.79910.0018
1001000000.02360.04480.04580.07120.0558
len2 22.4434185.110519.097318.60430.0018
100100000.0750.1420.04800.22220.26160.9978
len2 22.8203182.655618.977218.35760.0018
1001000000.02140.04320.04360.06500.0450
len3 4.164934.17083.54443.43430.0018
100100000.050.64580.05580.76420.77700.9908
len3 4.370834.04833.55863.42200.0018
1001000000.01920.05440.03780.05180.0484
len4 9.941782.39538.42678.28100.0018
100100000.0870.84300.05760.92820.92420.8774
len4 10.566482.88168.45238.32990.0018
Table 2. One sample tests, covtyp = 2, p = 10,000, cov = observed type I error for δ = 0 and power for δ > 0 . Boldface for good performance not including spatial.
Table 2. One sample tests, covtyp = 2, p = 10,000, cov = observed type I error for δ = 0 and power for δ > 0 . Boldface for good performance not including spatial.
nppsi/xtype δ boot ( w ¯ , S W ) ( T n , σ ^ W ) ( T n , S W ) Spatial
10010,000000.02720.05360.0450.05020.0496
len1 39,006.52326,271.532,976.4132,791.520.0007
10010,00001.690.84820.05820.930.92940.9286
len1 39,690.34327,648.832,994.6332,929.940.0007
10010,000000.02440.04420.04860.08760.0526
len2 408,8603,330,506347,476334,728.50.0007
10010,000030.11260.04880.17780.21480.9952
len2 411,196.13,349,674347,862.6336,654.90.0007
10010,000000.02060.0440.04360.06320.051
len3 75,976.41624,134.164,858.962,727.840.0007
10010,00002.50.89180.06080.94620.94541
len3 77,389.1625,801.964,740.6262,895.460.0007
10010,000000.02360.05340.0380.04440.0454
len4 181,871.71,517,807154,052.2152,545.30.0007
10010,00003.800.89520.05780.95580.95220.948
len4 185,192.41,518,189154,094.1152,583.70.0007
Table 3. Two sample tests, covtyp = 1, cov = observed type I error for δ = 0 and power for δ > 0 . Boldface for better performance.
Table 3. Two sample tests, covtyp = 1, cov = observed type I error for δ = 0 and power for δ > 0 . Boldface for better performance.
( n 1 , n 2 , σ , p ) xtypecovtype δ BootPairLi
(100, 100, 1, 100)1100.02460.04940.0494
len1101.34261.13891.1389
(100, 100, 1, 100)110.10.72240.85860.8586
len110.11.57891.14171.1417
(100, 200, 1, 100)1100.02560.04560.0462
len1101.00191.13600.8535
(100, 200, 1, 100)110.10.91660.86020.9612
len110.11.23961.14320.8609
Table 4. Cov(x,Z), n = 100, p = 100, k = 1, want cov > 0.94 except for mincov and cov96.
Table 4. Cov(x,Z), n = 100, p = 100, k = 1, want cov > 0.94 except for mincov and cov96.
κ mincovcov90cov92cov93cov94cov96testcov
0.00000.98990.98990.98990.98990.80810.9438
len0.41660.41870.4875
0.50.00620.98990.98990.98990.98990.75760.9440
len0.50500.50840.5686
10.00000.98990.98990.98990.98990.74750.9410
len0.48090.48340.5421
100.00000.98990.98990.98990.98990.69700.9412
len0.42580.42790.4929
1000.00000.98990.98990.98990.98990.65660.9464
len0.41740.41950.4882
Table 5. Omnibus test for multiple linear regression, cov = observed type I error for δ = 0 and power for δ > 0 . Boldface for good performance.
Table 5. Omnibus test for multiple linear regression, cov = observed type I error for δ = 0 and power for δ > 0 . Boldface for good performance.
( n , p ) (xtype,etype, ψ ) δ covlen S W covlen
(100, 100)(1, 1, 0)00.05746.39500.07105.9478
(100, 100)(1, 1, 0)0.00350.95249.07690.95428.3259
(100, 100)(1, 1, 0.1)00.05246.39140.06846.0284
(100, 100)(1, 1, 0.1)0.00350.95929.10930.95508.3757
(200, 100)(1, 1, 0)00.04843.24560.05863.1284
(500, 100)(1, 1, 0)00.04881.31470.05481.2821
(100, 100)(3, 2, 0)00.051830.0550.139424.370
(100, 500)(3, 2, 0)00.0466657.470.1320524.95
(50, 500)(2, 4, 0)00.0572940.290.1372799.28
(50, 500)(2, 4, 0)0.00010.94002587.00.96001547.5
(10, 500)(2, 4, 0)00.02584417.80.15463734.8
Table 6. Omnibus test for Poisson regression, cov = observed type I error for δ = 0 and power for δ > 0 . Boldface for good performance.
Table 6. Omnibus test for Poisson regression, cov = observed type I error for δ = 0 and power for δ > 0 . Boldface for good performance.
( n , p ) (k, ψ ) δ covlen S W covlen
(100, 100)(100, 0)00.04480.01510.08780.0144
(100, 100)(100, 0)0.70.93180.06160.91840.0548
(100, 100)(100, 0.1)00.05680.00150.06720.0014
(100, 100)(100, 0.1)0.250.97580.00240.97420.0021
(200, 100)(100, 0)00.04380.00750.06520.0073
(500, 100)(100, 0)00.04840.00300.06060.0030
(100, 200)(100, 0)00.04780.02150.08520.0205
(100, 500)(100, 0)20.319661.6060.600838.942
(100, 500)(100, 0)00.05020.03440.08520.0330
(50, 500)(100, 0)00.03700.07410.09060.0702
(30, 500)(100, 0)00.02320.14130.08960.1302
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Abid, A.M.; Quaye, P.A.; Olive, D.J. A High Dimensional Omnibus Regression Test. Stats 2025, 8, 107. https://doi.org/10.3390/stats8040107

AMA Style

Abid AM, Quaye PA, Olive DJ. A High Dimensional Omnibus Regression Test. Stats. 2025; 8(4):107. https://doi.org/10.3390/stats8040107

Chicago/Turabian Style

Abid, Ahlam M., Paul A. Quaye, and David J. Olive. 2025. "A High Dimensional Omnibus Regression Test" Stats 8, no. 4: 107. https://doi.org/10.3390/stats8040107

APA Style

Abid, A. M., Quaye, P. A., & Olive, D. J. (2025). A High Dimensional Omnibus Regression Test. Stats, 8(4), 107. https://doi.org/10.3390/stats8040107

Article Metrics

Back to TopTop