Next Article in Journal
Analyzing Temperature Distributions and Gradient Behaviors for Early-Stage Tumor Lesions in 3D Computational Model of Breast
Previous Article in Journal
An Adaptive Security Framework for Internet of Things Networks Leveraging SDN and Machine Learning
Previous Article in Special Issue
Pointwise Nonparametric Estimation of Odds Ratio Curves with R: Introducing the flexOR Package
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Optimal Concordant Tests

College of Health Solutions, Arizona State University, 425 N. 5th Street, #137, Phoenix, AZ 85004, USA
Appl. Sci. 2024, 14(11), 4536; https://doi.org/10.3390/app14114536
Submission received: 25 April 2024 / Revised: 16 May 2024 / Accepted: 23 May 2024 / Published: 25 May 2024
(This article belongs to the Special Issue Applied Biostatistics: Challenges and Opportunities)

Abstract

:
In meta-analyses, unlike model-based methods, such as fixed- or random-effect models, the p-value combining methods are distribution-free and robust. How to appropriately and powerfully combine p-values obtained from various sources remains an important but challenging topic in statistical inference. For cases where all or a majority of the individual alternative hypotheses have the same but unknown direction, concordant tests based on one-sided p-values can substantially improve the detecting power. However, there exists no test that is uniformly most powerful; therefore, figuring out how to choose a robust and powerful test to combine one-sided p-values for a given data set is desirable. In this paper, we propose and study a class of gamma distribution-based concordant tests. Those concordant tests are optimal under specific conditions. An asymptotically optimal concordant test is also studied. The excellent performances of the proposed tests were demonstrated through a numeric simulation study and real data example.

1. Introduction

Due to recent technical developments, larger volume data, including genome-wide genomic data, can be generated rapidly and at a low cost. As a consequence, powerful statistical approaches, including p-value combination tests, are highly desirable for analyzing these data sets. For instance, meta-analyses, when applied to genome-wide association studies (GWASs), have discovered many associated genetic variants that could not be identified from a single GWAS. With the development and application of advanced powerful p-value combination tests, we expect that many more casual genetic variants will be identified through existing data.
In meta-analyses, p-value combination methods are important alternatives to model-based techniques [1,2], especially when the fixed- or random-effect models have a lack of fit [3]. In the literature, many p-value combination methods, although first proposed a long time ago, are still widely used nowadays. These methods include popular ones: Fisher test [4], Pearson test [5], minimal p-test [6], z-test [7], and chi-square test [8,9]. Recently, we studied a class of p-value combination tests based on gamma distribution, which include those popular tests as special cases [10]. However, as noticed by Birnbaum, there exists no uniformly most powerful (UMP) p-value combination test for all conditions [11]. Therefore, robust methods that have good power under many conditions are desirable [12].
Under some conditions, we may know or reasonably assume that all or most of the true effects under the individual alternative hypotheses have the same but unknown direction. In this case, the concordant tests which use one-sided p-values have been shown to be more powerful than those methods based on two-sided p-values, such as the chi-square test [1]. For instance, in GWASs, studies with the same phenotype are usually conducted among several different subpopulations, from which the effect of a causal genetic variant may have the same direction but various sizes. Then, if the commonly used fixed- and random-effect meta-analyses are applied to combine the results from individual studies, they may lose the power to detect the associated genetic variants. To improve the detecting power, in this paper, we study a class of concordant tests based on gamma distribution, and then an optimal concordant test based on constrained likelihood ratio test (CLRT) is developed.
The rest of the manuscript is organized as follows. In Section 2, we first introduce the Pearson’s concordant test, which was studied by Owen [1] (Section 2.1). In Section 2.2, we describe our proposed generalized concordant tests based on gamma distribution and study their properties as UMP tests under some conditions. The asymptotically UMP concordant test based on CLRT is proposed and studied in Section 2.3. In Section 3, we compare the performances of the proposed concordant tests with some existing methods through a simulation study. In addition, an example of a real data application is illustrated to demonstrate the desired performance of the proposed tests. This paper concludes in Section 4 with a discussion and conclusions.

2. Methods

Suppose we have n independent studies. For each study, i   i = 1 , , n , we can perform a two-sample test for testing the null hypothesis, H i 0 :   μ i = 0 , against the alternative hypothesis, H i 1 : μ i 0 , where μ i is the mean difference between two populations. The corresponding p-value is denoted as P i i = 1 , , n . Throughout this paper, we assume P i ~ U 0,1 under H i 0 , where U 0,1 stands for the uniform distribution between 0 and 1. In a meta-analysis, we consider testing the global null hypothesis, H 0 = H i 0 , vs. the global alternative hypothesis, H 1 = H i 1 , using the n observed p-values.
For a concordant test, we assume, under the global alternative, that the majority of the non-zero mean differences have the same but unknown direction (positive or negative): μ i 0 and at least one strictly greater than holds, or μ i 0 and at least one strictly less than holds ( i = 1 , , n ). Although, under this setting, a two-sided p-value can still be calculated from each data set and then used in p-value combination, in this paper, we focus on the concordant tests which use one-sided p-values, P i , obtained from the directional alternatives μ i > 0 or μ i < 0 for all i values i = 1 , , n . In general, either the right- or left-sided p-values can be used, as they will produce the same result. Without loss of generality, we use the right-sided individual p-values hereafter, unless otherwise specified.

2.1. Pearson’s Concordant Test

Owen [1] revisited and studied the following Pearson’s concordant test, whose test statistic is defined as follows:
Q C = m a x ( 2 i = 1 n ln P i , 2 i = 1 n ln 1 P i ) ,
where P i   i = 1 , , n represents the one-sided p-values obtained from the same directional alternatives. The two components, 2 i = 1 n ln P i and 2 i = 1 n ln 1 P i , in Q C are obtained through the popular Fisher test for combining independent p-values [4]. Therefore, under the global null hypothesis, each component is the random variable, χ 2 n 2 , which has a chi-square distribution with degrees of freedom equal to 2 n . Furthermore, although 2 i = 1 n ln P i and 2 i = 1 n ln 1 P i are not independent, the p-value of the Pearson’s concordant test defined in (1) can be easily approximated using the upper bound, 2 P [ χ 2 n 2 > q C ] , where q C is the observed test statistic. In addition, the approximation is very accurate when the true p-value is small [1].

2.2. Concordant Tests Based on Gamma Distribution

The above Pearson’s concordant test can be extended to concordant tests based on gamma distribution. For this purpose, we first study a class of combination tests based on gamma distribution [10].
For given shape parameter α and scale parameter β , a p-value combination test statistic based on the gamma distribution can be constructed as follows:
T G α , β P 1 , P 2 , , P n = i = 1 n F G α , β 1 1 P i ,
where F G ( α , β ) 1 ( y ) is the inverse function of the cumulative distribution function (CDF), F G ( α , β ) x , of the random variable G a m m a α , β , whose probability density function (PDF) is f G ( α , β ) x = β α x α 1 e x p ( β x ) / Γ α for x > 0 , where the gamma function Γ z = 0 x z 1 e x d x . The right-sided p-value of the test defined in (2) can be defined as follows:
P = P G a m m a n α , β > t = 1 F G n α , β t = S G n α , β t ,
where G a m m a n α , β is the random variable having gamma distribution with the shape parameter, n α , and the scale parameter, β ; and F G n α , β and S G n α , β are the CDF and the survival function of G a m m a n α , β .
It can be shown that, due to the property of gamma distribution, parameter β has no effect on the test defined in (2) and (3) [10]. Therefore, for simplicity, we use β = 1 in the above test and denote T G α , 1 as T G α hereafter. Furthermore, we can prove the following properties for the above class of gamma distribution-based tests [10]: (i) T G ( 0 ) lim α 0 + T G ( α ) is equivalent to the Tippett’s minimal-p test; (ii) T G ( 0.5 ) is the same as the chi-square test with degrees of freedom equal to n ; (iii) T G ( 1 ) is equavalent to Fisher test; and (iv) T G ( ) lim α T G ( α ) is the same as the z-test. Therefore, the class of gamma distribution-based tests, T G α , includes many popular tests as special cases. This observation motivates us to generalize the Pearson’s concordant test.
We propose the gamma distribution-based concordant test with the following test statistic:
T G α C = m a x ( T G α L , T G α R ) ,
where T G α L = T G α P 1 , P 2 , , P n = i = 1 n F G ( α ) 1 ( 1 P i ) and T G α R = T G α 1 P 1 , 1 P 2 , , 1 P n = i = 1 n F G ( α ) 1 ( P i ) are the gamma distribution-based tests using the left- and right-sided p-values, respectively.
It is easy to see that when α = 1 , T G α C is equivalent to the Pearson’s concordant test Q C . Therefore, T G α C is a generalization of the Pearson’s concordant test. We show that the p-value for the proposed concordant test, T G α C , can also be estimated in a similar way as Q C : P T G α C > A 2 P [ T G α L > A ] . Now, we study the properties of T G α C . First we need the following definitions and lemmas [13].
Definition 1. 
A function f on R n is nondecreasing if it is nondecreasing in each of its n arguments when the other n 1 values are held fixed.
Definition 2. 
Let X 1 ,   , X n be random variables with a joint distribution. These random variables are associated if the covariance C o v ( f X , g ( X ) ) 0 holds for all nondecreasing functions, f   and g , for which the covariance is defined.
Lemma 1. 
(Theorem 2.1 of Esary, Proschan, and Walkup 1967 [13]) Independent random variables are associated.
Lemma 2. 
(Theorem 5.1 of Esary, Proschan, and Walkup 1967 [13]) Let X 1 ,   , X n be associated random variables, X = X 1 , , X n , S i = f i X , i = 1 , , k , then P [ S 1 s 1 , , S k s k ] i = 1 k [ S i s i ] and P [ S 1 > s 1 , , S k > s k ] i = 1 k [ S i > s i ] for all s 1 , s 2 , , s k .
Lemma 3. 
For integer n 1 , let X = ( X 1 , , X n ) ( 0,1 ) n have independent components. Set T G α L = T G α ( X 1 , , X n ) , and T G α R = T G α ( 1 X 1 , , 1 X n ) . Then, for any A L > 0 , and A R > 0 , P [ T G α L > A L , T G α R > A R ] P [ T G α L > A L ] P [ T G α R > A R ] .
Proof. 
Denote f X = T G α L = i = 1 n F G ( α ) 1 ( 1 P i ) and g X = T G α R = i = 1 n F G ( α ) 1 ( P i ) . It is easy to see that both f and g are nondecreasing functions of X . And the components of X are independent and hence associated. Therefore, we have
P T G α L > A L , T G α R > A R   = P f X > A L ,   g X > A R = P [ f X < A L ,   g X > A R ] = P g X > A R P f X > A L ,   g X > A R P g X > A R P f X > A L P g X > A R = P g X > A R P [ f X < A L ] = P T G α R > A R P [ T G α L > A L ] .
The concordant test, T G α C , defined in (4) has the following property.
Theorem 1. 
Let X = ( X 1 ,   , X n ) ( 0,1 ) n be a random vector with independent components, and denote T G α L = T G α ( X 1 , , X n ) , T G α R = T G α ( 1 X 1 , , 1 X n ) , and T G α C = m a x ( T G α L , T G α R ) . For any A R , let P L = P [ T G α L > A ] , P R = P [   T G α R > A ] , and P C = P [ T G α C > A ] . Then P L + P R P L P R   P C P L + P R
Proof. 
From Lemmas 2 and 3, we have P C = P T G α C > A = P T G α L > A   o r   T G α 0 R > A = P T G α L > A ] + P [ T G α R > A P T G α L > A , T G α R > A P T G α L > A ] + P [ T G α R > A P T G α L > A ] P [ T G α R > A = P L + P R P L P R . On the other hand, from Bonferroni inequality, we have P C P T G α L > A ] + P [ T G α R > A = P L + P R . □
Under the condition specified in Theorem 1 (i.e., the global null hypothesis), it is easy to see that T G α L and T G α R have the same distribution; hence, P T G α L > A = P [ T G α R > A ] .
From Theorem 1, we have the following result.
Corollary 1. 
Suppose X ~ U ( 0,1 ) n , for A > 0 , let p A = P [ T G α L > A ] , then 2 p A p A 2 P [ T G α C > A ] 2 p A .
From Corollary 1, the p-value of the concordant test, T G α C , defined in (4),   P T G α C > A , can be easily approximated by twice of the p-value from T G α L , i.e., P T G α C > A 2 P [ T G α L > A ] . The approximation is very accurate if the true p-value (e.g., the p-value from the sampling distribution under the alternative) is small, as p A 2 is even much smaller under this situation.
In the following subsection, we study an optimal concordant test when the optimal value for α is unknown and needs to be estimated from the data.

2.3. Optimal Concordant Test

For a given set of one-sided p-values, the class of concordant tests, T G α C , proposed in the previous section, Section 2.2, with different α values can be applied. However, in order to control type I error, the α value must be chosen before we see the data. But usually, the optimal value is unknown, and therefore how to choose an optimal value for α is critical in practice. We may choose a set of α values to apply the tests T G α C to the data and then use the smallest p-value among them. There are two limitations associated with this procedure. First, we need to adjust for multiple comparisons, which may result in power loss. Second, the final result will depend on the chosen α values; it might be difficult to determine those values. To circumvent these difficulties, we propose to estimate the optimal α value from the data, and therefore an asymptotically optimal concordant test can be constructed accordingly.
In this subsection, we make the following assumption: P 1 , P 2 , , P n are independently and identically distributed ( i i d ) with the following density function for parameters α > 0 and c < 1 :
f α , c p = 1 c α e x p [ c F G α 1 1 p ]             for   p ( 0,1 ) .
For the above densify function, f α , c p , it is not difficult to show that when c = 0 , f α , 0 p = 1   ( 0 < p < 1 ) . Therefore, f α , 0 p corresponds to the global null hypothesis. In addition, we have the following property [10].
Proposition 1. 
Suppose that P 1 , P 2 , , P n are i i d with the common density function defined in (5) with parameters α > 0 and c < 1 . If both α and c are known, then the gamma distribution-based test, T G ( α ) , is uniformly most powerful (UMP).
When both parameters α and c in (5) are unknown, they can be estimated via the constrained maximum likelihood estimation (CMLE), from which the following constrained likelihood ratio test (CLRT) can be defined:
T C L R T ( P 1 , , P n ) = 2 α ^ C L R T n l n 1 c ^ C L R T + 2 c ^ C L R T i = 1 n F G α ^ C L R T 1 1 P i ,
where α ^ C L R T and c ^ C L R T are the CMLEs for parameters α and c , respectively, through maximizing the log-likelihood function l α , c = n α l n 1 c + c i = 1 n F G α 1 1 P i with the constrains c < 1 and α > 0 .
For the above CLRT-based test, T C L R T , we have the following result [10,14].
Proposition 2. 
The asymptotic distribution of the test T C L R T is a mixture of chi-square distributions, i = 0 2 w i χ i 2 , where χ i 2 is the chi-square distribution with d f = i ; χ 0 2 is the random variable with probability 1 of being 0; and the weights w 0 , w 1 , and w 2 are determined by the null and the alternative hypothesis.
The above asymptotic result may not be directly applicable to estimate the p-value for this test. First, the number of p-values, n , is usually small, and the asymptotic result may not provide approximate solutions. Second and more seriously, the above weights, w i s , are difficult to obtain. Hence, here we use a simple resampling method to approximate the null distribution and to estimate the p-value of T C L R T . More specifically, for a given sample size n , we randomly sample n independent null p-values from the uniform distribution U(0,1) and then calculate the test statistic using (6). Repeat this process for B times (e.g., B = 10   5 ) to get the empirical null distribution of the test statistic, which will be used to approximate the p-value of T C L R T : p = # [ T i T C L R T p 1 , , p n ] / B , where T i   ( i = 1 , , B ) is the test statistic from the i t h resampling, and T C L R T p 1 , , p n is the observed test statistic from the data.
For T C L R T , we have the following result [10]:
Proposition 3. 
Under the conditions specified in (5), the CLRT-based test, T C L R T , is asymptotically UMP.
We now propose an optimal concordant test. Denote T C L R T L = T C L R T ( P 1 , P 2 , , P n ) , and T C L R T R = T C L R T ( 1 P 1 , 1 P 2 , , 1 P n ) , and then define the following concordant test statistic:
T C L R T C = m a x ( T L R T L , T L R T R ) .
Note that the null distributions of T L R T L   a n d   T L R T R are identical. For the concordant test, T C L R T C , we have the following property.
Theorem 2. 
Let X = ( X 1 ,   , X n ) ( 0,1 ) n be a random vector with independent components, and let T C L R T L = max α > 0 , c < 1 [ n α l n 1 c + c i = 1 n F G α 1 1 X i ] , T C L R T R = max α > 0 , c < 1 [ n α l n 1 c + c i = 1 n F G α 1 X i ] , and T C L R T C = m a x ( T C L R T L , T C L R T R ) . For any A R , let P L = P [ T C L R T L > A ] , P R = P [ T C L R T R > A ] , and P C = P [ T C L R T C > A ] . Then, P L + P R P L P R   P C P L + P R .
Proof. 
First notice that f X = max α > 0 , c < 1 ) [ n α l n 1 c + c i = 1 n F G α 1 1 X i ] and g ( X ) = max α > 0 , c < 1 ) [ n α l n 1 c + c i = 1 n F G α 1 X i ] are both nondecreasing functions of associated random variables, X . The rest of the proof is similar to that for Theorem 1. □
In addition, we have the following result for T C L R T C .
Corollary 2. 
Suppose X ~ U ( 0,1 ) n ; for A > 0 , let p A = P [ T C L R T L > A ] , and then 2 p A p A 2 P [ T C L R T C > A ] 2 p A .
From Corollary 2, the p-value of the concordant test, T C L R T C , defined in (7) can be estimated by the upper bound, 2 p A , where p A is the p-value from the CLRT-based test, T C L R T L , and can be estimated using the resampling method described previously. Again, when the true p-value is small, this approximation is very accurate.

3. Results

3.1. Numeric Results from Simulation Study

In this section, we assess the performance of the proposed tests through a simulation study. In the simulation, we compare the proposed optimal concordant test, T C L R T C , with gamma distribution-based concordant tests, T G 0 C , T G 1 C (i.e., the Owen’s Pearson concordant test), and T G C , using one-sided p-values.
In the simulation study, we want to combine fifty ( n = 50 ) randomly simulated independent p-values using different methods. We assume that a number, m   ( m = 10 ,   20 ,   40 ,   50 ) , out of these n p-values are from the true individual alternative hypotheses, and the rest ( n m ) are from the true individual null hypotheses. The n-m p-values from the true null hypotheses are randomly sampled from the uniform distribution with range 0 and 1. For a true individual alternative hypothesis, H i 1   ( i = 1 , , m ) , the p-value p i is obtained via a normal variable, z i ~ N ( μ i , 1 ) . To consider the situations where the true effects from different studies may have opposite directions (positive and negative), we randomly set k of the m   μ i   s to have the same direction (positive or negative), and the rest of m k has the other direction. A one-sided (right-sided) p-value for each true individual alternative hypothesis is obtained via the standard z-test, i.e., p = P [ Z > z ] , where Z is the standard normal distribution, N ( 0,1 ) , and z is the simulated number as described above.
For the hypotheses under true alternative, we consider three different scenarios for the effects of μ i s . Scenario 1: We use | μ i | = μ v i / i = 1 m v i , where v i = 10 r i , r i ~ N ( 0.3,1 ) , and μ = 0.8 ,   0.6 ,   0.4 ,   0.3 when there are 10, 20, 40, and 50 true individual alternatives, respectively. Scenario 2: We set | μ i | = μ v i / i = 1 m v i , where v i ~ U ( 1,100 ) and μ = 1.2 ,   1.0 ,   0.6 ,   0.5 when the number of true individual alternatives is m = 10 ,   20 ,   40 ,   and   50 , respectively. Scenario 3: We consider | μ i | = μ / m , μ = 1.5 ,   1.2 ,   0.8 ,   and   0.6 when m = 10 ,   20 ,   40 ,   and   50 , respectively. Note that, in the simulation, we consider the following: (i) The constants (e.g., μ , the parameters in the normal distribution for r i and the uniform distribution for v i ) are chosen in a way that the empirical powers are appreciable for comparison. (ii) The sum of the absolute effect sizes is equal to μ for all the three scenarios. (iii) The degree of heterogeneity of the effect sizes among the true individual alternatives decreases from Scenario 1 to Scenario 3 for each given m .
The empirical power values of the tests are estimated as the rejection proportions based on 1000 replicates, with 0.05 as the significance level. For the CLRT-based concordant test, T C L R T C , we use B = 10 5 samples to estimate the p-values from the resampling method described in Section 2.3.
When all of the 50 p-values are from null hypotheses, the empirical power is the type 1 error rate. All methods considered here were able to control the type I error rate: their empirical powers under this situation were close to 0.05 when the significance level 0.05 was used.
Figure 1, Figure 2 and Figure 3 plot the empirical power values for the concordant tests based on minimal p, T G 0 C (denoted as Min_CC), Fisher T G 1 C (denoted as Fisher_CC, i.e., Pearson’s concordant test), z-test T G C (denoted as Z_CC), and the proposed concordant test, T C L R T C (denoted as LRT_CC), when one-sided p-values under Scenarios 1 to 3, respectively, are used.
From the simulation study, we have the following observations. First, under Scenario 1 (Figure 1), where the effect sizes under alternative hypotheses are extremely heterogeneous, usually the minimal p-based concordant test (i.e., T G 0 C ) performs better than the Pearson’s concordant test ( T G 1 C ), which in turn outperforms the z-test-based concordant test ( T G C ). Second, under Scenarios 2 and 3, when the degrees of heterogeneity of the effect sizes among individual alternative hypotheses are less extreme than in Scenario 1, T G 1 C and the T G C usually perform better or much better than T G 0 C (Figure 2 and Figure 3). Third, for the three concordant tests with fixed α values, namely T G 0 C , T G 1 C , and T G C , one may perform well under some conditions but do poorly under others. Fourth, under all conditions considered, the CLRT-based concordant test, T C L R T C , always has the best or the second-best power. This demonstrates that, as expected, T C L R T C is a robust test, meaning that, under many conditions, it has reasonable detection power compared with other tests.

3.2. Real Data Application

In this section, we demonstrate the usefulness of the proposed tests by applying them to a real-world problem. A data set from a meta-analysis is used. In that meta-analysis, results from 12 randomized trials—each examined the effect of patient rehabilitation designed for geriatric patients versus usual care on improving functional outcome at 3–12-month follow-ups—were collected [15,16]. The 12 estimated odds ratios (ORs) from the individual studies are listed in Table 1.
The commonly used fixed-effect model for meta-analyses was inadequate for this data set, as the p-value from the Cochran’s test for homogeneity was 0.021. Hence, the random-effect model of the meta-analysis was applied, which estimated the overall OR as 1.36 with a 95% CI (1.08, 1.71) [16]. However, when we applied the goodness-of-fit test for the random-effect model, we obtained a p-value of 0.025 [3], which indicates a lack of fit of the random-effect model for this data set. We then applied the p-value combination methods to test whether there is an overall effect among these 12 independent trials.
To use the p-value combination methods, we need the individual p-values. For each trial, the corresponding p-value is calculated based on the reported 95% CI of the OR. If we use U and L to denote the upper and lower limits of the 95% CI, then the test statistic can be approximated as z = ln U × L / 4 ln U / L / 3.92 , which is asymptotically distributed as N ( 0,1 ) under the null hypothesis that there is no difference between the new treatment and the control. Since the sample sizes of these 12 trials were relatively large, ranging from 108 to 1388, their p-values can be reasonably estimated using the asymptotic null distribution by comparing the test statistic with N ( 0,1 ) . For example, the right-sided p-value can be calculated as P [ Z > z ] , where Z has a standard normal distribution, N ( 0,1 ) , and z is the test statistic calculated as described above.
We first calculate the one-sided p-value for each trial and then use the values for the concordant tests. The p-values from the concordant tests T G 0 C , T G 1 C , T G C , and T C L R T C are 0.013, 0.00017, 0.00031, and 0.00030, respectively. For comparison, we also applied the gamma distribution-based tests to this data set, using two-sided p-values calculated from each individual trial. The resulting p-values are 0.013, 0.0068, 0.075, and 0.0047 from T G ( 0 ) , T G ( 1 ) , T G ( ) , and T C L R T , respectively. As expected, under the situation that a majority of the trials have the same direction (10 out of the 12 estimated ORs were greater than 1), concordant tests are usually more powerful than the corresponding gamma distribution-based tests using two-sided p-values. Noticeably, the z-test-based concordant test, T G C , obtained a much smaller p-value (0.00031) compared with the z-test using two-sided p-values (0.075). Among the concordant tests, the proposed test has the second smallest p-value (0.00030).

4. Discussion and Conclusions

In this paper, we studied a class of gamma distribution-based concordant tests, which include the Pearson’s concordant test as a special case. We also proposed and studied a CLRT-based concordant test, which is asymptotically optimal, robust, and powerful under many conditions. The advantages of the proposed tests were demonstrated based on numeric simulation study and real data application.
In addition to their use for meta-analyses, the proposed concordant tests have many other applications. In a two-way contingency table, if at least one of the two variables is ordinal, a trend test may be more powerful than the popular unidirectional Pearson’s chi-square test [17]. One challenge in trend tests is how to appropriately assign scores for each level of the ordinal variable; results can be quite different with different score assignments. Our proposed concordant test, T C L R T C , can circumvent this difficulty and provide another solution. Specifically, by using the asymptotically independent statistics with the same direction obtained by the Lancaster’s chi-square partition [18,19], we can obtain asymptotically independent one-sided p-values, to which T C L R T C can be applied.
The proposed tests will also be very useful for detecting casual genetic variants associated with diseases when applied to GWASs. It offers an alternative but powerful tool to combine information from different (subpopulations) but related studies (similar phenotype). We would expect that many genetic variants will be identified in the near future due to the new approach. In the future, we will also study how to combine dependent or correlated p-values.

Funding

This work was partially supported by the National Institutes of Health grants 1R03DE030259 and UL1TR002529. The content is solely the responsibility of the author and does not necessarily represent the official views of the NIH.

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors on request.

Conflicts of Interest

The author declares no conflict of interest.

References

  1. Owen, A.B. Karl Pearson’s meta-analysis revisited. Ann. Stat. 2009, 37, 3867–3892. [Google Scholar] [CrossRef] [PubMed]
  2. Hedges, L.; Olkin, I. Statistical Methods for Meta-Analysis; Academic: San Diego, CA, USA, 1985. [Google Scholar]
  3. Chen, Z.; Zhang, G.; Li, J. Goodness-of-fit test for meta-analysis. Sci. Rep. 2015, 5, 16983. [Google Scholar] [CrossRef] [PubMed]
  4. Fisher, R.A. Statistical Methods for Research Workers, 4th ed.; Oliver and Boyd: Edinburgh, UK, 1932. [Google Scholar]
  5. Pearson, K. On a New Method of Determining “Goodness of Fit”. Biometrika 1934, 26, 425–442. [Google Scholar]
  6. Tippett, L.H.C. Methods of Statistics; Williams Norgate: London, UK, 1931. [Google Scholar]
  7. Stouffer, S.A.; Suchman, E.A.; Devinney, L.C.; Star, S.A.; Williams, R.M., Jr. The American Soldier: Adjustment During Army Life. (Studies in Social Psychology in World War II); Princeton Univ. Press: Princeton, NJ, USA, 1949; Volume 1. [Google Scholar]
  8. Lancaster, H. The combination of probabilities: An application of orthonormal functions. Aust. J. Stat. 1961, 3, 20–33. [Google Scholar] [CrossRef]
  9. Chen, Z.; Nadarajah, S. On the optimally weighted z-test for combining probabilities from independent studies. Comput. Stat. Data Anal. 2014, 70, 387–394. [Google Scholar] [CrossRef]
  10. Chen, Z. Optimal Tests for Combining p-Values. Appl. Sci. 2022, 12, 322. [Google Scholar] [CrossRef]
  11. Birnbaum, A. Combining Independent Tests of Significance. J. Am. Stat. Assoc. 1954, 49, 559–574. [Google Scholar]
  12. Chen, Z. Robust tests for combining p-values under arbitrary dependency structures. Sci. Rep. 2022, 12, 3158. [Google Scholar] [CrossRef] [PubMed]
  13. Esary, J.D.; Proschan, F.; Walkup, D.W. Association of random variables, with applications. Ann. Math. Stat. 1967, 38, 1466–1474. [Google Scholar] [CrossRef]
  14. Self, S.G.; Liang, K.-Y. Asymptotic properties of maximum likelihood estimators and likelihood ratio tests under nonstandard conditions. J. Am. Stat. Assoc. 1987, 82, 605–610. [Google Scholar] [CrossRef]
  15. Bachmann, S.; Finger, C.; Huss, A.; Egger, M.; Stuck, A.E.; Clough-Gorr, K.M. Inpatient rehabilitation specifically designed for geriatric patients: Systematic review and meta-analysis of randomised controlled trials. Bmj 2010, 340, c1718. [Google Scholar] [CrossRef] [PubMed]
  16. Riley, R.D.; Higgins, J.P.; Deeks, J.J. Interpretation of random effects meta-analyses. Bmj 2011, 342, d549. [Google Scholar] [CrossRef] [PubMed]
  17. Cochran, W. Some methods for strengthening the common chi-square tests. Biometrics 1954, 10, 417–451. [Google Scholar] [CrossRef]
  18. Lancaster, H. The derivation and partition of χ2 in certain discrete distributions. Biometrika 1949, 36, 117–129. [Google Scholar] [CrossRef]
  19. Chen, Z. A new association test based on Chi-square partition for case-control GWA studies. Genet. Epidemiol. 2011, 35, 658–663. [Google Scholar] [CrossRef]
Figure 1. Empirical power values of the tests based on one-sided p-values under Scenario 1.
Figure 1. Empirical power values of the tests based on one-sided p-values under Scenario 1.
Applsci 14 04536 g001
Figure 2. Empirical power values of the tests based on one-sided p-values under Scenario 2.
Figure 2. Empirical power values of the tests based on one-sided p-values under Scenario 2.
Applsci 14 04536 g002
Figure 3. Empirical power values of the tests based on one-sided p-values under Scenario 3.
Figure 3. Empirical power values of the tests based on one-sided p-values under Scenario 3.
Applsci 14 04536 g003
Table 1. Estimated odds ratio and its 95% CI from each study in a meta-analysis with 12 trials. Data were taken from Bachmann et al. [15] and Riley et al. [16].
Table 1. Estimated odds ratio and its 95% CI from each study in a meta-analysis with 12 trials. Data were taken from Bachmann et al. [15] and Riley et al. [16].
StudyOR95% CIStudyOR95% CIStudyOR95% CI
11.110.51, 2.3950.880.39, 1.9591.060.63, 1.79
20.970.78, 1.2161.280.71, 2.30102.951.54, 5.63
31.130.73, 1.7271.190.69, 2.08112.361.18, 4.72
41.080.42, 2.7583.821.37, 10.60121.681.05, 2.70
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Chen, Z. Optimal Concordant Tests. Appl. Sci. 2024, 14, 4536. https://doi.org/10.3390/app14114536

AMA Style

Chen Z. Optimal Concordant Tests. Applied Sciences. 2024; 14(11):4536. https://doi.org/10.3390/app14114536

Chicago/Turabian Style

Chen, Zhongxue. 2024. "Optimal Concordant Tests" Applied Sciences 14, no. 11: 4536. https://doi.org/10.3390/app14114536

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop