Next Article in Journal
Monogenity and Power Integral Bases: Recent Developments
Next Article in Special Issue
The Geometry of Dynamic Time-Dependent Best–Worst Choice Pairs
Previous Article in Journal
Constructing Approximations to Bivariate Piecewise-Smooth Functions
Previous Article in Special Issue
Extensions of Some Statistical Concepts to the Complex Domain
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Comparative Analysis of Exact Methods for Testing Equivalence of Prevalences in Bilateral and Unilateral Combined Data with and without Assumptions of Correlation

Department of Biostatistics, University at Buffalo, Buffalo, NY 14214, USA
*
Author to whom correspondence should be addressed.
Axioms 2024, 13(7), 430; https://doi.org/10.3390/axioms13070430
Submission received: 21 May 2024 / Revised: 18 June 2024 / Accepted: 23 June 2024 / Published: 26 June 2024
(This article belongs to the Special Issue New Perspectives in Mathematical Statistics)

Abstract

:
In clinical studies focusing on paired body parts, diseases can manifest on either both sides (bilateral) or just one side (unilateral) of the organs. Consequently, the data in these studies may consist of records from both bilateral and unilateral cases. There are two different methods of analyzing the data. One of the methods is assuming that the pair of measurements from the same subject are independent, while the other considers the correlation between paired organs. In terms of the homogeneity test of proportions, asymptotic methods have been proposed given the moderate size of data. This article extends the existing work by proposing exact methods to deal with the scenarios when the sample size is small and asymptotic methods perform poorly. The impact of the correlation assumption is also explored. Among the proposed methods, calculating p-values by replacing unknown parameters with estimated values while accounting for the correlation is recommended based on its satisfactory type I error controls and statistical powers. The proposed methods are applied to three real examples for illustration.

1. Introduction

Data collected from paired organs of human bodies can be bilateral or unilateral depending on the availability of each site or the disease status. Whether to assume or ignore the correlation between two sites of an organ remains an interesting discussion. The majority of articles published in the British Journal of Ophthalmology (BJO) in 1995 and 2017 assumed independence of the sites according to Zhang and Ying [1]. For the homogeneity tests of group-specific response rates given the outcome is binary, classical methods such as Pearson’s chi-squared test [2] can be applied. However, Rosner [3], Donner, and Banting [4]) criticized this approach by pointing out that the omission of the correlation will lead to biased inferences. However, there are not many studies that demonstrate such a bias with numerical presentations.
To avoid the possible bias, Donner [5] assumed that the number of successes from each pair of organs follows a binomial distribution with probability parameter P, where P varies among subjects and has an expectation of p. The variance of P is proportional to p ( 1 p ) ; that is, V a r ( P ) = ρ p ( 1 p ) . This concept can be extended easily by assuming the intra-person correlation is ρ , where ρ ranges from 1 to 1. The correlation is assumed to be constant for all individuals and is referred to as the equal correlation coefficients model. In the context of bilateral data, Ma and Liu [6] derived three asymptotic methods, the likelihood ratio test, the score test, and the Wald-type test, to assess the homogeneity of prevalences of multiple groups. Mou and Li [7] compared multiple algorithms for estimating parameters and investigated asymptotic statistics for the homogeneity test of many-to-one risk differences. Liu et al. [8] considered four exact approaches, the E approach, the M approach, the E + M approach, and the C approach, as alternatives to the methods proposed by Ma and Liu [6] when the sample size cannot ensure a good asymptotic approximation. In general, exact approaches calculate the probability of a tail area, which is defined as a set of cases that are more extreme than the observed data against the null hypothesis. The E approach, initially proposed by Liddell [9], replaces nuisance parameters with their corresponding maximum likelihood estimate. Basu [10] introduced the M approach, which involves determining the p-value by maximizing the likelihood of observing the tail area across the entire parameter space. By combining the E and the M methods, Lloyd [11] first determined the tail area by regarding the p-value from the E approach as a statistic and then applied the M method to calculate the final p-value. The previous three methods assume the group totals are fixed, while the C method in Liu’s work [8] fixes both margins of a table. In the context of bilateral and unilateral combined data, Ma and Wang [12] derived three asymptotic approaches for testing the equality of proportions using the equal correlation coefficients model. It is worth noting that these methods still require a large sample size, and there is a need to extend Liu’s work to combined data.
This article proposes five exact approaches given bilateral and unilateral combined data with limited samples for testing the equivalence of proportions of multiple groups. Section 2 presents details of Donner’s equal correlation coefficients model. In Section 3, we introduce the individual site model, which omits the correlation between sites. Classical methods based on the individual site model are described in Section 4, and we introduce five exact approaches in Section 5 using Donner’s model. In Section 4, we provide numerical studies that compare the two models and various methods with regard to their ability to control Type I errors (TIEs) and their statistical powers. The application of the proposed methods through two real-world examples is illustrated in Section 6. The final conclusions and discussion are given in Section 7.

2. Equal Correlation Coefficients Model

Define Z i j k = 1 as the event of the jth subject in the ith group having a response at the kth site and π i as P r o b ( Z i j k = 1 ) = P r o b ( Z i j ( 3 k ) = 1 ) , where i = 1 , 2 , , g and k = 1 , 2 . The equal correlation coefficients model proposed by Donner [5] assumes the correlation coefficient C o r r ( Z i j k , Z i j ( 3 k ) ) between measurements from two sites to be a constant ρ . In the bilateral cohort where subjects contribute one measurement at each site of the paired organs, let m i r stand for the number of subjects with r response(s) in the ith group, where r = 0 , 1 , 2 . In the unilateral cohort where subjects provide one measurement on only one site of the paired organs, n i r * denotes the number of subjects with r * response(s) in the ith group, where r * = 0 , 1 . The data layout on the subject level can be found in Table 1.
Given the nature of clinical study designs, ( m 1 , m 2 , , m g , n 1 , n 2 , , n g ) are fixed, and the plausible distribution assumptions are as follows:
n i 0 binomial distribution Bin ( n i , P u i 0 ) ; n i 1 binomial distribution Bin ( n i , P u i 1 ) ; ( m i 0 , m i 1 , m i 2 ) multinomial distribution M u l t i ( m i ; P b i 0 , P b i 1 , P b i 2 ) .
In observational studies, the sample size of each group can be random, which is beyond the scope of this article. It can be easily shown that P b i 0 = ( 1 π i ) ( 1 π i + ρ π i ) , P b i 1 = 2 ( 1 ρ ) ( π i π i 2 ) , P b i 2 = ρ π i + ( 1 ρ ) π i 2 , P u i 0 = 1 π i , and P u i 1 = π i . The null and the alternative hypotheses of interest are as follows:
H 0 : π 1 = π 2 = π g = π versus H 1 : not all of them are equal .

3. Individual Site Model without Considering Correlation

Many studies encounter challenges due to the limited sample size of participants. One of the common strategies that researchers consider is conducting analyses on the site level of each subject to extract as much information as possible. To our knowledge, the impact of ignoring the potential correlation of the two sites of a subject remains unknown. In order to explore this effect, data from the equal correlation coefficients model are transformed to new data on the site level. Define x i e as the number of sites with e response, where e = 0 , 1 and i = 1 , 2 , , g . The new data structure is presented in Table 2. The relationship between the two types of data is presented as follows:
x i 0 = 2 m i 0 + m i 1 + n i 0 , x i 1 = 2 m i 2 + m i 1 + n i 1 , x i = 2 m i + n i .
The random variables x i 0 and x i 1 are assumed to follow binomial distributions Bin( x i , 1 π i ) and Bin( x i , π i ), respectively. The null and the alternative hypotheses are the same as stated in Section 2.

4. Methods for Individual Site Model

4.1. The Pearson Chi-Squared Test

Define the notation X * as an observed table ( x 10 , x 11 , x 20 , x 21 , , x g 0 , x g 1 ) from Table 2. Given the ignorance of correlations between the sites, the individual site model becomes an unordered g × 2 table where the Pearson chi-squared test statistic is given by Equation (1).
T P e a r s o n ( X * ) = i = 1 g T x i 1 T 1 x i 2 T T 1 x i + T x i 0 T x i + T 1 x i 2 T x i T T 1
The p-value is defined as P P e a r s o n ( X * ) = P r o b ( X g 1 2 T P e a r s o n ( X * ) ) since the test statistic T P e a r s o n ( X * ) asymptotically follows a chi-squared distribution with g 1 degrees of freedom when the sample size is sufficient (Fagerland et al. [13]). The null hypothesis is rejected if the calculated p-value is less than 0.05 given the significance level of 0.05.

4.2. The Fisher–Freeman–Halton Exact Test

The asymptotic behavior described above does not hold when the numbers in Table 2 are small. To overcome this limitation, an alternative test called the Fisher–Freeman–Halton (FFH) exact test can be employed (Fagerland et al. [13]). The point probability, P r o b ( X ) , conditions both the row and column totals and is used as the test statistic. In the null hypothesis, the probability distribution of Table 2 follows the multiple hypergeometric distribution. The Fisher–Freeman–Halton exact test treats tables with a smaller point probability than P r o b ( X * ) as a piece of evidence against H 0 , where P r o b ( X * ) is the probability of observing X * . Let Ω F F H ( X * ) represent the collection of all tables sharing the same row and column margins as the observed table X * . Therefore, the exact p-value is defined by Equation (2), where I ( ) is the indicator function.
P F F H ( X * ) = X Ω F F H ( X * ) I ( P r o b ( X ) P r o b ( X * ) ) P r o b ( X )

4.3. The Mid-P Test

The Fisher–Freeman–Halton exact test is often criticized for being unnecessarily conservative due to discreteness in certain scenarios. One of the adjustments to P F F H ( X * ) is called the mid-P test. This test downsizes the weight of tables with the same point probability as the observed table X * (Fagerland et al. [13]). In other words, the adjusted p-value can be defined by Equation (3).
P m i d P ( X * ) = X Ω F F H ( X * ) I ( P r o b ( X ) < P r o b ( X * ) ) P r o b ( X ) + 1 2 X Ω F F H ( X * ) I ( P r o b ( X ) = P r o b ( X * ) ) P r o b ( X )

5. Methods for Equal Correlation Coefficients Model

Given observed data M * = ( m 10 , m 11 , m 12 , n 10 , n 11 , , m g 0 , m g 1 , m g 2 , n g 0 , n g 1 ) , the likelihood function is given by Equation (4), and the log-likelihood function can be written as a function of π 1 , π 2 , , π g , and ρ as shown in Equation (5).
L ( M * ; π 1 , π 2 , , π g , ρ ) = i = 1 g m i ! m i 0 ! m i 1 ! m i 2 ! ( ( 1 π i ) ( 1 π i + ρ π i ) ) m i 0 × ( 2 ( 1 ρ ) ( π i π i 2 ) ) m i 1 ( ρ π i + ( 1 ρ ) π i 2 ) m i 2 × n i ! n i 0 ! n i 1 ! ( 1 π i ) n i 0 π i n i 1
l ( π 1 , π 2 , , π g , ρ ) = i = 1 g m i 0 log ( ( 1 π i ) ( 1 π i + ρ π i ) ) + m i 1 log ( 2 ( 1 ρ ) ( π i π i 2 ) ) + m i 2 log ( ρ π i + ( 1 ρ ) π i 2 ) + n i 0 log ( 1 π i ) + n i 1 log ( π i )
Let L ( M * ; π , ρ ) represent L ( M * ; π 1 = π 2 = = π g = π , ρ ) for ease of notation. With H 0 , the maximum likelihood estimates (MLEs) can be obtained by setting the partial derivatives with respect to π i and ρ to zero and solving Equations (6) and (7).
l ρ = M 1 ρ 1 M 2 π 1 π + ρ π ρ + M 0 π π ρ π + 1 = 0
l π = N 1 π + N 0 π 1 M 2 ρ 2 π ρ 1 π 2 ρ 1 π ρ M 1 2 π 1 π π 2 + M 0 π 1 ρ 1 π + π ρ + 1 π 1 π ρ π + 1 = 0
Let π ^ and ρ ^ denote the constrained MLEs for H 0 . The closed-form solutions to the above equations can be found in Ma and Wang’s work [12].

5.1. Score Test

Ma and Wang [12] examined the likelihood ratio test, the score test, and the Wald-type test and concluded that the score test outperformed the other two approaches based on simulation studies using various combinations of parameter settings and sample sizes. Therefore, we choose the score test to explore the asymptotic behaviors given smaller sample sizes compared to Ma and Wang’s settings. This approach is referred to as the A method. The score test statistic is given by Equation (8), where U = ( U 1 , U 2 , , U g , U g + 1 ) = ( l π 1 , l π 2 , , l π g , l ρ ) , and I ( π 1 , π 2 , , π g , ρ ) is the Fisher information assuming the group-specific proportion π i ’s.
T S C ( M * ) = U I ( π 1 , π 2 , , π g , ρ ) 1 U T | π 1 = π 2 = = π g = π ^ , ρ = ρ ^
We kindly recommend that readers refer to Ma and Wang’s work for a simplified form of T S C ( M * ) . Asymptotically, the score test statistic follows a chi-square distribution with g 1 degrees of freedom given the null hypothesis is true according to Rao [14]. The corresponding p-value can be calculated by Equation (9).
P A ( M * ) = P r o b ( X g 1 2 > T S C ( M * ) )
Similar to the Pearson chi-squared statistic, the approximation to X g 1 2 does not hold if the sample size is insufficient, leading to uncontrolled type I error rates. To alleviate this difficulty, four exact methods are proposed by fixing ( m 1 , m 2 , , m g , n 1 , n 2 , , n g ) and are described in detail in Section 5.2, Section 5.3, Section 5.4 and Section 5.5. Define the collection of all tables with the same row margin as M * in Table 1 as Ω 1 ( M * ) . In Section 5.6, we propose another exact method by fixing both the row and column totals.

5.2. E Method

There are several methods to calculate p-values based on the summation of probabilities of tables that are more extreme than the observed table M * when H 0 is true. One of the approaches treats the values of constrained MLEs π ^ and ρ ^ as the true parameter values, and it is referred to as the E method because these estimates are used as substitutes for unknown parameters in calculations. As a result, the p-value based on the E method is defined by Equation (10).
P E ( M * ) = M Ω 1 ( M * ) I ( T S C ( M ) T S C ( M * ) ) L ( M | π = π ^ , ρ = ρ ^ )

5.3. M Method

Instead of using the estimated parameters directly, an alternative approach is to maximize the above summation of probabilities in the whole parameter space. This technique is denoted as the M method. The exact p-value can be calculated by Equation (11).
P M ( M * ) = sup π ( 0 , 1 ) , ρ ( 1 , 1 ) M Ω 1 ( M * ) I ( T S C ( M ) T S C ( M * ) ) L ( M ; π , ρ )

5.4. E + M Method

The M method discussed above calculates p-values based on the score test statistic. Notice that the exact p-value, P E ( M * ) , can be regarded as a statistic as well and replace the role of the score statistic in the M method. The fusion of the two approaches is referred to as the E + M method. The exact p-value is defined by Equation (12).
P E + M ( M * ) = sup π ( 0 , 1 ) , ρ ( 1 , 1 ) M Ω 1 ( M * ) I ( P E ( M ) P E ( M * ) ) L ( M ; π , ρ )

5.5. CI Method

Unlike the M method, a smaller parameter space consisting of the 100 ( 1 β ) % confidence intervals (CIs) of π and ρ can be used when maximizing the summation of probabilities, and this is called the CI method. The confidence intervals of π and ρ are denoted as C I π and C I ρ , respectively. We kindly ask the readers to refer to Berger [15] and Silvapulle’s [16] work for details. In essence, the rationale is that the M method can be conservative since the generated p-values may be close to one due to the nature of the supremum. The CI method shrinks the parameter space, resulting in smaller p-values than the M approach. The corresponding p-value is defined by Equation (13).
P C I ( M * ) = sup π C I π , ρ C I ρ M Ω 1 ( M * ) I ( T S C ( M ) T S C ( M * ) ) L ( M ; π , ρ ) + 3 β
A definition of the p-value is considered valid if P r o b ( p-value α | H 0 ) α holds according to Vexler [17]). A simple proof that P C I ( M * ) is a valid p-value is given in the Appendix A.
The confidence intervals of π and ρ are derived using another score test statistic ( T S C * ) with the null hypothesis H 0 : π 1 = π 2 = = π g = π . The conditional MLE of π given known ρ can be obtained by solving Equation (2), and the conditional MLE of ρ given π can be obtained by solving Equation (1). The score test statistic T S C * can be expressed by Equation (14),
T S C * ( M * ) = U C I C ( π , ρ ) 1 U C T
where
U C = ( l π , l ρ ) , I C ( π , ρ ) = I 11 I 12 I 21 I 22 ,
I 11 = E ( 2 l π 2 ) = N π 4 M ρ 1 N π 1 + 2 M 2 ρ 2 + M ρ 2 π ρ 1 2 π ρ π 2 ρ 1 M ( π ρ π + ( π 1 ) ( ρ 1 ) + 1 ) π 1 2 M 2 π 1 2 ρ 1 π π 2 M ρ 1 ( π ρ π + ( π 1 ) ( ρ 1 ) + 1 ) π ρ π + 1 , I 12 = I 21 = E ( 2 l π ρ ) = M ρ 2 π 1 π ρ π + 1 ρ + π π ρ , and I 22 = E ( 2 l ρ 2 ) = π M π 1 ρ + 1 ρ 1 π ρ π + 1 ρ + π π ρ .
The following iterative procedure outlines the details of finding the upper limit of the CI of π :
(1)
Set π ^ as the starting point of π , where π ^ is the constrained MLE of π for H 0 . Initialize flag = 1 and stepsize = m i n { 0.01 , ( 1 π ^ ) / 10 } so that the updated upper bound does not prematurely exceed 1;
(2)
Update π ( t + 1 ) = π ( t ) + flag × stepsize and calculate the conditional MLE ρ ˜ given π ( t + 1 ) . Then, the score test statistic is given by ( T S C * ( M * ) ) ( t + 1 ) = U C I C ( π , ρ ) 1 U C T | π = π ( t + 1 ) , ρ = ρ ˜ ;
(3)
If ( T S C * ( M * ) ) ( t + 1 ) > X 1 , 1 β 2 , where X 1 , 1 β 2 is the 1 β quantile of the chi-square distribution with one degree of freedom, turn to the opposite searching direction by letting flag = −1 and reduce the stepsize by multiplying it by 1 / 3.1416 , then return to step (2). Otherwise, keep flag = 1 and return to step (2);
(4)
Repeat steps (2) and (3) until the stepsize is sufficiently small (e.g., 10 4 ).
The lower limit of the CI of π can be determined by letting flag = −1 and stepsize = m i n { 0.01 , π ^ / 10 } in step (1) and then multiplying it by −1 if ( T S C * ( M * ) ) ( t + 1 ) > X 1 , 1 β 2 or multiplying it by 1 if ( T S C * ( M * ) ) ( t + 1 ) X 1 , 1 β 2 .
The following procedures can be used to determine the upper limit of the CI of ρ :
(1)
Set ρ ^ as the starting point of ρ , where ρ ^ is the constrained MLE of ρ for H 0 . Initialize flag = 1 and stepsize = m i n { 0.01 , ( 1 ρ ^ ) / 10 } ;
(2)
Update ρ ( t + 1 ) = ρ ( t ) + flag × stepsize and calculate the conditional MLE π ˜ given ρ ( t + 1 ) . Then, the score test statistic is given by ( T S C * ( M * ) ) ( t + 1 ) = U C I C ( π , ρ ) 1 U C T | ρ = ρ ( t + 1 ) , π = π ˜ ;
(3)
If ( T S C * ( M * ) ) ( t + 1 ) > X 1 , 1 β 2 , turn to the opposite searching direction by letting flag = −1 and reduce the stepsize by multiplying it by 1 / 3.1416 , then return to step (2). Otherwise, keep flag = 1 and return to step (2);
(4)
Repeat steps (2) and (3) until the stepsize is sufficiently small (e.g., 10 4 ).
The lower bound of the CI of ρ can be found by initializing flag = −1 and stepsize = m i n { 0.01 , ( 1 + ρ ^ ) / 10 } in step (1) and multiply it by −1 if ( T S C * ( M * ) ) ( t + 1 ) > X 1 , 1 β 2 or multiply it by 1 if ( T S C * ( M * ) ) ( t + 1 ) X 1 , 1 β 2 .

5.6. C Method

The conditional method, denoted as the C method, assumes both the column and row totals are fixed in Table 1, and it does not involve nuisance parameters in determining p-values. Let ( m 1 * , n 1 * , m 2 * , n 2 * , , m g * , n g * , S 0 * , S 1 * , S 2 * , N 0 * , N 1 * ) represent the margins of the observed table M * and define Ω C ( M * ) as the collection of all tables having the same margins as M * . The exact p-value is determined by Equation (15).
P C ( M * ) = M Ω C ( M * ) ( I ( T S C ( M ) T S C ( M * ) ) × i = 1 g m i ! m i 0 ! m i 1 ! m i 2 ! M ! S 0 ! S 1 ! S 2 ! × i = 1 g n i ! n i 0 ! n i 1 ! N ! N 0 ! N 1 ! )

6. Numerical Study

The performance of the approaches discussed in Section 4 and Section 5 was evaluated with respect to type I error controls and statistical powers. To simulate practical scenarios characterized by small sample sizes, the total sample size M + N was set to around 20 and 10. We further assumed a balanced design with m 1 = m 2 = = m g and n 1 = n 2 = = n g . It is worth noting that unbalanced studies can be explored in a similar manner. The nominal level α = 5 % was used throughout the numerical study.
For the equal correlation coefficients model, all tables having a given margin ( m 1 , m 2 , , m g , n 1 , , n g ) were enumerated, and the corresponding p-values based on the six methods discussed can be calculated. The type I errors and powers of each proposed method were determined by the summation of probabilities of tables with a p-value < 0.05 for H 0 and H a , respectively. We set the parameter β = 0.0001 in the CI method. Figure 1, Figure 2 and Figure 3 demonstrate the contour plots of type I error rates as a function of π and ρ when the total sample size is around 10. The contour plots of type I error rates given a sample size of approximately 20 are shown in Figure A1 and Figure A2. The blank areas in the plots represent combinations of π and ρ for which at least one of the parameters ( P b i 0 , P b i 1 , P b i 2 , P u i 0 , P u i 1 ) falls outside the interval [0, 1]. If the type I error of a test is greater than 6%, the test is considered liberal. And if the type I error of a test is less than 4%, the test is considered conservative according to Tang et al. [18].
Regardless of the sample size and the group number, the asymptotic method, the M method, the E + M method, the C method, and the CI method typically yield conservative type I errors since a large area of combinations of π and ρ producing TIEs ≤ 0.04 is observed. The CI approach and the M approach generate very similar results due to their similar constructions of p-values. The CI approach defines the p-values within the space consisting of two confidence intervals, and the M approach considers the whole parameter space. When the total sample size is around 10, the C method produces extremely conservative outcomes with all parameter settings generating TIEs that are less than 3% due to discreteness. Although the E method generates a small proportion of area in the contour plots where the type I errors exceed 6%, it maintains a larger portion of type I error rates within the range of 4% and 6% compared to the other five methods.
For each combination of ( g , m i , n i ) from Figure 1, Figure 2 and Figure 3 and from Figure A1 and Figure A2, a parallel study was conducted using the individual site model. For example, tables with g = 2 , m i = 5 , and n i = 5 can be transformed to ( x 10 , x 11 , x 20 , x 21 ) , where x 1 = x 2 = 2 × 5 + 5 = 15 . The performance of type I error controls of the Pearson chi-squared test, the FFH test, and the mid-P test were investigated, and the contour plots are exhibited in Figure 4, Figure 5 and Figure 6 and Figure A3 and Figure A4.
Using the individual site model, the Pearson, the FFH, and the mid-P test generate inflated type I errors when the underlying correlation coefficient ρ is close to 1, while the FFH approach outputs conservative TIEs when g = 2 , m i = 5 , n i = 5 . If the two sites of a subject have a weak correlation ( ρ 0 ), type I errors from the three methods can be controlled within the satisfied region. Hence, the equal correlation coefficients model is recommended in practice given the unsatisfied type I error controls when the correlation is ignored.
The power performance is investigated in the following part. The exact power is determined by varying the value of π g while fixing π 1 , , π g 1 , and ρ . Based on Figure 7, Figure 8, Figure 9 and Figure 10 and Figure A5, Figure A6, Figure A7 and Figure A8, the E method outperforms the other five methods and have the highest powers in all scenarios. The power of the C approach is generally the lowest, which is expected given the low type I error rates observed in the numerical study above. Again, the CI and the M method generate very similar powers due to the intrinsic similarities. Note that if the absolute difference between π g and the value of π 1 = . π g 1 decrease to 0, the powers of all methods become close to 0.05 as indicated by the horizontal dash lines. This behavior is anticipated because the smaller the difference, the more likely the tables will support the null hypothesis. The statistical test powers increase as the sample size increases from 10 to 20 no matter what method is employed.

7. Real Examples

Three real-world examples are described in detail to demonstrate the application of the proposed methods. The first example is a double-blind randomized clinical trial studying acute otitis media with effusion (OME) as described by Mandel [19]. The total number of children recruited in the study was 214, and there were two hundred and ninety-three ears registered in the study among these children. Every participant was categorized into either the bilateral disease or unilateral disease group at the study entry. All subjects underwent either unilateral or bilateral tympanocentesis and were subsequently randomized into one of two treatment groups, cefaclor or amoxicillin, for 14 days. After the course of treatment, the disease status of each ear was recorded, and 11 children were dropped from the study since they met the exclusion criteria. Table 3 displays the distribution of a subgroup of children aged ≥ 6 years. To test the homogeneity of cure rates of the two treatments, the six methods discussed in Section 5 were applied, and the corresponding p-values can be found in Table 4. All p-values are greater than 0.05, suggesting a lack of evidence to reject the null hypothesis at α = 5 % .
The second example consists of 60 subjects receiving Orthokeratology (Ortho-k) to treat nearsightedness. This is an observational study conducted at the First Affiliated Hospital of Xiamen University in 2023 (Liang et al. [20]). Ortho-k is a vision correction procedure that uses specially designed contact lenses to temporarily reshape the cornea and improve vision. The study involved various brands that can be further classified into two categories based on the type of lens designs. Vision shaping treatment (VST) is one of the two designs constructing the edge of the lens that touches the edge of the cornea, while the other design, called corneal refractive therapy (CRT), constructs the edge of the lens so that it does not touch the cornea (Lu et al. [21]). Patients can opt to wear the lenses on either one or both eyes, depending on their individual requirements. An eye is said to have a response to the treatment if the axial length growth < 0.3 mm according to Rose et al. [22]. A subgroup of female subjects were included in the demonstration to test whether there was a difference between the response rates of the two designs. There were 3 subjects who received unilateral Ortho-k and 26 subjects who underwent bilateral Ortho-k, as indicated in Table 5. The p-values from Table 6 are all less than the nominal level of 5%, suggesting a rejection of the null hypothesis.
The data of the third example were collected from patients with retinitis pigmentosa referred from the outpatient facilities of the Massachusetts Eye and Ear Infirmary or from private ophthalmologists. Details of the study can be found in Berson, Rosner, and Simonoff [23]. The patients were classified into four groups based on their genetic types, which are autosomal dominant RP (DOM), autosomal recessive RP (AR), sex-linked RP (SL), and isolate RP (ISO). The type ISO was dropped from the analysis since the sample size of this group is large and asymptotic methods are preferred. The distribution of patients can be found in Table 7. This example demonstrates the capability of the proposed methods in handling bilateral data. The calculation time of the E + M method is excessively long; therefore, the corresponding p-value is omitted. All the p-values of the proposed methods displayed in Table 8 are less than 0.05, indicating a rejection of the null hypothesis, and the prevalences of the three groups are not the same.

8. Discussion

The individual site model assumes the measurements from sites of individuals are independent and does not consider the potential correlation between two sites of the same subject. Given prior knowledge of the nonexistence of the correlation, an investigator may choose to proceed with the individual site model and perform the Pearson chi-squared test or exact tests without doubt. However, it is often the case that there is no proven fact of such an assumption. As the correlation moves further from zero, the asymptotic and exact methods lose the type I error controls based on the numerical studies in Section 6. One may argue that the individual site model may carry out the hypothesis test using an appropriate method with acceptable TIEs falling in the interval of [ 0.04 , 0.06 ] . It is important to note that the equal correlation coefficients model also produces satisfactory TIE controls with a suitable approach. Therefore, the model taking into account the correlation takes precedence if the data collection involves paired sites of the same subject.
Given the priority of the equal correlation coefficients model, testing the homogeneity of group-specific proportions using the score test can be the first candidate method that researchers consider. Nevertheless, the asymptotic behavior of the score test is not guaranteed if the sample size is small. As a consequence, five exact tests are proposed, and their performance in terms of TIE controls and statistical powers was examined in different scenarios. When the sample size is relatively small as seen in Figure 1, Figure 2 and Figure 3, the E method generally performs well in controlling TIEs within 4% to 6%, while other methods are conservative with larger proportions in the contour plots below 4%. This superiority of the E method can be found in Figure A1 and Figure A2 where the sample size is relatively large. Regarding statistical powers, the E method is superior or similar to other methods for all sample sizes and parameter settings. For example, the E method has similar powers to the asymptotic method when g = 2 and differentiates from other methods when g = 3 or 4 in Figure 7. Overall, the E method is recommended as it controls type I error in a satisfactory region and has higher powers than the other methods. The M, E + M, and CI methods generally produce comparable TIEs and powers, whereas the C method becomes the most conservative specifically when the total sample size is approximately 10. Future works will include validation of the equal correlation coefficient assumption before performing the homogeneity test.

9. Conclusions

There are two aims of this article: first, to investigate and compare the two models with and without the assumption of correlation coefficients, and second, to propose exact methods to address the lack of type I error controls caused by poor approximations of the score test when the sample size is small. The numerical study indicates the relative appropriateness of the equal correlation model when analyzing paired binary data and highlights the superiority of the E method for exact tests of homogeneity of proportions.

Author Contributions

All authors contribute equally to the manuscript. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Data are contained within the article.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

Let π 0 and ρ 0 be the true parameters with the null hypothesis H 0 and note that P r o b ( T S C ( M ) T S C ( M * ) | π 0 , ρ 0 ) is uniformly distributed on [0, 1] with H 0 . Following the definition of P C I ( M * ) , it is clear that
P r o b ( P C I ( M * ) α | H 0 ) = P r o b ( P C I ( M * ) α , π 0 C I π , ρ 0 C I ρ | H 0 ) + P r o b ( P C I ( M * ) α , π 0 C I π , ρ 0 C I ρ | H 0 ) + P r o b ( P C I ( M * ) α , π 0 C I π , ρ 0 C I ρ | H 0 ) + P r o b ( P C I ( M * ) α , π 0 C I π , ρ 0 C I ρ | H 0 ) P r o b ( P C I ( M * ) α , π 0 C I π , ρ 0 C I ρ | H 0 ) + P r o b ( ρ 0 C I ρ | H 0 ) + P r o b ( π 0 C I π | H 0 ) + P r o b ( π 0 C I π | H 0 ) = P r o b sup π C I π , ρ C I ρ M Ω 1 ( M * ) I ( T S C ( M ) T S C ( M * ) ) L ( M ; π , ρ ) + 3 β α , π 0 C I π , ρ 0 C I ρ | H 0 + P r o b ( ρ 0 C I ρ | H 0 ) + 2 × P r o b ( π 0 C I π | H 0 ) P r o b ( P r o b ( T S C ( M ) T S C ( M * ) | π 0 , ρ 0 ) + 3 β α , π 0 C I π , ρ 0 C I ρ | H 0 ) + P r o b ( ρ 0 C I ρ | H 0 ) + 2 × P r o b ( π 0 C I π | H 0 ) P r o b ( P r o b ( T S C ( M ) T S C ( M * ) | π 0 , ρ 0 ) α 3 β | H 0 ) + 3 β = α 3 β + 3 β = α ,

Appendix B

Figure A1. Contour plots of type I errors for the equal correlation coefficients model ( g = 2 , m i = 5 , n i = 5 , i = 1 , 2 ).
Figure A1. Contour plots of type I errors for the equal correlation coefficients model ( g = 2 , m i = 5 , n i = 5 , i = 1 , 2 ).
Axioms 13 00430 g0a1
Figure A2. Contour plots of type I errors for the equal correlation coefficients model ( g = 3 , m i = 4 , n i = 3 , i = 1 , 2 , 3 ).
Figure A2. Contour plots of type I errors for the equal correlation coefficients model ( g = 3 , m i = 4 , n i = 3 , i = 1 , 2 , 3 ).
Axioms 13 00430 g0a2aAxioms 13 00430 g0a2b
Figure A3. Contour plots of type I errors for the individual site model ( g = 2 , m i = 5 , n i = 5 , i = 1 , 2 ).
Figure A3. Contour plots of type I errors for the individual site model ( g = 2 , m i = 5 , n i = 5 , i = 1 , 2 ).
Axioms 13 00430 g0a3
Figure A4. Contour plots of type I errors for the individual site model ( g = 3 , m i = 4 , n i = 3 , i = 1 , 2 , 3 ).
Figure A4. Contour plots of type I errors for the individual site model ( g = 3 , m i = 4 , n i = 3 , i = 1 , 2 , 3 ).
Axioms 13 00430 g0a4
Figure A5. Power plots for π 1 = = π g 1 = 0.3 , and ρ = 0.4 .
Figure A5. Power plots for π 1 = = π g 1 = 0.3 , and ρ = 0.4 .
Axioms 13 00430 g0a5aAxioms 13 00430 g0a5b
Figure A6. Power plots for π 1 = = π g 1 = 0.7 , and ρ = 0.4 .
Figure A6. Power plots for π 1 = = π g 1 = 0.7 , and ρ = 0.4 .
Axioms 13 00430 g0a6
Figure A7. Power plots for π 1 = = π g 1 = 0.3 , and ρ = 0.6 .
Figure A7. Power plots for π 1 = = π g 1 = 0.3 , and ρ = 0.6 .
Axioms 13 00430 g0a7
Figure A8. Power plots for π 1 = = π g 1 = 0.7 , and ρ = 0.6 .
Figure A8. Power plots for π 1 = = π g 1 = 0.7 , and ρ = 0.6 .
Axioms 13 00430 g0a8

References

  1. Zhang, H.G.; Ying, G.S. Statistical approaches in published ophthalmic clinical science papers: A comparison to statistical practice two decades ago. Br. J. Ophthalmol. 2018, 102, 1188–1191. [Google Scholar] [CrossRef] [PubMed]
  2. Pearson, K.X. On the criterion that a given system of deviations from the probable in the case of a correlated system of variables is such that it can be reasonably supposed to have arisen from random sampling. London, Edinburgh, Dublin Philos. Mag. J. Sci. 1900, 50, 157–175. [Google Scholar] [CrossRef]
  3. Rosner, B. Statistical methods in ophthalmology: An adjustment for the intraclass correlation between eyes. Biometrics 1982, 38, 105–114. [Google Scholar] [CrossRef] [PubMed]
  4. Donner, A.; Banting, D. Analysis of site-specific data in dental studies. J. Dent. Res. 1988, 67, 1392–1395. [Google Scholar] [CrossRef] [PubMed]
  5. Donner, A. Statistical methods in ophthalmology: An adjusted chi-square approach. Biometrics 1989, 45, 605–611. [Google Scholar] [CrossRef] [PubMed]
  6. Ma, C.X.; Liu, S. Testing equality of proportions for correlated binary data in ophthalmologic studies. J. Biopharm. Stat. 2017, 27, 611–619. [Google Scholar] [CrossRef] [PubMed]
  7. Mou, K.; Li, Z. Homogeneity Test of Many-to-One Risk Differences for Correlated Binary Data under Optimal Algorithms. Complexity 2021, 2021, 6685951. [Google Scholar] [CrossRef]
  8. Liu, X.; Yang, Z.; Liu, S.; Ma, C.X. Exact methods of testing the homogeneity of prevalences for correlated binary data. J. Stat. Comput. Simul. 2017, 87, 3021–3039. [Google Scholar] [CrossRef]
  9. Liddell, D. Practical Tests of 2Times2 Contingency Tables. J. R. Stat. Soc. Ser. D (Stat.) 1976, 25, 295–304. [Google Scholar]
  10. Basu, D. On the Elimination of Nuisance Parameters. J. Am. Stat. Assoc. 1977, 72, 279–290. [Google Scholar] [CrossRef]
  11. Lloyd, C.J. Exact P-Values Discret. Model. Obtained Estim. Maximization. Aust. N. Z. J. Stat. 2008, 50, 329–345. [Google Scholar] [CrossRef]
  12. Ma, C.X.; Wang, H. Testing the equality of proportions for combined unilateral and bilateral data under equal intraclass correlation model. Stat. Biopharm. Res. 2022, 15, 608–617. [Google Scholar] [CrossRef]
  13. Fagerland, M.; Lydersen, S.; Laake, P. Statistical Analysis of Contingency Tables; CRC Press: Boca Raton, FL, USA, 2017. [Google Scholar]
  14. Rao, C.R. Large sample tests of statistical hypotheses concerning several parameters with applications to problems of estimation. Math. Proc. Camb. Philos. Soc. 1948, 44, 50–57. [Google Scholar]
  15. Berger, R.L.; Boos, D.D. P values maximized over a confidence set for the nuisance parameter. J. Am. Stat. Assoc. 1994, 89, 1012–1016. [Google Scholar] [CrossRef]
  16. Silvapulle, M.J. A test in the presence of nuisance parameters. J. Am. Stat. Assoc. 1996, 91, 1690–1693. [Google Scholar] [CrossRef]
  17. Vexler, A. Valid P-Values Expect. p-Values Revisited.Ann. Inst. Stat. Math. 2021, 73, 227–248. [Google Scholar] [CrossRef]
  18. Tang, N.S.; Tang, M.L.; Qiu, S.F. Testing the equality of proportions for correlated otolaryngologic data. Comput. Stat. Data Anal. 2008, 52, 3719–3729. [Google Scholar] [CrossRef]
  19. Mandel, E.M.; Bluestone, C.D.; Rockette, H.E.; BLATTER, M.M.; Reisinger, K.S.; Wucher, F.P.; Harper, J. Duration of effusion after antibiotic treatment for acute otitis media: Comparison of cefaclor and amoxicillin. Pediatr. Infect. Dis. J. 1982, 1, 310–316. [Google Scholar] [CrossRef] [PubMed]
  20. Liang, S.; Fang, K.T.; Huang, X.W.; Xin, Y.; Ma, C. Homogeneity Tests and Interval Estimations of Risk Differences for Stratified Bilateral and Unilateral Correlated Data. arXiv 2023, arXiv:2304.00162. [Google Scholar] [CrossRef]
  21. Lu, W.; Ning, R.; Diao, K.; Ding, Y.; Chen, R.; Zhou, L.; Lian, Y.; McAlinden, C.; Sanders, F.W.; Xia, F.; et al. Comparison of two main orthokeratology lens designs in efficacy and safety for myopia control. Front. Med. 2022, 9, 798314. [Google Scholar] [CrossRef]
  22. Rose, L.V.; Schulz, A.M.; Graham, S.L. Use baseline axial length measurements in myopic patients to predict the control of myopia with and without atropine 0.01%. PLoS ONE 2021, 16, e0254061. [Google Scholar] [CrossRef] [PubMed]
  23. Berson, E.L.; Rosner, B.; Simonoff, E. Risk factors for genetic typing and detection in retinitis pigmentosa. Am. J. Ophthalmol. 1980, 89, 763–775. [Google Scholar] [CrossRef] [PubMed]
Figure 1. Contour plots of type I errors for the equal correlation coefficients model ( g = 2 , m i = 3 , n i = 2 , i = 1 , 2 ).
Figure 1. Contour plots of type I errors for the equal correlation coefficients model ( g = 2 , m i = 3 , n i = 2 , i = 1 , 2 ).
Axioms 13 00430 g001
Figure 2. Contour plots of type I errors for the equal correlation coefficients model ( g = 3 , m i = 3 , n i = 1 , i = 1 , 2 , 3 ).
Figure 2. Contour plots of type I errors for the equal correlation coefficients model ( g = 3 , m i = 3 , n i = 1 , i = 1 , 2 , 3 ).
Axioms 13 00430 g002
Figure 3. Contour plots of type I errors for the equal correlation coefficients model ( g = 4 , m i = 2 , n i = 1 , i = 1 , 2 , 3 , 4 ).
Figure 3. Contour plots of type I errors for the equal correlation coefficients model ( g = 4 , m i = 2 , n i = 1 , i = 1 , 2 , 3 , 4 ).
Axioms 13 00430 g003aAxioms 13 00430 g003b
Figure 4. Contour plots of type I errors for the individual site model ( g = 2 , m i = 3 , n i = 2 , i = 1 , 2 ).
Figure 4. Contour plots of type I errors for the individual site model ( g = 2 , m i = 3 , n i = 2 , i = 1 , 2 ).
Axioms 13 00430 g004
Figure 5. Contour plots of type I errors for the individual site model ( g = 3 , m i = 3 , n i = 1 , i = 1 , 2 , 3 ).
Figure 5. Contour plots of type I errors for the individual site model ( g = 3 , m i = 3 , n i = 1 , i = 1 , 2 , 3 ).
Axioms 13 00430 g005aAxioms 13 00430 g005b
Figure 6. Contour plots of type I errors for the individual site model ( g = 4 , m i = 2 , n i = 1 , i = 1, 2, 3.4).
Figure 6. Contour plots of type I errors for the individual site model ( g = 4 , m i = 2 , n i = 1 , i = 1, 2, 3.4).
Axioms 13 00430 g006
Figure 7. Power plots for π 1 = = π g 1 = 0.3 , and ρ = 0.4 .
Figure 7. Power plots for π 1 = = π g 1 = 0.3 , and ρ = 0.4 .
Axioms 13 00430 g007
Figure 8. Power plots for π 1 = = π g 1 = 0.7 , and ρ = 0.4 .
Figure 8. Power plots for π 1 = = π g 1 = 0.7 , and ρ = 0.4 .
Axioms 13 00430 g008
Figure 9. Power plots for π 1 = = π g 1 = 0.3 , and ρ = 0.6 .
Figure 9. Power plots for π 1 = = π g 1 = 0.3 , and ρ = 0.6 .
Axioms 13 00430 g009
Figure 10. Power plots for π 1 = = π g 1 = 0.7 , and ρ = 0.6 .
Figure 10. Power plots for π 1 = = π g 1 = 0.7 , and ρ = 0.6 .
Axioms 13 00430 g010
Table 1. Data layout on subject level.
Table 1. Data layout on subject level.
Group (i)12gTotal
BilateralResponse (r)0 m 10 m 20 m g 0 M 0
1 m 11 m 21 m g 1 M 1
2 m 12 m 22 m g 2 M 2
Total m 1 m 2 m g M +
UnilateralResponse ( r * )0 n 10 n 20 n g 0 N 0
1 n 11 n 21 n g 1 N 1
Total n 1 n 2 n g N +
Table 2. Data layout on site level.
Table 2. Data layout on site level.
Group (i)12gTotal
Response (e)0 x 10 x 20 x g 0 T 0
1 x 11 x 21 x g 1 T 1
Total x 1 x 2 x g T
Table 3. The distribution of children aged ≥ 6 years at 14 days by treatment, number of cured ears, and disease status at study entry.
Table 3. The distribution of children aged ≥ 6 years at 14 days by treatment, number of cured ears, and disease status at study entry.
TreatmentBilateral at EntryUnilateral at Entry
No. of Cured EarsNo. of Cured Ears
01201
Cefaclor013811
Amoxicillin106711
Table 4. p-values using different approaches for example 1.
Table 4. p-values using different approaches for example 1.
Approachp-Value
A0.6629
E0.6712
M0.7218
E + M0.8608
C0.7667
CI0.7197
Table 5. The number of subjects by design and number of eyes demonstrating response after treatment.
Table 5. The number of subjects by design and number of eyes demonstrating response after treatment.
DesignBilateralUnilateral
No. of Eyes with ResponseNo. of Eyes with Response
01201
VST93721
CRT70000
Table 6. p-values using different approaches for example 2.
Table 6. p-values using different approaches for example 2.
Approachp-Value
A0.0188
E0.0171
M0.0220
E + M0.0249
C0.0265
CI0.0219
Table 7. The number of patients by genetic type and number of affected eyes.
Table 7. The number of patients by genetic type and number of affected eyes.
DesignBilateralUnilateral
No. of Eyes with ResponseNo. of Eyes with Response
01201
DOM156700
AR75900
SL321400
Table 8. p-values using different approaches for example 3.
Table 8. p-values using different approaches for example 3.
Approachp-Value
A0.0048
E0.0042
M0.0044
E + MNA
C0.0045
CI0.0047
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Liang, S.; Ma, C. Comparative Analysis of Exact Methods for Testing Equivalence of Prevalences in Bilateral and Unilateral Combined Data with and without Assumptions of Correlation. Axioms 2024, 13, 430. https://doi.org/10.3390/axioms13070430

AMA Style

Liang S, Ma C. Comparative Analysis of Exact Methods for Testing Equivalence of Prevalences in Bilateral and Unilateral Combined Data with and without Assumptions of Correlation. Axioms. 2024; 13(7):430. https://doi.org/10.3390/axioms13070430

Chicago/Turabian Style

Liang, Shuyi, and Changxing Ma. 2024. "Comparative Analysis of Exact Methods for Testing Equivalence of Prevalences in Bilateral and Unilateral Combined Data with and without Assumptions of Correlation" Axioms 13, no. 7: 430. https://doi.org/10.3390/axioms13070430

APA Style

Liang, S., & Ma, C. (2024). Comparative Analysis of Exact Methods for Testing Equivalence of Prevalences in Bilateral and Unilateral Combined Data with and without Assumptions of Correlation. Axioms, 13(7), 430. https://doi.org/10.3390/axioms13070430

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop