The Epidemiologic Comparison of Two Correlated Relative Risks: A Simple but Efficient Clinical Trial Design for Assessing Risk-Reduction and Treatment Significance

Efird, Jimmy T.; Dupuis, Genevieve N.; Choi, Yuk Ming; Wu, Hongsheng

doi:10.3390/medicina62010070

Open AccessTechnical Note

The Epidemiologic Comparison of Two Correlated Relative Risks: A Simple but Efficient Clinical Trial Design for Assessing Risk-Reduction and Treatment Significance

¹

VA Cooperative Studies Program Coordinating Center, 2 Avenue de Lafayette, Boston, MA 02111, USA

²

Department of Radiation Oncology, School of Medicine, Case Western Reserve University, Cleveland, OH 44206, USA

³

Department of Biostatistics, Boston University School of Public Health, Boston, MA 02118, USA

⁴

Provider Services, Signify Health, 4055 (S-700) Valley View Ln, Dallas, TX 75244, USA

⁵

Division of Mathematics, Analytics, Science, and Technology, Babson College, Wellesley, MA 02457, USA

^*

Author to whom correspondence should be addressed.

Medicina 2026, 62(1), 70; https://doi.org/10.3390/medicina62010070

Submission received: 27 July 2025 / Revised: 18 November 2025 / Accepted: 23 December 2025 / Published: 29 December 2025

(This article belongs to the Section Epidemiology & Public Health)

Download

Browse Figure

Versions Notes

Abstract

In the context of platform design, umbrella trials are a type of master protocol in which multiple treatments are randomized and evaluated with respect to a common, referent-control arm. In a simplified (1:1:1), non-adapted case, this is equivalent to the epidemiologic comparison of two correlated relative risks for assessing risk-reduction on the log-difference scale. The use of shared controls has the potential to reduce study time and costs but infers greater complexity to account for the covariance between relative effect estimates (i.e., dependence arising because treatment arms use the same referent group). Multiplicity adjustment is unnecessary on the log-difference scale (three-group design) as this is a single test statistic for interaction. An intuitive, risk-reduction (LDL-C statin) example is presented to illustrate the practical application of this method.

Keywords:

common referent-control; correlated relative risks; log-difference scale; risk-reduction; umbrella trials

1. Introduction

An efficient and cost-effective clinical trial design involves comparing two or more treatment groups to a single, shared referent-control arm. This reduces the overall number of patients randomized to a comparison group by half, offering greater appeal to patients, drug companies, and investors [1]. In the basic instance of a new versus legacy therapy, relative to a common control group, this epidemiological approach is innovative, practical, and methodologically rigorous. A key advantage of the (1:1:1) design entails the simultaneous assessment of risk-reduction and treatment significance.

Parallel drug comparisons of this type are known as master protocol, platform studies, with the term “umbrella trial” denoting the assignment of patients to one of many therapies, typically based on their molecular characteristics [2]. When aptly applied, this strategy has clear and direct implications for real-world, clinical trial practice, culminating in reduced study time, costs, and bias.

The shared reference group usually consists of a mutually distinct, random collection of patients, but it can also be assembled from pooled controls of previously conducted clinical trials, historic collections, or registries. In the case of an observational, targeted trial emulation, controls need to meet the eligibility criteria at the time of case definition [3]. Regardless of study type, the sharing of controls is appropriate only if treatment arms are comparable with respect to the referent base population in terms of key attributes (e.g., time period, geographic location, and demographic makeup).

An important feature of this design is the correlated structure of the data, requiring the incorporation of a covariance term into the denominator of the test statistic. The need to account for multiplicity adjustment is offset in the simple, three-group (risk-reduction) design, as the analysis is based on a single log-difference test for interaction [4]. A generalized linear model (GLM) with a log-link function may be used to analyze the correlated data, which can accommodate both univariable and adjusted models for risk-reduction [5].

2. Preliminaries

2.1. Definition of Two Correlated Relative Risks in Terms of a 3 × 2 Cross-Tabulation Table

Let

(a_{i j})

denote the

({i^{t h}, j}^{t h})

cell sizes for a contingency table with

i = 1 t o 3

rows and

j = 1 t o 2

columns (see Table 1). Here, the rows represent the treatment exposures

(T_{0}, T_{1} {, T}_{2})

while the columns correspond to an unfavorable vs. favorable outcome in the context of risk-reduction. Define the two relative risk (RR) estimates being compared with respect to a common referent-control group

(T_{0})

as follows:

\hat{{R R}_{1}} = \frac{\frac{a_{11}}{(a_{11} + a_{12})}}{\frac{a_{31}}{(a_{31} + a_{32})}} = \frac{\frac{a_{11}}{a_{1 +}}}{\frac{a_{31}}{a_{3 +}}} = \frac{{\hat{π}}_{0}}{{\hat{π}}_{2}}

(1)

and

\hat{{R R}_{2}} = \frac{\frac{a_{11}}{(a_{11} + a_{12})}}{\frac{a_{21}}{(a_{21} + a_{22})}} = \frac{\frac{a_{11}}{a_{1 +}}}{\frac{a_{21}}{a_{2 +}}} = \frac{{\hat{π}}_{0}}{{\hat{π}}_{1}},

(2)

where

(a_{1,1}), (a_{2,1}), a n d (a_{3,1})

are binomially distributed variables. Accordingly,

({\hat{R R}}_{1})

represents the ratio of the event probability under

(T_{0})

to

(T_{2})

, while

({\hat{R R}}_{2})

compares

(T_{0})

to

(T_{1})

.

2.2. Test Statistic and p-Value

Consider the following sample test statistic

({\hat{ζ}}_{R R R})

for comparing the ratio of two correlated RR estimates on the log-difference scale (i.e.,

[l o g ({\hat{R R}}_{1}) - l o g ({\hat{R R}}_{2})] = l o g (\hat{R R R})

), with respect to a common referent-control group:

{\hat{ζ}}_{RRR} = \frac{l o g (\frac{{\hat{R R}}_{1}}{{\hat{R R}}_{2}}) - E [l o g (\frac{{\hat{R R}}_{1}}{{\hat{R R}}_{2}})]}{\sqrt{V a r [l o g ({\hat{R R}}_{1})] + V a r [l o g ({\hat{R R}}_{2})] - 2 C o v [l o g ({\hat{R R}}_{1}), l o g ({\hat{R R}}_{2})]}},

(3)

where the denominator estimates the square-root of the variance for

l o g (\frac{{\hat{R R}}_{1}}{{\hat{R R}}_{2}}) .

E (*), V a r (*)

a n d C o v (*)

denote the expectation, variance and covariance of the quantities indicated within parenthesis. The expectation of the log-difference, i.e.,

E [l o g (\frac{{\hat{R R}}_{1}}{{\hat{R R}}_{2}})]

, equals zero under the null hypothesis (H₀) that the two RRs are equivalent (i.e.,

{R R}_{1} = {R R}_{2}

), versus the alternative (H₁) that they differ (i.e.,

{R R}_{1} \neq {R R}_{2}

).

When the sample sizes for each group

(n_{{\hat{R R}}_{1}}, n_{{\hat{R R}}_{2}})

are sufficiently large,

({\hat{R R}}_{1}, {\hat{R R}}_{2})

will tend to be conditionally independent, and one can simply estimate the denominator of

({\hat{ζ}}_{R R R})

by

\sqrt{{\{S E [l o g ({\hat{R R}}_{1})]\}}^{2} + {\{S E [l o g ({\hat{R R}}_{2})]\}}^{2}}

, where

S E (*)

is defined as the standard error. That is, as

(n_{{\hat{R R}}_{1}}, n_{{\hat{R R}}_{2}}) \to \infty,

sample estimates become more reliable and stable, with the covariance term approaching zero and

({\hat{ζ}}_{R R R})

assuming a standard normal distribution. However, for smaller sample sizes, the relative effect estimates are less accurate and more prone to random fluctuation (variability). To warrant approximate normality of the test statistic for reasonably sized, non-asymptotic samples, it remains integral to account for the sample covariance term in the denominator.

The corresponding p-value for interaction

(P_{I n t})

is computed as follows:

P_{I n t} ~ 2 \cdot \{1 - Φ (\hat{ζ})\},

(4)

where

Φ (\hat{ζ}) = \int_{- \infty}^{{\hat{ζ}}_{R R R}} \frac{e^{\frac{- x^{2}}{2}}}{\sqrt{2 \cdot π}} d x .

(5)

2.3. Analytic Details

2.3.1. Delta Approximation

The “Delta (δ) method” provides a first-order Taylor series approximation for the variance of a function involving random variables with known, finite moments [6]. By defining

(y)

as the function of two variables

(x_{1} {a n d x}_{2})

, each with a small coefficient of variation, it follows that

V a r (y) ≅ {(\frac{\partial y}{\partial x_{1}})}^{2} V a r (x_{1}) + 2 (\frac{\partial y}{\partial x_{1}}) (\frac{\partial y}{\partial x_{2}}) C o v (x_{1}, x_{2}) + {(\frac{\partial y}{\partial x_{2}})}^{2} V a r (x_{2}),

(6)

where the partial derivatives of

(y)

with respect to

(x_{1} a n d x_{2})

are evaluated at their respective mean values. The method assumes a local linear approximation of the underlying response surface, with optimal results realized when the function assumes an asymptotically Gaussian distribution.

2.3.2. Sample Variance and Covariance Estimates

Applying the δ-method, it is easily seen that

V a r [l o g (\hat{{R R}_{1}})] ≅ {(\frac{a_{1 +}}{a_{11}})}^{2} \{\frac{\frac{a_{11}}{a_{1 +}} (1 - \frac{a_{11}}{a_{1 +}})}{a_{1 +}}\} + {(\frac{a_{3 +}}{a_{31}})}^{2} \{\frac{\frac{a_{31}}{a_{3 +}} (1 - \frac{a_{31}}{a_{3 +}})}{a_{3 +}}\}

(7)

≅ \frac{1}{a_{11}} - \frac{1}{a_{1 +}} + \frac{1}{a_{31}} - \frac{1}{a_{3 +}} .

(8)

Accordingly,

S E [l o g (\hat{{R R}_{1}})] ≅ \sqrt{\frac{1}{a_{11}} - \frac{1}{a_{1 +}} + \frac{1}{a_{31}} - \frac{1}{a_{3 +}}} .

(9)

Likewise,

S E [l o g (\hat{{R R}_{2}})] ≅ \sqrt{\frac{1}{a_{11}} - \frac{1}{a_{1 +}} + \frac{1}{a_{21}} - \frac{1}{a_{2 +}}} .

(10)

As the only common term for

({\hat{R R}}_{1}, {\hat{R R}}_{2})

is

({\hat{π}}_{0} = \frac{a_{11}}{a_{1 +}})

, it readily follows from the δ-method that

C o v [l o g (\hat{{R R}_{1}}), l o g (\hat{{R R}_{2}})] ≅ \frac{C o v (\hat{{R R}_{1}}, \hat{{R R}_{2}})}{{\hat{π}}_{0}^{2}}

(11)

≅ \frac{V a r ({\hat{π}}_{0})}{{\hat{π}}_{0}^{2}}

(12)

≅ \frac{{\hat{π}}_{0} (1 - {\hat{π}}_{0})}{π_{0}^{2} a_{1 +}}

(13)

≅ \frac{(1 - \frac{a_{11}}{a_{1 +}})}{a_{11}}

(14)

≅ (\frac{1}{a_{11}} - \frac{1}{a_{1 +}})

(15)

This result reflects the variance component attributable to the shared, referent-control arm (i.e., the first two terms of

V a r [l o g (\hat{{R R}_{1}})]) .

2.4. Comparison of Two Correlated Odds Ratios

Rather than RRs, the relative effect measure of interest may be odds ratios (ORs). An important aspect of ORs is that the estimate is “invariant to rotation”, meaning that the disease and exposure ORs are equivalent. Importantly, ORs are a versatile measure of association, whether analyzing incidence-density or cumulative-incidence studies. When the outcome is rare in both the exposed and unexposed groups, the RR and OR estimates are approximately equal. Independent of the rare disease assumption, ORs are also fairly accurate estimates for rate ratios when the proportion of the population exposed and disease incidence remain constant over time.

Analogous to the test statistic for correlated RR estimates, two ORs may be compared with respect to a common referent-control group on the log-difference scale. That is, with the covariance term implicit in the denominator,

{\hat{ζ}}_{ROR} = \frac{l o g ({\hat{O R}}_{1}) - l o g ({\hat{O R}}_{2})}{\sqrt{V a r [l o g (\frac{{\hat{O R}}_{1}}{{\hat{O R}}_{2}})]}},

(16)

where the subscript in

{\hat{ζ}}_{R O R}

denotes the log-ratio of

{\hat{O R}}_{1}

and

{\hat{O R}}_{2}

. Again, using the 3 × 2 cross-tabulation table notation, the two sample ORs being compared with respect to a common referent-control group are defined as follows:

\hat{{O R}_{1}} = \frac{\frac{a_{11}}{a_{12}}}{\frac{a_{31}}{a_{32}}},

(17)

and

\hat{{O R}_{2}} = \frac{\frac{a_{11}}{a_{12}}}{\frac{a_{21}}{a_{22}}} .

(18)

Applying the δ-method and rearranging, we have the following [7]:

V a r [l o g (\hat{{O R}_{1}})] ≅ \frac{1}{a_{11}} + \frac{1}{a_{12}} + \frac{1}{a_{31}} + \frac{1}{a_{32}},

(19)

V a r [l o g (\hat{{O R}_{2}})] ≅ \frac{1}{a_{11}} + \frac{1}{a_{12}} + \frac{1}{a_{21}} + \frac{1}{a_{22}},

(20)

and

C o v [l o g (\hat{{O R}_{1}}), l o g (\hat{{O R}_{2}})] ≅ (\frac{1}{a_{11}} + \frac{1}{a_{12}}) .

(21)

2.5. Multinomial Distribution and Simulated Exact Statistics

Parameter estimates and the underlying distribution for

({\hat{ζ}}_{R R R})

may be easily obtained by simulating observations from a multinomial distribution [8,9]. In effect, this provides exact statistics, which may be preferable when the sample sizes are very small and the normality of the test statistic is questionable. The simulated values are also useful for validating the large sample variance and covariance estimates obtained by the δ-method.

Conditioning on the total number of patients

(n)

, the probability that a mutually exclusive set of

(k)

non-negative random variates

(Z_{1}, Z_{2}, \dots, Z_{k})

takes on a particular value (

z_{1}, z_{2}, \dots, z_{k}

) is given as follows:

P_{n} (Z_{1} = z_{1}, Z_{2} {= z}_{2}, \dots, Z_{k} {= z}_{k}) = \frac{n!}{\prod_{i = 1}^{k} z_{i}!} \prod_{i = 1}^{k} {P (Z_{i} = z_{i})}^{z_{i}},

(22)

where

\sum_{i = 1}^{k} P (Z_{i} = z_{i}) = 1,

(23)

\sum_{i = 1}^{k} z_{i} = n,

(24)

and

0 < P (Z_{i} = z_{i}) < 1 .

(25)

The estimated probability for the

({i^{t h}, j}^{t h})

cell of a 3 × 2 multinomial table for comparing two correlated relative effect estimates is given as

(\frac{a_{i j}}{n})

.

3. Computational Methods

Analyses were performed and validated in SAS 9.4 (Cary, NC, USA). The GENMOD procedure for implementing GLMs was used to compute RRs (log-link function), ORs (logit-link function), and respective p-values for interaction. The interactive matrix language procedure (PROC IML) was used to simulate values from a multinomial distribution.

4. Relative Risk-Reduction Example

Consistently high blood levels of low-density lipoprotein (LDL-C) underlie a condition known as atherosclerotic cardiovascular disease (ASCVD), which manifests as the accumulation of plaque in the arteries (Table 1). Cardiologists recommend achieving LDL-C levels of less than 70 mg/dL following therapeutic intervention. A randomized clinical trial was undertaken to assess the benefit of a new (molecularly targeted) statin drug

(T_{2})

over a previously approved (standard) agent

(T_{1})

, by way of a common referent-control arm of diet and exercise

(T_{0})

. A 33% relative risk-reduction on the log-difference scale was observed following 18 months on the new statin therapy combined with diet and exercise (P_Int = 0.01994), versus standard treatment. In comparison, the estimated OR reduction on a log-difference scale was 167%, with a p-value for interaction of 0.01857, illustrating an exaggerated result for the ratio of ORs, owing to the frequent outcome event.

The manually obtained results provided in Table 1 are easily validated within rounding error against the PROC GENMOD output shown in Appendix A. For example,

V a r [l o g ({\hat{R R}}_{1})] = (0 {. 1139}^{2}) = 0.01297,

(26)

V a r [l o g ({\hat{R R}}_{2})] = (0 {. 0776}^{2}) = 0.00602,

(27)

C o v [l o g (\hat{{R R}_{1}}), l o g (\hat{{R R}_{2}})] = \frac{(0.01297 + 0.00602) - (0 {. 1236}^{2})}{2} = 0.00186

(28)

and,

l o g (\frac{{\hat{R R}}_{1}}{{\hat{R R}}_{2}}) = l o g (\frac{1.5000}{1.1250}) = l o g (1.3333) = 0.28768

(29)

5. Simulated Exact Results

A multinomial distribution was used to obtain 10 million 3 × 2 tables based on the cell values in Table 1, with the frequency (histogram) plot for the test statistic shown in Figure 1. Given the relatively small sample size of 60 patients per arm, this is seen to follow a slightly skewed standard normal distribution (compared with solid Gaussian line), with non-zero skewness (0.18678) and kurtosis (0.19295). In this specific case, the distribution is right (positive)-skewed, wherein the tail is longer on the right side. Also note the higher peak of the simulated distribution, representing the value with the greatest probability.

The simulated exact results are given in Table 2. The area to the right of the test statistic under this distribution gives the simulated exact p-value for the interaction. Comparatively, the variance and covariance estimates obtained by the δ-method are reasonably close to the simulated exact statistics. Given the mild, right skewness of the simulated distribution, the p-value for interaction obtained by the exact method (i.e., 0.01566) is slightly more significant than that obtained by the normal theory method (i.e., 0.01994). However, this may not always be the situation for other examples, and one cannot assume that the normal theory result will universally yield the more conservative p-value.

6. Discussion

6.1. Overview

Master protocol platform studies embody an optimal approach for conducting parallel, clinical trials. Efficiencies are gained by using a shared control arm, which conveys a correlated, economical structure to the data. The manuscript at hand focuses on a non-adaptive (1:1:1) design. This simplified approach is equivalent to the assessment of two relative-risk estimates, with each epidemiologic measure being dependent on a collective denominator (single comparison group). When the disease outcome is rare among the two drugs being compared, a clinical trial may be emulated as a population-based, retrospective design, with a common referent group.

For moderately sized studies, as demonstrated with an example of

n = 180

patients (60 per arm), the (1:1:1) design is reasonably robust to departures from normality and can be easily analyzed as a GLM with a log-link function for RRs, which transforms the mean (µ) to the natural logarithm of (µ). This is in contrast with the logit-link function for ORs, which transforms the probability (µ) to the log-odds. The single, p-value for interaction offsets the need for multiplicity adjustment. Furthermore, the procedure is readily implemented using standard statistical software, with the option of including covariates to account for confounding.

In addition to clinical trials, the analysis of correlated data occurs in many epidemiologic settings (e.g., matched-pairs designs, pre- and post-studies, twin research, and cross-over investigations) [8]. A 2 × 2 × 2 three-way contingency design entailing the analysis of two paired binomial responses (measured on two treatments) has been used to compare side-effects of general anesthesia [10]. An overlapping collection of ~3000 controls has been implemented in a large genome-wide association study of ~2000 cases collected from seven diseases [11]. While methods to correct for a common referent-control group in association studies have been amply described in the literature, the emphasis has mainly been on correlated proportions and ORs [12,13]. In contrast, the current effort focuses on measuring the risk-reduction encountered when comparing two correlated RR estimates with a shared, dependent control arm. As a marginal method (versus ORs), the technique is appropriate to use for common diseases and prospective clinical analyses.

6.2. Advantages

A commonly employed clinical trial design involves directly comparing a new treatment against a standard agent and assessing the absolute effect risk-reduction (i.e., event rate difference between groups). However, the results may be biased because this design does not compare therapies with respect to a common, referent-control arm. For example, it may be challenging to prove the clinical usefulness of the new treatment if the standard drug was approved many years ago and is no longer as effective, owing to manufacturing changes, practice deviations, interactions with newer concomitant medications, and variations in the disease process. Assessing relative effect risk-reduction on a log-difference scale, using a shared referent group, effectively minimizes this threat to internal validity.

Another advantage of this design is accelerated drug development (i.e., shorter study time) and reduced resource allocation. Minimizing redundant control groups decreases the sample size of a study while increasing study power.

6.3. Limitations

A limitation of the normal theory approach is that the sample test statistic is reliant on the δ-method for obtaining variance and covariance estimates. This technique assumes that the first three partial derivatives of the underlying function are continuous, differentiable, and assume an asymptotic Gaussian form [14]. In most computer applications, only a first-order Taylor series approximation is used to derive variance estimates. Consequently, the asymptotic normality assumption may be questionable in small sample cases, with the conditional central limit theorem failing to hold true [15]. Increasing the sample size or using a second-order or higher Taylor series approximation may help to reduce bias. The exact method may need to be used if the data is particularly sparse or the rate of approaching normality is ostensibly slower than

O (1 / \sqrt{n}) .

Barring extreme degenerate examples, the test statistic in practice typically assumes a moderately well-behaved, bell-shape distribution (for treatment groups of ~60 or more participants), and one may posit that the underlying data is sufficiently close in form to safely proceed with normal theory methods.

Under certain circumstances, convergence issues with the Bernoulli likelihood may occur when implementing the GLM approach with a log-link function. That is, the estimated probability of success vis-à-vis the Newton–Raphson algorithm may fall on or near the boundary of the parameter space (i.e., unity) [10]. One solution is to assume a Poisson likelihood and use the robust “generalized estimation equations method” to estimate the variance [16]. However, since Poisson regression allows predicted probabilities to exceed one, resulting confidence intervals will be slightly biased. An alternative workaround is to obtain estimates by applying the expectation–maximization (EM) algorithm [17]. The exact method based on re-parametrization of covariates is another promising approach. However, a rate-limiting aspect of the latter method is that the covariate vector of fitted probabilities equals unity and needs to be confirmed in advance [18].

The potential for misclassification bias may be a concern if a shared control is using either of the drugs under consideration at baseline, or begin their use after randomization (immortal time effect) [19]. Protocol deviations of this type must be carefully monitored during the course of the study and appropriately accounted for in the statistical analysis and reporting of results. Investigators should be vigilant of concomitant medications that may either intensify or diminish the referent effect. The selection of controls in a non-random fashion, or from a hospital source related to the outcome measure, poses another limitation that can be exacerbated with the use of a single-arm control group.

While reducing the time and cost of a study, combining external controls with randomized trial data can introduce complications, such as unmeasured confounding and collider bias. Chronological bias represents another concern, wherein aspects of the control group may change over time because of dependent temporal effects (e.g., practice changes, staff learning, unobserved time trends) [20]. Furthermore, the randomization of participants to the two treatment groups versus a common control arm “does not preclude confounding except for extremely large studies” [21].

Early termination or the censoring of participants poses a source of bias if differential in effect. Appropriate imputation methods suitable for correlated data may need to be implemented in such cases. State transition models or a counting process approach present other options for mitigating bias. When the effect is believed to be non-differential, the sample size may be increased in an adaptive fashion to offset bias toward the null. Deleting the affected participants from the analysis, if small in number, may be reasonable if appositely acknowledged.

The simplified (1:1:1) design for assessing risk difference on a log-difference scale does not require multiplicity adjustment when using shared controls. However, this may not be true for more complicated, multiple-arm, platform or umbrella trials that similarly utilize a common, referent arm. In general, it is best to consult a PhD-trained Epidemiologist or Statistician when designing a master protocol to ensure that the selected approach is valid for the application at hand and appropriately powered. This is especially important if one plans to use a flexible design that allows for adding or deleting treatment arms after trial commencement.

6.4. Future Directions

Future investigations are directed at innovative extensions of the log-difference, risk-reduction approach for conducting and analyzing clinical trials. This includes Bayesian hierarchical, mixed-model alternatives, and nonparametric adaptive methods, as well as efficient algorithms to compute conditional power. Additional consideration of designs using a shared attention-control arm would be informative. The latter would be explicitly designed to account for the bias of interacting with study staff, which is distinct from the compounds under study.

While the FDA recognizes the use of external controls in situations where conventional controls are not medically feasible or ethical (e.g., “rare conditions or indications lacking clinical equipoise for a concurrent control”), it remains imperative that the selection of nonconcurrent referents are representative of and generalizable to the targeted treatment population currently under study, as is true for all controls [22,23]. This includes “diseases with high and predictable mortality or signs and symptoms of predictable duration or severity”. The future development of novel techniques for adjusting nonconcurrent referents to the base population will be beneficial as such control groups become more commonly used in the log-difference, risk-reduction method at hand.

The selection of controls from an observational source can pose methodologic concerns as patients may “enter and exit the database at various times, ages, disease states, etc.” [23]. Additional research addressing this concern is needed.

7. Conclusions

The epidemiologic assessment of risk-reduction on the log-difference scale, using a shared referent-control arm, presents an efficient clinical trial design for comparing two correlated relative risk estimates of treatment effects. This method represents a simplified version of a multi-arm, parallel designed umbrella-platform trial without the need for multiplicity adjustment. The resulting p-value for the interaction is readily obtained using a GLM algorithm available in standard statistical software packages.

Author Contributions

Conceptualization, J.T.E. and Y.M.C.; methodology, J.T.E., G.N.D. and H.W.; software, J.T.E., G.N.D. and H.W.; validation, J.T.E., G.N.D. and H.W.; formal analysis, J.T.E., G.N.D. and H.W.; investigation, J.T.E., G.N.D., Y.M.C. and H.W.; resources, J.T.E.; data curation, J.T.E.; writing—original draft preparation, J.T.E., G.N.D., Y.M.C. and H.W.; writing—review and editing, J.T.E., G.N.D., Y.M.C. and H.W.; visualization, J.T.E., G.N.D., Y.M.C. and H.W.; supervision, J.T.E.; project administration, J.T.E. All authors have read and agreed to the published version of the manuscript.

Funding

G.N.D. was supported by the National Institutes of Health T32GM140972.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data are contained within the article.

Acknowledgments

The insightful comments of Sonia T. An and, Janet Grubber, Kaitlin Cassidy, and Maria Androsenko during the writing of this manuscript are greatly appreciated. SAS software (version 9.4, Cary, NC, USA) was used to perform the computations. The opinions presented in this manuscript do not necessarily represent those of the VA or the United States Federal Government. The provided clinical example is hypothetical and not intended to reflect any real patients.

Conflicts of Interest

The author declares no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

ASCVD	Atherosclerotic cardiovascular disease
Cov	Covariance
EM	Expectation–maximization
GLM	Generalized linear model
LDL-C	Low-density lipoprotein
Log	Logarithm
OR	Odds ratio
RR	Relative risk
RRR	Ratio of relative risks
Var	Variance

Appendix A. SAS GENMOD Code and Output for Validating the Manual Results of Example 1

SAS Code

option ls=180;
data a;
   input group $ count response;
     cards;
       T0 54 1
       T0 6 0
       T1 48 1
       T1 12 0
       T2 36 1
       T2 24 0
;
proc genmod data=a descending;
   class group;
     model response=group/dist=bin link=log;
       weight count;
       estimate “RR1” group 1 0 −1 / exp;
       estimate “RR2” group 1 −1 0 / exp;
       estimate “RRR” group 0 1 −1 / exp;

Output (screenshot from the SAS system) Medicina 62 00070 i001

References

Ren, Y.; Li, X.; Chen, C. Statistical consideration of phase 3 umbrella trials allowing adding one treatment arm mid-trial. Contemp. Clin. Trials 2021, 9, 106538. [Google Scholar] [CrossRef] [PubMed]
Renfro, L.; Sargent, D. Statistical controversies in clinical research: Basket trials, umbrella trials, and other master protocols: A review and examples. Ann. Oncol. 2017, 28, 34–43. [Google Scholar] [CrossRef] [PubMed]
Bunin, G.; Baumgarten, M.; Norman, S.; Strom, B.; Berlin, J. Practical aspects of sharing controls between case-control studies. Pharmacoepidemiol. Drug Saf. 2005, 14, 523–530. [Google Scholar] [CrossRef]
Altman, D.; Bland, J. Interaction revisited: The difference between two estimates. Br. Med. J. 2003, 326, 219. [Google Scholar] [CrossRef]
Nelder, B.; Wedderburtn, R. Generalized Linear Models. J. R. Statist. Soc. A 1972, 135, 370384. [Google Scholar] [CrossRef]
Armitage, P. Statistical Methods in Medical Research; John Wiley and Sons: New York, NY, USA, 1971. [Google Scholar]
Bagos, P. On the covariance of two correlated log-odds ratios. Stat. Med. 2012, 31, 1418–1431. [Google Scholar] [CrossRef] [PubMed]
DelRocco, N.; Wang, Y.; Wu, D.; Yang, Y. New confidence intervals for relative risk of two correlated proportions. Stat. Biosci. 2023, 15, 1–30. [Google Scholar] [CrossRef]
Leisch, F.; Weingessel, A.; Hornik, K. On the Generation of Correlated Artificial Binary Data, June 1998 ed.; SFB Adaptive Information Systems and Modelling in Economics and Management Science; WU Vienna University of Economics and Business: Vienna, Austria, 1998. [Google Scholar] [CrossRef]
Carter, R.; Zhang, X.; Woolson, R. Statistical analysis of correlated relative risks. J. Data Sci. 2009, 7, 397–407. [Google Scholar] [CrossRef]
The Wellcome Trust Case Control Consortium. Genome-wide association study of 14,000 cases of seven common diseases and 3000 shared controls. Nature 2007, 447, 661–678. [Google Scholar] [CrossRef] [PubMed]
Zaykin, D.; Kozbur, D. P-value based analysis for shared controls design in genome-wide association studies. Genet. Epidemiol. 2010, 34, 725–738. [Google Scholar] [CrossRef] [PubMed]
Lin, D.; Sullivan, P. Meta-analysis of genome-wide association studies with overlapping subjects. Hum. Genet. 2009, 85, 862–872. [Google Scholar] [CrossRef] [PubMed]
Doob, J. The limiting distribution of certain statistics. Ann. Math. Stat. 1935, 6, 160–169. [Google Scholar] [CrossRef]
Le Cam, L. Asymptotic Methods in Statistical Decision Theory; Springer: New York, NY, USA, 1986. [Google Scholar]
Carter, R.; Lipsitz, S.; Tilley, B. Quasi-likelihood and goodness-of-fit in log binomial regression. Biostatistics 2005, 6, 39–44. [Google Scholar] [CrossRef] [PubMed]
Marschner, I.; Gillett, A. Relative risk regression: Reliable and flexible methods for log-binomial models. Biostatistics 2012, 13, 179–192. [Google Scholar] [CrossRef] [PubMed]
Zhu, C.; Hosmer, D.; Stankovich, J.; Wills, K.; Blizzard, L. Refinements on the exact method to solve the numerical difficulties in fitting the log binomial regression model for estimating relative risk. Commun. Stat. Theory Methods 2024, 53, 8359–8375. [Google Scholar] [CrossRef]
Lévesque, L.; Hanley, J.; Kezouh, A.; Suissa, A. Problem of immortal time bias in cohort studies: Example using statins for Preventing progress of diabetes. BMJ 2010, 340, b5087. [Google Scholar] [CrossRef] [PubMed]
Tamm, M.; Hilger, R. Chronological bias in randomized clinical trials arising from different types of unobserved time trends. Methods Inf. Med. 2014, 53, 501–510. [Google Scholar] [CrossRef] [PubMed]
Rothman, J. Epidemiologic methods in clinical trials. Cancer 1977, 39, 1771–1775. [Google Scholar] [CrossRef] [PubMed]
Mooghali, M.; Dhruva, S.; Hakimian, H.; Rathi, V.; Kadakia, K.; Ross, J. Nonconcurrent control use in FDA approval of high risk medical devices. JAMA Netw. Open 2025, 8, e256230. [Google Scholar] [CrossRef] [PubMed]
Jahanshahi, M.; Gregg, K.; Davis, G.; Ndu, A.; Miller, V.; Vockley, J.; Ollivier, C.; Franolic, T.; Sakai, S. The use of external controls in FDA regulatory decision making. Ther. Innov. Regul. Sci. 2021, 55, 1019–1035. [Google Scholar] [CrossRef] [PubMed]

Figure 1. Histogram with standard normal distribution overlay (test statistic =

{\hat{ζ}}_{R R R}

).

Figure 1. Histogram with standard normal distribution overlay (test statistic =

{\hat{ζ}}_{R R R}

).

Table 1. Clinical trial analysis (total number of patients = 180; 60 per arm).

Treatment (T)		Cholesterol Level (mg/dL) at 18 Months Post-Baseline		Results ^†
Treatment (T)		≥70 Unfavorable	<70 Favorable	Results ^†
$T_{0}$ ^§	Diet and Exercise	$a_{11}$ = 54	$a_{12} =$ 6	${\hat{R R}}_{1} (T_{0} : T_{2}) = 1.5000$ ${\hat{R R}}_{2} (T_{0} : T_{1}) = 1.125$ 0 $l o g (\frac{{\hat{R R}}_{1}}{{\hat{R R}}_{2}}) = l o g (1.3333) = 0.28768$ $V a r [l o g ({\hat{R R}}_{1})]$ = 0.01296 $V a r [l o g ({\hat{R R}}_{2})]$ = 0.00602 $C o v [l o g (\hat{{R R}_{1}}), l o g (\hat{{R R}_{2}})]$ = 0.00185 ${{\hat{ζ}}_{R R R} = 2.3275; P}_{I n t}$ = 0.01994 ^¶ $R L D$ = $(\frac{{\hat{R R}}_{1}}{{\hat{R R}}_{2}} - 1) \times 100$ = 33.333%
$T_{1}$	Standard Statin + Diet and Exercise	$a_{21} =$ 48	$a_{22} =$ 12
$T_{2}$	New Statin + Diet and Exercise	$a_{31} =$ 36	$a_{32} =$ 24

^† Estimates rounded to 5 significant digits versus a fixed number of decimal places (Goldilocks method). ^§ Common referent-control. Baseline = initiation of therapy. ^¶ Computed using unrounded values.

m g / d L =

milligrams per deciliter.

a_{R C}

= cell frequency for respective row (R) and column (C).

C o v =

covariance.

V a r

= variance.

{\hat{R R}}_{G}

= relative risk estimate for indicated group (G).

P_{I n t} =

p-value for interaction.

R L D

= relative risk-reduction on the log-difference scale.

{\hat{ζ}}_{R R R}

= test statistic for the ratio of relative risks.

Table 2. Exact statistics simulated from 10,000,000 multinomial observations.

Characteristic	Simulated Exact Value
$V a r [l o g ({\hat{R R}}_{1})]$	0.01355
$V a r [l o g ({\hat{R R}}_{2})]$	0.00622
$C o v [l o g (\hat{{R R}_{1}}), l o g (\hat{{R R}_{2}})]$	0.00192
${\hat{ζ}}_{R R R}$	2.2784
$P_{I n t}$	0.01566

V a r

= variance.

{\hat{R R}}_{G}

= relative risk estimate for indicated group (G).

{\hat{ζ}}_{R R R}

= test statistic for the ratio of relative risks.

P_{I n t} =

p-value for interaction.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Published by MDPI on behalf of the Lithuanian University of Health Sciences. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.

Share and Cite

MDPI and ACS Style

Efird, J.T.; Dupuis, G.N.; Choi, Y.M.; Wu, H. The Epidemiologic Comparison of Two Correlated Relative Risks: A Simple but Efficient Clinical Trial Design for Assessing Risk-Reduction and Treatment Significance. Medicina 2026, 62, 70. https://doi.org/10.3390/medicina62010070

AMA Style

Efird JT, Dupuis GN, Choi YM, Wu H. The Epidemiologic Comparison of Two Correlated Relative Risks: A Simple but Efficient Clinical Trial Design for Assessing Risk-Reduction and Treatment Significance. Medicina. 2026; 62(1):70. https://doi.org/10.3390/medicina62010070

Chicago/Turabian Style

Efird, Jimmy T., Genevieve N. Dupuis, Yuk Ming Choi, and Hongsheng Wu. 2026. "The Epidemiologic Comparison of Two Correlated Relative Risks: A Simple but Efficient Clinical Trial Design for Assessing Risk-Reduction and Treatment Significance" Medicina 62, no. 1: 70. https://doi.org/10.3390/medicina62010070

APA Style

Efird, J. T., Dupuis, G. N., Choi, Y. M., & Wu, H. (2026). The Epidemiologic Comparison of Two Correlated Relative Risks: A Simple but Efficient Clinical Trial Design for Assessing Risk-Reduction and Treatment Significance. Medicina, 62(1), 70. https://doi.org/10.3390/medicina62010070

Article Menu

The Epidemiologic Comparison of Two Correlated Relative Risks: A Simple but Efficient Clinical Trial Design for Assessing Risk-Reduction and Treatment Significance

Abstract

1. Introduction

2. Preliminaries

2.1. Definition of Two Correlated Relative Risks in Terms of a 3 × 2 Cross-Tabulation Table

2.2. Test Statistic and p-Value

2.3. Analytic Details

2.3.1. Delta Approximation

2.3.2. Sample Variance and Covariance Estimates

2.4. Comparison of Two Correlated Odds Ratios

2.5. Multinomial Distribution and Simulated Exact Statistics

3. Computational Methods

4. Relative Risk-Reduction Example

5. Simulated Exact Results

6. Discussion

6.1. Overview

6.2. Advantages

6.3. Limitations

6.4. Future Directions

7. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

Appendix A. SAS GENMOD Code and Output for Validating the Manual Results of Example 1

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI