An Epidemiologic Approach for Estimating Risk Reduction and Asymptotic Power on the Log-Difference Scale

Efird, Jimmy T.

doi:10.3390/ijerph22050719

Open AccessArticle

An Epidemiologic Approach for Estimating Risk Reduction and Asymptotic Power on the Log-Difference Scale

by

Jimmy T. Efird

^1,2

¹

Cooperative Studies Program Coordinating Center, VA Boston Healthcare System, Lafayette City Center, 2 Avenue de Lafayette, Boston, MA 02111, USA

²

Department of Radiation Oncology, School of Medicine, Case Western Reserve University, Cleveland, OH 44206, USA

Int. J. Environ. Res. Public Health 2025, 22(5), 719; https://doi.org/10.3390/ijerph22050719

Submission received: 23 February 2025 / Revised: 6 April 2025 / Accepted: 29 April 2025 / Published: 1 May 2025

(This article belongs to the Special Issue The 20th Anniversary of IJERPH)

Download

Browse Figure

Versions Notes

Abstract

When comparing the efficacy or harmfulness of two groups (e.g., drugs, devices, assays, interventions, environmental toxins), it is important to minimize bias by making this comparison with respect to a common referent-control group, assuming random allocation. Under such a scenario, one can estimate risk reduction for a new therapy on a log-difference, relative effect scale. The current manuscript reviews the large-sample framework for this conditionally independent comparison and demonstrates how to estimate test power for a given sample size.

Keywords:

clinical trials; common referent-control; conditional independence; multiplicity adjustment; power; risk reduction; sample size

1. Introduction

A frequently used clinical trial design is to directly compare a new treatment against a standard therapeutic compound. However, many years may have passed since the standard compound was originally approved. During this time, variations in the disease process, manufacturing changes, or practice deviations may have occurred, such that the standard treatment is no longer efficacious, or is effective to a much lesser degree than originally marketed.

Observing a statistically significant absolute effect (risk) reduction (i.e., the standard minus new therapy event rate) may not necessarily prove the clinical usefulness of the new treatment unless the clinical trial design includes a common comparison group as a referent-control. This contrast allows for the side-by-side evaluation of the new and standard agents in the context of relative effect reduction on the log-difference scale. Importantly, the use of a shared referent-control arm minimizes bias and threats to internal validity compared with assessing the absolute risk difference.

Power and sample size are important components in the planning of a clinical trial. However, the literature is largely silent on the topic of a conditionally independent shared comparison arm when assessing risk reduction on the log-difference, relative effect scale.

2. Preliminaries

2.1. Conditionally Independent, Large-Sample Distribution

Consider a test to compare two relative effect estimates (REEs), denoted as

({\hat{ϑ}}_{1}, {\hat{ϑ}}_{2}),

on the log-difference scale, with respect to a common referent-control group. Approximate log-normally distributed REEs include relative risks (RR), odds ratios (OR), and hazard ratios (HR). A variance-stabilizing, logarithmic transformation tends to make the difference of REEs closer to a normal distribution by reducing any potential relationship between the variance and mean. The application of this transformation is especially useful when dealing with probabilities near 0 or 1.

Assuming that (a) the standard error (

S E

) for the logarithm of

({\hat{ϑ}}_{1})

does not differ considerably from that for

({\hat{ϑ}}_{2})

, (b) corresponding sample sizes

(n_{{\hat{ϑ}}_{1}}, n_{{\hat{ϑ}}_{2}})

are reasonably large, (c)

{\hat{ϑ}}_{1}

and

{\hat{ϑ}}_{2}

with a common referent-control arm are asymptotically consistent, but not necessarily unbiased, estimates of the true parameters, and (d) the large-sample distribution for the logarithm of

({\hat{ϑ}}_{1}, {\hat{ϑ}}_{2})

is approximately Gaussian, with estimated event rates for both groups not being too close to 0 or 1, it follows from the Le Cam–Lévy–Ferguson “Conditional Martingale Limit Theorem” that we obtain the following equation:

\lim_{(n_{{\hat{ϑ}}_{1}}, n_{{\hat{ϑ}}_{2}}) \to \infty} \frac{l o g (\frac{{\hat{ϑ}}_{1}}{{\hat{ϑ}}_{2}}) - E [l o g (\frac{{\hat{ϑ}}_{1}}{{\hat{ϑ}}_{2}})]}{\sqrt{V a r [l o g (\frac{{\hat{ϑ}}_{1}}{{\hat{ϑ}}_{2}})]}} ~ N (0,1),

(1)

where

E [*]

represents the expectation, and

N (0,1)

is the standard normal distribution [1]. The single tilde, “~”, indicates asymptotic equivalence of the same magnitude, under the assumption that the underlying Martingale processes are stationary, ergodic, and F-measurable [2], with finite variance.

The cumulative distribution function for

l o g (\frac{{\hat{ϑ}}_{1}}{{\hat{ϑ}}_{2}})

divided by that for a standard normal curve approaches unity for large sample sizes. Under asymptotic regularity conditions,

{\hat{ϑ}}_{1}

is conditionally independent of

{\hat{ϑ}}_{2}

, given a common referent-control group. That is, the probability (A) for an observation in the numerator of

{\hat{ϑ}}_{1}

is not affected by the probability (B) for an observation in the numerator of

{\hat{ϑ}}_{2},

given the probability (C) for a shared observation in the denominators of

{\hat{ϑ}}_{1} a n d {\hat{ϑ}}_{2} .

Mathematically stated, it follows that (

A

⫫

B |C)

is equivalent to (

B

⫫

A |C) .

The denominator of the limit described above, denoted as

(ζ),

yields the square root of the sample variance estimate of

[l o g ({\hat{ϑ}}_{1}) - l o g ({\hat{ϑ}}_{2})],

which may be approximated as follows [3]:

ζ \approx \sqrt{{\{S E [l o g ({\hat{ϑ}}_{1})]\}}^{2} + {\{S E [l o g ({\hat{ϑ}}_{2})]\}}^{2} .}

(2)

2.2. Confidence Interval Method for Estimating the Sample Variance

Lower and upper

α

–level confidence intervals (CIs) for the indicated REEs are given as

L C I [*]

and

U C I [*]

. That is, for

(i = 1,2),

we obtain the following expression:

{C I}_{α} (ϑ_{i}) = e^{l o g ({\hat{ϑ}}_{i}) \pm z_{(α / 2)} S E [l o g ({\hat{ϑ}}_{i})]},

(3)

where

(α)

refers to the probability of rejecting the null hypothesis when it is true, and

z_{(α / 2)}

corresponds to the

100 (1 - α / 2)

percentile of

N (0,1)

[e.g.,

z_{(0.05 / 2)} = 1.96

]. Using the above bounds and rearranging, it readily follows that we obtain the following equation:

ζ \approx \sqrt{{\{\frac{l o g [U C I ({\hat{ϑ}}_{1})] - l o g [L C I ({\hat{ϑ}}_{1})]}{2 z_{(\frac{α}{2})}}\}}^{2} + {\{\frac{l o g [U C I ({\hat{ϑ}}_{2})] - l o g [L C I ({\hat{ϑ}}_{2})]}{2 z_{(\frac{α}{2})}}\}}^{2}} .

(4)

2.3. Alternative Computational Formula for $S E [l o g \hat{(R R)}]$

Let

(a_{i, j})

denote the

({i^{t h}, j}^{t h})

cell sizes for a contingency table with corresponding probabilities of

(π_{i, j})

. Here, the row exposures and column outcomes are indexed by

(i)

and

(j) .

The relative risk estimate

(\hat{R R}),

also known as the risk ratio, is defined as follows:

\hat{R R} = \frac{\frac{a_{11}}{a_{1 +}}}{\frac{a_{21}}{a_{2 +}},}

(5)

where

(a_{1,1})

and

(a_{2,1})

follow a binomial distribution. By applying the delta method (Taylor series approximation for large sample sizes), we obtain the following equations:

V a r \{l o g (\hat{R R})\} = V a r \{l o g (\frac{a_{11}}{a_{1 +}})\} + V a r \{l o g (\frac{a_{21}}{a_{2 +}})\}

(6)

\approx {(\frac{a_{1 +}}{a_{11}})}^{2} \{\frac{\frac{a_{11}}{a_{1 +}} (1 - \frac{a_{11}}{a_{1 +}})}{a_{1 +}}\} + {(\frac{a_{2 +}}{a_{21}})}^{2} \{\frac{\frac{a_{21}}{a_{2 +}} (1 - \frac{a_{21}}{a_{2 +}})}{a_{2 +}}\}

(7)

\approx \frac{1}{a_{11}} - \frac{1}{a_{1 +}} + \frac{1}{a_{21}} - \frac{1}{a_{2 +}} .

(8)

Accordingly, we obtain the following equation:

S E [l o g (\hat{R R})] \approx \sqrt{\frac{1}{a_{11}} - \frac{1}{a_{1 +}} + \frac{1}{a_{21}} - \frac{1}{a_{2 +}}} .

The advantage of this direct formula is the increased computational speed gained by foregoing the CI method for estimating

S E [l o g (\hat{R R})]

. In the case of ORs and HRs, one can derive similar computational approximations using the delta method.

2.4. Percentage Relative Effect (Risk) Reduction

When

({\hat{ϑ}}_{1} > {\hat{ϑ}}_{2})

, the percentage relative effect reduction

(% R L D)

is computed as

100 (\frac{{\hat{ϑ}}_{1}}{{\hat{ϑ}}_{2}} - 1),

and vice versa when

({\hat{ϑ}}_{2} > {\hat{ϑ}}_{1}),

i.e.,

100 (\frac{{\hat{ϑ}}_{2}}{{\hat{ϑ}}_{1}} - 1) .

If

({\hat{ϑ}}_{1} = {\hat{ϑ}}_{2})

, a null result is obtained for the

(% R L D) .

For example, if

({\hat{ϑ}}_{1} = 0.80)

and

({\hat{ϑ}}_{2} = 0.60)

, then

(% R L D)

=

100 (\frac{{\hat{ϑ}}_{1}}{{\hat{ϑ}}_{2}} - 1) = 100 (\frac{0.80}{0.60} - 1) = 100 (1.33 - 1) = 33

.

2.5. Null Hypothesis and p-Value

In the case of a null hypothesis (H₀),

l o g ({\hat{ϑ}}_{1}) a n d l o g ({\hat{ϑ}}_{2})

are deemed to be equal, versus the alterative scenario in which they differ [3]. That is, H₀:

l o g (\frac{{\hat{ϑ}}_{1}}{{\hat{ϑ}}_{2}}) = 0,

which is equivalent to a null

% R L D,

compared with H₁:

l o g (\frac{{\hat{ϑ}}_{1}}{{\hat{ϑ}}_{2}}) \neq 0 .

In the case of (H₀), both

l o g (\frac{{\hat{ϑ}}_{1}}{{\hat{ϑ}}_{2}}) a n d

E [l o g (\frac{{\hat{ϑ}}_{1}}{{\hat{ϑ}}_{2}})]

equal zero.

A p-value for the log-difference of two REEs is estimated as follows:

P ~ 2 \cdot \{1 - Φ (z)\},

(9)

where

Φ (z) = \int_{- \infty}^{z} \frac{e^{\frac{- x^{2}}{2}}}{\sqrt{2 \cdot π}} d x

(10)

and

z ~ \frac{l o g ({\hat{ϑ}}_{1}) - l o g ({\hat{ϑ}}_{2})}{ζ} .

(11)

2.6. Confidence Interval Decision Criteria

The disjoint CI percentage corresponding to an equivalent

α

-level test is computed as follows:

[1 - 2 Φ (- Z_{(α / 2)} \frac{\sqrt{2}}{2})] x 100 % .

(12)

This yields slightly wider regions to compensate for the pooled (versus non-pooled) sample variance [4]. For example, given that

α

= 0.01, 0.05, and 0.10, the difference between the estimates

l o g ({\hat{ϑ}}_{1}) a n d l o g ({\hat{ϑ}}_{2})

is declared as statistically significant if the 93.1452%, 83.4224%, and 75.5206% CIs for

({\hat{ϑ}}_{1}, {\hat{ϑ}}_{2})

do not overlap.

Analogously, the difference between the estimates

l o g ({\hat{ϑ}}_{1}) a n d l o g ({\hat{ϑ}}_{2})

may be declared as statistically significant when the following equation excludes unity:

{C I}_{α} (\frac{ϑ_{1}}{ϑ_{2}}) = e^{l o g (\frac{{\hat{ϑ}}_{1}}{{\hat{ϑ}}_{2}}) \pm \{z_{(α / 2)}\} ζ}

(13)

3. Power and Sample Size

3.1. Power

The power of a study indicates how frequently a statistical test will detect the falsehood of an underlying null hypothesis when it is false (i.e.,

\Pr (r e j e c t H_{0}| H_{0} f a l s e))

. Let

(β)

denote the probability of failing to reject the null the hypothesis when it is false (i.e.,

\Pr (a c c e p t H_{0}| H_{0} f a l s e)

). It follows that we obtain the following equation:

z_{(β)} ~ \{[\frac{l o g (R L D + 1)}{ζ}] - z_{(α / 2)}\} .

(14)

The area to the left of

z_{(β)}

under a standard normal distribution yields the desired power at the

α

-level of statistical significance for a two-sided test. When use of a one-tailed test is desired,

z_{(α)}

, rather than

z_{(α / 2)}

, is used in this formula.

3.2. Sample Size and Variability

The confidence interval method may be used to approximate the standard error for the logarithm of the respective relative effect estimates (

{\hat{ϑ}}_{1}, {\hat{ϑ}}_{2}

) corresponding to an initial rxc (row by column) = (n) contingency table, where (n) denotes the total sample size (i.e., combined rows). Cell frequencies of the pilot sample table are increased multiplicatively (in an iterative fashion) to obtain the corresponding standard errors for different sample sizes and corresponding power.

Using this method, it is important that the initial 3 × 2 contingency table be sufficiently large to accurately estimate the sample variability within a specified fraction of the population variance [5]. As a rule of thumb, the proportional width of the α-level CI for the population variance of a continuous variable is defined as follows:

(n - 1) (\frac{1}{χ_{(1 - \frac{α}{2}), (n - 1)}^{2}} - \frac{1}{χ_{(\frac{α}{2}), (n - 1)}^{2}}),

(15)

This value should be as large as the sample variance, where

χ_{(\frac{α}{2}), (n - 1)}^{2}

is the

100 (1 - α / 2)

percentile of the chi-squared distribution with

(n - 1)

degrees of freedom. When α = 0.05, this corresponds to a minimum sample size of 38. However, in practice, a more conservative sample size of 90 (30 per treatment row) is typically chosen as the starting point to account for the binary outcomes.

3.3. Simulation of Observations from a Multinomial Distribution

Given (n) distinct trials, the probability that a mutually exclusive set of (

k

) non-negative random variates (

A_{1}, A_{2}, \dots, A_{k}

) takes on a particular value (

a_{1}, a_{2}, \dots, a_{k}

) is given as follows:

P_{n} (A_{1} = a_{1}, A_{2} = a_{2}, \dots, A_{k} = a_{k}) = \frac{n!}{\prod_{i = 1}^{k} a_{i}} \prod_{i = 1}^{k} π_{i}^{a_{i}},

(16)

where

P_{n} (A_{i} = a_{i}) = π_{i}, \sum_{i = 1}^{k} π_{i} = 1, and \sum_{i = 1}^{k} a_{i} = n .

(17)

This probability is known as a multinomial distribution because of the following expression:

\frac{n!}{\prod_{i = 1}^{k} a_{i}} \prod_{i = 1}^{k} π_{i}^{a_{i}} = {(a_{1} + a_{2} + \dots + a_{k})}^{n},

(18)

where the latter is a multinomial series.

The multinomial distribution, as illustrated in Example 2, plays an important role in validating the asymptotic normality of the conditional test statistic.

4. Computational Methods

Analyses were performed and validated in SAS 9.4 (Cary, NC).

5. Example 1—Comparison of Potassium and Sodium Salts

Potassium salt substitutes can be helpful for lowering sodium intake and controlling high blood pressure. Among a cohort of borderline hypertensive (but otherwise healthy) patients, a team of clinical epidemiologists aimed to determine if randomization to a potassium salt substitute (1500 mg/d) reduces the 24-month risk of major adverse cardiovascular outcomes and non-cancer death compared with a formulary of sodium salt (1500 mg/d), with a common referent-control arm of 2300 mg/d sodium salt.

In a 1:1:1 pilot clinical trial, a 9.52% relative effect (risk) reduction on the log-difference scale was observed at 24 months post-randomization (Table 1). Based on this promising result, the team decided to conduct a larger population-based phase III clinical trial. To achieve at least 80% power for a similar %RLD (discounting dropouts), the new pivotal trial would need to randomize 36,835 patients per study arm at the α = 0.05 level of statistical significance (Table 2; see Appendix A for SAS code).

6. Example 2—Monte Carlo Simulation of the Conditional Martingale Limit Theorem

While a theoretical basis exists for the asymptotic normality and conditional independence of the test statistic [1,2], Monte Carlo methods can be used to validate the large- sample properties of this method.

A total of 1,000,000 trials, repeated for each sample of 900,000 patients (300,000 per arm), were drawn from a multinomial distribution using the cell probabilities given in Example 1 (with T₀ denoting the common referent-control group). A standardized value was computed for each simulated observation for

l o g (\frac{{\hat{ϑ}}_{1}}{{\hat{ϑ}}_{2}})

by subtracting

E (l o g (\frac{{\hat{ϑ}}_{1}}{{\hat{ϑ}}_{2}}))

from this estimate and dividing it by

\sqrt{V a r [l o g (\frac{{\hat{ϑ}}_{1}}{{\hat{ϑ}}_{2}})]}

. Plotting these values as a histogram yields Figure 1. The corresponding statistics (mean = 0, standard deviation = 1, and approximate zero values for skewness and kurtosis) indicate that the z-scores follow a standard normal distribution, as visually confirmed by the Normal (0,1) curve overlay. Furthermore, normality was not rejected by the Kolmogorov–Smirnov, Cramer–von Mises, or Anderson–Darling tests (see Appendix B for SAS code).

For large samples, this example supports the use of normal-theory methods to estimate power for a 1:1:1 randomized design with a common referent-control arm. In effect, the covariance of

l o g ({\hat{ϑ}}_{1}) a n d l o g ({\hat{ϑ}}_{2})

can be disregarded as the sample size increases toward infinity, allowing one to assume an independent, identically distributed (i.i.d.) standard normal test statistic.

7. Discussion

7.1. Overview

The comparison of two drugs with respect to a common comparator (referent-control) group is a powerful and efficient approach for conducting a clinical trial. Importantly, this method preserves the randomization of the originally assigned patient groups [6]. Additionally, it allows for determining risk reduction on the log-difference scale. The current manuscript presents a novel, computer-based technique for estimating the power of the latter by multiplicatively increasing the cell sizes of an initial pilot sample. This yields the inflated variances needed to compute power for larger sample sizes while still allowing the use of traditional normal-theory methods. As reasoned below, the common referent-control design does not require multiplicity adjustment.

7.2. Multiplicity Adjustment

The family-wise type I error rate, which is the probability of at least one false positive test among all hypotheses being tested, is not increased with respect to a common referent-control group [7]. The associated false positive comparison-wise error rate at the individual level is also not inflated, as the “expected proportion of incorrectly rejected hypotheses will not exceed the significance level used in the individual tests” [7,8].

While a treatment is the “family” over which one typically judges the need for multiplicity, there is indistinctness regarding how this entity is defined. Multiplicity control is not required if the underlying therapies have distinct mechanisms of action [9]. That is, “if each decision to reject each individual null hypothesis depends on no more than one significance test, then none of the individual tests constitute a family”, such that an

α

-level adjustment for any single hypothesis is unnecessary (except if one performs “disjunction testing of a joint intersection null hypothesis”) [10]. While cases exist wherein “the decision on the omnibus null will seem to contradict decisions on individual nulls”, it is important to remember that “rejecting or not rejecting a null is not certain proof that the null is false or true” [10]. Therefore, multiplicity correction for CIs) is needed when the omnibus null can only be evaluated through “a set of surrogate nulls” and cannot be “generalized to testing any diversity of unconnected nulls over the course of a study” [11].

Adjustment is counterintuitive in the case of conjunctive (intersection–union) testing, in which all tests must be significant for the joint null hypothesis to be rejected [9]. However, multiplicity adjustment for CIs is advisable if the comparative therapies are related, as is the case when evaluating “different dosages or regimens of a treatment compared with the same control arm” [12]. This includes trials with collective conditions “if there are no less than two primary hypotheses, unless one assumes that there is an explicit hierarchy in the multiple hypotheses” [12].

Multiplicity adjustment of CIs in the denominator of a relative-effects (log-difference) test of two compounds is generally not performed in practice, as this is equivalent to a stand-alone interaction test with a pooled variance structure [3]. If requested by a regulatory agency, the intervals can be easily adjusted for multiplicity using the Hochberg step-up procedure [13]. However, when “such adjustments are applied unnecessarily, potentially effective treatments may be discarded prematurely” [9].

7.3. Limitations

A limitation of REEs is that they are ratios and tend to have a slightly skewed log-normal distribution, which creates issues when the parameters of a test statistic are estimated from the sample. According to Jensen’s inequality, the expectation will be less than or equal to its true value. To some degree, this bias is an advantage because it helps to offset the Cauchy infinite variance problem that may result from taking the ratio of two normally distributed variables [1]. Barring extreme degenerate examples, it is noted that

l o g (\frac{{\hat{ϑ}}_{1}}{{\hat{ϑ}}_{2}})

asymptotically assumes a moderately well-behaved, normal shape in the limit.

The literature offers little consistent advice regarding the lower limit for the sample size needed to satisfy the underlying large-sample assumptions of the test statistic. It is also unclear what level of inaccuracy is incurred when estimating a z-score test statistic when the sample size is small and the observations are not fully independent. By treating the logarithmic transformation of

\frac{{\hat{ϑ}}_{1}}{{\hat{ϑ}}_{2}}

as a mean difference and applying Gauss’ hypergeometric series, one may heuristically conclude that the small-sample test statistic follows a pseudo Behrens–Fisher distribution with

(υ)

degrees of freedom [14], as follows:

υ = \frac{{[\frac{S D [l o g ({\hat{ϑ}}_{1})]}{n_{{\hat{ϑ}}_{1}}} + \frac{S D [l o g ({\hat{ϑ}}_{2})]}{n_{{\hat{ϑ}}_{2}}}]}^{2}}{[\frac{{\{S D [l o g ({\hat{ϑ}}_{1})]\}}^{2}}{n_{{\hat{ϑ}}_{1}} + 1} + \frac{{\{S D [l o g ({\hat{ϑ}}_{2})]\}}^{2}}{n_{{\hat{ϑ}}_{2}} + 1}]} - 2 .

(19)

An alternative, albeit lower-power, approach is the Lepage test, which treats the problem as a nonparametric test for central tendency and dispersion [15]. In certain situations, nonparametric methods are better suited (less sensitive) for handling violations of independence.

Addressing the exact convergence rates for a conditional Martingale-type central limit theorem, which are ostensibly slower than

(\frac{1}{\sqrt{n}}),

is a complex topic beyond the scope of the current manuscript and is best deferred to future research [16]. Nonetheless, it is common practice in many applied scenarios to accept that the underlying data are “close enough” to normal to move forward with normal-theory methods.

Imprecise rounding is also a concern. As REEs are expressed on a logarithmic scale, with significant departures from linearity occurring near unity, it is best to avoid intermediate rounding when computing the test statistic. Fortunately, this rarely poses a problem when the operations are performed by a computer rather than by manual means.

In theory, power for a given sample size may be estimated by simulating datasets from a conditional multinomial distribution, with prescribed cell probabilities being reflective of the underlying hypothesized parameters. Power is then defined as the proportion of the simulated test statistics that fall within the “a priori” alpha rejection region. However, the simulation method is computationally intensive and introduces uncertainty in the values at the distribution tails. As large sample sizes are often needed to detect small effect sizes for risk reduction, the process may be resource-prohibitive or prone to computational errors, in which case normal-theory methods may be the better choice for computing power. While robust estimation methods are available in most standard statistical packages to simulate the covariance of REEs and corresponding test statistics involving correlated data, boundary issues regularly result in algorithmic non-convergence. The simulation of power also is problematic for heavy-tailed, Cauchy-like distributions, which may require sample sizes on the order of millions to properly characterize outlying values. Furthermore, multinomial-type power computations are predicated on the observed pilot sample effect size. Thus, simulation cannot be used to determine power for a range of minimally detectable

R L D (%

) values based on a single effect size estimate. In such cases, separate pilot samples will be needed to determine the desired range of values.

7.4. Future Directions

Future investigations are directed at estimating power when sample sizes are small and the assumption of conditional independence is questionable. This will include cases where it is difficult to establish the consistency, stability, and normality of underlying data estimates and parameters. Innovative research efforts focusing on nonparametric, two-sample conditional tests and Bayesian methods also may be useful [17,18].

8. Conclusions

The use of a common reference-control arm is an important tool for minimizing bias in a randomized clinical trial. Assuming conditional independence, a large-sample method is presented to estimate power for risk reduction on the log-difference scale. This computationally straightforward approach offsets the uncertainty and potential bias associated with tests of absolute risk reduction.

Funding

The author reports that there is no external funding associated with the work featured in this article. The opinions presented in this manuscript do not necessarily represent those of the VA or the United States Federal Government. All examples provided herein are hypothetical and not intended to reflect any real individuals, communities, or non-profit/corporate/governmental entities.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data are contained within the article.

Acknowledgments

The insightful comments from Genevieve Dupuis, Kaitlin Cassidy, and Janet M. Grubber during the writing and revision of this manuscript are greatly appreciated. SAS programming was validated by Maria Androsenko.

Conflicts of Interest

The author declares no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

ASCVD	Atherosclerotic cardiovascular disease
CI	Confidence interval
LCI	Lower confidence interval
Log	Logarithm
Mg/day	Milligrams per day
P	p-value
REE	Relative effect estimate
RLD	Relative effect reduction on the log-difference scale
SD	Standard deviation
SE	Standard error
UCI	Upper confidence interval

Appendix A. SAS Code for Determining Power and Sample Size, as Shown in Table 2

Appendix B. SAS Code for Simulating the Conditional Central Limit Theorem (Example 2)

References

Le Cam, L. Asymptotic Methods in Statistical Decision Theory; Springer: New York, NY, USA, 1986. [Google Scholar]
Peligrad, M. Conditional central limit theorem via Martingale approximation. arXiv 2011. [Google Scholar] [CrossRef]
Altman, D.; Bland, J. Interaction revisited: The difference between two estimates. Br. Med. J. 2003, 326, 219. [Google Scholar] [CrossRef] [PubMed]
Knol, M.; Pestmen, W.; Grobbee, D. The (mis)use of overlap of confidence intervals to assess effect modification. Eur. J. Epidemiol. 2011, 26, 253–254. [Google Scholar] [CrossRef] [PubMed]
Teare, M.; Dimairo, M.; Shephard, N.; Hayman, A.; Whitehead, A.; Walters, J. Sample size requirements to estimate key design parameters from external pilot randomized controlled trials: A simulation study. Trials 2014, 15, 264–277. [Google Scholar] [CrossRef] [PubMed]
Kim, H.; Gurrin, L.; Ademi, Z.; Liew, D. Overview of methods for comparing the efficacies of drugs in absence of head-to-head clinical trial data. B.J.C.P 2013, 77, 116–121. [Google Scholar] [CrossRef] [PubMed]
Parker, R.; Weir, C. Non-adjustment for multiple testing in multi-arm trials of distinct treatments: Rationale and justification. Clin. Trials 2020, 17, 562–566. [Google Scholar] [CrossRef] [PubMed]
Bender, R.; Lange, S. Adjusting for multiple testing—When and how? J. Clin. Epidemiol. 2001, 54, 343–349. [Google Scholar] [CrossRef] [PubMed]
Molloy, S.; White, I.; Nunn, A.; Hayes, R.; Wang, D.; Harrison, S. Multiplicity adjustments in parallel-group multi-arm trials sharing a control group: Clear guidance is needed. Contemp. Clin. Trials 2022, 113, 106656. [Google Scholar] [CrossRef] [PubMed]
Rubin, M. When to adjust alpha during multiple testing: A consideration of disjunction, conjunction, and individual testing. Synthese 2021, 199, 10969–11000. [Google Scholar] [CrossRef]
García-Pérez, A. Use and misuse of correction for multiple testing. Meth. Psych. B 2023, 8, 100120. [Google Scholar] [CrossRef]
Li, G.; Taljaard, M.; Van den Heuvel, E.; Levine, M.; Cook, D.; Wells, G.; Deveraux, P.; Thabane, L. An introduction to multiplicity issues in clinical trials: The what, why, when and how. Int. J. Epidemiol. 2017, 46, 746–755. [Google Scholar] [CrossRef] [PubMed]
Hochberg, Y. A sharper Bonferroni procedure for multiple tests of significance. Biometrika 1988, 75, 800–802. [Google Scholar] [CrossRef]
Kabe, D. On the exact distribution of the Fisher-Behren’s-Welch statistic. Metrika 1996, 10, 13–15. [Google Scholar] [CrossRef]
Lepage, Y. A combination Wilcoxon’s and Ansari-Bradley’s statistics. Biometrika 1971, 58, 213–217. [Google Scholar] [CrossRef]
Bolthauser, E. Exact convergence rates in some Martingale central limit theorems. Ann. Prob. 1982, 3, 672–688. [Google Scholar]
Lee, S.; Cha, S.; Kim, I. General framework for conditional two-sample testing. arXiv 2024. [Google Scholar] [CrossRef]
Yan, J.; Li, Z.; Zhang, X. Distance and kernel measures for global and local two-sample conditional distribution testing. arXiv 2024. [Google Scholar] [CrossRef]

Figure 1. Histogram with a standard normal overlay for simulated z-scores, conditional on a common reference-control arm.

Table 1. Pilot analysis (N = 500 participants per arm).

Treatment Regimen (T)		MACE Outcome at 24 Months Post-Baseline n (%)		Results ^†
Treatment Regimen (T)		Yes	No	Results ^†
$T_{0}$	2300 mg/d ^§ Sodium Salt	50 (10)	450 (90)	${\hat{ϑ}}_{1} (T_{0} : T_{2}) = 1.1905 ({C I}_{0.05} = 0.8052 - 1.7602)$ ${\hat{ϑ}}_{2} (T_{0} : T_{1}) = 1.0870 ({C I}_{0.05} = 0.7428 - 1.5906)$ $\frac{{\hat{ϑ}}_{1}}{{\hat{ϑ}}_{2}} = 1.0952 ({C I}_{0.05} = 0.6346 - 1.8903)$ $% R L D = 9.52 %; P$ = 0.74395
$T_{1}$	1500 mg/d Sodium Salt	46 (9)	454 (91)
$T_{2}$	1500 mg/d Potassium Salt	42 (8)	458 (92)

^† Estimates are depicted as rounded values. ^§ Common referent-control. Baseline = initiation of treatment regimen.

{C I}_{0.05} =

95% confidence interval. MACE = major cardiovascular event. mg/d = milligrams per day. n = cell frequency. P = p-value for risk reduction.

{\hat{ϑ}}_{i}

= risk ratio for group (i). %RLD = percentage relative effect (risk) reduction on the log-difference scale.

Table 2. Power analysis for a planned population-based phase III clinical trial.

Sample Size Per Arm	Comparison	${C I}_{0.05} (\hat{ϑ})$ ^†	$S E [l o g (\hat{ϑ})]$ ^†	Minimal Detectable RLD (%)	Power †,§ (%)
10,000	$T_{0} : T_{2}$	1.0908–1.2993	0.044615	14	55.7
	$T_{0} : T_{2}$	1.0908–1.2993	0.044615	12	44.4
	$T_{0} : T_{1}$	.9982–1.1836	0.043439	10	33.4
	$T_{0} : T_{1}$	.9982–1.1836	0.043439	8	23.5
20,000	$T_{0} : T_{2}$	1.1191–1.2664	0.031547	14	84.5
	$T_{0} : T_{2}$	1.1191–1.2664	0.031547	12	73.0
	$T_{0} : T_{1}$	1.0234–1.1544	0.030716	10	58.1
	$T_{0} : T_{1}$	1.0234–1.1544	0.030716	8	41.6
30,000	$T_{0} : T_{2}$	1.1319–1.2521	0.025758	14	95.4
	$T_{0} : T_{2}$	1.1319–1.2521	0.025758	12	88.3
	$T_{0} : T_{1}$	1.0348–1.1417	0.025080	10	75.5
	$T_{0} : T_{1}$	1.0348–1.1417	0.025080	8	57.2
40,000	$T_{0} : T_{2}$	1.1395–1.2437	0.022307	14	98.8
	$T_{0} : T_{2}$	1.1395–1.2437	0.022307	12	95.4
	$T_{0} : T_{1}$	1.0417–1.1342	0.021720	10	86.5
	$T_{0} : T_{1}$	1.0417–1.1342	0.021720	8	69.6

^† Estimates depicted as rounded values. ^§ Power for a 2-sided,

α = 0.05

level test.

{C I}_{0.05} = 95 % c o n f i d e n c e i n t e r v a l . R L D

= Relative effect (risk) reduction on the log-difference scale

. \hat{ϑ} = R i s k r a t i o . S E =

Standard error.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Efird, J.T. An Epidemiologic Approach for Estimating Risk Reduction and Asymptotic Power on the Log-Difference Scale. Int. J. Environ. Res. Public Health 2025, 22, 719. https://doi.org/10.3390/ijerph22050719

AMA Style

Efird JT. An Epidemiologic Approach for Estimating Risk Reduction and Asymptotic Power on the Log-Difference Scale. International Journal of Environmental Research and Public Health. 2025; 22(5):719. https://doi.org/10.3390/ijerph22050719

Chicago/Turabian Style

Efird, Jimmy T. 2025. "An Epidemiologic Approach for Estimating Risk Reduction and Asymptotic Power on the Log-Difference Scale" International Journal of Environmental Research and Public Health 22, no. 5: 719. https://doi.org/10.3390/ijerph22050719

APA Style

Efird, J. T. (2025). An Epidemiologic Approach for Estimating Risk Reduction and Asymptotic Power on the Log-Difference Scale. International Journal of Environmental Research and Public Health, 22(5), 719. https://doi.org/10.3390/ijerph22050719

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

An Epidemiologic Approach for Estimating Risk Reduction and Asymptotic Power on the Log-Difference Scale

Abstract

1. Introduction

2. Preliminaries

2.1. Conditionally Independent, Large-Sample Distribution

2.2. Confidence Interval Method for Estimating the Sample Variance

2.3. Alternative Computational Formula for S E l o g R R ^

2.4. Percentage Relative Effect (Risk) Reduction

2.5. Null Hypothesis and p-Value

2.6. Confidence Interval Decision Criteria

3. Power and Sample Size

3.1. Power

3.2. Sample Size and Variability

3.3. Simulation of Observations from a Multinomial Distribution

4. Computational Methods

5. Example 1—Comparison of Potassium and Sodium Salts

6. Example 2—Monte Carlo Simulation of the Conditional Martingale Limit Theorem

7. Discussion

7.1. Overview

7.2. Multiplicity Adjustment

7.3. Limitations

7.4. Future Directions

8. Conclusions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

Appendix A. SAS Code for Determining Power and Sample Size, as Shown in Table 2

Appendix B. SAS Code for Simulating the Conditional Central Limit Theorem (Example 2)

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

2.3. Alternative Computational Formula for $S E [l o g \hat{(R R)}]$