Next Article in Journal
Efficacy of Radiotherapy for Oligometastatic Lung Cancer and Irradiation Methods Based on Metastatic Site
Previous Article in Journal
Distinctive Characteristics of Rare Sellar Lesions Mimicking Pituitary Adenomas: A Collection of Unusual Neoplasms
Previous Article in Special Issue
Benign/Cancer Diagnostics Based on X-Ray Diffraction: Comparison of Data Analytics Approaches
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Optimizing One-Sample Tests for Proportions in Single- and Two-Stage Oncology Trials

by
Alan David Hutson
Roswell Park Comprehensive Cancer Center, Department of Biostatistics and Bioinformatics, Elm and Carlton Streets, Buffalo, NY 14623, USA
Cancers 2025, 17(15), 2570; https://doi.org/10.3390/cancers17152570
Submission received: 9 July 2025 / Revised: 28 July 2025 / Accepted: 30 July 2025 / Published: 4 August 2025
(This article belongs to the Special Issue Application of Biostatistics in Cancer Research)

Simple Summary

Phase II oncology trials often use single-arm designs when randomized trials are too expensive or impractical, such as in rare diseases. These trials typically test whether a treatment’s success rate exceeds a specified benchmark. Standard statistical methods, like the exact binomial test or Simon’s two-stage design, are commonly used but tend to be conservative, often underestimating the actual probability of incorrectly rejecting a true null hypothesis (Type I error). To address this, a new method is proposed that blends the binomial distribution with simulated normal data to create an unbiased estimate of treatment success. This convolution-based method improves the precision of Type I error control and can lead to more efficient trial designs. It also introduces a new two-stage design that includes an early stopping point for futility, offering flexibility and reduced sample sizes without compromising statistical rigor. Compared to traditional methods, this approach can lower the cost and shorten the duration of trials, making it a promising tool for early-stage oncology research.

Abstract

Background/Objectives: Phase II oncology trials often rely on single-arm designs to test H 0 : π = π 0 versus H a : π > π 0 , especially when randomized trials are infeasible due to cost or disease rarity. Traditional approaches, such as the exact binomial test and Simon’s two-stage design, tend to be conservative, with actual Type I error rates falling below the nominal α due to the discreteness of the underlying binomial distribution. This study aims to develop a more efficient and flexible method that maintains accurate Type I error control in such settings. Methods: We propose a convolution-based method that combines the binomial distribution with a simulated normal variable to construct an unbiased estimator of π . This method is designed to precisely control the Type I error rate while enabling more efficient trial designs. We derive its theoretical properties and assess its performance against traditional exact tests in both one-stage and two-stage trial designs. Results: The proposed method results in more efficient designs with reduced sample sizes compared to standard approaches, without compromising the control of Type I error rates. We introduce a new two-stage design incorporating interim futility analysis and compare it with Simon’s design. Simulations and real-world examples demonstrate that the proposed approach can significantly lower trial cost and duration. Conclusions: This convolution-based approach offers a flexible and efficient alternative to traditional methods for early-phase oncology trial design. It addresses the conservativeness of existing designs and provides practical benefits in terms of resource use and study timelines.

1. Introduction

Common designs for phase II oncology trials typically focus on testing the hypothesis H 0 : π = π 0 versus H a : π > π 0 in a one-arm non-randomized setting. Although randomized trials are generally preferred, considerations such as cost, feasibility, and the rarity of certain cancer types often necessitate the use of single-arm designs. The estimated per-patient cost of conducting an oncology clinical trial was approximately $59,500, as reported by Batelle in 2013 [1]. More recent studies, particularly those involving cellular therapies, report substantially higher costs, in some cases exceeding $500,000 per treatment cycle [2,3,4]. Consequently, there is a critical need to optimize both phase II and phase III randomized trial designs to shorten trial duration and reduce required sample sizes. These efforts not only address escalating costs but also aim to expedite the availability of effective therapies to cancer patients. The focus of this work is towards optimizing phase II one-arm oncology trials with a binary endpoint in terms of reduced sample size and increased efficiency.
Commonly employed binary endpoints in phase II trials include objective response, complete response, and progression-free, event-free, or overall survival at fixed time points, such as 6 months or 1 year. A key feature of non-randomized, single-arm phase II trials is that, in many cancer indications, the standard-of-care population response rate is sufficiently well-characterized to serve as a comparator. If no promising signal is observed, further development, including progression to a randomized phase II or III trial, is typically not pursued.
In single-stage designs, the hypothesis about a rate or proportion, H 0 : π = π 0 versus H a : π > π 0 , is most often tested using an exact binomial test, where the Type I error rate α . In contrast, one-arm two-stage designs commonly employ Simon’s two-stage design [5], using either the minimax or optimal design configurations. The minimax design minimizes the total sample size, while the optimal design minimizes the expected sample size under the null hypothesis, where with the constraints are that the Type I error rate α and the Type II error rate β . Both designs incorporate an interim futility analysis at a predetermined sample size to allow early termination for lack of efficacy. Historically, it is noteworthy that very similar sampling schemes were developed decades earlier in the field of quality control, referred to as double sampling plans [6], with Simon’s two-stage design representing a special case of these earlier methods [7].
The issue with both single-stage and two-stage designs is that the exact Type I error rate is often considerably lower than the desired Type I error rate α and the desired power is often larger than the desired power value 1 β , due to the discreteness of the underlying binomial distribution under both the null and alternative hypotheses for a given design. This phenomenon is illustrated in the so-called saw-tooth plots in Figure 1, which display the exact Type I error across a range of potential null values for π 0 with sample sizes n = 10 , 20 , 30 , 40 . As a result, these tests can be conservative in certain scenarios where the exact Type I error rate falls substantially below the target level α .
One approach to mitigate this conservatism is to incorporate a continuity correction [8]; however, this does not eliminate the saw-tooth behavior in Type I error control. Similarly, the power function may also exhibit a saw-tooth pattern and can be non-monotonic [9]. For a fixed sample size n, there are n discrete values of π 0 corresponding to k = 1 , 2 , , n ( k = 0 is infeasible), at which the Type I error equals α , given by
π 0 = I α 1 ( k , n k + 1 ) ,
where I 1 is the inverse regularized incomplete beta function.
Interestingly, Simon’s two-stage design can exhibit the same phenomenon, where the exact Type I error rate is often much lower than the derired Type I error rate α . In this case, however, we cannot produce a saw-tooth plot as a function of π 0 alone, since Simon’s two-stage design also depends on the choice of the alternative response rate, denoted π 1 ( > π 0 ) . In Figure 2, we present the exact Type I error rate and power for testing H 0 : π = π 0 versus H a : π > π 0 , with π 0 = 0.1 and π 1 varying from 0.19 to 0.8, while fixing α = 0.05 and power = 0.80 for the minimax design. As π 1 increases, the required total sample size decreases, thus reducing the dimensionality of the sample space for design choices. In fact, for this example, there are very few instances where the exact Type I error rate approaches the nominal α . For example, when π 1 = 0.35 , the exact Type I error rate is 0.027 with a final stage sample size of n = 18 . Increasing π 1 to 0.36 raises the exact Type I error rate to 0.044, with a corresponding final stage sample size of n = 14 . For smaller values of n, the exact Type I error rate can drop as low as 0.015. Similarly, as the sample size decreases with increasing π 1 , the exact power may deviate considerably from the target power.
In this note, we illustrate how a straightforward convolution of a binomial random variable with a simulated normal random variable enables the construction of an unbiased estimator for the rate parameter π , and facilitates inference for H 0 : π = π 0 versus H a : π > π 0 with precise Type I error control. This in turn can reduce sample size requirements for both one-stage and two-stage designs.
In Section 2, we define the convolution estimator, derive its density and distribution functions, outline key theoretical properties including its expectation and variance, and present a toy example to demonstrate the p-value calculation.
We then provide a detailed comparison of the new convolution-based test with the exact binomial test in terms of Type I error control and power. In Section 3, we introduce a new two-stage design with a futility stopping rule that also achieves precise Type I error control. A direct comparison between the convolution-based two-stage design and Simon’s two-stage design is presented.
In Section 4, we provide real-world examples of both one-stage and two-stage designs, demonstrating how the convolution-based approach can reduce the cost and duration of clinical trials based on published design parameters. We conclude with final remarks.

2. Convolution Estimator

Our approach for constructing a test of the hypothesis H 0 : π = π 0 versus H 1 : π > π 0 , with precise Type I error control, utilizes a convolution-based method, in which synthetic continuous noise is added to discrete binomial data. Specifically, let Y Binomial ( n , π ) denote the binomial response over n subjects, where π is the success probability. Let X N ( 0 , h ) be an independent normal random variable with mean 0 and standard deviation h. We define the continuous variable Z = Y + X and derive its probability density and cumulative distribution functions.
The probability density function (PDF) of Z, denoted f Z ( z ) , is obtained by convolving the binomial probability mass function of Y with the normal density of X. Since Y is discrete and X is continuous, this results in a finite mixture of normal densities:
f Z ( z ) = 1 h 2 π k = 0 n n k π k ( 1 π ) n k exp ( z k ) 2 2 h 2 , z R .
Each component of the mixture is centered at k = 0 , 1 , , n , with mixing weights given by the binomial probabilities n k π k ( 1 π ) n k .
Similarly, the cumulative distribution function (CDF) of Z, denoted F Z ( z ) = P ( Z z ) , is derived by conditioning on the values of Y:
F Z ( z ) = k = 0 n n k π k ( 1 π ) n k Φ z k h , z R ,
where Φ ( u ) is the standard normal CDF defined as
Φ ( u ) = u 1 2 π e t 2 / 2 d t .
To understand the properties of the test statistic, we compare the moments of Y and Z. The expectation and variance of Y are given by
E [ Y ] = n π , Var ( Y ) = n π ( 1 π ) .
Since Y and X are independent, the corresponding moments of Z = Y + X follow directly:
E [ Z ] = E [ Y + X ] = E [ Y ] + E [ X ] = n π + 0 = n π ,
Var ( Z ) = Var ( Y + X ) = Var ( Y ) + Var ( X ) = n π ( 1 π ) + h 2 .
Thus, the mean of Z remains identical to that of Y, while the variance of Z is increased by an additive term h 2 , capturing the additional variability introduced by the normal perturbation. Throughout this note, we fix the value of h = 1 / 100 . Although this results in only a modest increase in the variance of Z we will demonstrate that this small adjustment is sufficient to produce tests with substantially improved power while maintaining the precise Type I error level.
To test H 0 : π = π 0 against H 1 : π > π 0 at significance level α , we reject H 0 if Z > c , where the critical value c is chosen to satisfy
1 k = 0 n n k π 0 k ( 1 π 0 ) n k Φ c k h = α .
The solution for c is via numerical methods. Similarly, the p-value is given by
p = 1 F Z | π = π 0 ( z ) = 1 k = 0 n n k π 0 k ( 1 π 0 ) n k Φ z k h ,
where z is the value of the observed convolution-based test statistic.
Under the alternative hypothesis H 1 : π = π 1 > π 0 , the corresponding CDF is
F Z | π = π 1 ( z ) = k = 0 n n k π 1 k ( 1 π 1 ) n k Φ z k h ,
and the power of the test at π = π 1 is given by
Power ( π 1 ) = 1 k = 0 n n k π 1 k ( 1 π 1 ) n k Φ c k h .

2.1. Toy Example

We simulated the random variables Y Binomial ( n = 20 , π = 0.3 ) and X N ( 0 , h 2 ) with h = 1 100 , and computed Z = X + Y . For each run, we calculated the convolution-based p-value and the exact binomial p-value testing H 0 : π = π 0 vs . H 1 : π > π 0 .
We see from Table 1 that the convolution-based p-values are consistently close to the exact binomial p-values, but tend to be slightly smaller due to the smoothing effect introduced by the normal perturbation X. The addition of this small normal noise transforms the discrete binomial distribution into a continuous mixture distribution, which leads to slightly different tail probabilities. For runs with larger observed binomial counts y, both the convolution and exact tests yield smaller p-values, indicating stronger evidence against the null hypothesis H 0 : π = π 0 .

2.2. Convolution Approach and Exact Binomial Test Comparison

Table 2 presents a Type I error control and power comparison between the convolution-based test and the exact binomial test for a sample size of n = 10 , across varying null hypotheses π 0 { 0.1 , 0.2 , 0.3 , 0.4 , 0.5 } and corresponding alternatives π 1 { π 0 , π 0 + 0.1 , , min ( π 0 + 0.4 , 1 ) } . For each π 0 , a critical value c was determined so that the convolution-based test controls the Type I error exactly at level α = 0.05 , as defined in Equation (3). The corresponding power was computed using Equation (5), noting that power equals the Type I error when π 0 = π 1 . In contrast, the exact binomial rejection threshold k was chosen to ensure that the Type I error remained strictly below α .
Table 3 reports analogous results for a larger sample size of n = 20 . As expected, power increases for both methods with larger n. Across both settings, the convolution-based test consistently achieves the nominal Type I error and provides greater power than the exact binomial test, particularly when the exact binomial test is overly conservative relative to Type I error control. This improved performance is attributed to the smoothing introduced by the normal perturbation, which results in a more refined and responsive rejection region. These findings underscore the practical utility of the convolution-based test in discrete-data settings, especially when measurement or process noise is present and sample sizes are limited. The only time the classic test and the convolution based test are equivalent are the n values for π 0 where Equation (1) is satisfied.

3. Two Stage Design

Similar to the one-stage test, we construct a two-stage test of the hypothesis H 0 : π = π 0 versus H 1 : π > π 0 , offering precise control of the Type I error rate and allowing for early stopping due to futility. Let the total sample size be n = n 1 + n 2 . After the first n 1 subjects have completed their endpoint assessments, a futility stopping rule is applied.
As in the one-stage design, let Z 1 denote the observed value of the convolution-based test statistic after the first n 1 subjects, and let Z 2 be the corresponding statistic based on an independent second cohort of n 2 subjects. Under the null hypothesis H 0 , define the p-values as
p 1 = 1 F Z | π = π 0 ( Z 1 ) , p 2 = 1 F Z | π = π 0 ( Z 2 ) ,
where p 1 and p 2 follow a uniform U ( 0 , 1 ) distribution by the probability integral transform. The cumulative distribution function F Z | π = π 0 is defined in Equation (4).
Our approach uses a p-value threshold at the interim analysis and applies Stouffer’s weighted z-score method for the final efficacy analysis [10]. Define the transformed statistics:
T 1 = Φ 1 ( p 1 ) , based on the first n 1 subjects ,
T 2 = Φ 1 ( p 2 ) , based on the sec ond n 2 subjects ,
where Φ 1 is the quantile function of the standard normal distribution.
The combined test statistic is given by the following:
T = w 1 T 1 + w 2 T 2 w 1 2 + w 2 2 ,
where the weights are defined as w 1 = n 1 / n and w 2 = n 2 / n . With this weighting scheme, the statistic T follows a standard normal distribution under H 0 , i.e., T N ( 0 , 1 ) .
The interim futility rule is to terminate the study early if p 1 > p c , where the threshold p c may be specified by the user or selected to optimize statistical power or minimize the expected sample size,
ESN = n 1 p c + n 2 ( 1 p c ) .
If the study does not stop early for futility, the final p-value is computed as p = Φ ( T ) , where Φ denotes the standard normal CDF. The null hypothesis H 0 is rejected if p < α , where α is adjusted to ensure that the overall Type I error rate is controlled at the nominal level α .
To determine α , let a = Φ 1 ( p c ) . The conditional probability of rejecting H 0 given that the futility boundary is not crossed is:
P ( T < c T 1 < a ) = 1 Φ ( a ) a Φ ( 2 c x ) ϕ ( x ) d x ,
where ϕ is the standard normal density function. To ensure the overall Type I error rate is α , we solve for c in the equation:
p c · P ( T < c T 1 < a ) = α ,
and define the adjusted significance level as α = Φ ( c ) , which satisfies α > α . Once α is determined numerically power can be calculated under H 1 via simulation. For a fixed n we can find combinations of n 1 and n 2 such that the overall α = 0.05 and power is greater than or equal to the desired power. Practically speaking one can start at the n determined by the Simon two-stage design and reduce by increments of 1 until the power constraint is no longer satisfied. The search is over values of n 1 and p c , with n 2 = n n 1 . This will be illustrated in the next section.

Convolution Approach and Simon Two-Stage Design Comparison

There is no straightforward way to directly compare the convolution-based approach with Simon’s two-stage design. A key distinguishing feature is that the convolution-based method precisely controls the Type I error rate, whereas Simon’s two-stage design may be conservative in some scenarios, as illustrated earlier in Figure 2. In settings where the actual Type I error of Simon’s design falls well below the nominal level, the convolution-based method can achieve the same desired power with a smaller sample size, particularly when the difference between π 1 and π 0 is substantial.
To illustrate, we consider testing H 0 : π = π 0 versus H 1 : π > π 0 with π 0 = 0.1 and π 1 ranging from 0.3 to 0.6 in increments of 0.1. The results of various Simon two-stage designs are presented in Table 4, showing the corresponding total sample size, exact Type I error, power, expected sample size (EN0), and probability of early stopping.
Similarly, Table 5 displays results from the convolution-based two-stage designs across the same values of π 1 . For each design, we considered a range of p c values from 0.2 to 0.7 in steps of 0.01, and we report the total sample size n, stage-wise sample sizes n 1 and n 2 , p c , power, expected sample size (ESN), adjusted α , for testing at the second stage, and the probability of early stopping, which is equal to 1 p c . The designs in Table 5 represent a subset of possible scenarios to provide a concise summary.
The convolution-based approach demonstrates a clear advantage by achieving slightly smaller sample sizes across all settings. Moreover, it is not constrained by the range of early stopping probabilities observed in the Simon designs (approximately 0.549 to 0.810 in our example). Instead, the convolution-based method allows the user to specify any desired p c (and thus early stopping probability), offering greater flexibility to tailor the design to the specific goals and constraints of a given clinical trial.

4. Real World Examples

4.1. One-Stage Designs

4.1.1. Example 1

Our first example [11] is from a study of patients with concomitant advanced non-small cell lung cancer (NSCLC) and interstitial lung disease (ILD). This prospective, multicenter, single-arm phase 2 trial investigated the efficacy and safety of albumin-bound paclitaxel (nab-paclitaxel) in combination with carboplatin in patients with both advanced NSCLC and ILD. The primary endpoint was the overall response rate (ORR), testing H 0 : π = π 0 versus H 1 : π > π 0 , with π 0 = 0.2 , alternative π 1 = 0.4 , α = 0.05 , and power = 0.80. Based on an exact binomial test, this required a sample size of n = 35 . The actual study enrolled n = 36 subjects between April 2014 and September 2017, corresponding to an average accrual rate of approximately 1.3 subjects per month.
Using the same design parameters, the convolution-based test would require n = 32 subjects to achieve a power of 0.8117, or n = 31 for a power of 0.7967, potentially reducing the trial duration by 4 to 5 months with the corresponding cost savings. The p-value was < 0.001 under both the exact binomial and convolution-based approaches. If a prorated response of 17 out of 31 positive responses were observed, the p-value using the convolution-based method would still be <0.001.

4.1.2. Example 2

Our next example study [12] enrolled n = 15 metastatic prostate cancer patients with AR-V7-expressing circulating tumor cells into a prospective phase II trial. The primary endpoint was PSA response, with hypothesis testing conducted as H 0 : π = π 0 versus H 1 : π > π 0 , where π 0 = 0.05 , π 1 = 0.264 , α = 0.10 , and target power = 0.80. The true Type I error rate for the exact binomial test was 0.0362. A positive outcome in the study was thus defined as ≥3 of 15 patients achieving a PSA response.
Using the same design parameters, the convolution-based test would require n = 11 subjects to achieve a power of 0.824, or n = 10 for a power of 0.793. In the actual trial, 2 out of 15 subjects achieved a PSA response, yielding a p-value of 0.14 using the exact binomial test. The convolution-based test yielded a p-value of 0.106. Although not statistically significant at the α = 0.10 level, this result demonstrates the relative efficiency of the convolution-based approach.

4.2. Two-Stage Designs

4.2.1. Example 1

Our next example [13] is based on a study evaluating vinorelbine in advanced non-small cell lung cancer (NSCLC) patients aged 70 years or older. The study employed a multicenter, two-stage phase II design following Simon’s optimal method. The primary endpoint was objective response rate (ORR), with hypothesis testing structured as H 0 : π = π 0 versus H 1 : π > π 0 , where π 0 = 0.10 , π 1 = 0.25 , h = 0.01 , α = 0.05 , and target power = 0.80.
The final sample size under Simon’s design was n = 43 with an interim futility analysis planned at n 1 = 18 subjects. The actual Type I error rate achieved was 0.048. Table 6 presents three example two-stage designs generated using the convolution-based estimator. Notably, if one is willing to delay the interim futility analysis beyond n 1 = 18 , the total required sample size can be reduced from n = 43 to n = 35 , offering a more efficient design compared to Simon’s optimal design alternative. Even maintaining the interim analysis at n 1 = 18 , the convolution-based approach can still reduce the total sample size to n = 37 .
In the observed trial data, 5 of the initial 18 subjects achieved a positive ORR, with a total of 10 ORR responses observed out of 43 by the end of the study. Simon’s optimal design specified early stopping for futility if 3 or fewer ORR responses were observed in the first stage, and declared efficacy if 8 or more total responses were observed at the second stage.
For each convolution-based design in Table 6, we retrospectively aligned the observed data to the alternative designs to estimate what would have occurred under those configurations:
  • For n = 35 , n 1 = 27 : Assuming 7 out of 27 responses, the futility p-value was 0.011, below the stopping threshold π c = 0.23 . The final-stage p-value, assuming 8 out of 35 total responses, was 0.0158, significant at the adjusted level α = 0.062 .
  • For n = 36 , n 1 = 24 : Assuming 6 out of 24 responses, the futility p-value was 0.002, below π c = 0.24 . The final-stage p-value based on 8 of 36 responses was 0.0162, also significant at α = 0.061 .
  • For n = 37 , n 1 = 18 : Assuming 5 out of 18 responses, the futility p-value was 0.015, which is less than π c = 0.56 . The final p-value with 8 of 37 responses was 0.020, significant at α = 0.051 .
This example again highlights the potential cost savings and efficiency gains achievable with the convolution-based approach compared to Simon’s two-stage design, while maintaining rigorous Type I error control and statistical power.

4.2.2. Example 2

Our next example [14] comes from a publication describing the LuDO-N trial, a phase II, open-label, multicenter, single-arm, two-stage clinical trial in children with high-risk neuroblastoma, utilizing an alternative administration schedule of Lutetium DOTATATE. The primary endpoint is the response rate, assessed by the Revised International Neuroblastoma Response Criteria one month after completion of therapy. Hypothesis testing is structured as H 0 : π = π 0 versus H 1 : π > π 0 , where π 0 = 0.2 , π 1 = 0.4 , h = 0.01 , α = 0.1 , and the target power is 0.80.
The proposed design follows Simon’s Two-Stage Minimax design. Recruitment is expected to be completed within 3–5 years. Based on the specified parameters, the Simon design requires a total sample size of n = 24 , with a first-stage futility analysis at n 1 = 14 , yielding a true Type I error rate of 0.0874.
In contrast, an alternative convolution-based design would require a total of n = 19 subjects, with n 1 = 16 in the first stage, using a futility threshold of p c = 0.21 and a final adjusted significance level of α = 0.158 . If the projected recruitment rate is 24 subjects over 5 years (approximately 4.8 subjects per year), the convolution-based design would reduce the total accrual period by roughly one year.

5. Conclusions

The convolution-based approach presented in this work offers a flexible and efficient alternative to traditional exact methods for designing and analyzing single-arm phase II oncology trials with binary endpoints. By convolving the binomial distribution with a simulated normal random variable, this method produces an unbiased estimator for the response rate π and achieves precise Type I error control. This leads to reduced sample size requirements in both one-stage and two-stage designs, while maintaining desirable operating characteristics.
A significant advantage of the convolution framework is its adaptability. It can be easily modified to incorporate an interim analysis for either futility or combined efficacy and futility, allowing for early decision-making and further optimization of trial resources. Moreover, the convolution-based method provides a smooth and continuous approximation to the binomial tail distribution, making it especially valuable in scenarios where measurement error or process variability exists, and a continuous p-value function is preferred.
Overall, this approach enhances the efficiency of early-phase oncology clinical trial design and provides a theoretically sound and practically implementable alternative to exact binomial and Simon’s two-stage methods. It holds particular promise for high-cost therapeutic areas, such as oncology and cellular therapies, where efficient trial execution is critical.

Funding

This work was supported by the following three NCI grants to Hutson: NRG Oncology Statistical and Data Management Center grant (grant no. U10CA180822); Immuno-Oncology Translational network (IOTN) Moonshot grant (grant no. U24CA232979); Acquired Resistance to Therapy network (ARTNet) grant (grant no.U24CA274159).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data utilized to illustrate this work is provided in the body of the text in Section 4.

Acknowledgments

We wish to thank the reviewers for their thoughtful remarks, which led to an improved version of this manuscript.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Prepared by Battelle Technology Partnership Practice. Biopharmaceutical Industry-Sponsored Clinical Trials: Impact on State Economies. Prepared for Pharmaceutical Research and Manufacturers of America (PhRMA). 2015. Available online: http://orclinicalresearch.com/wp-content/uploads/2018/12/battelle-2015-study.pdf (accessed on 8 July 2025).
  2. Leighl, N.B.; Nirmalakumar, S.; Ezeife, D.A.; Gyawali, B. An Arm and a Leg: The Rising Cost of Cancer Drugs and Impact on Access. Am. Soc. Clin. Oncol. Educ. Book. Am. Soc. Clin. Oncol. Annu. Meet. 2021, 41, e1–e12. [Google Scholar] [CrossRef] [PubMed]
  3. Kapinos, K.A.; Hu, E.; Trivedi, J.; Geethakumari, P.R.; Kansagra, A. Cost-Effectiveness Analysis of CAR T-Cell Therapies vs. Antibody Drug Conjugates for Patients with Advanced Multiple Myeloma. Cancer Control 2023, 30, 1–8. [Google Scholar] [CrossRef] [PubMed]
  4. Hoover, A.; Reimche, P.; Watson, D.; Tanner, L.; Gilchrist, L.; Finch, M.; Messinger, Y.H.; Turcotte, L.M. Healthcare cost and utilization for chimeric antigen receptor (CAR) T-cell therapy in the treatment of pediatric acute lymphoblastic leukemia: A commercial insurance claims database analysis. Cancer Rep. 2024. Epub ahead of print. [Google Scholar] [CrossRef] [PubMed]
  5. Simon, R. Optimal two-stage designs for phase II clinical trials. Control. Clin. Trials 1989, 10, 1–10. [Google Scholar] [CrossRef] [PubMed]
  6. Hamaker, H.C.; van Strik, R. The Efficiency of Double Sampling for Attributes. J. Am. Stat. Assoc. 1955, 50, 830–849. [Google Scholar] [CrossRef]
  7. Duncan, A.J. Quality Control and Industrial Statistics; Irwin: Homewood, IL, USA, 1986. [Google Scholar]
  8. Hutson, A.D. Modifying the Exact Test for a Binomial Proportion and Comparisons with Other Approaches. J. Appl. Stat. 2006, 33, 679–690. [Google Scholar] [CrossRef]
  9. Chernick, M.R.; Christine, Y.; Liu, C.Y. The Saw-Toothed Behavior of Power Versus Sample Size and Software Solutions: Single Binomial Proportion Using Exact Methods. Am. Stat. 2002, 56, 149–155. [Google Scholar] [CrossRef]
  10. Whitlock, M.C. Combining probability from independent tests: The weighted Z-method is superior to Fisher’s approach. J. Evol. Biol. 2005, 18, 1368–1373. [Google Scholar] [CrossRef] [PubMed]
  11. Asahina, H.; Oizumi, S.; Takamura, K.; Harada, T.; Harada, M.; Yokouchi, H.; Kanazawa, K.; Fujita, Y.; Kojima, T.; Sugaya, F.; et al. A prospective phase II study of carboplatin and nab-paclitaxel in patients with advanced non-small cell lung cancer and concomitant interstitial lung disease (HOT1302). Lung Cancer 2019, 138, 65–71. [Google Scholar] [CrossRef] [PubMed]
  12. Boudadi, K.; Suzman, D.L.; Anagnostou, V.; Fu, W.; Luber, B.; Wang, H.; Niknafs, N.; White, J.R.; Silberstein, J.L.; Sullivan, R.; et al. Ipilimumab plus nivolumab and DNA-repair defects in AR-V7-expressing metastatic prostate cancer. Oncotarget 2018, 9, 28561–28571. [Google Scholar] [CrossRef] [PubMed]
  13. Gridelli, C.; Perrone, F.; Gallo, C.; De Marinis, F.; Ianniello, G.; Cigolari, S.; Cariello, S.; Di Costanzo, F.; D’Aprile, M.; Rossi, A.; et al. Vinorelbine is well tolerated and active in the treatment of elderly patients with advanced non-small cell lung cancer. A two-stage phase II study. Eur. J. Cancer 1997, 33, 392–397. [Google Scholar] [CrossRef] [PubMed]
  14. Sundquist, F.; Georgantzi, K.; Jarvis, K.B.; Brok, J.; Koskenvuo, M.; Rascon, J.; van Noesel, M.; Grybäck, P.; Nilsson, J.; Braat, A.; et al. A Phase II Trial of a Personalized, Dose-Intense Administration Schedule of 177Lutetium-DOTATATE in Children With Primary Refractory or Relapsed High-Risk Neuroblastoma-LuDO-N. Front. Pediatr. 2022, 10, 836230. [Google Scholar] [CrossRef] [PubMed]
Figure 1. Type I error control for exact binomial test as a function of π 0 values, n = 10 , 20 , 30 , 40 .
Figure 1. Type I error control for exact binomial test as a function of π 0 values, n = 10 , 20 , 30 , 40 .
Cancers 17 02570 g001
Figure 2. Type I error control and power for the mimimax Simon’s two-stage design as a function of π 0 = 0.1 and π 1 = 0.19 to 0.80 .
Figure 2. Type I error control and power for the mimimax Simon’s two-stage design as a function of π 0 = 0.1 and π 1 = 0.19 to 0.80 .
Cancers 17 02570 g002
Table 1. Comparison of mixture-based and exact binomial p-values over three simulations.
Table 1. Comparison of mixture-based and exact binomial p-values over three simulations.
Runyx z = x + y F Z ( z ) Convoution p-ValueExact Binomial p-Value
140.0079684.0079680.58320.41680.5886
2110.01402411.014020.99990.00010.0006
330.0083623.0083620.37010.62990.7939
Table 2. Comparison of power between the mixture-based convolution test and the exact binomial test for sample size n = 10 , across values of π 0 and π 1 , h = 0.01 . The rejection threshold k ensures the binomial test controls Type I error strictly below α = 0.05 .
Table 2. Comparison of power between the mixture-based convolution test and the exact binomial test for sample size n = 10 , across values of π 0 and π 1 , h = 0.01 . The rejection threshold k ensures the binomial test controls Type I error strictly below α = 0.05 .
π 0 π 1 Critical cMixture PowerRejection kBinomial Power
0.10.12.99620.05000040.012795
0.10.22.99620.25137740.120874
0.10.32.99620.52335240.350389
0.10.42.99620.75708040.617719
0.10.52.99620.90408840.828125
0.20.24.00860.05000050.032793
0.20.34.00860.18936250.150268
0.20.44.00860.41589550.366897
0.20.54.00860.66310950.623047
0.20.64.00860.85553850.833761
0.30.35.01950.05000060.047349
0.30.45.01950.17140760.166239
0.30.55.01950.38329260.376953
0.30.65.01950.63827260.633103
0.30.75.01950.85238360.849732
0.40.46.98780.05000080.012295
0.40.56.98780.15873580.054688
0.40.66.98780.35817480.167290
0.40.76.98780.61969180.382783
0.40.86.98780.85655180.677800
0.50.57.98760.05000090.010742
0.50.67.98760.15439090.046357
0.50.77.98760.35787990.149308
0.50.87.98760.64558790.375810
0.50.97.98760.90914790.736099
Table 3. Power comparison between the mixture-based convolution test and the exact binomial test for sample size n = 20 , across values of π 0 and π 1 , h = 0.01 . The rejection threshold k ensures the binomial test controls Type I error strictly below α = 0.05 .
Table 3. Power comparison between the mixture-based convolution test and the exact binomial test for sample size n = 20 , across values of π 0 and π 1 , h = 0.01 . The rejection threshold k ensures the binomial test controls Type I error strictly below α = 0.05 .
π 0 π 1 Critical cMixture PowerRejection kBinomial Power
0.10.14.01430.05000050.043174
0.10.24.01430.38694150.370352
0.10.34.01430.77240850.762492
0.10.44.01430.95170850.949048
0.10.54.01430.99444250.994091
0.20.27.00450.05000080.032143
0.20.37.00450.28150180.227728
0.20.47.00450.63841080.584107
0.20.57.00450.89261380.868412
0.20.67.00450.98373880.978971
0.30.39.01860.050000100.047962
0.30.49.01860.249643100.244663
0.30.59.01860.593093100.588099
0.30.69.01860.874692100.872479
0.30.79.01860.983230100.982855
0.40.411.99100.050000130.021029
0.40.511.99100.229635130.131588
0.40.611.99100.562559130.415893
0.40.711.99100.865636130.772272
0.40.811.99100.985944130.967857
0.50.513.99180.050000150.020695
0.50.613.99180.224232150.125599
0.50.713.99180.568302150.416371
0.50.813.99180.890702150.804208
0.50.913.99180.995777150.988747
Table 4. Summary of Simon Two-Stage Designs with one-sided Type I error rate α = 0.05 , Power = 0.8, π 0 = 0.1 , h = 0.01 . Designs are shown for various response probabilities π 1 .
Table 4. Summary of Simon Two-Stage Designs with one-sided Type I error rate α = 0.05 , Power = 0.8, π 0 = 0.1 , h = 0.01 . Designs are shown for various response probabilities π 1 .
π 1 n n 1 r 1 r 2 Type I ErrorPowerEN0P (Early Stop)Method
0.32515150.0330.80219.50.549Minimax
0.32612150.0360.80516.80.659
0.32711150.0400.80615.80.697
0.32910150.0470.80515.00.736Optimal
0.4138130.0310.8028.90.813Minimax
0.4154030.0430.8187.80.656Optimal
0.584020.0360.8365.40.656Minimax
0.593020.0410.8284.60.729Optimal
0.663020.0150.8073.80.729Minimax
0.682020.0250.8193.10.810Optimal
Table 5. Summary of Convolution-Based Two-Stage Designs with one-sided Type I error rate α = 0.05 , Power = 0.8, π 0 = 0.1 , h = 0.01 . Designs are shown for various response probabilities π 1 .
Table 5. Summary of Convolution-Based Two-Stage Designs with one-sided Type I error rate α = 0.05 , Power = 0.8, π 0 = 0.1 , h = 0.01 . Designs are shown for various response probabilities π 1 .
π 1 n n 1 n 2 p c PowerESN α P (Early Stop)
0.3231670.340.81018.40.0550.66
0.3231760.320.80718.90.0560.68
0.3231760.300.80618.80.0570.70
0.3231580.310.80617.50.0560.69
0.3231490.370.80617.30.0540.63
0.3231490.450.80618.00.0520.55
0.3231490.460.80618.10.0520.54
0.3231670.690.80620.80.0500.31
0.3231670.360.80518.50.0540.64
0.3231580.370.80518.00.0540.63
0.32313100.500.80518.00.0510.50
0.32313100.530.80518.30.0510.47
0.32312110.700.80519.70.0500.30
0.411830.200.8018.60.0660.80
0.411920.210.8019.40.0640.79
0.57520.200.8095.40.0660.80
0.57520.250.8055.50.0600.75
0.57520.300.8015.60.0570.70
0.65320.290.8103.60.0570.71
0.65320.380.8043.80.0530.62
0.65320.400.8013.80.0530.60
0.65320.520.8014.00.0510.48
Table 6. Convolution-based design scenarios for Example 1: Two-stage design.
Table 6. Convolution-based design scenarios for Example 1: Two-stage design.
n n 1 n 2 π c PowerESN α
352780.230.80328.80.062
3624120.240.80226.90.061
3718190.560.80128.60.051
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Hutson, A.D. Optimizing One-Sample Tests for Proportions in Single- and Two-Stage Oncology Trials. Cancers 2025, 17, 2570. https://doi.org/10.3390/cancers17152570

AMA Style

Hutson AD. Optimizing One-Sample Tests for Proportions in Single- and Two-Stage Oncology Trials. Cancers. 2025; 17(15):2570. https://doi.org/10.3390/cancers17152570

Chicago/Turabian Style

Hutson, Alan David. 2025. "Optimizing One-Sample Tests for Proportions in Single- and Two-Stage Oncology Trials" Cancers 17, no. 15: 2570. https://doi.org/10.3390/cancers17152570

APA Style

Hutson, A. D. (2025). Optimizing One-Sample Tests for Proportions in Single- and Two-Stage Oncology Trials. Cancers, 17(15), 2570. https://doi.org/10.3390/cancers17152570

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.
Back to TopTop