Next Article in Journal
The Control Mechanism of the Coal Pillar Width on the Mechanical State of Hard Roofs
Previous Article in Journal
Bridging Modalities: An Analysis of Cross-Modal Wasserstein Adversarial Translation Networks and Their Theoretical Foundations
Previous Article in Special Issue
Multi-Stage Methods for Cost Controlled Data Compression Using Principal Component Analysis
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Inflation of Familywise Error Rate in Treatment Efficacy Testing Due to the Reallocation of Significance Levels Based on Safety Data

Clinical Research Center, Shizuoka Cancer Center, 1007 Shimonagakubo, Sunto-Nagaizumi, Shizuoka 411-8777, Japan
*
Author to whom correspondence should be addressed.
Mathematics 2025, 13(16), 2547; https://doi.org/10.3390/math13162547
Submission received: 23 June 2025 / Revised: 27 July 2025 / Accepted: 6 August 2025 / Published: 8 August 2025
(This article belongs to the Special Issue Sequential Sampling Methods for Statistical Inference)

Abstract

In randomized clinical trials comparing two standard treatments, a two-sided test for efficacy at a significance level α is typically used when neither treatment is expected to be superior at the design stage. This two-sided test comprises two one-sided tests, each conducted at a significance level of α / 2 . If safety data later suggest that one treatment is not clinically acceptable due to a higher rate of adverse events, investigators may reallocate the α / 2 significance level originally assigned to the one-sided efficacy test for that treatment to the other one-sided test. This results in a two-stage procedure. We examine the impact of such reallocation on the familywise error rate (FWER). Using theoretical derivations and simulation studies, we show that FWER can exceed the nominal level α when the treatment with fewer adverse events tends to show greater efficacy. Therefore, the two-stage procedure should be avoided when strict control of FWER is a priority. These findings emphasize the need for caution when reallocating significance levels based on auxiliary information and have implications beyond clinical trials, particularly in adaptive statistical methodologies.
MSC:
62F03; 62J15; 62L10; 62P10

1. Introduction

This paper considers a randomized clinical trial comparing the efficacy of two standard treatments. When neither treatment is expected to be superior at the design stage, a two-sided test for efficacy at a significance level α is typically employed [1]. If one treatment is shown to be more effective than the other, it will continue to be the standard treatment, while the other will not be adopted as the standard treatment.
The two-sided test at a significance level of α consists of two one-sided tests, each assessing whether one treatment is superior to the other, with a significance level of α / 2 . If, for some reason, one of the one-sided tests becomes irrelevant, the significance level of α / 2 assigned to that test will be wasted. Suppose that one treatment is found to have a very high rate of adverse events. Even though it has been shown to be more effective, clinicians may consider it unreasonable for that treatment to remain the standard while the other treatment does not. In such a case, investigators would not be interested in demonstrating that the treatment is more effective. Instead, they would seek to reallocate the significance level of α / 2 , originally assigned to testing whether the treatment is more effective, to testing whether the other treatment is more effective.
This procedure consists of two stages: The first stage reviews safety data and determines the reallocation of the significance level, while the second stage conducts the efficacy test. Two-stage procedures in hypothesis testing, although different from the one described above, have been widely discussed in the statistical literature [2,3,4,5,6]. If the two stages are not properly considered, using these procedures can lead to an increased type I error rate or familywise error rate (FWER), defined as the probability of rejecting one or more true null hypotheses under any combination of true and false null hypotheses [7].
For example, seamless phase II/III trials use data from the phase II stage in the phase III stage, and the two-stage procedure involves decisions made at both stages [2]. Senn [3] introduced a different two-stage procedure in a cross-over trial, where a preliminary test on a carry-over effect guides the choice of a primary test for the treatment effect. Similarly, Kahan [4] considered a factorial design and used an interaction test to determine whether to compare individual treatment combinations or contrasts such as A vs. not-A and B vs. not-B. Campbell and Dean [5] considered a two-stage procedure for Cox models, selecting a test for regression coefficients based on a test of the proportional hazard assumption. A two-stage procedure for group-sequential designs with two endpoints was also described in [6].
To the best of our knowledge, the reallocation of significance levels in two-sided hypothesis testing problems has not been mathematically investigated in the existing literature. This paper examines the validity of the two-stage procedure in such settings in terms of controlling FWER. The remainder of the paper is organized as follows. Section 2 provides a formal analytic description of the problem, followed by the calculation of FWER in Section 3 and the calculation of power in Section 4. Section 5 presents a statistical model for individuals, which satisfies the requirements outlined in Section 2. Section 6 presents a concluding discussion.

2. Reallocation of Significance Levels Based on Safety Data

We present a formal analytic description of the problem discussed in the Introduction. Two treatments are represented by T 1 and T 2 . For simplicity, we assume that the safety data used to determine the reallocation of significance levels relate to a single adverse event and are summarized by the difference in the proportions of patients experiencing the adverse event under treatments T 1 and T 2 . In addition, a rule for determining the reallocation of significance levels based on these data is established when planning the trial. Let θ be an effect measure of efficacy, where θ > 0 indicates that T 1 is more efficacious than T 2 , θ = 0 indicates that T 1 and T 2 have the same efficacy, and θ < 0 indicates that T 2 is more efficacious than T 1 . One example of such a measure is θ = E ( Y | T = T 1 ) E ( Y | T = T 2 ) , where Y is a continuous outcome and T is the assigned treatment taking values T 1 and T 2 . The null hypotheses of the two one-sided and one two-sided tests concerning θ are defined as H 0 T 1 : θ 0 , H 0 T 1 , T 2 : θ = 0 , and H 0 T 2 : θ 0 . Note that H 0 T 1 , T 2 = H 0 T 1 H 0 T 2 .
Let Z E be a test statistic for θ that follows a normal distribution with mean θ and variance 1. Let Z S be the standardized difference between the proportions of patients experiencing an adverse event under treatments T 1 and T 2 . Suppose that ( Z E , Z S ) follows a two-dimensional normal distribution with mean vector ( θ , 0 ) , marginal variances of 1 , and correlation coefficient r . The mean of Z S is assumed to be 0 since it is not the focus of interest.
A two-stage procedure is determined with cutoff values c L and c U ( c L c U ) for safety data. If c L Z S c U , the two-sided test for H 0 T 1 ,   T 2 is conducted at the significance level of α , which is equivalent to performing two one-sided tests for H 0 T 1 and H 0 T 2 , each at the significance level of α / 2 . If Z S > c U , the significance level of α / 2 assigned to the test for H 0 T 1 is reallocated to the test for H 0 T 2 , so that only the test for H 0 T 2 is conducted at the significance level of α . If Z S < c L , the significance level of α / 2 assigned to the test for H 0 T 2 is reallocated to the test for H 0 T 1 , so that only the test for H 0 T 1 is conducted at the significance level of α .
The two-stage procedure described above maintains a type I error rate of α for each test. However, due to the correlation r = c o r ( Z E , Z S ) between the efficacy test statistic and the safety index, it is unclear if the two-stage procedure adequately controls FWER. The next section mathematically evaluates FWER for the two-stage procedure.

3. Derivation of the Familywise Error Rate for the Two-Stage Procedure

This section evaluates FWER for the two-stage procedure. FWER for the two-stage procedure is a function of the effect measure θ , the correlation coefficient r , the cutoff values c L and c U , and the significance level α . The function is denoted by FWER ( θ , r , c L , c U , α ) . Let Φ ( x ) be the cumulative distribution function of the standard normal distribution, and let z α be the critical value of the standard normal distribution cutoff probability α in the upper tail.
In the following, we evaluate FWER in three distinct cases: θ > 0 , θ < 0 , and θ = 0 . When θ > 0 , FWER is given by
FWER θ , r , c L , c U , α = P c U < Z S ,   Z E < z α + P c L Z S c U , Z E < z α / 2 .
Since z α / 2 < z α , it follows that
P c U < Z S ,   Z E < z α + P c L Z S c U , Z E < z α / 2                                                                                                                                             < P c U < Z S ,   Z E < z α + P c L Z S c U , Z E < z α ,
which equals P c L Z S , Z E < z α . We then have
P c L Z S ,   Z E < z α P Z E < z α = P Z E θ < z α θ = Φ z α θ < Φ z α = α .
Hence, for θ > 0 , we conclude that FWER θ , r , c L , c U , α α .
When θ < 0 , FWER is given by
FWER θ , r , c L , c U , α = P Z S < c L ,   z α < Z E + P c L Z S c U , z α / 2 < Z E .
Since z α < z α / 2 , it follows that
P Z S < c L ,   z α < Z E + P c L Z S c U , z α / 2 < Z E                         < P Z S < c L ,   z α < Z E + P c L Z S c U , z α < Z E ,
which equals P Z S c U , z α < Z E . We then have
P Z S c U ,   z α < Z E P z α < Z E = P z α θ < Z E θ = 1 Φ z α θ < 1 Φ z α = α .
Hence, for θ < 0 , we conclude that FWER θ , r , c L , c U , α α .
When θ = 0 , FWER θ , r , c L , c U , α is given by
P c U < Z S , Z E < z α + P c L Z S c U , z α / 2 < Z E + P Z S < c L , z α < Z E .
When θ = 0 and r = 0 , Z E and Z S are independent since Z E , Z S follows a two-dimensional normal distribution. Therefore, FWER θ , r , c L , c U , α can be written as
P c U < Z S P Z E < z α + P c L Z S c U P z α / 2 < Z E + P Z S < c L P z α < Z E ,
which equals
α P c U < Z S + P c L Z S c U + P Z S < c L = α .
Hence, when θ = 0 and r = 0 , we conclude that FWER θ , r , c L , c U , α = α .
When θ = 0 and r 0 , FWER θ , r , c L , c U , α can be calculated via a numerical integration. Table 1 presents the values of FWER θ , r , c L , c U , α for α = 0.05 under various combinations of r ,   c L , and c U . When Z E and Z S are negatively correlated, that is, when treatments with fewer adverse events tend to show better efficacy, FWER exceeds α . In contrast, when Z E and Z S are positively correlated, that is, when treatments with more adverse events tend to show better efficacy, FWER remains below α . When the absolute values of c L and c U are large (i.e., c L = c U 2.5 ), the probability that no reallocation occurs is high, and as a result, FWER tends to be controlled at α . In contrast, when the absolute values of c L and c U are small (i.e., c L = c U 2.0 ), the probability of reallocation increases, and as a result, the maximum FWER can reach 2 α = 0.10 .
The observation regarding the maximum value of FWER can be formally proven. When θ = 0 and r 0 , the maximum value of FWER θ , r , c L , c U , α is equal to 2 α since
F W E R θ , r , c L , c U , α = P c U < Z S ,   Z E < z α + P c L Z S c U , z α / 2 < Z E + P Z S < c L , z α < Z E = P c U < Z S ,   Z E < z α + P c L Z S c U , Z E < z α / 2 + P c L Z S c U , z α / 2 < Z E + P Z S < c L , z α < Z E P c U < Z S ,   Z E < z α + P c L Z S c U , Z E < z α + P c L Z S c U , z α < Z E + P Z S < c L , z α < Z E = P c L Z S ,   Z E < z α + P Z S c U , z α < Z E P   Z E < z α + P z α < Z E = 2 α .
Equality holds if and only if the correlation coefficient r = 1 and the cutoff values satisfy z α c L c U z α . We now provide a proof of this statement. Note that F W E R θ , r , c L , c U , α = 2 α holds if and only if the following three equalities are satisfied:
P c L Z S c U ,   Z E > z α / 2 = P c L Z S c U ,   Z E > z α ,
P c L Z S ,   Z E < z α = P Z E < z α ,
P Z S c U ,   z α < Z E = P z α < Z E   .
If r < 1 , then equalities (2) and (3) cannot hold. Therefore, we must have r = 1 . Suppose r = 1 . Then, Z E = Z S and from (2) we have
P c L Z E ,   Z E < z α = P Z E < z α ,
which leads to a contradiction. Therefore, r must be 1 . In this case, from (2), we obtain
P Z E c L ,   Z E < z α = P Z E < z α ,
which implies z α c L , that is, c L z α . Similarly, from (3) we obtain
P c U Z E ,   z α < Z E = P z α < Z E ,
which implies c U z α , that is, z α c U . From (1), we have
P c U Z E c L ,   Z E > z α / 2 = P c U Z E c L ,   Z E > z α ,
which implies z α c U and c L z α , that is, z α c L and c U z α . Then, equalities (1)–(3) hold if and only if r = 1 , and z α c L c U z α . This completes the proof. The following theorem summarizes the discussion above.
Theorem 1.
FWER of the two-stage procedure satisfies the following properties: When  θ 0 , or  θ = 0  and  r = 0 , the FWER of the two-stage procedure can be controlled at  α . When  θ = 0  and  r 0 , FWER may exceed  α  and it can reach the maximum value of  2 α  when  r = 1  and  z α c L c U z α .

4. Derivation of Power for the Two-Stage Procedure

The power of the two-stage procedure to detect a true alternative hypothesis is a function of the effect measure θ , the correlation coefficient r , the cutoff values c L and c U , and the significance level α . It is denoted by POWER ( θ , r , c L , c U , α ) .
  • When θ > 0 , POWER θ , r , c L , c U , α is given by
P Z S < c L , z α < Z E + P c L Z S c U , z α / 2 < Z E .
When θ < 0 , POWER θ , r , c L , c U , α is given by
P c U < Z S , Z E < z α + P c L Z S c U , Z E < z α / 2 .
When θ = 0 , POWER θ , r , c L , c U , α = 0 , as all alternative hypotheses are false. The power of the two-stage procedure can also be calculated via numerical integration. Table 2 presents the power of the two-stage procedure for α = 0.05 under various values of r ,   c L , and c U . The two values of the power function for θ = 3.0 are identical, and this equality holds for other values of θ as well. This pattern arises from the symmetry of the bivariate normal distribution. When θ = 3.0 and c L = c U = 2.5 or 1.5 , the probability of conducting the test for H 0 T 1 : θ 0 is high, resulting in high power. The same applies when θ = 3.0 , in which case the test for H 0 T 2 : θ 0 is likely to be conducted. In contrast, when θ = 3.0   and | c L | = | c U | = 0.1 , the probability of conducting the test for H 0 T 1 : θ 0 is low, resulting in low power. The same applies when θ = 3.0 , in which case the test for H 0 T 2 : θ 0 is unlikely to be conducted. As the correlation coefficient r increases, the power decreases since the probability of conducting the test for H 0 T 1 : θ 0 when θ > 0 , and for H 0 T 2 : θ 0 when θ < 0 , decreases. The power of the two-sided test for H 0 T 1 ,   T 2 : θ = 0 is equal to 0.851 , 0.323 , 0.072 , 0.072 , 0.323 and 0.851 , for θ = 3.0 ,   1.5 ,   0.5 , 0.5 , 1.5 , and 3.0 , respectively. When | θ | = 3.0 , the power of the two-stage procedure is, in some cases, lower than that of the two-sided test. This may be problematic. In contrast, when | θ | = 1.5 , the power of the two-stage procedure, in some cases, exceeds that of the two-sided test. However, this improvement in power is a direct consequence of the failure to control the FWER under θ = 0 ; in other words, the gain in power reflects a loss of error rate control.

5. A Statistical Model Satisfying the Assumptions in Section 2

We present a statistical model for individuals that is appropriate for the setting described in Section 2. For i , j = 1 , , n , let L i T 1 and L j T 2 denote latent variables for individuals receiving treatments T 1 and T 2 , respectively. These variables are assumed to be independently and identically distributed as standard normal variables.
For a patient i receiving T 1 , the efficacy outcome Y i , E T 1 is given by L i T 1 + d for some positive value d , and the safety outcome Y i , S T 1 takes the value 0 (no adverse event) or 1 (adverse event) with probability
P Y i , S T 1 = 1 | L i T 1 = exp L i T 1 c / 1 + exp L i T 1 c ,
where we assume the logistic curve between the latent and response variable L i T 1 and Y i , S T 1 . Here, c is a constant term. Similarly, for patient j receiving T 2 , the efficacy outcome Y j , E T 2 is given by L j T 2 , and the safety outcome Y j , S T 2 takes the value 0 (no adverse event) or 1 (adverse event) with probability
P Y j , S T 2 = 1 | L j T 2 = exp L j T 2 c / 1 + exp L j T 2 c .
When c = 1.0 ,   1.5 ,   and 2.0 , the probability that Y i , S T 1 or Y j , S T 2 equals 1 is 0.303 , 0.221 , and 0.115 , respectively. In this study, we set c = 2 .
In this setting, a test statistic for efficacy and a summary statistic for safety are given as
Z E = n 2 1 n i = 1 n Y i , E T 1 1 n j = 1 n Y j , E T 2 ,
Z S = 1 n i = 1 n Y i , S T 1 1 n j = 1 n Y j , S T 2 1 n i = 1 n Y i , S T 1 1 1 n i = 1 n Y i , S T 1 / n + 1 n j = 1 n Y j , S T 2 1 1 n j = 1 n Y j , S T 2 / n .
By the central limit theorem, the pair Z E , Z S are asymptotically normally distributed and approximately satisfy the assumptions in Section 2. The mean of Z E , denoted by θ , is equal to d / n / 2 , where n is considered as a constant. The correlation coefficient between Z E and Z S was estimated via a Monte Carlo simulation. We generated data with n = 200 and conducted 10 5 repetitions under θ = 0 , that is, d = 0 . This resulted in 10 5 independent pairs of Z E , Z S , which are shown in the scatter plot in Figure 1. The correlation coefficient of Z E , Z S was estimated to be 0.319, with a 95% confidence interval of 0.314 ,   0.325 .
Since the number of individuals in each treatment group is n = 200 , Z S can take at most 200 × 200 = 40,000 distinct values. As a result, certain values near zero are not represented in Z S , leading to visible horizontal white lines in the scatterplot. As n increases, these lines gradually disappear.
If the probability that Y i , S T k equals 1 is given by exp L i T k c / 1 + exp L i T k c for k = 1,2 , with c = 2 , and all other settings remain as previously described, then the correlation coefficient between Z E and Z S was estimated to be 0.319 , with a 95% confidence interval of 0.325 ,   0.314 .

6. Concluding Discussion

In this paper, we consider a two-stage procedure in clinical trial settings comparing two standard treatments, in which a two-sided test for efficacy at a significance level of α is planned at the design stage. The α / 2 allocated to one of the two one-sided tests that constitute the two-sided test can potentially be reallocated as follows: the α/2 originally assigned to the one-sided test for the treatment with a higher rate of an adverse event is reallocated to the other one-sided test. In Theorem 1, we show that FWER for this two-stage procedure can exceed the nominal significance level α when the treatment associated with a lower rate of adverse event tends to demonstrate greater efficacy. Therefore, this procedure should be avoided when strict control of FWER is a priority.
Consider a clinical example in which cancer patients treated with anticancer drugs tend to experience greater treatment efficacy if they develop adverse events than if they do not [8]. In such a scenario, applying the two-stage procedure in a trial comparing two such anticancer drugs may control FWER at the nominal level α ; however, as demonstrated in this paper, it can result in reduced statistical power in some cases.
The main assumptions in Section 2 are as follows:
(1)
The test statistic for efficacy and the summary statistic for safety jointly follow a multivariate normal distribution.
(2)
The summary statistic for safety relates to a single adverse event and is determined by the difference in the proportions of patients experiencing the adverse event under treatments T 1 and T 2 .
(3)
The cutoff values for the safety summary statistic are determined at the design stage and are used to guide the reallocation of the significance level.
Regarding Assumption (1), we consider it to be reasonable, as many commonly used test and summary statistics are asymptotically normally distributed under standard regularity conditions. Assumptions (2) and (3) were adopted to facilitate the theoretical derivation of the FWER. In practice, however, it may be difficult to pre-specify a single adverse event and fixed cutoff values at the design stage. More commonly, all safety data are reviewed after all data have been collected, but before efficacy data are analyzed, and the decision on whether to reallocate the significance level is taken following discussions among investigators. In such cases, the safety data may influence the choice of cutoff values, making the mathematical evaluation of FWER more complex. Therefore, for analytical tractability, we focused on a setting in which both the adverse event and the cutoff values are specified in advance. Given that our study demonstrated inflation of FWER even under this simplified setting—and that FWER control is not guaranteed under more flexible or data-driven reallocation procedures—we think that such procedures, including the one examined in this study, should be avoided when strict FWER control is required.
From a practical perspective, if a treatment is found to have a very high rate of adverse events, efficacy data may not be collected after patient dropout. In such cases, the sample sizes of both the intention-to-treat population and the full analysis set may become imbalanced between groups. In general, an imbalance in sample size can lead to reduced power in the two-stage procedure. However, this study focuses on fundamental theoretical aspects and does not address practical issues such as dropout-related sample size imbalance.
This study focused solely on a two-stage procedure. However, in three- or multi-stage procedures, inflation of FWER could also occur because correlations between the test statistic and other summary statistics, which drive the inflation, may also arise in these more complex settings.
The two-stage procedure examined in this study involves sequential decision-making and, therefore, bears some resemblance to group sequential designs and the associated bias correction for point estimation (e.g., [9,10]). However, a key difference is that group sequential designs involve repeated testing of a single null hypothesis, whereas our procedure conducts only a single hypothesis test in the second stage, guided by a decision taken in the first stage. The primary focus of this study is on controlling FWER in the second stage.
The implications of our findings extend beyond clinical trial settings. The issues addressed in this study commonly arise when significance levels are reallocated based on external or auxiliary information. Since intuitive reasoning about probabilities can be misleading, careful consideration is essential when employing complex testing procedures.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/math13162547/s1, the R code for simulations.

Author Contributions

Conceptualization, A.N. and K.M.; methodology, A.N. and K.M.; formal analysis, A.N.; writing—original draft preparation, A.N.; writing—review and editing, A.N. and K.M.; visualization, A.N.; supervision, K.M.; project administration, A.N. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported in part by JSPS KAKENHI, Grant Number JP23H03353.

Data Availability Statement

No new data were created or analyzed in this study. Data sharing is not applicable to this article. The R code used for the simulation is provided in the Supplementary Materials.

Acknowledgments

We used ChatGPT (GPT-4o, OpenAI) to identify grammatical errors and improve our English expressions throughout the development of this work.

Conflicts of Interest

No authors have a conflict of interest related to the contents of this manuscript.

References

  1. Green, S.; Benedetti, J.; Smith, A.; Crowley, J. Clinical Trials in Oncology, 3rd ed.; Taylor & Francis: Oxfordshire, UK, 2012. [Google Scholar]
  2. Yu, M.; Man, R.; Zhu, H.; Wang, L. Enhancing the flexibility and power of adaptive seamless phase 2/3 design with copula modeling between short-term and long-term endpoints. Commun. Stat. Simul. Comput. 2024, 1–21. [Google Scholar] [CrossRef]
  3. Senn, S. Viewpoint: Do not resurrect the two-stage procedure. In Pharmaceutical Statistics; John Wiley and Sons Ltd.: Hoboken, NJ, USA, 2022; pp. 808–814. [Google Scholar]
  4. Kahan, B.C. Bias in randomised factorial trials. Stat. Med. 2013, 32, 4540–4549. [Google Scholar] [CrossRef] [PubMed]
  5. Campbell, H.; Dean, C.B. The consequences of proportional hazards based model selection. Stat. Med. 2014, 33, 1042–1056. [Google Scholar] [CrossRef] [PubMed]
  6. Hung, H.M.J.; Wang, S.-J.; O’Neill, R. Statistical considerations for testing multiple endpoints in group sequential or adaptive clinical trials. J. Biopharm. Stat. 2007, 17, 1201–1210. [Google Scholar] [CrossRef] [PubMed]
  7. Dmitrienko, A.; Tamhane, A.C.; Bretz, F. (Eds.) Multiple Testing Problems in Pharmaceutical Statistics; Chapman and Hall/CRC: Boca Raton, FL, USA, 2009. [Google Scholar]
  8. Haratani, K.; Hayashi, H.; Chiba, Y.; Kudo, K.; Yonesaka, K.; Kato, R.; Kaneda, H.; Hasegawa, Y.; Tanaka, K.; Takeda, M.; et al. Association of immune-related adverse events with nivolumab efficacy in non-small cell lung cancer. JAMA Oncol. 2018, 4, 374–378. [Google Scholar] [CrossRef] [PubMed]
  9. Grayling, M.J.; Wason, J.M.S. Point estimation following a two-stage group sequential trial. Stat. Methods Med. Res. 2023, 32, 287–304. [Google Scholar] [CrossRef] [PubMed]
  10. Grayling, M.J.; Wason, J.M.S.; Mander, A.P. Group sequential crossover trial designs with strong control of the familywise error rate. Seq. Anal. 2018, 37, 174–203. [Google Scholar] [CrossRef] [PubMed]
Figure 1. Scatter plot of 10 5 independent pairs of ( Z E , Z S ) .
Figure 1. Scatter plot of 10 5 independent pairs of ( Z E , Z S ) .
Mathematics 13 02547 g001
Table 1. FWERs of the two-stage procedure at θ = 0 and α = 0.05 .
Table 1. FWERs of the two-stage procedure at θ = 0 and α = 0.05 .
c L c U r
−0.99−0.80−0.60−0.40−0.200.200.400.600.800.99
−0.10.10.1000.0990.0930.0810.0660.0340.0190.0060.0000.000
−1.01.00.1000.0880.0770.0680.0590.0390.0280.0160.0050.000
−1.51.50.0970.0730.0650.0600.0550.0440.0370.0270.0140.000
−2.02.00.0540.0590.0560.0540.0520.0470.0440.0380.0290.009
−2.52.50.0500.0520.0520.0510.0510.0490.0480.0450.0420.038
FWER denotes the familywise error rate. c L and c U represent the lower and upper cutoff values for Z S , respectively. r denotes the correlation coefficient between Z E and Z S .
Table 2. Power of the two-stage procedure at α = 0.05 .
Table 2. Power of the two-stage procedure at α = 0.05 .
θ c L c U r
−0.99−0.80−0.40−0.200.200.400.800.99
3.0−0.10.10.5400.5360.5130.5000.4750.4640.4510.452
3.0−1.51.50.8510.8360.8110.8030.7950.7940.7950.786
3.0−2.52.50.8510.8510.8480.8470.8460.8460.8450.845
1.5−0.10.10.4280.3670.2930.2610.1980.1640.0780.001
1.5−1.51.50.3230.3250.3250.3180.2980.2850.2590.256
1.5−2.52.50.3230.3230.3230.3230.3200.3180.3170.317
0.5−0.10.10.1260.1220.0960.0800.0470.0310.0030.000
0.5−1.51.50.0770.0840.0790.0760.0650.0580.0360.010
0.5−2.52.50.0720.0730.0730.0730.0710.0700.0670.066
−0.5−0.10.10.1260.1220.0960.0800.0470.0310.0030.000
−0.5−1.51.50.0770.0840.0790.0760.0650.0580.0360.010
−0.5−2.52.50.0720.0730.0730.0730.0710.0700.0670.066
−1.5−0.10.10.4280.3670.2930.2610.1980.1640.0780.001
−1.5−1.51.50.3230.3250.3250.3180.2980.2850.2590.256
−1.5−2.52.50.3230.3230.3230.3230.3200.3180.3170.317
−3.0−0.10.10.5400.5360.5130.5000.4750.4640.4510.452
−3.0−1.51.50.8510.8360.8110.8030.7950.7940.7950.786
−3.0−2.52.50.8510.8510.8480.8470.8460.8460.8450.845
Power denotes the probability of detecting a true alternative hypothesis. θ represents the effect measure. c L and c U are the cutoff values for Z S . r denotes the correlation coefficient between Z E and Z S .
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Notsu, A.; Mori, K. Inflation of Familywise Error Rate in Treatment Efficacy Testing Due to the Reallocation of Significance Levels Based on Safety Data. Mathematics 2025, 13, 2547. https://doi.org/10.3390/math13162547

AMA Style

Notsu A, Mori K. Inflation of Familywise Error Rate in Treatment Efficacy Testing Due to the Reallocation of Significance Levels Based on Safety Data. Mathematics. 2025; 13(16):2547. https://doi.org/10.3390/math13162547

Chicago/Turabian Style

Notsu, Akifumi, and Keita Mori. 2025. "Inflation of Familywise Error Rate in Treatment Efficacy Testing Due to the Reallocation of Significance Levels Based on Safety Data" Mathematics 13, no. 16: 2547. https://doi.org/10.3390/math13162547

APA Style

Notsu, A., & Mori, K. (2025). Inflation of Familywise Error Rate in Treatment Efficacy Testing Due to the Reallocation of Significance Levels Based on Safety Data. Mathematics, 13(16), 2547. https://doi.org/10.3390/math13162547

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop