Estimating Conditional Power for Sequential Monitoring of Covariate Adaptive Randomized Designs: The Fractional Brownian Motion Approach

: Conditional power based on classical Brownian motion (BM) has been widely used in sequential monitoring of clinical trials, including those with the covariate adaptive randomization design (CAR). Due to some uncontrollable factors, the sequential test statistics under CAR procedures may not satisfy the independent increment property of BM. We conﬁrm the invalidation of BM when the error terms in the linear model with CAR design are not independent and identically distributed. To incorporate the possible correlation structure of the increment of the test statistic, we utilize the fractional Brownian motion (FBM). We conducted a comparative study of the conditional power under BM and FBM. It was found that the conditional power under FBM assumption was mostly higher than that under BM assumption when the Hurst exponent was greater than 0.5.


Introduction
Clinical trials, aiming to evaluate the safety and efficacy of drugs and medical devices in target populations, play an important role in the development of public health and medicine [1]. Adaptive randomized clinical design improves the trial, based on the accumulated data and changing environment, making clinical trials more efficient, flexible, and ethically reasonable [2].
The treatment effects estimated from unadjusted models may not be appropriate when the covariates are not balanced. Meanwhile, some covariates, such as elevated values of biomarkers that were found to affiliate with disease status in the translational research, may be critical in determining the treatment effects in clinical trials [3]. For example, biomarker HR23B is closely related to and used to indicate the effectiveness of histone deacetylase inhibitors-based therapy for tumors [4]. To address the problem of covariate imbalance, a useful tool is the Covariate Adaptive Randomization (CAR) procedure in which participants are assigned to different treatment groups based on previous participants' assignment, previous participants' covariates, and current participants' covariates, such that the asymmetries across the subgroups are minimized [5]. Although the complete randomization is good at eliminating the selection bias, the CAR design is a more reasonable comprise between reducing the selection bias and balancing covariates assignments [6]. The rigorous theory of covariate adaptive randomized clinical trials has been developed more recently [7]. Thereafter, progress has been made in research on statistical inference with CAR designs [8,9].
Traditionally, classical Brownian motion (BM) is a fundamental theory for monitoring outcome effects in clinical trials, including those with CAR designs [10][11][12][13][14]. It has been proved that the sequential test statistics of covariate adaptive clinical trials follow Brownian motion asymptotically under some regularized conditions [15].
A condition of performing the hypothesis testing of covariate adaptive randomized clinical trials is that the underlying error terms are independent and identically distributed (iid). In addition, independent increments are a property of classical Brownian motion [16]. However, the independent increment property of the test statistics may not be completely met in some situations. For example, some patients may enter the trial during the same season; some patients may be treated by the same hospital or the same physician. Therefore, the error terms from the model may be correlated and follow special covariance patterns. Given the situation that most of the previous theoretical research into sequential monitoring of CAR designs was based on the Brownian motion assumption, it is necessary to explore the stochastic properties of the sequential monitoring process when error structures are not independent and identically distributed. We propose fractional Brownian motion (FBM) as a more valid tool to investigate the outcomes of clinical trials with correlated error structures.
FBM, annotated as "B H (t)", is a Gaussian process with E(B H (t)) = 0, and Cov(B H (s), where the Hurst coefficient (H), in the range of 0 < H < 1, is a parameter of the FBM, describing the long-term dependence of the process [17][18][19]. FBM is a Markov process when H = 1/2 [20,21]. The maximum likelihood estimation (MLE) method was proposed for estimating the Hurst coefficient underlying FBM in clinical trials [22]. The log likelihood function of the observed value of [22].
In this paper, comprehensive simulations of the sequential monitoring of CAR procedures were conducted to investigate the breakdown of BM when the independent increment assumption was not met. We further calculated the conditional power (CP) under the null hypothesis, with the BM assumption and with the FBM assumption, respectively. Section 2 of this paper includes the test statistics and theoretical properties under covariate adaptive randomized clinical trials with correlated error structures. In Section 3, results from numerical simulations are provided to estimate the Hurst exponents for sequential monitoring of CAR procedures when error structures are not iid. Conditional powers are calculated and compared between BM and FBM assumptions in Section 4. Conclusions and discussions are found in Section 5.

Covariate-Adaptive Randomized Clinical Trials with Correlated Error Structures
Assume there is a two-arm randomized sequential clinical trial with covariate adaptive designs. µ 1 and µ 2 are parameters for group 1 and group 2, measuring the main effects of the treatment in each group, respectively. I i are the indicator variables for ith patient assigned to the different treatment groups, i = 1, 2, . . . N. When I i = 1, the patient is assigned to the treatment group 1; when I i = 0, the patient is assigned to the treatment group 2. Let Y i be the response outcomes of the model. X i, 1 . . . X i, p are covariates. β 1 . . . β p are unknown parameters for the covariates. p is the number of covariates besides µ 1 and µ 2 . The expectation of X is equal to 0. All covariates are independent from each other. ε i s denote the error terms that are correlated. N is the total sample size.
E(X k ) = 0, for k = 1, 2, . . . , p X i, k 's are covariates independent with the error terms ε i . Indicator I i are also independent with the error terms ε i .
The expression (1) can be synthesized to the matrix form: The hypothesis for testing the treatment effect is: When error terms are independent and normally distributed, the statistics for testing the main outcome effect at the time point t can be denoted as [7]: Under null hypothesis with equal treatment effect, we have: · is denoted as the floor function. When using the statistic (2), the sequential monitoring process with multiple interim analyses is considered. For example, the investigator may evaluate the test statistics after every 100 patients finish the study in a 4000 patients' clinical trial. Z 1 is the statistic test for patients 1-100; Z 2 is for patients 1-200; . . . ; Z 40 is for patients 1-4000. Each test statistic Z value is different from others, with different recruited patient numbers at that time point.
It was indicated that the test statistic Z score at each time point in (2) followed normal distribution under the null hypothesis [7]. Interim normalized Z value can be transformed into B-value B (H) as B H (t) = Z t t 1/2 . The sequential statistics test B H in the model (3) converges to asymptotically Brownian motion [12]. Both conclusions above were based on the assumption of error terms iid in the model. In some cases, the independent increment property of the test statistic may be invalid when error terms ε s of linear model (1) are correlated. Nonetheless, people may ignore the covariance in the error terms and still use the original classical hypothesis test, with error terms independent and identically distributed. This scenario is called misspecification in our study. * used in the above Equations (2)-(5) were the mis-specified test equations and parameters, distinguished from the classical original hypothesis test formula when ε s are iid.
If the sequential test statistics B * H (t) converge to a standard Brownian motion, However, based on the theoretical derivation in Yang [15], the covariance of B * H (t i ) and B * H (t j ) could not converge to the minimum of t i and t j under the null hypothesis in misspecification scenarios. The conclusion demonstrated that B * H (t) in sequential test statistics (3) was not asymptotically standard Brownian motion under the misspecification scenarios.

Simulations for Misspecification Scenarios
Since the sequential test statistics cannot converge to asymptotically Brownian motion when error terms are correlated, we propose a larger class of fractional Brownian motion for the stochastic structure of the test statistic. Maximum likelihood method was used to estimate the Hurst exponents of the FBM for the sequential monitoring processes under the misspecification assumption. If the mean estimation of H values deviates significantly from 0.5, the sequential monitoring processes would be confirmed as not converging to BM. Error terms in the model (1) were assumed to follow specific correlated patterns. Increments of fractional Brownian motion, defined as , were used in the error terms ε of our simulations [24]. fbm() function in the R software is a way to create one dimension FBM series B H (t) [25]. Covariance of the increments of fractional Brownian motion is: Incorrect estimators (4) and (5) and incorrect classical hypothesis test statistics (2) were used to build the sequential monitoring processes without considering the covariance terms.
In the Equation (1), p 1 , p 2 are the probability of success respectively in the Bernoulli distribution when the covariates X 1 , X 2 are binary variables. µ 1 , µ 2 , β 1 , β 2 , p 1 , p 2 were set up as 0.5, 0.5, 1, 1, 0.5, 0.5, respectively. 1000 replications were used for all the simulations. Patients were assumed to be sequentially randomized into two treatment groups by the block randomization (BR) (by blockrand() function in R software), stratified permuted block randomization (SPB) [26], and Pocock and Simon minimization designs (PS) [27] in the simulation studies consecutively. No covariate, two discrete covariates, and two continuous covariates situations were illustrated under misspecification scenarios in the simulation studies.
Assume 4000 patients were recruited in a clinical trial study with uniformly distributed enter time. An interim analysis would be done after every 100 new patients had finished the study. In total, 40 interim results were obtained. The maximum likelihood method was used to estimate the Hurst exponent (H) for normalized B H value from the sequential test (3) in the entire paper. When H= 0.5, this indicated an uncorrelated process, corresponding to classical Brownian motion. It was shown that B H has a long range dependence property when 0.5 < H <1 [28,29].
The mean and standard deviation of Hurst exponent estimations were tabulated in Table 1, in which another Hurst estimation method proposed by Peltler Lévy Véhel was used to validate the MLE results [30]. Two Hurst estimation methods reached similar results. The distribution of the estimates of H is close to normal distribution. The visual histograms are shown in Figures 1-3. Mean estimated H values from the misspecification scenarios in Table 1 and Figures 1-3 deviated from 0.5. All test of statistical significance test proved this conclusion with p value less than 0.0001 by t-test (Student's t-Test) function in R software. According to the theoretical derivation results and simulation results, sequential test statistics do not follow a Brownian motion in the covariate adaptive randomized clinical trials sequential monitoring processes when error terms are correlated. Models with different covariate types reached similar conclusions.

Conditional Power for Covariate Adaptive Randomized Clinical Trials with Correlated Error Structures
Test statistics derived from covariate adaptive randomized clinical trials do not follow a Brownian motion in the sequential monitoring processes when ignoring the error term covariance patterns in the statistics test formula [15]. The independent increment property is not satisfied under the misspecification assumption for the covariate adaptive

Conditional Power for Covariate Adaptive Randomized Clinical Trials with Correlated Error Structures
Test statistics derived from covariate adaptive randomized clinical trials do not follow a Brownian motion in the sequential monitoring processes when ignoring the error term covariance patterns in the statistics test formula [15]. The independent increment property is not satisfied under the misspecification assumption for the covariate adaptive randomized clinical trials. Therefore, a more general form of stochastic process, fractional Brownian motion, was proposed for the misspecification scenarios. In this section, conditional powers were calculated and compared between the BM and the FBM methods.
Conditional power is the probability of rejecting the null hypothesis, given the interim data results up to the time of interim analysis [31]. Conditional power is used for the decisions regarding possible early termination, or proceeding to the original plan, or increasing the sample sizes in the clinical trial interim analyses. The evaluation of the conditional power can predict the long-term clinical trial study results based on the partially observed data.
Under the null hypothesis of no treatment effect for the clinical trial, Z t is the sequential test statistic at time t. Normalized Z t from the estimated of β(t) in the covariate adaptive randomized clinical trial can be transformed to the B value as B H (t) = Z t t 1/2 . Assuming time t > s, the conditional probability can be calculated with Brownian motion: [10,32]. When the treatment effect is not equal in two groups, Brownian motion would have a drift parameter θ. Brownian motion with drift follows the normal distribution with mean θ(t − s) and variance (t − s). The conditional power with drift is expressed as: [10,33]. The conditional probability under null hypothesis in terms of FBM was proposed in Lai et al. [34] expressed as: The conditional distribution in the Equation (7) is normally distributed as N(µ H n , σ 2 H n ). ∑ is the variance-covariance matrix of B H (t N ) and B H (t n ) . . . B H (t 2 ), B H (t 1 ), denoted as A two-stage simulation strategy was used to calculate the conditional power [35]. Recruited time t was divided into two parts by the time point ti. f ix, which resembled the interim analysis time point in the real clinical trial. The first part before the time point ti. f ix, which is called the fixed part, includes the already known data information. The second part after the time point ti. f ix and until t = 1 is called the unknown part. The empirical result was denoted as the analyzed data from ti. f ix to t = 1 by 1000 simulations. It was assumed that the empirical results can respond the consequence at the end of the whole clinical trial.
Based on the Formula (7), CPs with the assumptions of the BM and FBM were calculated respectively for the fixed part, from t = 0 to t = ti. f ix. In Formula (7), s 1 , s 2 ,..., s n are the observed values before t N . The mean µ H n and σ H n are described in the definition of Equation (7). The conditional power was denoted as CP (BM) under BM assumption with H = 0.5. If the conditional power was calculated under the FBM assumption with the estimated H value by MLE approach, this was denoted as CP (FBM).
CP (empirical) was assumed to predict the results at the end of the whole clinical trial (t = 1) with the formula CP (empirical)= Count (Z(t N ) > 1.96)/1000. 1.96 is the critical value for the hypothesis test with alpha = 0.05 at t = 1. CP (empirical), among 1000 replications, was treated as the standard criterion to compare with theoretical values CP (BM) and CP (FBM) [35]. Scatter plots for CP (BM), CP (FBM) and empirical CPs virtually demonstrated the consistency between the theoretical and empirical data.
Scenarios were simulated when a Brownian motion showed a mean shift upwards with drift. Assume that B(t) is a standard Brownian motion process. A new Brownian motion process U(t) with drift would be shown as U(t) = θt + σ 2 B(t), t ≥ 0. θ is the drift parameter. Assuming σ 2 = 1, different parameters (θ = 0, 0.5, 1, 1.5, 2) were illustrated in the conditional power simulation processes. µ H n in the equation (7) corresponded with the drift parameters in the simulations. In addition, different interim time points, ti. f ix = 0.7, 0.8, 0.9, were evaluated respectively.
In this section, a total sample size of 4000 participants was illustrated with 40 interim time points, although the actual number of interim analyses is much fewer. The error terms of the model were not independent and identically distributed but were assumed to follow increments of fractional Brownian motion with the Hurst exponent H = 0.9. In this study, B H (t) were calculated first from sequential monitoring test statistics (3) under misspecification scenarios. H values were calculated based on the B H (t). Then, conditional powers were performed using Equation (7) with B H (t) from sequential processes. We

Conclusions and Discussions
In this study, we investigated the sequential monitoring properties in covariate adaptive randomized clinical trials under the misspecification scenarios. We also performed numerical simulations under various situations in which the mean estimates of Hurst coefficient by MLE from the sequential test statistics under misspecification scenarios deviated from 0.5. Brownian motion is satisfied only when = 0.5. Therefore, the independ-

Conclusions and Discussions
In this study, we investigated the sequential monitoring properties in covariate adaptive randomized clinical trials under the misspecification scenarios. We also performed numerical simulations under various situations in which the mean estimates of Hurst coefficient by MLE from the sequential test statistics under misspecification scenarios deviated from 0.5. Brownian motion is satisfied only when H = 0.5. Therefore, the independent increment assumption was violated and Brownian motion was not appropriate for the sequential process. However, clinical researchers may not know the existence of the covariance in the error terms, and hence use the original classical statistic test under the misspecification scenarios, leading to non-Brownian motion trajectory of the test statistics under sequential analysis. Therefore, it is necessary to estimate the Hurst coefficient.
We calculated conditional powers for covariate adaptive randomized clinical trials with mis-specified error structures of the model under different covariate types, adaptive designs, drift parameters, and interim time points. Conditional powers based on the fractional Brownian motion (CP (FBM)) assumption resulted in better consistency with the standard empirical value (CP (empirical)) than conditional powers under the classical Brownian motion (CP (BM)) assumption. When the H MLE > 0.5, most conditional powers under the FBM assumption were greater than the conditional powers under the classical Brownian motion assumption. The fractional Brownian motion, incorporating a dependent increment assumption, would be a reasonable approach for the clinical trial sequential analyses. Even if the sequential procedure actually follows the Brownian motion, the application of the fractional Brownian motion technique would still be useful, since BM is a special case of FBM with H = 0.5.