Design and Analysis of Cancer Clinical Trials for Personalized Medicine

Jung, Sin-Ho

doi:10.3390/jpm11050376

Open AccessReview

Design and Analysis of Cancer Clinical Trials for Personalized Medicine

by

Sin-Ho Jung

Department of Biostatistics and Bioinformatics, Duke University, Durham, NC 27710, USA

J. Pers. Med. 2021, 11(5), 376; https://doi.org/10.3390/jpm11050376

Submission received: 7 April 2021 / Revised: 22 April 2021 / Accepted: 22 April 2021 / Published: 4 May 2021

(This article belongs to the Section Epidemiology)

Download Review Reports Versions Notes

Abstract

Biomarkers play a key role in the development of personalized medicine. Cancer clinical trials with biomarker should be appropriately designed and analyzed reflecting the various factors, such as the phase of trials, the type of biomarker, the study objectives, and whether the used biomarker is already validated or not. In this paper, we demonstrate design and analysis of two phase II cancer clinical trials, one with a predictive biomarker and the other with a prognostic biomarker. A statistical testing method and its sample size calculation method are presented for each of the trials. We assume that the primary endpoint of these trials is a time to event variable, but this concept can be used for any type of endpoint with associated testing methods. The test statistics and their sample size formulas are derived using the large sample approximation based on the martingale central limit theorem. Using simulations, we find that the test statistics control the type I error rate accurately and the sample sizes calculated using the formulas maintain the statistical power specified at the design stage.

Keywords:

enrichment trial; interaction; predictive biomarker; prognostic biomarker; progression-free survival; stratified randomization trial

1. Introduction

In many cancer clinical trials, different types of biomarkers are measured from the tumor, blood or urine using molecular, biochemical, physiological, anatomical, or imaging method at the baseline or during treatment. The observed biomarkers are used for various purposes during the diagnosis and treatment of the diseases. For example, cancer biomarkers are used to diagnose diseases (diagnostic biomarker), to predict the response to a specific treatment (predictive biomarker), to measure the aggressiveness of a disease for patients with no or a non-targeted treatment(prognostic biomarker), to monitor the recurrence of a disease, and so on.

These biomarkers can be used to select a treatment of cancer patients. However, biomarkers should be validated before being used to select a treatment in clinical trials. If a biomarker has not been validated yet, it can be used as a stratification factor of a randomized clinical trial. In such a trial, the biomarker is used for its validation, rather than for treatment selection.

The design and analysis method of a clinical trial with a biomarker-guided treatment can be very different depending on the type of the used biomarker, the biomarker’s development stage, the study objective, and so on. Various design issues of randomized clinical trials with biomarkers have been widely discussed [1]. A series of statistical testing has been proposed for a randomized phase II trial with a potentially predictive biomarker which has not been strictly validated yet [2]. The efficacy of enrichment trials and stratified randomization trials with a time to event variable as the primary endpoint has been compared assuming that the treatment effect reverses between biomarker positive and negative groups and considering subset analysis within each biomarker status group [3].

Phase II trials are to screen out inefficacious treatments before proceeding to a large-scale studies, such as a phase III trial. As such, phase II trials should be completed in a short time period, so that we must choose a small sample size and a short-term surrogate endpoint, such as tumor response or progression-free survival, as the primary endpoint, rather than a confirmatory endpoint, such as overall survival.

In this paper, we demonstrate two phase II cancer clinical trials, one with a predictive biomarker and the other with a prognostic biomarker, and present analysis and sample size calculation methods for these trials. We use a survival variable as the primary endpoint in this paper, but the same concept can be used for any kind of variables including a binary variable, such as tumor response. For the purpose of sample size calculation, we assume exponential survival distributions which are most popularly used in real trial designs, although the statistical testing does not depend on any specific survival distribution. This is a review article of a biostatistical paper [4] with some modifications.

2. Materials and Methods

We consider a time to event (or survival) endpoint, progression-free survival (PFS). We use a generalized log-rank test for a trial with an imaging prognostic biomarker and a Cox proportional hazards model for a trial with a predictive biomarker, and derive their sample size formulas. To account for relatively small sample sizes of phase II trials, exact statistical methods are used for binary outcomes, but in general no exact methods are available for survival analysis. Therefore, using simulations on two real trial examples, we evaluate the small sample performance of the discussed statistical tests and their sample size formulas that are derived based on large sample approximation.

3. Results

3.1. A Phase II Trial with a Predictive Biomarker

Predictive biomarkers help provide information on the likelihood of response to a specific chemotherapy. For example, tumors expressing high thymidylate synthase (TS) levels were shown to be resistant to pemetrexed in a preclinical study [5], but it was not validated by a clinical study yet. Suppose that we want to investigate whether TS expression is a predictive marker for the clinical outcome of pemetrexed/cisplatin (PC) in patients with nonsquamous non–small-cell lung cancer (NSCLC) through a phase II trial. The control non-targeted treatment is gemcitabine/cisplatin (GC). Compared to GC, PC is expected to be similarly efficacious for TS-positive group, but to be more efficacious in TS-negative group.

To investigate this hypothesis, we want to randomize patients between the two treatment arms stratifying by TS-positivity vs. TS-negativity. This trial was designed and published with overall response as the primary endpoint [6], but in this paper, we demonstrate how to design and analyze a trial using PFS as the primary endpoint based on the estimates from the trial.

3.1.1. Statistical Testing

When this study is completed, PFS will be regressed on treatment allocation

z_{1}

(=0 for GC arm; =1 for PC arm) and TS-positivity

z_{2}

(=0 for TS-negative group; =1 for TS-positive group) using a proportional hazards model [7]

λ (t) = λ_{0} (t) exp (β_{1} z_{1} + β_{2} z_{2} + β_{3} z_{1} z_{2})

(1)

Please note that we have an interaction term

z_{1} z_{2}

in the model.

From model (1), the hazard functions of four patient groups defined by treatments and TS status are given as

λ (t | z_{1} = 0, z_{2} = 0) = λ_{0} (t)

,

λ (t | z_{1} = 1, z_{2} = 0) = λ_{0} (t) exp (β_{1})

,

λ (t | z_{1} = 0, z_{2} = 1) = λ_{0} (t) exp (β_{2})

, and

λ (t | z_{1} = 1, z_{2} = 1) = λ_{0} (t) exp (β_{1} + β_{2} + β_{3})

. For TS-positive patients

(z_{2} = 1)

, the hazard ratio between PC and GC is

λ (t | z_{1} = 1, z_{2} = 1) / λ (t | z_{1} = 0, z_{2} = 1) = exp (β_{1} + β_{3})

, so that we expect

β_{1} \approx - β_{3}

if GC and PC are similarly efficacious for TS-positive patients. For GC arm (

z_{1} = 0

), the hazard ratio between TS-positivity group and TS-negativity group is

λ (t | z_{1} = 0, z_{2} = 1) / λ (t | z_{1} = 0, z_{2} = 0) = exp (β_{2})

, so that we will have

β_{2} = 0

if GC is non-targeted against TS. With

β_{2} = 0

,

λ_{0} (t)

is the hazard function for GC arm. On the other hand, for PC arm (

z_{1} = 1

), the hazard ratio between TS-positivity group and TS-negativity group is

λ (t | z_{1} = 1, z_{2} = 1) / λ (t | z_{1} = 1, z_{2} = 0) = exp (β_{3})

since GC is non-targeted treatment (i.e.,

β_{2} = 0

). If TS-positive tumors are resistant to pemetrexed, we will have

β_{3} > 0

. Therefore, the hypotheses of interest are

H_{0} : β_{3} = 0

and

H_{1} : β_{3} > 0

.

For patient

i (= 1, \dots, n)

, let

X_{i}

be the minimum of censoring time and survival time,

δ_{i}

be the event indicator taking 1 if tumor progression has occurred and 0 otherwise, and

Z_{i} = {(z_{1 i}, z_{2 i}, z_{1 i} z_{2 i})}^{T}

be the covariate vector. Partial score and information functions for regression coefficients

β = {(β_{1}, β_{2}, β_{3})}^{T}

are given as

U (β) = \sum_{i = 1}^{n} \int_{0}^{\infty} \{Z_{i} - \frac{\sum_{j = 1}^{n} Y_{j} (t) Z_{j} e^{β^{T} Z_{j}}}{\sum_{j = 1}^{n} Y_{j} (t) e^{β^{T} Z_{j}}}\} d N_{i} (t)

and

I (β) = \sum_{i = 1}^{n} \int_{0}^{\infty} [\frac{\sum_{j = 1}^{n} Y_{j} (t) Z_{j}^{\otimes 2} e^{β^{T} Z_{j}}}{\sum_{j = 1}^{n} Y_{j} (t) e^{β^{T} Z_{j}}} - \frac{{\sum_{j = 1}^{n} Y_{j} (t) Z_{j} e^{β^{T} Z_{j}}}^{\otimes 2}}{{\sum_{j = 1}^{n} Y_{j} (t) e^{β^{T} Z_{j}}}^{2}}] d N_{i} (t),

respectively, where

N_{i} (t) = δ_{i} I (X_{i} \leq t)

is the event process,

Y_{i} (t) = I (X_{i} \geq t)

is the at-risk process,

I (\cdot)

is an indicator function, and

z^{\otimes 2} = z z^{T}

for a vector z.

Let

\hat{β} = {({\hat{β}}_{1}, {\hat{β}}_{2}, {\hat{β}}_{3})}^{T}

denote the solution to

U (β) = 0

. Then,

\hat{β}

is approximately normal with mean 0, and variance–covariance

I^{- 1} (0)

under the global null hypothesis of

β = 0

[8]. Hence, with a one-sided type I error rate of

α

, we reject

H_{0} : β_{3} = 0

in favor of

H_{1} : β_{3} > 0

if

{\hat{β}}_{3} / {\hat{σ}}_{3} > z_{1 - α}

, where

{\hat{σ}}_{3}^{2}

is the

(3, 3)

-component of

I^{- 1} (0)

and

z_{1 - α}

is the

1 - α

quantile of the standard normal distribution.

3.1.2. Sample Size Calculation

For sample size calculation of this study, we need to specify following design parameters.

Type I error rate and power, $(α, 1 - β)$
Allocation proportion for GC arm, $p_{0}$ , and for PC arm, $p_{1}$ $(p_{0} + p_{1} = 1)$
TS-negativity $q_{0}$ and TS-positivity $q_{1}$ based on the prevalence in the study population $(q_{0} + q_{1} = 1)$
Assuming exponential distributions for PFS, the hazard rates, $λ_{z_{1} z_{2}}$ , of the four patient groups, $λ_{00}$ , $λ_{01}$ , $λ_{10}$ , and $λ_{11}$
Accrual period a (or accrual rate r) and additional follow-up period b

Assuming exponential distributions for PFS with hazard rates

λ_{z_{1} z_{2}}

, model (1) is simplified to

λ = λ_{0} exp (β_{1} z_{1} + β_{2} z_{2} + β_{3} z_{1} z_{2})

with

λ_{00} = λ_{0}, λ_{10} = λ_{0} exp (β_{1}), λ_{01} = λ_{0} exp (β_{2}), λ_{11} = λ_{0} exp (β_{1} + β_{2} + β_{3}) .

By solving these equations with respect to

(λ_{0}, β_{1}, β_{2}, β_{3})

, we have

λ_{0} = λ_{00}, β_{1} = log λ_{10} - log λ_{00}, β_{2} = log λ_{01} - log λ_{00},

β_{3} = log λ_{11} - log λ_{10} - log λ_{01} + log λ_{00} .

(2)

Hence, we can calculate the values of

β_{3}

under

H_{1}

in terms of the hazard ratios that are specified as design parameters above.

To derive a sample size formula, we need to calculate the limit of

{\hat{σ}}_{3}^{2}

or

I^{- 1} (0)

as

n \to \infty

in terms of the design parameters. Let

p_{k l} = P (z_{1} = k, z_{2} = l)

for

k, l = 0

or 1 denote the relative frequency of each cell of the

2 \times 2

table defined by treatment and TS status. Under the stratified randomization scheme, we have

p_{k l} = p_{k} q_{l}

for

k, l = 0

or 1.

Appendix A shows that

I (0)

converges to

D A

, where

D = n d

denotes the expected number of events (or number of patients with tumor progression),

d = p_{00} d_{00} + p_{10} d_{10} + p_{01} d_{01} + p_{11} d_{11}

denotes the probability that a patient has a progression during the study period,

d_{k l} = 1 - exp (- λ_{k l} b) {1 - exp (- λ_{k l} a)}

denotes the probability that a patient in group

(z_{1}, z_{2}) = (k, l)

has a progression for

k, l = 0, 1

as derived based on an exponential PFS distribution and

U (b, a + b)

censoring distribution, and

A = (\begin{matrix} p_{0} p_{1} & p_{11} - p_{1} q_{1} & p_{0} p_{11} \\ p_{11} - p_{1} q_{1} & q_{0} q_{1} & q_{0} p_{11} \\ p_{0} p_{11} & q_{0} p_{11} & p_{11} (1 - p_{11}) \end{matrix})

Please note that

d_{z_{1} z_{2}}

is derived from an exponential PFS distribution with hazard ratio

λ_{z_{1} z_{2}}

and a censoring distribution of

U (b, a + b)

. Hence, the limit of

{\hat{σ}}_{3}^{2}

is

σ_{3}^{2} = A_{(3, 3)} / D

, where

A_{(3, 3)}

is the

(3, 3)

component of

A^{- 1}

.

From (2),

{\bar{β}}_{3} = log λ_{11} - log λ_{10} - log λ_{01} + log λ_{00}

is the

β_{3}

value specified under

H_{1}

. Since

({\hat{β}}_{3} - {\bar{β}}_{3}) / σ_{3}

has the standard normal distribution under

H_{1}

, the power for a local alternative hypothesis

H_{1} : β_{3} = {\bar{β}}_{3}

is given as

1 - β = P ({\hat{β}}_{3} / {\hat{σ}}_{3} > z_{1 - α} | β_{3} = {\bar{β}}_{3}) = P (\frac{{\hat{β}}_{3} - {\bar{β}}_{3}}{σ_{3}} > z_{1 - α} - \frac{{\bar{β}}_{3}}{σ_{3}} | β_{3} = {\bar{β}}_{3}) = \bar{Φ} (z_{1 - α} - {\bar{β}}_{3} / σ_{3})

(3)

where

\bar{Φ} (\cdot)

is the survivor function of the standard normal distribution.

Noting that

σ_{3}^{2} = D A_{(3, 3)} = n d A_{(3, 3)}

, we obtain the required number of events

D = A_{(3, 3)} {(\frac{z_{1 - α} + z_{1 - β}}{{\bar{β}}_{3}})}^{2}

or the required sample size

n = \frac{A_{(3, 3)}}{d} {(\frac{z_{1 - α} + z_{1 - β}}{{\bar{β}}_{3}})}^{2}

(4)

by solving Equation (3).

Formula (4) requires specification of accrual period a together with

(α, 1 - β)

,

{\bar{β}}_{3}

, b,

p_{0}

and

q_{0}

. In designing a clinical trial, however, we can estimate the accrual pattern, rather than an accrual period. Suppose that patients are expected to be enrolled to the study at a rate of r during an accrual period based on the number of patients treated by the study member sites recently. Assuming uniform patient accrual during period a, we have

n \approx a \times r

. Noting that

d = d (a)

is a function of a, (4) is expressed as

a \times r = \frac{A_{(3, 3)}}{d (a)} {(\frac{z_{1 - α} + z_{1 - β}}{{\bar{β}}_{3}})}^{2}

(5)

By solving (5) with respect to a using a numerical method, such as the bisection method, we obtain the required accrual period, say

a^{*}

, and the required sample size

n = a^{*} r

.

3.1.3. Example 1

We demonstrate our sample size calculation method with the NSCLC trial that is introduced above. We will randomize patients between the two treatment arms with 1-to-1 fashion, i.e.,

p_{0} = p_{1} = 1 / 2

stratified by TS status. The expected TS-positivity is 50% (i.e.,

q_{0} = q_{1} = 0.5

) because the median TS level was selected as the cutoff value for TS-positivity from a previous study [9]. Hence, we have

p_{z_{1} z_{2}} = p_{z_{1}} q_{z_{2}} = 1 / 4

for

z_{1}, z_{2} = 0

or 1. The 6-month PFS is expected to be about 35% for GC arm regardless of TS level and for PC arm with TS-positivity, and 55% for PC arm with TS-negativity. For an exponential distribution, t-year survival probability

S (t)

is associated with its hazard rate

λ

by

S (t) = exp (- λ t)

. Therefore, the annual hazard rates under the alternative hypothesis are given as

λ_{00} = λ_{01} = λ_{11} = 2.100

and

λ_{10} = 1.196

under the exponential PFS assumption. For these hazard rates, we have the baseline hazard rate

λ_{0} = 2.100

,

β_{2} = 0

, and

β_{3} = - β_{1} = 0.563

from (2). Suppose that about 10 patients per month are expected to be entered to the study, i.e., an annual accrual rate of

r = 120

. We plan to follow the patients for additional

b = 1

year after the last patient enters. Then, the 1-sided

α = 0.1

test for

H_{0} : β_{3} = 0

against

H_{1} : β_{3} = 0.563

in model (1) requires

n = 345

patients for a power of

1 - β = 0.9

. The expected number of events (i.e., number of patients with a disease progression) at the analysis will be

D = 333

. As an effort to lower the sample size for this phase II trial, we use a large

α

level compared to the standard two-sided

α = 0.05

. We observe an empirical power of 0.897 from 10,000 simulation samples of size

n = 345

that are generated at the design setting. This trial recruited 321 patients using overall response as the primary endpoint [6].

A stratified randomized trial of a treatment with a predictive biomarker requires a large sample size for testing on the interaction term. Sample size of a trial for testing the interaction term with 50% of biomarker positivity may be compared to that of a trial for an arm-to-arm comparison with 1-to-3 randomization in the setting of the NSCLC trial expecting a higher efficacy of PC only for TS-negative group. Let us consider a randomized trial to compare two treatment arms with

(α, 1 - β, r, b) = (0.1, 0.9, 120, 1)

as above and 6-month PFS of 35% for the control treatment and 55% for the experimental treatment. In this case, we need only

n = 122

(

D = 102

) patients by 1-to-3 randomization by a sample size formula for the standard log-rank test [10]. If TS had been already validated to be a predictive biomarker of pemetrexed before this trial, then we could have chosen an enrichment trial for TS-negative patients that would require a much smaller sample size. The efficiency has been compared between an enrichment design and a stratified randomization design has been for predictive biomarker in terms of a continuous outcome [11].

3.2. A Phase II Trial with a Prognostic Biomarker

Prognostic biomarkers provide information on the overall cancer outcome in patients to facilitate cancer diagnosis regardless of selected treatments. In this section, we consider a phase II trial with an imaging prognostic biomarker. Chemotherapy B has been a standard regimen for patients with non-bulky stage I and II Hodgkin lymphoma. In a previous study on 6 cycles of B, each patient had a FDG-PET (fluorodeoxyglucose positron-emission tomography) imaging after 2 cycles of B. It was found that the patients with a negative PET image (group 1) and those with a positive PET image (group 2) had a 3-year PFS of

S_{1} (3) = 0.86

and

S_{2} (3) = 0.52

, respectively, and the hazard ratio,

Δ = λ_{2} / λ_{1}

, was estimated as

Δ_{0} = 4.3

.

In a new single-arm phase II trial, the patients with a negative PET image after 2 cycles of B will be treated by additional 4 cycles of the chemotherapy B as in the previous study, whereas those with a positive PET image after 2 cycles of B will be treated by 4 cycles of a more aggressive chemotherapy C plus radiation therapy (C+RT).

In this trial, we want to show that by treating PET positive patients with the more aggressive therapy C+RT, their PFS will become closer to that of PET negative patients who are treated by the standard chemotherapy B. To this end, we test

H_{0} : Δ = Δ_{0}

against

H_{1} : Δ < Δ_{0}

. Although the PFS of group 2 will be different between

H_{0}

and

H_{1}

, that of group 1 is expected to be identical since PET negative patients receive the same treatment as that of the previous study.

Statistical Testing

Let

n_{k}

denote the sample size in group k,

n = n_{1} + n_{2}

the total sample size, and

T_{k i}

the time to progression for subject i in group k (

1 \leq i \leq n_{k}; k = 1, 2

). We observe

(X_{k i}, δ_{k i})

, where

X_{k i}

is the minimum of

T_{k i}

and the censoring time and

δ_{k i}

is an event (or progression) indicator taking 1 if the subject had a tumor progression and 0 otherwise. For group k,

T_{k 1}, \dots, T_{k, n_{k}}

are distributed with hazard function

λ_{k} (t)

. Under the proportional hazards assumption,

Δ = λ_{2} (t) / λ_{1} (t)

denotes the hazard ratio between the two patient groups.

Let

{\hat{Λ}}_{k} (t) = \int_{0}^{t} Y_{k}^{- 1} (t) d N_{k} (t)

denote the Aalen–Nelson estimator [12,13] for the cumulative hazard function

Λ_{k} (t) = \int_{0}^{t} λ_{k} (s) d s

,

Y_{k} (t) = \sum_{i = 1}^{n_{k}} I (X_{k i} \geq t)

and

N_{k} (t) = \sum_{i = 1}^{n_{k}} δ_{k i} I (X_{k i} \leq t)

are the at-risk process and the event process for group k, respectively, and

N (t) = N_{1} (t) + N_{2} (t)

. It was shown [14] that

W (Δ) = \int_{0}^{\infty} \frac{Y_{1} (t) Y_{2} (t)}{Y_{1} (t) + Δ Y_{2} (t)} {Δ d {\hat{Λ}}_{1} (t) - d {\hat{Λ}}_{2} (t)}

is increasing in

Δ

, and

W (Δ) / σ_{n} (Δ)

is asymptotically

N (0, 1)

, where

σ_{n}^{2} (Δ) = Δ \int_{0}^{\infty} \frac{Y_{1} (t) Y_{2} (t)}{{Y_{1} (t) + Δ Y_{2} (t)}^{2}} d N (t)

Hence, we reject

H_{0} : Δ = Δ_{0}

, in favor of

H_{1} : Δ < Δ_{0}

, if

W (Δ_{0}) / σ_{n} (Δ_{0}) > z_{1 - α}

with one-sided type I error rate

α

. The test statistic with

Δ_{0} = 1

is the standard log-rank test [15].

3.3. Sample Size Calculation

We want to estimate the sample size n under a local alternative hypothesis

H_{1} : Δ = Δ_{1}

(<

Δ_{0})

with a desired power. For sample size calculation of this trial, we need to specify following design parameters.

Type I error rate and power, $(α, 1 - β)$
PET-negativity and PET-positivity, $p_{1}, p_{2}$
Distributions of PFS for PET negative and PET positive patient groups: exponential distributions with hazard rates $λ_{1}$ for PET negative group; $λ_{20}$ under $H_{0}$ and $λ_{21}$ under $H_{1}$ for PET positive group
Accrual period a (or accrual rate r) and additional follow-up period b

Using the specified hazard rates, we have

Δ_{0} = λ_{20} / λ_{1}

and

Δ_{1} = λ_{21} / λ_{1}

. A sample size formula with

Δ_{0} > 1

and

Δ_{1} = 1

for designing non-inferiority trials was proposed [14]. This sample size formula was further extended for general

Δ_{0}

and

Δ_{1}

with

Δ_{1} < Δ_{0}

[16].

Appendix B derives a sample size formula by adapting Jung and Chow’s formula [16] for this trial,

n = \frac{{(σ_{0} z_{1 - α} + σ_{1} z_{1 - β})}^{2}}{ω^{2}}

(6)

where

ω = p_{1} p_{2} \int_{0}^{\infty} \frac{G (t) S_{1} (t) S_{21} (t) {λ_{1} Δ_{0} - λ_{21}}}{p_{1} S_{1} (t) + p_{2} Δ_{0} S_{21} (t)} d t

σ_{0}^{2} = Δ_{0} p_{1} p_{2} \int_{0}^{\infty} \frac{G (t) S_{1} (t) S_{21} (t) {p_{1} λ_{1} S_{1} (t) + p_{2} λ_{21} S_{21} (t)}}{{p_{1} S_{1} (t) + p_{2} Δ_{0} S_{21} (t)}^{2}} d t

σ_{1}^{2} = p_{1} p_{2} \int_{0}^{\infty} \frac{G (t) S_{1} (t) S_{21} (t) {p_{2} λ_{1} Δ_{0}^{2} S_{21} (t) + p_{1} λ_{21} S_{1} (t)}}{{p_{1} S_{1} (t) + p_{2} Δ_{0} S_{21} (t)}^{2}} d t

S_{1} (t) = exp (- λ_{1} t)

,

S_{21} (t) = exp (- λ_{21} t)

,

G (t)

is the survivor function of the

U (b, a + b)

censoring distribution with

G (t) = \{\begin{matrix} 1 & if t \leq b \\ - t / b + (a + b) / a & if b < t \leq a + b \\ 0 & if t > a + b \end{matrix}

and

p_{k} = n_{k} / n

. The integrals for

ω

,

σ_{0}^{2}

, and

σ_{1}^{2}

are calculated using a numerical method.

The number of events D expected at the analysis time under

H_{1}

is calculated by

D = n (p_{1} d_{1} + p_{2} d_{2})

, where

d_{1} = 1 + \int_{0}^{\infty} S_{1} (t) d G (t) = 1 - exp (- λ_{1} b) {1 - exp (- λ_{1} a)}

and

d_{2} = 1 + \int_{0}^{\infty} S_{21} (t) d G (t) = 1 - exp (- λ_{21} b) {1 - exp (- λ_{21} a)}

.

Sample size formula (6) assumes that the accrual period a is specified. Suppose that accrual rate r is specified instead of accrual period a. Given

(λ_{1}, Δ_{0}, Δ_{1}, α, 1 - β, p_{1}, b)

,

ω = ω (a)

and

σ_{h} = σ_{h} (a)

for

h = 0, 1

are functions of a. Under uniform accrual assumption, we have

n = a \times r

. Hence, (6) is expressed as

a \times r = \frac{{σ_{0} (a) z_{1 - α} + σ_{1} (a) z_{1 - β}}^{2}}{ω^{2} (a)} .

(7)

By solving (7) with respect to a, using a numerical method such as the bisection method, we obtain the required accrual period

a^{*}

and the required sample size

n = a^{*} \times r

.

Example 2

We consider the PET-guided Hodgkin lymphoma trial introduced in the beginning of this section. Under

H_{0}

, we assume a 3-year PFS of 86% and 52% for PET negative and positive groups, respectively, which correspond to annual hazard rates of

(λ_{1}, λ_{20}) = (0.050, 0.218)

under an exponential PFS model, resulting in a hazard ratio of

Δ_{0} = 4.3

. By treating the PET positive patients with an aggressive treatment C+RT, we expect to increase their 3-year PFS up to 74% (from 52%), resulting in an annual hazard rate of

λ_{21} = 0.100

and hazard rate of

Δ_{1} = 2.0

. The previous study observed about

p_{2} = 20 %

of PET-positivity. Assuming an annual accrual rate of

r = 60

patients and

b = 3

years of additional follow-up after completion of accrual, we need

n = 191

patients for

1 - β = 90 %

power for detecting

H_{1} : Δ_{1} = 2

by the generalized log-rank test

W (Δ_{0}) / σ_{n} (Δ_{0})

with one-sided

α = 10 %

under

H_{0} : Δ_{0} = 4.3

. Under this specific alternative hypothesis, we expect about 46 events (progressions or deaths) at the data analysis. This trial was conducted with this study objective as a second objective [17]. Simulation studies are conducted to evaluate the performance of the calculated sample size under the above design settings under

H_{0}

and

H_{1}

. Using 10,000 simulation samples of size

n = 191

under each hypothesis, the empirical type I error rate and power are observed as 0.0984 (to be compared to

α = 0.1

) and 0.8749 (to be compared to

1 - β = 0.9

), respectively.

4. Discussion

We have presented design and analysis methods of two phase II trials for biomarker-guided treatments.

The power of the statistical tests of the trials discussed above depends on the prevalence of the biomarker positivity. Therefore, we need to check the observed prevalence during the patient accrual, and to recalculate the sample size if the observed prevalence is very different from the one specified at the design stage. For both of our example trials, the initial sample size will be under-powered if the observed prevalence is farther from

1 / 2

than the specified one at the design stage. In this case, we may plan a sample size recalculation reflecting the observed prevalence in the middle of the trial, and modify the sample size of the trial if necessary.

For the sample size calculations, we have assumed exponential survival distributions and an accrual pattern with a constant accrual rate, but we can easily extend the formulas for any survival distributions and any accrual pattern [18]. We have considered a survival endpoint as the primary endpoint in this paper, but the concept can be used to design and analysis for biomarker-driven phase II trials with other type of endpoint, such as a binary outcome for tumor response.

As an effort to lower the sample size of a phase II trial from that of a phase III trial, we use a high type I error rate [19,20], such as 1-sided

α = 5 %

or 10% (compared to 2-sided

α = 5 %

), a surrogate short-term outcome, such as tumor response or progression-free survival (compared to a confirmatory endpoint such as overall survival), a larger treatment effect, and a single-arm design (compared to a randomized design). Despite these efforts, we observe that a randomized trial stratified by a predictive biomarker requires a relatively large sample size for a phase II trial. This fact is pointed out by literature [3,11,21].

5. Conclusions

Through simulations on the two real study examples, we find that the proposed statistical tests control the type I error rate accurately and the calculated sample sizes maintain the appropriate power. The sample size calculations require some numerical methods for integration and solving equations. The author developed Fortran programs to implement the sample size formulas, which are available upon request.

Funding

This research received no external funding.

Conflicts of Interest

The author declares no conflict of interest.

Appendix A. Limit of I(0)

Since

n^{- 1} \sum_{i = 1}^{n}

uniformly converges to

G (t) S_{00} (t)

under the global null hypothesis, as

n \to \infty

, we have

\frac{\sum_{j = 1}^{n} Y_{j} (t) Z_{j}^{\otimes 2}}{\sum_{j = 1}^{n} Y_{j} (t)} ⟶ A_{1} = (\begin{matrix} p_{1} & p_{11} & p_{11} \\ p_{11} & q_{1} & p_{11} \\ p_{11} & p_{11} & p_{11} \end{matrix})

and

\frac{{(\sum_{j = 1}^{n} Y_{j} (t) Z_{j})}^{\otimes 2}}{{(\sum_{j = 1}^{n} Y_{j} (t))}^{2}} ⟶ A_{2} = (\begin{matrix} p_{1}^{2} & p_{1} q_{1} & p_{1} p_{11} \\ p_{1} q_{1} & q_{1}^{2} & q_{1} p_{11} \\ p_{1} p_{11} & q_{1} p_{11} & p_{11}^{2} \end{matrix})

uniformly. Hence,

I (0) = \sum_{i = 1}^{n} \int_{0}^{\infty} [\frac{\sum_{j = 1}^{n} Y_{j} (t) Z_{j}^{\otimes 2}}{\sum_{j = 1}^{n} Y_{j} (t)} - \frac{{\sum_{j = 1}^{n} Y_{j} (t) Z_{j}}^{\otimes 2}}{{\sum_{j = 1}^{n} Y_{j} (t)}^{2}}] d N_{i} (t)

converges to

A D

, where

D = \sum_{i = 1}^{n} \int_{0}^{\infty} d N_{i} (t)

is the number of events and

A = A_{1} - A_{2} = (\begin{matrix} p_{0} p_{1} & p_{11} - p_{1} q_{1} & p_{0} p_{11} \\ p_{11} - p_{1} q_{1} & q_{0} q_{1} & q_{0} p_{11} \\ p_{0} p_{11} & q_{0} p_{11} & p_{11} (1 - p_{11}) \end{matrix})

Appendix B. Derivation of Jung and Chow’s Formula

Let

p_{k} = n_{k} / n

denote the prevalence of group

k (= 1, 2)

. To derive a sample size formula, we must derive the limiting distribution of the test statistic

W (Δ_{0}) / σ_{n} (Δ_{0})

under

H_{1}

. The survivor function of the

U (b, a + b)

censoring distribution is given as

G (t) = \{\begin{matrix} 1 & if t \leq b \\ - t / b + (a + b) / a & if b < t \leq a + b \\ 0 & if t > a + b \end{matrix}

Please note that under

H_{1}

,

n_{1}^{- 1} Y_{1} (t)

and

n_{2}^{- 1} Y_{2} (t)

uniformly converge to

G (t) S_{1} (t)

and

G (t) S_{21} (t)

, respectively.

Since

W (Δ_{0}) = Δ_{0} \int_{0}^{\infty} \frac{Y_{2} (t)}{Y_{1} (t) + Δ_{0} Y_{2} (t)} d N_{1} (t) - \int_{0}^{\infty} \frac{Y_{1} (t)}{Y_{1} (t) + Δ_{0} Y_{2} (t)} d N_{2} (t)

its variance under

H_{1}

is

{\hat{σ}}_{1}^{2} = Δ_{0}^{2} \int_{0}^{\infty} \frac{Y_{1} (t) Y_{2}^{2} (t)}{{Y_{1} (t) + Δ 0 Y_{2} (t)}^{2}} d Λ_{1} (t) + \int_{0}^{\infty} \frac{Y_{1}^{2} (t) Y_{2} (t)}{{Y_{1} (t) + Δ 0 Y_{2} (t)}^{2}} d Λ_{21} (t)

Therefore,

n^{- 1} {\hat{σ}}_{1}^{2}

converges to

σ_{1}^{2} = p_{1} p_{2} \int_{0}^{\infty} \frac{G (t) S_{1} (t) S_{21} (t) {p_{2} λ_{1} Δ_{0}^{2} S_{21} (t) + p_{1} λ_{21} S_{1} (t)}}{{p_{1} S_{1} (t) + p_{2} Δ_{0} S_{21} (t)}^{2}} d t

On the other hand, under

H_{1}

,

n^{- 1} σ_{n} (Δ_{0})

converges to

σ_{0}^{2} = Δ_{0} p_{1} p_{2} \int_{0}^{\infty} \frac{G (t) S_{1} (t) S_{21} (t) {p_{1} λ_{1} S_{1} (t) + p_{2} λ_{21} S_{21} (t)}}{{p_{1} S_{1} (t) + p_{2} Δ_{0} S_{21} (t)}^{2}} d t

Also, the expected value of

n^{- 1} W (Δ_{0})

under

H_{1}

is given as

ω = p_{1} p_{2} \int_{0}^{\infty} \frac{G (t) S_{1} (t) S_{21} (t) {λ_{1} Δ_{0} - λ_{21}}}{p_{1} S_{1} (t) + p_{2} Δ_{0} S_{21} (t)} d t

Please note that we derive

σ_{0}^{2}

,

σ_{1}^{2}

, and

ω

using the exact asymptotic results under

H_{1}

, while Jung (2018) derives them approximately under the nearby alternative hypothesis.

Hence, under

H_{1}

,

W (Δ_{0})

is approximately normal with mean

n ω

and variance

n σ_{1}^{2}

, so that given n, the power of the test statistic with 1-sided

α

is

1 - β \approx P (\frac{W (Δ_{0})}{σ_{0} \sqrt{n}} > z_{1 - α} | H_{1})

= P (\frac{W (Δ_{0}) - n ω}{σ_{1} \sqrt{n}} \times \frac{σ_{1}}{σ_{0}} + \frac{ω \sqrt{n}}{σ_{0}} > z_{1 - α} | H_{1})

= \bar{Φ} (\frac{σ_{0}}{σ_{1}} z_{1 - α} - \frac{ω \sqrt{n}}{σ_{1}})

By solving this equation with respect to n, we obtain the sample size required for a power of

1 - β

as

n = \frac{{(σ_{0} z_{1 - α} + σ_{1} z_{1 - β})}^{2}}{ω^{2}}

References

Freidlin, B.; McShane, L.M.; Korn, E.L. Randomized Clinical Trials with Biomarkers: Design Issues. J. Natl. Cancer Inst. 2010, 102, 152–160. [Google Scholar] [CrossRef] [PubMed]
Freidlin, B.; McShane, L.M.; Polley, M.Y.C.; Korn, E.L. Randomized Phase II Trial Designs with Biomarkers. J. Clin. Oncol. 2012, 30, 3304–3309. [Google Scholar] [CrossRef] [PubMed]
Mandrekar, S.J.; Sargent, D.J. Clinical Trial Designs for Predictive Biomarker Validation: Theoretical Considerations and Practical Challenges. J. Clin. Oncol. 2009, 27, 4027–4034. [Google Scholar] [CrossRef] [PubMed]
Jung, S.H. Phase II Cancer Clinical Trials for Biomarker-Guided Treatments. J. Biopharm. Stat. 2018, 28, 256–263. [Google Scholar] [CrossRef] [PubMed]
Ozasa, H.; Oguri, T.; Uemura, T.; Miyazaki, M.; Maeno, K.; Sato, S.; Ueda, R. Significance of Thymidylate Synthase for Resistance to Pemetrexed in Lung Cancer. Cancer Sci. 2010, 101, 161–166. [Google Scholar] [CrossRef] [PubMed]
Sun, J.M.; Ahn, J.S.; Jung, S.H.; Sun, J.; Ha, S.Y.; Han, J.; Park, K.; Ahn, M.J. Pemetrexed plus Cisplatin versus Gemcitabine plus Cisplatin according to Thymidylate Synthase Expression in Nonsquamous Non–Small-Cell Lung Cancer: A Biomarker-Stratified Randomized Phase II Trial. J. Clin. Oncol. 2015, 33, 2450–2456. [Google Scholar] [CrossRef] [PubMed]
Cox, D.R. Regression Models and Life Tables (with discussion). J. R. Stat. Soc. Ser. B Stat. Methodol. 1972, 34, 187–220. [Google Scholar]
Fleming, T.R.; Harrington, D.P. Counting Processes and Survival Analysis; Wiley: New York, NY, USA, 1991. [Google Scholar]
Sun, J.M.; Han, J.; Ahn, J.S.; Park, K.; Ahn, M.J. Significance of Thymidylate Synthase and Thyroid Transcription Factor 1 Expression in Patients with Nonsquamous Non–Small Cell Lung Cancer Treated with Pemetrexed-Based Chemotherapy. J. Thorac. Oncol. 2011, 6, 1392–1399. [Google Scholar] [CrossRef] [PubMed]
Yateman, N.A.; Skene, A.M. Sample Size for Proportional Hazards Survival Studies with Arbitrary Patient Entry and Loss to Follow-up Distributions. Stat. Med. 1992, 11, 1103–1113. [Google Scholar] [CrossRef] [PubMed]
Maitournam, A.; Simon, R. On the Efficiency of Targeted Clinical Trials. Stat. Med. 2005, 24, 329–339. [Google Scholar] [CrossRef] [PubMed]
Aalen, O.O. Nonparametric Inference for a Family of Counting Processes. Ann. Stat. 1978, 6, 701–726. [Google Scholar] [CrossRef]
Nelson, W. Hazard Plotting for Incomplete Failure Data. J. Qual. Technol. 1969, 1, 27–52. [Google Scholar] [CrossRef]
Jung, S.H.; Kang, S.J.; McCall, L.; Blumenstein, B. Sample Size Computation for Noninferiority Log-Rank Test. J. Biopharm. Stat. 2005, 15, 957–967. [Google Scholar] [CrossRef] [PubMed]
Peto, R.; Peto, J. Asymptotically Efficient Rank Invariant Test Procedures (with discussion). J. R. Stat. Soc. Ser. A Stat. Soc. 1972, 135, 185–206. [Google Scholar] [CrossRef]
Jung, S.H.; Chow, S.C. On Sample Size Calculation for Comparing Survival Curves under General Hypotheses Testing. J. Biopharm. Stat. 2012, 22, 485–495. [Google Scholar] [CrossRef] [PubMed]
Straus, D.J.; Jung, S.H.; Pitcher, B.; Kostakoglu, L.; Grecula, J.; His, E.; Schöder, H.; Popplewell, L.; Chang, J.; Moskowitz, C.; et al. CALBG 50604: Risk-Adapted Treatment of Non-Bulky Early Stage Hodgkin Lymphoma based on Interim PET. Blood 2018, 132, 1013–1021. [Google Scholar] [CrossRef] [PubMed]
Jung, S.H. Randomized Phase II Cancer Clinical Trials; Chapman & Hall/CRC: New York, NY, USA, 2013. [Google Scholar]
Simon, R. Optimal Two-Stage Designs for Phase II Clinical Trials. Control. Clin. Trials 1989, 10, 1–10. [Google Scholar] [CrossRef]
Jung, S.H.; Sargent, D.J. Randomized Phase II Clinical Trials. J. Biopharm. Stat. 2014, 24, 802–816. [Google Scholar] [CrossRef] [PubMed]
Simon, R.; Maitournam, A. Evaluating the Efficiency of Targeted Designs for Randomized Clinical Trials. Clin. Cancer Res. 2004, 10, 6759–6763. [Google Scholar] [CrossRef] [PubMed]

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2021 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Jung, S.-H. Design and Analysis of Cancer Clinical Trials for Personalized Medicine. J. Pers. Med. 2021, 11, 376. https://doi.org/10.3390/jpm11050376

AMA Style

Jung S-H. Design and Analysis of Cancer Clinical Trials for Personalized Medicine. Journal of Personalized Medicine. 2021; 11(5):376. https://doi.org/10.3390/jpm11050376

Chicago/Turabian Style

Jung, Sin-Ho. 2021. "Design and Analysis of Cancer Clinical Trials for Personalized Medicine" Journal of Personalized Medicine 11, no. 5: 376. https://doi.org/10.3390/jpm11050376

APA Style

Jung, S.-H. (2021). Design and Analysis of Cancer Clinical Trials for Personalized Medicine. Journal of Personalized Medicine, 11(5), 376. https://doi.org/10.3390/jpm11050376

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Design and Analysis of Cancer Clinical Trials for Personalized Medicine

Abstract

1. Introduction

2. Materials and Methods

3. Results

3.1. A Phase II Trial with a Predictive Biomarker

3.1.1. Statistical Testing

3.1.2. Sample Size Calculation

3.1.3. Example 1

3.2. A Phase II Trial with a Prognostic Biomarker

Statistical Testing

3.3. Sample Size Calculation

Example 2

4. Discussion

5. Conclusions

Funding

Conflicts of Interest

Appendix A. Limit of I(0)

Appendix B. Derivation of Jung and Chow’s Formula

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI