This trial is an early phase study evaluating the safety and efficacy of the combination of modulated dose chemotherapy and Pembrolizumab, with or without second immunomodulatory agent as neoadjuvant therapy for stage IB-IIIA surgically resectable NSCLC patients in two cohorts. The patient cohorts are defined by patients with adenocarcinoma (Cohort A) and squamous cell carcinoma (Cohort B). Standard histology-based chemotherapy regimens vary for the two patient cohorts, and it is not known whether one cohort is expected to have systematically greater or lesser toxicity than the other cohort. Thus, the cohorts are considered independently. Patients will receive 4 cycles of neoadjuvant combination therapy followed by surgical resection with a primary objective of determining the optimal dose combination (ODC). The ODC will incorporate both safety and efficacy and will be defined as the combination with the highest response rate among combinations with an acceptable level of toxicity. The primary outcomes guiding accrual decisions include the frequency of treatment-related dose-limiting toxicities (DLTs) and the frequency of pathologic response, assessed between 12 and 28 weeks from the start of treatment.
2.2. Bayesian Dose-Finding Design
The intention of this design is to determine cohort-specific ODC where treatment combination allocation is based on a Bayesian continual reassessment method accounting for both toxicity and efficacy [
9]. The study is designed to accrue eligible participants using cohorts of size one. Allocation to treatment combinations is implemented for each patient cohort independently, and the process is the same in both cohorts. With regard to safety, it is assumed that increasing the dose level while holding the other agent fixed will result in an increased probability of DLT. Using this assumption, modeling incorporates a set of four possible orderings for DLT probabilities among the treatment combinations in
Table 2 and a working model for DLT probabilities corresponding to the four possible orders in
Table 3. This process is considered separately for each of the two patient cohorts.
The continual reassessment method (CRM) is fit for toxicity within each ordering using the working model and the accumulated data. For each working model in each cohort, m = 1, …, 4 in
Table 3, the DLT probabilities are modeled using a class of one-parameter power models
, where the
are the working model values for order
given in
Table 3,
indexes the dose combination and
indexes the cohort.
DLT probability estimation embodies characteristics of the continual reassessment method (CRM) [
10], so we use its features to specify design parameters. The skeleton values for toxicity were selected using to the algorithm of Lee and Cheung [
11], using recommended specifications that yield good operating characteristics. CRM designs have been shown to be robust and efficient with the use of “reasonable” skeletons, where adjacent values have adequate spacing. The algorithm is available as a function, getprior, within the R [
12] package dfcrm [
13] and requires a spacing measure
to generate reasonable spacing between adjacent combinations in the skeleton. Simulation results in Lee and Cheung [
11] indicate that the optimal range of
is [0.04, 0.10] for common target toxicity rates (i.e., 0.20–0.33). The value
lies in the optimal range and provides a set of reasonably spaced skeleton values. The skeletons should represent the various possible orderings of regimen–toxicity curves, according to the toxicity assumptions displayed in
Table 2. The class of skeletons in
Table 3 was generated using the algorithm and the locations of these values were adjusted to correspond to the six orderings in
Table 2 using the getwm function in R package pocrm [
14].
The prior distribution on the parameter
for all working models is given by
, a normal distribution with mean 0 and standard deviation 0.48. The standard deviation for the prior distribution was chosen according to Algorithm 9.1 in Cheung [
15] using values of
and a grid width of 0.03. According to Cheung [
15], there are two practical advantages for choosing a normal distribution in this setting. First, posterior computations using Gauss–Hermite quadrature [
16] under the above parametrization are accurate, and the second, Bayesian CRM utilizing a class of one-parameter models that includes the power model is invariant to the mean of a prior that forms a location-scale family. This property allows for the prior mean to be zero and the prior to be completely specified by its standard deviation, simplifying the process of calibration. A uniform prior distribution,
, is placed on each working model for each cohort so that all working models are considered equally likely a priori. Based on the observed toxicity data
, where
is the number of DLTs,
is the number of subjects treated on combination
, and
specifies the cohort. The likelihood for ordering
is given by
Using Bayes theorem, the posterior probability for each working model given the data can then be calculated as
After accrual of each participant into the trial the model associated with the largest posterior probability is selected and the DLT probability estimates,
, are updated using the chosen working model using the Bayesian form of the CRM [
9] so that
If a tie occurs between the posterior model probabilities of two or more models, then the selected model would be randomly chosen from among the tied models. The estimated DLT probabilities are used to define a set of “acceptable” combinations with regard to safety. The maximum tolerated dose combination (MTDC) is defined as the combination with estimated DLT probability closest to the maximum allowable DLT rate of 30%. Any combination with estimated DLT rate less than or equal to that of the MTDC would be considered acceptable in terms of safety.
The probability of response
at combination
in cohort
is modeled using a beta-binomial model
where
is a beta distribution with parameters
and
. Based on the number of responses
and the number of treated participants
on combination
in cohort
, the posterior distribution of
follows a beta distribution so that
Using a non-informative
prior distribution in each cohort, the probabilities of pathologic response for each combination are estimated based on the posterior mean
, separately for each cohort. Once the set of acceptable combinations is determined in each cohort, the recommended combination varies depending on how many participants have entered the study to that point. For the first third of the trial (1/3 the maximum sample size), the combination recommendation in each cohort is based on randomization using a weighted allocation scheme. The recommended combination for the next entered participant is chosen at random from the set of acceptable combinations, with each acceptable combination weighted by its estimated response probability. Based on the estimates
, we calculate the randomization probability
and randomize the next participant in cohort
to an acceptable combination
with probability
. This approach allows for acceptable combinations with higher estimated response probabilities to have a higher chance of being randomly chosen as the next recommended combination. For the latter two-thirds of the trial (final 2/3 of maximum sample size), the recommended combination for the next entered participant is defined as the acceptable combination with the highest estimated response probability so that the next participant is assigned the combination
satisfying
. As each participant enters the study, a new recommended combination is obtained, and the next entered participant would be allocated to the updated recommended combination. The trial is designed to stop once sufficient information about the optimal combination in each cohort is obtained, according to the stopping rules defined in the following section.