Bayesian Design for Identifying Cohort-Specific Optimal Dose Combinations Based on Multiple Endpoints: Application to a Phase I Trial in Non-Small Cell Lung Cancer

Immunotherapy and chemotherapy combinations have proven to be a safe and efficacious treatment approach in multiple settings. However, it is not clear whether approved doses of chemotherapy developed to achieve a maximum tolerated dose are the ideal dose when combining cytotoxic chemotherapy with immunotherapy to induce immune responses. This trial of a modulated dose chemotherapy and Pembrolizumab, with or without a second immunomodulatory agent, uses a Bayesian design to select the optimal treatment combination by balancing both safety and efficacy of the chemotherapy and immunotherapy agents within each of two cohorts. The simulation study provides evidence that the proposed Bayesian design successfully addresses the primary study aim to identify the optimal dose combination for each of the two independent patient cohorts. This conclusion is supported by the high percentage of simulated trials which select a treatment combination that is both safe and highly efficacious. The proposed trial was funded and was being finalized when the sponsoring company decided not to proceed due to negative findings in another patient population. The proposed trial design will continue to be relevant as multiple chemotherapy and immunotherapy combinations become the standard of care and future research will require evaluating the appropriate doses of various components of multiple drug regimens.


Introduction
The global incidence of lung cancer was 2.2 million in 2020, resulting in an estimated 1.7 million deaths [1]. In the United States, the 2021 estimated incidence of new diagnoses is 235,760 and the estimated number of deaths was 131,880. Lung cancer represents 12.4% of all new cancer cases in the US and remains the leading cause of cancer death for both men and women. Only 18% of patients are diagnosed with localized disease and an additional 22% are diagnosed with regional disease with the remaining having distant spread at the time of diagnosis. The 5-year survival for patients with localized and regional disease is 59.8% and 32.9% respectively [2].
Immune checkpoint inhibitors (ICI), such as those that inhibit programmed death ligand 1 (PD-1) or its ligand (PD-L1), have been approved as first and second-line treatments for non-small cell lung cancer (NSCLC) and are currently being evaluated in the neoadjuvant and adjuvant setting. Pembrolizumab, a PD-1 inhibitor, is commonly used to treat patients with advanced NSCLC, either alone or in combination with chemotherapy. These agents have improved overall survival and can result in durable disease control and meaningful increases in long-term survival. Although treatment with anti-PD-1 or anti-PD-L1 antibodies can induce clinical responses in the setting of many advanced cancers, these ICIs fail to induce durable responses in a large proportion of patients. Thus, there remains a critical need to identify combinatorial approaches to augment anti-PD-1 responses and overcome immune resistance mechanisms. The efficacy of checkpoint inhibitors may be further enhanced by overcoming immune resistance mechanisms within the tumor microenvironment. Various agents have demonstrated potential to synergize with ICIs to enhance an immune response, including agents that target indoleamine 2,3-dioxygenase (IDO), vascular endothelial growth factor (VEGF), histone deacetylase (HDAC), or poly-ADP-ribose polymerase (PARP), among others. In addition, combinations of other checkpoint inhibitors (CPIs), such as those that target CTLA4, TIGIT, and LAG-3 have shown promise. In advanced NSCLC, a combination of PD-1 and CTLA4 has been shown to improve overall survival compared to chemotherapy alone and this strategy is now FDA approved [3,4].
Immunotherapy and chemotherapy combinations have proven to be a safe and efficacious treatment approach in multiple settings and there is potential to further elucidate a synergistic relationship between these modalities [5][6][7]. It is not clear whether approved doses of chemotherapy, which were developed to achieve a maximum tolerated dose (MTD), are the ideal dose when combining cytotoxic chemotherapy with immunotherapy to induce immune responses. Lower doses of chemotherapy may maximize this synergistic effect and allow for a combination with less toxicity. Further exploring the role of immunotherapy combinations with chemotherapy offers even more potential to improve response rates and survival in a disease with significant morbidity and mortality. The Checkmate 9LA study evaluated a combination of ipilimumab, a CTLA4 inhibitor, with nivolumab, a PD-1 inhibitor, with only two cycles of chemotherapy rather than the standard four cycles. This trial showed superior overall survival with this combination compared to four cycles of chemotherapy alone. However, the doses of chemotherapy given for those two cycles were at the standard dose [3].
The clinical question of interest for this study of patients with NSCLC is to explore potential benefits of lower doses of chemotherapy as well as adding a second immunotherapy agent to the commonly used standard of care regimen of platinum-doublet chemotherapy and pembrolizumab. To address this clinical question, the trial was proposed with a Bayesian design to select the optimal treatment combination by balancing both safety and efficacy of the chemotherapy and immunotherapy agents within each of two cohorts. For an overview of Bayesian design of adaptive clinical trials, we refer the reader to Giovagnoli [8] and the references therein. There are no existing dose-finding methods available to address the multitude of challenges presented by research objectives of this study. Our team adapted relevant components of existing methods to develop an appropriate and flexible design strategy. There is an increased demand to tailor early-phase clinical trial designs to the trial's research objectives in order to treat study participants as efficiently as possible rather than reorienting the objectives to apply an "off-the-shelf" method, potentially missing the opportunity to answer promising and relevant research questions. Details of the design are provided in Section 2, followed by simulation results in Section 3. Discussion and conclusion follow in Sections 4 and 5, respectively.

Methods
This trial is an early phase study evaluating the safety and efficacy of the combination of modulated dose chemotherapy and Pembrolizumab, with or without second immunomodulatory agent as neoadjuvant therapy for stage IB-IIIA surgically resectable NSCLC patients in two cohorts. The patient cohorts are defined by patients with adenocarcinoma (Cohort A) and squamous cell carcinoma (Cohort B). Standard histology-based chemotherapy regimens vary for the two patient cohorts, and it is not known whether one cohort is expected to have systematically greater or lesser toxicity than the other cohort. Thus, the cohorts are considered independently. Patients will receive 4 cycles of neoadjuvant combination therapy followed by surgical resection with a primary objective of determining the optimal dose combination (ODC). The ODC will incorporate both safety and efficacy and will be defined as the combination with the highest response rate among combinations with an acceptable level of toxicity. The primary outcomes guiding accrual decisions include the frequency of treatment-related dose-limiting toxicities (DLTs) and the frequency of pathologic response, assessed between 12 and 28 weeks from the start of treatment.

Treatment Combinations by Patient Cohort
It is anticipated that 65% of the participant population will be from Cohort A and 35% from Cohort B based on prevalence of squamous and non-squamous histology. Treatment details are provided in Table 1, where treatment combinations are labeled as Arms A1 through A6 for Cohort A and Arms B1 through B6 for Cohort B. The ODC in each cohort is the combination that is estimated to have an acceptable toxicity profile, as measured by DLTs, and a good response profile as measured by pathologic response. Adverse events are assessed and graded using the National Cancer Institute's Common Terminology Criteria (CTCAE). As data accumulate, each evaluable participant is classified as experiencing a DLT (yes/no) and experiencing a response (yes/no). Based on the expectedness of adverse events, the maximum allowable DLT rate is 30%. Any combination with an estimated DLT probability ≤ 30% is considered "acceptable" in terms of safety.

Bayesian Dose-Finding Design
The intention of this design is to determine cohort-specific ODC where treatment combination allocation is based on a Bayesian continual reassessment method accounting for both toxicity and efficacy [9]. The study is designed to accrue eligible participants using cohorts of size one. Allocation to treatment combinations is implemented for each patient cohort independently, and the process is the same in both cohorts. With regard to safety, it is assumed that increasing the dose level while holding the other agent fixed will result in an increased probability of DLT. Using this assumption, modeling incorporates a set of four possible orderings for DLT probabilities among the treatment combinations in Table 2 and a working model for DLT probabilities corresponding to the four possible orders in Table 3. This process is considered separately for each of the two patient cohorts.  The continual reassessment method (CRM) is fit for toxicity within each ordering using the working model and the accumulated data. For each working model in each cohort, m = 1, . . . , 4 in Table 3, the DLT probabilities are modeled using a class of one-parameter power models Pr(DLT at combination i) ≈ p exp(θ mc ) mci , where the p mci are the working model values for order m given in Table 3, i indexes the dose combination and c indexes the cohort.

Order (m) Combination
DLT probability estimation embodies characteristics of the continual reassessment method (CRM) [10], so we use its features to specify design parameters. The skeleton values for toxicity were selected using to the algorithm of Lee and Cheung [11], using recommended specifications that yield good operating characteristics. CRM designs have been shown to be robust and efficient with the use of "reasonable" skeletons, where adjacent values have adequate spacing. The algorithm is available as a function, getprior, within the R [12] package dfcrm [13] and requires a spacing measure ρ to generate reasonable spacing between adjacent combinations in the skeleton. Simulation results in Lee and Cheung [11] indicate that the optimal range of ρ is [0.04, 0.10] for common target toxicity rates (i.e., 0.20-0.33). The value ρ = 0.04 lies in the optimal range and provides a set of reasonably spaced skeleton values. The skeletons should represent the various possible orderings of regimen-toxicity curves, according to the toxicity assumptions displayed in Table 2. The class of skeletons in Table 3 was generated using the algorithm and the locations of these values were adjusted to correspond to the six orderings in Table 2 using the getwm function in R package pocrm [14].
The prior distribution on the parameter θ for all working models is given by g(θ) = N(0, 0.48), a normal distribution with mean 0 and standard deviation 0.48. The standard deviation for the prior distribution was chosen according to Algorithm 9.1 in Cheung [15] using values of σ LI θ = 0.75, λ 1 = 0.6, λ 2 = 1.4 and a grid width of 0.03. According to Cheung [15], there are two practical advantages for choosing a normal distribution in this setting. First, posterior computations using Gauss-Hermite quadrature [16] under the above parametrization are accurate, and the second, Bayesian CRM utilizing a class of one-parameter models that includes the power model is invariant to the mean of a prior that forms a location-scale family. This property allows for the prior mean to be zero and the prior to be completely specified by its standard deviation, simplifying the process of calibration. A uniform prior distribution, τ(m) = 1/m, is placed on each working model for each cohort so that all working models are considered equally likely a priori. Based on the observed toxicity data D c = {(y ci , n ci ); i = 1, . . . , 6}, where y ci is the number of DLTs, n ci is the number of subjects treated on combination i, and c specifies the cohort. The likelihood for ordering m is given by Using Bayes theorem, the posterior probability for each working model given the data can then be calculated as After accrual of each participant into the trial the model associated with the largest posterior probability is selected and the DLT probability estimates,π ci , are updated using the chosen working model using the Bayesian form of the CRM [9] so that If a tie occurs between the posterior model probabilities of two or more models, then the selected model would be randomly chosen from among the tied models. The estimated DLT probabilities are used to define a set of "acceptable" combinations with regard to safety. The maximum tolerated dose combination (MTDC) is defined as the combination with estimated DLT probability closest to the maximum allowable DLT rate of 30%. Any combination with estimated DLT rate less than or equal to that of the MTDC would be considered acceptable in terms of safety. The probability of response δ ci at combination i in cohort c is modeled using a betabinomial model where Beta(τ ci , ν ci ) is a beta distribution with parameters τ ci and ν ci . Based on the number of responses z ci and the number of treated participants n ci on combination i in cohort c, the posterior distribution of δ ci follows a beta distribution so that δ ci |(z ci , n ci ) ∼ Beta(τ ci + z ci , ν ci + n ci − z ci ) Using a non-informative Beta(0.5, 0.5) prior distribution in each cohort, the probabilities of pathologic response for each combination are estimated based on the posterior mean δ ci = (z ci + 0.5)/(n ci + 1), separately for each cohort. Once the set of acceptable combinations is determined in each cohort, the recommended combination varies depending on how many participants have entered the study to that point. For the first third of the trial (1/3 the maximum sample size), the combination recommendation in each cohort is based on randomization using a weighted allocation scheme. The recommended combination for the next entered participant is chosen at random from the set of acceptable combinations, with each acceptable combination weighted by its estimated response probability. Based on the estimatesδ ci , we calculate the randomization probability and randomize the next participant in cohort c to an acceptable combination i with probability R ci . This approach allows for acceptable combinations with higher estimated response probabilities to have a higher chance of being randomly chosen as the next recommended combination. For the latter two-thirds of the trial (final 2/3 of maximum sample size), the recommended combination for the next entered participant is defined as the acceptable combination with the highest estimated response probability so that the next participant is assigned the combination i satisfying argmaxδ ci . As each participant enters the study, a new recommended combination is obtained, and the next entered participant would be allocated to the updated recommended combination. The trial is designed to stop once sufficient information about the optimal combination in each cohort is obtained, according to the stopping rules defined in the following section.

Sample Size and Stopping Rules
The maximum target sample size is 60 based on obtaining sufficient information to determine the optimal dose combination in each cohort, which is defined by the combination with the highest response rate among combinations with acceptable toxicity. Stopping rules are incorporated for both safety considerations and efficient use of participants by stopping accrual to a cohort once a sufficient number of patients are treated at the ODC. If the set of acceptable combinations is empty at any point, accrual to the study will be halted. This stopping guideline will trigger a review by the study investigators and DSMC to determine if the study should be modified or permanently closed to further accrual. Accrual to the study for a cohort will end if the recommended treatment combination for the next participant is to a combination that already has 12 patients treated at that combination. If occurring, this treatment combination is determined to be the optimal dose combination for the cohort. Otherwise, accrual will continue until 60 patients are accrued to the study.
Twelve patients receiving the optimal combination will allow for adequate data to assess the pathologic response rate. Based on a Beta(0.5, 0.5) prior, if 5 out of 12 patients receiving the ODC experience pathologic response, then the posterior distribution of δ ci is Beta(5.5, 7.5) according to Equation (5). The probability that the response rate for the optimal combination exceeds the standard of care is given by, Pr(δ ci > 0.28|z ci = 5, n ci = 12) = 1 0.28 where i and c indicate the combination and cohort, respectively.

Simulation Results
A simulation study provides operating characteristics that convey the design's ability to address the aims of the study. In dose-finding clinical trials, operating characteristics provide the scientific justification for the selected design and sample size, similar to that of a power analysis in a phase III clinical trial [17].

Design of Simulation Study
Simulations were run in R to display the performance of the design described in Section 2, with results presented in Tables 4 and 5. Six scenarios are considered, allowing for a broad range of possible relationships between treatment dosage, DLT, and efficacy rates. In each scenario, 1000 simulated trials were run. For each treatment combination, Table 4 presents the true DLT and efficacy rates (row 1), percentage of selection as the ODC (row 2), and the average number of participants treated (row 3). In Table 4, optimal combinations are indicated in bold type, and unsafe combinations are indicated in red type. Table 5 displays the average sample size overall and by cohort. While the overall maximum sample size is 60 participants, it is assumed that 65% of participants are diagnosed with adenocarcinoma, and the remaining 35% are diagnosed with squamous cell carcinoma. This provides maximum sample sizes of 39 and 21 for Cohorts A and B, respectively. The following six scenarios were chosen to display the operating characteristics for this design, providing a wide variety of dose-toxicity-efficacy relationships.

1.
All doses are safe. Intermediate chemo dose maximizes efficacy.

2.
All doses are safe. More chemo yields better efficacy.

3.
Highest chemo dose with immune agent 2 is unsafe. More chemo yields better efficacy.

4.
Highest chemo dose with immune agent 2 is unsafe. Intermediate chemo dose maximizes efficacy.

5.
Highest chemo dose with and without immune agent 2 are unsafe. Intermediate chemo dose maximizes efficacy. 6.
Two cohorts have different safety and efficacy profiles.

Sample Size and Accrual
Accrual to the study for a cohort was designed to end once the next recommendation is to assign the next participant to a combination that already has 12 patients treated at that combination. Accrual is estimated to be 2-3 patients per month, allowing for accrual to be complete within two years. If the minimum follow-up period for participants already on study is not satisfied at the time a new participant is ready to be put on study, then the participant may be accrued to any combination by random allocation, which has accrued at least one participant and is in the acceptable safety set. At the time of combination allocation for the next participant, model-based estimates are calculated for both DLT and response probabilities using the available observed data from all participants accrued to the study at that time. It is important to note that in this design approach, some model-based decisions may be made using slightly less efficacy data than DLT data due to the longer minimum observation window for efficacy. Adjusting for 10% dropout and ineligibility, the maximum sample size should not exceed 67 patients.

Summary of Operating Characteristics
The selected design performs well by providing a high rate of ODC selection in the optimal combinations and a low rate of ODC selection in less desirable treatment combinations, either because of safety concerns or insufficient efficacy. Consider Scenario 1 in Cohorts A and B, where the optimal combination is the treatment combination of immune agent 2 and the intermediate dosage of chemotherapy (indicated in bold type). While all treatment combinations are considered safe, three treatment combinations with low DLT rates and high rates efficacy are highlighted in gray. For Cohort A, these three treatment combinations comprise more than 70% of the recommended ODCs while treating, on average, 58.7% of the trial participants. In Cohort B, these three treatment combinations comprise 69.9% of the recommended ODCs while treating, on average, 59.6% of the trial participants. In contrast, consider Scenario 4, where the treatment combination with immune agent 2 and the highest level of chemotherapy is unsafe. Treatment combinations with the intermediate dosage of chemotherapy, both with and without immune agent 2, have the highest level of efficacy as well as acceptable toxicity. Very few simulated trials resulted in an ODC recommendation of the unsafe treatment combination (1.7% for both Cohorts A and B). The two optimal treatment combinations comprise 76.8% and 74.7% of the ODC recommendations for Cohorts A and B, respectively. Additionally, more than half of simulated trial participants are treated on the two optimal combinations (56.1% and 60.1% for Cohorts A and B, respectively).
The maximum sample size is 60 eligible participants; however, the simulation results in Table 5 indicate that across all scenarios considered, the maximum average trial size is 46 participants. The design used for the trial both performs well and uses resources efficiently by stopping the study once the design recommends a treatment combination in which 12 participants have been treated in the cohort.

Discussion
The design for this study was chosen by balancing the primary study aims and adaptation of existing methods in developing a flexible design strategy. Careful selection of the dose-finding method allows the study design to address the primary study aim without reorienting the study goals to fit a simpler design. In this case, the primary aim of the study is to identify the ODC for each of the two independent patient cohorts. Simulation studies are provided to evaluate the operating characteristics of this design and highlight the ability of the design to identify the ODC and other desirable treatment combinations in a high percentage of trials. Additionally, simulations guide the anticipated final sample size needed to draw meaningful conclusions about the efficacy of the selected ODC.
Treatment regimens varied for the two patient cohorts, and it was not anticipated that either cohort would have systematically greater or lesser toxicity than the other cohort. Because of this, the cohorts were considered independently. If prior information indicated that one cohort was expected to have greater or lesser toxicity, appropriate changes in the design would have been made to use this order information in identifying the ODC for each cohort. Study design options for treatment combinations are limited, especially when considering the additional complexity of clinical aims. While several methods are available to account for two or more groups of participants, these designs consider dose-finding for a single agent [18][19][20][21]. The approach outlined in this paper to tailor the study design to the complex research objectives provides the framework that demonstrates how adaptive designs can be modified within a single trial to address the objectives specific to the study while advancing early development of novel treatment regimens. This manuscript aims to provide an example for designing complex early-phase trials with multiple objectives in various cohorts.

Conclusions
This phase I study design aims to identify the optimal dose combination for each of two cohorts of patients with non-small cell lung cancer, based on multiple endpoints. Simulation studies indicate that the design is well suited to address the study aims while conserving study resources.
During the finalization of this trial protocol, the company sponsoring the study decided not to move forward with this trial due to recent negative findings in another patient group [22]. While this trial was not initiated, plans were near completion and this example highlights the benefits of using a Bayesian design for early phase clinical trials.
As multiple chemotherapy and immunotherapy combinations become the standard of care, future research will likely require evaluating the appropriate doses of the various components of the multiple drug regimen. The Bayesian phase I design described here allows for evaluation of both safe and efficacious doses for various drug combinations commonly used in NSCLC and incorporates standard histology-based chemotherapy regimens in the same trial.

Conflicts of Interest:
The authors declare no conflict of interest.