Modern Challenges for Early-Phase Clinical Trial Design and Biomarker Discovery in Metastatic Non-Small-Cell Lung Cancer

: Oncology research has changed extensively due to the possibility to categorize each cancer type into smaller subgroups based on histology and particularly on different genetic alterations due to their heterogeneity. The consequences of this heterogeneity are particularly evident in the management of metastatic non-small-cell lung cancer (NSCLC). This review will discuss the beneﬁts and challenges of incorporating precision medicine into early- through late-phase metastatic NSCLC clinical trials, discussing examples of drug development programs in oncogene- and non-oncogene-addicted NSCLC. The experiences of clinical development of crizotinib, geﬁtinib and osimertinib are depicted showing that when a targeted drug is administrated in a study population not selected by any biomarker, trials could produce negative results. However, the early detection of biomarker-driven biology helps to obtain a greater beneﬁt for a selected population and can reduce the required time for drug approval. Early clinical development programs involving nivolumab, pembrolizumab and avelumab, immune checkpoint inhibitors, taught us that, beyond safety and activity, the optimal selection of patients should be based on pre-speciﬁed biomarkers. Overall, the identiﬁcation of predictive biomarkers is one of the greatest challenges of NSCLC research that should be optimized with solid methodological trial designs to maximize the clinical outcomes.


Introduction
In oncology research, the clinical trials can fail more frequently than in any other diseases [1]. Despite most phase 3 trials being designed based on robust previous data sets, many fail to reach the primary endpoint after many years of work and high investments [2,3]. Thus, it is of paramount importance to understand why the results of these phase 3 trials were not as expected. A possible explanation could be the high heterogeneity that characterized each cancer type, both in inter-patients and intra-patient settings.
In the last few years, the oncology research has deeply changed due to the possibility to categorize each cancer type in smaller subgroups based on histology and particularly

Modern Versus Old Early-Phase Clinical Trial Design: Lights and Shadows
The drug development in the precision oncology era is everything but an efficient process. With the highest attrition rate and a very long duration [9], only in recent years has the process benefitted from a changing paradigm. The observation that the variation in prognosis among individuals is often greater than the average effect of a therapeutics prompts a change in the object of the drug development in identifying and assessing the efficacy of new drugs in homogenously selected patients as early as possible in their development [10]. For this goal, a new generation of clinical trials has been designed with the aim of demonstrating either the efficiency of the "personalized model" or the specific activity of a drug/combination in molecularly selected populations of patients ( Figure 1). In this framework, the vertical dimension of the "classic" drug development (i.e., sequential succession of phases) left the place to a horizontal dimension in which the goals of the sequential phases are often shrunk into the same protocol/platform, and the classic compartmental drug development is transforming into a "liquid development" where the classical phases cannot be distinguished. Apparently, introducing a personalized approach since the early phase reduced the attrition rate, leading to an increased proportion of drugs having entered clinical development, finally causing an approval increase from 8.4% to 26% and reducing the time from first-in-human (FIH) initiation and drug approval. sequential phases are often shrunk into the same protocol/platform, and the classic compartmental drug development is transforming into a "liquid development" where the classical phases cannot be distinguished. Apparently, introducing a personalized approach since the early phase reduced the attrition rate, leading to an increased proportion of drugs having entered clinical development, finally causing an approval increase from 8.4% to 26% and reducing the time from first-in-human (FIH) initiation and drug approval. However, this new model of drug development has not only potentially reduced the time and costs required for the entire drug development (reducing the number of patients to be enrolled in clinical trials and discarding early unpromising drugs) but has radically changed the way early phase trials were designed.
Traditionally, the goals of phase 1 trials were to find, using toxicity as an endpoint, with the recommended dose and schedule of a new drug for phase 2 studies then used evaluate the antitumor activity. One cornerstone was that, usually, these trials had little or no therapeutic intended benefit (objective response rate [ORR] ≤ 5%) and patients to be considered were not selected for tumor histotype (all-comers). First experiences in matching the patients with molecular targeted drugs in phase 1 trials showed a dramatic increase in the average ORR registered in clinical trials [11,12] and fostered the debate over the therapeutic opportunity of phase 1 trials [13]. Therefore, even more frequently, modern phase 1 clinical trials are no longer enrolling all-comers for which no standard treatments are available but instead use very narrow selection criteria in order to maximise the possibility to catch a potential early activity. According to this changing strategy, a prominent role is held by a tumor biopsy before starting treatments and the screening phase, and during the trial participation and correlated biomarkers/pharmacodynamics endpoints.
In parallel, the sample size of phase 1 clinical trials is increasing over time. In particular, after the maximum tolerated dose (MTD) is established, in modern phase 1 trials However, this new model of drug development has not only potentially reduced the time and costs required for the entire drug development (reducing the number of patients to be enrolled in clinical trials and discarding early unpromising drugs) but has radically changed the way early phase trials were designed.
Traditionally, the goals of phase 1 trials were to find, using toxicity as an endpoint, with the recommended dose and schedule of a new drug for phase 2 studies then used evaluate the antitumor activity. One cornerstone was that, usually, these trials had little or no therapeutic intended benefit (objective response rate [ORR] ≤ 5%) and patients to be considered were not selected for tumor histotype (all-comers). First experiences in matching the patients with molecular targeted drugs in phase 1 trials showed a dramatic increase in the average ORR registered in clinical trials [11,12] and fostered the debate over the therapeutic opportunity of phase 1 trials [13]. Therefore, even more frequently, modern phase 1 clinical trials are no longer enrolling all-comers for which no standard treatments are available but instead use very narrow selection criteria in order to maximise the possibility to catch a potential early activity. According to this changing strategy, a prominent role is held by a tumor biopsy before starting treatments and the screening phase, and during the trial participation and correlated biomarkers/pharmacodynamics endpoints.
In parallel, the sample size of phase 1 clinical trials is increasing over time. In particular, after the maximum tolerated dose (MTD) is established, in modern phase 1 trials patients are very often enrolled in dose expansion cohorts (DECs), in which the main goal is to determine the preliminary activity of the drug.
Interestingly, it has been reported that the use of DEC increased over time from 12% in 2006 to 38% in 2011, and that efficacy/activity is the main outcome of 33% of these [14]. Moreover, data from DEC identified a new toxic effect that was not described during the dose escalation in 54% of the trials and led to a recommended phase 2 dose (RP2D) in 13% of them. More recently, based on similar data, it was reported that the use of DEC increased the possibility of success in phase 2 trials, but the size of the DEC (over 20 patients) is not correlated with this probability of success [15].
The sample size, primarily determined by the DEC size, is not a trivial point considering that, analogously, the sample size of phase 1 increased over time, and that this increase is correlated with the DEC use, with the ORR as endpoint, and is rarely justified by a proper statistical design [16].
Overall, all these new features related to the design, the conduction and the interpretation of early phase trials in oncology led to an unprecedented enthusiasm by a part of the scientific community claiming systematic approval after phase 1 [17]. On the other hand, the scientific community should carefully consider this trend of reducing drug development to the activity demonstration in DECs. Firstly, as we commented, a statistical hypothesis rarely reduces the sample size of those phase 1 trials, thus the results are mostly unreliable. Secondly, and perhaps most importantly, claiming the success of a drug based on unreliable and soft endpoints like ORR stops the development and impairs the possibility to demonstrate a meaningful benefit in terms of overall survival (OS) or progression-free survival (PFS), that requires even more time in case of, theoretically, highly efficient drugs. Under this perspective, an important role should also be played by regulatory agencies. In fact, a peculiar accelerated approval/breakthrough designation by the FDA permitted the approval of these drugs after phase 1, whilst other Regulatory Agencies (i.e., European Medicine Agency [EMA]) required more data from phase 2/3 trials. A concrete and joint effort in reviewing these approvals would certainly reduce the uncertainty in which these drugs are approved, the withdrawals often required due to additional data and the heterogeneity in the access to innovative drugs across the world.

Examples of Drug Development Programs in NSCLC: The Case of Gefitinib, Crizotinib and Osimertinib for Oncogene-Addicted NSCLC
Targeted therapy represents the pivotal treatment for oncogene-addicted NSCLC and its use in clinical practice is becoming wider [18]. While these innovative therapeutical options have improved the outcomes of a subgroup of patients affected by molecularlyaddicted NSCLC, the counterpart is that targeted therapy has different mechanisms of action, drug kinetics and dynamics, efficacy and toxicity profiles compared with standard chemotherapy [19]. Given these intrinsic differences, a drug development methodology requires novel approaches. All these aspects impair classical early clinical trial designs from being effective and efficient.
With the advent of targeted therapies, biomarkers have become increasingly relevant, influencing modern drug development and trial design. The most relevant aspect is related to a patient selection that can influence trials results from the earliest phases onwards. Indeed, the identification of biomarkers provides the possibility to use tumors and patient's characteristics to predict drug efficacy, guiding the treatment selection for each individual patient [20]. In particular, a validated predictive marker can prospectively identify individuals who are likely to have a positive clinical outcome, such as improved survival or decreased toxicity, from a specific treatment [21]. Moreover, targeted therapy has a toxicity profile significantly different from cytotoxic agents and typically results from target effects on normal tissues. As a result, targeted therapy has a toxicity profile that does not necessarily correlate with the MTD, historically the main endpoint of phase 1 chemotherapy trials is crucial in identifying the RP2D. Indeed, the correlation between toxicity and activity may be less linear than with conventional cytotoxic agents. In up to a third of phase 1 trials of molecularly targeted drugs, the MTD is not reached and therapeutic activity may be seen at the low-dose levels used in the early stages of clinical trials with targeted drugs [22]. Therefore, with the new molecular agents, the occurrence of more adverse events than the higher tolerated dose, should affect dose escalation and define the dose to be chosen in subsequent trial phases [23]. As a consequence, new trial designs and the corresponding endpoints need to be adapted to the specific agents being investigated.
Here, we report some examples of targeted drug development, showing how these programs have been affected by the timing of biomarker detection, the adverse events reported and, later, by the expertise about disease biology/heterogeneity acquired over time.
In recent years, phase 1 trials have enrolled more patients than those traditionally required for subsequent phase 2-3 trials. The rationale for such large phase 1 trials is based on the evaluation of activity endpoints in specific subsets of patient populations selected according to the presence of predictive biomarkers. This phenomenon blurs the lines between phase 1, 2 and 3 trials and allows accelerating drugs approval. One of the first drugs approved in this manner in thoracic oncology was crizotinib, a dual tyrosine-protein kinase MET and ALK inhibitor [24]. In 2006, the FIH phase 1 study with crizotinib started with a dose escalation (from 50 mg once daily to 300 mg twice daily) in 37 patients affected by advanced cancers. Dose-limiting fatigue in the cohort receiving 300 mg twice daily led to establish the regimen of 250 mg twice daily in a 28-day cycle as the RP2D. This part of the trial was followed by screening for ALK or MET aberrations in specific histotypes, identifying promising results in two patients with ALK-rearranged advanced NSCLC. For this reason, in 2008, the cohort of ALK-rearranged NSCLC patients was expanded. The ORR from 19 patients with pre-treated ALK-positive NSCLC was 53%; these data were confirmed in 82 patients with an ORR of 57%. The most frequent adverse events were grade 1-2 gastrointestinal events, visual alterations and transaminase increase. All toxicities reversed on cessation of crizotinib [25]. An expanded cohort of 149 patients was enrolled in the phase 1 PROFILE 1001 trial, in which crizotinib confirmed its activity, achieving an ORR of 60.8% and leading to rapid responses (median time to first documented response was 7.9 weeks) and durable (median duration of response was 49.1 weeks). The greatest proportions of responses were noted in the treatment of naïve patients, those with the lowest performance status score, and Asian patients. In this larger number of patients treated with crizotinib, mainly grade 1-2 side effects were reported and patients recovered after stopping crizotinib treatment [26,27]. The marked activity of crizotinib observed in the phase 1 study has led to phase 2-3 trials. The PROFILE 1005 is a phase 2, open-label single arm trial of efficacy and safety of crizotinib in advanced ALK-rearranged pre-treated NSCLC patients. Crizotinib confirmed a strong efficacy in ALK-rearranged NSCLC patients, showing an ORR of approximately 60%, and a tolerable safety profile [28]. The crizotinib example highlights the fact that an extended expansion cohort allowed us to estimate activity endpoints, such as response, with more precision by enrolling more and more patients after only a handful of initial responses are observed, and showed that a strong candidate biomarker exists [29]. These studies led to a rapid regulatory approval of crizotinib in 2011 [30].
Gefitinib is a different example of a targeted drug development program. It was evaluated as a single agent in four phase 1 clinical trials. In the first one [31], gefitinib was administrated once daily for 14 consecutive days, followed by 14 days off treatment. Dose escalation started at 50 mg and continued to 925 mg or until a consistent dose-limiting toxicity (DLT) was reached. The most frequent adverse events were grade 1-2 rash, nausea and diarrhoea. Of 16 patients with NSCLC, 4 achieved partial response. In a second study, gefitinib was administrated from 150 to 1000 mg/day for consecutive 28 days. At 1000 mg/day, 5 out of 12 patients experienced DLT developing grade 3 diarrhoea. In this study, 19 patients had stable disease [32]. In the third study, 71 patients were enrolled and 39 had NSCLC. At doses > 800 mg, 45% of patients required dose reductions. One partial response and 6 prolonged stable disease responses were observed [33]. The fourth phase 1 study investigated the tolerability and toxicity of gefitinib in Japanese patients with solid tumors. Overall, 31 patients were included and received oral gefitinib on 14 consecutive days every 28 days. Dose escalation was from 50 mg/day to a maximum of 925 mg/day or DLT; 2 patients had DLT at 700 mg/day. The adverse events were consistent with previous studies. Partial response was observed in 5/23 NSCLC patients [34]. Two randomized phase 2 clinical studies evaluated the safety and the activity of two doses of gefitinib (250 mg or 500 mg) in pre-treated NSCLC patients (IDEAL-1 and IDEAL-2). The IDEAL-1 [35] enrolled 210 patients who were pre-treated with one or two chemotherapy regimens, with at least one containing platinum. The IDEAL-2 study [36] included 221 patients who were pre-treated with two or more regimens containing platinum and docetaxel. In both studies, the two doses of gefitinib produced similar results in terms of ORR (approximately 20% in IDEAL-1 and 10% in IDEAL-2), disease control rate (DCR, about 50% in IDEAL-1 and 40% in IDEAL-2), and OS (about 8 months in IDEAL-1 and 7 months in IDEAL-2). Overall, adverse events were more common in patients treated with 500 mg/day. Based on these results, the dose chosen for gefitinib administration in NSCLC patients was 250 mg/day. In these phase 2 studies, some efforts were made to identify predictive factors of response. In the IDEAL-1 trial, a multivariate analysis demonstrated that histology, female gender and performance status were associated with better outcomes, while in the IDEAL-2 trial, only the female gender showed results associated with a greater gefitinib efficacy. For these reasons, in 2004 the FDA accelerated gefitinib approval for advanced NSCLC after platinum-based and docetaxel treatments. Only the subsequent discovery of EGFR mutations led to several phase 2 trials in which a correlation between EGFR mutations and gefitinib efficacy was validated. For example, an ORR of 95% and 73.9% and a PFS of 8.9 and 9.1 months, respectively, were observed in 43 patients with exon 19 deletions and L858R mutations [37]. In the iTARGET trial [38] chemo-naïve patients with non-squamous NSCLC were selected based on clinical characteristics typically associated with EGFR mutations (non-smokers, adenocarcinoma histology, females, Asiatic patients). In this study, mutations were identified in 35% of patients and the ORR was 78% and 59% for patients carrying L858R mutation and exon 19 deletion, respectively, whereas it was 0% in patients without these two alterations. Several other trials confirmed these results, showing that gefitinib efficacy was associated with the presence of EGFR mutations, regardless of ethnicity, gender, performance status or smoking history.
The development of gefitinib in NSCLC is a clear example of how difficult it was to conduct clinical trials with molecular-targeted agents when little was known about predictive factors and selection criteria. The key point in choosing a methodology of clinical research with target-based agents is the identification of those patients who are expected to benefit more. The preliminary identification of effective molecular targets in the early phases of drug development allows selecting only those patients who are more likely to respond to a specific targeted agent. Otherwise, the presence of an unrecognized molecular heterogeneity can lead to a falsely negative study that can fail to detect a truly effective new therapy, leading to the rejection of a potentially useful drug but in the context of a molecularly-selected population [39]. The gefitinib development program highlights the practical challenges in conducting biomarker research. The sooner a predictive biomarker is identified, the more focused and efficient the drug development program can become. The major problem in the gefitinib development was that the identification of the drug target progressed in parallel with its clinical development, extending the time for first-line drug approval. Ideally, biomarkers need to be identified in preclinical studies to enable a more efficient clinical development [40].
A third example of drug development program involves osimertinib. This drug was designed to inhibit EGFR in a covalent irreversible manner, harboring preferential activity against sensitizing and T790M resistance mutations, compared with the wildtype form of the receptor. Preclinically, osimertinib inhibits signaling pathways and cellular growth of EGFR mutant and T790M resistant cell lines in vitro [41]. In vivo, this translates into sustained tumor regression in EGFR-mutant tumor xenograft and transgenic single-and double-mutant models [41]. Osimertinib was clinically developed directly in EGFR-mutated NSCLC patients through the AURA series of trials. In the phase 1 AURA trial [42], the safety and efficacy of osimertinib was assessed in 253 EGFR-mutated metastatic NSCLC patients, who progressed while receiving first-or second-generation EGFR-TKIs. Building on the knowledge and learnings from clinical studies with gefitinib, the AURA study incorporated a biomarker-defined patient selection, which improved the chances of demonstrating clinical activity. The study comprised a dose-escalation cohort, with 31 patients, and a dose expansion cohort enrolling 222 patients who received oral osimertinib from 20 to 240 mg/day. No DLT were observed, with an ORR of 51%. In the expansion cohorts, tumor biopsies were required for central determination of EGFR T790M status with an ORR of 61% in 127 patients that obtained positive results. The median PFS was 8.2 months for all patients, while it was 9.6 and 2.8 months in EGFR T790M-positive and -negative patients, respectively. The dose of 80 mg once daily was selected for further trials. The phase 1 AURA study also included two cohorts of treatmentnaïve patients [43]. A total of 30 patients received osimertinib at 80 mg/day and 30 at 160 mg/day. Overall ORR was 67% in the 80 mg group, 87% in the 160 mg group and 77% across doses. Median PFS was 22.1, 19.3 and 20.5 months, respectively. The results confirmed 80 mg/day as the dose of osimertinib to investigate in the following first-line trial. In the AURA phase 2 extension cohort [44], 201 patients were enrolled with an ORR of 62%. Median PFS was 12.3 months and the 1-year survival rate was 79%. ORR and PFS were similar between the second-and third-line or more cohorts and by the common EGFR sensitizing mutation status. In the phase 2 AURA2 trial, 210 patients were enrolled and treated with 80 mg/day of osimertinib [45], with an ORR of 70%. Median PFS was 9.9 months and the 1-year survival rate was 81%. In both trials, lung cancer symptoms, quality of life and physical functioning domains improved. The final selection of the dose to be used for commercialization, as well as phase 3 development, was made in February 2014, only 11 months after the first dose in humans. The development of osimertinib followed a biomarker-driven, adaptive approach that involved close collaboration between industry partners and global regulatory bodies, including the FDA and the European Medicines Agency (EMA). On 13 November 2015, osimertinib received FDA approval for patients whose tumors have a specific EGFR mutation (T790M) and whose disease has progressed after treatment with other EGFR-tyrosine kinase inhibitors (TKIs) therapy, following a clinical development period of just over 2.5 years from the first patient dosed to the first approval. This was made possible by multiple factors, the most relevant of which was the experience acquired during gefitinib development and the increasing expertise in the biology of EGFR-mutant NSCLC and in the chemistry of EGFR-TKIs [46]. This has been one of the fastest development journeys for a drug in history.
Based on these few historical examples, we have learned that when a targeted drug is administrated in a study population not selected by any biomarker, trials may produce negative results. However, the early detection of biomarker-driven biology helps to obtain a greater benefit for a selected population and can reduce the required time for drug approval ( Table 1). Although toxicity remains the most commonly used phase 1 endpoint to RP2D, biomarker effects, pharmacokinetic and optimal biological doses (intended as the doses required to obtain the target inhibition and, consequently, the drug efficacy) need to be taken into account [20]. Ideally, phase 1 trials for target-based drugs should always be designed to determine whether the target can be inhibited in vivo at a tolerable dose and to estimate the dose or drug concentration required to achieve and maintain maximum inhibition of the target in vivo [47].
For all the above reasons, there is an urgent need to keep designing biomarker-driven early-phase oncology trials, addressing key questions that provide mechanistic insight into target modulation, drug sensitivity and resistance, as well as guiding decision making such as dose-escalation and schedule optimization [22].

Examples of Drug Development Programs in NSCLC: The Case of Nivolumab, Pembrolizumab and Avelumab for Non-Oncogene Addicted NSCLC
Immune checkpoint inhibitors (ICIs) have dramatically changed the therapeutical landscape of NSCLC, with immunotherapy, alone or in combination with chemotherapy, being the standard up-front approach for almost all the patients affected by non-oncogeneaddicted tumors [48]. However, several issues related to the conduction of clinical trials with ICIs are still unresolved. In particular, major challenges remain regarding the optimal methodology to conduct clinical trials focused on ICIs, dose finding, safety profile and the radiological assessment of response to these drugs.
Clinical trials involving ICIs, and in particular phase 1 trials, need to answer multiple questions within a short timeframe to accelerate the classical drug development process from early to regulatory phases. Interestingly, adaptive design trials, especially in the immunotherapy field, have revolutionized the classical paradigm of drug development based on phase 1, phase 2 and phase 3 steps. From adaptive design trials, we have learned that several questions could receive an answer at one time to speed up the process [49].
Drug development programs involving ICIs need to also consider the different pattern of toxicities, which can be peculiar and radically different from those observed during chemotherapy or targeted therapy. The conventional endpoints of phase 1 clinical trials represented by the classical dose escalation and the standard DLT assessment could not capture the late onset of immunotherapy-related toxicities [50]. In this context, early trials involving ICIs should include a longer DLT period for a better selection of the optimal phase 2 dose.
The major challenge in the immunotherapy early trials is represented by the identification of predictive biomarkers of efficacy/toxicity and the related assays method for their testing.
Several candidate biomarkers potentially able to identify patients who are more likely to benefit from ICIs, such as tumor mutational burden (TMB) [51,52], tumor-infiltrating lymphocytes [53] and circulating factors [54] have been investigated in the past few years [55].Nevertheless, among them only PD-L1 expression was demonstrated to be associated with an ICIs benefit [56,57]. In this light, partially similarly to the abovementioned scenario of oncogene-addicted tumors, the early study and identification of effective predictive biomarkers would allow a more effective drug development program, mainly focused on those specific populations more likely to achieve a benefit from ICIs.
Herein, we report three examples of immunological drug development, underlying caveats and peculiarities in early clinical steps.
The tolerability and activity of nivolumab, a fully human IgG4 PD-1 immune checkpoint inhibitor antibody, were firstly reported in patients with NSCLC, melanoma and renal cell carcinoma treated in a phase 1 multidose clinical trial [58], in which patients were not selected by PD-L1 expression. In 2015, the OS results in NSCLC patients treated with nivolumab were published, revealing encouraging survival rates and durable responses [59]. Nivolumab therapy was generally well tolerated, with only 14% of patients experiencing grade 3 to 4 treatment-related adverse events. Based on these results, phase 3 clinical trials were conducted, using nivolumab at a dose of 3 mg/kg administered intravenously every 2 weeks based on pharmacokinetic exposure, safety and efficacy [59]. The results of CheckMate 017 and CheckMate 057 led to introduction of nivolumab in the treatment, respectively, of squamous and non-squamous pre-treated NSCLC [60,61]. An update of the phase 1 trials showed that 5-year OS was 16% for squamous and 15% for non-squamous patients [62], although the reported 9.9 months of median OS could not adequately capture the long benefit demonstrated by the plateaus in the tails of the survival curves.
In the phase 1 trial with nivolumab, tumor specimens from 42 patients were evaluated by the murine anti-human PD-L1 monoclonal antibody 5H1 and PD-L1 positivity was defined by 5% or more of tumor cells. None of the 17 patients with PD-L1 negative tumors had an ORR, while nine out of 25 (35%) patients with PD-L1 positive tumors had a response (p = 0.006) [63]. A group evaluating immunohistochemical features from patients with colorectal carcinoma, renal cell carcinoma, NSCLC, melanoma or prostate cancer on the phase 1 nivolumab trial, including PD-1, PD-L1 and PD-L2 expression, patterns of immune cell infiltration and lymphocyte subpopulations, demonstrated that tumor PD-L1 expression correlated the most with ORR to anti-PD-1 therapy, based on assessing 41 pre-treatment tumor specimens [63].
In CheckMate 012, a multicohort phase 1 study in previously untreated patients with NSCLC, results from a cohort of 20 patients showed a good safety profile and durable responses [64]. Among the 10 patients with a PD-L1 expression level of 5% or more, ORR was 50%, PFS at 24 weeks was 70% and median PFS was 10.6 months. Although an increasing PD-L1 expression level was associated with a greater benefit in the expanded cohort, clinical activity was also observed in patients with a low or negative PD-L1 expression. Based on these results, PFS among patients with stage IV or recurrent NSCLC with a PD-L1 expression level of 5% or more was chosen as the primary endpoint of the phase 3 Check-Mate 026 trial, evaluating nivolumab versus chemotherapy alone as first line treatment [65]. However, this phase 3 trial failed to reach its primary endpoint. In fact, the median PFS was 4.2 months in nivolumab arm versus 5.9 months in chemotherapy arm (hazard ratio [HR] 1.15; 95% confidence interval [CI] 0.91-1.45; p = 0.25) in contrast with the KEYNOTE-024 positive results [66]. The application of different assays to assess PD-L1 tumor expression, the criteria related to previous radiotherapy and different patients' characteristics, could contribute to explaining this difference [67].
Differences in biomarker tests and in PD-L1 expression cut-off point, test 22C3 with cut-off of 50% with pembrolizumab versus the test 28-8 clone and cut-off of 5% with nivolumab, could contribute to the discordance in terms of results between CheckMate 026 and KEYNOTE-024. In fact, it was assumed that patients considered at high PD-L1 expression in the pembrolizumab trial could be significantly different from those considered "high" in the nivolumab one. An important difference between these trials is represented by the percentage of patients who received prior radiotherapy, which was very high in CheckMate 026 (37.6%) while, in KEYNOTE-024, prior radiation therapy of >30 Gy within 6 months of the first dose of trial treatment was an exclusion criterion. Indeed, radiation treatment may play a potential immunosuppressive role, although not yet a fully clarified one, leading to a decreased activity of ICIs on irradiated areas [68]. Moreover, a higher percentage of never-smoker patients was included in the nivolumab trial than in the pembrolizumab trial (11% versus 3%). The never-smoker population has lower mutational loads, potentially leading to decreased response to ICIs [69].
The Nivolumab "saga" in its early development demonstrated how important the fine tuning on the best selection of patients, based on a strong biomarker hypothesis could be, in particular in the perspective of subsequent phase 2 and 3 trials.
Pembrolizumab represents an example of optimal co-development of a drug with its companion diagnostic for detecting PD-L1. Early clinical data suggested that tumors expressing PD-L1 could better respond to pembrolizumab and consequently, an assay for detecting PD-L1 was urgently needed [70]. The expansion cohort of the phase 1 trial KEYNOTE-001 demonstrated, in patients affected by locally advanced or metastatic PDL-1-positive (defined as TPS ≥ 50%) NSCLC, higher ORR (45% for PD-L1 ≥ 50%, 16.5% for PD-L1 1-49% and 10.7% for PD-L1 < 1%) with longer duration of response compared with historical data [71]. KEYNOTE-001 implies a biomarker-based enrichment design: the cut-off selection for PD-L1 positivity and its validation provided the rationale for phase 2 and 3 trials testing pembrolizumab in PD-L1 positive NSCLC patients (threshold 1% for KEYNOTE-010 and 50% for KEYNOTE-024). In KEYNOTE-001, the so-called prototype assay, to distinguish it from later versions of the assay for PD-L1, was used: PD-L1 positivity was defined as membranous staining in at least 1% of cells (neoplastic and intercalated mononuclear inflammatory cells) within tumor nests or a distinctive staining pattern caused by the infiltration of mononuclear inflammatory cells in the stroma that formed a banding pattern adjacent to tumor nests [70]. Four immunohistochemical PD-L1 scoring methods, based on the proportion of PD-L1-expressing tumor cells, were considered in NSCLC: the receiver operating characteristic (ROC) analysis demonstrated that all four scores were equally predictive [70][71][72]. Moreover, KEYNOTE-001 represents a good example of an adaptive design trial, which led to FDA approval of pembrolizumab in both NSCLC and melanoma in a short timeframe (<4 years) [73] and was capable of generating a huge amount of data simultaneously, with multiple implementations throughout a series of amendments. The KEYNOTE-010 phase 2/3 trial demonstrated the benefit of pembrolizumab versus docetaxel in patients with previously treated NSCLC and TPS ≥ 1% tumors [74]. A planned post-hoc analysis from the KEYNOTE-010 study indicated that pembrolizumab continued to improve OS versus docetaxel, regardless of whether PD-L1 expression had been assessed in newly collected or archival tissue samples [75]. Moreover, pembrolizumab became the first immunotherapy approved in first-line setting of advanced NSCLC with PD-L1 TPS ≥ 50%, based on KEYNOTE-024 results which demonstrated a longer PFS and OS compared with chemotherapy [76], demonstrating how far-sighted the bet was on PD-L1 TPS ≥ 50% since early drug development. The Blueprint PD-L1 immunohistochemistry (IHC) Assay Comparison Project, divided into two phases: the preliminary one and the real-world one, demonstrated that the 22C3, 28-8 and SP263 assays may be interchangeable for tumor cell PD-L1 expression, whereas the SP142 assay showed lower sensitivity in determining TPS on tumor cells [76,77]. True harmonization among different assays with different cut-offs provided by multiple companies appeared to not be completely possible [78]. Nowadays, the outstanding clinical benefit of pembrolizumab in the selected patient population might be overcoming the need for harmonization. The development program of pembrolizumab and concurrent identification of a strong companion diagnostic led to a rapid process of authorization, winning (at least for now) the run to the first-line setting.
Avelumab is a human Ig-G1 monoclonal antibody with a wild-type Fc region targeting PD-L1 that was approved for the treatment of metastatic Merkel cell carcinoma. In a phase 1a dose-escalation and dose-expansion, avelumab showed antitumor activity and durable response in a large cohort of advanced solid tumors [79]. In 2017, the results of the dose-expansion cohort of phase 1b in patients with previously treated metastatic NSCLC were published, demonstrating an acceptable safety profile and a promising antitumor activity [80]. Notably, patients were not selected on the basis of PD-L1 expression. However, patients with PD-L1 positive tumor cells using the 1% cut-off had longer PFS with avelumab than patients with PD-L1 negative tumors, based on tumor classification using the novel antibody clone (73-10) [80]. The consequent phase 3 study, a JAVELIN Lung 200 trial, did not meet the primary endpoint of OS, despite the fact that clinical activity and safety were noted [81]. The authors explained the failure in achieving the primary endpoint of this trial, underlying the high-frequency of post-study use of ICIs, the higher proportion of randomly assigned patients who did not receive any study treatment in the docetaxel group than in the avelumab group (8% versus 1%), methods of biomarker assessment and patients' and drug characteristics. However, the pre-specified exploratory analyses showed that patients with higher PD-L1 expression had longer PFS and OS. Notably, the PD-L1 assay used in this trial (73-10 assay) has higher sensitivity to detect PDL-1 positivity than other assays (such as 22C3 for pembrolizumab and SP142 for atezolizumab) [82]. In the phase 1 expansion cohort of the JAVELIN solid tumor trial, avelumab monotherapy in first-line setting of advanced NSCLC demonstrated clinical activity and an acceptable safety profile [83]. Although around 28% of patients were not evaluable for tumor PD-L1 expression and the number of patients in PD-L1 subgroup is low (n=38), patients with ≥ 50% and ≥ 80% PD-L1-positive tumors showed ORRs of 22.6% and 26.3%, respectively [79]. Based on these encouraging results, a phase III clinical trial (JAVELIN Lung 100 study) is currently ongoing, assessing first-line avelumab monotherapy compared with platinum-based doublet chemotherapy in patients with PD-L1-positive NSCLC (NCT02576574).
As we learned from these trials, early clinical development programs involving immunological agents have to take into account, beyond safety and activity, the optimal selection of patients, based on pre-specified biomarkers. To date, pembrolizumab, compared to other ICIs such as nivolumab and avelumab, represents an example of how crucial the identification of a strong biomarker hypothesis is from early phases. The codevelopment of a companion diagnostic together with a methodical approach to clinical development accelerated the approval process in pembrolizumab indications. In parallel with an optimal biomarker-based drug development, an adaptive design could be adopted to speed up the process from early to registrational/pivotal phases [49] (Table 2). Classic drug development from phase 1 to phase 3 trials * Only in tumors with expression of programmed death ligand-1 (PD-L1) tumor proportion score (TPS) ≥ 1%; FDA: Food and Drug Administration.

Conclusions
New challenges and considerations affect the conduct and feasibility of trials in oncology research and specifically in metastatic NSCLC. We are going from an empirical approach, including large trials comparing treatments, to a tailored approach in which the trials are designed to ask biologically relevant questions. In this deep changing of clinical research, to reach a successful conversion, the methodology of the research and infrastructure should be extremely restructured in order to better understand the biology of the disease, the mechanism of action of new agents, the new methodological approaches to apply and prediction of toxicity or activity. The examples of drug development programs in oncogene-and non-oncogene-addicted NSCLC that we reported highlight how these large shifts in the standard paradigm of cancer clinical research can impact the real-world practice. In the future, concepts such as phase 1 expansion cohorts replacing phase 2 studies, regulatory approvals based on nonrandomized trials and tumor agnostic approvals will be firmly adopted in the field of drug development. The FDA has already shown its will to approve agents that were evaluated in only a small number of patients, even based on single-arm trials, as traditionally large registration trials may never be feasible or ethically appropriate within some NSCLC subtypes [8]. A further novel approach by FDA was the request of a post-approval study testing a lower dose of sotorasib, a Kirsten rat sarcoma (KRAS) G12C-mutated inhibitor. Thus, this request highlights how the dose finding process that was usually part of phase 1 trials can also be performed even post marketing. In fact, a multicenter randomized clinical trial to compare the safety and efficacy of sotorasib at 960 mg once daily vs. a lower daily dose of 240 mg in patients with advanced NSCLC is ongoing [84].
The continuing development of the next-generation sequencing tool will increase the use of assignment of patients to matched treatments according to multiple driving mutations, biomarkers, or pathways [85]. This concept of personalized medicine will lead to reducing the potential use of drugs in non-responders. However, it may increase diagnostic budgets by requiring the testing of a whole patient population to select the groups of eligible patients who might benefit, leading to a potential higher cost.
Overall, the identification of predictive biomarkers is one of the greatest challenges of NSCLC research that should be optimized with solid methodological trial designs to maximize the clinical outcomes.