Somatic Copy-Number Alterations in Plasma Circulating Tumor DNA from Advanced EGFR-Mutated Lung Adenocarcinoma Patients

Background: To assess the clinical relevance of genome-wide somatic copy-number alterations (SCNAs) in plasma circulating tumor DNA (ctDNA) from advanced epidermal growth factor receptor (EGFR)-mutated lung adenocarcinoma patients. Methods: We included 43 patients with advanced EGFR T790M-positive lung adenocarcinoma who were treated with osimertinib after progression under previous EGFR-TKI therapy. We performed genomic profiling of ctDNA in plasma samples from each patient obtained pre-osimertinib and after patients developed resistance to osimertinib. SCNAs were detected by shallow whole-genome plasma sequencing and EGFR mutations were assessed by droplet digital PCR. Results: SCNAs in resistance-related genes (rrSCNAs) were detected in 10 out of 31 (32%) evaluable patients before start of osimertinib. The presence of rrSCNAs in plasma before the initiation of osimertinib therapy was associated with a lower response rate to osimertinib (50% versus 81%, p = 0.08) and was an independent predictor for shorter progression-free survival (adjusted HR 3.33, 95% CI 1.37–8.10, p = 0.008) and overall survival (adjusted HR 2.54, 95% CI 1.09–5.92, p = 0.03). Conclusions: Genomic profiling of plasma ctDNA is clinically relevant and affects the efficacy and clinical outcome of osimertinib. Our approach enables the comprehensive assessment of SCNAs in plasma samples of lung adenocarcinoma patients and may help to guide genotype-specific therapeutic strategies in the future.


Introduction
Osimertinib is the standard treatment of advanced epidermal growth factor receptor (EGFR)-mutated non-small cell lung cancer (NSCLC) patients and EGFR T790M-mediated resistance [1][2][3][4]. Despite high response rates, patients will develop resistance to osimertinib therapy and clinically progress. Resistance mechanisms in osimertinib-treated patients appear to be complex and are currently not fully understood. Both EGFR-dependent and EGFR-independent mechanisms of resistance may be important [5].
Due to the fact that a liquid biopsy is less burdensome than a tissue biopsy and that in many advanced NSCLC patients, multiple tissue sampling is clinically not feasible, we selected plasma for molecular profiling [25,26]. Blood samples are easily obtainable and can be taken repeatedly even in short time intervals. In addition, the genetic heterogeneity of the progressing tumor may lead to an incomplete picture of the tumor genome if only single tissue biopsies are obtained. Furthermore, blood-based analytic approaches may allow for real-time monitoring of the total tumor burden and the detection of upcoming mutations that arise during clinical treatment through serial blood sampling and analysis. Blood samples can be collected during routine care at the time of diagnosis, before first-line therapy, and at subsequent time points when the tumor is progressing on therapy.
In this study, genome-wide copy number profiling with a special focus on focal events was performed using shallow whole-genome sequencing in circulating tumor DNA (ctDNA) of plasma samples from each patient collected prior to osimertinib initiation and at the time of osimertinib resistance in order to detect molecular alterations relevant for therapy efficacy. Here, we report the results of this study.

Patient Cohort and Sample Collection
Samples of 43 patients with advanced EGFR-mutated lung adenocarcinoma who progressed under first-or second-generation EGFR-TKI therapy were collected between August 2015 and January 2019. All patients developed the T790M resistance mutation and were treated with osimertinib. Patients had a confirmed activating EGFR mutation at the time of initial diagnosis in tissue biopsy. The first plasma sample was collected at the time of radiologic progression to a first-or second-generation EGFR TKI ("pre-osimertinib" sample). A second plasma sample was collected from all patients at the time of clinical progression under osimertinib. In addition, a set of 10 self-reporting healthy individuals (an age range of 20-30 years) was analyzed.

Blood Collection and Cell-Free DNA Extraction from Plasma
Blood processing was performed as previously described [27]. Briefly, EDTA-containing vacutainer tubes or cell-free DNA Blood Collection Tubes (Roche, Pleasanton, CA, USA) were used for blood collection. For plasma isolation, whole blood was centrifuged at 200× g for 10 min, followed by 1600× g for 10 min. Subsequently, the supernatant was collected and centrifuged at 1900× g for 10 min. ctDNA was extracted from 2 mL of plasma using the QIAamp circulating nucleic acid kit (Qiagen, Hilden, Germany), according to the manufacturer's instructions.

ddPCR
EGFR exon 19 deletion, L858R, T790M and C797S mutations were assessed using custom-made ddPCR assays of Life Technologies (Carlsbad, CA, USA). EGFR L861Q mutations were detected by means of a ddPCR assay of Bio-Rad (Hercules, CA, USA). Primer sequences and PCR protocols were previously specified [27][28][29][30]. All ddPCR assays were performed in triplicate and analyzed with QuantaSoft analysis software (Bio-Rad). Results were reported as copies of mutant allele per ml of plasma. The threshold for positivity was >1 copy/mL for all assays.

Shallow Whole-Genome Plasma Sequencing
Shallow whole-genome plasma sequencing was performed as previously described [31]. Briefly, a total of 5-10 ng of input DNA from plasma DNA extractions was used based on ctDNA quantification using the Qubit dsDNA HS Assay kit (Life Technologies, Carlsbad, CA, USA). Shotgun libraries were prepared using the TruSeq Nano DNA HT Sample preparation kit (Illumina, San Diego, CA, USA). Due to the high fragmentation of plasma DNA, the fragmentation step was omitted and for the selective amplification of the library fragments, 20 PCR cycles were used.
Plasma DNA libraries were quantified and normalized with quantitative PCR, using primers complementary to Illumina-specific adaptor sequences (forward: AATGATAC GGCGACCACCGAGAT; reverse: CAAGCAGAAGACGGCATACGA). Libraries were pooled equimolarly and sequenced on an Illumina MiSeq or NextSeq instrument (Illumina) either in a paired end sequencing mode (2 × 75 bp) or single read mode (150 bp). Sequencing reads were analyzed using the plasma-Seq algorithm, which is based on read count analysis to establish genome-wide SCNAs [31]. Briefly, sequencing reads were mapped to the PAR-masked genome, counted in non-overlapping 50 kb windows and normalized by the total amount of reads. After GC normalization, read counts were further normalized to healthy controls to avoid position effects and we normalized the sequencing. The resulting normalized log2 ratios were segmented. Since driver genes and copy number alterations related to resistance are frequently located on focal amplifications, we specifically called for focal events as previously described in detail [32].
Segment should contain a gene, but not >100 genes; 4.
Log2-ratio must be 0.2 higher than weighted mean of the log2-ratios of neighboring 20Mb on both the sides if it contains a known tumor driver gene; 5.
Log2-ratio must be 0.58 higher (Log2-ratio of 0.58 translates to about three copies) than weighted mean of the log2-ratios of neighboring 20Mb on both the sides if it does not contain a known tumor driver gene; 6.
Segment should not contain segmental duplications in >50% of its size; 7.
Segment should not overlap with known entries in DGVar.
For focal deletions, the following criteria were used: 1. Segment should be <20 Mb; 2.
Segment should contain a gene known to be affected by deletions; 4. Segment should contain a gene but not >100 genes; 5.
Log2-ratio must be 0.2 lower than weighted mean of the log2-ratios of neighboring 20 Mb on both the sides; 6.
Segment should not contain segmental duplications in >50% of its size; 7.
Segment should not overlap with known entries in DGVar.
Focal identification was performed using R.
To calculate the tumor fraction (TF) in plasma DNA samples, data were analyzed with the previously published ichorCNA algorithm using a 1Mb bin size [33]. As a cut-off to reliably detect focal SCNAs, we set the detection threshold to a TF of 3% [33]. Focal SCNAs, in which-based on a literature review-genes that may be associated with resistance to osimertinib were located, were defined as resistance-related SCNA (rrSCNAs). To assess the background noise, 10 self-reporting healthy individuals with a similar number of reads (average number of reads for cases 6,569,227, range 4,652,614-8,877,744; average number of reads for controls 6,917,896, range 6,833,913-6,981,005) were analyzed.

Statistical Analyses
Progression-free survival (PFS) was determined by investigator assessment and was defined as the duration between the first osimertinib dose and progression of disease or death for any cause, whichever occurred first. The overall survival (OS) was calculated from first osimertinib dose to death from any cause.
Tumor response was assessed by contrast enhanced computed tomography of the chest and abdomen by the local radiologist according to institutional practice. The scan intervals were usually between 6 to 8 weeks at the treating physician's discretion. The response rate (RR) was defined as the percentage of patients showing complete response (CR) or partial response (PR) at restaging after osimertinib initiation.
Patient and tumor characteristics included age, sex, presence or absence of extrathoracic metastases, tissue genotype at diagnosis, previous EGFR-TKI therapy, and tumor fraction in plasma DNA samples. Chi-square test and Fisher's exact test were used to evaluate the association of rrSCNAs with clinical parameters including response to osimertinib. Logistic regression models were used to assess the independent effect of covariables on response. Age and the values of TF in plasma DNA samples were compared by Mann-Whitney-U tests. The Kaplan-Meier method was used to estimate survival probabilities. Differences between survival curves were analyzed by means of the log-rank test. Univariate and multivariate Cox-proportional hazards regression models were used to compare survival outcome according to rrSCNAs. For the multivariate analyses, we used stepwise backward elimination models that included age (as continuous variable), gender (male, female), presence or absence of extra-thoracic metastases (thoracic, extra-thoracic), tissue genotype at diagnosis (EGFR deletions in exon 19, L858R, L861Q), previous EGFR TKI therapy (afatinib, erlotinib, gefitinib, >1 EGFR TKI), TF (as continuous variable), and rrSCNA (present, absent).
All reported p-values are 2-sided and considered significant at the 0.05 level. Statistical analyses were performed using SPSS Statistics software, version 25 (SPSS, IBM Corporation, Armonk, NY, USA).

Patient Characteristics
Plasma samples of 43 advanced EGFR T790M-positive lung adenocarcinoma patients were collected pre-osimertinib and at the time of progression under osimertinib. All patients progressed under treatment with first-or second-generation TKIs and were EGFR T790M-positive based on plasma genotyping by ddPCR prior to the initiation of second-line treatment with osimertinib.
The characteristics of 31 evaluable patients enrolled in this study are summarized in Table 1. All patients had adenocarcinoma histology, stage IV disease at diagnosis and were pretreated with EGFR-TKIs. The median time from initial diagnosis of lung adenocarcinoma until start of osimertinib therapy was 22 months (range 4 to 60 months).
In addition to the assessment of SCNAs by shallow whole-genome plasma sequencing, we identified activating EGFR mutations and EGFR resistance mutations by means of ddPCR. Prior to the initiation of osimertinib therapy, activating EGFR mutations were detected in the plasma of 34/43 (79%) patients, the T790M mutation in all 43 patients, and the C797S mutation in none of the patients ( Table 2).

Assessment of SCNAs in Plasma Samples
We performed shallow whole-genome sequencing of ctDNA to assess SCNAs in plasma samples from each patient collected pre-osimertinib when the T790M mutation was detectable in plasma and clinical progression on EGFR-TKIs was developed, and subsequently at the time of clinical progression on osimertinib ( Table 2). Profiles of SCNAs in all 86 samples are shown in Figure S1. In addition, a table of all identified SCNAs is available in the supplement (see Table S1). Examples of copy number profiles from a patient who responded to osimertinib and a non-responding patient are shown in Figure 1. Percentages may not total 100% because of rounding. rrSCNAs = somatic copy-number alterations in resistancerelated genes before osimertinib treatment. The median TF calculated with ichorCNA for all evaluable samples (n = 80) was 4.9% (range 3.0-42.6%). There was no difference between the median TF at osimertinib initiation (median 5.1%, range 3.0-42.4%) and at the time of progression (median 4.7%, range 3.2-42.6%) (p = 0.79; see Figure S2A). Twenty-one of 39 (54%) pre-osimertinib samples and 18 of 41 (44%) samples at the time of progression under osimertinib had a TF greater than ≥5% (Table 2). The median TF calculated with ichorCNA for all evaluable samples (n = 80) was 4.9% (range 3.0-42.6%). There was no difference between the median TF at osimertinib initia tion (median 5.1%, range 3.0-42.4%) and at the time of progression (median 4.7%, range 3.2-42.6%) (p = 0.79; see Figure S2A). Twenty-one of 39 (54%) pre-osimertinib samples and 18 of 41 (44%) samples at the time of progression under osimertinib had a TF greater than ≥5% (Table 2).
We observed a significant positive correlation between the copy number of the acti vating EGFR mutations and the TF in plasma samples assessed pre-osimertinib (Spearman Rho 0.46, p = 0.002) (see Figure S3A) and a trend towards higher TF with increasing T790M copy number assessed pre-osimertinib (Spearman Rho 0.30, p = 0.054) (see Figure S3B).
Next, we specifically called for focal SCNA, which often contain clinically relevan genes. These genes were selected based on a literature search. Since ichorCNA-based as sessment of copy number calling is based on 1Mb bins and focal genomic amplifications are often narrow, we applied our plasma-Seq-based focal amplification calling algorithm which is based on a 50 kbp bin approach [32]. Focal amplifications are mostly accompa nied with a high copy number leading to a regionally increased TF for the specific region therefore, these events can be detected with a higher resolution than gross SCNAs (single copy loss/gain) [32]. This revealed focal SCNAs in 44 of 81 (54%) samples, many of which included well characterized driver genes in lung cancer and were previously associated with resistance to osimertinib (rrSCNAs; resistance-related SCNAs). Of all 44 cases with focal SCNA, 25 (57%) had a TF ≥5% and 19 (43%) had a TF <5%. The median TF was higher in samples with focal SCNA than in those without (median 5.6% versus 4.5%, p = 0.02 We observed a significant positive correlation between the copy number of the activating EGFR mutations and the TF in plasma samples assessed pre-osimertinib (Spearman Rho 0.46, p = 0.002) (see Figure S3A) and a trend towards higher TF with increasing T790M copy number assessed pre-osimertinib (Spearman Rho 0.30, p = 0.054) (see Figure S3B).
Next, we specifically called for focal SCNA, which often contain clinically relevant genes. These genes were selected based on a literature search. Since ichorCNA-based assessment of copy number calling is based on 1Mb bins and focal genomic amplifications are often narrow, we applied our plasma-Seq-based focal amplification calling algorithm, which is based on a 50 kbp bin approach [32]. Focal amplifications are mostly accompanied with a high copy number leading to a regionally increased TF for the specific region; therefore, these events can be detected with a higher resolution than gross SCNAs (single copy loss/gain) [32]. This revealed focal SCNAs in 44 of 81 (54%) samples, many of which included well characterized driver genes in lung cancer and were previously associated with resistance to osimertinib (rrSCNAs; resistance-related SCNAs). Of all 44 cases with focal SCNA, 25 (57%) had a TF ≥5% and 19 (43%) had a TF <5%. The median TF was higher in samples with focal SCNA than in those without (median 5.6% versus 4.5%, p = 0.02) (see Figure S2B). While a 5% cut-off did not have a significant impact on PFS (p = 0.58), patients with a TF ≥10% had a significantly shorter PFS compared to patients with a TF <10% (p = 0.007) (Figure 2A   We identified rrSCNAs in EGFR (n = 6), ERBB2 (n = 1), CDK4 (n = 2), CDK6 (n = 1), MDM2 (n = 3), CDKN2A (n = 1), AKT2 (n = 1), and RB1 (n = 1) in pre-osimertinib plasma samples (see Table S2). A simultaneous change in two or more resistance-related genes was observed in four samples (Table 2). rrSCNAs were observed in EGFR (n = 3), ERBB2 (n = 1), CDK4 (n = 2), MDM2 (n = 2), RB1 (n = 1), and AKT2 (n = 1) which were present in both preosimertinib specimens and in samples taken at the time of osimertinib resistance ( Table 2). In some patients, rrSCNAs in EGFR (n = 2), ERBB2 (n = 2), CDK4 (n = 1), MET (n = 1), and PIK3CA (n = 1) were identified only at the time of progression under osimertinib ( Table 2). As the presence of rrSCNAs should be considered in combination with the TF in the plasma sample, the relationship of the TF in samples before osimertinib initiation and at progression under osimertinib is shown in Figure S4.
As some samples had borderline tumor fractions around the detection limit of ichor-CNA, we analyzed 10 control samples with a similar number of reads to assess the background noise. The median TF of the controls was 1.1% (range 0-4.2) and five of them were called as 0. In contrast, TF of the NSCLC cases ranged from 1-42% with a median of 5% (p < 0.001). Profiles of the control samples are shown in Figure S5.

Clinical Relevance of rrSCNAs
The presence of rrSCNAs cannot be completely excluded in 12 cases with a TF <5% and when no SCNAs have been detected (Table 3). Therefore, these patients were excluded from all outcome analyses. We observed no association between pre-osimertinib rrSCNAs and age, gender, absence or presence of extra-thoracic metastases, tissue genotype at diagnosis, and previous EGFR-TKI therapy (Table 1). However, the TF in plasma samples was significantly higher in samples with rrSCNAs (n = 10) compared to those without detectable rrSCNAs (n = 21) (17.0% versus 5.1%, p < 0.0001) ( Table 1).  The osimertinib response rate was 71% (22 of 31 patients) and the disease control rate (DCR) was 74% (23 of 31 patients). The median TF in plasma DNA samples was not different in patients who responded to osimertinib compared to those who did not respond (5.4% versus 7.0%, p = 0.36). Patients without detectable rrSCNAs in pre-osimertinib samples had a better response to osimertinib than patients with detectable rrSCNAs (81% versus 50%, p = 0.08) ( Table 4).  (Table 5). However, patients with exon 18 or L861Q mutations had a shorter PFS and OS ( Table 5). The presence of rrSCNAs in pre-osimertinib samples predicted shorter PFS (median 2.8 months versus 10.4 months; HR 3.33, 95% CI 1.37-8.10, p = 0.008) ( Figure 2C) and OS (median 6.7 months versus 18.7 months; HR 2.54, 95% CI 1.09-5.92, p = 0.03) ( Figure 2D). Multivariate analyses using stepwise backward elimination models demonstrated that the presence of rrSCNAs was the only significant predictor of shorter PFS and OS after adjusting for clinical parameters (Table 5).

Discussion
Shallow whole-genome sequencing was recently shown to be useful to characterize the landscape and evolution of SCNAs in plasma ctDNA of prostate cancer and colorectal cancer patients [31,34,35]. In our study, we used shallow whole-genome plasma sequencing to evaluate SCNAs of resistance-related genes in ctDNA of EGFR-mutated lung adenocarcinoma patients who had developed the T790M mutation, progressed after treatment with first-or second-generation EGFR TKIs and who were subsequently treated with osimertinib. We detected various rrSCNAs that were described in previous reports to mediate osimertinib resistance. In particular, it has previously been shown that amplifications of wildtype or mutant EGFR [36][37][38], ERBB2 amplifications [39], MET amplifications [40], and rrSCNAs of CDKN2A or CDK4/6 [21,40] are associated with rapid progression to osimertinib and acquired drug resistance. In our study, we observed no MET amplification in plasma samples prior to osimertinib and only one MET amplification at the time of osimertinib resistance, which is in contrast to published frequencies of MET amplification. [41]. Notably, the activation of MET signaling can also be a result of polysomy of chr7 or gain of chr7q, which could be identified in a variety of other patients, but was not considered as a focal rrSCNA. We observed various patterns of rrSCNAs in plasma samples from each patient. In pattern 1, no rrSCNAs were detected before and after osimertinib.
In pattern 2, the same rrSCNAs were present before start of osimertinib treatment and at the time of osimertinib resistance. In pattern 3, rrSCNAs were observed only before start of osimertinib treatment but not at the time of osimertinib resistance. In pattern 4, rrSCNAs were observed only at the time of osimertinib resistance. Moreover, rrSCNAs appear in combination with the osimertinib resistance EGFR C797S mutation but were also independent of C797S. Thus, osimertinib resistance mechanisms are more diverse and complex compared to resistance to first-and/or second-generation TKIs which is mainly caused by the T790M mutation. A limitation, however, is the fact that compared to ddPCR, SCNAs can only be detected in samples with elevated tumor fractions (3-5% and higher). A hard threshold cannot be set because the detection of focal SCNAs depends not only on the TF but also on the amplitude of the SCNA. Although molecular profiling from tumor tissue may lead to higher resolution due to a higher TF, re-biopsies are generally difficult to obtain due to their invasive nature and the poor performance status of many advanced NSCLC patients. Another limitation may be that we used a highly selected patient cohort, and our conclusion may not apply to a general NSCLC cohort.
Regarding activating and resistance EGFR mutations, we confirmed that the development of the C797S mutation is one of the most common EGFR-dependent resistance mechanisms against osimertinib and that C797S typically occurs simultaneously with the activating EGFR mutations and T790M in plasma ctDNA [7,10]. At the time of osimertinib resistance, the T790M mutation was still present in 42% and was undetectable in 58% of the patients, which is in line with other reports [7,40].
The findings of our present study suggest that rrSCNAs in plasma ctDNA detected before second-line therapy with osimertinib are clinically relevant in patients with advanced EGFR-mutated lung adenocarcinoma. The presence of rrSCNAs before the start of osimertinib is associated with shorter PFS and OS of these patients. Therefore, the detection of rrSCNAs before starting osimertinib treatment could be helpful for guiding treatment in the future. According to our results, patients in whom no rrSCNAs are detectable should continue with osimertinib alone. However, patients with detectable rrSCNAs in resistance-related genes in plasma ctDNA may need to change treatment due to their poor outcome. They may benefit from adding chemotherapy or other treatments to osimertinib.
This treatment strategy is strengthened by the results of two phase III trials in the firstline setting, in which the combination of gefitinib with chemotherapy resulted in longer PFS in both trials and longer OS in one of these trials compared to gefitinib alone [42,43]. Therefore, the combination of osimertinib with chemotherapy requires further study in clinical trials in patients with detectable rrSCNAs. Other combination therapies with osimertinib are currently under investigation, e.g., the combination of osimertinib with the VEGF-inhibitor bevacizumab (NCT02803203), or with the EGFR inhibitors necitumumab (NCT02496663) or dacomitinib (NCT03810807) have already entered clinical trials.

Conclusions
Our study contributes to a comprehensive view of the evolution of the tumor genome during the treatment of EGFR-mutated lung adenocarcinoma patients. Furthermore, our results indicate that shallow whole-genome plasma sequencing in EGFR-mutated lung adenocarcinoma patients provides clinically relevant information. Patients with detectable rrSCNAs in plasma ctDNA before starting osimertinib have shorter survival and may require other treatments such as the combination of osimertinib with chemotherapy, chemoimmunotherapy or other drugs. These treatment options should be explored within clinical trials in the future and may further improve the outcome of patients with advanced EGFR-mutated lung adenocarcinoma.  Figure S1. Shallow whole-genome plasma sequencing profiles before initiation of osimertinib and at the time of progression to osimertinib. Figure S2. Tumor fraction in plasma DNA samples collected before initiation of osimertinib and at the time of progression to osimertinib (A) and in samples with or without SCNAs (B). Figure S3. Correlation between the log 10 -transformed copy number of the activating EGFR mutations (A), T790M (B) and the TF in plasma samples. Figure  S4. Relationship of the tumor fraction in plasma samples (n = 13) with detectable rrSCNAs before start of osimertinib therapy and at the time of progression to osimertinib. Figure S5. Shallow whole-genome plasma sequencing profiles of control samples. Table S1. SCNAs detected by shallow whole-genome plasma sequencing. Table S2. SCNAs in resistance-related genes detected by shallow whole-genome sequencing.

Informed Consent Statement:
The study protocol as well as the consent form were approved by the local Ethics Committee (EC No: 1132/ 2016) and all patients gave their written informed consent for providing blood samples for plasma genotyping.
Data Availability Statement: All data generated or analyzed during this study are included in this published article (and its supplementary information files). Shallow whole genome sequencing data have been deposited at the European Genome-phenome Archive (EGA; http://www.ebi.ac.uk/ega/), which is hosted by the EBI, under the accession number EGAS00001004539.