Analysis of Copy Number Variations in Solid Tumors Using a Next Generation Sequencing Custom Panel

: Somatic copy number variations (CNV; i.e., ampliﬁcations and deletions) have been implicated in the origin and development of multiple cancers and some of these aberrations are designated targets for therapies. Although FISH is still considered the gold standard for CNV detection, the increasing number of potentially druggable ampliﬁcations to be assessed makes a gene-by-gene approach time- and tissue-consuming. Here we investigated the potential of next generation sequencing (NGS) custom panels to simultaneously determine CNVs across FFPE solid tumor samples. DNA was puriﬁed from cell lines and FFPE samples and analyzed by NGS sequencing using a 20-gene custom panel in the GeneReader Platform ® . CNVs were identiﬁed using an in-house algorithm based on the UMI read coverage. Retrospective validation of in-house algorithm to identify CNVs showed 97.1% concordance rate with the NGS custom panel. The prospective analysis was performed in a cohort of 243 FFPE samples from patients arriving at our hospital, which included 74 NSCLC tumors, 148 CRC tumors, and 21 other tumors. Of them, 33% presented CNVs by NGS and in 14 cases (5.9%) the CNV was the only alteration detected. We have identiﬁed CNV alterations in about one-third of our cohort, including FGFR1 , CDK6 , CDK4 , EGFR , MET , ERBB2 , BRAF , or KRAS . Our work highlights the need to include CNV testing as a part of routine NGS analysis in order to uncover clinically relevant gene ampliﬁcations that can guide the selection of therapies.


Introduction
Somatic genetic alterations in solid tumors have clinical relevance and some of them predict response to targeted therapies. Among them, mutations and copy number variations (CNVs) have been extensively investigated. CNVs, by definition, are intermediate structural variations and include amplifications and deletions of a particular segment of chromosomal DNA between 1 Kb and 5 Mb.
Point mutations have been considered in clinical guidelines as the most relevant genetic alterations and many efforts have been devoted to identify them by molecular testing [1]. However, clinically significant alterations can range from nucleotide-level insertions/deletions to entire chromosomes [2] and it is currently known that somatic CNVs are associated with the development and progression of numerous cancers by impacting the gene expression level [3]. Traditionally, gold standard methods for the detection of CNVs include fluorescent in situ hybridization (FISH), multiplex ligation dependent probe amplification (MLPA), comparative genomic hybridization microarrays, and SNP arrays [4]. However, these techniques have disadvantages such as tissue consumption, limited coverage, low resolution, and high cost.
Nowadays, the clinical use of high-throughput methods, and in particular targeted next-generation sequencing (NGS), allows the identification of mutations and CNVs using a limited quantity of biological material and in a rapid and cost-effective manner. Algorithms for detection of clinically relevant variations from NGS data rely on one or more of the following methods: discordant paired-end reads, split reads, or depth of coverage (DOC). The performance of each of these detection methods depends on the sequencing data available [2,5]. CNV calling using targeted NGS data most commonly uses the depth of coverage assessment approach, which is based on the assumption that the DOC signal is proportional to the number of copies of chromosomal segments present in that specimen. The workflows perform an additional coverage analysis on a number of target regions defined for each of the CNV target genes or exons. The observed coverage is compared to coverage profiles of control samples known to not have any CNVs in the relevant genes.
CNVs analysis based on targeted NGS is technically challenging. The small size and non-contiguous nature of target regions prevent the application of algorithms designed to analyze whole genome sequencing to targeted gene panel data. The main intrinsic biases to be solved are the variation in GC content between genes, the presence of highly homologous regions, poor mappability, and technical issues such as library preparation, capture, and sequencing efficiencies. In addition, targeted panels compared with FISH analysis are related with the impact of whole arm or whole chromosome gains/losses in the handling of the data and the results obtained. In these types of analysis, we can only distinguish focal from whole arm/chromosome events when some of the genes of the panel are located in the same arm/chromosome. If they are all amplified with similar copy numbers, a whole arm/chr amplification can be suspected.
The objective of this study was to investigate the potential of NGS custom panels for multiplex detection of CNVs across formalin-fixed, paraffin embedded (FFPE) solid tumor samples.

Patients and Cell Lines
The study was conducted in accordance with the Declaration of Helsinki under an approved protocol of the research ethic committee of the Quiron Salud hospital group (nº52/2018), and de-identified for patient confidentiality. Informed written consent was obtained from all subjects.
First, we performed a retrospective validation with 70 samples: 20 cell lines (14 in house EGFR TKIs-resistant cell lines and 6 purchased from the American Type Culture Collection) and 50 NSCLC FFPE samples (40 baseline, 10 after progression to selective inhibitors). All of them had been previously genotyped by other NGS commercial panels or FISH, in cases where it was possible.
Next, from September 2019 to December 2020, we prospectively screened by NGS analysis 243 FFPE samples from different types of solid tumors, corresponding to patients that visited our oncology service (Supplementary Table S1).

Tissue Dissection and DNA Purification
Pathological evaluation of the FFPE samples was performed prior to tissue collection for NGS analysis. The percentage of tumor infiltration was evaluated and, in samples with less than 25% tumor infiltration, tumor content was enriched using macro or micro dissection of selected areas with a high percentage of tumor.
For DNA purification of cell lines and FFPE samples, we used the DNeasy Blood & Tissue Kit and the GeneRead DNA FFPE Kit, respectively (QIAGEN, Hilden, Germany), following the manufacturer's instructions. In both cases, DNA concentration was measured by Qubit ® . Samples with DNA ≥ 2.5 ng/µL were diluted to achieve this concentration.

NGS Sequencing Analysis
NGS was performed with the GeneReader Platform ® (QIAGEN, Hilden, Germany), an all-in-one platform (from sample preparation to bioinformatic analysis of the data obtained) with a biomedical and clinical focus. The GeneRead TM QIAact panels integrate Unique Molecular Index (UMI) technology in combination with a specially formulated enrichment chemistry to achieve efficient sequencing of GC-rich regions, enabling variant detection of targeted genomic regions by NGS on the GeneReader system.  Table S2). The custom panel was based on a 16-gene commercially available panel, which was modified according to the clinical needs of the oncology department of our hospital.
Libraries were quantified using a QIAxcel ® Advanced System (QIAGEN, Hilden, Germany) and Qubit dsDNA HS Assay Kit (Thermo Fisher Scientific, Carlsbad, CA, USA), diluted to 100 pg/uL and pooled. Clonal amplification was performed on 625 pg of pooled libraries by the GeneRead Clonal Amp Q Kit using the GeneRead QIAcube and an automated protocol. Following bead enrichment, pooled libraries were sequenced using the GeneRead UMI Advanced Sequencing Q Kit in a GeneReader instrument.
QIAGEN Clinical Insight Analyze (QCI-A) software was used to performed the secondary analysis of FASTQ reads, align the read data to the hg19 reference genome sequence, call sequence variants, and generate a report for visualization of the sequencing results. Variants were imported into the QIAGEN Clinical Insight Interpret (QCI-I) web interface for data interpretation and generation of the final custom report. A sample was considered not evaluable if the percentage of base positions in regions of interest with UMI coverage >100x was <20%. Manual inspection of regions untested or with poor coverage was performed in all samples. If they affected hotspots for clinically relevant mutations or CNVs, the sample was informed as "not evaluable" for the corresponding genes.

CNVs Analysis Prediction: Algorithm
CNVs were identified using an in-house algorithm. First, for each sample, we calculated the sum of median UMI read coverage for the 20 genes in the panel. Then, the UMI read coverages for each gene were normalized using this sum (so-called Ngx). Next, we calculated the mean and standard deviation of the normalized coverages for each gene across all samples analyzed. A gene was considered amplified if Ngx ≥ mean +2 SD. Deletions were considered true if Ngx was ≤ mean −2 SD.
The mean value was generated from the validation cohort data, compared with the result automatically obtained by the QCI-A software as well as with the FISH results (in cases where it was feasible).

CNV Analysis in the Validation Cohort
First, a retrospective study was performed to validate the CNV identification by NGS. A total of 70 samples, previously genotyped by other methodologies, were selected for the validation cohort, including 20 cell lines and 50 FFPE non-small cell lung cancer (NSCLC) tumors, 10 biopsies at progression to selective inhibitors and 40 baseline.
For cell lines, a 100% concordance was observed for EGFR, ERBB,2 and MET CNVs between our in-house NGS algorithm and FISH. Additionally, in some cases, amplification of other genes such as NRAS, KIT, CDK4, CDK6, or RICTOR was observed with the CNV algorithm. However, these CNVs could not be validated due to the lack of specific probes for FISH analysis (Supplementary Table S3).
Similar results were obtained for the retrospective analysis of the 50 NSCLC FFPE samples. In 32/50 cases (64%), we had previous data of positivity for EGFR, ERBB2, MET, or FGFR1 CNVs by FISH or commercial NGS GeneRead™ QIAact Lung DNA Panel, (Supplementary Figure S1, Supplementary Table S4), while the remaining 18/50 (36%) were negative for these four genes. The data showed a 96% concordance between our in-house NGS algorithm, FISH, and the previous results obtained with the commercial NGS panel. In only two cases (4%), discrepancies were observed between FISH and the results of our in-house CNV algorithm. Both cases (patients 43 and 44) corresponded to EGFR-mutated patients after progression to tyrosine kinase inhibitor (TKI). In the first, ERBB2 amplification was detected by NGS; however, FISH was negative given the specified criteria (1.5 ratio and an average of 4.7 copies per gene) [6]. In the second case, MET amplification was detected also by NGS, but FISH was negative according to criteria (0.9 MET/CEN ratio and 3.8 average copies) [7].
Among the 50 patients, 25 showed additional CNVs in other genes such us PIK3CA, NRAS, KRAS, etc. (not included in FISH or CNV analysis in the commercial panel).

Mutation and CNV Status in the Prospective Cohort
We employed NGS (GeneRead TM QIAact Custom Solid Tumor Panel, Qiagen) for prospective mutation and CNV analysis in 243 FFPE solid tumor samples. In four of them (4/243, 1.6%), the quality of the sequencing was not adequate to give reliable results (Figure 1). Mutations and/or CNVs were found in the majority of the cases with NGS results (93.3%), leaving four lung (1.7%), 8 CRC (3.3%), and four other pathologies (1.7%) without any clinically relevant alteration detected (Figure 1).
Of the total samples with NGS results included, 79/239 patients (33%) showed CNVs alterations in one or more of the 20 genes of the panel, which encode protein products as druggable or potentially druggable ( Figure 2). As shown in Figure 1, 82.3% of patients with CNVs (65/79) were accompanied by relevant mutations, either in the same gene or in other genes of the panel.  Overall, we identified 108 CNVs, either gene amplifications or deletions. Amplification was the most prevalent CNV (101/108, 93.5%) while deletions represented 6.5% (7/108) of the total CNVs.  Overall, we identified 108 CNVs, either gene amplifications or deletions. Amplification was the most prevalent CNV (101/108, 93.5%) while deletions represented 6.5% (7/108) of the total CNVs.

CNV in Baseline vs. Progression Samples
An additional analysis considering baseline vs. progression samples showed differences in the CNVs pattern, depending on the type of tumor. For NSCLC, 7/20 patients after progression to therapy showed CNVs alterations (Figure 3). Three of them corresponded to EGFR-mutated patients after progression to TKIs. EGFR amplification was observed in all of the cases. Additionally, concomitant ERBB2 (patient 58) and KRAS, TP53 amplifications (patient 9) were observed, as a potential acquired resistance mechanism. The remaining four patients studied progressed to platinum-based chemotherapy. FGFR1 amplification was observed in three cases (patients 3, 52, 56), while CDK6 was detected in the fourth patient.
Conversely, a more characteristic pattern was observed in the 43/142 samples at diagnosis harboring CNVs, with ERBB2, MET, CDK6, or FGFR1 amplification among the most frequent.

Discussion
Although FISH is still considered the gold standard for CNV detection, the increasing number of potentially druggable amplifications to be assessed makes a gene-by-gene approach time-and tissue-consuming. In this study, we present the results of CNV testing obtained in routine clinical testing after the implementation of a 20-gene NGS custom panel (Supplementary Table S2). CNVs were determined using an in-house algorithm that, in contrast to other widely used detection tools [10,11], is simple, fast, and does not require advanced bioinformatics knowledge or computer programs. In addition, our algorithm can be used in any NGS platform as long as UMI reads coverage for each gene can be obtained.
First, we performed a validation study in cell lines (n = 20) and FFPE NSCLC cases (n = 50) with known EGFR, ERBB2, and MET FISH results, achieving a 97.1% concordance with the NGS custom panel. In addition, CNV alterations were observed in genes such as CDK4, CDK6, RICTOR, KIT, and NRAS, which could not be corroborated by FISH due to the lack of specific probes. This finding highlights the ability of NGS to detect amplifications in genes with potential therapeutic or prognostic implications not routinely analyzed by FISH. NRAS amplification is frequently found in melanoma patients and could predict poor prognosis [12]. Moreover, cell models with NRAS amplification have been shown to be sensitive to the MEK inhibitor binimetinib, indicating that this gain could be a new therapeutic target [12]. Melanoma patients with KIT amplification have been reported to derive clinical benefit from imatinib [13] and frequent copy number gains of KIT have been described in squamous cell carcinoma of the lung [14]. Furthermore, CDK4/6 amplifications have been associated with longer PFS in hormone receptor-positive, HER2-negative metastatic breast cancer patients treated with CDK4/6 inhibitors [15]. Finally, RICTOR amplification has been proposed as a mechanism of resistance to TKIs and potential therapeutic target in this setting [16].
The only two discordant cases observed (2/70, 2.8%) corresponded to samples at progression to EGFR TKIs, positive for MET and ERBB2 amplification by NGS but negative for FISH. We attribute the discrepancy between NGS and FISH to a "cut-off" issue. Both cases were polysomic by FISH with a high copy number (4.7 and 3.8) but ratios 1.5 and 0.9, which were below the FISH cut-off. In contrast, the NGS values obtained by our in-house algorithm, which takes into account gene vs. average coverage, were close but superior to the cut-off. In view of the clinical characteristics, these two cases are likely to be truly amplified (and "false negatives" of FISH). They were both EGFR-mutated patients in progression to EGFR TKIs, showing MET and ERBB2 amplification and with no other mechanisms of resistance. In addition, the patient with MET amplification showed a good response to a bi-anti EGFR/MET monoclonal antibody.
After retrospective validation, we prospectively tested 243 FFPE samples from patients arriving at our hospital. Of them, 33% presented CNVs by NGS and in 14 cases (5.9%) the CNV was the only alteration detected, highlighting the need to include CNV analysis in multiplex testing of somatic alterations. Comprehensive studies about CNVs in solid tumors are unfortunately scarce in the literature. Despite this fact and the relatively small size of our cohort, the spectrum of CNVs observed in our study was coincident with previous reports. Thus, ERRB2, MET, or EGFR amplifications were frequently found in CRC while FGFR1 and EGFR CNVs were present in a small but significant percentage of lung tumors [17][18][19][20].
FGFR1 was the most frequently amplified gene in NSCLC samples (28%), being also present in 17% of CRC samples analyzed. Interestingly, among NSCLC patients progressing to platinum-based chemotherapy, FGFR1 amplification was observed in 3/4 (75%) of cases. To date, there are no data in the literature about the potential role of this alteration as a mechanism of resistance to chemotherapy. FGFR1 CNVs have been reported as a relatively frequent event in different cancer tumors [21][22][23]. Several kinase inhibitors with activity against receptors of the FGFR family have been developed, such as Abemaciclib, AZD4547, or Regorafenib; and some of them are currently being tested in clinical trials in patients with FGFR1 amplification [24,25].
A relatively high incidence of CDK4/6 amplifications (around 35% of cases) was apparent in our cohort, suggesting widespread alteration in cell cycle control. As mentioned, CDK4/6 amplifications have been associated with better outcome in breast cancer patients treated with CDK4/6 inhibitors. In lung cancer, recent in vitro research has suggested that the CDK4/6 inhibitor palbociclib in combination with taxanes might be useful in SqCLC [15,26].
Regarding EGFR amplification, it was detected in 20% and 12.8% of NSCLC and CRC samples, respectively. As described, EGFR amplification was associated with EGFR mutations in the case of NSCLC samples. Among the EGFR-mutated patients, 2/3 and 3/3 cases, baseline and after progression to EGFR TKIs, respectively, harbored concomitant EGFR amplification. While this alteration has been associated with acquired resistance to targeted therapies [27], its role baseline is controversial [19,28,29]. Biomarker-directed therapies are approved for MET amplifications in NSCLC and for ERBB2 amplifications in CRC, while clinical trials are in progress for EGFR amplifications in NSCLC and for BRAF amplifications in other types of tumors.

Conclusions
In summary, we have identified amplifications of druggable or potentially druggable targets in about one-third of the 243 patients of our cohort, including FGFR1, CDK6, CDK4, EGFR, MET, ERBB2, or BRAF. However, in some cases such as BRAF, ERBB2, EGFR, or MET gene amplification could confer resistance to certain therapies [27,[30][31][32][33]. Our work highlights the need to include CNV testing as a part of routine NGS analysis in order to uncover clinically relevant gene amplifications that can guide the selection of therapies.
Supplementary Materials: The following are available online at https://www.mdpi.com/article/10 .3390/jmp2020013/s1, Table S1: Characteristics of patients included in the prospective cohort, Table  S2: Description of gene and exons included in the GeneReadTM QIAact Custom Solid Tumor Panel, Table S3: Results of CNVs alterations observed in the twenty cell lines included in the validation cohort, Table S4: Characteristics of commercial GeneRead™ QIAact Lung DNA Panel (Qiagen), Figure S1: Heatmap of the FFPE NSCLC patients included in the retrospective cohort (n = 50).