Cell Models for Chromosome 20q11.21 Amplification and Drug Sensitivities in Colorectal Cancer

Background and objectives: The chromosome locus 20q11.21 is a commonly amplified locus in colorectal cancer, with a prevalence of 8% to 9%. Several candidate cancer-associated genes are transcribed from the locus. The therapeutic implications of the amplification in colorectal cancer remain unclear. Materials and Methods: Preclinical cell line models of colorectal cancer included in the Cancer Cell Line Encyclopedia (CCLE) collection were examined for the presence of amplifications in 20q11.21 genes. Correlations of the presence of 20q11.21 amplifications with gene essentialities and drug sensitivities were surveyed on salient databases for determination of therapeutic leads. Results: A significant subset of colorectal cancer cell lines in the CCLE (12 of 63 cell lines, 19%) bear amplifications of genes located at 20q11.21. Cancer-associated genes of the locus include ASXL1, DNMT3B, BCL2L1, TPX2, KIF3B and POFUT1. These genes are all amplified in the 12 cell lines, but they are variably over-expressed at the mRNA level, compared to non-amplified lines. 20q11.21 amplified cell lines are sensitive to various tyrosine kinase inhibitors and are resistant to chemotherapy drugs targeting the mitotic apparatus and microtubules. CRISPR and RNAi dependencies screening revealed, besides the β-catenin and KRAS genes, a few recurrent gene dependencies in more than one cell line, including YAP1 and JUP. Conclusions: Cell line models of colorectal cancer with 20q11.21 gene amplifications display dependencies on the presence of specific genes and resistance or sensitivity to specific drugs and drug categories. Observations from in vitro models may form the basis for clinical drug development in this subtype of colorectal cancer. Genetic lesions conferring synthetic lethality to certain drugs or categories of drugs could be discovered with this approach.


Introduction
Development of targeted therapies of cancer based on underlying molecular defects that drive carcinogenesis has improved cancer patient outcomes in recent years. New candidate drugs are often first tested in preclinical in vitro models consisting of cell lines cultured in highly artificial conditions [1]. Thus, their relevance for capturing the in vivo environment where the drugs under development will be acting is debatable. Moreover, tumors in situ consist not only of the tumor cells but also of a variety of supporting cells, as well as tumor attacking or tolerogenic immune cells and soluble factors (cytokines, chemokines and hormones) that both promote and antagonize tumor cell survival and proliferation [2]. Initial anti-cancer drug testing for identification of candidates for further development is performed with the aid of high-throughput screens, which, by their design, do not take into consideration the molecular characteristics of the tested cell lines. However, many of the tested drugs have a known molecular target, and their efficacy could be enhanced if the target is expressed and critical for the survival of neoplastic cells. In contrast, these drugs may be ineffective, even if pharmacologically and pharmacodynamically appropriate concentrations have been achieved, when the target is not expressed or is not critical in a given cancer [3]. Taking these factors into consideration, a molecularly informed drug development model starting early from the pre-clinical, in vitro phase, could aid in further clinical advancement of drugs with a drug-target paired model of development. A matched approach in the development process from early on could reduce the high attrition rates that hustle cancer drug discovery [4].
Amplification of specific loci is common in cancers and is often observed with tumor specificity, being more frequent in specific types of cancers. Perhaps one of the best-known amplifications with clinical therapeutic implications is observed in a subset of breast cancers at chromosome 17q. The locus includes oncogene ERBB2, encoding for HER2 receptor protein and leading to sensitivity to monoclonal antibodies and small-molecule tyrosine kinase inhibitors blocking the receptor [5]. In another cancer with high prevalence, colorectal adenocarcinoma, a commonly amplified region is at chromosome locus 20q11.21, which contains several potential oncogenic drivers and is amplified in about 10% of cases in the colorectal cancer cohort from TCGA [6]. In other cancers, amplifications of the 20q11. 21 locus are rare and are encountered in 2.5% of head and neck carcinomas and in 1% to 1.5% of bladder carcinomas, small-cell lung carcinomas, esophagogastric and hepatobiliary cancers, which are the cancers with the higher prevalence of the amplification [7]. The amplified 20q11 region in colorectal cancers extends in many cases to neighboring loci, while in other cases, it is more restricted [7]. Genes that are located in the commonly amplified region include ASXL1, DNMT3B, BCL2L1, TPX2, KIF3B and POFUT1. The expression of resulting mRNA in 20q11.21 amplified colorectal cancers is variable. Some of the amplified genes are rarely over-expressed at the mRNA level, while others, such as ASXL1, KIF3B and POFUT1, are over-expressed in most amplified cancers. ASXL1 is over-expressed in 88.9% of 20q11.21 amplified colorectal cancers in the TCGA cohort, while POFUT1 is over-expressed in 90.5% of 20q11.21 amplified cases in the same cohort [7]. Genes that are over-expressed when amplified are putative drivers in the oncogenic process and may promote selection of the amplicon. This investigation examines drug sensitivity of colon and other cancer cell lines with amplification of genes at 20q11.21 for drugs that target driver genes of the amplicon. Dependencies of these cell lines to genomic alterations are also examined.

Methods
Cancer cell lines included in the current investigation constitute part of the Cancer Cell Line Encyclopedia (CCLE) collection [8]. The cBioportal Genomics Portal platform was used to probe CCLE for molecular abnormalities in colorectal cancer cell lines with amplification of genes located at 20q11.21 [9]. cBioCancer (http://www.cbioportal.org, accessed on 3 April 2021) is a user-friendly, open-access platform for genomic analysis of tumors and cancer cell lines [9]. Additionally, genomic data of colorectal cancer patients from The Cancer Genome Atlas (TCGA) study cohort [6] were analyzed using cBioportal. Subsets of cell lines and colorectal cancers with or without amplifications of genes at the 20q11.21 locus were identified. TCGA employs whole-exome sequencing to discover mutations, copy number alterations, and fusions in cohorts of patients with various types of cancer. Analysis of copy number alterations in TCGA is performed with the GISTIC (Genomic Identification of Significant Targets in Cancer) algorithm, in which a score of 2 or above denotes putative amplification of a gene [10]. An Aneuploidy Score (AS) is presented as a measure of chromosomal instability of cancers and is defined as the sum of the number of chromosome arms in each patient sample included in the study that display copy number alterations (gains or losses) [11]. A chromosome arm is considered copy number-altered based on the length of alterations, as calculated by the ABSOLUTE algorithm from Affymetrix 6.0 SNP arrays [12]. The definition of a somatic copy number alteration was set at more than 80% of the length of the arm. Alterations in 20% to 80% of a given arm length were considered inadequate to call, while chromosomal arms with somatic copy number alterations in less than 20% of the arm length were considered not altered. TCGA also includes mRNA expression analysis. The RSEM algorithm is used for normalization of mRNA expression [13].
The OncoKB knowledgebase is a database of cancer-related genes and classifies cancerrelated genes as oncogenes or tumor suppressors [14]. OncoKB was scanned for any genes from the 20q11.21 locus included in the database and information was used to guide further analyses of putative drug dependencies.
The Genomics of Drug Sensitivity in Cancer (GDSC) dataset (www.cancerrxgene.org, accessed on 6 April 2021) was queried to obtain data on drug sensitivity of cell lines from colorectal cancer and other cancers with the 20q11.21 amplification [15]. Dependencies of these cell lines on specific genes was obtained from the Depmap portal that contains data from CRISPR arrays and RNA-interference arrays for CCLE cell lines [16,17]. These arrays screen cell lines for essential genes that are important for their survival and, as a result, their knock-down has a significant effect in their survival and proliferation in vitro [18][19][20].
Statistical comparisons of categorical data were carried out using Fisher's exact test or the χ 2 test. The Mann-Whitney U test was used to compare median values. All statistical comparisons were considered significant if p < 0.05.

Cell Lines with 20q11.21 and Drug Sensitivity/Resistance In Vitro
Twelve cell lines among sixty-three colorectal cancer cell lines (19%) included in CCLE have amplifications of genes at the 20q11.21 locus, as assessed in cBioportal. All three genes from this locus that are listed as cancer-related at OncoKB knowledgebase (ASXL1, DNMT3B, BCL2L1) are amplified in the 12 cell lines (Table 1). Other genes of the locus with potential pathogenic importance in colorectal cancers, such as TPX2, KIF3B and PO-FUT1, are also amplified in the 12 cell lines. Another type of cancer with several cell lines displaying amplification of 20q11.21 genes is gastroesophageal carcinomas, where 14% to 15% of cell lines in CCLE carry amplifications of genes in the locus (Table 2). Interestingly, in contrast to colorectal cancer, gastroesophageal adenocarcinomas display 20q11.21 gene amplifications only in about 2.5% of clinical patient samples in TCGA gastroesophageal and gastric adenocarcinoma cohorts [21]. Similarly, non-small cell lung cancer (NSCLC) cell lines display a 15% to 18% prevalence of amplifications in 20q11.21 genes, while the prevalence of such amplifications in patients with either adenocarcinomas or squamous lung carcinomas is significantly lower [22,23]. The 20q11.21 locus amplification constitutes a recurrent copy number alteration confirmed in a pan-cancer cell line analysis performed in the GDSC database (feature cnaPANCAN363, www.cancerrxgene.org, accessed on 6 April 2021). Cancer cell lines with this recurrent copy number alteration, independently of primary type, display resistance to several currently used chemotherapy drugs, including the microtubule poisons vincristine, vinblastine and docetaxel, the topoisomerase II inhibitors teniposide and epirubicin and the DNA poisons temozolamide and actinomycin D (Table 3). In addition to microtubule inhibitor chemotherapeutics, several targeted mitotic inhibitors, including the Aurora A kinase inhibitor Alisertib, the kinesin protein family member 11 inhibitor Eg5 9814 and the CDC42BPA (CDC42 binding protein kinase alpha, also known as MRCK) inhibitor BDP-00009066, are associated with resistance in cell lines with 20q11.21 amplifications. Two inhibitors of DOT1L, an H3 histone methyltransferase with specificity for lysine 79 (H3K79), EPZ004777 and EPZ5676, also display resistance in these cell lines. A specific analysis of colorectal cancer cell lines with the 20q11.21 amplification shows that these cell lines are more resistant to several of those drugs (higher median IC 50 ) than colorectal cancer cell lines without the amplification, although, due to the smaller size of the cohort, differences are not statistically significant except for Alisertib and EPZ004777 (Table 3). In contrast to promoting drug resistance, the cnaPANCAN363 copy number alteration (20q11.21 amplification) does not confer sensitivity to any of the 185 drugs tested in the Genomics of Drug Sensitivity in Cancer database (GDSC). However, individual colorectal cancer cell lines with the amplification display sensitivities to tested drugs (Table 4). Drug categories that show sensitivity in more than one cell line with the cnaPANCAN363 feature include receptor tyrosine kinase inhibitors, inhibitors of intracellular kinases (PI3K, mTOR, MEK and PKC) and lipid metabolism enzyme inhibitors of sphingosine kinase and Stearoyl-CoA desaturase. No specific drugs display sensitivity in more than three amplified colorectal cancer cell lines, suggesting that the mechanism of sensitivity may not be related to the 20q11.21 amplification, that they all possess.

Increased mRNA Expression of Genes from 20q11.21 and Targeted Drugs
mRNA expression of genes at 20q11.21 in 12 colorectal cancer cell lines with 20q11.21 amplification was compared with the corresponding expression in 12 randomly selected colorectal cancer cell lines from CCLE without amplification in the locus. Among the six genes located at 20q11.21 with potential cancer pathogenesis interest, BCL2L1, POFUT1 and KIF3B were over-expressed at the mRNA level in 20q11.21 amplified cell lines compared with non-amplified cell lines (Table 5). Over-expression of amplified genes is partially overlapping with the over-expression of the genes in 20q11.21 amplified clinical samples of colorectal cancer patients, where POFUT1 but also ASXL1, and, in fewer cases, KIF3B and TPX2, are often over-expressed (Table 5). Based on the mRNA over-expressions of genes at 20q11.21, the sensitivity of cell lines to BCL-xL inhibitors, Notch inhibitors (Notch pathway is activated by POFUT1 enzyme) and mitotic spindle inhibitors was evaluated at the GDSC. A notable drug in these categories displaying sensitivity in amplified cell lines compared to non-amplified cell lines is the BCL2 family inhibitor WEHI-539 (median IC 50 in amplified lines 11.6 µmol versus 42.5 µmol in non-amplified cell lines, p = 0.04). In contrast, other BCL2 family inhibitors examined such as venetoclax and navitoclax showed no sensitivity in amplified cell lines or even a trend for resistance compared to non-amplified lines. Colorectal cancer cell lines with the 20q11.21 amplification displayed resistance to the microtubule polymerization stabilizer epothilone B, compared with non-amplified cell lines (median IC 50 in amplified lines 0.028 µmol versus 0.003 µmol in non-amplified cell lines, p = 0.01). Z-LLNie-CHO, a γ secretase inhibitor of the Notch cascade, displayed a non-significant trend towards resistance in amplified cell lines (median IC 50 in amplified lines 7.36 µmol versus 1.96 µmol in non-amplified cell lines, p = 0.23).

CRISPR Microarray Dependencies of 20q11.21 Amplified Colorectal Cancer Cell Lines
An evaluation of the CRISPR preferentially essential genes and RNA-interference screening of colorectal cancer cell lines with the 20q11.21 amplification disclosed a few recurrent genes in more than one line, that include CTNNB1, encoding for β-catenin, the WNT pathway transcription factor TCF7L2 and oncogene KRAS (Table 6). In addition, BCL2L1, the gene JUP, encoding for Junction Plakoglobin (also called γ-catenin), and YAP1 (Yes-associated protein), encoding for a Hippo pathway nuclear regulator, are among additional recurrent dependency genes in 20q11.21 amplified cell lines. A similar CRISPR KO screen from project SCORE that excluded known core fitness genes of colorectal cancer, disclosed a few non-overlapping recurrent genes in colorectal cancer cell lines with 20q11.21 amplification, including DONSON (downstream of the SON gene, DNA replication fork stabilization factor), SNAP23 (synaptosome-associated protein 23) and HMGCS1 (3-hydroxy-3-methylglutaryl-CoA synthase 1).

Discussion
Amplifications of loci at the long arm of chromosome 20 are common in colorectal cancers. Genes at locus 20q11.21 are listed among the most commonly amplified genes in colorectal cancers. Colorectal cancers with the 20q11.21 amplification present at a similar stage with 20q11.21 non-amplified colorectal cancers and have a similar overall survival [7]. However, when metastatic, colorectal cancers with 20q11.21 amplification have a better survival compared with non-amplified metastatic counterparts. Moreover, 20q11.21 amplified colorectal cancer rarely harbor mutations in DNA damage response and mismatch repairrelated genes compared with 20q11.21 non-amplified colorectal cancers, and have a lower tumor mutation burden [7]. The 20q11.21 locus harbors several cancer-associated genes with potential oncogenic properties. These include the epigenetic regulators ASXL1 and DNMT3B, the apoptosis regulator BCL2L1, the microtubule and mitotic spindle-associated proteins TPX2 and KIF3B and the enzyme fucosyl-transferase POFUT1. Proteins encoded by cancer-associated genes at 20q11.21 are expressed in most cases of colorectal cancers in the Human Protein Atlas [24]. Variability in the intensity of staining is observed that may underline differences in translation, in addition to gene dosage. Amplification of one or more of these genes, as a consequence of 20q11.21 locus amplification, may be the event that favors selection of the amplification and results in its comparatively high prevalence in colorectal cancer and colorectal cancer cell lines. In addition, such a driver event could be a therapeutic target for the subset of cancers carrying the amplification. Targeting a driver defect in the specific subsets of cancers that bear it would be effective in these patients and avoid treatment toxicity in the rest of the patients with no amplification of the locus. Moreover, it would help the development of targeted drugs, as a therapeutic benefit would be difficult to discern if the target population in trials is diluted by non-responders. With these considerations, the current investigation sought to take advantage of databases of in vitro cancer models in an attempt to provide a framework of sensitivities for colorectal cancers with the 20q11.21 amplification. The main findings of the current study are manifold. First, it was shown that a subset of colorectal cancer cell lines bear the 20q11.21, capturing the corresponding colorectal cancer biology of a subset of patients. Second, as a group, cancer cell lines with 20q11.21 amplifications tend to be more resistant to microtubule inhibitors, topoisomerase II inhibitors and some DNA alkylators. In addition, resistance to targeted Aurora A kinase inhibitors and kinesin inhibitors is observed in these cell lines. In contrast, the amplification does not endow 20q11.21 amplified colorectal cancer cell lines with sensitivity to any of the drugs checked in the database. However, individual cell lines with the amplification display sensitivity to assayed drugs, including kinase inhibitors and lipid metabolism inhibitors, albeit only in a few cell lines in each case. Recurrent dependencies of cell lines with the amplification include the genes for YAP1, JUP (γ-catenin), BCL2L1, DONSON, SNAP23 and HMGCS1. These dependencies may provide clues for additional therapeutic interventions.
Amplification of 20q11.21 is observed in a significant minority of cell lines beyond colorectal cancer, such as esophagogastric and non-small lung cancers, despite the low prevalence of the amplification in patient samples from these cancers. This may suggest that the amplified segment genes confer advantage in these cancers in vitro, which is not essential in vivo. In contrast, in colorectal cancers, 20q11.21 amplifications are advantageous both in vitro and in vivo. Drug sensitivities and resistance mechanisms stemming from over-expression of genes amplified from the locus would be expected to present independently of cell line origin. With this rationale, and in order to increase statistical power from an increased number of cell lines, this study compared all 20q11.21 amplified cell lines from CCLE included in the GDSC project to those without the 20q11.21 amplification. Comparisons were, then, focused on the colorectal cell lines subsets. Results were concordant in the two comparisons, although, as expected, due to smaller numbers, they were mostly not statistically significant in the latter set of comparisons.
A different approach that could help with a better targeting of therapies to appropriate subsets of patients is by categorizing colorectal cancers to genomic subsets. Colorectal cancers have been categorized according to genomic profiles into four consensus molecular subtypes (CMS1 to 4) [25]. Cancers with 20q11.21 amplifications represent a subset of the most common canonical CSM2 cancers [7]. Characteristics of the CMS2 group include left colon laterality in 77% of cases, high level of chromosomal instability, high frequency of APC mutations leading to WNT pathway activation and lower levels of MSI lesions. KRAS mutations, BRAF mutations and SMAD4 mutations are less frequent in CSM2 colorectal cancers than in other subtypes [7,26]. Thus, CMS2 cancers and, among them, cancers with 20q11.21, would be expected to respond to EGFR-targeting therapies. Data from drug sensitivity analysis of several 20q11.21 amplified cell lines concur with this assumption of sensitivity to EGFR and downstream kinase inhibitors (Table 4). However, the molecular consensus classification does not provide any additional guidance for currently available therapies, and the need for new options based on biomarkers of efficacy remains. It is reassuring that none of the chemotherapy drugs identified as being associated with resistance in 20q11.21 cancers are used clinically in colorectal cancer. The therapeutic implications of resistance to targeted mitotic inhibitors, including Aurora A kinase and kinesin inhibitors, is of interest and suggests that TPX2 and KIF3B amplifications may be involved in dysregulation of mitosis, leading to the observed mitosis inhibitors' resistance.
Dependencies of 20q11.21 amplified cell lines on particular genes and their products as derived from CRISPR and RNAi arrays could inform development of therapies based on synthetic lethalities. Yes-associated protein 1 (YAP1), a transcription factor of the Hippo pathway, comes up in the dependency screening of 20q amplified colorectal cancer cell lines, confirming a key role of the pathway in colorectal cancer. YAP1 co-operates with transcription factors TAZ and TEAD in transcription of genes involved in proliferation following tissue damage and promoting regeneration in the gut [27]. In colorectal cancer, aberrant signals from cancer-associated pathways, such as WNT and activated KRAS, activate Hippo to promote tumor growth and metastasis [28]. It is intriguing that the apoptosis inhibitor BCL2 is among the target genes of YAP1, a fact that could contribute to YAP1 dependency in 20q11.21 amplified cancers, given that the related BCL2L1 protein is dysregulated in these cancers [29]. Thus, interruption of Hippo signaling could impede, at least partially, aberrant cancer cell signals. CMS2 cancers are characterized by WNT pathway activation and thus a downstream activation of Hippo. Similarly, the JUP gene encoding for the β-catenin homolog, junction plakoglobin (also called γ-catenin), is also shown to be a dependence gene in a subset of 20q11.21 amplified colorectal cancer cell lines. γ-catenin has parallel roles with β-catenin in cell adhesion and WNT signaling [30]. In addition to adherens junctions, γ-catenin has a role in desmosomes [31]. These data suggest that the network of proteins associated with alternative fates of Wnt signaling is an important node in 20q11.21 amplified colorectal cancers and a candidate for therapeutic interventions.
DONSON, one of the discovered dependencies present in 2 of 7 tested amplified cell lines, is a gene of unknown function, playing a role in DNA replication and the stabilization and protection of stalled replication forks. Mutations of the gene are associated with the microcephaly-micromelia syndrome [32]. In cancer, DONSON could be helpful in preventing apoptosis during aberrant DNA replication. However, the mechanism through which 20q11.21 amplified cancers and cancers with increased chromosomal instability in general could be associated with DONSON dependence remains to be unveiled.
SNAP23, another dependency gene present in 2 of 7 tested amplified cell lines, encodes for one of the proteins of the cellular machinery for membrane fusion and exocytosis. It is also involved in cell signaling, promoting malignant cell motion, and through this mechanism, it may favor metastasis [33].
HMGCS1 is a mevalonate precursor enzyme and catalyzes the conversion of two molecules of acetoacetyl-CoA to form 3-hydroxy-3-methyl-glutaryl-CoA (HMG-CoA), the precursor of cholesterol production [34]. HMGCS1 plays a role in breast cancer stem cells, and its downregulation decreased the stem cell fraction of both luminal and basal breast cancer cells [35]. Cholesterol biosynthesis is targeted by statins, a class of cholesterollowering drugs, and attempts at repurposing these drugs for cancer are in progress [36]. Interestingly, drugs targeting other lipid metabolism enzymes show activity in some 20q11.21 amplified cell lines, suggesting lipid metabolism as a possible target in these cancers. Repurposing of drugs already used for other indications for well-defined subsets of colorectal cancers would present significant advantages from financial and patient safety perspectives.