KIT Mutations Correlate with Higher Galectin Levels and Brain Metastasis in Breast and Non-Small Cell Lung Cancer

Simple Summary Galectins are a family of β-galactoside binding proteins whose levels are altered in various stages of different types of cancer. This study provides an analytical comparison of 50 frequently mutated genes in two common cancers and the serum levels of the galectin proteins. The goal is the revelation of potential relationships between the mutation status of these genes and serum levels of galectins. We found that mutations in the KIT gene (which codes for the proto-oncogene c-KIT protein) are associated with increased circulating levels of certain galectins. We also found that patient samples originating from brain tissue have a higher likelihood of having a mutation in the KIT gene. Understanding the relationship between cancer-critical gene mutations and serum galectin levels could provide a feasible and non-invasive avenue to better understand the tumor’s unique genetic profile. Abstract To investigate a potential role for galectins as biomarkers that enable diagnosis or prognostication of breast or non-small cell lung cancer, the serum levels of galectins -1, -3, -7, -8, and -9 of cancer patients determined by ELISA assays were compared to the mutation status of 50 known cancer-critical genes, which were determined using multiplex PCR in tumors of the same patients. Mutations in the KIT proto-oncogene, which codes for the c-Kit protein, a receptor tyrosine kinase, correlated with higher levels of galectins -1, -3, -8, and -9 in breast cancer patients and galectin-1 in non-small cell lung cancer patients. Mutations in the KIT gene were more likely found in brain metastases from both of these primary cancers. The most common KIT mutation in our panel was p.M541L, a missense mutation in the transmembrane domain of the c-Kit protein. These results demonstrate an association between KIT oncogenic signaling and elevated serum galectins in patients with metastatic disease. Changes in protein trafficking and the glycocalyx composition of cancer cells may explain the observed alterations in galectin expression. This study can be useful for the targeted selection of receptor tyrosine kinase and galectin inhibitor anti-cancer treatments.


Introduction
Lung cancer has one of the lowest 5-year survival rates of any cancer in the United States at 21% and is the greatest cause of cancer deaths in both men and women. Although the 5-year relative survival rate of breast cancer is 90% for all subtypes, its high incidence still results in tens of thousands of deaths annually in the United States [1]. Once metastasis occurs, the survival rate greatly drops, with a majority (66.7%) of all solid tumor cancer deaths being caused by metastasis [2]. Sixteen to twenty percent of those diagnosed with lung cancer, and 5.1% with breast cancer, develop metastasis to the brain [3,4]. The high incidence and metastatic ability make these cancers of primary interest for study.

Cancer-Critical Genes
Multiple genes in the human genome, when mutated, enable the development and progression of neoplastic tissue. Oncogenes are genes that, when mutated, create a product with a gain-of-function (GOF) that allows it to contribute to the dysregulation of the cell. Conversely, tumor suppressor genes lose their ability to protect the cell from dysregulated growth and proliferation when they are mutated. This paper refers to both groups of genes collectively as "cancer critical" genes.
A recent comprehensive study of 9423 tumor exomes identified 299 cancer driver genes [5]. This study evaluates 50 of those cancer-critical genes, providing a broad screening of genes commonly mutated in multiple cellular pathways. The nine involved pathways are the RTK/RAS/MAP, TGFβ, PI3K, Wnt, GPCR, p53, JAK/STAT, Notch, and the cell cycle pathway. These genes and their respective pathways are highlighted in Figure 1. The graphic is not comprehensive of all the possible cancer-critical genes or the full signaling pathways, it is instead designed to highlight the genes used in this study and their potential contribution to unregulated proliferation. Seven genes were not a part of any canonical cancer-causing pathway or were members of multiple pathways (e.g., SRC); these are grouped together as "Other" in Figure 1. These genes are involved in DNA repair, genomic stability, epigenetic modification, etc.
The KIT gene codes for c-Kit, a class III receptor tyrosine kinase (RTK) which binds to extracellular Stem Cell Factor (SCF) and activates the PI3K, JAK/STAT, and MAPK pathways in hematopoietic cells, resulting in proliferation and differentiation [7][8][9]. This RTK is also highly expressed by glandular and myoepithelial breast cells [10]. Furthermore, c-Kit is known to play roles in several cancers via gain-of-function and loss-of-function mutations; most notably in gastrointestinal stromal tumors (GIST), and also in melanoma and thyroid carcinoma [11][12][13][14]. Several mutations in c-KIT have been associated with the development of cancers. These mutations are typically missense mutations that usually result in gain-of-function or an otherwise unknown result [15].
Targeted cancer therapy offers more precise cancer treatment with fewer cytotoxic effects on non-cancer cells [25]. This level of therapy requires knowing the cancer's specific genetic makeup to identify druggable targets. For example, osimertinib, a tyrosine kinase inhibitor, is a targeted therapy for patients with NSCLC with specific sensitizing mutations (p.Thr790Met and p.Leu858Arg) in the EGFR gene [26][27][28][29][30][31][32]. A more thorough understanding of the cellular biology of cancer will reveal the therapeutic targets involved in growth and is a promising strategy for reducing mortality from cancer and its metastases.

Galectins and Their Role in Cancer
Galectins (formerly known as S-type lectins) are a family of lectin proteins which share a domain with high-affinity binding for β-galactoside sugars. Galectins are divided into three subfamilies based on their structures: prototypical, chimeric, and tandem-repeat [33]. Among other functions, galectins are players in the innate immune system, triggering immune responses as well as resolving inflammation [34]. Further, galectins modulate adaptive immune responses, such as Gal-9 or Gal-1, acting to dampen activated T cell responses [35]. Galectins also have several functions outside of the immune system. They interact with cellular proteins via binding to protein glycosylation sites [36,37]. They can form lattice networks with cell membrane receptors and modulate the functions and transportation of the receptors [38]. Galectins have intracellular interactions as well and enhance oncogenic signals and promote tumor proliferation [39].
In breast cancer, galectins have several roles. Galectins-1 and -3 specifically have been implicated in the progression of lesions into metastatic disease through their roles in cell-to-cell and cell-to-extracellular matrix interactions [47]. Galectin-7 has been found to have interactions with p53 that can induce chemoresistance [45,48]. Increased levels of galectin-8 were shown to lead to lower survival rates [49]. Lastly, galectin-9 has increased expression in breast cancer, and its interactions with Tim-3 may provide an escape from cytotoxic T cells [50,51].
In regard to NSCLC, galectin-1 is overexpressed in these cell lines and in tissue samples from lung cancer patients [52,53]. Knockdown of Gal-1 in lung adenocarcinoma results in reduced tumor growth in vivo and inhibited migration, invasion, and colony formation in vitro [53]. Galectin-3 is also more highly expressed in NSCLC and augments tumorigenesis, invasion, metastasis, and tumor immunity [54]. Galectins -7 and -8 have been shown to have higher mRNA expression levels in NSCLC [55]; however, no studies have examined their roles. Galectin-9 expression in NSCLC is found to be a favorable prognostic marker due to interactions between tumor-infiltrating lymphocytes and tumor cells [55,56].
The tissue levels of galectins are well known to be altered in breast and lung cancer [42,52,[57][58][59][60][61][62]. This results in measurably altered serum levels of these galectins [52,[63][64][65]. The mechanism for the altered serum expression of galectins in cancer patients remains unclear, however, there are potential explanations. Galectins are secreted in a currently enigmatic non-classical pathway and their trafficking is controlled at different points within the cell [66][67][68][69][70]. The abnormal cellular processes of cancer cells quite possibly result in the dysregulation of the processes involved in galectin secretion. Additionally, while normal cellular glycosylation is required for proper functioning, cancer cells have deviant glycosylation [71][72][73]. This could disturb the type and number of glycoconjugates for which galectins bind. The alterations in both trafficking and the glycomic profile, in turn, could lead to altered galectin levels in these neoplastic tissues, resulting in their demonstrably different serum levels.
Given galectin's dysregulation in the cancer environment and a possible therapeutic target, several galectin inhibitors have been developed [41,78,81,82]. GR-MD-02 (a galectin-3 inhibitor) is currently in clinical trials to evaluate its usefulness in cancer patients with melanoma, NSCLC, and squamous cell head and neck cancer (NCT02117362, NCT02575404).
Finally, galectins are known players in cancer metastasis. Galectins -1 and -3 are particularly well studied in this aspect. Galectin-3 has been identified as a metastasis-related protein as early as 1998 [88]. Galectin-1 is upregulated in more advanced breast cancers of higher TNM stages and correlates with metastasis to regional lymph nodes [58,89]. Molecularly, the lectin interacts with laminin and fibronectin to promote aggregation [90,91]. Galectin-1 is able to upregulate MMP-2 and MMP-9 and reorganize cytoskeletal elements by activating Cdc42 to increase the amount of filopodia in oral squamous cell carcinoma cells [92]. Knockdown of galectin-1 reduced prostate cancer migration by suppressing androgen receptors and Akt signaling [93].
While many studies show changes in galectin levels during cancer, no comprehensive work has been done to correlate galectin levels with cancer-critical gene mutations in cancer patients [65,77,78,80,94,95]. This paper seeks to provide an initial exploration into serum galectin levels and their correlation with cancer-critical gene mutations in breast and non-small cell lung cancer patients.

Patient Samples
Seventy-four cancer patient serum samples were obtained from the Prisma Health Cancer Institute (PHCI) biorepository (Greenville, SC, USA). The collection years ranged from 2012-2018. The PHCI biorepository houses inventory, including live cryopreserved, snapfrozen, and formalin-fixed paraffin-embedded tissues, and blood (whole blood, plasma, and serum). Patient donor permission was obtained via participant informed consent prior to the collection and storage of specimens. The biorepository standard operating procedures include specimen handling and tracking (i.e., collection, processing, storage) and facilities management and operations (i.e., equipment maintenance and monitoring). The PHCI biorepository has been acknowledged in various publications, having provided all specimen types in its inventory for previously conducted research projects [65,96,97].
Thirty-five samples were from breast cancer patients (F:M 1:0, median age 60.6, minmax  and the other 39 were from NSCLC patients (F:M 17:22, median age 65.1, min-max 47-79). Ten samples were obtained of stages I, II, and III of breast and lung cancer, 5 of stage IV breast cancer, and 9 of stage IV lung cancer. In the breast cancer samples, 31 were ductal, 2 were lobular and 2 were coded non-specifically as "adenocarcinoma" histology. In the lung cancer samples, 24 were adenocarcinoma, 13 were squamous cell and 2 were large cell histology. The samples from the patients contained a random mix of primary tumors and metastatic tissue.
Patient information was collected from the PHCI database. The information included demographic data, such as age, race, gender, and smoking status, as well as tumor data, including TNM staging, grade, histology, site, and cancer stage. This information is available in Supplemental Materials (Tables S1 and S2)..

Galectin Profiling
The patient's serum was used to determine the circulating galectin levels using an enzyme-linked immunosorbent assay (ELISA) [65]. This study used a subset of the data described by Blair et. al. (2021). Galectin-1, -3, and -9 concentrations were obtained using the ELISA kits from R&D Systems (Minneapolis, MN, USA). Galectin-7 and -8 concentrations were determined using the ELISA kits from Invitrogen (Carlsbad, CA, USA). Each sample was assayed four times. ELISA kit quality control information can be found in Supplemental Materials as Table S3. The AmpliSeq Cancer Hotspot panel (v2), from which all variants were identified, was validated as a laboratory-developed test (LDT) under the Clinical Laboratory Improvement Amendments (CLIA) guidelines. The accuracy of all variant calls was validated at 99.8%. The sensitivity of the variants was detected at a lower limit of 5% allele frequency down to 30% tumor content (cell admixture). The precision of variant detection was shown to be 99.8% between operators and 98.9% within the operator. False variant calls, a measure of specificity, were less than 1% from the CLIA validation.

Cancer HotSpot Panel
Each sequencing run had minimum criteria for variant calls. Coverage across the entire panel must be greater than 90% at 300X for the sequencing run to be analyzed further. A minimum read depth of 100X and 5% allele frequency must be observed for individual variants to be reported. Finally, homopolymer indels and variants within 10 bp of amplicon ends were filtered to minimize the likelihood of false positives.
Each sequencing run included the AcroMetrix Oncology Hotspot Control, which is designed to control the hundreds of amplicons targeted by next-generation sequencing (NGS) panels. It contains over 500 mutations from the COSMIC database and has five variant types of varying nucleotide lengths. The 53 genes represented in the AcroMetrix Oncology Hotspot Control are: ABL1, AKT1, ALK, APC, ATM, BRAF, CDH1, CDKN2A,  CSF1R, CTNNB1, EGFR, ERBB2, ERBB4, EZH2, FBXW7, FGFR1, FGFR2, FGFR3, FLT3,  FOXL2, GNA11, GNAQ, GNAS, HNF1A, HRAS, IDH1, IDH2, JAK2, JAK3, KDR, KIT, KRAS,  MAP2K1, MET, MLH1, MPL, MSH6, NOTCH1, NPM1, NRAS, PDGFRA, PIK3CA, PTEN,  PTPN11, RB1, RET, SMAD4, SMARCB1, SMO, SRC, STK11, TP53, VHL. This control was calibrated using the analysis parameters detailed in the CLIA validation. The resulting analysis yielded 351 detected variants, and these variants served as the reference set for quality control of each sequencing run. As a quality control measure for variant detection, a minimum of 344 variants (95%) must be identified from each sequencing run for variants from clinical specimens to be reported. A detailed quality control log was maintained, which documented the results from each run and was a part of the routine CLIA compliance.

Data Analysis
All statistical analyses were performed using JMP ® software by the SAS Institute (Cary, NC, USA). The distributions of the serum galectin levels were analyzed for normality. The distributions for each galectin in a mutated gene were compared to those of patients with a non-mutated version of the same gene by t-test.
Contingency analyses were performed on the mutation status of genes against other categorical variables, such as tissue site and histology. The odds ratios were calculated for the KIT mutations and brain metastases in both cancers. Both cancers were analyzed separately. Values of p less than 0.05 were considered statistically significant.

Results
The levels of circulating galectins -1, -3, -7, -8 and -9 in breast and lung cancer patients were revealed by an ELISA assay of patient serum [65]. Tumor tissues from the same patients were analyzed for mutations in 50 cancer-critical genes by multiplex polymerase chain reaction (PCR). The mutation status of these genes was compared to the circulating levels of galectins in the cancer patients.

Serum Galectin Levels
Tables 1 and 2 contain the serum galectin levels for the cancer patient groups. Some samples were excluded from further analysis due to the reliability of the results.  Table 3 ranks the genes by frequency of mutation among breast cancer patients as well as the specific mutation. PIK3CA and TP53 were the most mutated genes in this group patients.  Table 4 provides the mutations and their frequencies in the lung cancer patient sample group. TP53 and KDR (VEGF2) were the most mutated genes in this group.

Associations with Galectins
The screening of galectin levels by gene mutations found several associations between the serum galectin levels and cancer-critical gene mutations. Most notable are the associations with multiple galectin levels and the KIT gene. Figure 2 shows a heat map of the t-test results of comparing the serum galectin levels in patients with a mutated gene to patients with a wild-type gene.           Figure 5 shows the contingency analysis of the presence of a KIT mutation at the site of the tissue biopsy of the tumor. Tumor samples taken from the brains of cancer patients were significantly more likely to have a mutation in the KIT gene.   Table 5 shows the odds ratio between having a KIT mutation and brain metastasis. Our sample population did not contain a breast cancer sample with a brain metastasis and wild-type KIT and therefore, no ratio could be calculated for the group. In summary, in breast cancer patients, we find that PIK3CA and TP53 were the most mutated genes while TP53 and KDR (VEGF2) were the most mutated genes in the lung cancer patients. Levels of galectins -1, -3, -8, and -9 were elevated in patients with mutations in the KIT gene. Simultaneously, samples from a brain metastasis of breast and lung  Table 5 shows the odds ratio between having a KIT mutation and brain metastasis. Our sample population did not contain a breast cancer sample with a brain metastasis and wild-type KIT and therefore, no ratio could be calculated for the group.

Associations with Brain Metastases
In summary, in breast cancer patients, we find that PIK3CA and TP53 were the most mutated genes while TP53 and KDR (VEGF2) were the most mutated genes in the lung cancer patients. Levels of galectins -1, -3, -8, and -9 were elevated in patients with mutations in the KIT gene. Simultaneously, samples from a brain metastasis of breast and lung cancer patients had more KIT gene mutations than samples from the primary tumor.

Discussion
Galectins -1, -3, -8, and -9 were found to be at higher levels in sera of breast cancer patients with a mutation in the KIT gene than other cancer patients without the mutation.
The most common KIT mutation in our panel was p.Met541Leu (rs3822214). This mutation occurs in the transmembrane region of the protein and has not been implicated as a mutation of clinical concern [98]. Since the mutation occurs in the transmembrane region, some have theorized that the mutation is loss-of-function and impairs the insertion of the receptor into the membrane [99]. However, studies have shown that the p.Met541Leu mutation increases the RTK's affinity for its ligand, SCF [100,101]. One study found that chronic myelogenous leukemia (CML) patients with this mutation had altered white blood cell counts and overall survival [101].
Our study joins others in finding increasing potential clinical significance for this missense mutation [102]. We investigated the allele frequency of the mutation in these patients and found it indicates a heterozygous germline mutation. This is supported by studies which find that this mutation is common (8.1% allele frequency) in the Caucasian population [103]. For comparison, the mutation appears in 8.57% of our breast cancer patients and 15.38% of the lung cancer patients for 12.61% overall.
Galectins and RTKs, such as c-Kit, are known to have an abundant number of interactions [104]. There is no literature to indicate specific interactions between galectins -1, -3, -8, and -9 and c-Kit, although it is known that galectins do interact with other members of this class of RTKs, such as platelet-derived growth factor receptor (PDGFR) via spatial organization and trafficking [105][106][107].
The association between the c-Kit mutation and increased levels of certain galectins is interesting. There are a variety of possible interpretations of this finding ( Figure 6). The mutation could lead to altered receptor glycosylation, which would in turn affect the galectin serum levels. Galectin expression could be upregulated by the GOF c-Kit mutations via the activated intracellular pathways. The nature of the interaction is of interest and worthy of future studies.
A query of the TCGA database via UALCAN shows that galectins -1, and -3, have decreased expression in breast invasive carcinoma, suggesting that the observed increase in these galectins could be of a non-tumor origin. The database also shows that galectins -8 and -9 have increased expression in breast cancer tissue [110].
This study also found that tissue samples taken from the metastasis in patients' brains were more likely to have a mutated KIT gene. It is unclear why a mutated c-Kit protein would result in this outcome and, in fact, one study has shown that the loss of c-Kit expression has been associated with advanced stages of breast cancer [111]. It is possible that the mutation reduces the stability of the c-Kit protein.
The association between the c-Kit mutation and increased levels of certain galectins is interesting. There are a variety of possible interpretations of this finding ( Figure 6). The mutation could lead to altered receptor glycosylation, which would in turn affect the galectin serum levels. Galectin expression could be upregulated by the GOF c-Kit mutations via the activated intracellular pathways. The nature of the interaction is of interest and worthy of future studies. A query of the TCGA database via UALCAN shows that galectins -1, and -3, have decreased expression in breast invasive carcinoma, suggesting that the observed increase in these galectins could be of a non-tumor origin. The database also shows that galectins -8 and -9 have increased expression in breast cancer tissue [110].
This study also found that tissue samples taken from the metastasis in patients' brains were more likely to have a mutated KIT gene. It is unclear why a mutated c-Kit protein would result in this outcome and, in fact, one study has shown that the loss of c-Kit expression has been associated with advanced stages of breast cancer [111]. It is possible that the mutation reduces the stability of the c-Kit protein.

Impact of Findings
These findings serve to further enhance the understanding of the role of galectins in the cancer setting. Serum levels of certain galectins are known to be increased in cancer [65]. Our study shows that certain galectins could have increased serum levels when certain cancer-critical genes are mutated in the tumor sample, indicating a potential relationship.
Additionally, given the high frequency of the p.Met541Leu c-Kit mutation in the general population, its cause for concern in other studies [101,102] and its correlation with brain metastasis in cancer patients of this study, the p.Met541Leu mutation is a potential marker for more aggressive cancer and has promise for future studies.
The practical application of this research is the discovery of further molecular changes correlated with specific tumor mutations. The Ampliseq hotspot panel provides a gene panel that can be used to investigate many genes of interest, not only in breast and lung cancers but in other cancers and diseases as well.
Further investigations could find blood serum markers that better correlate with the mutation status of cancer-critical genes. This approach has applications in both diagnostics and treatment, as the mutation status of specific proteins often translates to their response to cancer treatments. For example, p.Met541Leu KIT-expressing cells have been shown to have increased sensitivity to imatinib, a c-Kit inhibitor [112]. This is a practical goal, as cancer treatment is tailored to specific mutations and the galectin levels can be targeted by galectin inhibitors.

Study Limitations
Our sample size of 35 breast cancer samples and 39 lung cancer samples reflects the availability of the hotspot panel sequencing data and the pilot nature of this study. Due to the method of sample selection, a traditional power calculation was not performed. The size of the sampling does limit the generalizability of the study. However, we view this work as an exploratory study and a way to find and flag potential genes and gene mutations of interest.
Additionally, we did not control for other patient variables, such as comorbidities and detailed treatment, due to the boundaries of our approved research scope. In regard to the treatment information, our previous work found that the galectin levels in treated versus untreated or not recently treated for this sample group had no observable differences [65]. As such, our study should be interpreted accordingly, as a heterogeneous pool of cancer patients representing the population from which they were obtained.

Conclusions
Based on our findings, we propose areas for future studies. The first is a mechanistic analysis of potential binding between galectins and glycosylated c-Kit protein. Second, is the establishment of the role of c-Kit in the regulation of expression and secretion of galectins. Third, is the investigation into the relationship between mutated c-Kit proteins and metastatic brain tumors. Further, c-KIT and its ligand, SCF, are known to be expressed preferentially in small cell lung cancers [113]. As small cell lung cancers were not examined in this study, the next step would be to examine the c-Kit mutation status and galectin levels in SCLC to determine if there is a correlation. Finally, the concept of a hotspot gene panel to find correlations between the mutations and circulating biochemical markers can be expanded to cover more cancer types and molecular markers. As a result, these studies should not only provide new insight into the key aspects of c-Kit and galectin interactions but may also provide an important framework to create rational approaches to prevent the development of metastasis in other cancers.   Institutional Review Board Statement: Ethical review and approval were waived for this study, which is nonhuman subject research and therefore does not require IRB approval. The study was approved by the Tissue Utilization Committee per the IRB-approved biorepository protocol and SOPs (IRB #Pro00069834).

Informed Consent Statement:
Informed consent was obtained from all subjects involved in the study.

Data Availability Statement:
The data presented in this study is available in this article.

Acknowledgments:
The authors thank the Prisma Health Cancer Institute's biorepository staff for the patient samples. We thank Rebecca Russ-Sellers for her biostatistical help. We also thank Guy Benian for his review of the manuscript.

Conflicts of Interest:
The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.