Lower Expression of CFTR Is Associated with Higher Mortality in a Meta-Analysis of Individuals with Colorectal Cancer

Simple Summary The ion channel gene CFTR is a tumor suppressor in colorectal cancer. It is well-established that individuals with cystic fibrosis, caused by biallelic germline mutations in CFTR, are at increased risk of developing colorectal cancer. A population of colorectal cancer patients with no known relationship to cystic fibrosis expresses reduced levels of CFTR in their tumors. This study aimed to determine if this population experienced increased mortality compared to those expressing higher levels of CFTR. Three independent datasets containing 1177 colorectal cancer patients were analyzed using Cox proportional hazards regression. Analysis of each study individually and meta-analysis of all three revealed an association between reduced CFTR expression and increased mortality. This association is potentially clinically significant because individuals with low CFTR expression may benefit from more aggressive treatment. Additionally, molecular therapies developed to treat cystic fibrosis by increasing CFTR activity may be applicable for colorectal cancer tumors expressing low levels of CFTR. Abstract Individuals with cystic fibrosis (CF), caused by biallelic germline mutations in the cystic fibrosis transmembrane conductance regulator (CFTR), have higher risk and earlier onset of colorectal cancer (CRC). A subset of CRC patients in the non-CF population expresses low levels of tumoral CFTR mRNA which may also cause decreased CFTR activity. To determine the consequences of reduced CFTR expression in this population, we investigated association of tumoral CFTR expression with overall and disease-specific mortality in CRC patients. CFTR mRNA expression, clinical factors and survival data from 1177 CRC patients reported in the Cancer Genome Atlas (TCGA) and Gene Expression Omnibus studies GSE39582 and GSE17538 were included. Log-transformed and z-normalized [mean = 0, standard deviation (SD) = 1] CFTR expression values were modeled as quartiles or dichotomized at the median. Univariate and multivariable Cox proportional hazards regression models were used to estimate hazard ratios (HR) and 95% confidence intervals (CI) for overall and disease-specific mortality in individual studies and meta-analyses. Analyses of each of the three individual datasets showed a robust association of decreased CFTR expression with increased mortality. In meta-analyses adjusted for stage at diagnosis, age and sex, CFTR expression was inversely associated with risk of overall death [pooled HR (95% CI): 0.70 (0.57–0.86)] and disease-specific death [pooled HR (95% CI): 0.68 (0.47–0.99)]. Associations did not differ by stage at diagnosis, age, or sex. Meta-analysis of overall death stratified by microsatellite instable (MSI) versus microsatellite stable (MSS) status indicated potential interaction between MSI/MSS status and CFTR expression, (p-interaction: 0.06). The findings from these three datasets support the hypothesis that low CFTR expression is associated with increased CRC mortality.


Introduction
The cystic fibrosis transmembrane conductance regulator (CFTR) gene encodes an ion channel expressed on the apical surface of luminal epithelia, including the lungs and intestine. The CFTR protein regulates transepithelial transport of Cl − and HCO 3 − ions to maintain water and salt homeostasis at the epithelial surface [1]. Biallelic germline loss of CFTR causes the hereditary life-shortening disease, cystic fibrosis (CF) [2]. The primary cause of mortality in CF is respiratory failure [3]; however, CF also causes clinically significant dysfunction in the intestine throughout the lives of CF patients including dysbiosis, inflammation, meconium ileus, and obstruction of the ileum and colon in adults and children [1,4]. As the lifespan of CF patients has improved, it has become apparent that another gastrointestinal manifestation of CF is increased risk of developing colorectal cancer (CRC) [5,6].
Initial evidence for the CF-CRC link came from a 20-year epidemiological study of more than 41,000 persons with CF in the United States (U.S.) [5]. The standardized incidence ratio for colon cancer was 6 times greater in non-transplanted CF patients compared to the U.S. general population, although the overall cancer risk was similar in these groups [5]. A recent meta-analysis of six additional population-based studies confirmed an increased risk of colon cancer among those with CF [6]. Clinical evidence for a direct connection between CF and CRC comes from endoscopic screening studies that found that adenomatous polyps appear earlier, and are more numerous and aggressive in individuals with CF than in the general population [7]. In line with this finding, the average age of CRC diagnosis among those with CF is 40 years, i.e., about 30 years lower than the average age in the general population [8].
This finding has been recapitulated in our mouse models of CF that directly tested the effects of loss of CFTR on intestinal tumorigenesis. In genetically modified mice with intestinal-specific deletion of Cftr [9], 61% of Cftr-deficient mice developed intestinal adenomas by the end of one year, as compared to none in the Cftr wild-type mice. When we added in an additional mutation in the Apc gene, a gene frequently mutated in human CRC, invasive carcinomas were observed, although they rarely develop in mice with only the Apc mutation [10].
These studies demonstrated that loss of CFTR via systemic germline mutations contributes to CRC in people with CF. However, loss of CFTR is also implicated in sporadic CRC without known connections to CF. Initially, we performed mutagenesis screens in genetically engineered mouse models that were designed to generate random somatic genetic alterations in the intestinal epithelium. Several of these screens identified CFTR as a candidate CRC driver gene, arguing that loss of CFTR activity may contribute to oncogenesis in sporadic human CRC [11][12][13].
In agreement with our animal studies, we found a significant association between CFTR mRNA expression in primary CRC tumors and 3-year disease-free survival of 90 persons diagnosed with stage II CRC [hazard ratio (HR) (95% confidence interval (CI)): 3.6 (1.20-10.77)], after adjusting for tumor location, differentiation, stage and microsatellite instability (MSI) status [10]. Our findings are consistent with findings from another study that reported lower CFTR mRNA and protein expression in CRC tumors versus normal tissue, and in metastatic CRC versus non-metastatic CRC [14].
These studies in humans and mice establish CFTR as a tumor suppressor gene in CRC. To better understand the contribution of reduced CFTR expression to the outcomes in individuals with CRC in the population without CF, we expanded on our earlier study and investigated the association between CFTR expression and survival of individuals with CRC in three independent cohorts: the COADREAD [COAD(COlon ADenocarcinoma) and READ (REctum ADenocarcinoma)] study from The Cancer Genome Atlas Program (TCGA) [15], and two studies with data deposited in the NCBI Gene Expression Omnibus (GEO) database, GSE39582 [16] and GSE17538 [17][18][19][20]. Together, these studies included 1177 persons diagnosed with CRC at stages II, III and IV, 374 of whom died. The data from these studies were analyzed individually and combined in meta-analyses to determine the risk of overall and disease-specific death associated with CFTR expression. These analyses were carried out to validate and extend the results of our pilot study, and to test our hypothesis that reduced CFTR expression is associated with worse outcomes in persons with CRC.

Study Design
We examined overall mortality and disease-specific mortality of persons with CRC in relation to CFTR expression in the three studies: TCGA COADREAD, GSE39582 and GSE17538. The two GSE studies were selected because they have data on CFTR expression and overall survival (OS), and at least 100 participants.

TCGA COADREAD Study
mRNA expression data and related clinical information for CRC patients in the TCGA study were obtained from the cBioPortal for Cancer Genomics [21] data folder "Coadread_tcga_pan_can_atlas_2018". mRNA expression data was extracted from the "data_RNA_Seq_v2_mRNA_median_all_sample_zscores" file [mean = 0 and standard deviation (SD) = 1]. mRNA sequence was generated using an Illumina HiSeq 2000 sequencer (Illumina Inc, San Diego, CA, USA) and processed using the RNAseqV2 pipeline, which uses MapSplice for alignment and RNA-Seq by Expectation-Maximization (RSEM) for quantification.
Clinical data for these patients were obtained from the "data_clinical_patient" file. This file contains information about cancer type (colon and rectal), age and stage at diagnosis, sex, and information about overall survival (OS) and disease-specific survival (DSS) and time of follow-up for each outcome. OS and DSS data were used to obtain overall and disease-specific deaths data used in our analyses in all three studies. After merging studies of individuals with CRC who had information about CFTR mRNA expression and OS (N = 577), we excluded individuals who were diagnosed at stage I (N = 103), and those with missing follow-up time or follow-up time equal to 0 (N = 21), resulting in 453 CRC cases (stages II-IV) available for the analysis of OS. For the analysis of disease-specific death 20 additional individuals were excluded because they did not have information about causes of death, leaving 433 CRC cases for this analysis. MSI and microsatellite stable (MSS) data were obtained from the "data_clinical_sample file". In this study MSI was defined as an MSIsensor score ≥ 3.5, and MSS was defined as MSIsensor score < 3.5 [22].
Somatic mutation data was obtained from the UCSC Xena genomics platform (https: //xena.ucsc.edu/, accessed on 21 January 2023) "dataset: somatic mutation (SNP and indel)-MC3 public version". Cases reporting mutation status were merged with mRNA expression and clinical data from the cBioPortal to give a total of 300 cases.

GSE39582 and GSE17538 Studies
GSE39582 and GSE17538 were downloaded from the NCBI GEO database. In both studies, mRNA expression was measured by the Affymetrix Human Genome U133 Plus 2.0 Array (Thermo Fisher, Waltham, MA, USA) in a log2 scale and normalized using Robust Multi-array Average (RMA) [23]. We normalized mRNA expression to z-scores for all analyses.

Patient and Clinical Information in the GSE17538 Study
This study includes 178 colon cancer cases from the Moffit Cancer Center, and 55 from the Vanderbilt University Medical Center (N = 233). Information about age, sex and stage at diagnosis, and data on OS and DSS, were obtained from the GSE17538-GPL570_series_matrix file. After excluding cases diagnosed at stage I (N = 29), 204 cases diagnosed at stages II-IV remained for the OS analysis, and 153 for the DSS analysis. MSI/MSS status was not reported in this study.

Patient and Clinical Information in the GSE39582 Study
The study includes only persons with colon cancers (N = 566). Information about stage at diagnosis, age, sex, and MSI/MSS status, and data on OS (overall deaths, time of follow-up) were obtained from GSE39582_series_matrix. MSS was defined as "mmr-p" and MSI as "mmr-d". No race information was available in GSE39582. Exclusion of colon cancer cases diagnosed at stage I (N = 37), and those with missing follow-up time or follow-up time equal to 0 (N = 9), resulted in 520 cases available for the OS analysis.
The data from all three studies, TCGA COADREAD, GSE17538 and GSE39582, are publicly available, so approvals by ethics committees were not required.

Statistical Analysis
Unless otherwise mentioned, all analyses were conducted using SAS (version 9.4, SAS Institute Inc, Cary, NC, USA). All statistical comparisons were performed at a two-sided significance level of 0.05 unless otherwise stated.
The demographic and clinical characteristics in each study were summarized. To estimate the risk of overall and disease-specific death across CFTR expression categories, we used Cox proportional hazards regression to estimate hazard ratios (HRs) and 95% confidence intervals (CIs). CFTR expression was analyzed as quartiles or dichotomized at the median to define high and low expressing groups. In the analysis of quartiles, ptrend was estimated by including the CFTR expression categories as an ordinal variable into the Cox proportional hazards model. Proportional hazards assumption was tested in each study using an interaction term between CFTR expression and follow-up time in relation to overall and disease-specific mortality and was not violated in any study. We conducted a univariate analysis and analysis a priori adjusted for age (continuous), sex (male versus female), and stage at diagnosis (II-IV). We did not adjust for race because the information was incomplete: GSE39582 did not collect information on race and 35% of TCGA COADREAD cases were missing data on race. Additionally, we conducted analysis further adjusted for MSI/MSS status in TCGA and GSE39582-the studies where this information was available, and for the colon/rectal subsites in TCGA.
To account for the possibility that death from causes other than CRC may be a competing event, we re-ran analyses of disease-specific deaths using the Fine-Gray sub-distribution hazard competing risk regression model [24].
OS and DSS across categories of CFTR expression were visualized using Kaplan-Meier plots and compared using a log-rank test (R statistical software, version 4.1.2, packages "survminer" and "survival").
Risk of overall and disease-specific deaths associated with CFTR expression was also calculated using CFTR expression dichotomized at median (low versus high) in each study and in the meta-analyses (R statistical software, version 4.1.2, package "metafor"). The meta-analysis of overall deaths combined all three studies, whereas the meta-analysis of disease-specific deaths included only the two studies where this information was available, TCGA COADREAD and GSE17538. Meta-analyses were adjusted for stage at diagnosis, age and sex, and conducted using the fixed-effect model, since the HRs were in the same direction and p-values for heterogeneity between studies were not significant-0.54 for the analysis of overall death and 0.14 for the analysis of disease-specific death (Q-test).
Exploratory analyses were conducted stratified by age (dichotomized at median), sex (male/female), stage at diagnosis (II-IV), and MSI/MSS (MSI versus MSS) in the metaanalyzed data. To perform this analysis, we first stratified each dataset by the variable of interest and then meta-analyzed the association between CFTR expression and death (overall or disease-specific) in each stratum. MSI/MSS status was only available in the TCGA COADREAD and GSE39582, so only these two studies contributed data to this analysis.

Descriptive Analysis
We analyzed data from three independent studies that contained CFTR expression data and associated survival data: TCGA COADREAD, GSE17538, and GSE39582. Median age and male/female ratios were similar in these three studies (Table 1), as were the ranges between z-scores for highest and lowest CFTR gene expression: 5.95 for TCGA, 4.61 for GSE17538, and 5.93 for GSE39582 (Supplementary Table S1). There was some difference in the proportions of people diagnosed with Stage IV versus Stage II, with the GSE17538 study having a higher proportion of Stage IV and lower proportion of Stage II compared to the TCGA COADREAD and GSE39582 studies (Table 1).

CFTR Expression and the Risk of Overall Death
We used three Cox proportional hazards regression models to analyze the association of CFTR expression with overall death. Model 1 was an unadjusted model, Model 2 was adjusted for stage at diagnosis, age and sex, and Model 3 was additionally adjusted for MSI/MSS status (Table 2). In Model 1 in the TCGA study, higher CFTR expression modeled as quartiles was significantly associated with lower risk of death across quartiles (p-trend = 0.04) ( Table 2). There was also an indication of higher CFTR expression associated with lower risk in Model 2, but this association did not reach statistical significance. The TCGA study included both colon and rectal cancer cases. No significant associations were observed when these analyses were limited to only those with colon cancer (Supplementary Table S2). The sample size was too limited to examine rectal cancer separately, as there were only 12 deaths among 87 rectal cancer cases.
In the GSE17538 and GSE39582 studies, there was a significantly lower risk of overall death for those with higher CFTR expression in all models ( Table 2). For instance, for the highest versus lowest quartile, in Model 2, HRs (95% CI) were 0.31 (0.16-0.57, p-trend < 0.01) in GSE17538, and 0.64 (0.42-0.97, p-trend = 0.03) in GSE39582.
We also conducted Model 2 analysis after dichotomizing patients into high and low categories using median CFTR expression. There was an indication of association between higher CFTR expression and lower risk of death for all three studies. However, the association was significant only in GSE17538, HR (95% CI): 0.57 (0.37, 0.87) (Supplementary Table S3).
In agreement with the findings from Cox proportional hazards regression models, Kaplan-Meier analysis indicated that OS was better in patients expressing the highest levels of CFTR (quartile 4) compared to patients expressing the lowest levels of CFTR (quartile 1). Log-rank p-values were 0.078, 0.042 and 0.043 for TCGA, GSE17538, and GSE39582, respectively (Figure 1). In agreement with the findings from Cox proportional hazards regression models, Kaplan-Meier analysis indicated that OS was better in patients expressing the highest levels of CFTR (quartile 4) compared to patients expressing the lowest levels of CFTR (quartile 1). Log-rank p-values were 0.078, 0.042 and 0.043 for TCGA, GSE17538, and GSE39582, respectively (Figure 1). Finally, we performed a meta-analysis of the findings from all three studies to calculate the association of CFTR expression with overall death. In this meta-analysis, high CFTR expression (dichotomized at median) was associated with decreased overall death with HR (95% CI): 0.70 (0.57-0.86) (Figure 2). The associations did not differ significantly across age, sex or stage (Supplementary Table S4). There was a suggestive interaction between CFTR expression and MSI/MSS status, with an inverse association observed for MSS tumors only (p-interaction = 0.06), but the sample size for those with MSI status was too limited for analysis of this group (Supplementary Table S4). Finally, we performed a meta-analysis of the findings from all three studies to calculate the association of CFTR expression with overall death. In this meta-analysis, high CFTR expression (dichotomized at median) was associated with decreased overall death with HR (95% CI): 0.70 (0.57-0.86) (Figure 2). The associations did not differ significantly across age, sex or stage (Supplementary Table S4). There was a suggestive interaction between CFTR expression and MSI/MSS status, with an inverse association observed for MSS tumors only (p-interaction = 0.06), but the sample size for those with MSI status was too limited for analysis of this group (Supplementary Table S4).

CFTR Expression and the Risk of Disease-Specific Death
Information about disease-specific survival (DSS) was available in TCGA COAD-READ and GSE17538. There was an indication of inverse associations between CFTR expression and disease-specific death in all models in both studies (Table 3, Supplementary  Table S5). However, the associations were only statistically significant in GSE17538. For example, in Model 2, for the highest versus lowest quartile, HR (95% CI): 0.22 (0.09-, 0.47, p-trend < 0.01). Analyses that treated non-CRC deaths as competing events showed similar findings to the cause-specific hazard analysis (Supplementary Table S6). In the metaanalysis that pooled the findings from both studies, HR (95% CI) was 0.68 (0.47, 0.99) for high versus low CFTR expression (Figure 3). In meta-analyses stratified by stage at diagnosis, sex, or age, the associations did not differ significantly across subgroups (Supplementary Table S6). Information about disease-specific death and MSI/MSS status was only available for the TCGA dataset, thus the limited sample size (6 disease-specific deaths among those with MSI) precluded stratification by MSI/MSS status. Table 3. Hazard ratio (HR) and 95% confidence interval (CI) for disease-specific death in relation to CFTR mRNA expression, presented as quartiles among individuals with CRC/colon cancer in the TCGA COADREAD and GSE17538 studies.

CFTR Expression and the Risk of Disease-Specific Death
Information about disease-specific survival (DSS) was available in TCGA COADREAD and GSE17538. There was an indication of inverse associations between CFTR expression and disease-specific death in all models in both studies (Table 3, Supplementary Table S5). However, the associations were only statistically significant in GSE17538. For example, in Model 2, for the highest versus lowest quartile, HR (95% CI): 0.22 (0.09-, 0.47, p-trend < 0.01). Analyses that treated non-CRC deaths as competing events showed similar findings to the cause-specific hazard analysis (Supplementary Table S6). In the meta-analysis that pooled the findings from both studies, HR (95% CI) was 0.68 (0.47, 0.99) for high versus low CFTR expression (Figure 3). In meta-analyses stratified by stage at diagnosis, sex, or age, the associations did not differ significantly across subgroups (Supplementary Table  S6). Information about disease-specific death and MSI/MSS status was only available for the TCGA dataset, thus the limited sample size (6 disease-specific deaths among those with MSI) precluded stratification by MSI/MSS status. Table 3. Hazard ratio (HR) and 95% confidence interval (CI) for disease-specific death in relation to CFTR mRNA expression, presented as quartiles among individuals with CRC/colon cancer in the TCGA COADREAD and GSE17538 studies.    Table 3 is lower than in the corresponding study in Table 2 because the information about disease-specific death was not available for all individuals.  Table 3 is lower than in the corresponding study in Table 2 because the information about disease-specific death was not available for all individuals.

Somatic Mutations in CFTR in CRC
Somatic mutations in CFTR may contribute to decreased CFTR mRNA levels and/or to increased mortality. Mutational status based on whole exome sequencing (which captures a subset of CFTR mutations) was reported in the TCGA COADREAD study but not GSE17538 and GSE39582. Of the cases analyzed in our study, 300 reported data on somatic mutations. In these 300 cases, 22 non-synonymous CFTR mutations were found in 17 cases, i.e., 5.7% of cases. Eight mutations were categorized as loss of function (LOF) (loss or gain of stop codon, splice and frameshift mutations), while 14 were missense mutations. Two of the LOF mutations, R851*and E1104*, are also found in the CFTR2 database where they are characterized as CF-causing mutations (Table 4). We did not detect a significant correlation between CFTR mutation status (i.e., the presence of at least one CFTR mutation) and CFTR expression, although this analysis needs to be repeated in a larger study in the future. For all cases carrying any of the 22 mutations, the Point-Biserial correlation coefficient (PBCC) between CFTR mutation status and CFTR expression was r = −0.08 and for cases carrying LOF mutations, PBCC r = −0.0470. The number of cases with mutations and number of deaths among these cases were insufficient to carry out survival analysis. An additional complication was that most of the cases containing CFTR mutations (13 of

Somatic Mutations in CFTR in CRC
Somatic mutations in CFTR may contribute to decreased CFTR mRNA levels and/or to increased mortality. Mutational status based on whole exome sequencing (which captures a subset of CFTR mutations) was reported in the TCGA COADREAD study but not GSE17538 and GSE39582. Of the cases analyzed in our study, 300 reported data on somatic mutations. In these 300 cases, 22 non-synonymous CFTR mutations were found in 17 cases, i.e., 5.7% of cases. Eight mutations were categorized as loss of function (LOF) (loss or gain of stop codon, splice and frameshift mutations), while 14 were missense mutations. Two of the LOF mutations, R851*and E1104*, are also found in the CFTR2 database where they are characterized as CF-causing mutations (Table 4). We did not detect a significant correlation between CFTR mutation status (i.e., the presence of at least one CFTR mutation) and CFTR expression, although this analysis needs to be repeated in a larger study in the future. For all cases carrying any of the 22 mutations, the Point-Biserial correlation coefficient (PBCC) between CFTR mutation status and CFTR expression was r = −0.08 and for cases carrying LOF mutations, PBCC r = −0.0470. The number of cases with mutations and number of deaths among these cases were insufficient to carry out survival analysis. An additional complication was that most of the cases containing CFTR mutations (13 of 17 cases) belong to either the miscrosatellite instable (MSI) or polymerase epsilon (POLE) CRC subtypes. Both subtypes are characterized by deficiencies in DNA repair and tumors typically have mutation burdens of 1000 or more mutations per tumor, so it is difficult to estimate the functional impact of any one mutation. 1 This study was limited to cases with primary tumors, CFTR expression reported, diagnosed at stages 2-4, and mutation status reported. 2 Only loss or gain of stop codon, frame shift, splice site and missense mutations in protein coding regions were included. 3 The CFTR2 database (https://cftr2.org/, accessed on 21 January 2023) maintains a comprehensive list of CF-causing CFTR variants for the CF community. 4 CRC subtypes refer to clinically relevant subtypes based on mechanism of genomic instability. CIN, chromosomal instable; MSI, microsatellite instable; POLE, mutation in polymerase epsilon.

Discussion
Epidemiological studies have identified a strong association between CF patients and CRC risk, with CF patients having a much higher CRC rate and earlier age of onset. There are also more than 10 million CF carriers (a single mutated allele) in the U.S. population. CF carriers have decreased CFTR protein function compared to the general population [25] and may also be at increased risk. In addition, there is a subset of individuals with CRC who express reduced levels of CFTR in their tumors. It is unclear, however, if expression of CFTR is associated with overall and disease-specific survival in sporadic CRC with no known relationship to CF. To address this question, we carried out an analysis of the association of CFTR expression with overall and disease-specific risk of death in 1,177 persons with CRC from three studies-TCGA COADREAD, GSE17538, and GSE39582. Analysis of each individual study showed an inverse relationship between CFTR expression and mortality, with the strongest association in GSE17538. In meta-analyses of these studies, we found that high versus low CFTR mRNA expression was significantly associated with a 30% decreased risk of overall death and 32% decreased risk of disease-specific death after adjusting for age, sex and stage at diagnosis when dichotomizing CRC patients by median CFTR expression. The associations did not differ by age, sex, or stage. CFTR expression was also similar across stages suggesting that the changes in CFTR expression occur at earlier stages of CRC.
This analysis is consistent with the findings of our initial study of 90 patients with stage II CRC [10]. In that study, we reported that lower CFTR expression in CRC tumors was associated with a 29% decrease in 3-year disease-free survival for the 27% of patients with lower CFTR expression versus the 73% with higher expression [10]. The patient population in that initial study is comparable to the populations in TCGA COADREAD, GSE17538 and GSE39582 included in the current study in age and sex distribution [age: 73.4 (34.6-95.1 years); male: 42 (46.7%)] [26]. However, the initial study included only 90 persons with CRC and examined only cases diagnosed at stage II, while the current study examined 1177 cases diagnosed at stages II-IV. Thus, our current study with a much larger sample size extends our initial work by showing that similar associations exist for stages II-IV for overall and disease-specific death.
Further, we found a potential interaction between MSI/MSS status and CFTR expression (p = 0.06). The association between reduced CFTR expression and increased overall death appears to exist among those with MSS but not with MSI status. This result suggests that a differential effect of CFTR high versus low expression in MSS cases most likely drives the inverse association of CFTR expression with increased mortality. The finding of the association for the MSS cases is consistent with the findings from several studies which report that diminished CFTR activity leads to the activation of Wnt/β-catenin signaling-a fundamental pathway in CRC development [10,27,28]. Because the sample size for MSI cases is limited (31 deaths among 126 cases), we cannot conclude whether there is association among this group. Nor could we examine the interaction with MSI/MSS status in relation to disease-specific death, since the CRC cases with MSI/MSS status and disease-specific death were available in the TCGA study only (number of CRC deaths = 6).
The biological basis for association between low CFTR expression and poor survival of individuals with CRC is unknown. However, it is known that loss of CFTR due to germline mutations in individuals with CF causes severe manifestations in the gastrointestinal tract that are potentially oncogenic, including intestinal obstruction, dysbiosis, and inflammation [1,4]. Experimentally, loss of CFTR in animal and cell culture models is associated with activation of the oncogenic Wnt/β-catenin [27,28] and NF-κB pathways [29,30].
A number of factors may contribute to differential expression of CFTR in CRC tumors. In CF carriers with CRC, CFTR expression may be limited due to the presence of a single mutant allele. However, the estimated frequency of CF carriers among individuals with CRC is~6% [31], which is likely not sufficient to explain the association between CFTR expression and mortality.
CFTR mRNA levels could also be affected by somatic genetic alterations. The TCGA COADREAD study reports somatic mutations identified from whole exome sequencing. In the cases analyzed in our study, 22 mutations were reported in 17 of 300 cases, or 5.6% including 8 classified as LOF and 14 classified as missense. We did not detect correlation between CFTR mutation status and CFTR expression although this analysis needs to be repeated in a larger study. Because of the relatively small number of mutations and lack of correlation with expression it is unlikely these mutations are driving the inverse association that we report between CFTR expression and mortality.
CFTR levels may also be affected by many other processes including alterations in signaling pathways that control transcription factors regulating CFTR expression, large chromosomal changes, and epigenetic changes including DNA methylation or histone acetylation. CFTR transcription in the normal intestine is regulated by a complex array of factors acting at upstream and intronic enhancers as well as at promoters. In particular, cis regulatory elements within intron 1 and intron 11 are involved in intestinal-specific regulation. A number of factors promote intestinal-specific transcription through interaction with these elements [32]. Among these, CDX2 is of particular interest because loss of CDX2 is associated with CRC [33,34]. In addition, hypermethylation of the CFTR promoter and consequent downregulation have been associated with several cancers, including lung, breast and CRC [35][36][37]. However, the relationship between these factors, CFTR low expression and CRC survival remains to be determined.
Our study is the first to examine the association between the CFTR expression and CRC survival in a large study of 374 overall deaths among 1177 CRC cases. A strength of our study is that we were able to account for stage, while a limitation is that we could not account for treatment or for lifestyle factors such as diet, smoking and obesity. We also could not analyze the effect of race because data was only available in two out of three studies and was severely limited in the TCGA COADREAD study where 35% of cases were missing information on race. Given the higher CRC incidence and CRC deaths among the Black population, in the future, it will be important to determine if decreased CFTR expression is associated with poor survival in this population as well. Finally, mRNA expression in the TCGA and GEO studies was determined using different platforms: RNA sequencing in TCGA COADREAD and RNA microarrays in the GSE studies. However, we normalized mRNA expression as z-scores, and the range of CFTR z-score measures was similar in all three studies. The ranges between the highest and lowest CFTR z-scored expression on the log scale were comparable across the studies. We also modeled z-scored expression as quartiles or dichotomized at the median because this presentation is less sensitive to different methods of measurements and to outliers compared to continuous variable presentation. Importantly, the associations were in the same direction in all three studies and there was no evidence of heterogeneity across studies, which provides further credibility to our findings.
We examined the association of loss of mRNA expression with survival, and association of somatic CFTR mutations with CFTR expression. However, it was beyond the scope of this study to investigate the effects of germline mutations in CF carriers. In future work it will be important to identify CRC cases with CFTR germline mutations to provide a more complete picture of the impact of loss of CFTR activity on survival. In addition, new CF modulator therapies which are mutation-specific may be applicable to these cases. Finally, identification of the spectrum of mutations found in CRC may aid in understanding which CFTR functions are important for tumor suppression.

Conclusions
In summary, consistent with our hypothesis, we have found an association between higher CFTR expression and lower risk of overall and disease-specific death in individuals with CRC. This association is potentially clinically significant because individuals with low CFTR expression may benefit from more aggressive treatment. In addition, the last 10 years have seen the development of highly effective modulator therapies for the treatment of CF. These small molecule therapies restore function of mutant CFTR proteins and thus treat the underlying cause of CF rather than the symptoms and effects [38]. Some of these therapies may be applicable for treatment of CRC tumors expressing low levels of CFTR. CF modulators may be potentially useful in several CRC situations: 1. In CRC which harbor known CF-causing mutations, modulators may be available to restore function of specific mutations. For example, therapies currently under development to promote readthrough of premature stop codons could be used to rescue CFTR function in cases harboring nonsense mutations [39].
2. Rare CF-causing mutations are tested for response to modulators using patientderived cell lines and organoids. Similar reagents could be used to test uncharacterized CFTR mutations in CRC for response to modulators [40].
3. Modulators such as ivacaftor that increase CFTR ion channel activity by increasing time in the open conformation could potentially compensate for low expression of CFTR mRNA by increasing the activity of the relatively small amount of protein that is made [41].
Our study highlights the importance of understanding mechanisms of downregulation of CFTR in CRC. Our future studies will broaden our work to address this question.
Supplementary Materials: The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/cancers15030989/s1, Table S1: CFTR mRNA expression in the TCGA COADREAD, GSE17538, and GSE39582 studies; Table S2: Hazard ratio (HR) and 95% confidence interval (CI) for overall and disease-specific death in relation to CFTR mRNA expression, presented as quartiles in individuals with colon cancer in the TCGA COADREAD study; Table S3: Hazard ratio (HR) and 95% confidence interval (CI) for overall death in relation to CFTR mRNA expression, dichotomized at median among individuals with CRC/colon cancer in TCGA COAD-READ, GSE39582, and GSE17538 studies; Table S4: Meta-analysis: Hazard ratios (HRs) and 95% confidence interval (CI) for overall death of individuals with CRC in relation to CFTR expression (dichotomized at median) stratified by stage at diagnosis, age, sex, and MSI/MSS status; Table S5: Hazard ratio (HR) and 95% confidence interval (CI) for disease-specific death in relation to CFTR mRNA expression, dichotomized at median among individuals with CRC in the TCGA COADREAD