KLF4 Induces Mesenchymal–Epithelial Transition (MET) by Suppressing Multiple EMT-Inducing Transcription Factors

Simple Summary Cancer is expected to be the leading cause of death due to noncommunicable diseases in the 21st century. Cancer-related mortality is largely due to metastasis. Cancer cells undergoing metastasis exhibit Epithelial–Mesenchymal Plasticity where they can transition from an epithelial to mesenchymal (EMT) or from a mesenchymal to epithelial (MET) phenotype. These transitions are crucial for the success of various stages of metastasis. Both these processes are modulated by multiple EMT-inducing and MET-inducing factors acting in concert. While EMT inducers are well-recognized, MET inducers are relatively poorly investigated. Here, we investigated the role of KLF4 through mechanism-based mathematical models and transcriptomic data analysis and identified it to be a potential MET inducer by suppressing one or more EMT inducers directly and/or indirectly. Abstract Epithelial–Mesenchymal Plasticity (EMP) refers to reversible dynamic processes where cells can transition from epithelial to mesenchymal (EMT) or from mesenchymal to epithelial (MET) phenotypes. Both these processes are modulated by multiple transcription factors acting in concert. While EMT-inducing transcription factors (TFs)—TWIST1/2, ZEB1/2, SNAIL1/2/3, GSC, and FOXC2—are well-characterized, the MET-inducing TFs are relatively poorly understood (OVOL1/2 and GRHL1/2). Here, using mechanism-based mathematical modeling, we show that transcription factor KLF4 can delay the onset of EMT by suppressing multiple EMT-TFs. Our simulations suggest that KLF4 overexpression can promote a phenotypic shift toward a more epithelial state, an observation suggested by the negative correlation of KLF4 with EMT-TFs and with transcriptomic-based EMT scoring metrics in cancer cell lines. We also show that the influence of KLF4 in modulating the EMT dynamics can be strengthened by its ability to inhibit cell-state transitions at the epigenetic level. Thus, KLF4 can inhibit EMT through multiple parallel paths and can act as a putative MET-TF. KLF4 associates with the patient survival metrics across multiple cancers in a context-specific manner, highlighting the complex association of EMP with patient survival.


Introduction
Cancer is expected to surpass all noncommunicable disease-related deaths in the 21st century, making it a major global public health threat [1]. Nearly all cancer-related deaths can be attributed to the process of metastasis [2]. Metastasizing cells can migrate and invade, traits that enable them to disseminate throughout the body [3]. However, metastasis is a highly inefficient process with attrition rates as high as >99.5% [4]; thus, only between the KLF4 levels and enrichment of EMT. We also incorporated the impact of the epigenetic influence mediated by KLF4 and SNAIL in a population dynamics scenario and demonstrated that KLF4-mediated 'epigenetic locking' can enable resistance to EMT, while SNAIL-mediated effects can drive a stronger EMT. Finally, we propose KLF4 as a potential MET-TF that can repress many EMT-TFs simultaneously and inhibit EMT through multiple parallel paths. These observations are supported by the observed association of KLF4 with patient survival metrics across multiple cancers.

KLF4 Inhibits the Progression of EMT
We began by examining the role of KLF4 in modulating EMT dynamics. To do this we investigated the dynamics of the interaction between KLF4 and a core EMT regulatory circuit (denoted by the black dotted rectangle in Figure 1A) comprised of four players: three EMT-inducing transcription factors (EMT-TFs)-ZEB1/2, SNAIL, and SLUG-and an EMT-inhibiting microRNA family (miR-200).
Cancers 2021, 13,5135 3 of 18 consequently, its overexpression can induce a partial or complete MET, similar to the observations for GRHL2 [52][53][54]. An analysis of in vitro transcriptomic datasets and cancer patient samples from The Cancer Genome Atlas (TCGA) revealed a negative correlation between the KLF4 levels and enrichment of EMT. We also incorporated the impact of the epigenetic influence mediated by KLF4 and SNAIL in a population dynamics scenario and demonstrated that KLF4-mediated 'epigenetic locking' can enable resistance to EMT, while SNAIL-mediated effects can drive a stronger EMT. Finally, we propose KLF4 as a potential MET-TF that can repress many EMT-TFs simultaneously and inhibit EMT through multiple parallel paths. These observations are supported by the observed association of KLF4 with patient survival metrics across multiple cancers.

KLF4 Inhibits the Progression of EMT
We began by examining the role of KLF4 in modulating EMT dynamics. To do this we investigated the dynamics of the interaction between KLF4 and a core EMT regulatory circuit (denoted by the black dotted rectangle in Figure 1A) comprised of four players: three EMT-inducing transcription factors (EMT-TFs)-ZEB1/2, SNAIL, and SLUG-and an EMT-inhibiting microRNA family (miR-200). Green arrows denote activation, and red bars indicate inhibition. Solid arrows represent transcriptional regulation; a dotted line represents microRNA-mediated regulation. The circuit shown within the dotted rectangle is the control case (i.e., core EMT network without KLF4). (B) Bifurcation diagrams indicating ZEB1/2 mRNA levels for increasing the external signal (I) levels for the coupled EMT-KLF4 circuit (solid blue and dotted red curve) and the core EMT circuit (solid green and dotted black curve). (C) Temporal dynamics of the ZEB1/2 mRNA levels in a cell starting in an epithelial phenotype when exposed to a high level of an external EMT signal (I_ext = 100,000 molecules) (pink-shaded region) for the circuits shown in (A)). (D-F) Phase diagrams for the KLF4-EMT network driven by an external signal (I_ext) for (D) varying strengths of repression on SNAIL by KLF4, (E) for varying strengths of repression on KLF4 by SNAIL, and (F) for varying strengths of repression on SLUG by KLF4. Green arrows denote activation, and red bars indicate inhibition. Solid arrows represent transcriptional regulation; a dotted line represents microRNA-mediated regulation. The circuit shown within the dotted rectangle is the control case (i.e., core EMT network without KLF4). (B) Bifurcation diagrams indicating ZEB1/2 mRNA levels for increasing the external signal (I) levels for the coupled EMT-KLF4 circuit (solid blue and dotted red curve) and the core EMT circuit (solid green and dotted black curve). (C) Temporal dynamics of the ZEB1/2 mRNA levels in a cell starting in an epithelial phenotype when exposed to a high level of an external EMT signal (I_ext = 100,000 molecules) (pink-shaded region) for the circuits shown in (A)). (D-F) Phase diagrams for the KLF4-EMT network driven by an external signal (I_ext) for (D) varying strengths of repression on SNAIL by KLF4, (E) for varying strengths of repression on KLF4 by SNAIL, and (F) for varying strengths of repression on SLUG by KLF4.
First, we calculated a bifurcation diagram of ZEB1/2 mRNA levels in response to an external EMT-inducing signal I_ext, which can represent various possible intracellular/extracellular stimuli that can drive EMT ( Figure 1B). The ZEB1/2 mRNA levels served as a readout of the EMT phenotypes. We observed that, with an increase in the levels of I_ext, the cells could switch from an epithelial state (ZEB1/2 mRNA < 100 molecules) to a hybrid E/M phenotype (100 < ZEB1/2 mRNA molecules < 600) and, finally, to a mesenchymal state (ZEB1/2 mRNA > 600 molecules). This behavior was observed both in the absence of KLF4 (curve with a green solid line and black dashed line in Figure 1B) and in the presence of KLF4 (curve with a blue solid line and red dashed line in Figure 1B), and the bifurcation curves looked similar in shape. However, in the presence of KLF4, we observed that the system required a stronger EMT signal to be pushed out of an epithelial state and, also, for the acquisition of a completely mesenchymal state (i.e., the bifurcation diagram in the presence of KLF4 was shifted to the right as compared to that of the core network without KLF4). This observation suggests that KLF4 can inhibit the progression of EMT.
To test the robustness of the model prediction for the role of KLF4 in EMT, we performed a sensitivity analysis in which we varied the numerical value of every kinetic parameter used in the model by ±10% one at a time and captured the changes in the range of the I_ext for which the hybrid E/M state existed in the bifurcation diagram. Except for a few parameter cases involving ZEB1/2 and miR200 interactions, this change was found to be less than 10% for a corresponding 10% change in the individual parameter values. Specifically, for variations in the kinetic parameters corresponding to the interactions of KLF4 with the core EMT circuit, this change did not extend beyond 1% ( Figure S1A). Thus, the observed behavior of KLF4 in its ability to delay or inhibit EMT is robust to small parametric variations.
Next, we determined the temporal response of cells to a fixed concentration of the external EMT-inducing signal I ext . In the absence of KLF4, cells in the epithelial state transitioned first to a hybrid E/M state and then to a mesenchymal state in response to an external signal (red curve in Figure 1C). However, in the presence of KLF4, this transition was much more gradual and relatively slower (blue curve in Figure 1C). In addition, the steady-state value of ZEB1/2 mRNA levels was lower in the presence of KLF4 as compared to the control case. This decrease can be attributed to the KLF4-mediated inhibition of both SLUG and SNAIL that can activate ZEB1/2. Additionally, it was consistent with the trends in ZEB1/2 mRNA level bifurcation diagram (the blue curve lies below the green curve at all the values of I_ext in Figure 1B).
KLF4 inhibits both SLUG and SNAIL and is inhibited by both of them. Thus, we probed the impact of the interactions between KLF4 and both of these EMT-TFs in terms of influencing EMT progression. First, we varied the strength of the repression of SNAIL by KLF4. When this repression was strong (i.e., low λ KS or low K 0 S values), the cells required a stronger EMT-inducing signal to be pushed out of the epithelial state. Conversely, when KLF4 inhibited SNAIL weakly (higher λ KS or K 0 S values), EMT could be induced at lower values of I_ext ( Figures 1D and S1B). Next, we varied the repression of KLF4 by SNAIL. At a stronger repression (i.e., low λ SK or low S 0 K values), the cells could exit the epithelial state at a weaker external EMT-inducing signal. Conversely, when SNAIL inhibited KLF4 weakly (higher λ SK or S 0 K values), a stronger stimulus was required for the cells to exit the epithelial state ( Figures 1E and S1C). Put together, these results highlighted that, while a weaker impact of KLF4-through either a stronger repression of KLF4 by SNAIL or by a weaker repression of SNAIL by KLF4-potentiated the progression of EMT, a stronger impact of KLF4 prevented cells from undergoing EMT. Similar results were seen for the feedback loop between SLUG and KLF4 ( Figures 1F and S1D,E), but the impact on the EMT dynamics was weaker upon altering the inhibition of SLUG by KLF4 than that of SNAIL by KLF4. Upon altering either λ KSl or K 0 Sl, we did not observe any change in concentration of I ext needed to induce EMT, as seen for the case with SNAIL (compare Figure S1D with Figure 1D and Figure S1E with Figure S1B). This difference may be explained by reports suggesting that SNAIL is a more potent EMT inducer than SLUG [9,46]. This hypothesis is strengthened by observations that SLUG self-activation does not alter the qualitative dynamics of the KLF4-SNAIL interactions ( Figure S2A-C). A stronger KLF4 self-activation, on the other hand, can increase resistance to undergo EMT ( Figure S2D).

KLF4 Promotes an Epithelial Phenotype
We next examined whether the impact of KLF4 in inhibiting EMT can be a generic emergent property of the topology of the regulatory network that it forms with SLUG, SNAIL, and ZEB1/21, instead of the behavior of a specific parameter set. Thus, to map the possible phenotypic space of the network shown in Figure 1A, we simulated its dynamics using the computational framework RACIPE [55]. RACIPE considers the topology of the gene regulatory network as an input and converts that network topology information into a set of coupled ordinary differential equations (ODEs). These ODEs are solved over a wide range of biologically relevant parameter values to identify the different possible steady-state gene expression levels (phenotypes) that the network is capable of giving rise to.
After obtaining the possible phenotypes from the RACIPE analysis, we plotted the distributions of the steady-state levels of different nodes in the gene regulatory network obtained across the ensemble of parameter sets. KLF4, SLUG, ZEB1, and miR-200 showed bimodal distribution, while SNAIL showed a unimodal distribution ( Figure S3A); thus, SNAIL was excluded from the subsequent clustering analysis. We plotted the steady states obtained from RACIPE derived as a heatmap (Figure 2A, left). Hierarchical clustering performed on the heatmap revealed two predominant clusters-one with low ZEB1/2, low SLUG, high miR200, and high KLF4 and another with high ZEB1/2, high SLUG, low miR200, and low KLF4 levels. These clusters can be mapped onto epithelial and mesenchymal phenotypes, respectively (green and orange bars in Figure 2A). Next, we perturbed the production rates of KLF4 to mimic the over-and under-expression of KLF4 and assessed its effect on the frequency of the observed epithelial and mesenchymal phenotypes. Overexpression of KLF4 led to an increased frequency of the epithelial cluster and a decrease in the mesenchymal cluster (Figure 2A, right). As expected, opposite patterns were observed upon the inhibition and over-expression of ZEB1, an EMT-TF ( Figure S3B). We quantified the change in the fraction of the epithelial phenotype when KLF4, ZEB1, and SLUG are either overexpressed or inhibited one at a time. The overexpression of KLF4 or downregulation of either SLUG or ZEB1 increased the frequency of the epithelial phenotype ( Figure 2B).
To investigate the classification of phenotypes in a more quantitative manner, and to characterize heterogeneity in EMT in a given cell population, we defined an EMT score (= ZEB1 + SLUG − miR-200 − CDH1) for an extended regulatory network that included E-cadherin and its connections with SLUG and ZEB1 ( Figure S3C). The distribution of the EMT scores plotted for the solutions seen across the parameter sets in RACIPE revealed a trimodal distribution, indicating the existence of three distinct EMT phenotypes: epithelial, mesenchymal, and hybrid E/M ( Figure 2C, top panel). A simulated knockdown of KLF4 drove an increase in the mesenchymal subpopulation with a concurrent decrease in the epithelial states ( Figure 2C, middle panel). Opposite trends were observed for the case of simulated KLF4 overexpression ( Figure 2C, bottom panel). These results highlighted that the emergent dynamics of the KLF4-EMT network can allow for multiple phenotypes to coexist in an isogenic population; and the KLF4 levels can modulate the distribution of phenotypic heterogeneity in that population towards a more epithelial or a more mesenchymal state. To interrogate the role of KLF4 in EMT/MET further, we performed a pairwise correlation analysis in the Cancer Cell Line Encyclopedia (CCLE) cohort among three distinct transcriptomic-based EMT scoring metrics (76GS, KS, and MLR) and the expression levels of KLF4 and those of the other canonical epithelial and mesenchymal factors. KLF4 correlated negatively with the KS and MLR EMT scoring metrics (higher KS or MLR scores denote a mesenchymal phenotype [56]) but positively with the 76GS scores (higher 76GS scores denote a more epithelial phenotype [56]) ( Figure 2D,i). Most EMT-TFs were found to be correlated positively with each other (SNAI1/2, ZEB1/2, and TWIST1) and negatively with KLF4 and the other MET drivers, such as ESRP1/2, OVOL1/2, and GRHL2 [57], which were all positively corelated with KLF4 ( Figure 2D,i). Consistent correlations were recapitulated in the RACIPE simulation data for the KLF4-EMT network ( Figure 2D,ii), thus underscoring that the gene regulatory network considered in Figure 1A can explain these observed experimental trends for the existence of 'teams' [58] of EMT and MET inducers. Interestingly, GRHL2 seemed to correlate more strongly with ZEB1, ZEB2, and TWIST1 and the MLR and KS scores as compared to KLF4 ( Figure 2D,i), thus encouraging us to compare the influence of KLF4 and GRHL2 in terms of their ability to induce MET via simulations. We compared the over expression (OE) and down expression (DE) scenarios of GRHL2 and KLF4 in terms of influencing the distribution of the epithelial and mesenchymal phenotypes and noted a stronger enrichment of mesenchymal upon the downregulation of GRHL2 than that seen upon the downregulation of KLF4 ( Figure 2E and S3D). Thus, our results suggest that KLF4, similar to GRHL2, can induce a partial or full MET ( Figure 2F). To interrogate the role of KLF4 in EMT/MET further, we performed a pairwise correlation analysis in the Cancer Cell Line Encyclopedia (CCLE) cohort among three distinct transcriptomic-based EMT scoring metrics (76GS, KS, and MLR) and the expression levels of KLF4 and those of the other canonical epithelial and mesenchymal factors. KLF4 correlated negatively with the KS and MLR EMT scoring metrics (higher KS or MLR scores denote a mesenchymal phenotype [56]) but positively with the 76GS scores (higher 76GS scores denote a more epithelial phenotype [56]) ( Figure 2D,i). Most EMT-TFs were found to be correlated positively with each other (SNAI1/2, ZEB1/2, and TWIST1) and negatively with KLF4 and the other MET drivers, such as ESRP1/2, OVOL1/2, and GRHL2 [57], which were all positively corelated with KLF4 ( Figure 2D,i). Consistent correlations were recapitulated in the RACIPE simulation data for the KLF4-EMT network ( Figure 2D,ii), thus underscoring that the gene regulatory network considered in Figure 1A can explain these observed experimental trends for the existence of 'teams' [58] of EMT and MET inducers. Interestingly, GRHL2 seemed to correlate more strongly with ZEB1, ZEB2, and TWIST1 and the MLR and KS scores as compared to KLF4 ( Figure 2D,i), thus encouraging us to compare the influence of KLF4 and GRHL2 in terms of their ability to induce MET via simulations. We compared the over expression (OE) and down expression (DE) scenarios of GRHL2 and KLF4 in terms of influencing the distribution of the epithelial and mesenchymal phenotypes and noted a stronger enrichment of mesenchymal upon the downregulation of GRHL2 than that seen upon the downregulation of KLF4 ( Figures 2E and S3D). Thus, our results suggest that KLF4, similar to GRHL2, can induce a partial or full MET ( Figure 2F).

KLF4 Is Inhibited during EMT
Next, using various publicly available transcriptomic datasets, we examined if KLF4 is inhibited as cells undergo EMT. In mouse mammary cells EpRas induced to undergo EMT by TGFβ treatment for 14 days [59], KLF4 levels were reduced (GSE59922; Figure 3A). Similarly, when EMT was induced in HMEC cells via the overexpression of SNAIL or SLUG [9], KLF4 levels went down (GSE40690; Figure 3B). Reinforcing trends were seen in MCF-7 cells forced to undergo EMT via the overexpression of SNAIL [60] (GSE58252; Figure 3C), in OVCA4209 cells in which GRHL2 was knocked down [61] (GSE118407; Figure 3D), and in MCF10A cells cultured without growth factors and shown to undergo EMT [62] (GSE85857; Figure 3E). Further, as compared to the primary HMEC (human mammary epithelial cells), immortalized and Ras-transformed HMECs were enriched for EMT-associated genes (GSE110677) and had lower KLF4 levels ( Figure 3F). Together, these datasets across cancer types reveal a robust reduction in KLF4 expression with the onset of EMT.

KLF4 Is Inhibited during EMT
Next, using various publicly available transcriptomic datasets, we examined if KLF4 is inhibited as cells undergo EMT. In mouse mammary cells EpRas induced to undergo EMT by TGFβ treatment for 14 days [59], KLF4 levels were reduced (GSE59922; Figure  3A). Similarly, when EMT was induced in HMEC cells via the overexpression of SNAIL or SLUG [9], KLF4 levels went down (GSE40690; Figure 3B). Reinforcing trends were seen in MCF-7 cells forced to undergo EMT via the overexpression of SNAIL [60] (GSE58252; Figure 3C), in OVCA4209 cells in which GRHL2 was knocked down [61] Figure 3E). Further, as compared to the primary HMEC (human mammary epithelial cells), immortalized and Ras-transformed HMECs were enriched for EMT-associated genes (GSE110677) and had lower KLF4 levels ( Figure 3F). Together, these datasets across cancer types reveal a robust reduction in KLF4 expression with the onset of EMT. To substantiate these in vitro observations with clinical data, we compared KLF4 levels across cancers in The Cancer Genome Atlas (TCGA). KLF4 expression was found to be lower in cancers with a more mesenchymal phenotype, as measured by their higher KSbased EMT scores [63]. Mesenchymal-like cancers, such as uveal melanoma (UVM), uterine carcinosarcoma (UCS), glioblastoma (GBM), and low-grade glioma (LGG), tend to have lower KLF4 expressions (shown in red in Figure 3G). Conversely, the KLF4 levels were higher in epithelial-like cancer types, such as the head and neck squamous cell carcinoma (HNSC), esophageal carcinoma (ESCA), stomach adenocarcinoma (STAD), and cervical carcinoma (CESC) samples (shown in blue; Figure 3G). ZEB1 and SNAIL, on the other hand, showed opposite trends to KLF4: enriched in cancers with a higher KS score: LGG, GBM, UCS, SARC (sarcoma), and PCPG (pheochromocytoma and paraganglioma) but reduced in those with a lower one: HNSC, COAD (colorectal adeno-carcinoma), CESC, BLCA (bladder carcinoma), and READ (rectum adenocarcinoma) (Figures 3H and To substantiate these in vitro observations with clinical data, we compared KLF4 levels across cancers in The Cancer Genome Atlas (TCGA). KLF4 expression was found to be lower in cancers with a more mesenchymal phenotype, as measured by their higher KS-based EMT scores [63]. Mesenchymal-like cancers, such as uveal melanoma (UVM), uterine carcinosarcoma (UCS), glioblastoma (GBM), and low-grade glioma (LGG), tend to have lower KLF4 expressions (shown in red in Figure 3G). Conversely, the KLF4 levels were higher in epithelial-like cancer types, such as the head and neck squamous cell carcinoma (HNSC), esophageal carcinoma (ESCA), stomach adenocarcinoma (STAD), and cervical carcinoma (CESC) samples (shown in blue; Figure 3G). ZEB1 and SNAIL, on the other hand, showed opposite trends to KLF4: enriched in cancers with a higher KS score: LGG, GBM, UCS, SARC (sarcoma), and PCPG (pheochromocytoma and paraganglioma) but  (Figures 3H and S4A). Hence, an inverse correlation of KLF4 with multiple EMT-TFs seen in vitro is consistently observed in TCGA samples.

Epigenetic Changes, including KLF4 Promoter Methylation, Can Alter Population Distributions along the EMT Spectrum
A decrease in KLF4 expression has been reported to be associated with the hypermethylation of the KLF4 promoter during EMT in renal fibrosis in vitro and in vivo [64]. Thus, we examined the correlation of KLF4 expression with its methylation status in TCGA data. We observed a reduced methylation of KLF4 in many cancers with reduced KS scores, such as HNSC, ESCA, and COAD. Consistent with this observation, KLF4 expression and methylation status were negatively correlated ( Figure 4A), reminiscent of the observations in the renal cancer cell lines and tissues and suggesting a possible epigenetic mechanism driving its suppression during EMT. Consistently, a DNA methyltransferase inhibitor increased KLF4 expression in renal cancers [65]. SNAIL expression was also negatively correlated with the corresponding promoter methylation levels in TCGA; however, ZEB1 did not show a clear pattern ( Figure S4B,C). These observations drove us to investigate the impact of the epigenetic influence operating in the KLF4 and SNAIL feedback loop.

Epigenetic Changes, including KLF4 Promoter Methylation, Can Alter Population Distributions along the EMT Spectrum
A decrease in KLF4 expression has been reported to be associated with the hypermethylation of the KLF4 promoter during EMT in renal fibrosis in vitro and in vivo [64]. Thus, we examined the correlation of KLF4 expression with its methylation status in TCGA data. We observed a reduced methylation of KLF4 in many cancers with reduced KS scores, such as HNSC, ESCA, and COAD. Consistent with this observation, KLF4 expression and methylation status were negatively correlated ( Figure 4A), reminiscent of the observations in the renal cancer cell lines and tissues and suggesting a possible epigenetic mechanism driving its suppression during EMT. Consistently, a DNA methyltransferase inhibitor increased KLF4 expression in renal cancers [65]. SNAIL expression was also negatively correlated with the corresponding promoter methylation levels in TCGA; however, ZEB1 did not show a clear pattern ( Figure S4B,C). These observations drove us to investigate the impact of the epigenetic influence operating in the KLF4 and SNAIL feedback loop. Epigenetic changes can drastically alter the rates of transition among the different cell phenotypes by controlling the accessibility of the promoters for 'master regulators'. In the context of EMT, we have previously shown that epigenetic feedback mediated by ZEB1 while repressing miR-200 (i.e., blocking the access of EMT inducers to the miR-200 promoter) can drive irreversible EMT, while that mediated by GRHL2 (i.e., blocking access to the ZEB1 promoter for EMT inducers) in inhibiting ZEB1 can enable irreversible Epigenetic changes can drastically alter the rates of transition among the different cell phenotypes by controlling the accessibility of the promoters for 'master regulators'. In the context of EMT, we have previously shown that epigenetic feedback mediated by ZEB1 while repressing miR-200 (i.e., blocking the access of EMT inducers to the miR-200 promoter) can drive irreversible EMT, while that mediated by GRHL2 (i.e., blocking access to the ZEB1 promoter for EMT inducers) in inhibiting ZEB1 can enable irreversible MET, i.e., a resistance of cells to undergo EMT [66,67]. Here, we assessed the impact of the KLF4-mediated epigenetic silencing of SNAIL (i.e., the ability of KLF4 to cause methylation of the SNAIL promoter directly or indirectly) and vice versa (SNAIL-mediated epigenetic silencing of KLF4) with a population dynamics model capturing a cell population with diverse EMT states (epithelial, mesenchymal, and hybrid E/M). This phenomenological model encapsulates the epigenetic influence by modulating the threshold for the impact of a transcription factor on the expression of its downstream target [68]. Such dynamic thresholds capturing the epigenetic influence often enable the self-stabilization of gene expression states, i.e., the longer a transcription factor has been active, the easier it becomes for it to stay 'on'. We introduced two epigenetic variables: α 1 and α 2 . The higher the value of α 1 , the stronger is the influence of the KLF4-mediated effective epigenetic silencing of SNAIL. The higher the value of α 2 , the stronger is the influence of the SNAIL-mediated effective epigenetic silencing of KLF4 (see Methods for details).
As a first step towards understanding the dynamics of this epigenetic 'tug of war' between KLF4 and SNAIL, we characterized how the bifurcation diagram of the KLF4-EMT-coupled circuit changed at various values of α 1 and α 2 . When the epigenetic silencing of SNAIL mediated by KLF4 was higher than that of KLF4 mediated by SNAIL ((α 1 , α 2 ) = (0.75, 0.1)), a larger EMT-inducing signal (I_ext) was required to push cells out of an epithelial state, because SNAIL was being strongly repressed by KLF4 as compared to the control case in which there is no epigenetic influence (compare the blue/red curve with the black/yellow curve in Figure 4B). Conversely, when the epigenetic silencing of KLF4 predominated ((α 1 , α 2 ) = (0.25, 0.75)), it was easier for cells to exit an epithelial state, presumably because the KLF4 repression of EMT was now being inhibited more potently by SNAIL relative to the control case (compare the blue/red curve with the black/green curve in Figure 4B). Thus, these opposing epigenetic 'forces' can 'push' the bifurcation diagram in different directions along the x-axis without impacting any of its major qualitative features.
To consolidate these results, we next performed stochastic simulations for a population of 500 cells at a fixed value of I_ext = 90,000 molecules. We observed a stable phenotypic distribution with 6% epithelial (E), 28% mesenchymal (M), and 66% hybrid E/M cells ( Figure 4C, top) in the absence of any epigenetic regulation (α 1 = α 2 = 0). In the case of a stronger epigenetic repression of SNAIL by KLF4 (α 1 = 0.75, α 2 = 0.1), the population distribution changed to 32% epithelial (E), 3% mesenchymal (M), and 65% hybrid E/M cells ( Figure 4C, middle). Conversely, when SNAIL repressed KLF4 more dominantly (α 1 = 0.25 and α 2 = 0.75), the population distribution changed to 1% epithelial (E), 58% mesenchymal (M), and 41% hybrid E/M cells ( Figure 4C, bottom). A similar analysis was performed for collating steady-state distributions for a range of α 1 and α 2 values, revealing that high α 1 and low α 2 values favored the predominance of an epithelial phenotype ( Figure 4D, top), but low α 1 and high α 2 values facilitated a mesenchymal phenotype ( Figure 4D, bottom). Intriguingly, when the strength of the epigenetic repression from KLF4 to SNAIL and vice versa was comparable, the hybrid E/M phenotype dominated ( Figure 4D, middle). Put together, varying extents of epigenetic silencing mediated by EMT-TF SNAIL and a MET-TF KLF4 can fine tune the epithelial-hybrid-mesenchymal heterogeneity patterns in a cell population.

KLF4 Correlates with Patient Survival
To determine the effects of KLF4 on clinical outcomes, we investigated the correlation between KLF4 and patient survival. We observed that high KLF4 levels correlated with better relapse-free survival ( Figure 5A,B) and better overall survival ( Figure 5C,D) in two specific breast cancer datasets-GSE42568 (n = 104 breast cancer biopsies) [69] and GSE3494 (n = 251 primary breast tumors) [70]. However, the trend was reversed in terms of the overall survival data ( Figure 5E,F) in ovarian cancer-GSE26712 (n = 195 tumor specimens) [71] and GSE30161 (n = 58 cancer samples) [72] and lung cancer ( Figure 5G,H)-GSE30219 (n = 293 lung tumor samples) [73] and caArray (n = 504 samples) [74], where high KLF4 levels correlated with worse patient outcomes.
Cancers 2021, 13, 5135 10 of 18 -GSE30219 (n = 293 lung tumor samples) [73] and caArray (n = 504 samples) [74], where high KLF4 levels correlated with worse patient outcomes. Given that high KLF4 expression associates with a more epithelial phenotype, these results, when extrapolated to indicate the extent of EMT/MET, suggest that EMT associates with a worse survival in breast cancer but not necessarily in ovarian cancer and lung cancer, as far as these limited datasets being analyzed are concerned. These results are reminiscent of previous observations that EMT need not universally correlate with worse patient survival outcomes and can depend on the cancer type being investigated [63,75]. Therefore, the association of KLF4 with survival seems to be tumor type-specific, and future studies are needed to decipher the interplay between KLF4 and EMT/MET states as a prognostic marker of clinical outcomes in a cancer-specific manner.

Discussion
We hereby propose KLF4 as a potential MET-inducing transcription factor (MET-TFs) based on in silico model predictions and their experimental validation across multiple in vitro and cancer patient sample datasets. This observation adds to the increasing literature on the role of KLF4 in inhibiting EMT and/or driving MET in different biological contexts. For instance, in the colon epithelial cell line RKO, KLF4 upregulates the levels of various epithelial-specific keratins, such as KRT8 and KRT18 [76]. Similarly, in nasopharyngeal carcinoma, KLF4 can transcriptionally activate E-cadherin and reduce the motility and invasion of cells. This reduction is at least partly rescued by shRNA-mediated E-cadherin knockdown in KLF4-expressing cells, suggesting a functional role of E-cadherin in regulating these traits [77]. The direct transcriptional activation of CDH1 (E-cadherin) by KLF4 has also been noted in MCF10A cells [78]. Further, the overexpression of KLF4 in MDA-MB-231 breast cancer cells restored E-cadherin levels, induced an epithelial morphology, and suppressed migration and invasion [78], similar to previous observations in Given that high KLF4 expression associates with a more epithelial phenotype, these results, when extrapolated to indicate the extent of EMT/MET, suggest that EMT associates with a worse survival in breast cancer but not necessarily in ovarian cancer and lung cancer, as far as these limited datasets being analyzed are concerned. These results are reminiscent of previous observations that EMT need not universally correlate with worse patient survival outcomes and can depend on the cancer type being investigated [63,75]. Therefore, the association of KLF4 with survival seems to be tumor type-specific, and future studies are needed to decipher the interplay between KLF4 and EMT/MET states as a prognostic marker of clinical outcomes in a cancer-specific manner.

Discussion
We hereby propose KLF4 as a potential MET-inducing transcription factor (MET-TFs) based on in silico model predictions and their experimental validation across multiple in vitro and cancer patient sample datasets. This observation adds to the increasing literature on the role of KLF4 in inhibiting EMT and/or driving MET in different biological contexts. For instance, in the colon epithelial cell line RKO, KLF4 upregulates the levels of various epithelial-specific keratins, such as KRT8 and KRT18 [76]. Similarly, in nasopharyngeal carcinoma, KLF4 can transcriptionally activate E-cadherin and reduce the motility and invasion of cells. This reduction is at least partly rescued by shRNA-mediated E-cadherin knockdown in KLF4-expressing cells, suggesting a functional role of E-cadherin in regulating these traits [77]. The direct transcriptional activation of CDH1 (E-cadherin) by KLF4 has also been noted in MCF10A cells [78]. Further, the overexpression of KLF4 in MDA-MB-231 breast cancer cells restored E-cadherin levels, induced an epithelial morphology, and suppressed migration and invasion [78], similar to previous observations in these cells for another MET-TF, GRHL2 [14]. Consistently, KLF4 overexpression decreased levels of vimentin and Slug and increased those of E-cadherin in OVCAR3 ovarian cancer cells [79]. These observations are reminiscent of the effect of KLF4 knockdown in a prostate stem cell line where the cells lost their epithelial markers, such as E-cadherin, ZO-1, and cytokeratin 8, and showed elevated levels of vimentin, SNAIL, SLUG, and ZEB1 [80]. Supporting these in vitro observations, in pancreatic cancer samples, KLF4 correlated positively with E-cadherin and negatively with vimentin and Cav-1, a direct transcriptional target of KLF4 that can inhibit EMT in pancreatic cancer [81].
KLF4 can also promote stemness in various cancers where it promotes epithelial differentiation, thereby challenging the tacitly assumed association between EMT and cancer stem cells (CSCs) [82]. In breast cancer, KLF4 knockdown reduced ALDH1 + CSCs and mammosphere formation in vitro in MCF7 and MDA-MB-231 cells [41]. In vivo tumorigenesis and metastasis were also compromised in KLF4-depleted NOD/SCID mice [41,83]. In hepatocellular carcinoma, KLF4 directly activated EpCAM, increased the number of EpCAM + /CD133 + liver cancer stem cells in vitro, and amplified the tumorigenesis in vivo [84]. Similarly, in osteosarcoma cells, KLF4 suppression prevented sphere formation and attenuated the levels of many stem cell-related markers, including ALDH1A1 [85]. Conversely, KLF4-overexpressing cells were more chemoresistant and metastatic [86], and osteosarcoma stem cells had increased levels of KLF4 [87]. Considered together, these observations suggest that KLF4 may associate with more epithelial-like and/or hybrid E/M CSC phenotypes [88], similar to NRF2 [89]. Moreover, KLF4 can have additional cellular functions too, such as cell cycle regulation and metabolic reprogramming [90,91]. Thus, a more comprehensive analysis of the role of KLF4 on various axes of cellular plasticity is needed.
However, the association of KLF4 with metastasis is relatively ambiguous and context dependent. High KLF4 was shown to prevent metastasis in breast cancer [92] and pancreatic cancer models [93]. KLF4 can also have additional roles in modulating metastasis beyond regulating EMT and/or stemness, such as in its pro-metastasis role in the phenotypic switching of perivascular cells and formation of a premetastatic niche [94]. Another potential confounding factor is the association of KLF4 with cell cycle regulation. KLF4 can drive cell cycle arrest [76], but it also simultaneously represses p53 and activate p21 CIP [42], acting as an 'incoherent signal' that can drive antagonistic outputs depending on the relative strengths of the regulations. Therefore, future studies investigating the coupled dynamics of EMT, stemness, cell cycle, and the connection of KLF4 to these regulatory pathways are needed to elucidate the impact of KLF4 in modulating these interconnected axes driving metastasis. Additionally, further developments of small-molecule inhibitors or inducers of KLF4 [95,96] will be helpful in identifying the protumor or antitumor roles of KLF4 in patients.
Overall, our analysis highlights KLF4 as a potential MET-TF, adding to the list of known MET-TFs, such as GRHL1/2/3 and OVOL1/2. Recent studies showing that MET is not simply the inverse of EMT [12,97,98] necessitate attention to identify more MET-TFs and the interconnected networks they form with EMT-TFs to gain a comprehensive understanding of the emergent dynamics of epithelial-mesenchymal plasticity in metastaticinvasion cascade.

Mathematical Modeling
As per the schematic shown in Figure 1A, the dynamics of all the four molecular species (miR-200, SNAIL, ZEB, and SLUG) were described by a system of coupled ODEs. The generic chemical rate equation given below describes the level of a protein, mRNA, or microRNA (X): where the first term g X signifies the basal rate of production, and the terms multiplied to g X represent transcriptional/translational/post-translational regulations due to interactions among the species in the system, as defined by the shifted Hills function (H S (A, A 0 , n, λ)).
The rate of degradation of the species (X) is defined by the term k X X based on first-order kinetics. The complete set of equations and parameters are presented in the File S1. The bifurcation diagrams were drawn using the continuation software package MATCONT [99].

RACIPE (Random Network Simulation)
The gene regulatory network shown in Figure 1A was simulated via RACIPE. The overexpression and down-expression of KLF4, ZEB1, and SLUG were done by setting the fold change value to 10. Ten thousand parameter sets were simulated for 100 different initial conditions to obtain the ensemble of steady-state solutions. The steady-state solutions were Z-normalized for each gene over all the steady-state values as the observed steadystate expression-mean steady-state expression)/standard deviation of the steady-state expression. The resultant normalized steady-state solutions were plotted as a heatmap. Significance in the differences between distinct groups were accessed by performing a Students t-test on three replicates of 10,000 parameter sets each.
Next, we incorporated CDH1 to the circuit in Figure 1A and simulated the GRN by RACIPE. A similar circuit was also simulated by incorporating GRHL2 but without KLF4. Along with the base circuits, the overexpression and down-expression were also done for KLF4 and GRHL2 50-fold in their respective circuits. The RACIPE steady states were z-normalized as above, and the EMT score for each steady state was calculated as ZEB1 + SLUG − miR-200 − CDH1. The resultant trimodal distribution was quantified by fitting 3 gaussians. The frequencies of the epithelial and mesenchymal phenotypes were quantified by computing the area under the corresponding gaussian fits. Significance in the difference between the distinct groups was accessed by performing a Students' t-test on three replicates of 10,000 parameter sets each.

Gene Expression Datasets
The gene expression datasets were downloaded using the GEOquery R Bioconductor package [100]. Preprocessing of these datasets was performed for each sample to obtain the gene-wise expression from the probe-wise expression matrix using R (version 4.0.0).

External Signal Noise and Epigenetic Feedback on KLF4 and SNAIL
The external noise and epigenetic feedback calculations were performed as described earlier [67].

•
Noise on External signal: The external signal I that we use here can be written as the stochastic differential equation: .
• Epigenetic feedback: We tested the epigenetic feedback on the KLF4-SNAIL axis. The dynamic equation of epigenetic feedback on the inhibition by KLF4 on SNAIL is: Similarly, the epigenetic feedback on the SNAIL inhibition on KLF4 is modeled via: where ζ is a timescale factor and chosen to be 100 (hours). α represents the strength of epigenetic feedback. A larger α corresponds to stronger epigenetic feedback. α has an upper bound because of the restriction that the numbers of all the molecules must be positive. For inhibition by KLF4 on SNAIL, a high level of KLF4 can inhibit the expression of SNAIL due to this epigenetic regulation. Meanwhile, for SNAIL's inhibition on KLF4, high levels of SNAIL can suppress the synthesis of KLF4.

Kaplan-Meier Analysis
KM Plotter [74] was used to conduct the Kaplan-Meier analysis for the respective datasets. The number of samples in the KLF4-high vs. KLF4-low categories is given in File S1.

Patient Data
The gene expression levels for the batch effect normalized RNA-seq were obtained for 12,839 samples from The Cancer Genome Atlas's (TCGA) pan-cancer (PANCAN) dataset via the University of California, Santa Cruz's Xena Browser. The samples were filtered using unique patient identifiers, and only samples that overlapped between the two datasets were kept (11,252 samples). The samples were further filtered to remove patients with missing data for the gene expression or cancer type (10,619 samples). These samples were ultimately used in all the subsequent analyses. The DNA methylation data were obtained from the TCGA PANCAN dataset via the University of California, Santa Cruz's Xena Browser. The methylation data were profiled using the Illumina Infinium HumanMethylation450 Bead Chip (450K) [101].

EMT Score
The Epithelial-Mesenchymal Transition (EMT) score was calculated using the Kolmogo rov-Smirnov (KS) scoring metric [63]. For a given patient, the cumulative distribution functions (CDFs) of the Epithelial and Mesenchymal gene signatures were compared. First, the distance between Epithelial and Mesenchymal signatures was calculated using the maximum distance between their cumulative distribution functions. This represented the test statistic in the subsequent two-sample test used to calculate the EMT score. The score was ultimately determined by the hypothesis testing of two alternative hypotheses, with the null hypothesis being that there was no difference in the CDF of the Epithelial and Mesenchymal signatures. The first hypothesis was that the CDF of the Mesenchymal signature was greater than the CDF of the Epithelial signature. The second hypothesis was that the CDF of the Epithelial signature was greater than the CDF of the Mesenchymal signature. This scoring metric ranged from −1 to +1, where a sample with a positive EMT score was Mesenchymal, whereas negative EMT scores were associated with an Epithelial phenotype. The MLR and 76GS scores were calculated as reported earlier [56,102].

Methylation Status
After the four genes of interest were identified as KLF4, SLUG (SNAI2), SNAIL (SNAI1), and ZEB1, the methylation and expression data were obtained from the TCGA PANCAN dataset via the University of California, Santa Cruz's Xena Browser. The methylation data was quantified from the Illumina Infinium HumanMethylation450 Bead Chip (450K) for each of the four genes and identified the unique CpG associated with each of these genes: 10 unique CpG sites for KLF4, 10 for SLUG, 13 for SNAIL, and 42 for ZEB1. A β-value representing how methylated each CpG site was, with a β-value of one representing a fully methylated CpG site, was provided for all patients and at every CpG site. To quantify the methylation status of each gene, the β-values for all the associated CpG sites were averaged [101]. The gene expression was then visualized by using R's ggplot2 package to display violin plots for each gene that was ordered by the gene expression and colored by the EMT score. Subsequently, the methylation data and expression data were again plotted using R's ggplot2 package, with labels created via R's ggrepel package.

Conclusions
Our study indicates KLF4 as a potential MET-inducing transcription factor by suppressing multiple EMT transcription factors directly or indirectly-SNAIL, SLUG, and ZEB1. The simulations for our mechanistic model indicated a delay in the onset of EMT in the presence of KLF4. Further, KLF4 was found to be negatively correlated with EMT and EMT-TFs in multiple transcriptomic datasets. Finally, the epigenetic influence mediated by KLF4 on EMT can change the phenotypic distribution in a heterogeneous cancer cell population. Thus, KLF4 can inhibit EMT through multiple parallel paths and can act as a putative MET-TF.