Quantifying the Patterns of Metabolic Plasticity and Heterogeneity along the Epithelial–Hybrid–Mesenchymal Spectrum in Cancer

Cancer metastasis is the leading cause of cancer-related mortality and the process of the epithelial-to-mesenchymal transition (EMT) is crucial for cancer metastasis. Both partial and complete EMT have been reported to influence the metabolic plasticity of cancer cells in terms of switching among the oxidative phosphorylation, fatty acid oxidation and glycolysis pathways. However, a comprehensive analysis of these major metabolic pathways and their associations with EMT across different cancers is lacking. Here, we analyse more than 180 cancer cell datasets and show the diverse associations of these metabolic pathways with the EMT status of cancer cells. Our bulk data analysis shows that EMT generally positively correlates with glycolysis but negatively with oxidative phosphorylation and fatty acid metabolism. These correlations are also consistent at the level of their molecular master regulators, namely AMPK and HIF1α. Yet, these associations are shown to not be universal. The analysis of single-cell data for EMT induction shows dynamic changes along the different axes of metabolic pathways, consistent with general trends seen in bulk samples. Further, assessing the association of EMT and metabolic activity with patient survival shows that a higher extent of EMT and glycolysis predicts a worse prognosis in many cancers. Together, our results reveal the underlying patterns of metabolic plasticity and heterogeneity as cancer cells traverse through the epithelial–hybrid–mesenchymal spectrum of states.


Introduction
The epithelial-to-mesenchymal transition (EMT) is a cellular programme that gives rise to the loss of epithelial phenotypes (reduced expression of cadherins involved in cell-cell attachment, loss of apical-basal polarity) with a concomitant gain in mesenchymal traits such as migration and invasion [1]. This programme has long been known to be involved in embryogenesis and wound healing in adults [2]. Cancer cells are known to activate the EMT during the metastatic progression of a tumour, allowing the cells to invade and establish secondary tumours. The EMT is not a binary process; instead, cells can acquire and maintain one or more hybrid epithelial/mesenchymal (E/M) states [3]. The plasticity of cancer cells along the epithelial-hybrid-mesenchymal landscape is highly dynamic, complex, and multi-dimensional in nature [4]. Often, other relevant biological traits, such Biomolecules 2022, 12, 297 2 of 18 as immune evasion, stemness, anoikis resistance and therapy resistance, are coupled with this dynamic nature of cancer cells [5][6][7][8][9][10].
The metabolic status of cancer cells has also been reported to be coupled with the EMT. Specifically, many EMT-inducing transcription factors (EMT-TFs) regulate the expression of various metabolic genes involved in glucose, lipid, glutamine, amino acid, and nucleotide metabolism [11][12][13]. Furthermore, a change in the metabolic state of cancer cells can induce a change in their EMT status [13]. The exact modalities by which these two processes are related are, however, relatively unclear, and have only begun to be investigated. For instance, the activation of glycolytic enzymes by EMT has been reported in breast and prostate cancer cells [14]. A similar study reports EMT-driven activation of glycolysis in non-small cell lung cancer (NSCLC) cells by the transcriptional activation of glucose transporter 3 (GLUT3) [15]. In the reverse direction, the upregulation of glycolysis has been shown to promote stemness and the EMT in pancreatic cancer cells [16]. On the other hand, TGF-β-induced EMT has been shown to inhibit glycolysis and instead activate oxidative phosphorylation (OXPHOS) via the repression of pyruvate dehydrogenase kinase 4 (PDK4) [17], an enzyme that prevents the conversion of pyruvate to acetyl-CoA. Overexpressing PDK4 inhibited the EMT [17], demonstrating the mutual regulation of these two axes of plasticity. Similarly, TGF-β-induced EMT in colon cancer cells was shown to suppress glycolysis by the nuclear translocation of pyruvate kinase M2 (PKM2) [18], a cytosolic enzyme required for pyruvate formation. Finally, the expression of SNAI1, an EMT-TF, was shown to repress another glycolytic enzyme, fructose-1,6-biphosphatase (FBP1) [19]. Similar context-dependent trends were observed in the case of the association of the lipid metabolism with the EMT. While TNFα or TGF-β induced EMT activation promoted the lipid synthesis pathway in prostate cancer cells [20], the overexpression of SNAI1 can also lead to the inactivation of lipid synthesis enzymes [21]. Overall, these studies suggest the context-dependency of EMT-mediated coupling to metabolic networks.
Additionally, several lipid metabolism enzymes such as acetyl-CoA synthetases (ACSL1 or ACSL4, steroyl-CoA desaturase (SCD) etc.) can activate EMT [22][23][24]. Notably, two pathways with links to EMT that have been particularly well studied are glycolysis and mitochondrial metabolism. For instance, the glycolytic enzyme phosphoglucose isomerase (PGI) can also act as a cytokine and activate EMT via ZEB1 and ZEB2 stabilization in breast cancer cells [25]. However, again, this trend may not be ubiquitous across cancer subtypes and different microenvironments. The glycolytic enzyme FBP1, for instance, blocks the induction of the SNAI1-driven EMT in breast cancer cells and the loss of this enzyme favours the EMT, as shown in vitro [26]. The downregulation of several mitochondrial metabolic genes and mutations in TCA cycle enzymes has also been associated with EMT activation. Mutations in fumarate hydratase, an enzyme that converts fumarate to malate in the TCA cycle, can induce EMT by inhibiting the activity of miR-200 [27]. Similarly, mutations in the TCA enzymes succinate dehydrogenase (SDH) and isocitrate dehydrogenase (IDH) also induce EMT via the epigenetic suppression of miR-200, leading to alterations in the miR200-ZEB1 axis [27,28] that regulates the EMT status of cells [29]. Moreover, silencing of another TCA cycle enzyme, citrate synthase (CS), induces EMT-like cellular changes in vitro and promotes metastasis in vivo [30]. However, more recent studies reveal that CS is upregulated in several other tumour types and that its inactivation impedes the EMT programme in tumour cell lines [31]. While these studies point towards a causal link between EMT and the metabolic pathways of glycolysis, fatty acid oxidation and oxidative phosphorylation, the overall interconnection landscape among these pathways is quite confounding, thereby necessitating further research.
Here, we sought to analyse the association of three main aspects of cellular metabolism-glycolysis, oxidative phosphorylation and fatty acid synthesis-with the process of the EMT in more than 180 publicly available microarray/RNA-seq datasets comprising cell lines and patient tumors. We found that oxidative phosphorylation is predominantly negatively correlated with the process of EMT, and its primary regulator AMPK is primarily correlated (positively) specifically with an epithelial programme. Con-versely, glycolysis and its key regulator HIF1α predominantly are positively correlated with a mesenchymal programme and the induction of EMT. However, glycolysis also showed a positive correlation with the epithelial programme in many datasets, highlighting its complex interaction with the EMT programme. Fatty acid oxidation was correlated negatively with the acquisition of a mesenchymal phenotype and positively with the epithelial nature of cancer cells. However, alternative modalities of association of metabolic axes with the EMT programme were also observed. The analysis of EMT induction in single cell RNA-seq data showed largely consistent trends with the generic patterns seen in the analysis of bulk samples. Additionally, survival analysis using the EMT and metabolic scores to predict disease prognosis across multiple cancer subtypes reveals that patients with enriched EMT and glycolytic phenotype are associated with worse overall survival in multiple cancer types.

Software and Datasets
Python (version 3.6) and R (version 4.0.2) were used for conducting all computational and statistical analyses. Microarray datasets were downloaded from NCBI GEO using GEOquery R Bioconductor package. FASTQ files for RNA sequencing datasets were downloaded from the ENA (European Nucleotide Archive) database. A complete list of datasets used, and the associated metadata has been provided in Supplementary Table S1.

Pre-Processing of Microarray Datasets
Gene wise expression for each sample was obtained after appropriate pre-processing of microarray datasets. Probe wise expression matrices downloaded using GEOquery were log2 normalised and annotation files corresponding to microarray platforms were utilized for mapping the probes to respective genes. In cases where more than one probe mapped on to a single gene, the mean of expression values of all these probes was used for such genes.

Pre-Processing of RNA-seq Datasets
Adapter contamination and overall quality of sequences were inspected using FastQC. Sequences were aligned with the hg38 human (or mm10 mouse) reference genome using the STAR alignment software. Finally, the raw counts for each gene were calculated with these aligned sequences using htseq-count. These raw counts were then normalised for gene length and transformed to TPM (transcripts per million) values, which were then log2 normalised to obtain the final values.

EMT Scoring Methods
EMT scores were calculated using five different methods for each dataset. Each method requires gene expression data as input. Each method uses a distinct gene set or a distinct algorithm.

76GS
The 76-gene EMT scoring method (76GS) was developed using transcriptomic data from NSCLC cell lines and patient samples [32,33]. As the name suggests, it utilizes 76 gene signatures. The weighted sum of gene expression values of 76 genes was calculated for each sample, where the weighting factors are correlation coefficients with CDH1 expression levels. The values obtained through this method have no specific range. EMT score for each sample was subtracted by the mean obtained from all samples such that the resultant mean score was zero. As per this new scale, negative scores indicate an M phenotype, and positive scores indicate an E phenotype.  [35] repository. For each sample, ssGSEA (single sample gene set enrichment analysis) [36] analysis was performed using this geneset to obtain the normalized enrichment score (NES). All calculations were performed using the GSEAPY python library.

Epithelial and Mesenchymal Scores
These metrics use the KS epithelial and mesenchymal gene signatures to quantify E & M status separately. A rank-based single sample scoring method called Singscore [37] was used for quantifying the enrichment level of these gene sets in a given sample. The final value obtained from this method has a range of [−1, 1]. For the epithelial score, a higher value indicates a more epithelial phenotype. The mesenchymal score also operates in a similar manner.

Metabolic Pathways Scoring Methods
ssGSEA scores were calculated using the hallmark oxidative phosphorylation and glycolysis gene sets (MSigDB) to obtain OXPHOS and glycolysis signatures respectively (Supplementary Table S2). AMPK and HIF-1 signatures were quantified using expression levels of their downstream target genes as previously reported [38]. In total, 33 downstream genes for AMPK and 23 downstream genes for HIF-1 were used. The final scores were obtained using the Singscore method [37] performed on these gene sets. The FAO scores were calculated based on equations previously reported [39], which use the expression levels of 14 FAO enzyme genes.

Survival Analysis
Overall survival data were obtained from the TCGA cohort of patients from 32 different cancer types. All samples were divided into EMT high and EMT low, Glycolysis high and Glycolysis low, OXPHOS high and OXPHOS low groups based on the median of the respective ssGSEA scores of the samples. Kaplan-Meier analysis was performed using R package "survival". A log-rank test was used to calculate the p-values. The reported hazard ratio (HR) and confidence interval (95% CI) were estimated using Cox regression using "coxph" function. Statistical analysis, survival analysis and plots were all conducted in R version 4.0.3. Heatmaps were plotted using the "heatmap" function and using RdBu color scale from the RcolorBrewer package.

EMT Scoring Metrics Are Largely Consistent across Datasets
Multiple transcriptomic-based scoring metrics have been used to quantify the EMT status of biological samples [40]. We used five different approaches to quantify the EMT status of biological samples in a set of 182 datasets. The 76GS [32,33] and KS [34] EMT scoring methods use two different sets of gene lists (including epithelial and/or mesenchymal genes) to score a sample along the epithelial-hybrid-mesenchymal spectrum. We further used ssGSEA (single-sample Gene Set Enrichment Analysis) and/or Singscore to calculate the activity of epithelial and mesenchymal gene lists (see Methods) to estimate the epithelial and mesenchymal nature of the samples, respectively.
For 114 out of 182 datasets, the KS score (a higher KS score implies a more mesenchymal state) is positively correlated with the enrichment of hallmark EMT signature; in only eight datasets, this correlation is significantly negative ( Figure 1A, left). The KS metric also positively correlated with a mesenchymal signature ( Figure 1A, right) and negatively with an epithelial signature ( Figure S1). On the other hand, the 76GS EMT score (a higher 76GS score indicates a more epithelial state) largely correlates with this epithelial signature ( Figure 1A, middle), but negatively correlates with the abovementioned mesenchymal signature ( Figure S1). The enrichment of epithelial and mesenchymal signatures also showed expected trends with that of the hallmark EMT signature ( Figure S1). After evaluating these pairwise comparisons, we identified overlaps among the datasets that showed the expected trends, given the pro-epithelial trends for 76GS [32,33] and pro-mesenchymal comparisons for KS [34] (positive correlation for KS vs. hallmark EMT or mesenchymal, positive correlation for 76GS vs. epithelial, negative correlation for other pairwise comparisons: 76GS vs. KS, 76GS vs. hallmark EMT, 76GS vs. mesenchymal, KS vs. epithelial, epithelial vs. mesenchymal). We used Venn diagrams ( Figure 1D) to assess the overall consistency (in terms of specific assignments of E versus M) for each of the EMT scoring methods applied to our set of datasets. Each ellipse in the Venn diagram has a total number equal to the expected correlation between any two EMT scoring metrics; for instance, the green ellipse with a sum of 114 corresponds to number of datasets showing a positive correlation for KS vs. hallmark EMT ( Figure 1A). The regions with overlap between the ellipses indicate the number of datasets in which multiple such pairwise correlations yielded the same result. For example, the number 65 in the region of intersection of all four ellipses indicates that 65 datasets show complete consistency among the four pairwise correlations; in those 65 datasets, KS scores correlated positively with hallmark EMT scores but negatively with 76GS scores, and 76GS scores correlated positively with epithelial signature scores but negatively with the Hallmark EMT scores. Similarly, in 25 (the sum of the numbers in the overlap regions for three ellipses) datasets, 3 out of 4 pairwise correlations are in agreement. Additional combinations of such pairwise correlations also show consistent trends ( Figure S2). Together, these results show that all these EMT scoring metrics are largely consistent with one another across a large cohort of datasets.
Next, we wanted to quantify the consistency of pairs of metrics if they were significantly correlated. To quantify that trend, we computed a "probability" score for a given pair of metrics by considering the number of datasets correlated significantly (p < 0.05) either positively (r > 0.3) or negatively (r < −0.3). We computed the ratio of the number of positively or negatively correlated datasets (depending on the trend seen) to the total number of datasets that showed a significant (p < 0.05) association (irrespective of the direction of association). The higher this ratio, the better the concordance between these two metrics in a given direction. We found that (a) KS score vs. mesenchymal, (b) KS score vs. epithelial and (c) mesenchymal signature vs. hallmark EMT signatures were most consistent with one another in positive, negative and positive directions, respectively (Figure 1C). When these probabilities were further weighted by a fraction of significant cases out of all datasets considered, we saw that (a) KS score vs. epithelial, (b) 76GS score vs. KS score and (c) 76GS vs. epithelial were most consistent with each other in negative, negative and positive directions, respectively, as expected ( Figure 1D). KS score vs. mesenchymal and KS score vs. hallmark EMT correlations also maintained their trends as seen in earlier scenarios (compare Figure 1D with Figure 1C). Put together, these results show that the EMT metrics considered here show highly concordant trends with respect to one another in a majority of the datasets.

OXPHOS Is More Likely to Be Negatively Correlated with A Mesenchymal Program and Positively with An Epithelial Program
Having assessed the consistency of different EMT scoring metrics among themselves on a cohort of datasets, we next wanted to understand how the different axes of metabolism associated with the EMT metrics. Oxidative phosphorylation is the predominant process by which cells generate energy for survival. To study how the biological process of oxidative phosphorylation associates with different EMT metrics, we correlated the ssGSEA activity scores calculated for the hallmark oxidative phosphorylation gene set with different EMT metrics.
Upon correlating the OXPHOS signature with the hallmark EMT scores, we found that although there were significant correlations both in the positive and negative directions, there were many more datasets correlated negatively (56 vs. 24) with hallmark EMT than positively ( Figure 2A). This overall observation is in accordance with many experimental studies that point towards a negative association between oxidative phosphorylation and the EMT [41][42][43][44]. However, this relationship is not exclusive, i.e., these quantities could be positively correlated in a variety of contexts, as reported in other experimental studies [45][46][47]. Among all pairs of correlations between OXPHOS signature and EMT metrics, the OXPHOS-hallmark EMT pair showed the strongest propensity of negative association with one another, among all pairwise comparisons that showed a significant trend in either direction ( Figure 2B). Furthermore, the OXPHOS-hallmark EMT pair was also the top scoring pair when weighted with the fraction of significant cases ( Figure 2C), further highlighting the finding that the acquisition of the mesenchymal features was more likely to result in a decline in activity of the OXPHOS gene set.
Oxidative phosphorylation in cells has been reported to be positively regulated by AMPK activity levels in cells. To assess AMPK activity, we considered a list of AMPK target genes used as a proxy for the activity of the phosphorylated active form of AMPK (see Methods). We found that the AMPK signature was more likely to be positively correlated with the epithelial signature (41 datasets showing positive correlations vs. 12 showing negative correlations) ( Figure 2D). However, the AMPK signature did not show a strong skew towards being either positively or negatively correlated with the separate mesenchymal signature ( Figure 2E). Together, these trends could indicate towards the fact that the active form of AMPK is likely more strongly correlated with the presence of an epithelial signature rather than with the absence of a mesenchymal signature, especially if we deconvolute the EMT into two-dimensional process where loss of epithelial traits and gain of mesenchymal traits can be treated semi-independently. Through quantifying the trends of association of the AMPK signature with various EMT metrics, we noticed that the probability of a positive correlation between AMPK and epithelial metrics (epithelial signature, 76GS scores) was higher than that of a negative correlation between AMPK and mesenchymal ones (KS score, hallmark EMT, mesenchymal signature) ( Figure 2F). These observations suggest that AMPK is strongly coupled with epithelial traits of cells, rather than with their mesenchymal ones. However, we noticed OXPHOS is strongly negatively correlated with both hallmark EMT signature as well as the mesenchymal signature (Figure 2A,B). This difference seen between trends in AMPK and OXPHOS may be due to additional context-specific factors, apart from AMPK, that might also mediate the crosstalk between the EMT and OXPHOS [48], thus leading to an overall stronger negative association of OXPHOS with the hallmark EMT program ( Figure S3).

Glycolysis Is More Likely to Be Positively Correlated with A (Partial) EMT Programme
Next, we wanted to check how the glycolytic process was associated with the EMT programme in the datasets we had considered. To assess this association, we correlated the enrichment (ssGSEA) scores for the hallmark EMT and hallmark glycolysis signatures across our datasets. We observed that glycolysis was more likely to be significantly positively correlated with EMT than to be significantly negatively correlated (66 vs. 13, respectively) ( Figure 3A). One would therefore expect that glycolysis should be negatively correlated with the epithelial programme or with the 76GS EMT scoring metric that assigns epithelial samples a higher score. This is, however, not what we observed. Instead, glycolysis was also found to be positively correlated with scoring metrics that report an enriched epithelial program (Epithelial gene list, as well as 76GS scores), to a comparable extent with which it correlated with a mesenchymal program ( Figure 3B). Similar trends are also seen when the association probability values were further weighted by number of significant datasets in which a given trend was observed ( Figure 3C). Here, the association of glycolysis with the hallmark EMT programme was stronger, albeit not to a large degree, than that seen for the epithelial gene list and 76GS score, suggesting its putative association with a partial EMT state and/or other context-specific factors not included in our analysis. HIF1α is a known mediator of the glycolytic pathway [49] and in modulating the EMT status of cells [50]. Thus, next we probed how the HIF1α signature was associated with epithelial and mesenchymal programmes. Intriguingly, we found that both the volcano plots showed a skew towards the positive side ( Figure 3D,E), suggesting that HIF1α activation may be associated with a partial EMT state exhibiting both epithelial and mesenchymal features [51]. It should be noted that in the case of mesenchymal programme, the HIF1α signature was somewhat more strongly skewed towards to the positive side in comparison to the positive skew present in the case of epithelial programme (47 out of 60 datasets vs. 31 datasets out of 46, respectively) ( Figure 3D,E). The positive association of glycolysis, as well as its known regulator HIF1α, with both the epithelial and mesenchymal axes may indicate that glycolysis is a hallmark feature of hybrid E/M states. Recent observations about glycolysis accompanying collective cell migration endorse this association of glycolytic shift in partial EMT state(s) [52,53]; however, how metabolic heterogeneity maps on to leader-follower dynamic switching remains to be investigated in more detail [54][55][56]. Nevertheless, stronger trends, as measured by weighted probability scores for HIF1α vs. hallmark EMT and HIF1α vs. mesenchymal compared to HIF1α vs. 76GS or HIF1α vs. epithelial, indicate the enrichment of HIF1α in being associated with a relatively more mesenchymal phenotype ( Figure 3F and Figure S4). The degree of coupling of the gain of mesenchymal traits with the loss of epithelial traits in a given scenario [57] may play a key role in associating HIF1α with a partial or complete EMT.

FAO Is More Likely to Positively Correlate with An Epithelial Program and Negatively with A Mesenchymal Program
Fatty acid oxidation (FAO) is a catabolic process in which fatty acids are broken down and is another key mechanism by which cancer cells can generate energy for survival [58]. Genes involved in fatty acid oxidation have been characterised previously [58]. We used one such gene set as a proxy for the activity of the FAO pathway in our datasets (see Methods). We found that as with OXPHOS, FAO was most likely to be negatively correlated with the hallmark EMT programme (52 significantly negative vs. 11 significantly positive cases) ( Figure 4A). The epithelial programme alone was more likely to be correlated positively (37 positive vs. 17 negative) ( Figure 4B) while the mesenchymal programme alone was likely to be correlated negatively (30 negative vs. 18 positive) ( Figure 4C). These results show that FAO is more likely to be associated negatively with the acquisition of a mesenchymal phenotype. Upon calculation of the probability of positive/ negative correlations when the correlation is significant as well as the overall weighted probability, we observed that the mesenchymal metrics (hallmark EMT, mesenchymal signature and KS) were all skewed towards the negative side, while the more epithelial metrics (76GS and epithelial signature) were positively correlated with FAO ( Figure 4D). Collectively, these results show that while OXPHOS and FAO are more likely to be associated negatively with the mesenchymal programme and the EMT process, glycolysis is more likely to be positively associated with the mesenchymal characteristics of cells. These associations, at least in part, are supported by the activity of molecular regulators such as AMPK and HIF1α.

Different Modalities of Association between Pairs of Metabolic Pathways and the EMT
After exploring the three major metabolic axes (OXPHOS, glycolysis and FAO) independently in terms of their association with EMT, we wanted to investigate the pair-wise associations between the metabolic pathways in the context of EMT. For datasets under consideration, we first computed the fractions of datasets that had none, one, two or all three axes of metabolism associated significantly with the hallmark EMT signature ( Figure 5A). Most datasets (~60%) had a maximum of one axis of metabolism correlated with the hallmark EMT programme ( Figure 5A). In about 25% of the datasets, hallmark EMT was not correlated with any of the metabolic axes, probably indicative of biological scenarios where these metabolic axes are not coupled directly with the EMT spectrum. In the remaining 40% of datasets, where two or more than two axes were significantly correlated with the hallmark EMT signature, we investigated if certain combinations of associations were more likely than others in context of their correlations with the EMT. To answer this question, we first plotted all 45 datasets that had significant correlations with the EMT axis and either OXPHOS or glycolysis ( Figure 5B). Among those, 21 (46.67%) datasets showed a positive correlation between glycolysis and hallmark EMT programme, while OXPHOS was negatively correlated with the hallmark EMT programme ( Figure 5B, green box). This co-occurring association of OXPHOS and glycolysis with EMT in inverse directions has been reported earlier experimentally [59,60]. However, this co-occurrence is not the only mode of association between these three axes. The next most predominant modality of association is the scenario in which both glycolysis and OXPHOS are positively correlated with EMT; this trend is shown in 14 (31.11%) datasets ( Figure 5B, red box). This could be indicative of the EMT associating positively with a hybrid metabolic state in which both OXPHOS and glycolysis are high [59]. The other two cases-both OXPHOS and glycolysis correlating negatively with EMT ( Figure 5B, yellow box) and OXPHOS being positively associated while glycolysis being negatively associated ( Figure 5B, blue box)-were 15.55% and 6.67%, respectively. Collectively, this analysis shows that besides predominant modalities of association between OXPHOS and glycolysis in the context of EMT, other modalities also exist although less frequently.
Next, we wanted to explore how FAO and OXPHOS are associated with each other in the context of their correlations with EMT. The produced scatter plots, similar to those prepared for glycolysis and OXPHOS, show that only three quadrants are populated with different propensities ( Figure 5C). The most predominant modality was the scenario in which the EMT was negatively correlated with both OXPHOS and FAO in 30 (73.17%) datasets ( Figure 5C, yellow box). In 7 (17.07%) datasets, OXPHOS and FAO were both positively correlated with the hallmark EMT signature ( Figure 5C, red box). In the remaining four datasets, OXPHOS was positively correlated with the EMT while FAO was significantly negatively correlated with the EMT (Figure 5C, blue box). These results show that FAO and OXPHOS are more likely to be coordinated in a similar manner, either positively or negatively correlated to the EMT. Similarly, when we compared glycolysis with FAO, we found that the major modality of action was the scenario in which glycolysis was positively correlated with the EMT while FAO was negatively correlated with the EMT programme ( Figure 5D, blue box). This association was observed in 25 (69.44%) datasets. The other two observed modalities of regulation were the cases in which both FAO and glycolysis were both positively ( Figure 5D, red box) or both negatively ( Figure 5D, yellow box) correlated with the EMT. Overall, our analysis uncovers the different possibilities and propensities by which these three axes of metabolism might associate with one another in terms of their connection with EMT.

Heterogeneity in Associations between Different Axes of Metabolism in Relation to the EMT Is Also Reflected in Single Cell RNA-seq Data
Until now, our analysis had focused on bulk samples. We next examined whether there was also evidence for the observed heterogeneity of association of metabolic axes with the EMT present in single-cell data. To do so, we analysed single-cell data for different cell lines induced to undergo the EMT by TGFβ for 7 days, followed by (partial) MET by withdrawal of TGFβ for 3 days [61], to determine (a) if there was a shift along any of the metabolic axis upon the induction of EMT/MET, and (b) whether the different modalities of associations were seen across different biological conditions. For the cell line DU145, when the EMT was induced for 7 days, there was a distinct rise in the hallmark EMT scores of the cells at day 7 compared to day 0. Upon withdrawal of TGFβ for 3 days, the hallmark EMT score decreased ( Figure 6A). As the hallmark EMT scores for cells increased at day 7 compared to day 0, a significant increase in the level of glycolysis, but a significant shift towards reduced levels of OXPHOS as well as FAO was noted, as expected from the most dominant trend seen in bulk datasets ( Figure 6A). In the case of OVCA420 cells, the hallmark EMT scores increased upon EMT induction followed by a decrease on withdrawal of TGFβ, as expected ( Figure 6B). Interestingly, along with decrease in FAO and OXPHOS, there was also a decrease in glycolysis scores upon EMT induction. Upon withdrawal of TGFβ, while the OXPHOS increased again, there was no significant change in glycolysis or FAO enrichment ( Figure 6B). These results demonstrate possible heterogeneity in the modalities by which pairs of metabolic axes associate contingent upon the biological context; in this case, the cell line considered and extent to which EMT was induced (a stronger response in OVCA420 vs. that in DU145).

Survival Analysis Reveals Association of EMT and Glycolysis with Worse Patient Survival
Finally, we evaluated the association of the EMT and metabolic axes (glycolysis, OX-PHOS) gene lists with patient overall survival (OS) across TCGA cancer types (Table S3). The enrichment of hallmark EMT signatures was associated with worse OS in many cancers: bladder cancer (BLCA), cervical squamous cell carcinoma (CESC), colon adenocarcinoma (COAD), kidney renal clear cell carcinoma (KIRC), kidney renal papillary cell carcinoma (KIRP), low grade glioma (LGG), liver hepatocellular carcinoma (LIHC), mesothelioma (MESO) and uveal melanoma (UVM) (Figure 7A, top). Further, the enrichment of glycolysis associated with worse OS in multiple cancer types: CESC, cholangiocarcinoma (CHOL), head and neck squamous cell carcinoma (HNSC), kidney chromophobe (KICH), KIRP, LGG, UVM, lung adenocarcinoma (LUAD) and pancreatic adenocarcinoma (PAAD) ( Figure 7A, middle). Conversely, in KIRP and MESO, the enrichment of OXPHOS correlated with a better survival, in contrast to trends in HNSC and UVM ( Figure 7A, bottom). Thus, overall, an enrichment of glycolysis with/without significant changes in OXPHOS associated with worse patient survival ( Figure 7A). Next, we tested combinatorial scenarios, i.e., EMT high or low, glycolysis high or low, and OXPHOS high or low, based on segregation of ssGSEA scores by median across the samples for a given cancer. We observed that across many cancers, the simultaneous enrichment of the EMT and glycolysis, irrespective of changes in OXPHOS (columns E+G+O-and E+G+O+ in Figure 7B), was associated with worse survival. Together, these trends show that an enrichment of glycolysis together with a (partial or complete) EMT can often indicate a more aggressive disease outcome.
trends show that an enrichment of glycolysis together with a (partial or complete) EMT can often indicate a more aggressive disease outcome.

Discussion
Metabolic reprogramming in cancer cells is a key step in the adaptation and survival of cancer cells in the changing milieu of the tumour microenvironment. Metabolic reactions are more likely to act on a smaller time scale compared to transcriptional and translational processes. This difference in time scales makes metabolic remodelling an attractive mode for the instantaneous adaptation of cancer cells. However, changes to the metabolic programmes in cells also happens at a timescale longer than such immediate adaptations. Long-term changes in metabolic programmes can happen due to crosstalk with other dynamic biological process in cells. In the context of the EMT, cells switch from an epithelial to a more mesenchymal phenotype through multiple intermediate states, which facilitate metastasis. Changes in cellular motility and stresses arising from new microenvironments during metastasis may require more energy to adapt and survive and thus necessitate altered metabolism. Such metabolic alterations that affect the overall energy balance in the cell ultimately determine its fitness. Thus, it is not surprising that EMT and metabolism have been shown to influence one another [59,62].
In this study, we have focused on three major energy producing metabolic processes: glycolysis, oxidative phosphorylation (OXPHOS) and fatty acid oxidation (FAO). Cancer cells typically facilitate glycolysis as their primary energy source, irrespective of the presence of oxygen [63]. This process is referred to as the Warburg effect or aerobic glycolysis. Conversely, the role of oxidative phosphorylation has also been observed to be important in cancer cells and cannot be ignored [64,65]. EMT-induced metabolic alterations have been an active field of research with the identification of numerous mechanisms by which the metabolic state of cells is altered. Several EMT-inducing signals and EMT-TFs have been shown to activate glycolysis [15,19,66]. Further, glycolysis has been shown, in turn, to promote EMT, thus forming a positive feedback loop [67][68][69][70]. In addition, several studies have shown that EMT-TFs can also inhibit mitochondrial respiration and oxidative phosphorylation [71,72]. Thus, EMT has been consistently shown to be associated with activation of glycolysis and inhibition of OXPHOS, as also noted in scenarios of EMT induction [41]. However, in contrast to these studies, other pieces of evidence suggest that cancer cells with an activated EMT may also have increased levels of OXPHOS in some cases [21,65]. Such conflicting findings [21,65] regarding glycolysis vs. OXPHOS may be due to differing tumor microenvironments in these studies or due to differences in the cell lines or patient samples used. Another possible explanation is that cancer cells may exhibit a hybrid metabolic phenotype [38] where both glycolysis and OXPHOS states co-exist, which allows high metabolic adaptability.
Consistent with the existing literature, our analysis of more than 180 gene expression datasets revealed that the process of EMT is more likely to be positively associated with the glycolytic process and negatively with the OXPHOS programme. We also report here that FAO is also likely to be negatively associated with EMT progression, similar to OXPHOS. These broad trends are, however, not binding, but rather probabilistic in nature given different biological contexts. In other words, the absence of a universal rule indicates that cancer cells may favour glycolysis or oxidative metabolism depending on factors present in the tumor microenvironment (TME), such as availability of glucose, hypoxia, reactive oxygen species, etc. Such an ability to shift the metabolic balance dynamically may provide an advantage amidst the shifting energy demands inherent in an evolving TME. Upon analysis of pairs of metabolic pathways in the context of their associations with EMT, we observed that while glycolysis and OXPHOS are more likely to be antagonistic in their associations with the EMT, OXPHOS and FAO were more likely to be both associated negatively with the EMT. The other modalities of associations were also observed, albeit in lower propensities. The observed heterogeneities were also seen in single cell RNA-seq data upon the induction of EMT.
Transcriptomic analysis of metabolic genes for given pathways, as performed here, is one of the ways to estimate the level of activity of a pathway and its corresponding associations with the EMT programme. However, the analysis of metabolomic data would give a more precise picture of the actual metabolic state of cells. Furthermore, there is a need to better characterise the molecular players and associated mechanisms that could allow for heterogeneity in the various modalities of associations between the different metabolic pathways and if there are feedback loops/networks that might allow a switch from one modality of association to another. Identifying such a mechanistic basis would be important to develop this understanding for any therapeutic strategies. Based on the current study, we cannot state if the observed associations are likely to hold in the context of EMT induction only or also hold in the context of the mesenchymal to epithelial transition (MET). The EMT/MET dynamics have been shown to be hysteretic (non-symmetric) in nature [73][74][75]; whether that feature extends to metabolic reprogramming remains to be seen.
Despite these limitations, our work sheds light upon the underlying patterns in terms of metabolic plasticity and heterogeneity along the epithelial-hybrid-mesenchymal spectrum in cancer cells. Understanding this coupling between the EMT/MET and metabolic plasticity will enable effective targeting of cells in heterogeneous tumor populations.