EMT-Related Genes Have No Prognostic Relevance in Metastatic Colorectal Cancer as Opposed to Stage II/III: Analysis of the Randomised, Phase III Trial FIRE-3 (AIO KRK 0306; FIRE-3)

Simple Summary Despite huge advances in local and systemic therapies, the 5-year relative survival rate for patients with metastatic CRC is still low. To avoid over- or undertreatment, proper risk stratification with regard to treatment strategy is highly needed. As EMT (epithelial-mesenchymal transition) is a major step in metastatic spread, this study analysed the prognostic effect of EMT-related genes in stage IV colorectal cancer patients using the study cohort of the FIRE-3 trial, an open-label multi-centre randomised controlled phase III trial of stage IV colorectal cancer patients. Overall, the prognostic relevance of EMT-related genes seems stage-dependent. EMT-related genes have no prognostic relevance in stage IV CRC as opposed to stage II/III. Abstract Introduction: There is no standard treatment after resection of colorectal liver metastases and the role of systemic therapy remains controversial. To avoid over- or undertreatment, proper risk stratification with regard to postoperative treatment strategy is highly needed. We recently demonstrated the prognostic relevance of EMT-related (epithelial-mesenchymal transition) genes in stage II/III CRC. As EMT is a major step in CRC progression, we now aimed to analyse the prognostic relevance of EMT-related genes in stage IV CRC using the study cohort of the FIRE-3 trial, an open-label multi-centre randomised controlled phase III trial of patients with metastatic CRC. Methods: Overall and progression free survival were considered as endpoints (n = 350). To investigate the prognostic relevance of EMT-related genes on either endpoint, we compared predictive performance of different models using clinical data only to models using gene data in addition to clinical data, expecting better predictive performance if EMT-related genes have prognostic value. In addition to baseline models (Kaplan Meier (KM), (regularised) Cox), Random Survival Forest (RSF), and gradient boosted trees (GBT) were fit to the data. Repeated, nested five-fold cross-validation was used for hyperparameter optimisation and performance evaluation. Predictive performance was measured by the integrated Brier score (IBS). Results: The baseline KM model showed the best performance (OS: 0.250, PFS: 0.251). None of the other models were able to outperform the KM when using clinical data only according to the IBS scores (OS: 0.253 (Cox), 0.256 (RSF), 0.284 (GBT); PFS: 0.254 (Cox), 0.256 (RSF), 0.276 (GBT)). When adding gene data, performance of GBT improved slightly (OS: 0.262 vs. 0.284; PFS: 0.268 vs. 0.276), however, none of the models performed better than the KM baseline. Conclusion: Overall, the results suggest that the prognostic relevance of EMT-related genes may be stage-dependent and that EMT-related genes have no prognostic relevance in stage IV CRC.


Introduction
Colorectal cancer (CRC) is the third most common cancer worldwide, being the second leading cause of cancer-related deaths [1] with more than 1.9 million new cases each year. About 20% of patients present with synchronous liver metastases and up to 50% of patients develop distant metastases during their disease with the liver being the most frequent site of metachronous spread [1][2][3][4][5]. Despite huge advances in local and systemic therapies, the 5-year relative survival rate for patients with metastatic CRC (mCRC) still ranges between 14-17% [6]. To date, the personalised approach to treat mCRC as recommended by national (German S3-Leitlinie, NCCN) and international (ESMO, ESMO-Asia) guidelines is limited to the analysis of microsatellite (MSI) status and mutational analysis of RAS (rat sarcoma oncogene) and B-RAF [7][8][9][10]. However, extended molecular testing has the potential to identify druggable targets beyond standardised treatment options, establish biomarkers that allow better and more precisely risk stratification, predict prognoses, and improve clinical decision-making in a precision medicine approach. There is still no standard treatment after the resection of colorectal liver metastases (CRLM) and the role of systemic therapy remains controversial. Whereas adjuvant treatment is recommended after surgery in stage III CRC with nodal spread, there is no standard recommendation for systemic treatment after surgery in stage IV CRC with distant spread, a rationale that might not seem conclusive [9]. To avoid over-or undertreatment, a proper risk stratification regarding the postoperative treatment strategy is highly needed (precision oncology). Successful stratification of risk groups based on tumour biology reflected in longer diseasefree survival in high-risk groups receiving additive systemic treatment and avoidance of unnecessary adverse effects of systemic treatment in low-risk groups could lead to a paradigm shift in the treatment strategy of stage IV CRC.
In this respect, we recently demonstrated the prognostic relevance of EMT-related (epithelial-mesenchymal transition) genes in stage II/III CRC. Further, we proposed an EMT-related gene signature that identified patients at risk of relapse in multiple CRC cohorts. This EMT-related gene signature was a strong predictive indicator for recurrence in stage II/III CRC patients and associated with overall survival [11]. With the aim to optimise patient outcome, the respective EMT signature might help to stratify patients according to their tumour biology, and contribute to personalised treatment in the future.
EMT is a key program that enables stationary epithelial cells to lose their cell-cell adherence and acquire mesenchymal properties, including enhanced mobility, invasiveness, increased resistance to apoptosis, and degradation and production of extracellular matrix components, that are all essential for invasion and metastasis. In this respect, EMT is associated with an aggressive phenotype, pivotal for tumour progression and the prerequisite for metastatic spread [12,13]. EMT is regulated at different molecular levels that lead to the loss of E-Cadherin as the critical event with a subsequent activation of all major cancer cell intrinsic signaling pathways. Whereas EMT and EMT-related gene signatures have been demonstrated to be associated with prognosis and therapeutic resistance in non-metastatic stages of CRC and various other tumour entities [11,[14][15][16][17], the prognostic role of EMT-related genes and our previously proposed EMT-related gene signature in mCRC remains uncertain.
As EMT is a major step in CRC progression and metastatic spread, it was the aim of this study to analyse the prognostic relevance of EMT-related genes in stage IV CRC patients using the study cohort of the FIRE-3 trial. The FIRE-3 trial was an open-label multi-centre randomised controlled phase III trial for first-line treatment of patients with RAS wild-type (wt) mCRC patients [18]. In this respect, we aimed to assess whether the prognostic value of the previously identified EMT-related genes and our proposed EMT-related gene signature in stage II/III CRC can be validated in the metastatic setting of CRC and potentially be used for risk stratification and guidance of systemic treatment in an individualised therapy approach in stage IV CRC.

Patients
Clinical data was available from 752 patients. In 416 cases, genetic information was also available. After removing missing values and a single subject with primary tumour location on both sides (rather than left or right), 350 patients with complete data remained for the analysis, of which 237 were male and 113 were female. For each patient, the corresponding treatment, tumour location, metastatic status (solitary liver metastasis (nonresectable metastases confined to the liver at time of diagnosis) versus distant metastasis in more than one organ), BRAF V600E status (wild-type (wt) versus mutated (mut)), survival parameters (overall survival (OS), progression-free survival (PFS)) along with 191 variables containing gene expression data were collected (Table 1).

Gene-Expression Analysis
Using formalin-fixed paraffin-embedded (FFPE) samples of primary tumour tissue, gene expression analysis was carried out using ALMAC's Xcel TM gene-expression array at ALMACs laboratories [23]. All analyses were approved by the ethics committee of the Ludwig-Maximilians-University, Munich (#186-15).

EMT-Related Dataset
The EMT-related dataset investigated in this study was derived from public databases as previously described [11]. In this respect, transcriptome profiles and clinical information of 1780 stage II/III CRC patients from 15 public datasets were investigated. Coefficient variant analysis was used to select reference genes for normalising gene expression levels. Univariate, LASSO, and multivariate Cox regression analyses were combined to develop the originally studied EMT related dataset [11].

Outcomes
Endpoints investigated in this study included progression-free survival (PFS) (time from randomisation to disease progression or death from any cause) and overall survival (OS) (time from randomisation to death from any cause).

Statistical Analysis
For each endpoint (OS and PFS), we trained different models and investigated their predictive performance. In order to investigate the prognostic relevance of gene data, we compared the predictive performance of the models using clinical data only to model using gene data in addition to clinical data, expecting better predictive performance if gene data has prognostic value. The models used for comparison were Kaplan-Meier (KM) [24], (regularised) Cox Regression (Cox) [25], Random Survival Forest (RSF) [26,27], and Gradient Boosted Trees with Cox Loss (GBT) [28]. KM served as a baseline for the prediction without taking any covariate information into account, the Cox models served as a baseline for a model with covariates but without non-linear effects and interactions. The predictive performance of the models was measured by the integrated Brier Score (IBS) evaluated at the median survival time [29].
For regularised Cox, RSF, and GBT, hyperparameter optimisation (HPO) was performed via random search [30] and three-fold cross validation (CV) on the respective training data (see Appendix A for details). Performance evaluation was based on a repeated five-fold CV with five repetitions. All analyses including training the models were performed in the R programming environment (version 4.1.3). For the setup of survival tasks, model training, HPO, and performance evaluation, we used mlr3proba [31] with the mlr3 [32] ecosystem. All code used for the analysis is available from GitHub: https://github. com/adibender/EMT-gene-fire3-prognostic-relevance (accessed on 1 January 2022).

Results
The results of the benchmark experiments are given in Figure 1  slightly when comparing performance with and without genetic data (OS: 0.262 vs. 0.284; PFS: 0.268 vs. 0.276), the performance is still worse than the mean performance of RSF or unregularised Cox with clinical data only. Furthermore, none of the models using additional information (clinical and/or genetic) can outperform the KM baseline (OS: 0.250, PFS: 0.251). Performance of regularised Cox regression is identical to KM, as all coefficients are penalised to zero.

Discussion
Advances in genomic and transcriptomic analyses have shifted cancer therapy towards a precision medicine approach and allowed better understanding of the molecular alterations of CRC with regard to tumour initiation, progression, and resistance [33]. While the TNM staging system in combination with molecular markers (RAS, BRAF, MSI) is the backbone of therapeutic decisions and used as a guideline for survival estimates, there is a wide variation in prognosis among CRC patients with the same TNM stage and a survival paradox of stage II/III CRC patients on account of the inherent heterogeneity that traditional clinicopathological and molecular features fail to explain. In this respect, identification of innovative markers and risk factors based on tumour biology that can guide the administration of systemic treatment (targeted therapies, postoperative additive treatments) in CRC need to be introduced into the clinical arena.
In this respect, we recently demonstrated that EMT-related genes have prognostic relevance in stage II/III CRC. Further, we developed an innovative prognostic model based on our proposed EMT-related gene signature predicting recurrence of stage II/III CRC patients, offering a potential explanation with regard to tumour biology beyond traditional clinicopathological characteristics for the mechanisms underlying the observed survival paradox [11]. We now aimed to assess the prognostic relevance of EMT-related genes in a metastasized CRC setting using the study cohort of the FIRE-3 trial, a multi-centre randomised controlled phase III trial of mCRC patients [18].
Analyses using (regularised) Cox and RSF showed no improvement in predictive performance according to IBS when using gene data in addition to clinical data (see Figures 1 and 2; Table 2), and therefore no prognostic effect of EMT-related genes in stage IV CRC. GBT performed slightly better when using gene data in addition to clinical data, but still performed worse than the RSF or unregularised Cox using clinical data only. Furthermore, according to our results, none of the models using covariate information (clinical and/or gene data) could outperform the KM with respect to predictive performance.
We have previously shown that FOLFIRI plus cetuximab was associated with improved OS in patients with RASwt mCRC relative to those treated with FOLFIRI plus bevacizumab [18]. This FOLFIRI plus cetuximab conferred OS benefit, however, was in the absence of differences in investigator-assessed objective response or PFS. In an attempt to elucidate the underlying relationship for these unexplained results, metrics of tumour dynamics were assessed, and centralised radiological review revealed that FOLFIRI plus cetuximab induced superior objective response, frequency of early tumour shrinkage and depth of response compared with FOLFIRI plus bevacizumab. In this respect, early tumour shrinkage and depth of response were associated with OS in both treatment groups [19]. These results highlight the importance of new innovative metrics that reflect tumour biology to predict therapeutic response and outcome. Indeed, this is underscored by the increasing evidence that evaluation of response according to RECIST criteria may not adequately capture the quality and quantity of response to targeted therapies in mCRC [24]. As depth of response is an on-treatment parameter occurring approximately 3.5 months after the beginning of treatment, this parameter might rather be used for retrospective analyses than initial clinical decision making. On the other hand, early tumour shrinkage is a useful parameter to guide decision making in the early phase of treatment, however, the parameter by itself does not consider the subgroup of patients that show no early shrinkage, though they are slow responders who could still benefit from continuation of treatment. In this respect, there is still a need for further metrics to supplement the RECIST criteria that are easy to obtain, feasible in a clinical setting, and can predict therapeutic efficacy not only with improved precision, but also at an early stage. To this aim, we assessed the relevance of our proposed EMT-related signature that fulfilled the above-described requirements and was able to predict recurrence and survival in stage II/III CRC patients, in a metastatic setting.
To our knowledge, we are first to assess the prognostic relevance of EMT-related genes in stage IV CRC. The results of this study indicate that in an advanced stage, EMT-related characteristics may give insights into the underlying tumour biology and mechanisms that have preceded metastatic spread but do not add further value to the determination of prognosis.
In fact, the role of EMT in metastatic tumours is still under debate. EMT is a critical process for tumour progression in which epithelial cells lose their epithelial features and acquire mesenchymal characteristics, such as invasion and motility. During EMT, cancer cells are considered to gain a more aggressive phenotype and are more prone to develop metastatic spread. Studies suggest that induction of EMT is critical for the initial steps of metastatic spread but not for metastatic seeding and outgrowth at the distant site. Other studies point out that the mere presence of tumour cells displaying EMT characteristics in the primary tumour does not prove that EMT is even absolutely required for metastatic spread. In line with this, not all cells that have undergone EMT will successfully metastasize. Further studies suggest that EMT may be a temporal function of metastatic progression that may play a pivotal role and predict prognosis in earlier stages of CRC, as also described by our study groups, whereas successful spread may be dependent on various factors including transcription factors, miRNAs, and noncoding RNAs, epigenetic regulators, environmental factors, and multiple other signaling molecules [34].
We have previously assessed the relevance of consensus-molecular subgroups (CMS), grouping CRC according to their gene-signature in four different subtypes, and found that OS in CMS4 (defined by EMT and an activated tissue growth factor (TGF)-β path-way making this subgroup more chemo-resistant) favored FOLFIRI plus cetuximab over FOLFIRI plus bevacizumab. However, similar to this study and from a clinical standpoint, CMS classification appeared not to be of superior value with regard to patient selection and optimal treatment [14]. In summary, evaluation of EMT-related genes has prognostic relevance in non-metastatic CRC, however, after the tumour has spread, they do not appear to add further value.
There are also limitations of this study. The material that the gene-expression analysis in this investigation was based on, was predominantly derived from primary tumours, as liver tissue was not available in the majority of cases, due to irresectability. In this respect, EMT-related characteristics in the primary tumour may not have prognostic relevance after the primary tumour has spread, however, although it has been shown that CMS classification changes from primary tumour to metastases, it remains unclear whether the expression of EMT related genes changes within metastases, which would be an interesting research question for future studies [35]. We used a range of different methods to be able to capture different types of effects that genes could exhibit on OS or PFS (linear or non-linear effects, interactions, deviation from proportional hazards). The results indicate no prognostic effect of gene data, as defined by predictive performance measured by the IBS, however, given the usually small effect sizes associated with individual genes, the sample size for this study might have been too low in order to detect them, if present. Performance of GBT could probably be improved through additional tuning, however, given the results obtained from other learners, it is doubtful that performance of the KM could be improved upon.
Overall, EMT-related genes did not show prognostic relevance in stage IV CRC and are not of additional value compared to common parameters used regarding patient selection and clinical decision making. In this respect, further studies that investigate novel metrics that reflect tumour biology to predict therapeutic response and outcome in mCRC and that can supplement the currently limited decision-making tools are warranted.

Conclusions
We have assessed the prognostic relevance of EMT-related genes in stage IV CRC. The results of this study indicate that in an advanced stage, EMT-related characteristics may give insights into the underlying tumour biology and mechanisms that have preceded metastatic spread but do not add further value to the determination of prognosis compared to common parameters used regarding patient selection and clinical decision making.