A Keratin 7 and E-Cadherin Signature Is Highly Predictive of Tubo-Ovarian High-Grade Serous Carcinoma Prognosis

During tubo-ovarian high-grade serous carcinoma (HGSC) progression, tumoral cells undergo phenotypic changes in their epithelial marker profiles, which are essential for dissemination processes. Here, we set out to determine whether standard epithelial markers can predict HGSC patient prognosis. Levels of E-CADH, KRT7, KRT18, KRT19 were quantified in 18 HGSC cell lines by Western blot and in a Discovery cohort tissue microarray (TMA) (n = 101 patients) using immunofluorescence. E-CADH and KRT7 levels were subsequently analyzed in the TMA of the Canadian Ovarian Experimental Unified Resource cohort (COEUR, n = 1158 patients) and in public datasets. Epithelial marker expression was highly variable in HGSC cell lines and tissues. In the Discovery cohort, high levels of KRT7 and KRT19 were associated with an unfavorable prognosis, whereas high E-CADH expression indicated a better outcome. Expression of KRT7 and E-CADH gave a robust combination to predict overall survival (OS, p = 0.004) and progression free survival (PFS, p = 5.5 × 10−4) by Kaplan–Meier analysis. In the COEUR cohort, the E-CADH-KRT7 signature was a strong independent prognostic biomarker (OS, HR = 1.6, p = 2.9 × 10−4; PFS, HR = 1.3, p = 0.008) and predicted a poor patient response to chemotherapy (p = 1.3 × 10−4). Our results identify a combination of two epithelial markers as highly significant indicators of HGSC patient prognosis and treatment response.


Introduction
Tubo-ovarian high-grade serous carcinoma (HGSC) is the most frequent, aggressive, and lethal histotype among ovarian carcinomas (OC). HGSC originates most frequently from the fimbrial mucosa of the Fallopian tubes and rapidly disseminates throughout the peritoneal cavity [1,2]. HGSC standard management consists of cytoreductive surgery and platinum and taxol-based chemotherapy but most patients will relapse within five years [3]. Molecular subtypes of HGSC have been identified in recent years, but their association with patient prognosis and response to therapy remains uncertain [4][5][6][7][8]. Therefore, there may be a greater value in finding consensus biomarkers that do not belong in a specific molecular subtype but that are able to stratify patients by prognosis and likely treatment Int. J. Mol. Sci. 2021, 22, 5325 2 of 17 responsiveness in order to tailor new therapeutic strategies for patients who are unlikely to respond to standard care [8]. Tumor profiling and biomarker signatures also provide indications of tumor phenotypes and activated signaling pathways that could be targeted by specific and personalized treatments [4,9,10].
Epithelial cell plasticity has been extensively described in the literature with respect to the capacity of tumor cells to navigate between an epithelial to mesenchymal phenotype or to regain stemness potency [9]. Epithelial phenotype is characterized by a plethora of markers, depending on the observed tissue and the type of epithelium. E-cadherin (E-CADH) and keratins (KRTs), such as the type II keratin 7 (KRT7) and the type I keratins 18 (KRT18) and 19 (KRT19), are commonly used to characterize the tumoral cells in HGSC tumors [10,11]. E-CADH has been frequently found downregulated in malignant epithelial tumors compared to their normal tissue counterparts, and high levels of E-CADH protein have been associated with a favourable prognosis in HGSC [11][12][13][14]. KRTs are known constituents of the cytoskeleton intermediate filaments in epithelial cells and are routinely used by pathologists as epithelial tumors that largely maintain the keratin profiles associated with their respective cells of origin [15][16][17]. KRT7, KRT18, KRT19 protein expression in tumors were reported to predict prognosis in several cancer types [18][19][20][21][22]. However, despite reported functions of KRTs in tumor progression, relatively little is known about their value as prognostic markers in the context of HGSC [17,[23][24][25]. Interestingly, we previously showed that KRT7 and KRT19 mRNA were elevated in OC compared to borderline tumors [26]. The relative levels of KRTs and E-CADH in HGSC may have an impact on tumor biology and patient prognosis.
In the present study, we focus on epithelial marker variability in HGSC and whether it is associated with aggressiveness of the disease. As they are already standard markers routinely used by pathologists, E-CADH, KRT7, KRT18, KRT19 are selected as epithelial markers, while vimentin (VIM) is selected as an indicator of mesenchymal phenotype. We analyse marker expression profiles and their relations with treatment responsiveness and prognosis of HGSC. This extensive analysis includes 18 HGSC cell lines, tissues from patient cohorts including the largest HGSC cohort in Canada, and publicly available datasets.

HGSC Cell Lines and Tissues Display a High Level of Epithelial Marker Plasticity
Protein level variability of epithelial and mesenchymal markers was analyzed in 18 HGSC cell lines ( Figure 1A). TOV3291G, TOV1369, OV2295(R2), and TOV3133G cell lines showed an "epithelial-like" profile, as they expressed all epithelial markers but not VIM. Conversely, TOV1946, OV1946, and TOV2223G cell lines were defined as "mesenchymallike", due to a positive expression of VIM and no epithelial marker expression. Other cell lines showed a mixed profile, such as the OV4453, OV4485, OV2085, and OV2295 cell lines that expressed keratins while showing negative or weak E-CADH expression and various levels of VIM. We explored this marker plasticity in HGSC tissues from 101 patients represented in a Discovery TMA by immunofluorescence staining. MFI of markers was calculated in tumoral structures from each tissue using a robust and previously validated digital image analysis (Visiopharm ® ) [27,28]. Across patient tissues, we observed a great variability in marker intensity within epithelial structures, ranging from the mesenchymallike to the epithelial-like phenotype, with most tissues exhibiting a mixed epithelial marker profile. Interestingly, E-CADH showed a negative correlation with VIM, whereas KRT7 and KRT19 did not correlate with this mesenchymal marker ( Figure 1B). Using CPTAC protein expression from the TCGA Ovarian Serous Cystadenocarcinoma dataset (n = 274), we observed that protein expression of E-CADH was negatively correlated to markers of EMT, whereas KRT7 and 19 were not correlated to markers of EMT ( Figure 1C). Our results highlight that HGSC tumors show a variability in epithelial marker expression across patients and suggest that KRT plasticity is independent from EMT status. markers of EMT, whereas KRT7 and 19 were not correlated to markers of EMT ( Figure  1C). Our results highlight that HGSC tumors show a variability in epithelial marker expression across patients and suggest that KRT plasticity is independent from EMT status.

E-CADH, KRT7, and KRT19 Predict Patient Prognosis in the Discovery Cohort
Using the clinically annotated Discovery cohort (Table 1), we then sought to determine if epithelial marker plasticity could be an indicator of prognosis in patients with HGSC. KRT7 and KRT19 high expression was significantly associated with shorter progression-free survival (PFS) (p = 5.5 × 10 −4 and p = 0.004, respectively) and shorter overall survival (OS) (p = 5.2 × 10 −4 and p = 0.016, respectively), whereas E-CADH high expression was associated with longer PFS (p = 0.007) and longer OS (p = 0.043) (Figure 2A-C). KRT18 and VIM expression were not correlated with prognosis ( Figure 2D,E). By univariate cox regression analysis, KRT7 and KRT19 expression showed a significant association with an unfavourable outcome, whereas E-CADH was associated with a favourable prognosis (Tables S2 and S3). Multivariate analysis indicated that KRT7 and KRT19 were independent prognosis biomarkers when adjusted for stage and residual disease (Tables S2 and S3).

E-CADH and KRT7 Combination Is the Best Prognosis Predictor in the Discovery Cohort
Combinations of KRT7, KRT19, and E-CADH expression were evaluated to determine the most relevant signature to predict patient outcome. As KRT7 and KRT19 expression were highly correlated in patient HGSC from the Discovery cohort ( Figure 2B), there was little added value in combining both markers compared to the individual markers regarding the prognosis evaluation ( Figure 3A). However, the E-CADH-KRT7 and E-CADH-KRT19 combinations improved the level of prognosis prediction ( Figure 3B,C, Table S2). The E-CADH-KRT7 combination was selected to pursue the study, since KRT7 alone showed greater levels of significance than KRT19 in evaluating patient prognosis ( Figure 2, Tables S2 and S3). In addition, the E-CADH-KRT7 combination was a better independent prognosis predictor than E-CADH-KRT19 by multivariate analysis (Table S2).
mine the most relevant signature to predict patient outcome. As KRT7 and KRT19 expres-sion were highly correlated in patient HGSC from the Discovery cohort ( Figure 2B), there was little added value in combining both markers compared to the individual markers regarding the prognosis evaluation ( Figure 3A). However, the E-CADH-KRT7 and E-CADH-KRT19 combinations improved the level of prognosis prediction ( Figure 3B,C, Table S2). The E-CADH-KRT7 combination was selected to pursue the study, since KRT7 alone showed greater levels of significance than KRT19 in evaluating patient prognosis ( Figure 2, Tables S2 and S3). In addition, the E-CADH-KRT7 combination was a better independent prognosis predictor than E-CADH-KRT19 by multivariate analysis (Table  S2).

E-CADH and KRT7 Signature Improves Patient Prognosis Stratification by Stage and Residual Disease in the COEUR Cohort
Association of the markers with clinical parameters of the COEUR cohort indicated that KRT7 levels were significantly elevated in tumors at late stages compared to tumors at early stages, and in tumors with high levels of residual disease (RD) compared to tumors with absence or low rates of RD after cytoreductive surgery ( Figure 5A). Conversely, E-CADH expression was decreased in tumors with high levels of RD compared to those with low rate of RD after surgery ( Figure 5B). This observation led us to evaluate the prognostic signature of the E-CADH-KRT7 combination in patients stratified by stage and RD, which were the strongest clinical parameters to evaluate HGSC patient prognosis ( Figure 5C,E) in the COEUR cohort. The addition of KRT7 and E-CADH expression criteria enhanced the discrimination level of patients by estimated median survival months among the groups stratified by early/late tumor stages or by low/high RD rates ( Figure 5D,F). By ROC curve analysis, the addition of KRT7/E-CADH levels to clinical parameters such as stage, RD, age, and/or BRCA mutation status systematically improved the performance to predict patient overall survival ( Figure S3). Interestingly, using KRT7/E-CADH levels, stage and RD was highly predictive with an AUC 0.722 (p = 8.27 × 10 −14 , n = 473), and the addition of BRCA mutation status increased the performance to AUC 0.743 (p = 9.78 × 10 −10 , n = 231), thought it was on a more restricted number of patients.
KRT7 signature (right) associations with 12 months' time to recurrence after chemotherapy. p values are indicated. Number of patients and estimated 75th percentile months to recurrence are indicated for each group.

E-CADH and KRT7 Signature Improves Patient Prognosis Stratification by Stage and Residual Disease in the COEUR Cohort
Association of the markers with clinical parameters of the COEUR cohort indicated that KRT7 levels were significantly elevated in tumors at late stages compared to tumors at early stages, and in tumors with high levels of residual disease (RD) compared to tumors with absence or low rates of RD after cytoreductive surgery ( Figure 5A). Conversely, E-CADH expression was decreased in tumors with high levels of RD compared to those with low rate of RD after surgery ( Figure 5B). This observation led us to evaluate the prognostic signature of the E-CADH-KRT7 combination in patients stratified by stage and RD, which were the strongest clinical parameters to evaluate HGSC patient prognosis ( Figure  5C,E) in the COEUR cohort. The addition of KRT7 and E-CADH expression criteria enhanced the discrimination level of patients by estimated median survival months among the groups stratified by early/late tumor stages or by low/high RD rates ( Figure 5D,F). By ROC curve analysis, the addition of KRT7/E-CADH levels to clinical parameters such as stage, RD, age, and/or BRCA mutation status systematically improved the performance to predict patient overall survival ( Figure S3). Interestingly, using KRT7/E-CADH levels, stage and RD was highly predictive with an AUC 0.722 (p = 8.27 × 10 −14 , n = 473), and the addition of BRCA mutation status increased the performance to AUC 0.743 (p = 9.78 × 10 −10 , n = 231), thought it was on a more restricted number of patients.

KRT7 Is a Major Predictor of HGSC Patient Prognosis at the Gene Expression Level
E-CADH (CDH1) and KRT7 gene expression were then analysed in publicly available datasets. E-CADH expression was not correlated with prognosis at the gene expression level in the TCGA dataset or in the Kaplan-Meier plotter dataset ( Figure S4A,B), corroborating the literature about E-CADH gene expression and ovarian cancer [29]. This observation can be explained by the weak correlation between E-CADH mRNA and its protein expression (Spearman Rho = 0.26) as observed in the TCGA dataset, suggesting that E-CADH mRNA expression does not reflect the level of E-CADH protein in HGSC tumors ( Figure S4C).
Conversely, high KRT7 gene expression was significantly associated with poor patient prognosis in the Kaplan-Meier plotter dataset and in the CSIOVDB dataset comprised of 3431 OC patients classified by molecular subtypes [7] (Figure S5A,B). KRT7 showed a decreased mRNA expression and an elevated level of DNA methylation in the mesenchymal subtype compared to the epithelial-A and epithelial-B subtypes, indicating that KRT7 overexpression is not a feature of the mesenchymal profile ( Figure S5C,D). Analysis of the TCGA dataset showed a high correlation between KRT7 mRNA and its protein expression in HGSC tumors (Spearman Rho = 0.71) ( Figure S5E). When we focused on the Kaplan-Meier plotter dataset and categorized patients by stage or RD, addition of KRT7 mRNA expression improved the stratification of patient prognosis ( Figure S6A). High KRT7 level was also predictive of a poorer 12-month response to treatments, particularly for the group of patients treated by platinum-based chemotherapy ( Figure S6B). Together, our results indicate that KRT7 is a major HGSC prognostic biomarker at protein and gene expression levels and is predictive of a poorer response to chemotherapy.

KRT7 Is a Prognosis Biomarker of Breast, Gastric, and Non-Small-Cell Lung Carcinomas
Using the Kaplan-Meier plotter dataset, we analyzed KRT7 gene expression in breast, gastric, and non-small-cell lung carcinomas. In breast cancer, high KRT7 gene expression was associated with a poorer prognosis ( Figure S7A). When we classified patients by breast cancer subtypes, we observed that higher KRT7 levels were associated with the poorest outcome in basal and HER2 + subtypes but not in the more differentiated luminal A and luminal B subtypes ( Figure S8). Elevated KRT7 mRNA was also significantly associated with poorer outcomes in the intestinal subtype of gastric carcinoma ( Figure S7B) and in lung adenocarcinoma, the most common subtype of non-small-cell lung cancer ( Figure S7C). These last findings highlight the interest in evaluating the prognostic value of KRT7 in solid epithelial cancers.

Discussion
In our study, heterogeneous profiles of epithelial markers were observed in HGSC cell lines and tumor tissues. In the Discovery and the COEUR validation set, we showed that KRT7 expression is a strong and independent negative prognostic biomarker and that its combination with E-CADH expression further improved prognostic and treatment response prediction in HGSC patients. In recent years, researchers and clinicians have put much effort into finding new prognostic and predictive biomarkers of HGSC and only a few have been clinically validated [30][31][32]. A recent publication from Millstein et al. has proposed a 101 gene expression signature to predict high-grade serous ovarian cancer overall survival [33]. Performance of their signature including clinical parameters such as age and stage was an AUC of 0.75 for a five-year OS by ROC curve analysis. Here, our twomarker signature performance reached an AUC of 0.743 to predict the OS when associated with clinical parameters. Moreover, KRT7 is routinely used by gynecological pathologists to distinguish ovarian neoplasms from metastatic colonic adenocarcinoma [15,16]. E-CADH is commonly used in pathology to confirm epithelial cell lineage. Importantly, our results were validated using the same 75th percentile threshold for dichotomization across all studied cohorts for both markers and across TCGA and Kaplan-Meier plotter datasets for KRT7, indicating that this specific threshold does not rely on specific cohort data. Though our results were obtained by immunofluorescence, KRT7 and E-CADH expression analysis by immunohistochemistry are regularly conducted in pathology departments and this prognostic signature could be implemented in the clinical setting, at a lower cost than genomic analyses.
The ability of the E-CADH-KRT7 combination to discriminate patient prognosis with a higher reliability than KRT7 or E-CADH alone, probably lies in the fact that the two markers are involved in different pathways that participate in HGSC progression. E-CADH downregulation is associated with increased cell motility and cell invasion capacity and can be regulated by several EMT transcription factors including TWIST, SNAIL, SLUG, ZEB1, and TGF-β1, among others [11,34]. The combination of E-CADH and SNAIL protein expression was shown to be associated with ovarian cancer prognosis in a cohort of 174 patients [35]. Another study assessed the protein expression of E-CADH, N-CADH, Pcadherin, ZEB1, HMGA2, RAB25, CD24, NCAM, SOX11, and VIM in 100 tubo-ovarian serous carcinoma effusions and found a limited prognostic role of the markers alone or in combination [36]. Marker combinations from these last studies were less significant than the E-CADH-KRT7 signature in predicting patient prognosis, probably because of the redundancy of E-CADH and other markers' involvement in EMT pathways. As expected, we observed a negative correlation between E-CADH and VIM in the Discovery cohort and between E-CADH and EMT markers in HGSC samples from the TCGA. In contrast, our results indicate that KRT7 protein expression is not associated with EMT markers or the mesenchymal OC subtype, suggesting that KRT7 upregulation is not a feature of EMT in HGSC.
The literature is scarce and mixed regarding KRT7 regulation in cancer. It was observed in a recent publication that KRT7 overexpression in ovarian cancer cell lines was associated with increased proliferation, migration and EMT marker expression through the regulation of the TGF-β/Smad2/3 [23]. These results obtained on HGSC cell lines differ from our findings which show that KRT7 expression is not correlated with EMT markers in HGSC tissues and emphasizes the need to more fully characterize KRT7 functions in HGSC progression. KRT7 regulation may be in part related to Forkhead box family members. In ovarian cancer cell line SKOV3, KRT5 and KRT7 were upregulated by Forkhead box M1 (FOXM1) and KRT5 and KRT7 deficiency prevented migration [24]. FOXM1 has been widely involved in cancer progression and several molecules targeting FOXM1 pathway are currently under investigation [37,38]. In esophageal squamous cell and in gastric carcinoma, where KRT7 overexpression is associated with a poor prognosis, KRT7 was transcriptionally upregulated by FOXA1 [20,39,40]. Other KRT7 regulation mechanisms involved the long non-coding RNA KRT7-AS that forms an RNA-RNA duplex with KRT7 and stabilizes KRT7 expression at the mRNA and the post-transcriptional levels. KRT7-AS promoted gastric and colorectal cancer cell progression by increasing KRT7 expression [41,42]. As these observations are disparate and were obtained from limited cell line models, there is a necessity to deepen the understanding of the mechanisms of regulation governing KRT7 expression. KRT7 regulation may involve several complex and tissue specific pathways that might represent interesting therapeutic targets.
Our analysis of KRT7 in publicly available cancer databases has highlighted that KRT7 gene expression is a prognostic marker of poor outcome in several cancer types including breast, non-small-cell lung, and gastric carcinomas. This last observation corroborates results from a recent publication where KRT7 overexpression was associated with poor prognosis in gastric cancer patients [21]. Other published works have indicated that elevated KRT7 is also associated with an unfavourable outcome in pancreatic cancer [43], esophageal squamous cell carcinoma [20] and colorectal carcinoma [22]. However, high KRT7 expression was associated with better OS in papillary renal cell carcinoma [44] and the KRT7/KRT19 expressing subtype was associated with better outcomes in clear cell renal cell carcinoma [45]. Together, these observations emphasize the major but complex and tissue-specific implications of KRT7 function in cancer progression.
Roles have been described for other KRTs than KRT7 in immune system and inflammation, DNA damage response and resistance to apoptosis, shear stress resistance during extravasation, or apico-basal polarization [10,46]. In addition, KRTs and notably KRT7, KRT18 and KRT19, are widely used to detect circulating tumor cells in the blood or detached tumor cells in ascites [46]. Soluble protein fragments of keratins, including KRT7, KRT8, KRT18 and KRT19, can be detected in the circulation of cancer patients and are used to monitor disease progression and patient prognostic in certain tumour types [46][47][48][49]. High level of KRT7 in serum of patients with non-small cell lung cancer complicated with superior vena cava syndrome was associated with a poor prognosis [47]. The utility of KRT7 as a liquid biopsy diagnostic, prognostic and predictive biomarker should be further evaluated in HGSC patients.
Our study was limited to several markers and the inclusion of other epithelial markers may be of interest to improve HGSC prognosis evaluation. Notably, EPCAM protein expression was associated with stemness, aggressive features and chemoresistance in ovarian cancer [50][51][52]. Potential KRT7 interaction with EPCAM and the relevance of adding EPCAM expression to the E-CADH-KRT7 signature should be further evaluated. In addition, the impact of epithelial plasticity and KRT7 functions on chemoresistance need to be explored, as we observed that patients with high KRT7 and low E-CADH levels had a poorer response to treatment and notably to platinum + taxol-based chemotherapy. Further studies are needed to understand the biological causes and significance of specific epithelial cell phenotypes in HGSC tumor progression and metastasis. We believe that the E-CADH-KRT7 combination is a very promising signature to predict HGSC patient prognosis and standard treatment response and could prove valuable in clinical decision making.

Patient Cohorts and Datasets
Ethics statement and patient cohorts for protein expression analysis. Ethical approval (CER CHUM, REB Project Number: BD04.002, 13 November 2021) was obtained from the Centre hospitalier de l'Université de Montreal (CHUM) institutional ethics committee (Comité d'éthique de la recherche du CHUM).
HGSC specimens were collected during primary cytoreductive surgery of patients and subsequently formalin-fixed and paraffin-embedded (FFPE). Informed patient consent was obtained. HGSC tissue micro-arrays (TMAs) from the CHUM and the Terry Fox Research Institute (TFRI)-COEUR have previously been described [27,[53][54][55]. Clinicopathological characteristics of the cohorts are summarized in Table 1 The COEUR cohort included eight TMA blocks constructed from 1158 HGSC tumor samples. Patients were recruited between 1991 and 2017 from 10 tumor banks across Canada, including the CHUM [54]. Seventy-seven patient cases were common between the Discovery and the COEUR cohorts. Two 0.6 mm tumor tissue punches per patient were included in the COEUR TMA. Two gynaecologic-oncologic pathologists (MK and KR) performed a double central review of the FFPE TMA blocks with integrated use of diagnostic immunohistochemical markers [55]. E-CADH and KRT7 expression analysis was performed using mRNA expression (z-scores RNA Seq V2 RSEM, n = 307 samples) and proteomic data (z-scores mass spectrometry, n = 174 samples) from the TCGA Ovarian Serous Cystadenocarcinoma dataset (Firehose Legacy, n = 606). Proteomic data were obtained from the Clinical Proteomic Tumor Analysis Consortium (CPTAC), NCI/NIH on 1 November 2017: https://proteomics. cancer.gov/programs/cptac. Clinical data, CPTAC proteomic data and mRNA expression were downloaded from the cBioPortal for Cancer Genomics on November 2017: http://www.cbioportal.org [56].

Kaplan-Meier Plotter Dataset
The Kaplan-Meier plotter is comprised of 1816 ovarian cancer patient data with a mean follow-up of 40 months (http://kmplot.com, accessed on 1 September 2019) [57]. The database was primarily set up using gene expression data and survival information of ovarian cancer patients downloaded from Gene Expression Omnibus (GEO) (n = 1251) and TCGA (n = 565) (Affymetrix HG-U133A, HG-U133A 2.0, and HG-U133 Plus 2.0 microarrays). Analysis was done on September 2019 with the JetSet best probe sets "201131_s_at" against E-CADH (CDH1 gene) and "209016_s_at" against KRT7 and was restricted to serous histology, grades 2 (n = 325) + 3 (n = 1024), all TP53 status and all available datasets. Stage, debulking status, and chemotherapy regimens were variable parameters according to the analysis.

Ovarian Cancer Database of the Cancer Science Institute Singapore (CSIOVDB)
CSIOVDB includes data on 3431 human ovarian carcinomas including carcinoma of the ovary (91.49%), fallopian tube, peritoneum, and metastasis to the ovary from GEO, ArrayExpress, TCGA, ExpO, and private/in-house data (http://csiovdb.mc.ntu.edu.tw/ CSIOVDB.html, accessed on 1 May 2021). HGSC is the most highly represented carcinoma in CSIOVDB (73.75%) [29]. The database has 1516 and 1868 samples with progression-free survival (PFS) and overall survival (OS), respectively. Output of a gene query includes expression profiles in histological and molecular subtypes, survival correlations and integration with the DNA methylation from TCGA.

Immunofluorescence Staining
Immunostaining was performed on 4 µm TMA sections (Discovery and COEUR) using the Benchmark XT autostainer (Ventana Medical Systems, Roche, Rotkreuz, Switzerland). Antibody staining conditions were based on the manufacturer's datasheet for each marker. Antigen retrieval was performed using Cell Conditioning 1 (Tris-EDTA buffer, pH 7.8, for KRT7, KRT18, KRT19, and VIM) or 2 (citrate buffer, pH 6.0, for E-CADH) (Ventana Medical Systems) and slides were then incubated for one hour with the primary mouse monoclonal antibodies against E-CADH (1/100,

Digital Image Analysis (DIA)
TMA slides were scanned with the VS-110 microscope using a 20X 0.75 NA objective and a resolution of 0.3225 µm (Olympus Canada Inc., Richmond Hill, ON, Canada) linked to an OlyVIA ® image viewer software (xvViewer.exe). Scanned images were imported into Visiopharm ® (VP) software (Hoersholm, Denmark), and fluorescent staining of the different markers were quantified by automated DIA as previously detailed [27]. VP algorithms used (1) DAPI staining to enable the delimitation of the "whole tissue" and (2) KRT8/18 staining to discriminate the regions of Interest (ROI) "epithelium" and "stroma" in each TMA punch. Marker expression was quantified in each image pixel of each ROI to calculate the mean fluorescence intensity (MFI) within the ROI. A visual review was also performed to exclude damaged tissue, necrotic, or red cell infiltrated sections.

High Grade Serous Carcinoma Cell Lines
All 18 cell lines used in our study were derived from HGSC solid tumors or peritoneal ascites and were described in publications from our group. Their characteristics are summarized in Table S1 [58][59][60]. Cell lines were cultivated at 37 • C in hypoxic condition of 7% O 2 , and 5% CO 2 and grown in OSE medium (Wisent, St.-Bruno, QC, Canada) supplemented with 10% Foetal Bovine Serum, 0.5 µg/mL amphotericin B (Wisent), and 50 µg/mL gentamicin (Gibco ® , Life Technologies Inc., Waltham, MA, USA).

Western Blot
The same antibodies were used for immunofluorescence and Western blot detection (referenced in the section "Immunofluorescence staining"), except for E-CADH. Total protein extracts (30 µg) were electrophoresed in 4-15% pre-cast gels (Bio-Rad, Mississauga, ON, Canada). Proteins were transferred onto PVDF membranes that were blocked in PBS with 5% milk. Membranes were then probed with primary antibodies in 5% milk PBS-Tween at the following dilutions: 1/10,000 for beta-actin, 1/1000 for KRT19, 1/2000 for KRT7 and KRT18, 1/500 for E-cadherin (24E10, #3195, Cell Signaling Technology Inc., Danvers, MA, USA) and vimentin. Protein expression was detected with HRP-conjugated secondary antibodies and visualized by enhanced chemiluminescence (ECL, Bio-Rad). Beta-actin was used as housekeeping gene loading control. As the 18 cell lines could not be loaded together on a same gel, 3291G protein extracts were used as a normalization control.

Statistics and Survival Analyses
Statistical analyses were conducted using IBM Statistics SPSS 23 and GraphPad Prism 5. Correlation studies between gene and protein expression were performed using nonparametric Spearman correlation. The non-parametric Mann-Whitney test was used to compare the mean between two groups of marker expression. A p value < 0.05 was considered statistically significant (* = p < 0.05; ** = p < 0.01; *** = p < 0.001).
For survival analyses, Kaplan-Meier, univariate, and multivariate Cox proportional hazard regression models and Receiving Operating Characteristic (ROC) curves were used. Multivariate analyses were performed in association with standard available prognostic indicators, including FIGO stages (1/2 vs. 3/4), RD at cytoreductive surgery (<1 cm vs. ≥1 cm/miliary), and age of patients at diagnosis (continuous values). BRCA mutation status was not considered for multivariate analyses, as this variable was not available for a large number of patients. Candidate biomarkers were systematically analysed by their continuous and dichotomized expression values by Cox regression analyses. Using the Discovery cohort, candidate expressions were dichotomized into groups of low and high expression by the 25th percentile, the median or the 75th percentile. For each candidate, the most significant cut-off was selected for the present study. KRT7 and E-CADH protein expression were dichotomized using the 75th percentile MFI cut-off in all the subsequent survival analyses, including patient categorization using the E-CADH-KRT7 combination. For AUC measures derived from ROC analyses, scores were a combination of the following dichotomized parameters: FIGO stage (0: 1/2 stages; 1: 3/4 stages), RD (0: <1 cm; 1: ≥1 cm/miliary), age (0: <65 years old; 1: ≥65 years old), BRCA mutation status (0: BRCA1/2 mutation; 1: BRCA1/2 wild type), and KRT7/E-CADH MFI expression ratio (0: <median ratio; 1: ≥median ratio). Data Availability Statement: Data presented in this study are available on request from the corresponding author. The data are not publicly available due to privacy of patient clinical data.