RNA Immune Signatures from Pan-Cancer Analysis Are Prognostic for High-Grade Serous Ovarian Cancer and Other Female Cancers

Immune cell infiltrates within the tumor microenvironment can influence treatment response and outcome in several cancers. In this study, we developed RNA-based immune signatures from pan-cancer analysis that could serve as potential markers across tumor types and tested them for association with outcome in high-grade serous ovarian cancer (HGSOC) and other female cancers. Pan-cancer RNA-Seq cluster analysis of immune-related gene expression profiles in The Cancer Genome Atlas (TCGA) from 29 different solid tumors (4446 specimens) identified distinct but concordant gene signatures. Among these immune signatures, Cytotoxic Lymphocyte Immune Signature (CLIS), T-cell trafficking (TCT), and the TCT to M2 tumor-associated macrophage (M2TAM) ratio (TCT:M2TAM) were significantly (p < 0.05) associated with overall survival (OS), using multivariable Cox proportional hazards regression models, in a discovery cohort and two independent validation cohorts of HGSOC patients. Notably, the TCT:M2TAM ratio was highly significant (p ≤ 0.000001) in two HGSOC cohorts. Immune signatures were also significant (p < 0.05) in the presence of tumor cytoreduction, BRCA1/2 mutation, and COL2A1 expression. Importantly, the CLIS and TCT signatures were also validated for prognostic significance (p < 0.05) in TCGA cohorts for endometrial and high tumor mutational burden (Hi-TMB) breast cancer. These immune signatures also have the potential for being predictive in other cancers and for patients following different treatment strategies.


Introduction
The tumor microenvironment (TME) plays an important role in tumor growth and treatment response. A key component of the TME involves immune cell infiltrates that orchestrate the innate and adaptive immune response. The interaction of immune cells with tumor cells can lead to tumor eradication or progression depending on the type and activity of immune cells that are present in the TME. Studies of the molecular mechanisms involved in immune surveillance within tumors has led to development of promising immunotherapies that have benefited a subset of patients. Furthermore, the immune landscape can also influence response to chemotherapy.
The relationship between immune status determined by gene expression-based biomarkers, treatment response, and OS for EOC patients is indeed complex, since no significant association between published expression-based immune signatures and OS was reported using a univariable Cox model in a cohort of 260 patients from TCGA [8,21]. Konecny et al. [22] developed a distinct RNA-based immune response signature for EOC mimicking the immunoreactive subtype from TCGA and found it was significantly associated with OS in two independent cohorts, but not in TCGA EOC dataset. In this study, using the experimental approach outlined in Figure 1, we conducted pan-cancer analysis from 29 different solid tumors (4446 specimens) in TCGA to develop immune signatures that could be applied in a clinical setting to several different cancers, including EOC.

Pan-Cancer Analysis of TCGA RNA-Seq Data
A heatmap (Figure 2A) of the pan-cancer analysis of genes expressed in immune cells in TCGA solid tumors showed that both gene-cancer type interactions and cancer-independent characteristics were seen in the clustered expression profile. Various related subsets of genes appeared to express in a similar coordinated fashion across many tumor types and specimens. The re-clustering of a gene focus set from Figure 2A of 126 primarily adaptive and monocyte-related genes appeared to cluster into distinguishable immune compartments ( Figure 2B with key to color codes shown in Figure S1). In contrast to Figure 2A, the gene focus set in Figure 2B showed relatively little cancer-specific clustering.

Pan-Cancer Analysis of TCGA RNA-Seq Data
A heatmap (Figure 2A) of the pan-cancer analysis of genes expressed in immune cells in TCGA solid tumors showed that both gene-cancer type interactions and cancer-independent characteristics were seen in the clustered expression profile. Various related subsets of genes appeared to express in a similar coordinated fashion across many tumor types and specimens. The re-clustering of a gene focus set from Figure 2A of 126 primarily adaptive and monocyte-related genes appeared to cluster into distinguishable immune compartments ( Figure 2B with key to color codes shown in Figure S1). In contrast to Figure 2A, the gene focus set in Figure 2B showed relatively little cancer-specific clustering.  Figure S1. An explanation of TCGA codes is provided in Table S1.
Several important pan-cancer features were observed from the cluster analysis ( Figure 2B). Tumors mostly segregated into immune-active and immune-silent groups. Immune-active tumors did not generally cluster by cancer type and high levels of adaptive immune activity were seen in all tumor types. Exceptions were observed for glioblastoma multiforme (GBM) and lower grade glioma (LGG), which exhibited poor immunogenic activity. Several immune subsystems of co-expressing genes that are potentially useful as immune signatures were identified, including a wider class of cytotoxic lymphocytes represented by 57 genes that included CD8 T effector, natural killer (NK), and T helper cells; T cell Trafficking (TCT) is represented by three genes (associated with the CXCR3 ligands, CXCL9 CXCL10, CXCL11); and M2TAM is represented by four genes. Table 1 shows the patient characteristics of three independent HGSOC cohorts used for testing the derived immune signatures. The Cleveland Clinic-Charité cohort is enriched for long-term survivors as a percentage relative to the other cohorts (42% versus 21-29%). The immune signatures derived from the pan-cancer analysis were first tested in the HGSOC TCGA discovery cohort. The results ( Table 2 and Figure S2 . This is the first time, to the best of our knowledge, that any RNA-based immune signature has demonstrated association with OS endpoints in TCGA HGSOC dataset. Validation of the defined signatures in two other HGSOC cohorts (Cleveland Clinic-Charité and Mayo Clinic) revealed that CLIS and TCT were significantly associated with OS (p < 0.05) in the presence of age and stage with similar HR estimates ( Table 2 and Figure S2). Contrary to several other studies, primary surgical cytoreduction (optimal or sub-optimal) was not found to be significant by itself or in the presence of various factors, including immune status, in association with OS in TCGA cohort. As a result, primary cytoreduction was not included in the multi-variable Cox PH models for TCGA. Primary surgical cytoreduction (optimal or sub-optimal) was highly significant as a co-variable in the Mayo Clinic cohort (p < 0.00001) and had mixed significance (p = 0.014 to 0.067) in the Cleveland Clinic-Charité cohort. These results indicated that higher levels of cytotoxic lymphocytes and TCT were associated with reduced risk and better outcomes (HR < 1). Interestingly, the TCT:M2TAM signature, which was highly significant for TCGA and the Mayo Clinic cohort, exhibited p-values that were two to four orders of magnitude more significant than for the individual signatures (Table 2 and Figure 3). In contrast, p-values for the M2TAM immune signature in particular were not significantly associated with OS either individually or with other co-variables and would not have initially suggested that M2TAM would be so highly associated with OS when combined with TCT as a ratio.

Association of the Immune Signatures in the Presence of other Cofactors
Beyond their prognostic value, we investigated whether the derived immune signatures were significantly associated with disease-free survival (DFS) in the presence of other known predictive biomarkers, e.g., BRCA1/2 somatic mutational status [9,10] and COL2A1 expression [23]. In TCGA, BRCA1/2 somatic mutational status was significantly associated with DFS in the presence of age, stage and each RNA-based immune signature (Table S2). The statistical significance and HR estimates for each immune signature were not substantially impacted by the presence of BRCA1/2 somatic mutational status. This analysis was not carried out for the other HGSOC cohorts as BRCA1/2 mutational status was not available. In the TCGA and Cleveland Clinic-Charité DFS cohorts, COL2A1 expression was jointly significant with the immune factors CLIS and TCT in the presence of the other clinical factors. In fact, the statistical significance of the immune factors increased and HR estimates were stable for CLIS and TCT when COL2A1 expression was added to the multivariable PH model (Table S3). Thus, immune status is important in explaining variation in DFS and OS across multiple cohorts even when patient age, tumor stage, BRCA1/2 somatic mutational status, tumor cytoreduction, and COL2A1 expression is known.

Significance of Pan-Cancer-Based Immune Signature in other Cancers
To test the applicability of the study-derived immune signatures in other cancers, we probed TCGA RNA-Seq data for uterine corpus endometrial cancer (UCEC) and high tumor mutational burden (Hi-TMB) breast cancer (BRCA) patients [24,25]. The patient characteristics for these two cohorts are shown in Table S4. Similar to HGSOC, the immune signatures for CLIS (p = 0.001 in UCEC and p = 0.002 in Hi-TMB breast cancer) and TCT (p = 0.036 in UCEC and p = 0.032 in Hi-TMB breast cancer) were significantly associated with outcome in multivariable Cox PH models (Table 3) suggesting the potential utility of the immune signatures for other female cancers. Hi-TMB breast cancer OS multivariable Cox PH models also included progesterone receptor (PR) status, since progesterone is a known predictor of outcomes in breast cancer and was significantly associated with OS with each immune signature.

Discussion
Cytotoxic T lymphocytes play a significant role in the tumor cytotoxic response. However, different immune components within the TME can collectively impact tumor growth characteristics. Several studies have characterized immune subtypes based on the RNA expression of immune-related genes, but the clinical value of these gene signatures has not been reliably described among different patient cohorts. Therefore, the establishment of pan-cancer RNA-based immune signatures with broad predictive and/or prognostic applicability for different tumor types would be helpful in guiding treatment strategies Using a pan-cancer approach of solid tumors, we derived RNA-based immune-gene signatures that represented different immune activities and successfully tested the prognostic value of three important signatures in different tumors. These signatures included CLIS, TCT, and TCT:M2TAM, representing the ratio of two plastic immune activities that exert divergent effects on tumor progression (tumor inhibitory vs. tumor promoting, respectively). Our results demonstrate that CLIS, TCT, and TCT:M2TAM signatures were significantly associated with OS in the TCGA HGSOC discovery cohort. The robustness of these signatures was confirmed in two validation cohorts of HGSOC. In addition, CLIS and TCT signatures had a value beyond HGSOC, showing significance with the OS of UCEC and Hi-TMB breast cancer patients. Thus, CLIS and TCT were significantly associated with OS in all patient cohorts tested, independent of the RNA measurement platform (RNA-Seq or microarray) and in the presence of several important co-variables.
Interestingly, the TCT:M2TAM activity was shown to be a more significant prognostic indicator than individual signatures in TCGA and the Mayo Clinic HGSOC cohorts. The improved HR and significance of the TCT:M2TAM in the larger TCGA and Mayo Clinic cohorts as well as the similar directionality of the HR and low p value (0.067) for the smaller Cleveland Clinic-Charité cohort illustrate the potential for combining immune regulatory activity with anti-tumor immune activity to explain a larger portion of the variation observed in outcomes at least for HGSOC. Indeed, previous studies have shown that tumor progression can be influenced by the differential expression of M1 and M2 macrophages [26]. It has also been reported that the depletion of tumor-associated macrophages can improve T cell migration [27] and that the repolarization of M2-to-M1 macrophage within the TME using a supramolecule improves anti-tumor effects in pre-clinical models of breast cancer and melanoma [28]. Thus, our results suggest that immune activity is not unidimensional relative to outcomes and that a more complete view of the TME is needed to provide insights for improving patient outcomes with novel chemotherapy and/or immunotherapy strategies.
Our results demonstrating the association of immune signatures with the adjuvant-chemotherapyresponse is supported by recent studies demonstrating the augmentation of a pre-existing immune response in HGSOC patients treated with neo-adjuvant chemotherapy [29,30]. Furthermore, our results reveal that the significance of CLIS and TCT was maintained when other predictive markers of HGSOC, BRCA1/2 mutational status [3], primary cytoreduction [5], or COL2A1 expression [23] were included in multivariable Cox PH models for predicting DFS in HGSOC cohorts.
In addition to being prognostic for HGSOC, CLIS and TCT signatures were also prognostic for other female malignancies, UCEC and Hi-TMB breast cancer. The strength of the association for the different cancer types evaluated was greatest when immune activity was measured on the continuum rather than as High/Low, implying that risk was proportional to levels of CLIS and TCT. Interestingly, CLIS showed consistent HR estimates to those reported for CD8+ T cells measured by immunohistochemistry [7] and CLIS, TCT, and TCT:M2TAM were more significantly associated with survival in TCGA HGSOC cohort than other reported [31,32] immune signatures.
Immunotherapy has made substantial progress in treating several cancers [13,[33][34][35]. However, the absence of suitable biomarkers has made it difficult to identify patients that would benefit from immunotherapy treatment. The pan-cancer immune signatures defined in this study that were representative of both the adaptive and innate/inflammatory immune tumor microenvironment and predictive of outcome to chemotherapy may also be useful in guiding treatment decisions for selecting patients being considered for immunotherapy. The predictive capabilities of inflammatory/immunosuppressive immune signatures suggest consideration of these signatures in addition to the routinely employed biomarkers, e.g., PD1/PDL1 expression for immunotherapy treatment protocols.

Pan-Cancer RNA-Seq Data Analysis for Derivation of Immune Signatures
The experimental approach is outlined in Figure 1. RNA-Seq profiles in TCGA from 29 different solid tumor types (Table S1) were evaluated to identify patterns of gene co-expression associated with major immune subsystems. A stratified random sampling approach was employed to examine 4446 pan-cancer tumor specimens. Up to 200 specimens from each type were randomly selected to prevent bias in statistical analysis.
For derivation of the immune signatures, normalized TCGA expression data were mean centered by gene and hierarchically clustered (centroid-based) by gene and tumor specimens using correlation as a measure of similarity. The immunome model was based on the LM22 classification of > 500 genes [36] supplemented with additional genes from other studies [37,38]. All immune signatures were derived while blinded to clinical outcome information.

Patient Cohorts
The applicability of the immune signatures was tested in TCGA RNA-Seq HGSOC dataset (discovery cohort), n = 189 [8] as the prototypical tumor type and validated in two other HGSOC patient cohorts, Cleveland Clinic, and Charité Medical University of Berlin (n = 48), and Mayo Clinic, n = 174 [22], as well as TCGA uterine corpus endometrial cancer (UCEC) cohort (n = 370) and a subset of TCGA breast cancer (BRCA) cohort (n = 194) that consisted of patients with high TMB. Hi-TMB was defined as TMB > 2 nonsynonymous somatic variants per Mb. In TCGA and Cleveland Clinic-Charité HGSOC cohorts we selected patients who had cytoreductive surgery followed by ≥ 6 courses of adjuvant platinum-based chemotherapy. For the Mayo Clinic patients, all patients received adjuvant platinum-based chemotherapy. Ethics approval and consent to participate. For our study cohort (also known as the Cleveland Clinic-Charité cohort), tumor specimens were obtained at cytoreductive surgery according to institutional review board approved protocols (IRB 4614), with fully informed consent. The study was performed in accordance with the Declaration of Helsinki.

RNA-Seq and Microarray Analysis
The RNA-Seq normalized count data for TCGA cohorts was downloaded from TCGA repositories within dbGAP (now part of the Cancer Genomics Cloud). The RNA-Seq data was aligned to the genome (hg19) using Mapsplice, quantified relative to the transcriptome using RNA-Seq by expectation-maximization (RSEM) [39], and consolidated to the gene level. Counts were further normalized using upper quartile methods across generally detected genes to be more compatible with the discovery cohort.
Tumor specimens for the Cleveland Clinic/Charité cohort were obtained at primary cytoreductive surgery from patients providing complete informed consent according to Institutional Review Board approved protocols. RNA samples were prepped using 100 ng total RNA input following the Illumina TruSeq Total RNA prep protocol and were sequenced on Illumina HiSeq flow cells in a 50-b paired-end fashion with each sample having a minimum of 50 M clusters (paired reads). The same bioinformatic methods used for TCGA RNA-Seq were applied except that alignment was performed using STAR v2.4.
RNA expression data for the Mayo Clinic (Agilent array-based) cohort were extracted from the Gene Expression Omnibus (GEO) file GSE53963_series_matrix.txt [22]. When multiple probes existed for a gene, the probe with the greatest detection level across the cohort was preferentially selected.

Statistical and Survival Analysis
All immune signatures used a weighted average of individual gene expression values within the signature (log 2 scale) to create a score per signature. Immune signatures were standardized (through a linear transform of log 2 values) relative to each cohort so that a one-unit change in an immune signature score implied an approximate one standard deviation change within the study population. The log 2 signature values were linearly transformed so that the 15th and 85th percentile values were recoded to −1 and 1, respectively to compare hazard ratio (HR) estimates between studies and signatures.
For survival analysis, Cox proportional hazards (PH) regression models were implemented using PROC PHREG from SAS v9.4. Confidence intervals for HR estimates used the Wald 95% limits. For this analysis, patient age, tumor stage, and various immune factors were considered simultaneously. Additionally, we considered other factors in several multivariable survival models as appropriate for either OS or DFS: for example, BRCA1/BRCA2 somatic mutational status (mutated vs. wild-type), primary cytoreductive surgery (optimal or sub-optimal or unknown), and COL2A1 expression (high and low) for HGSOC, and progesterone receptor (PR) status (positive vs. negative) for breast cancer. For TCGA analysis, patient age, tumor stage, grade, and primary cytoreductive surgery status are available via the Broad Firehose portal [40].

Conclusions
In conclusion, our study demonstrates convincingly that multiple RNA-based immune signatures identified using a pan-cancer approach are prognostic for OS in HGSOC as well as other female cancers. These findings also suggest that these signatures potentially have predictive/prognostic value in other solid tumors. Thus, a comprehensive understanding of the genomic differences and the association of relevant RNA-based biomarkers with immune response could significantly influence clinical decisions, including the development of novel targeted strategies and immune checkpoint blockade therapy.
Supplementary Materials: The following are available online at http://www.mdpi.com/2072-6694/12/3/620/s1, Figure S1: The color coding and detailed description of the LM22 immune classes (36) for Figure 2, Figure S2: Hazard ratio (HR) estimates ( ) with 95% confidence intervals for individual immune signatures in the multivariable Cox proportional hazards (PH) models from three high-grade serous ovarian cancer (HGSOC) cohorts shown in Table 2. Immune signatures were standardized to more easily compare hazard ratios. Each multivariable Cox models includes patient age and tumor stage. All immune-related signatures shown here have higher levels associated with reduced risk and thus better outcomes, Table S1: The Cancer Genome Atlas (TCGA) Solid Tumor Types; Table  S2: Multivariable Cox proportional hazards (PH) models from The Cancer Genome Atlas (TCGA) high-grade serous ovarian cancer (HGSOC) cohort with BRCA1/2 somatic mutation (Som-Mut) versus wild type (WT) status with immune signatures, patient age, and tumor stage for disease-free survival (DFS). Primary cytoreductive surgery (optimal or sub-optimal) was not used since it was not statistically significant for TCGA DFS by itself or with any of the multivariable models shown below, Table S3: Multivariable Cox proportional hazards (PH) models from The Cancer Genome Atlas (TCGA) and Cleveland Clinic-Charité high-grade serous ovarian cancer (HGSOC) cohort with COL2A1 status (High or Low expression) with immune signatures, patient age, tumor stage, and primary cytoreduction status for disease-free survival (DFS), Table S4: Patient characteristics for uterine corpus endometrial cancer (UCEC) and high tumor mutational burden (Hi-TMB) breast cancer (BRCA) cohorts in TCGA.