The Transcriptomic Landscape of Gastric Cancer: Insights into Epstein-Barr Virus Infected and Microsatellite Unstable Tumors

Background: Epstein-Barr Virus (EBV) positive and microsatellite unstable (MSI-high) gastric cancer (GC) are molecular subgroups with distinctive molecular profiles. We explored the transcriptomic differences between EBV+ and MSI-high GCs, and the expression of current GC immunotherapy targets such as PD-1, PD-L1, CTLA4 and Dies1/VISTA. Methods: Using Nanostring Technology and comparative bioinformatics, we analyzed the expression of 499 genes in 46 GCs, classified either as EBV positive (EBER in situ hybridization) or MSI-high (PCR/fragment analysis). PD-L1 protein expression was assessed by immunohistochemistry. Results: From the 46 GCs, 27 tested MSI-high/EBV−, 15 tested MSS/EBV+ and four tested MSS/EBV−. The Nanostring CodeSet could segregate GCs according to MSI and, to a lesser extent, EBV status. Functional annotation of differentially expressed genes associated MSI-high/EBV− GCs with mitotic activity and MSS/EBV+ GCs with immune response. PD-L1 protein expression, evaluated in stromal immune cells, was lower in MSI-high/EBV− GCs. High mRNA expression of PD-1, CTLA4 and Dies1/VISTA and distinctive PD-1/PD-L1 co-expression patterns (PD-1high/PD-L1low, PD-1high/PDL1high) were associated with MSS/EBV+ molecular subtype and gastric cancer with lymphoid stroma (GCLS) morphological features. Conclusions: EBV+ and MSI-high GCs present distinct transcriptomic profiles. GCLS/EBV+ cases frequently present co-expression of multiple immunotherapy targets, a finding with putative therapeutic implications.


Introduction
Gastric Cancer (GC) is a heterogeneous disease at the morphological and molecular levels [1]. Numerous somatic gene mutations, copy-number variations, translocations/inversions, as well as Although a growing number of publications have focused on the study of EBV+ and MSI-high molecular subtypes separately or as a group, an in-depth analysis of the contribution of EBV infection and MSI status to the transcriptomic landscape of GC is still lacking. In this study, we performed an unbiased analysis, aimed at uncovering differences within the gene expression profiles of EBV+ and MSI-high molecular subtypes, and at analyzing the expression profile of current targets for immunotherapy in GC.

Results
We studied 46 GCs characterized for two molecular features: (1) EBV infection and, (2) MSI-high status ( Figure 1). Fifteen out of 46 GC (32.6%) displayed EBV positivity (EBV+) and the remaining 31/46 (67.4%) were negative (EBV−). Concerning MSI status, 27/46 (58.7%) were MSI-high while 19/46 (41.3%) were microsatellite stable (MSS). All EBV+ GC cases were MSS, while most (27/31) EBV− cases were MSI-high ( Figure 1). targeted immunotherapies [22]. Therefore, in this context, the presence of EBV positivity and MSI-high status in GC may serve to select patients for immune checkpoint inhibitor therapy [8,9,13]. Although a growing number of publications have focused on the study of EBV+ and MSI-high molecular subtypes separately or as a group, an in-depth analysis of the contribution of EBV infection and MSI status to the transcriptomic landscape of GC is still lacking. In this study, we performed an unbiased analysis, aimed at uncovering differences within the gene expression profiles of EBV+ and MSI-high molecular subtypes, and at analyzing the expression profile of current targets for immunotherapy in GC.

EBV+ and MSI-High GCs Displayed Distinct Transcriptomic Signatures
We analyzed the transcriptomic landscape of the 46 GC cases, aimed at unveiling differences between EBV+ and MSI-high GC subtypes. For this, we have used a previously published Nanostring nCounter CodeSet [23], which comprised 499 genes associated with oncogenic signaling pathways, GC molecular subtype signatures and immune response. After adequate data analysis and normalization, the expression of the 499 genes was plotted in a heatmap and non-hierarchical clustering and principal component analysis (PCA) were performed ( Figure 2).
Concerning MSI-high status, we observed that 26/27 MSI-high cases were clustered in clusters 2A and 2B, and 17/19 MSS cases were clustered together in cluster 2C (Figure 2a). Using a PCA, we observed a clear separation between MSI-high and MSS cases (circles vs. squares, Figure 2b). Concerning EBV infection, we observed that most EBV+ cases (13/15) were clustered together in cluster 2C, with the remaining two EBV+ cases found within/close to cluster 2B (Figure 2a). With the PCA, almost all EBV+ cases were separated from EBV− cases (black vs. gray, Figure 2b).
As we correlated the information from the heatmap and dendrogram and the PCA, we could observe the clear separation of two MSS/EBV+ cases in the PCA, which likely reflects overall higher expression in these two cases (black squares with asterisk, Figure 2). Furthermore, we observed the MSI-high/EBV− case in MSS-rich cluster 2C, however clustered among MSI-high cases in the PCA, highlighting the differences in the clustering strategies (gray circle with asterisk, Figure 2). These results suggested that the transcriptomic landscape assessed can clearly distinguish GC cases with an MSS phenotype from those with MSI-high phenotype, and, to a lesser degree, EBV+ from EBV− cases.

EBV+ and MSI-High GCs Displayed Distinct Transcriptomic Signatures
We analyzed the transcriptomic landscape of the 46 GC cases, aimed at unveiling differences between EBV+ and MSI-high GC subtypes. For this, we have used a previously published Nanostring nCounter CodeSet [23], which comprised 499 genes associated with oncogenic signaling pathways, GC molecular subtype signatures and immune response. After adequate data analysis and normalization, the expression of the 499 genes was plotted in a heatmap and non-hierarchical clustering and principal component analysis (PCA) were performed ( Figure 2).
Concerning MSI-high status, we observed that 26/27 MSI-high cases were clustered in clusters 2A and 2B, and 17/19 MSS cases were clustered together in cluster 2C (Figure 2a). Using a PCA, we observed a clear separation between MSI-high and MSS cases (circles vs. squares, Figure 2b). Concerning EBV infection, we observed that most EBV+ cases (13/15) were clustered together in cluster 2C, with the remaining two EBV+ cases found within/close to cluster 2B (Figure 2a). With the PCA, almost all EBV+ cases were separated from EBV− cases (black vs. gray, Figure 2b).
As we correlated the information from the heatmap and dendrogram and the PCA, we could observe the clear separation of two MSS/EBV+ cases in the PCA, which likely reflects overall higher expression in these two cases (black squares with asterisk, Figure 2). Furthermore, we observed the MSI-high/EBV− case in MSS-rich cluster 2C, however clustered among MSI-high cases in the PCA, highlighting the differences in the clustering strategies (gray circle with asterisk, Figure 2). These results suggested that the transcriptomic landscape assessed can clearly distinguish GC cases with an MSS phenotype from those with MSI-high phenotype, and, to a lesser degree, EBV+ from EBV− cases. To further reinforce these findings, we used the partitioning method k-means and several values of k to understand how strongly the gene expression profile followed the classification MSI-high or EBV+. The best results were observed for a value of k = 3, as the clusters calculated were the most homogeneous in terms of molecular subtypes, particularly for the MSS/MSI-high status. In fact, all MSI-high cases fell into cluster I and were perfectly separated from MSS cases that fell into clusters II and III (Table 1). For EBV− cases, the clustering was more heterogeneous, spreading across two different clusters (I and III). Of notice, if we disregard the two samples in cluster II, which correspond to those previously shown to display an abnormally high global expression profile (black squares with asterisk, Figure 2), homogeneity in cluster III becomes evident.  I  25  1  1  26  II  2  18  14  6   3   I  27  0  0  27  II  0  2  2  0  III  0  17  13  4   4   I  13  0  0  13  II  14  0  0  14  III  0  17  13  4  IV  0  2  2  0   5   I  0  12  9  3  II  12  0  0  12  III  0  2  2  0  IV  12  0  0  12  V  3  5  4  4 Our observations show that the gene expression profile assessed with the Nanostring CodeSet was sufficient to illustrate the molecular separation of MSS and MSI-high molecular subtypes. This analysis also provides the rationale to derive a smaller specific gene expression signature that strongly discriminates MSS and MSI-high phenotypes and, to a lesser extent, EBV infection status. To better understand this, we next studied each molecular subtype independently, aiming at uncovering the biological meaning of the gene expression profiles associated with each phenotype.

MSI-High GC Cases Displayed a Mitotic Signature, While MSS GC Cases Showed an Immune Response Signature
We first compared the transcriptomic landscape of MSS and MSI-high cases and detected 193 genes DE-genes (False Discovery Rate (FDR) ≤ 0.05 and 1.5 ≤ fold-change ≤ 0.6). By performing a non-hierarchical clustering using specifically these DE-genes, we observed that all MSS cases (n = 19/19) and the majority of MSI-high cases (n = 25/27), were clustered together (cluster 3A and cluster 3B, respectively) (Figure 3a), while two MSI-high outlier cases clustered together with the MSS cases (cluster 3A, circles with cross, Figure 3a). However, these two cases were clustered closer to the remaining MSI-high cases than MSS with PCA, with over 50% of data variability considered (Figure 3b). In fact, PCA exhibited a clear separation of cases according to MSS/MSI-high status.   We next searched for biological annotations among the 193 DE-genes between MSS and MSI-high cases [24,25]. We observed enrichment in the more generalist terms such as signal peptide and disulfide bond, as well as in regulation of cell proliferation and chemotaxis (Table 2). From the 193 DE-genes, 55 were upregulated in MSI-high cases and annotated to terms associated with cell division. The remaining 138 DE-genes, downregulated in MSI-high cases, were related to chemotaxis and immune response. The 55 upregulated DE-genes in MSI-high cases were significantly associated with three annotation clusters: cluster 1 associated with cell cycle and cell division; cluster 2 with cytoskeleton and; cluster 3 associating more specific terms such as 'mitotic spindle organization' (Table 3). For the 138 downregulated DE-genes, eight clusters were significantly enriched associated with the terms: signal peptide and disulfide bond; chemotaxis; leukocyte and lymphocyte activation; chemokine activity; response to stimuli; regulation of cell death; cell migration and motility, and; polysaccharide binding (Table 3). Taken together, these results showed that MSI-high cases likely present increased cell division and decreased immune response and cell migration.

EBV+ GC Cases Were Associated with Immune Response Signature
Next, we compared the transcriptomic landscape of EBV+ and EBV− GC cases. We detected 142 DE-genes and non-hierarchical clustering revealed two major clusters: cluster 4A with 22/31 cases EBV− and; cluster 4B constituted by all 15 EBV+ cases plus the remaining 9 EBV− cases ( Figure 4a). This separation between EBV+ and EBV− cases was partially recapitulated by PCA: 4/9 EBV− cases clustered together with EBV+ cases in the non-hierarchical clustering, were similarly grouped with PCA (gray diamonds with asterisk, Figure 4). This showed an overall less homogenous clustering according to EBV status, demonstrating that the MSI-high status was a stronger marker in GC.  We then performed biological annotation of the 142 DE-genes and observed a significant enrichment in terms such as chemotaxis and immune response (Table 4). Black diamonds correspond to EBV+ cases (n = 15) and white diamonds to EBV− cases (n = 31). Diamonds with asterisk for correspondence between panels (a) and (b) and described in the main text.
We then performed biological annotation of the 142 DE-genes and observed a significant enrichment in terms such as chemotaxis and immune response (Table 4). From the 142 DE-genes, 105 were upregulated EBV+ cases vs. EBV− cases and belonged to six enriched clusters: signal peptide and disulfide bond; chemotaxis and motility; leukocyte and T-cell activation; chemokines and chemokine activity; immune system development; and T-cell differentiation ( Table 5). The 37 genes downregulated in this comparison were enriched in two clusters: cell division, mitosis, and cell cycle (Table 5).  Our results show that EBV infection is not as determinant as MSI-high status. Nevertheless, the presence of this virus in GC samples was associated with an immune T-cell inflamed phenotype, in line with current literature [4,6,8].

MSS/MSI Phenotype Classification Was the Major Molecular Classifier in GC
Given that MSI and EBV status were part of the molecular classification for GC proposed by TCGA [4], we next assessed the differential expression profile of our GC cases taking into account both molecular classifications. We detected 166 DE-genes associated with biological annotations such as interleukin-8-like chemokine, signal peptide and chemotaxis (Table 6). From the 166 DE-genes, 117 were upregulated and 49 downregulated DE-genes in MSS/EBV+ cases vs. MSI-high/EBV−genes. Next, we plotted the expression of these DE-genes for all 46 GC cases and observed that, despite adding both molecular classifiers, MSI-high and MSS cases remained well separated ( Figure 5a). As observed before, two MSI-high/EBV− GC cases were clustered together with MSS cases (cluster 5A, Figure 5a, gray circles with asterisk and cluster 3A, Figure 3a, circles with cross). Nevertheless, with PCA, the same two cases were clustered closer to MSI-high/EBV− GC cases. PCA separated two other MSI-high/EBV− cases, which were already loosely clustered in cluster 5B (  Separate annotation of up-and downregulated DE-genes in MSS/EBV+ cases revealed significant enrichment for five annotation clusters for upregulated genes: signal peptide and disulfide bond; chemotaxis and locomotion; leukocyte and T-cell activation; chemokines and chemokine activity, and; actin fibers (Table 7). Downregulated genes were separated in two clusters: cell cycle and mitosis, and cytoskeleton. These results were comparable to those previously observed when considering the molecular classifiers independently. To understand whether this was due to common set of DE-genes across analyses, we next compared the DE-genes obtained for each comparison: MSS with MSI-high cases; EBV+ with EBV− cases and; MSS/EBV+ with MSI-high/EBV− cases.
Most DE-genes were shared by the three analyses (n = 133, Figure 6), thus justifying the similar biological annotation enrichments obtained. Unlike EBV−-based classification, many DE-genes derived specifically from the MSI-high/MSS-based classification and became lost when combining both molecular subtypes (n = 34, Figure 6). Functional enrichment of this particular set of DE-genes, although without any FDR-significant results, pointed toward enrichment in extracellular matrix terms. Altogether, our results pinpointed MSI-high/MSS phenotype as the major molecular classifier in our GC cohort, independently of EBV− tatus classification. Most DE-genes were shared by the three analyses (n = 133, Figure 6), thus justifying the similar biological annotation enrichments obtained. Unlike EBV−-based classification, many DE-genes derived specifically from the MSI-high/MSS-based classification and became lost when combining both molecular subtypes (n = 34, Figure 6). Functional enrichment of this particular set of DE-genes, although without any FDR-significant results, pointed toward enrichment in extracellular matrix terms. Altogether, our results pinpointed MSI-high/MSS phenotype as the major molecular classifier in our GC cohort, independently of EBV− tatus classification.

PD-L1 and PD-1 Displayed Opposite mRNA Expression Patterns and Were Differently Associated with GC Molecular Subtypes and Morphological Features
In GC, several clinical trials have been targeting immune checkpoint regulators, such as CTLA4, PD-1, PDL1 and VISTA/Dies1 [26]. Given the observed associations between immune response terms and MSS/EBV+ cases, we further assessed the mRNA expression of these immune checkpoint regulators across all cases from the 3 GC groups represented in our series (15 MSS/EBV+, 4 MSS/EBV− and 27 MSI-high/EBV−, Figure 7). CTLA4, PD-1 and VISTA/Dies1, but not PD-L1 mRNA expression was significantly enriched in MSS/EBV+ cases (Figure 7a). Therefore, we analyzed PD-L1 protein expression in cancer cells and in the immune cells infiltrating the tumor microenvironment (TME) to understand this difference. In cancer epithelial cells, PD-L1 protein expression did not differ between GC groups (Figure 7b). However, in the immune cells of the TME, MSI-high/EBV− cases often presented low expression of PD-L1, while MSS/EBV+ showed variable PD-L1 expression across all categories, from low to high (Figure 7c, Fisher's Exact Test p-value = 7.71 × 10 −3 ) These results prompted us to re-analyze the mRNA expression of the four immune checkpoint regulators on a case-by-case manner. While CTLA4 and VISTA/Dies1 followed the expression pattern of PD-1, PD-L1 varied in an inverse manner in a large set of cases: for example, GC cases with highest expression of PD-L1 displayed the lowest expression of PD-1 (Figure 7d).
To validate this observation, we assessed the number of GC cases for each of the four PD-L1/PD-1 co-expression scenarios observed: (1) high expression of PD-L1 and low expression of

PD-L1 and PD-1 Displayed Opposite mRNA Expression Patterns and Were Differently Associated with GC Molecular Subtypes and Morphological Features
In GC, several clinical trials have been targeting immune checkpoint regulators, such as CTLA4, PD-1, PDL1 and VISTA/Dies1 [26]. Given the observed associations between immune response terms and MSS/EBV+ cases, we further assessed the mRNA expression of these immune checkpoint regulators across all cases from the 3 GC groups represented in our series (15 MSS/EBV+, 4 MSS/EBV− and 27 MSI-high/EBV−, Figure 7). CTLA4, PD-1 and VISTA/Dies1, but not PD-L1 mRNA expression was significantly enriched in MSS/EBV+ cases (Figure 7a). Therefore, we analyzed PD-L1 protein expression in cancer cells and in the immune cells infiltrating the tumor microenvironment (TME) to understand this difference. In cancer epithelial cells, PD-L1 protein expression did not differ between GC groups (Figure 7b). However, in the immune cells of the TME, MSI-high/EBV− cases often presented low expression of PD-L1, while MSS/EBV+ showed variable PD-L1 expression across all categories, from low to high (Figure 7c, Fisher's Exact Test p-value = 7.71 × 10 −3 ) These results prompted us to re-analyze the mRNA expression of the four immune checkpoint regulators on a case-by-case manner. While CTLA4 and VISTA/Dies1 followed the expression pattern of PD-1, PD-L1 varied in an inverse manner in a large set of cases: for example, GC cases with highest expression of PD-L1 displayed the lowest expression of PD-1 (Figure 7d).

Discussion
In this study, we explored the transcriptomic profile of EBV+ and MSI-high GC, using a Nanostring CodeSet with 499 genes involved in oncogenic signaling, immune response and molecular gene expression signatures. This small gene expression panel could segregate GCs of our cohort according to MSI-high status and, to a lesser extent, EBV infection, and was sufficient to reproduce the taxonomy developed by TCGA.
EBV infection and MSI-high status represent two alternative pathways of gastric carcinogenesis and two mutually exclusive GC molecular subtypes [11,12,27]. Herein, we confirmed that all EBV+ cases showed an MSS phenotype, and vice versa, that all MSI-high cases were negative for EBV Altogether, this gene-oriented analysis showed that PD-L1 and PD-1 exhibit particular co-expression patterns in an MSS/MSI-high and EBV infection-dependent manner. Moreover, our results suggest that the evaluation of the tumor immune infiltrate by histopathological analysis strengthened the stratification of GC cases and helped identifying more homogenous biological subgroups in terms of co-expression of immune checkpoint regulators.

Discussion
In this study, we explored the transcriptomic profile of EBV+ and MSI-high GC, using a Nanostring CodeSet with 499 genes involved in oncogenic signaling, immune response and molecular gene expression signatures. This small gene expression panel could segregate GCs of our cohort according to MSI-high status and, to a lesser extent, EBV infection, and was sufficient to reproduce the taxonomy developed by TCGA.
EBV infection and MSI-high status represent two alternative pathways of gastric carcinogenesis and two mutually exclusive GC molecular subtypes [11,12,27]. Herein, we confirmed that all EBV+ cases showed an MSS phenotype, and vice versa, that all MSI-high cases were negative for EBV infection.
When we focused on determining clusters of biologically-related annotation terms, underlying the DE-genes found for the two GC molecular subtypes, we found that MSI-high tumors showed an enrichment in genes related to DNA replication and mitotic cell cycle, as previously reported [4,5]. MSI-high status leads to the accumulation of numerous frameshift mutations throughout the genome [28] and may determine the inactivation of key tumor suppressor genes, including those involved in DNA damage repair, cell cycle control and apoptotic signaling [29]. Accordingly, as demonstrated in the MSI-high colorectal cancer model [30], mutations providing proliferative and survival advantage are selected during MSI-high GC initiation and/or progression, conferring a proliferative state. In contrast, EBV+ tumors showed a downregulation of genes involved in mitotic pathways.
By functional annotation of genes discriminating EBV+ tumors, we identified a gene signature involved in immune pathways, confirming the data already reported in the literature [4,6,8,31,32]. The immune signature was enriched for genes related to T-cell differentiation, cytotoxic signaling, pro-inflammatory cytokines/chemokines, leukocyte migration and genes of the immune checkpoint inhibitors pathways. These features reflect the immunogenicity of EBV infection and provide evidence of the biological significance of immune cell infiltration in EBV+ tumors [33]. Accordingly, the DE-genes downregulated in MSI-high cases, hence upregulated in MSS cases, were also found to be associated with immune response and cell migration. Therefore, in this study we demonstrated, using unbiased bioinformatics analyses, that the transcriptomic landscape of GCs with EBV+ and MSI-high phenotypes is different, associating each molecular entity with enrichment of different biological terms, i.e., mitotic activity and immune response. In this study, gene expression analysis was performed through the Nanostring Technology Platform, which has shown excellent robustness and sensitivity for the analysis of formalin-fixed paraffin-embedded (FFPE) samples [34]. Moreover, several authors have shown the reproducibility of the results obtained in GC tissues through Nanostring technology, using distinct molecular platforms [23,34]. Importantly, our results were able to confirm the transcriptomic data obtained in TCGA study, thus further contributing for the validation of the Nanostring CodeSet. In future studies, it would be interesting to confirm the enrichment of mitotic pathways in MSI GC cohorts, by investigating mitotic activity/index through histopathological analysis, as demonstrated already in the colorectal cancer model [30,35].
We also investigated the expression of molecules involved in immune checkpoint inhibitors pathways and current targets for immunotherapy in GC [26]. By assessing PD-L1 protein expression by IHC, the current predictive biomarker used for selecting GC patients eligible for Pembrolizumab immunotherapy [16], we found that PD-L1 protein expression, evaluated in cancer cells, showed no significant differences between the two molecular subgroups. However, when we evaluated PD-L1 expression in immune cells of the TME, most MSI-high/EBV− cases presented low expression, when compared to MSS/EBV+ tumors. This result shows the value of evaluating protein expression in tissue sections to improve knowledge of topographic distribution of molecular markers.
We also analyzed PD-1, CTLA4 and Dies1/VISTA mRNA expression and observed that all were significantly enriched in MSS/EBV+ cases, in comparison with MSI-high/EBV− cases. This result highlighted the high correlation of PD-1, CTLA4 and Dies1/VISTA increased mRNA expression with EBV+ cases, but not with EBV− cases. However, as we analyzed PD-L1 mRNA expression, we did not observe any significant difference between the two molecular subgroups. This result prompted us to analyze the co-expression of PD-1 and PD-L1 at the mRNA level, the two most promising biomarkers for GC immunotherapy. Studies have shown that patients with mRNA co-expression of PD-1/PD-L1 were those with better prognosis [36]. However, few GC cases in our cohort presented expression of both markers simultaneously and, interestingly, most were GCLS cases positive for EBV infection. Nevertheless, it has been shown that patients treated with anti-PD-1 therapy respond well even in the absence of PD-L1 expression [37], a fact that may reflect the different co-expression patterns observed in our cohort.
Taking all these observations into account, we further explored the pattern of co-expression of PD-1 with Dies1/VISTA and CTLA4. In fact, combination immunotherapies, targeting simultaneously PD-1 and Dies1/VISTA or CTLA4 are being explored as new strategies for the treatment of GC [38,39]. We observed that from the 19 PD-1 high-expressing GCLS cases, 18 cases also displayed high Dies1/VISTA and/or CTLA4 mRNA expression. This observation suggests that the recognition of GCLS morphological features may contribute, in >70% cases, to the selection of patients who would benefit from a combination immunotherapy, targeting PD-1 and either Dies1/VISTA or CTLA4 (Figure 8). The results herein described raise the hypothesis that Dies1/VISTA and CTLA4 may be the silent PD-1 partners in GC, explaining the good response observed in patients harboring PD-L1-negative tumors, treated with anti-PD-1 therapy [18][19][20]37]. Overall, our results support that most GCLS patients will benefit from anti-PD-1 therapy combined with either anti-Dies1/VISTA or anti-CTLA4 ( Figure 8). This novel data is worth further studies. To evaluate, in different GC cohorts, protein expression of multiple immunotherapy targets, besides PD-L1, would be crucial to integrate gene and protein expression data, as well to explore the topographic distribution (i.e., cancer cells versus TME immune cells) of different biomarkers. Of notice, Dies1/VISTA expression has already been assessed in a large GC series, and its expression was mostly detected in >80% of TME immune cells [26].
Our study also revealed that, beyond MSS/MSI-high and EBV infection, the morphological entity GCLS was strongly associated with PD-1 high expression. In fact,~80% of all GCLS presented high PD-1 mRNA expression ( Figure 8).
Altogether, our data demonstrated that a small transcriptomic panel can separate MSI/EBV− from MSS/EBV+ GC cases, and this may have clinical utility. Also, our analysis demonstrates that EBV+ GCs with GCLS morphological features is the biological subgroup that would more likely respond to immunotherapy, as they present higher PD-1 expression, the key immunotherapy target in GC, together with Dies1/VISTA and CTLA4. These observations support the ongoing GC clinical trials and highlight GCLS as a useful feature to stratify patients for targeted immunotherapies.

Case Series
Tissue samples were obtained retrospectively from 46 patients with GC who had undergone gastrectomy as primary treatment at Centro Hospitalar São João (Porto, Portugal). For mRNA extraction, frozen tissue was available from 23 cases, whereas FFPE tissue was used in the remaining 23 cases. The series was enriched with GC cases harboring EBV infection (n = 15) and MSI-high status (n = 27), whereas the remaining four cases were EBV− and MSS. EBV infection and MSI status were investigated as described below. Histopathological analysis was performed on H&E sections and the tumors were classified as GCLS or CA, based on the abundance of the lymphoid infiltrate.

EBV In Situ Hybridization
The presence of EBV infection was studied by chromogenic in situ hybridization (ISH) for EBV− encoded RNA (EBER-ISH, INFORM EBER probe, Ventana Medical Systems, Tucson, AZ, USA). One 3 µm section was processed in the automatic Ventana Benchmark Ultra platform with enzymatic digestion (ISH protease) and the iViewBlue detection kit.

PCR/Fragment Analysis for MSI Status
Genomic DNA was extracted from frozen or FFPE tissues (four sections, each 10 µm thick), using QIAamp DNA Mini Kit (Qiagen, Valencia, CA, USA), in accordance with the manufacturer's instructions. DNA purity and quantification were assessed using the NanoDrop 2000 UV-Vis spectrophotometer (NanoDrop products, Wilmington, DE, USA). Five mononucleotide markers (BAT-25, BAT-26, NR-24, NR-21 and NR-27) were used as a pentaplex panel to determine MSI status (Multiplex PCR, Qiagen, Valencia, CA, USA). Tumors with instability involving at least two of the five loci were classified as MSI.

Gene Expression Profiling by Nanostring nCounter Assay
Total RNA was extracted from frozen or FFPE tissues (four sections, each 10 µm thick), using miRNeasy Mini Kit (Qiagen, Valencia, CA, USA), in accordance with the manufacturer's instructions. High tumor content of the samples was ensured by morphological evaluation of mirror H&E sections of frozen samples and by microdissection of tumor areas in sections from FFPE blocks. For Nanostring nCounter assay, we used a custom-designed panel comprising 474 genes previously published [23] and additional genes associated with immune response (CCL22, CCR7, CD3D, CD3E, CD3G, CD8A, CD8B, CD19, CD20, CD45, CD68, CXCL10, CXCL11, FOXP3, GZMA, GZMB, IL4, IL13, PD-1, PD-L1, TNFA, VISTA/Dies1). Nanostring probe hybridization was performed as a service at Genome Institute of Singapore. Raw counts obtained for each sample were normalized using nSolver software version 3.0 (NanoString Technologies). We performed: (1) background subtraction using eight negative control probes included in the Nanostring CodeSet; (2) positive control normalization using six positive control probes also included in the Nanostring CodeSet; (3) housekeeping normalization using the standard method in the software and five independent housekeeping genes included in the Nanostring CodeSet. Normalized log2-scaled counts were used for to construct heatmaps, dendrograms and to perform PCA described in this study, using the R environment and the packages "ggplot2" and "ggfortify" [40][41][42][43]. Next, each sample was identified concerning its MSS/MSI-high phenotype and/or EBV infection status to perform comparison analysis using also the nSolver software (NanoString Technologies). Calculated ratios and FDR was used to define the set of up/downregulated DE-genes considering the comparison performed: genes with ratio above 1.5 and FDR < 0.05 were classified as differentially expressed upregulated genes; genes with ratio below 0.67 and FDR < 0.05 were classified as differentially expressed downregulated genes.

PD-L1 Immunohistochemistry
Staining for PD-L1 was performed in FFPE 3 µm sections with a rabbit monoclonal antibody (clone E1L3N, 1:1000; Cell Signaling Technology) on the automatic Ventana Benchmark Ultra platform, using the OptiView Universal DAB detection kit and the OptiView Amplification kit from the same manufacturer. PD-L1 immunoexpression was evaluated semi-quantitatively for tumor epithelial and stromal immune cells, according to the immunoreactivity scoring system (IRS) described by Boger et al. [44]. PD-L1 expression in tumor epithelial cells was dichotomized as positive (detected) or negative (absent) by an immunoreactivity score (IRS) of 2. PD-L1 expression in immune cells of the TME was defined as low (1-5% of positive cells), intermediate (6-20% of positive cells) or high (>20% of positive cells).

Functional Annotation and Statistical Analysis
Functional annotation was performed using the online tool DAVID 6.7 [24,25]. In particular, we have used the option 'Functional Annotation Clustering' using always the stringency 'high' and the option 'Functional Annotation Chart'. Selected clusters and/or biological terms were considered enriched and reported in this study if presenting an FDR < 0.05. Normalized log2-scaled counts for the genes CTLA4, PD-1, VISTA/Dies1 and PD-L1 was collected from nSolver software (NanoString Technologies), as previously described. Boxplots were plotted using R [40] and represented p-values derived from a Wilcoxon test (Mann-Whitney) also performed using R and all samples (including outliers). After building a contingency table for the number of cases in each detailed condition, a Fisher's exact test was performed using R. This test was selected rather than the chi-square test, due to the low number of samples available in our cohort.

Conclusions
In this study, we have shown that the expression profile of GC cases for the assessed 499 genes was strongly correlated with the established molecular subtypes currently used for GC molecular stratification. Altogether, our results have clearly associated: (1) MSI-high/EBV− GC cases with mitosis and cell cycle biological terms; (2) MSS/EBV+ GC cases with immune response mediated by T-cells. Importantly, we have also shown that the MSI status is a much more relevant molecular classifier than EBV infection. We have also revealed that PD-L1 and PD-1 have opposite mRNA expression patterns in GC, in correlation with MSI phenotype, EBV status and prominent immune infiltrate, as revealed by the GCLS morphological feature, a highly relevant finding as both genes are nowadays actively pursued as targets for immunotherapy in GC. Moreover, our study has shown that Dies1/VISTA and CTLA4 are highly expressed in the majority of EBV+ and GCLS cases, strengthening the relevance of clinical trials using antibodies raised against these two proteins in combination with the promising anti-PD-1 therapy.
Author Contributions: I.G. participated in the concept, design, and planning of the study, selected the cohort, performed/participated in all described experimental procedures, participated in the classification of cases and staining patterns, co-designed and participated in the interpretation of the bioinformatic analyses and co-wrote the manuscript. J.C. co-designed and participated in the interpretation of the bioinformatic analyses and co-wrote the manuscript. D.M. performed the RNA extraction from GC cases. D.L., A.R.M. and M.F. participated in the bioinformatic analyses. C.O. participated in the concept, design, and planning of the study, co-designed the bioinformatic analyses and reviewed the manuscript. F.C. participated in the concept, design or planning of the study, co-designed the bioinformatic analyses, participated in the classification of cases and staining patterns and reviewed the manuscript. P.O. participated in the study design, co-designed, performed and interpreted all bioinformatics analyses and co-wrote the manuscript.

Conflicts of Interest:
The authors declare no conflict of interest.