1. Introduction
Merkel cell carcinoma (MCC) is a neuroendocrine carcinoma of the skin with a poor prognosis. The survival rate of MCC varies significantly; population-based studies from New Zealand, Finland, and the United States revealed 5-year disease-specific survival rates of 45%, 59%, and 73%, respectively [
1,
2,
3]. In most MCCs (approximately 80% of MCC tumors in the northern hemisphere), the genome of Merkel cell polyomavirus (MCPyV) is integrated in the tumor cell genome [
4,
5]. Our group and others have previously identified several clinicopathological factors that negatively influence survival in MCC, such as tumor MCPyV-negativity, lack of tumor-infiltrating lymphocytes, male sex, larger primary tumor size, presence of lymph node or systemic metastasis at diagnosis, and immunosuppression [
1,
2,
3,
4,
6,
7,
8,
9,
10,
11]. Despite the generally poor survival of MCC, there is a significant proportion of patients in our Finnish population-based cohort who are still alive over a decade after the initial diagnosis. The treatment of MCC generally consists of surgical removal accompanied by sentinel lymph node biopsy and potential lymph node evacuation followed by radiotherapy. In disseminated MCC, immune-checkpoint inhibitors, namely PD-1 or PD-L1 inhibitors, are frequently used [
12,
13]. To improve the poor prognosis of MCC by means of developing effective targeted therapies, an improved understanding of the mechanisms that drive tumor progression is necessary.
In a recent study by Harms et al., targeted DNA and transcriptional profiling of 63 and 26 pre-defined, cancer-relevant genes, respectively, was performed. The study revealed that oncogene-activating mutations in PIK3CA, IDH2, and JAK2 were associated with poorer disease-specific survival in MCPyV-positive cases. Improved disease-specific survival was associated with higher expression of the pro-inflammatory transcripts IDO1, IFNG, and GZMA in MCPyV-positive cases, whereas high expression of the oncogene transcripts BRAF, RET, and UBE2C was associated with poorer survival in MCPyV-negative cases [
10]. A previous transcriptomic study by Paulson et al. revealed that tumors from patients with a good prognosis exhibited overexpression of immune response-associated genes, particularly genes associated with cytotoxic CD8+ lymphocytes. However, genes overexpressed in tumors from patients with a poor prognosis were not examined in this study [
11].
In this study, we aimed to identify genes, processes, and pathways that play crucial roles in causing MCC-specific death and MCC-specific survival that may be targeted in the treatment of MCC or may be used in guiding therapy selection or as prognostic markers. We investigated transcriptomes from the primary MCC tumors of 102 patients to identify the genes that were most significantly upregulated among survivors and among patients who died from MCC. We subsequently cross-referenced these genes with gene ontology and pathway databases to identify the biological processes and signaling pathways associated with these differentially expressed genes.
3. Results
3.1. Overview of Patients
Detailed patient clinical data are shown in
Table 1. MCC-specific death occurred in 28/102 (27%) cases and occurred within 5 years of initial MCC diagnosis in 25/28 (89%) cases. The median survival time before MCC-specific death was 2.3 years (range 0.3 to 14.8 years). There was one extreme outlier who survived 14.8 years before MCC-specific death; the second longest survival time was 6.3 years. The median follow-up time before death from a cause unrelated to MCC, or until the last follow-up date at which the patient was still alive, was 4.4 years (range 0.04 to 33.2 years). Kaplan–Meier analysis revealed that MCPyV-negativity, male sex, and the presence of lymph node or systemic metastasis at diagnosis correlated with poorer MCC-specific survival (all
p < 0.001).
Table 2 shows the results of the Cox regression multivariate analysis; all three variables were significantly associated with poorer MCC-specific survival in the multivariate analysis.
3.2. Identification of DAGs and SAGs
The differential expression test of genes revealed a total of 50 DAGs and 29 SAGs with a BH corrected
p-value < 0.05. A summary of the upregulated genes, along with their corresponding logFC (fold change) values, false detection rates (FDR), and
p-values, can be found in
Table 3. A heatmap illustrating the expression of DAGs and SAGs across all samples can be found in
Figure 1.
3.3. GO Enrichment and KEGG Signaling Pathway Analysis of DAGs and SAGs
The top 10 GO BP and GO MF terms most significantly enriched by the DAGs and SAGs, respectively, can be found arranged according to their
p-values in
Table 4 and
Table 5, along with their corresponding q-values and a list of the specific genes causing the enrichment. The q-value is an adjusted
p-value calculated using the BH method for correction for multiple hypotheses testing. The top 10 KEGG pathway terms most significantly enriched by DAGs and SAGs can be found in the same format in
Table 6.
Cancer-relevant processes, functions, and pathways enriched by the DAGs included the PI3K-Akt signaling pathway, the MAPK signaling pathway, and angiogenic processes. Considering angiogenesis, the most significantly enriched GO terms included regulation of vascular associated smooth muscle cell migration, positive regulation of vascular endothelial cell proliferation, vascular endothelial growth factor receptor 2 binding, and vascular endothelial growth factor receptor binding. Among the most significantly enriched KEGG pathway terms was VEGF signaling pathway and among the DAGs was vascular endothelial growth factor A (VEGFA). Five of the DAGs were associated with the KEGG pathway term PI3K-Akt signaling pathway (MYB, AKT3, IGF2, ITGA6, and VEGFA) and four of the DAGs were associated with the KEGG pathway term MAPK signaling pathway (MECOM, AKT3, IGF2, and VEGFA).
The DAGs also exhibited enrichment of developmental pathways and processes, such as the GO PB terms chordate embryonic development (CHD7, IGF2, XYLT1, VEGFA, SULF2), in utero embryonic development (CHD7, IGF2, VEGFA), and skeletal system development (CHD7, COL11A2, IGF2, XYLT1, SULF2).
The processes, functions, and pathways most significantly enriched by the SAGs were mostly related to immune response, particularly to antigen processing and MHC-dependent antigen presentation, and to T-cell mediated immune response. Examples include the KEGG pathway term antigen processing and presentation (CD74, TAP2, PSME1, HLA-DRA, HLA-E) and the GO BP terms T cell receptor signaling pathway (PSME1, HLA-DRA, LCP2, CARD11) and positive regulation of lymphocyte proliferation (SASH3, CD74, HLA-E).
Complete lists of all the GO BP, GO MF, and KEGG pathway terms that had
p-values of <0.05, enriched by the DAGs and SAGs, are presented in
Supplementary Tables S1 and S2. 3.4. Survival of ≥3 Years as Inclusion Criterion for the Good Prognosis Group
Including only patients who were still alive 3 years after diagnosis in the good prognosis group resulted in an increased amount of differentially expressed genes. The number of genes upregulated in the good prognosis group increased to 82, including 22/29 of the original SAGs; the number of genes upregulated in the poor prognosis group increased to 118, including 40/50 of the original DAGs. Considering the SAGs of this control analysis, the vast majority of the most significantly enriched GO and KEGG pathway terms were still immune response-related, especially related to antigen processing and MHC-dependent antigen presentation. The GO and KEGG pathway terms most significantly enriched by the DAGs in this control analysis still included embryonic developmental processes. The KEGG pathways PI3K-Akt and MAPK signaling pathway were still enriched by the same DAGs, with the exception of FGFR2 instead of VEGFA in both cases; however, the
p-values were no longer significant owing to the increased number of differentially expressed genes. The DAGs in this control analysis also exhibited enrichment of several terms related to mitosis as well as chromatin binding, the latter in part caused by the inclusion of several genes encoding histone proteins; the main finding that was lost in this control analysis was the DAG VEGFA, and thereby several GO and KEGG pathway terms related to angiogenesis. Complete lists of the differentially expressed genes yielded by this control analysis, as well as the GO and KEGG pathway terms significantly enriched by them, are provided in
Supplementary Tables S3–S5. 3.5. Differential Expression Test of DAGs Based on MCPyV Status
The differential expression test of genes between MCPyV-positive and MCPyV-negative tumors of patients who died from MCC yielded 100 genes significantly upregulated in MCPyV-negative tumors. Of these genes, eight (VSIG8, NEDD4L, ITGA6, KIF23, IGF2, H1-3, DST, COL21A1) were among the DAGs. A complete list of these 100 genes is provided in
Supplementary Table S6. 3.6. Correlation between Tumor-Infiltrating Lymphocytes and SAGs Related to Antigen Processing and Presentation
The correlation analysis between an abundance of CD3-positive, tumor-infiltrating lymphocytes and the expression levels of the five SAGs causing the enrichment of the KEGG pathway antigen processing and presentation (CD74, TAP2, PSME1, HLA-DRA, and HLA-E) revealed a significant and positive Pearson’s product–moment correlation in each case. Detailed results are presented in
Supplementary Table S7.
3.7. Comparison of FFPE Data to Data from Fresh Frozen Tissues
Out of the 250 genes most significantly differentially expressed between MCPyV-positive and MCPyV-negative tumors in GSE39612, 136 genes were present in our FFPE data. The correlation analysis between the logFC-values of these 136 genes based of MCPyV status in GSE39612 and our data yielded a Pearson correlation coefficient of 0.82 (
p < 0.001). A list of these 136 genes, together with their logFC-values in GSE39612 and our data, as well as a scatter plot of their expression, is provided in
Supplementary Table S8.
4. Discussion
We found a dichotomous gene expression profile between the tumors of the poor and good survival groups. Many of the DAGs, which were upregulated in the poor survival group, function in various oncogenic pathways and processes. The SAGs, which were upregulated in the good survival group, were to a large extent immune-response-related genes.
We found a clear association of DAGs with angiogenesis. Angiogenesis plays a crucial role in the progression of solid tumors, and increased tumor vascularization is a factor predicting poor prognosis in MCC [
22]. VEGFA, a proangiogenic growth factor that was among the DAGs, has been found to be expressed in the majority of MCC tumors based on immunohistochemistry results. Overexpression of VEGFA has been shown to correlate with metastatic tumor spread of MCC [
23,
24].
Five of the DAGs were associated with the KEGG pathway term PI3K-Akt signaling pathway, and four of the DAGs were associated with the KEGG pathway term MAPK signaling pathway. Furthermore, although not recognized in the KEGG pathway database, the DAGs SULF2, RBFOX3, and COL11A1 have been reported to upregulate the PI3K-Akt signaling pathway and the DAGs ITGA6, SMYD3, and ENC1 have been reported to upregulate the MAPK signaling pathway in other malignancies [
25,
26,
27,
28,
29,
30,
31]. MCPyV small tumor antigen has previously been demonstrated to activate p38 MAPK signaling in MCC, and immunohistochemical findings have demonstrated high degrees of activating AKT phosphorylation in MCC [
32,
33]. The PI3K-Akt and MAPK signaling pathways both serve oncogenic roles in several malignancies, such as stimulating cellular proliferation, inhibiting apoptosis, promoting tumor invasion and metastasis, and stimulating angiogenesis, which in the case of the MAPK pathway is partially mediated by upregulation of VEGFA [
34,
35].
Among the DAGs were also insulin-like growth factor 2 (IGF2), a protumorigenic growth factor, and IGFBP5, which can either inhibit or potentiate insulin-like growth factor-signaling depending on the context [
36]. Insulin-like growth factor binding, insulin-like growth factor I binding, and insulin-like growth factor II binding were among the most enriched GO MF terms. MCC has previously been shown to express insulin-like growth factor-I receptor, but to the best of our knowledge, insulin-like growth factor signaling in MCC has not otherwise been reported [
37].
Considering clinical implications, the aforementioned pathways and processes include several potentially viable targets for pharmacological intervention. Kervarrec et al. suggested that VEGFA may be a potential therapeutic target in MCC following promising results from drug trials in mouse models using the monoclonal antibody bevacizumab [
38]. The PI3K-Akt and the MAPK signaling pathways are two well-known oncogenic pathways, for which there are numerous established methods of pharmacological inhibition used in the treatment of other malignancies. These include PI3K, Akt, and mTOR inhibitors for the PI3K-Akt pathway and BRAF and MEK inhibitors for the MAPK pathway [
39,
40]. Furthermore, the insulin-like growth factor pathway constitutes another oncogenic pathway that can be targeted by blocking the IGF-1 receptor or its ligands IGF-1 and IGF-2 [
41].
Other notable DAGs included MYB, DST, and KIF23. MYB (c-MYB) encodes an oncogenic transcription factor that regulates processes such as cell proliferation and apoptosis in several other malignancies. Small-molecule inhibitors of c-MYB, such as celastrol and blumbagin, have shown promising results in cell cultures and mouse models [
42]. DST encodes a barrier protein that supports melanoma cell growth in vitro and in vivo, likely by interfering with immune cell infiltration or by enhancing angiogenesis [
43]. KIF23 encodes a microtubule-associated motor protein involved in the regulation of cytokinesis. Upregulation of KIF23 increases cell proliferation and worsens prognosis in other malignancies, such as gastric cancer. Knockdown of KIF23 resulted in marked inhibition of proliferation in gastric cancer [
44]. Other DAGs, the silencing of which has been demonstrated to inhibit cell proliferation or increase apoptosis in other malignancies, include MLF1, MELTF, CIT, RRBP1, CHD7, and MEIS2 [
45,
46,
47,
48,
49,
50].
Another curious finding considering DAGs was enrichment of developmental pathways and processes. This may suggest that the cancer cells of more aggressive MCC revert to a more primitive, stem-cell-like state. An embryonic stem-cell-like gene expression pattern has been found to correlate with poor tumor differentiation and poor prognosis in other malignancies [
51]. So-called cancer stem cells that bear specific cell-surface markers and possess the abilities of self-renewal, induction of metastasis, evasion of apoptosis, and resistance to conventional cancer treatments have been described in several other malignancies, but have not as of yet been characterized in MCC. Advances have been made in specifically targeting these cells, including immunotherapy and gene therapy [
52].
For SAGs, we found a clear association with pathways and processes related to immune response. Almost all of the most significantly enriched GO BP, GO MF, and KEGG pathway terms were immune-response-related and were especially related to antigen processing, MHC-dependent antigen presentation, and T-cell mediated immune response. These findings underline the importance of a functional immune system, capable of creating a hostile tumor microenvironment, for MCC-specific survival. These findings are also consistent with previous findings on the abundance of tumor-infiltrating lymphocytes, notably CD8-positive T-cells, as a strong predictor of good survival in MCC [
6,
8,
9,
11].
The positive correlation between the abundance of CD3-positive, tumor-infiltrating lymphocytes and the expression levels of the SAGs causing the enrichment of the KEGG pathway antigen processing and presentation suggests that the abundance of CD3-positive lymphocytes functions as a surrogate marker for an immunogenic gene expression signature.
Notable SAGs involved in processes not related to immune response included DUSP2, LLGL1, STAT1, and OGFR. DUSP2 encodes a phosphatase that deactivates protumorigenic MAP-kinases, the loss of which predicts a poor prognosis in bladder cancer [
53]. LLGL1 encodes a cytoskeleton-associated protein involved in maintaining cell polarity; the loss of LLGL1 is associated with a loss of cellular adhesion, dissemination of cells, and distant metastases in several cancers including gastric cancer and malignant melanoma [
54,
55]. STAT1 encodes a protein that serves tumor-suppressive functions in many cancers and has been recognized as a potential biomarker for patient selection for treatment with anti-PD-1/anti-PD-L1 antibodies in breast cancer, as p-STAT1 correlates with higher PD-L1 and HLA class I expression [
56,
57]. Upregulation of opioid growth factor signaling through OGFR (opioid growth factor receptor) suppresses proliferation in several other malignancies, including lung and ovarian cancer [
58,
59].
Owing to the strong correlation between MCPyV-negativity and MCC-specific death, we repeated the differential expression test of genes based on MCPyV status within the poor prognosis group. Of the 50 DAGs, eight were significantly upregulated in MCPyV-negative tumors, suggesting that their prognostic relevance resulted at least in part from an association with MCPyV-negativity. The remaining 42 DAGs were not significantly differentially expressed based on MCPyV status within the poor prognosis group, suggesting a prognostic relevance regardless of viral status.
It should be noted that the average quality of the extracted RNA was fairly poor (average RIN of 2.1), as is often the case with FFPE samples. However, it has been shown that 3′ tag counting, such as that used in this study, markedly decreases the amount of false positives when studying differential expression of genes in samples of varying RIN at the expense of decreased sensitivity [
60]. This study utilized a sequencing pipeline optimized for FFPE samples. Specifically, QuantSeq sequences only the 3′ end of the RNA transcript, thus significantly reducing the impact of partial RNA fragmentation. This is in contrast to, for example, Illumina’s TruSeq, which sequences most of the transcript. In a study comparing Lexogen’s QuantSeq and Illumina’s TruSeq, there was a strong correlation between the methods concerning the average expression values for all expressed genes. Both methods identified a similar number of expressed protein-coding genes, with QuantSeq identifying approximately 94% of the protein-coding genes found by TruSeq [
15]. The correlation analysis between GSE39612 and our transcriptomic data revealed a strong correlation between the logFC-values based on the MCPyV status of the 136 genes studied. This suggests a high specificity of our FFPE data as compared with data from fresh frozen tissues when studying differentially expressed genes. Another challenge with the study design is that, because we analyzed bulk transcriptomic information, we cannot say if a certain expression signature originates from a specific subset of cells in the tumor or if it represents gene expression of the tumor overall.