The Identification of RNA Modification Gene PUS7 as a Potential Biomarker of Ovarian Cancer

Simple Summary RNA modifications are involved in a variety of diseases, including cancers. Given the lack of efficient and reliable biomarkers for early diagnosis of ovarian cancer (OV), this study was designed to explore the role of RNA modification genes (RMGs) in the diagnosis of OV. The study first selected PUS7 (Pseudouridine Synthase 7) as a diagnostic biomarker candidate through the analysis of differentially expressed genes using TCGA and GEO data. Then, we evaluated its specificity and sensitivity using Receiver Operating Characteristic (ROC) analysis in TCGA and GEO data. The protein expression, mutation, protein interaction networks, correlated genes, related pathways, biological processes, cell components, and molecular functions were analyzed for PUS7 as well. The upregulation of PUS7 protein in OV was confirmed by the staining images in HPA and tissue arrays. In conclusion, the findings of the present study point towards the potential of PUS7 as the diagnostic marker and therapeutic target for ovarian cancer. Abstract RNA modifications are reversible, dynamically regulated, and involved in a variety of diseases such as cancers. Given the lack of efficient and reliable biomarkers for early diagnosis of ovarian cancer (OV), this study was designed to explore the role of RNA modification genes (RMGs) in the diagnosis of OV. Herein, 132 RMGs were retrieved in PubMed, 638 OV and 18 normal ovary samples were retrieved in The Cancer Genome Atlas (TCGA), and GSE18520 cohorts were collected for differential analysis. Finally, PUS7 (Pseudouridine Synthase 7) as differentially expressed RMGs (DEGs-RMGs) was identified as a diagnostic biomarker candidate and evaluated for its specificity and sensitivity using Receiver Operating Characteristic (ROC) analysis in TCGA and GEO data. The protein expression, mutation, protein interaction networks, correlated genes, related pathways, biological processes, cell components, and molecular functions of PUS7 were analyzed as well. The upregulation of PUS7 protein in OV was confirmed by the staining images in HPA and tissue arrays. Collectively, the findings of the present study point towards the potential of PUS7 as a diagnostic marker and therapeutic target for ovarian cancer.


Introduction
Ovarian cancer (OV) is the leading cause of death among gynecologic malignancies in most developed countries [1,2]. It accounts for an estimated 239,000 new cases and 152,000 deaths worldwide annually [3]. The risk of having ovarian cancer during the lifetime of a woman is approximately 1 in 78, and the lifetime chance of dying of ovarian cancer is approximately 1 in 108 [4]. Four out of five OV patients are diagnosed with advanced stage [5], and out of these, only 30% of patients survive more than 5 years [4].
The lack of a practical screening strategy and the asymptomatic characteristic of OV contribute to the late presentation of the disease. Hence, the efficient and early detection of OV is pivotal to improving the survival of ovarian cancer patients.
Post-transcriptional modifications affect RNA stability, localization, structure, splicing, or function [6]. Different RNAs have been detected to contain numerous types of modifications [7,8]. For example, mRNA modifications include N6-methyladenosine (m6A), inosine (I), 5-methylcytosine (m5C), and 5-hydroxymethylcytosine (hm5C). Deregulated RNA modifications are reported to be associated with several pathological processes such as tumorigenesis, cardiovascular diseases, and neurological disorders [9]. RNA modification enzymes have been generally considered important decorations for RNAs [10], and dysregulation and mutation in RNA modification genes are involved in the development of numerous cancers including lung cancer, bladder cancer, leukemia, prostate cancer, breast cancer, etc. [11]. For example, Alpha-Ketoglutarate Dependent Dioxygenase (FTO) was deciphered as a prognosticator for lung squamous cell carcinoma and promoted cell proliferation and invasion [12]. Methyltransferase Like 3 (METTL3), acting as an oncogene in lung cancer, upregulated EGFR and TAZ expression and promoted growth, survival, and invasion of human lung cancer cells [13]. NOP2/Sun RNA Methyltransferase 2 (NSUN2) was reported to be overexpressed in breast cancer and to be associated with cancer progression [14]. Elongator Acetyltransferase Complex Subunit 3 (ELP3), responsible for mcm5s2 modification, has been found to be upregulated in breast cancer and to facilitate cancer cell metastasis [15]. tRNA methyltransferase 9B (TRM9L/TRMT9B) has been shown to be downregulated in breast cancer [16]. Similarly, in renal cell carcinomas, G3BP Stress Granule Assembly Factor 1 (G3BP1) has been shown to promote tumor progression and metastasis [17]. Taken together, RNA modification genes play pivotal roles in human cancers.
Pseudouridine synthases (PUS) are divided into six families (TruA, TruB, TruD, RsuA, RluA, and Pus10) [18]. PUS7 is the only member of the TruD family that is involved in the modification of tRNAs, at position Tyr35 in pre-tRNA, at position 13 in cytoplasmic tRNA, and at numerous nucleotides in mRNAs. PUS7 is the only pseudouridine synthase to possess a consensus sequence (UGUAR) for substrate recognition [19]. PUS7 was also reported to be associated with human myeloid malignancies in embryonic stem cells [20]. However, no reports have expounded the role of PUS7 in OV, so far.
In this study, PUS7 was identified as a novel and potential biomarker for early diagnosis, using transcriptional profiles in the GEO and TCGA databases, ROC, HPA, and Oncomine analyses. Protein-protein interaction (PPI); GSEA pathway; and GO analyses, including the biological process (BP), cell component (CC), and molecular function (MF) terms, were also performed to provide in-depth insights into PUS7.

Data Collection
The RMGs were collected from PubMed according to the keywords "RNA modification". The transcriptome profiles, including datasets GSE18520 and TCGA, were obtained from GEO (https://www.ncbi.nlm.nih.gov/gds, accessed on 15 October 2019) [21] and UCSC Xena (https://xena.ucsc.edu/, accessed on 16 October 2019) [22], respectively. A total of 53 OV and 10 normal cases were enrolled in GSE18520 (platform: GPL570), and 585 OV and 8 normal cases in TCGA (Affymetrix Human Genome U133 Plus 2.0 Array) were adopted to carry out the following analyses.

Differential Expression Analysis
The GEO2R, an interactive web tool that facilitates users to compare the gene expression between different groups of samples in a GEO dataset, was used to identify the differentially expressed genes (DEGs). The SangerBox was adopted to analyze the TCGA expression profile of ovarian cancer. A p value < 0.05 and |log 2 FC| > 1 were used as the cut-off criteria to screen out DEGs. The DEGs of the two datasets were listed in Supple-mentary Table S1. Subsequently, the RMGs and DEGs that overlapped between GSE18520 and TCGA were selected using Venny 2.1 and were used for further analysis. The analysis of the volcano plot of DEGs in GSE18520 and TCGA, and the heatmaps of DEGs-RMGs in GSE18520 and TCGA were obtained through the SangerBox web tool.

PUS7 Protein Level Analysis of OV Tissues in HPA and Tissue Array
The protein expression of PUS7 was analyzed using HPA data [23]. A tissue chip (HOvaC070PT01) was purchased from SHANGHAI OUTDO BIOTECH CO., LTD. A total of 12 OV samples and 2 healthy ovary samples, and 65 OV samples and 5 healthy ovary samples were retrieved from HPA and tissue array, respectively. The one case with an equivocal staining result was excluded, and the baseline characteristics of the remaining 64 cases of OV tissues in tissue array are described in Supplementary Table S2. The immunohistochemistry (IHC) staining intensity was graded from 0 to 3 (0, negative; 1, weak; 2, moderate; and 3, strong). The staining quantity was graded from 0 to 3 (0, none; 1, <25%; 2, 25-75%; and 3, >75%) according to the percentage of positive cells in the HPA database. The staining quantity was graded from 0 to 4 (0, none; 1, <25%; 2, 25-50%; 3, 50-75%; and 4, >75%) in the tissue assay. The staining scores were calculated by multiplying the staining intensity with the staining quantity.

PUS7 Gene Expression Analysis Using TCGA and GEO Datasets
The PUS7 expression analysis was carried out using TCGA and GSE119056 expression profiling data. An ROC analysis (the method frequently used for binary assessment) was subsequently performed to evaluate the effectiveness of the expression level of any gene of interest in discriminating between OV and healthy samples. The area under the curve (AUC) value ranged from 0.5 to 1.0, which indicates 50 to 100% discrimination ability.

PUS7 Gene Expression Analysis Using Oncomine Database
The gene expression of PUS7 was explored using the Oncomine database (https:// www.oncomine.org/resource/main.html, accessed on 25 October 2019) [24]. The Oncomine database applies a combination of threshold values (p-value) and fold change (FC, tumors vs. controls) with p ≤ 0.05 and fold change >1.

Pathways and BP, CC, and MF Analyses
Gene set enrichment analysis (GSEA) was carried out to identify potential cellular pathways involved with PUS7. The TCGA-OV dataset was divided into a high (25%) and a low group (75%) based on the PUS7 mRNA expression. Nominal p-value < 0.01 and false discovery rate (FDR) q-value < 0.05 were considered significant for enriched gene set analysis. Using 312 genes positively correlated (R > 0.3, p < 0.05) with PUS7 derived from the cBioPortal analysis, the BP, CC, and MF analyses were carried out through the Database for Annotation, Visualization, and Integrated Discovery (DAVID, https: //david.ncifcrf.gov/, 19 November 2020) [29] and visualized with bubble diagrams based on p values < 0.05.

Statistical Analysis
The statistical analyses were performed using SPSS ver. 26.0. The Student's t-test and the rank-sum test were used to evaluate the difference in PUS7 expression between the OV and normal samples. The ROC curve was constructed using PUS7 expression profiles in the OV and normal samples by GraphPad Prism 8.0. A p value at < 0.05 was taken as a measure of statistically significant difference.

The Identification of DEGs-RMGs of OV Data in TCGA and GEO
A total of 132 RMGs (Supplementary Table S3) were retrieved from PubMed. TCGA AffyU133a expression profiles and GSE18520 cohorts of ovarian cancer were downloaded from UCSC Xena and the GEO databases, respectively. A total of 1142 and 5215 DEGs (Supplementary Table S1) were obtained in the TCGA dataset and GSE18520 dataset between the OV and normal samples through DEO2R and SangerBox-limma analysis, respectively, and the volcano plots of DEGs are presented in Figure 1A,B. The RMGs and DEGs from the two cohorts were intersected to screen out the overlapping RMGs and DEGs for diagnostic biomarker analysis. As a result, two genes named WDR77 and PUS7 were identified as differentially expressed RMGs ( Figure 1B). WDR77 was excluded since it exhibited a contrary expression trend between OV and normal in TCGA and GSE18520 (Figure 2A,B). However, PUS7 showed a consistent high expression in OV rather than normal cases; thus, PUS7 could be a potential diagnostic biomarker and is subject to further analyses.

Expression Validation and Mutation Analysis for PUS7 in Ovarian Cancer
To validate the overexpression of PUS7 in OV rather than normal samples, an Oncomine analysis was performed on ovarian cancer with different pathological types, and found that the PUS7 expression is highly elevated in OV samples with fold change >1 and p < 0.05 (as presented in Figure 3A,B and Table 1). Moreover, Figure 3C,D displays the corresponding ROC curve of PUS7 in the TCGA and GSE18520 datasets, indicating the remarkable potential of PUS7 to discriminate OVs from normal tissue. The IHC analytic results showed the overexpression of PUS7 at the protein level ( Figure 4A,B). To further explore the overexpression of PUS7 at the protein level in OV samples, a tissue array was performed. Typical staining images in the tissue array are exhibited in Figure 4C, confirming the protein upregulation of PUS7 in OV tissues ( Figure 4D). Since mutations in RNA modification genes have been reported to be associated with several types of human cancers, the mutation analysis of PUS7 was performed in cBioPortal, demonstrating the fusion of PUS7 with SRSF Protein Kinase 2 (SRPK2) in serous ovarian cancer ( Table 2).   The ROC analysis of PUS7 between OV and normal samples in TCGA and GSE18520 cohorts. AUC is plotted as sensitivity% vs. 100-specificity%. A p < 0.05 was considered a significant difference.

The Pathway Enrichment Analysis of PUS7 in Ovarian Cancer
To investigate the pathways that PUS7 may be involved in or may regulate in ovarian cancer, a GSEA pathway analysis was performed using TCGA data, which was separated into a high (top 25%) PUS7 group and a low (down 75%) PUS7 group. The top eight pathways in which PUS7 participates are DNA replication, the cell cycle, mismatch repair, spliceosomes, homologous recombination, RNA polymerase, aminoacyl tRNA biosynthesis, and one carbon pool by folate in ovarian cancer ( Figure 6). Among the eight pathways, the top two pathways are DNA replication and the cell cycle, both of which are linked to ovarian cancer cell proliferation. These results may imply that the overexpression of PUS7 in ovarian cancer might promote ovarian cancer proliferation via regulation of DNA replication and the cell cycle. Figure 6. The pathway enrichment analysis of PUS7 in ovarian cancer. GSEA pathway analysis using TCGA ovarian cancer data, which was separated to a high (top25%) PUS7 group and a low (down75%) PUS7 group. Eight top pathways in which PUS7 participates were DNA replication, the cell cycle, mismatch repair, spliceosomes, homologous recombination, RNA polymerase, aminoacyl, tRNA biosynthesis, and one carbon pool by folate in ovarian cancer.

Gene Ontology (GO) Analyses of PUS7 in Ovarian Cancer
To further clarify the GO terms of BP (biological processes), CC (cellular components) and MF (molecular functions) of PUS7, a total of 312 genes (Supplementary Table S4) positively related to PUS7 (R > 0.3, p < 0.0001) according to the TCGA ovarian cancer data through the cBioPortal database were subjected to DAVID analysis. The results showed that biological processes in which PUS7 mainly participates include the regulation of DNA templates and transcription, rRNA processing, tRNA export from nuclei, the regulation of glucose transport, the intracellular transport of viruses, mitotic nuclear envelope disassembly, viral processes, RNA processing, the regulation of cellular response to heat, gene silencing by RNA, and the positive regulation of gene expression ( Figure 7A). The cellular components affected by PUS7 include the nucleoplasm, nucleolus, nucleus, small subunit processomes, nuclear envelope, and nuclear membrane ( Figure 7B). The molecular functions of PUS7 include poly(A) RNA binding, nucleic acid binding, helicase activity, ATP binding, ATP-dependent RNA helicase activity, structural constituents of a nuclear pore, DNA binding, RNA binding, protein binding, single-stranded DNA binding, nucleocytoplasmic transporter activity, DNA replication origin binding, ATP-dependent helicase activity, and nucleotide binding ( Figure 7C).

Discussion
It was estimated that there were 22,530 new cases and 13,980 deaths due to ovarian cancer in the United States in 2019 [34]. Ovarian cancers are often diagnosed late, when the disease has progressed to advanced stages. Hence, an efficient and reliable diagnostic marker is very necessary to facilitate clinical diagnosis and to prolong the survival time for OV. RNA modifications are reported to play vital roles in human diseases, including cancer. For example, m6A, a new star of RNA modifications, is associated with tumorigenesis, tumor proliferation and differentiation and functions as oncogenes or anti-oncogenes in malignant tumors [35]. For example, m6A plays a pivotal role in ovarian cancer progression [36]. Recent advances in human Mendelian diseases have brought focus to human PUS genes as a type of RMG in clinical medicine [37]. PUS7-mediated pseudouridylation could "activate" a class of tRNA-derived small RNAs to regulate protein synthesis and stem cell fate [20]. Additionally, PUS7 is also reported to be a potential biomarker for glioma [38].
In this study, we investigated dysregulated RMGs in ovarian cancer and identified PUS7 as a novel potential biomarker for the diagnosis of OV. ROC analysis acting as an efficient method has been commonly used to determine the accuracy and specificity of medical imaging techniques and non-imaging diagnostic tests in various settings involving disease screening, prognosis, diagnosis, staging, and treatment [39]. Herein, ROC analysis aimed at discriminating cancer from normal tissue was performed to evaluate the sensitivity and specificity of PUS7 in GEO and TCGA data. AUC is a global measure of the ability of a test to discriminate whether a specific condition is present [40]. In this study, an AUC score over 0.9 in an ROC analysis was obtained, suggesting the potent discriminating potency of PUS7 (AUC = 0.9404, p < 0.0001) in ovarian cancer. In addition to PUS7 upregulation in the TCGA and GEO datasets, the Oncomine database analysis and IHC results further validated the promising diagnostic role of PUS7 in OV.
PUS7 has never been reported in ovarian cancer. To rationalize the vital role of PUS7 in OV, we explored the proteins interacting with PUS7, which may partially help explain PUS7 function in tumor diagnosis, tumorigenesis, and development. The PPI and gene network analyses identified PUS7 interacting partners, including NOC3L and PUS1, which are also not reported in ovarian cancer, although several reports have revealed that NOC3L regulates the proliferation and tumorigenesis of gastric cancer [41], and NOC3L is associated with an increased risk of gastric cancer in the Chinese Han population [42]. For PUS1, previous reports demonstrated that it is related to sideroblastic anemia [43], and no association of PUS1 with cancer was ever shown, suggesting the novelty of the protein interaction. To further explore the signaling pathway of PUS7 in ovarian cancer, the GSEA pathways analysis demonstrated that DNA replication and the cell cycle are the top two pathways that PUS7 regulated. These results point towards the role of PUS7 in ovarian cancer proliferation via regulation of DNA replication and the cell cycle. However, this hypothesis needs further experiments to be validated.

Conclusions
In conclusion, the findings of the present study revealed PUS7 as a novel and prospective biomarker at the RNA and protein levels for ovarian cancer. Further analysis indicated that PUS7 may interact with NOC3L and PUS1 to regulate ovarian cancer proliferation via modulation of DNA replication and the cell cycle.
Supplementary Materials: The following are available online at https://www.mdpi.com/article/10 .3390/biology10111130/s1, Table S1: The differentially expressed genes in ovarian cancer samples compared with normal tissue according to TCGA and GSE18520 data. Table S2: The baseline characteristics of ovarian cancer samples in a tissue array. Table S3: RNA modification-related genes. Table S4: The genes correlated to PUS7.