Comprehensive Analysis Identifies Ameloblastin-Related Competitive Endogenous RNA as a Prognostic Biomarker for Testicular Germ Cell Tumour

Simple Summary Testicular germ cell tumour is a common tumour in young males, and although it is one of the most curable cancers, many patients still experience recurrence after the chemotherapy. Tumour recurrence is not detected with high sensitivity by established blood tumour markers. Ameloblastin is identified as an extracellular matrix protein and has shown to be associated with tumour progression. We validated ameloblastin’s expression in testicular tissue, and used comprehensive bioinformatics analysis of 156 patients with testicular germ cell tumour to show that the level of ameloblastin was associated with the time of tumour recurrence after the first cure. In the analysis of ameloblastin differential genes in the tumour, a ceRNA (competing endogenous RNA) regulatory network associated with tumour diagnosis and an independent prognostic factor for the tumour, PELATON (Plaque Enriched LncRNA In Atherosclerotic And Inflammatory Bowel Macrophage Regulation), were identified, which could provide evidence for prediction of tumour prognosis. Abstract Testicular Germ Cell Tumour (TGCT) is one of the most common tumours in young men. Increasing evidence shows that the extracellular matrix has a key role in the prognosis and metastasis of various human cancers. This study analysed the relationship between the matrix protein ameloblastin (AMBN) and potential biological markers associated with TGCT diagnosis and prognosis. The relationship between AMBN and TGCT prognosis was determined by bioinformatic analysis using the expression profiles of three RNAs (long non-coding RNAs (lncRNAs), microRNAs (miRNAs) and mRNAs) from The Cancer Genome Atlas (TCGA) database, and available clinical information of the corresponding patients. Prediction and validation of competitive endogenous RNA (ceRNA) regulatory networks related to AMBN was performed. AMBN and its associated ceRNA regulatory network were found to be related to the recurrence of TGCT, and LINC02701 may be used as a diagnostic factor in TGCT. Furthermore, we identified PELATON (Plaque Enriched LncRNA In Atherosclerotic And Inflammatory Bowel Macrophage Regulation) as an independent prognostic factor for TGCT progression-free interval.


Introduction
Testicular Germ Cell Tumour (TGCT) is the most common malignancy in men between 20 and 40 years of age [1]. It is estimated that the number of new European TGCTs will

Data Preparation
Clinical information and raw data (RNA sequencing data profiles) were analysed for the TGCT patients from the TCGA database (https://portal.gdc.cancer.gov/, accessed date 15 September 2021). Patients' primary diagnoses included: embryonal carcinoma, seminoma, teratoma, mixed germ cell tumour, yolk sac tumour, teratocarcinoma (for more information see Tables S1 and S2). TPM RNAseq data from the UCSC XENA Project (https: //xenabrowser.net/datapages/, accessed date 15 September 2021) [31], which included the TCGA and GTEx RNA sequencing data, were analysed together to increase the reliability of data analysis. Available mRNA sequencing (mRNA-seq) data were obtained from the TCGA database for 154 TCGA samples and the UCSC XENA database for 165 normal testis samples. The FPKM (fragments per kilobase per million) data was converted to TPM (transcripts per million reads) format.

Immunofluorescence of Testis Tissues from Rat
After isolation from the 12-week-old Sprague-Dawley male rat, the testis was fixed using 4% paraformaldehyde (PFA) for 24 h, then overnight in cryoprotectant solution [32]. The sample was embedded in optimal cutting temperature (OCT) compound (Leica, Buffalo Grove, IL, USA), and frozen in liquid nitrogen before equilibration at −20 • C for sectioning. Three 5 µm-thickness sections of the testis tissues were used. The animal was part of an experiment was approved by the National Animal Research Authority (approval number 25785). Tissues were sectioned into 5 µm-thickness and mounted onto positively charged glass slides, then antigen-retrieved in Tris-EDTA buffer ( The anti-AMBN antibody and anti-vimentin were diluted in 2% normal goat serum, and the secondary antibody was in 4% normal goat serum. The immunolabelled samples were observed under a confocal fluorescence microscope (Leica TCS SP8, Leica Microsystems CMS GmbH, Mannheim, Germany) with HC PL APO CS2 40×/1.30 oil immersion objective lens, running software LAS X 3.5.6.21594 version. Excitation at 488 nm and 5 nm bandpass filters from 500-590 nm. Identical settings were used to scan a negative control sample that had been processed for imaging without a primary anti-AMBN antibody. Excitation and bandpass emission wavelengths were 552 nm and 580-625 nm (anti-vimentin, and Alexa Fluor 568). DRAQ5 imaging of the same field of view used excitation at 638 nm and a bandpass emission filter at 670-750 nm. Images were prepared for publication using Photoshop CS6 (Adobe Inc., Berkeley, CA, USA)

The RNAseq Data Analysis of AMBN
Statistical analysis of mRNA levels of AMBN in 33 types of cancers, such as adrenocortical carcinoma, glioblastoma multiforme and kidney chromophobe (Table S3), was performed with the RNAseq data in TPM format processed by the Toil process for TCGA and GTEx, visualised using the ggplot2 (version 3.3.3) package. Due to insufficient sample size, TGCT samples were not grouped by histology. Mutations in AMBN in TGCT patients were analysed using cBioPortal (https://www.cbioportal.org/, accessed date 15 September 2021) [33].

Differential Gene Expression Analysis
Patients were categorised into low or high groups based on the gene expression of AMBN, using the median expression as the cut-off value (low expression group: 0-50%, and high expression group: 50-100%). Differential expression analysis was performed using the R package: DESeq2 (version 1.26.0) [34]. A threshold of lncRNA was used for adj p < 0.05 and |log fold change (FC)| > 0.5, and p < 0.05 and |logFC| > 0.3 as miRNA thresholds, and adj p < 0.05 and |logFC| > 0.5 as mRNA thresholds to obtain DERNAs (including different expression lncRNAs, different expression miRNAs and different expression mRNAs). Correlation analysis of DERNAs with AMBN was performed using the stat package (version 3.6.3), and the volcano plot and co-expression heat map were visualised using ggplot2 (version 3.3.3).

Survival Analysis and Construction of Gene-Specific Prognosis Models for TGCT
The public patient data material used here contains copy number data complemented with relevant clinical information. Statistical analysis of survival data was performed using the R package SURVIVAL (version 3.2-10) to analyse Kaplan-Meier survival curves between the high level and low-level AMBN groups. The visualisation was performed using the survminer package (version 0.4.9). Hazard ratios (HR) and 95% confidence intervals (CI) were analysed with Cox proportional hazards regression models to identify factors associated with the study's primary endpoint. Receiver operating characteristic (ROC) curves of different factors were analysed using the pROC package (version 1.17.0.1) to compare the predictive accuracy and risk scores of the genes of interest. ROC curves were visualised using the ggplot2 package (version 3.3.3).

Immune Infiltrate Levels Related to AMBN
The R GSVA package (version 1.34.0) was used for the expression analysis of the RNA sequencing data of the TGCT patients for the 24 types of tumour-infiltrating cells and the expression of AMBN [35]. The identified tumour-infiltrating cells included: activated dendritic cells (aDC); B cells; CD8 T cells; Cytotoxic cells; DC; Eosinophils; iDC (immature DC); Macrophages; Mast cells; Neutrophils; NK CD56bright cells; NK CD56dim cells; NK cells; pDC (Plasmacytoid DC); T cells; T helper cells; Tcm (T central memory); Tem (T effector memory); Tfh (T follicular helper); Tgd (T γ δ); Th1 cells; Th17 cells; Th2 cells; regulatory T cell (Treg). The markers of the cells from Bindea et al. [36]. In addition, the correlation between immune cells infiltrating the tumour tissue and the level of AMBN was analysed.

Functional Enrichment Analysis
Functional enrichment analysis was performed on the factors obtained from the differential analysis using the clusterProfiler package (version 3.14.3) [37]. Gene ontology (GO) enrichments (including biological process (BP), cellular component (CC) and molecular function (MF)), and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichments were obtained and ID transformations were performed using the org.Hs.eg.db package (version 3.10.0). False discovery rate (FDR) < 0.25 and p.adjust < 0.05 conditions were met for significant enrichment. Gene Set Enrichment Analysis (GSEA) analysis [38]

Methylation and Expression Analysis of GFAP
The human disease methylation database UALCAN (http://ualcan.path.uab.edu/, accessed date 15 September 2021) [45] was used to evaluate the methylation levels of glial fibrillary acidic protein (GFAP) between TGCT and normal human testis tissues, and among different TGCT pathological stages. Simultaneously, MEXPRESS (https://mexpress.be, accessed date 15 September 2021) [46] was used to analyse the relationship between gene expression of GFAP and its DNA methylation status.

Statistical Analysis
All data were analysed using SPSS 28.0 software (SPSS, Chicago, IL, USA). The Mann-Whitney U test (Wilcoxon rank-sum test) and independent t-test were used to calculate the differences between the data of normal human testis tissue vs. the cancer tissue, and between the data of high AMBN group vs. low AMBN group in the TGCT. One-way analysis of variance (ANOVA) with Kruskal-Wallis test and chi-square test were used to assess between-group differences (normal vs. cancer, and high AMBN vs. low AMBN). Non-parametric correlation tests (Spearman) were used for correlation analysis. Univariate Cox regression analysis was performed to analyse the relationship between candidate genes and progression-free interval (PFI). p < 0.05 indicated that the difference was statistically significant (*, p < 0.05; **, p < 0.01; ***, p < 0.001).

Down-Regulation of AMBN Expression and Clinical Value in TGCT
The TCGA and GTEx databases were used to investigate the expression of AMBN mRNA in various normal and cancers tissues. AMBN mRNA expression was only found to be present in kidney chromophobe (KICH), uterine carcinosarcoma (UCS) tissues and TGCT, as well as in normal testicular tissue ( Figures 1A and S1A), and not transcribed in the other 29 cancer tissues and their corresponding normal tissues analysed. In the current study, TGCT was not analysed according to the classification of seminoma and non-seminoma. Due to sample size limitations, TGCT was analysed only in groups according to different clinical parameters. Immunohistochemical staining of rat tissues confirmed that the AMBN protein was present in normal testicular tissue ( Figure 1B, the corresponding negative control image in Figure S1B). In the rat testis, positive staining for AMBN protein was present beside the nucleus of the cells in the seminiferous tubules. The mRNA expression of AMBN was found to be significantly down-regulated in TGCT compared to normal tissue ( Figure 1A).
We investigated the relationship between the expression level of AMBN mRNA and several clinical factors/results in TGCT patients to identify the clinical significance of AMBN in this cancer. According to the ROC survival curve, as shown in Figure 1C, the relationship between mRNA expression of AMBN and a specific diagnosis of TGCT is low (AUC 0.675, Cl: 0.613-0.736). In addition, analysis of AMBN mRNA levels for different clinical stages suggested that patients with intermediate to advanced TGCT had significantly lower levels of AMBN in their cancer tissues than patients with earlier stages ( Figure 1D, Table 1). Although there was a significant difference in the number of cancer stages between the high and low AMBN groups, the overlap of AMBN level between patients in the early-stage and late stages precludes the use of AMBN levels to identify the stage of cancer. The overlap of AMBN level between patients in the earlystage and late stages precludes the use of AMBN levels to identify the stage of cancer. However, in the survival analysis of progress free interval (PFI) in all TGCT patients, we found that low levels of AMBN were associated with longer intervals between tumour recurrences after the initial cure ( Figure 1E). AMBN levels were not associated with overall survival. In further prognostic analyses targeting different subgroups of clinical information classification, patients who developed lymphatic invasion metastases and had low levels of AMBN in tumour tissue were found to have longer cancer recurrence intervals ( Figure 1F). Interestingly, in TGCT patients without concomitant lymphatic invasion, mRNA levels of AMBN did not correlate with the time to cancer recurrence after the first treatment ( Figure 1G). However, this phenomenon was not found in other clinical factors/results subgroups. To identify mutations in AMBN in TGCT patients, we analysed the genome and copy number of AMBN. The deletion mutations of AMBN gene in the TCGA TGCT dataset were shown by OncoPrint plot ( Figure 1H).

Analysis of Differentially Expressed Genes (DEGs)
Based on the observed differences in PFI in tumour tissues with respect to AMBN mRNA level, a natural follow-up was to investigate its role and putative effect. The  The symbols and |logFC| of all DEGs were used to perform GO, KEGG and GESA analyses to explore the functions of these genes. As shown in Figure 2G-J, the functions of differential genes and enriched pathways are focused on steroid hormone metabolism, for example, "C21-steroid hormone metabolic process", "androgen metabolic process", "androgen biosynthetic process" "steroid dehydrogenase activity" "ovarian steroidogenesis" and "steroid hormone biosynthesis". The "steroid hormone biosynthesis" has drawn our attention since serum hCG levels provide an important marker for diagnosing TGCT3.
In the phenotype-related GESA analysis results ( Figure S2), the function of DEGs was focused on the Hallmark_Spermatogenesis, Hallmark_Androgen_Response, Hall-mark_Xenobiotic_Metabolism, Hallmark_Estrogen_Response_Early, Hallmark_Estrogen_ Response_Late, Korkola_Embryonic_Carcinoma_Vs_Seminoma_Dn Korkola, and notably DEGs were negatively associated with a variety of inflammatory cell-related gene sets.

Correlation between Immune Cell Infiltration and AMBN mRNA Expression in TGCT
We evaluated the relationship between AMBN mRNA expression levels and immune cell infiltration in TGCT. First, the single sample GSEA algorithm was applied to analyse the correlation between the level of infiltration of 24 immune cell types and AMBN mRNA expression. The results showed a slight negative correlation between the infiltration of pDC, Neutrophils, Macrophages, iDC, pDC and the level of AMBN ( Figure 3A,B). Subsequently, we analysed whether there were differences in the levels of infiltration of 24 immune cell types in the high/low AMBN mRNA expression groups. The results showed that the enrichment scores of 8 immune cell types (DC, iDC, Macrophages, Neutrophils, pDC, Tcm, Th1 cells, Th17 cells) were statistically different in the AMBN low and high expression groups ( Figure 3C). Only Tcm infiltration in the cancer tissue of low level patients was lower than the high AMBN group.

Correlation between Immune Cell Infiltration and AMBN mRNA Expression in TGCT
We evaluated the relationship between AMBN mRNA expression levels and immune cell infiltration in TGCT. First, the single sample GSEA algorithm was applied to analyse the correlation between the level of infiltration of 24 immune cell types and AMBN mRNA expression. The results showed a slight negative correlation between the infiltration of pDC, Neutrophils, Macrophages, iDC, pDC and the level of AMBN ( Figure 3A,B). Subsequently, we analysed whether there were differences in the levels of infiltration of 24 immune cell types in the high/low AMBN mRNA expression groups. The results showed that the enrichment scores of 8 immune cell types (DC, iDC, Macrophages, Neutrophils, pDC, Tcm, Th1 cells, Th17 cells) were statistically different in the AMBN low and high expression groups ( Figure 3C). Only Tcm infiltration in the cancer tissue of low level patients was lower than the high AMBN group.

Construction of a lncRNA-miRNA-mRNA Triplet Regulatory Network and Its Functional Enrichment
DElncRNAs obtained from previous analyses were used as study subjects, and their target miRNAs were predicted using Tarbase (https://carolina.imis.athena-innovation.gr/diana_tools/web/index.php, accessed date 15 September 2021). Five candidate

Construction of a lncRNA-miRNA-mRNA Triplet Regulatory Network and Its Functional Enrichment
DElncRNAs obtained from previous analyses were used as study subjects, and their target miRNAs were predicted using Tarbase (https://carolina.imis.athena-innovation. gr/diana_tools/web/index.php, accessed date 15 September 2021). Five candidate miR-NAs were subsequently included in the following analysis after the resulting predicted miRNAs were intersected with 38 DEmiRNAs. The DElncRNAs unrelated to these five candidate miRNAs were omitted, and the remaining 22 candidate lncRNAs were incorporated into the final regulatory network mapping. The miRWalk and TargetScan databases were used for analysis to predict downstream target mRNAs referencing the five candidate miRNAs. The final 235 candidate mRNAs were incorporated into the tertiary regulatory network's construction after comparing the predicted mRNAs' results with the DEmRNAs. The AMBN-related lncRNA-miRNA-mRNA triple regulatory network in TGCT was mapped with Cytoscape ( Figure 4A). 22 hub RNAs were screened for hub triple regulatory networks using the Cytoscape plugin cytoHubba, including 5 lncRNAs (LINC02701, PELATON (Plaque Enriched LncRNA In Atherosclerotic And Inflammatory Bowel Macrophage Regulation), PAQR9-AS1, FLJ13224, LINC02026), 5 miRNAs (hsa-miR-5587-5p, hsa-miR-4740-5p, hsa-miR-4689, hsa-miR-5587-3p, hsa-miR-3153), and 12 mRNAs (LMX1A, ZIC4, PSG1, PSG4, INSL3, INSL4, HSD3B1, MAGEA3, MAGEA6, TM4SF20, PDZK1IP1, GFAP) ( Figure 4B). Enrichment analysis of the 22 RNA-associated functions (including GO and KEGG) was performed to explore this regulatory network's function. The results showed that mRNAs involved in functions such as ovarian steroidogenesis, steroid hormone biosynthesis, cortisol synthesis and secretion, aldosterone synthesis and secretion, were particularly abundant in this regulatory network ( Figure 4C). miRNAs were subsequently included in the following analysis after the resulting predicted miRNAs were intersected with 38 DEmiRNAs. The DElncRNAs unrelated to these five candidate miRNAs were omitted, and the remaining 22 candidate lncRNAs were incorporated into the final regulatory network mapping. The miRWalk and TargetScan databases were used for analysis to predict downstream target mRNAs referencing the five candidate miRNAs. The final 235 candidate mRNAs were incorporated into the tertiary regulatory network's construction after comparing the predicted mRNAs' results with the DEmRNAs. The AMBN-related lncRNA-miRNA-mRNA triple regulatory network in TGCT was mapped with Cytoscape ( Figure 4A). 22 hub RNAs were screened for hub triple regulatory networks using the Cytoscape plugin cytoHubba, including 5 lncRNAs

Validation of the ceRNA Network Model
Expression differential analysis between normal and tumour tissues was used to verify the expression levels of lncRNAs and mRNAs of the hub triple regulatory network. The results showed that, except for HSD3B1, 16 lncRNAs and mRNAs were significantly differentially expressed in normal and tumour tissues ( Figure 5A,B). The RNAs with significantly higher expression levels in cancer tissues than in normal tissues include PELATON, INSL4, PSG1, PSG4. Significantly lower RNAs than normal tissue were: LINC02701, PAQR9-AS1, FLJ13224, LINC02026, LMX1A, ZIC4, INSL3, MAGEA3, MAGEA6, TM4SF20, PDZK1IP1, and GFAP. Although the expression level of HSD3B1 in cancer tissues was higher than that in normal tissues, the difference was not statistically significant.

Prognostic Analysis of the ceRNA Network Model
For determining whether these RNAs were associated with the diagnostic and prognostic outcome of TGCT, we analysed and plotted the ROC curves of these three RNAs in TGCT, the KM curves of PFI, and the correlation with AMBN ( Figure 6). AUC < 0.7 means low accuracy, 0.7 < AUC < 0.9 for general accuracy and AUC > 0.9 is high accuracy. LINC02701 was shown to give limited diagnostic accuracy with TGCT. Meanwhile, the analysis found that low levels of LINC02701 were associated with a longer time to cancer Furthermore, considering that the cellular localisation of lncRNAs determines its underlying mechanism, we analysed the subcellular localisation of these five DElncRNAs by performing lncLocator. As shown in the figure (Figure 5C), LINC02701, FLJ13224 and LINC02026 are mainly located in the cytoplasm, PAQR9-AS1 is mainly located in the nucleus, and PELATON is in the cell membrane. By miRanda, the pairings between DE RNAs were predicted separately. Ultimately, we found that LINC02701 could target GFAP expression via hsa-miR-3153 after screening ( Figure 5D,E).

Prognostic Analysis of the ceRNA Network Model
For determining whether these RNAs were associated with the diagnostic and prognostic outcome of TGCT, we analysed and plotted the ROC curves of these three RNAs in TGCT, the KM curves of PFI, and the correlation with AMBN ( Figure 6). AUC < 0.7 means low accuracy, 0.7 < AUC < 0.9 for general accuracy and AUC > 0.9 is high accuracy. LINC02701 was shown to give limited diagnostic accuracy with TGCT. Meanwhile, the analysis found that low levels of LINC02701 were associated with a longer time to cancer recurrence after the first cure by TGCT. A positive correlation between the mRNA expression of LINC02701/hsa-miR-3153 and the mRNA expression of AMBN was demonstrated by mRNA expression correlation analysis.
We divided the TGCT patients into two groups using the median GFAP level. In the analysis of clinical prognostic parameters between the two groups, we found that although the number of lymphatic invasions differed significantly, the number of patients with different cancer stages was similar in the two groups (Table 2). However, GFAP was not identified as an independent factor for the occurrence of lymphatic metastasis in TGCT by multifactorial Cox regression analysis (Table S4). Interestingly, we found that PELATON was an independent prognostic factor for TGCT PFI (Table S5) and single-gene logistic regression found that PELATON was associated with lymphatic metastasis and tumourigenic location in TGCT (Table S6).
It was reported that abnormal DNA methylation is strongly associated with oncogenesis [47], so we also analysed the DNA methylation of GFAP in TGCT. Although the mean β value was higher in stage 2.3 than in stage 1, only stage 1 was significantly different from stage 2 ( Figure 7A, p < 0.05). The β value indicates the level of DNA methylation ranging from 0 (unmethylated) to 1 (fully methylated). Different β value ranges are defined to indicate hypermethylation (β value: 0.7-0.5) or hypo-methylation (β value: 0.3-0.25). Furthermore, 20 methylation sites in the DNA sequence of GFAP were positively correlated with their mRNA expression levels ( Figure 7B).  We divided the TGCT patients into two groups using the median GFAP level. In the analysis of clinical prognostic parameters between the two groups, we found that although the number of lymphatic invasions differed significantly, the number of patients with different cancer stages was similar in the two groups (Table 2). However, GFAP was not identified as an independent factor for the occurrence of lymphatic metastasis in TGCT by multifactorial Cox regression analysis (Table S4). Interestingly, we found that PELATON was an independent prognostic factor for TGCT PFI (Table S5) and single-gene logistic regression found that PELATON was associated with lymphatic metastasis and tumourigenic location in TGCT (Table S6). Table 2. Baseline information sheet for GFAP, assessing the difference in the composition ratios of the high and low GFAP mRNA expression subgroups in the different TGCT clinical variables.

Characteristic
Low Expression of GFAP It was reported that abnormal DNA methylation is strongly associated with oncogenesis [47], so we also analysed the DNA methylation of GFAP in TGCT. Although the mean β value was higher in stage 2.3 than in stage 1, only stage 1 was significantly different from stage 2 ( Figure 7A, p < 0.05). The β value indicates the level of DNA methylation ranging from 0 (unmethylated) to 1 (fully methylated). Different β value ranges are defined to indicate hypermethylation (β value: 0.7-0.5) or hypo-methylation (β value: 0.3-0.25). Furthermore, 20 methylation sites in the DNA sequence of GFAP were positively correlated with their mRNA expression levels ( Figure 7B).

Discussion
Our analysis revealed the association between AMBN and the prognosis of TGCT that the level of AMBN mRNA within the TGCT tissue was associated with the recurrence interval after the first cure. By analysing sequencing information from the TGCT patients between high and low levels of AMBN, we identified a potential ceRNA tertiary regulatory network: LINC02701-hsa-miR-3153-GFAP. LINC02701 is associated with the recurrence interval after the first cure and GFAP may be a diagnostic factor for TGCT. In addition, we found significant differences in the levels of PELATON between patients with high and low levels of AMBN, which was identified as an independent prognostic factor for TGCT progression-free interval. In the present study, we first explored the changes in AMBN mRNA expression in local tissues. Compared to normal testicular tissue, AMBN levels were decreased in tumour tissue. Furthermore, low AMBN levels could be detected at stage 2.3 in TGCT. Stage T2 and Stage T3 mean that the cancer tissue has spread to the surrounding blood vessels, lymphatic vessels, surrounding soft tissues, or even that the tumour has grown into the spermatic cord [48]. In a study on osteosarcoma, Toshinori et al. [49] found that AMBN inhibited colony formation and migration of osteosarcoma cells through the Src-Stat3 pathway, thereby affecting the severity of osteosarcoma. Moreover Xu et al. [18] demonstrated the feasibility of AMBN as a prognostic biomarker in univariate and multivariate Cox regression analysis and ROC analysis in prostate cancer.
Cancer cell growth is influenced by the infiltration of immune cells into the tumour micro-environment, and multiple immune cell phenotypes have prognostic significance [50]. AMBN has been found to stimulate the expression and secretion of several inflammatory factors in human primary osteoblasts (NHOs) and mesenchymal stem cells (MSCs) [51]. In addition, AMBN up-regulates the inflammatory response of human macrophages, which plays a vital role in innate immunity [52]. CD34+ cells were found to be present in the normal testicular stroma [53], and are precursor cells for a variety of immune cells [54,55]. Tamburstuen et al. [56] found AMBN mRNA expression in CD34+ cells, which may implicate a more direct relationship between AMBN levels and immune infiltration in TGCT tissue. It cannot be ruled out that CD34+ cells were one of the cell types that stained positively for AMBN in the testis, however, the ability to express AMBN may be lost during differentiation to more mature immune cells [54]. and in TGCT, the AMBN levels were found to be negatively correlated with the degree of dendritic cell, macrophage, and neutrophil infiltration by our immune infiltration analysis. The negative correlation of AMBN with the degree of immune cell factors was revealed in the GESA functional clustering analysis, although previous studies have reported that immune infiltration has a minor effect on the overall survival of patients in TGCT [57,58]. If AMBN is expressed during spermatogenesis, a reduction in AMBN may be due to cancer-induced changes in cell differentiation [59], tissue infiltration, or other changes in the cellular micro-environment.
Dendritic cells are associated with better prognosis and lower cancer recurrence in various cancers [60]. Most immune cells, such as DC, had a higher level of infiltration in the cancer tissue of patients in the low-level AMBN group than in the high-level group. The higher level of DC infiltration in the TGCT tissue of the low AMBN group may explain the interesting phenomenon that the relationship between AMBN levels and disease severity is the opposite of the relationship between AMBN levels and PFI.
ceRNA regulatory networks are involved in the occurrence and development of many human cancers, but very few studies have focused on whether ceRNAs can be used as a diagnostic basis for TGCT or to predict TGCT prognosis. In this study, we sought to establish an AMBN-related ceRNA network in TGCT and link it to the diagnosis or prognosis of TGCT. Our enrichment results also show significant differences in the factors related to "ovarian steroidogenesis" between the high and low AMBN mRNA expression groups. It has been demonstrated that increased oestrogen production is associated with reduced spermatogenesis in men with testicular cancer [61], and perhaps AMBN levels may indirectly reflect reduced spermatogenesis in men with testicular cancer after treatment. A valid regulatory pathway, LINC02701-hsa-miR-3153-GFAP, was identified based on subcellular localisation and sequence alignment validation. LINC02701 has not been previously reported to be associated with cancer, but LINC02701 mRNA expression was significantly raised in patients with Parkinson's disease compared to normal controls [62]. Additionally, LINC02701 was analysed in the recent SARS-CoV-2 study as being associated with the antiviral function of cells and involved in the coordinated expression of signalling pathways in the immune response [63]. The diagnostic role of miRNAs in TGCT has been explored, and miR371 was found to have a sensitive diagnostic specificity in TGCT [64], whereas hsa-miR-3153 appeared to have decreased plasma levels in EGFR mutation NSCLC patients with primary resistance to TKI [65].
GFAP was initially considered to be a glial cell-specific protein and is essential for the normal function of glial cells [66]. Moreover, it is one of the specific diagnostic factors for glioblastoma multiforme [67]. GFAP is expressed in a variety of cell types [68,69], with recent studies demonstrating the expression of GFAP in the testis [70,71]. Our findings support this, and indicate that GFAP levels in TGCT cancer tissues could also be used as a diagnostic factor. The mRNA level of GFAP was found to correlate with the occurrence of lymphatic metastasis in TGCT. Restrepo et al. [72] found raised levels of GFAP promoter methylation in gliomas led to reduced GFAP expression, and that there was frequently loss of GFAP expression with increasing malignancy. We found a similar phenomenon in our analysis of GFAP promoter methylation in TGCT. We found that levels of GFAP promoter methylation were higher in cancer tissue than in normal tissue, and continued to increase as clinical staging progressed. This implies that aberrant methylation of GFAP may be responsible for its prognostic impact on TGCT, and may also explain the lower mRNA levels of GFAP in TGCT tissues than in normal tissues found by our analysis. Gao et al. [73] found that miR-342-5p could regulate mouse neural stem cell proliferation and differentiation by targeting GFAP. We predicted the targeting relationship between hsa-miR-3153 and GFAP using multiple analysis methods. Also coincidentally, the mature miRNA sequence of hsa-miR-3153 could be paired with MSX2 mRNA targeting. MSX2 has been shown to interact with AMBN in vivo, with overexpression of either AMBN or MSX2 affecting each other's expression levels, and with knockdown of MSX2 the levels of AMBN were also affected [74]. More focused analyses should be performed to investigate the interaction between the two pairs of regulatory pathways; hsa-miR-3153-MSX2 and hsa-miR-3153-GFAP. Due to the limitations of miRNA data in the database this is not feasible in this study. The complex interrelationship between AMBN, ceRNA (LINC02701hsa-miR-GFAP) and MSX2 may be another reason for the interesting phenomenon that the expression level of AMBN is lower in advanced TGCT than in earlier stages, but high levels of AMBN in cancer tissue predict a shorter time to recurrence after first cure.
PELATON was first detected in inflammatory bowel disease [75]. Hung et al. [76] subsequently verified that it has macrophage and monocyte specificity and can be elevated in levels in unstable atherosclerotic plaques. In recent studies, it was shown to be an iron death suppressor and one of the oncogenes of glioblastoma multiforme (GBM) [77]. Our analysis also demonstrates that PELATON can be an independent prognostic factor for TGCT progression-free interval and is associated with the lymphatic invasion and the first location of TGCT.
Although the ceRNA-based LINC02701/GFAP axis associated with AMBN has been constructed and appears to be a potentially helpful biomarker for diagnosing and predicting PFI, several limitations must also be noted. Firstly, the binding affinity of lncRNAs, miRNAs and mRNAs to each other obtained by analysis of the predictions needs to be experimentally validated. Second, the function and mechanism of the LINC02701/GFAP axis in TGCT need to be further investigated experimentally. In addition, because of sample size limitations in the TCGA database, our analysis did not group TGCT according to non-seminoma and seminoma, but rather directly grouped TGCT patients according to their clinical parameters for comparative analysis. Future work will address more refined tumour classification by cancer cell type.

Conclusions
In conclusion, we found that AMBN could be a novel predictor of cancer recurrence for TGCT. Furthermore, we established a network of ceRNAs (LINC02701-hsa-miR-3153-GFAP) associated with diagnosing and predicting PFI in TGCT and an independent prognostic factor (PELATON) for PFI in TGCT.