PLEK2, RRM2, GCSH: A Novel WWOX-Dependent Biomarker Triad of Glioblastoma at the Crossroads of Cytoskeleton Reorganization and Metabolism Alterations

Simple Summary Cytoskeleton reorganization affects the malignancy of glioblastoma. The WWOX gene is a tumor suppressor in glioblastoma and was found to modulate the cytoskeletal machinery in neural progenitor cells. To date, the role of this gene in the cytoskeleton or glioblastoma has been studied separately. Therefore, the purpose of this study was to investigate WWOX-dependent genes in glioblastoma and indicate cytoskeleton-related processes they are involved in. The most relevant WWOX-dependent genes were found to be PLEK2, RRM2, and GCSH, which have been proposed as novel biomarkers. Their biological functions suggest that there is an important link between cytoskeleton and metabolism, orchestrating tumor proliferation, metastasis, and resistance. Searching for such new therapeutic targets is important due to the constant lack of effective treatment for glioblastoma patients. Abstract Glioblastoma is one of the deadliest human cancers. Its malignancy depends on cytoskeleton reorganization, which is related to, e.g., epithelial-to-mesenchymal transition and metastasis. The malignant phenotype of glioblastoma is also affected by the WWOX gene, which is lost in nearly a quarter of gliomas. Although the role of WWOX in the cytoskeleton rearrangement has been found in neural progenitor cells, its function as a modulator of cytoskeleton in gliomas was not investigated. Therefore, this study aimed to investigate the role of WWOX and its collaborators in cytoskeleton dynamics of glioblastoma. Methodology on RNA-seq data integrated the use of databases, bioinformatics tools, web-based platforms, and machine learning algorithm, and the obtained results were validated through microarray data. PLEK2, RRM2, and GCSH were the most relevant WWOX-dependent genes that could serve as novel biomarkers. Other genes important in the context of cytoskeleton (BMP4, CCL11, CUX2, DUSP7, FAM92B, GRIN2B, HOXA1, HOXA10, KIF20A, NF2, SPOCK1, TTR, UHRF1, and WT1), metabolism (MTHFD2), or correlation with WWOX (COL3A1, KIF20A, RNF141, and RXRG) were also discovered. For the first time, we propose that changes in WWOX expression dictate a myriad of alterations that affect both glioblastoma cytoskeleton and metabolism, rendering new therapeutic possibilities.

One of the genes affecting both cytoskeleton and GBM is WW domain-containing oxidoreductase (WWOX), a haploinsufficient tumor suppressor described in many cancers, including GBM, for which it impairs malignant phenotype [17]. In the brain tissue, WWOX can be summarized as a global modulator of transcription and an important regulator of differentiation and maintenance [18]; this is complemented with the prognostic relevance of WWOX [19]. It is also an important modulator of metabolic pathways, regulating the synthesis of amino acids and lipids but also glycolysis or Krebs cycle. Its role in various metabolic disorders [20] or in cancer metabolism [21] was previously reported. Recently, our comprehensive review summarized the current knowledge about WWOX in the central nervous system (CNS), including brain tumors, i.e., astrocytoma, neuroblastoma, and glioblastoma [22]. For GBM, it has been found that WWOX downregulation may be a result of promoter hypermethylation or loss of heterozygosity (LOH); the latter is related to tumor progression and contributes to 20% of gliomas [22]. Determining the role of WWOX in GBM is thought to be in the initial state [19]; hence, profound research is needed, especially in the cytoskeleton-related context, which remains enigmatic. Available data indicate that WW domain of WWOX collaborates, e.g., with dystroglycan, a transmembrane protein that interacts with utrophin and dystrophin, which also communicate with actin [23]. Such finding proves the implication of WWOX in complex machinery involving both extracellular matrix (ECM) and cytoskeleton [23]. Furthermore, it has been found that WWOX recognizes the PPxY motif of ezrin [24], a protein that links transmembrane signaling to the reorganization of cytoskeleton [25], influencing cell migration and tumor progression [24]. In the case of brain tissue, the ability of WWOX to bind and regulate glycogen synthase kinase 3β (GSK-3β) was proved to be important for Alzheimer's disease [26], in which a frequent WWOX downregulation is found among patients [27]. Lastly, WWOX silencing changed the expression profile of genes (e.g., DCLK, NEFL, NEFM, MAP2/4/6) involved in the cytoskeleton organization of neural progenitor cells [18].
It can be assumed that as a global modulator, WWOX will manifest its function directly or indirectly. Therefore, this research aims to examine the role of WWOX-dependent genes in terms of cytoskeleton reorganization in GBM. Simultaneously, the lethal nature of the de novo IDH wild-type glioblastoma does not leave too many solutions, and hence it is justified to investigate not only GBM but also low-grade tissue from which glioblastoma IDH mutated could arise afterwards.

WWOX Influenced a Number of Pathways and Biological Processes
Optimal WWOX expression cut-point was determined to separate high-and lowexpressing groups of patients. The obtained cut-off value, 222.6, had significantly separated groups, showing better prognosis for patients with high WWOX expression level ( Figure 1).
Performing gene set enrichment analysis (GSEA) with the use of the selected cut-point confirmed the presence of gene sets such as BIOCARTA/KEGG/REACTOME canonical pathways; chemical and genetic perturbations (CGP); gene ontology (GO): biological processes (BP) and molecular function (MF); cancer gene neighborhoods (CGN); and oncogenic and immunologic signatures, hallmarks, and positional gene sets. Collectively, 74 gene sets were taken into account, which was equivalent to 3046 core enrichment genes (Supplementary File S1). Ultimately, duplicate removal yielded 1898 genes.

Global Distinction Indicated Both Clinical Features and Modules Worth Considering
Together with their corresponding normal brain tissue, the GBM and LGG cohorts were distributed across Uniform Manifold Approximation and Projection (UMAP) dimensions using the 1898 genes from GSEA. Out of eight characteristics acquired from The Cancer Genome Atlas (TCGA) clinical data, half showed promising distribution. First, samples were divided on the basis of the tumor_type; normal brain samples were also included to allow simultaneous visualization with tumor samples using one variable (Figure 2A). Subsequently, samples were grouped according to neoplasm_histologic_grade ( Figure 2B) and histological_type ( Figure 2C), and finally age_at_initial_pathologic_diagnosis divided by median age ( Figure 2D). The remaining clinical features (excluded from further analyses due to the lack of a clear pattern) are presented in Figure S1.  The heatmaps were elaborated to split 1898 genes into modules and ease interpretation of specific comparison. The content of the modules is summarized in Supplementary File S2. Representative modules were selected on the basis of the contrasting pattern between the groups and the level of change in expression. For tumor_type ( Figure 3A), the GBM was distinguished from normal brain (abbreviated NT) using modules 5 and 6. Module 1 was chosen to explain the differences between LGG and NT. GBM and LGG demonstrated different patterns for all modules; hence, two sets were adopted for independent examination-the first set comprised modules 2, 4, 5, and 7 combined together, while the second contained modules 1, 3, and 6. For neoplasm_histologic_grade ( Figure 3B), the G4 vs. G2 or G4 vs. G3 comparisons were explained using sets of modules 2 + 4 + 5 + 7 and 1 + 3 + 6. Regarding histological_type ( Figure 3C), we focused on the most drastic changes: the set of modules 1 + 5 + 7 allowed astrocytoma (A) to be distinguished from both oligoastrocytoma (OA) and oligodendroglioma (OD). Additionally, using combined modules 1, 4, 5, and 6, the following comparisons were also included to identify treatmentrelated differences: (1) glioblastoma_multiforme combined with untreated_primary_gbm vs. normal_nervous_system; (2) glioblastoma_multiforme combined with untreated_primary_gbm vs. treated_primary_gbm; (3) treated_primary_gbm vs. normal_nervous_system. Lastly, for age_at_initial_pathologic_diagnosis, we indicated two sets of modules (1 + 3 and 2 + 4 + 5) that demonstrated opposite expression profile ( Figure 3D). Henceforth, sets will be referred to as module numbers linked with "+".

Differentially Expressed Genes Were Identified for Each Comparison
Each comparison with corresponding set was summarized through volcano plot. The control in specific comparison was always the phenotype more resembling the natural/normal conditions, e.g., for high-grade vs. low-grade, the latter was used as a reference. A dedicated log 2 fold-change (log 2 FC) was applied to specific analysis since the number of genes in a given comparison varied depending on the chosen modules. Using a different cut-off log 2 FC value, it was possible to indicate a small but strict group of genes presenting remarkable difference in expression. Differentially expressed genes (DEGs) were considered significant when at least log 2 FC = 0.6 (i.e., fold-change ≈1.5) and p < 0.01; however, in many comparisons, even log 2 FC = 3.5 was used to provide the most stringent group of DEGs.
For GBM vs.

The Most Relevant Genes Were Subjected to ROC Analysis
Sequential elimination using Multiple Support Vector Machine Recursive Feature Elimination (mSVM-RFE) was performed to select the best three features (genes) per comparison. Due to small size of the groups, all comparisons for histological_type were based on the topTags() function. Ultimately, 14 out of 16 comparisons contained DEGs; these revealed 42 genes with supposed potential. However, some genes were identified in more than one comparison, resulting in a group of 30 genes. These were henceforth described as top genes. The most relevant genes, along with their ascending average rank (AvgRank) or with topTags() hierarchy, are listed in Table 1.
LGG vs. NT 1 G3 vs. G2 1 + 3 + 6 Elimination of insignificant or overfitted genes was possible via determination of area under the curve (AUC) in receiver operating characteristic (ROC) analysis. Some comparisons contained no valuable predictor, and data validation helped to decide whether to exclude them from subsequent step. The ROC analysis for all top genes is collected in Figure 5. The importance of top genes regarding glioma or cytoskeleton regulation is summarized in Table S1.

Validation Verified the Usefulness of Top Genes
Independent microarray cohorts were used to certify RNA-seq findings; identical top genes were considered in specific clinical comparison ( Figure 6).

Several Substantial Genes Were Correlated with WWOX and Possessed Prognostic Value
Prior to focusing on the best nine explanatory genes, we correlated all top genes with WWOX to select subsidiary genes more related to WWOX but with slightly worse predictive properties. As forecasted, among all top genes (Table 3), and besides nine predictors, there were a few additional genes worth considering, e.g., COL3A1, KIF20A, RNF141, and RXRG. This resulted in a total of 13 genes that were then subjected to survival analysis, together with WWOX (Figure 7). Kaplan-Meier curves revealed that despite being at the forefront, UHRF1 was not found to be of prognostic importance and was excluded ( Figure 7N).  Table 3. Correlation analysis of the top genes with WWOX. The remaining 12 genes were visualized in terms of their expression across all RNA-seq samples. The selection of three the most relevant genes was based on their predictive value (AUC), effect on survival (DFS), correlation with WWOX, and transparency of expression difference across UMAP dimensions. These were found to be PLEK2, RRM2, and GCSH. Together with WWOX, their expression difference across UMAP is visualized in Figure 8; the same for the remaining genes can be found in Figure S2. The GBM cohort demonstrated markedly increased PLEK2 and lowered GCSH expression compared to LGG, as well as a change in RRM2 expression similar to that of PLEK2. According to Gene Expression Profiling Interactive Analysis (GEPIA2), the last two are positively correlated, i.e., R = 0.62, p < 0.0001 ( Figure 9A); a negative correlation can be seen for PLEK2 and GCSH, i.e., R = −0.32, p < 0.0001 ( Figure 9B). Lastly, GCSH and RRM2 are not well correlated, indicated by a low correlation coefficient, i.e., R = −0.23, p < 0.0001 ( Figure 9C). Nevertheless, each of the three genes significantly correlated with WWOX (GCSH positively, PLEK2 and RRM2 negatively) as shown in Table 3. Finally, the prognostic signatures of PLEK2, RRM2, and GCSH were adjusted using survival curves with "Estimation of Stromal and Immune cells in Malignant Tumors using Expression data" (ESTIMATE) score as covariate ( Figure 9D-F).

Discussion
Glioblastoma is one of the deadliest human cancers [28]. Its malignant behavior results from, e.g., cytoskeleton reorganization that controls EMT and metastasis [11,29]. GBM is also affected by WWOX, a cytoskeleton-related protein that interacts with ezrin, dystroglycan, or GSK−3β; it also influences on expression of DCLK, NEFL, NEFM, and MAP2/4/6 genes. As a tumor suppressor, it is lost in nearly a quarter of gliomas through a LOH event [22,23,26]. This study is the first to evaluate the role of WWOX in cytoskeleton dynamics of glioblastoma. Low-grade glioma (from which IDH-mutated GBM could arise) was included in the analysis to determine whether WWOX loss might also switch expression profile at the earlier stages of carcinogenesis; this may potentially lead to an improved on-time diagnosis.
The main research milestones were (1) acquisition of a list of core enrichment genes using Evaluate Cutpoints and GSEA; (2) selection of the most important clinical features and comparisons using Monocle3; (3) establishment of the top genes and determining their role in the context of glioma and/or cytoskeleton; (4) validation of the results and selection of the three most relevant genes based on their predictive properties (AUC), impact on patients survival (DFS), correlation with WWOX, and transparency of spatial expression analysis across UMAP dimensions. Performing the edgeR → mSVM-RFE → ROC workflow independently for each clinical feature is justified, since the final verdict (indicating the most relevant gene) depended on the analyzed trait. Moreover, additional correlation analysis between the top genes and WWOX identified four important genes; likewise, multiple testing of the top genes allowed us to spot overfitted genes.
Our study revealed a noticeable subdivision of samples based on tumor type (or accompanied normal tissue), grading, histological type, or age. It was foreseeable that a higher grade of LGG (grade 3) or the presence of an astrocytoma histology are more indicative of GBM than lower grade or other types, as IDH mutated glioblastomas derive from astrocytomas [45]. However, it is not possible to make a complete distinction from oligodendroglioma as there are some glioblastomas with oligodendroglioma component (termed GBM-O) that might be pathologically defined as anaplastic oligoastrocytomas with necrosis [46,47]. The mean age has been also included as one of the patterns distinguishing glioblastomas [48], and the clinical practice confirms that application of surgical resection, radiotherapy, or chemotherapy is hampered in older age [49].
The final stages of this study involved the selection of the best three genes-these were PLEK2, RRM2, and GCSH. They correlated with WWOX, and favorably (GCSH) or unfavorably (PLEK2 and RRM2) affected DFS. Across UMAP dimensions, GCSH was downregulated in GBM compared to LGG, but PLEK2 and RRM2 were upregulated. Finally, RRM2 and PLEK2 were correlated with each other, as were GCSH and PLEK2, but not RRM2 and GCSH; all three genes were also prognostic when ESTIMATE covariate was applied. Regarding phenotype comparisons for which these genes were representative, both GCSH and PLEK2 distinguished cancer grades: GCSH expression was lower in G4 than in G3, while PLEK2 was higher in G4 compared to G2. The changes in cytoskeleton were linked to increasing cancer grade, e.g., in colon cancer or glioblastoma. At first, Pachenari et al. reported that the proportion of actin and tubulin differs between low and high grade and that microtubules reorganize cytoskeleton to facilitate benign to malignant phenotype transition [65]. Secondly, Reiss-Zimmermann et al. found G4 glioblastoma to be softer than G3 astrocytoma [66]; this stems from cell stiffness, which is related to cytoskeleton reorganization [67]. On the other hand, RRM2 discriminated untreated GBM from normal nervous system; this is embedded in the TCGA clinical data of histological_type but it can also be considered as a typical comparison between glioblastoma and normal brain tissue. As previously mentioned, RRM2 has been already proposed as an overexpressed biomarker with functional significance in glioma [55]. Hence, PLEK2 and GCSH should also be considered as prognostic biomarkers in GBM, since their spatial expression analysis was even clearer than that of literature-supported RRM2.
To understand whether these three genes can undergo targeted therapy, it is necessary to analyze their biological function. In addition, GCSH expression is clearly lowered with increasing glioma grade, while that of PLEK2 and RRM2 is clearly elevated; therefore, GCSH does not fit the premise of targeted therapy, i.e., biological pathway inhibition [68]. Additionally, GCSH encodes one of four proteins responsible for glycine metabolism, which is only a single pathway of the whole complicated metabolic machinery [69]; its alteration should be considered in a broader perspective. Complexity is even greater as the cytoskeleton is implicated in carbohydrate metabolism [70], and conversely, actin and tubulin require energy from nucleotide hydrolysis to maintain structural dynamics [71]. Moreover, the other top gene, i.e., MTHFD2, was found to regulate metabolism through the folate cycle [72]. It was the best explanatory gene for the differences between LGG and normal brain tissue, suggesting that metabolic changes are not only restricted to glioblastoma.
RRM2 encodes β subunit of ribonucleotide reductase (RR), an enzyme acquiring 2 -deoxyribonucleotides from ribonucleotide 5 -diphosphates, which are crucial for the synthesis or repair of DNA [73]. It forms a dimer that may bind DNA [74]-each RRM2 monomer contains the tyrosyl radical and non-heme iron [73]. A sufficient supply of deoxyribonucleotides is required for uncontrolled DNA replication in cancer [75]; it is therefore not surprising that RRM2 is often a target of molecular therapy [73,76,77]. Nowadays, there are several RRM2 inhibitors, i.e., radical scavengers, iron chelators, subunit polymerization inhibitors, or expression silencers [76,[78][79][80]. A wide range of anti-RRM2 approaches focus on inhibition of proliferation, differentiation, and division but also invasion [75]. It has been found that RRM2 knockdown inhibits cell proliferation via DNA damage-driven senescence induction [77].
Considering PLEK2, a protein with two pleckstrin homology (PH) domains and one disheveled-Egl10-pleckstrin (DEP) domain [81] that is able to bind acidic phospholipids of cell membranes [82] or influence actin dynamics [83], targeted therapy is less developed yet already viable. Han et al. screened for small molecules potentially able to bind PLEK2 and identified compounds binding DEP domain; the lead compound, i.e., NUP−17d diminished proliferation similarly to ruxolitinib [82]. Despite moving to the cell membrane through its ability to bind phosphoinositides [84], it is also the effector of increased proliferation driven by JAK/STAT and PI3K/AKT signaling [82,83]. These two pathways were found in GSEA during discrimination of WWOX high/low groups. For JAK/STAT, Zhao et al. performed quantitative PCR preceded by chromatin immunoprecipitation (ChIP) to confirm the presence of STAT5 consensus-binding sites in PLEK2 promoter region [83]. Likewise, PLEK2 recruits phosphatidylinositol 3,4-bisphosphate and proteins, e.g., AKT, phosphoinositide-dependent kinase−1 (PDK1), PDK2, and mTOR, forming a complex that augments PI3K signaling [82]. Since PLEK2 induces cell spreading and guides tumor progression and metastasis [81], the development of new targeted therapy aimed at this oncogenic molecule is of utmost importance.
Finally, GCSH (or H-protein) is an integral core protein of glycine cleavage system (GCS), the major pathway of glycine degradation [85]. In short, GCS consists of a fourprotein complex that catalyzes glycine degradation into carbon dioxide, ammonia, and a methylene group that is accepted by tetrahydrofolate [86]. The functioning of GCS is reversible yet requires an intermediary state when the ternary complex is formed, i.e., compound of P-protein (or glycine decarboxylase; GLDC), the aminomethyl moiety of glycine, and H-protein [87]. This proves that in order to maintain glycine synthesis/degradation for subsequent processing, GCSH must be intact. Here, our study indicates that GCSH expression dramatically declines with increasing glioma grade. This is even more intriguing since cancer may benefit from GCS, as stated by Zhang et al. during lung adenocarcinoma research, wherein overexpression of GLDC increased tumor formation [88]. However, activity of GCS is mainly restricted to normal tissues of the brain, liver, and kidney [89], which suggests that context is tissue-dependent. Moreover, the brain is one of the few tissues with naturally high GLDC expression [90], suggesting the inverse tendency in typically non-GCS-expressing tissues that become cancerous and then overexpress GLDC. This can be supported by the findings of Zhuang et al. on hepatocellular carcinoma (HCC), wherein restoration of GCS proper activity (via GLDC overexpression due to its low level in malignant HCC) suppressed cancer progression via inhibition of both invasion and metastasis [91]. As mentioned above, the liver is (alongside the brain) one of few tissues maintaining the activity of GCS, and hence GLDC and other proteins of this complex. Therefore, this complex may be inhibited during carcinogenesis; this is in contrast to tissues with a naturally low or absent activity of GCS, where it would be increased. The main purpose of such a switch remains elusive, but to investigate this, glycine metabolism must be considered together with serine as they both are biosynthetically linked; they contribute to the one-carbon metabolism that cycles units of carbon from various amino acids [92]. Many discrepancies around glycine function have arisen, i.e., its uptake has been correlated with cancer proliferation [93], while excess glycine in the diet inhibited tumorigenesis in vivo [94,95]. This is supposedly due to the overall complexity of the metabolism but also insufficient understanding of glycine metabolism in carcinogenesis [96]. In theory, glycine is able to provide all precursors required to support nucleic acid synthesis [97], which is crucial for maintaining cancer cell growth [92]. Interestingly, serine is the main donor of one-carbon units [96] and a central hub of metabolic pathways in cancer [92], which might suggest its supremacy over glycine. Labuschagne et al. report that serine, rather than glycine, supports carbon metabolism and later proliferation [97]. The same authors highlighted that many tumors prefer serine consumption over glycine, and the latter does not substitute serine in nucleotide synthesis; what is more, the paradox of high glycine uptake during rapid proliferation can be a consequence rather than a cause of an such event [97]. This corresponds to our results, suggesting that GBM may preferably switch to serine consumption via GCSH downregulation. As GCSH was the gene that can distinguish G4 from G3 brain tumors, we speculate this occurs (beyond IDH wild-type GBM development) over the transformation from G3 astrocytoma to G4 glioblastoma. The only concern is how such deadly and incurable tumor handles with excess glycine, which if not metabolized may be converted to toxic by-products such as aminoacetone or methylglyoxal [86]. The explanation may again be serine-dependent; it has been found that not only excess glycine drives conversion to serine and inhibits glycine flux to purines but also high serine leads to glycine efflux [97]. Moreover, GBM is thought to be adapted to environmental conditions via serine-dependent redox homeostasis that enhances tumor survival [98]. Although glioblastoma metabolism remains poorly understood, it appears that antimetabolic therapy focused on serine pathway inhibition may be worth considering.

Data Collection of Cancer Patients and Cut-Point Determination
The entire methodology is summarized in Figure 10. Expression data of RNA-seq together with corresponding clinical annotation were collected from both GBM and LGG cohorts of TCGA-dedicated FireBrowse Repository (level 3 RNA-seqV2, RSEM normalized, data version of 28 January 2016 available at http://firebrowse.org/, accessed on 10 December 2020). Patients missing expression or clinical data were discarded from the study; no additional exclusion criteria were applied. The available data of paired normal brain tissues were additionally retrieved via the R-dedicated package TCGA-Assembler [99]. A total of 672 samples were included in the study. Determining the suitable WWOX expression cut-point to stratify the population into two groups was achieved with the help of the R-based Evaluate Cutpoints tool designed in our Department [100].

Identification of Significant Differences between Phenotypes
In order to determine which genes are WWOX-related in high-grade glioma, we conducted GSEA (https://www.gsea-msigdb.org/gsea, accessed on 20 December 2020) on eight major collections acquired from http://software.broadinstitute.org/gsea/msigdb (accessed on 20 December 2020). Functional analysis was performed on 20,502 genes using the tTest metric with a weighted statistic to score hits/misses and permutation type concerning phenotype. From the whole and excluding duplicates, 1898 genes belonging to selected gene sets were chosen with a significance threshold of FDR < 0.25.

Global Profiling and Determination of Gene-Containing Modules
Phenotype heterogeneity between GBM/LGG and normal brain tissue was investigated using the Monocle3 R package [101]. The dimensionality of the reduced space of 100 (num_dim) was chosen for the PCA pre-processing step (preprocess_cds()). Dimension reduction (reduce_dimension()) and individual clustering (cluster_cells()) within spaces were applied with the UMAP algorithm for dimensionality reduction method, upon which they were applied to base clustering (reduction_method). The clusters of individuals were compared with graph_test() function in accordance with Moran's I spatial autocorrelation analysis with knn neighbor graph and 0.05 q-value. Moreover, the genes varying across the clusters were grouped into modules through Louvain community analysis (find_gene_modules()) with parameters set to default. The modules were clustered on all of the individuals, enabling simultaneously comparison between tumors (e.g., GBM vs. LGG) and tumor vs. non-tumor samples (e.g., GBM/LGG vs. NT). The results were clustered with the Ward D2 method and visualized with pheatmap(). Expression differences for a given gene in the global projection were visualized using plot_cells(). The whole pipeline was per-formed according to the Monocle3 tutorial (https://cole-trapnell-lab.github.io/monocle3/, accesssed on 26 December 2020).

Examination of Differentially Expressed Genes
Bioconductor's edgeR package allowed us to find DEGs using embedded DGEList() constructor and fitting a negative binomial generalized log-linear model to the read counts for each gene through glmFit() and glmLRT() with subsequent makeContrasts() between groups of specific comparison (default parameters). The most differentially expressed genes were extracted in the form of the table using topTags() ranked by absolute logFC with p < 0.01 adjusted using Benjamini and Hochberg correction method. DEGs were visualized on volcano plots using ggrepel package and geom_text_repel() function.

Relevant Features Investigation
Multiple SVM-RFE approach by Duan et al. [102], which extended techniques of resampling compared to the original idea invented by Guyon et al. [103], was used to obtain top features (genes) across folds. Feature ranking was performed using svmRFE() function with k-fold cross validation (CV) of k = 10 to include a multiplicity of mSVM-RFE and halve.above = 100. R-package e1071 was included in the environment to allow SVM fitting. After setting up 10-fold CV, we performed feature ranking on all training sets and obtained top features across all folds using WriteFeatures() with a list of genes ordered by ascending AvgRank value (the lower number the better). The best genes (three per each significant comparison) were subjected to further investigation.

Evaluation of Statistical Model Accuracy
The ROC curves were evaluated on genes acquired from SVM machine learning algorithm. The pROC package was used for ROC analysis (estimating AUC, 95% CI, ACC) and for plotting curve plot through ggroc using ggplot2 in R environment.

Validation of the Results
Validation of the findings on the basis of independent adequate cohorts was performed through Agilent 244K G4502A microarray normalized expression data acquired from the University of California Santa Cruz (UCSC) Xena repository. No LGG cohort exists on other platforms, e.g., AffyU133a (GBM microarray data alone was not sufficient to reflect comparisons).

Correlation Analysis and Survival Curves
GEPIA2 (http://gepia2.cancer-pku.cn, accessed on 29 December 2020) was used to correlate the top genes with WWOX using Spearman's rank correlation coefficient. Analysis of genes' prognostic models on DFS endpoint in both GBM and LGG cohorts (merged together) was conducted using survminer and forestmodel R-packages. Survival curves were set on median group cut-off and forest plots were generated with Cox proportional hazards model. For the best three genes selected from the study, we developed the models with confounder-adjusted survival rate via inverse probability weighting (IPW) with the use of RISCA and hrIPW R-packages. The covariate was ESTIMATE combined score [104], which was divided into high and low groups on the basis of the median value. The logrank p-value was calculated using ipw.log.rank() function of RISCA, while hazard ratio estimation was acquired from hrIPW.

Conclusions
The change in WWOX gene expression dictates a myriad of alterations through WWOX-dependent genes that affect both glioblastoma cytoskeleton and metabolism. The greatest differences appear between glioblastoma and other gliomas (particularly in relation to tumor grade G4 vs. G3/G2), where GCSH and PLEK2 can be used to distinguish them since they exhibit opposite expression profiles in this regard. Together with the already described glioma biomarker, RRM2, we considered PLEK2 and GCSH as a novel WWOXdependent biomarker triad of glioblastoma, whose subsequent investigation is advisable. RRM2 and PLEK2 appear to be prognostic and therapeutic biomarkers; modulating their activity through targeted therapy might inhibit uncontrolled DNA replication or metastasis and proliferation, respectively (presumably driven by JAK/STAT and PI3K/AKT). Regarding GCSH, targeted therapy is not justified as its expression decreases with increasing glioma grade; hence, it is not a therapeutic biomarker. However, a broader view on onecarbon metabolism implied that more emphasis should be given to the serine pathway as a possible antimetabolic target. Nevertheless, the usefulness of PLEK2, RRM2, and GCSH as diagnostic or predictive biomarkers is yet to be confirmed. Other genes may also be worth investigating, including those playing a key role in the cytoskeleton (BMP4, CCL11, CUX2, DUSP7, FAM92B, GRIN2B, HOXA1, HOXA10, KIF20A, NF2, SPOCK1, TTR, UHRF1, and WT1), metabolism (MTHFD2), or correlation with WWOX (COL3A1, KIF20A, RNF141, and RXRG).

Supplementary Materials:
The following are available online at https://www.mdpi.com/article/ 10.3390/cancers13122955/s1, Figure S1: Residual global profiling of TCGA glioma samples accompanied by normal brain tissue through various clinical data, Figure S2: Spatial expression analysis of the remaining 9 most relevant genes, Table S1: Role of the top genes in glioma and cytoskeleton , Supplementary File S1: Gene sets acquired from enrichment analysis (Supplementary_File_S1.xlsx), Supplementary File S2: Content of modules from Monocle3 (Supplementary_File_S2.xlsx).
Author Contributions:Ż.K. and A.K.B. conceptualized the article.Ż.K., D.K. and A.K.B. established the methodology.Ż.K. and DK were responsible for software. A.K.B. and E.P. supervised the article. Z.K. and D.K. visualized the results.Ż.K. wrote the original draft.Ż.K., D.K., and E.P. reviewed and edited the article. All authors read and approved the final manuscript.
Funding: This research was funded by Medical University of Lodz, grant number: 503/0-078-02/503-01-001-19-00. The funding body had no role in study design, collection, analysis, data interpretation, or writing of the manuscript.
Institutional Review Board Statement: Not applicable.

Informed Consent Statement: Not applicable.
Data Availability Statement: Publicly available datasets were analyzed in this study. This data can be found here: https://gdac.broadinstitute.org/ and https://xenabrowser.net/ (accessed on 10 December 2020).