Extracellular Matrix- and Integrin Adhesion Complexes-Related Genes in the Prognosis of Prostate Cancer Patients’ Progression-Free Survival

Prostate cancer is a heterogeneous disease, and one of the main obstacles in its management is the inability to foresee its course. Therefore, novel biomarkers are needed that will guide the treatment options. The extracellular matrix (ECM) is an important part of the tumor microenvironment that largely influences cell behavior. ECM components are ligands for integrin receptors which are involved in every step of tumor progression. An underlying characteristic of integrin activation and ligation is the formation of integrin adhesion complexes (IACs), intracellular structures that carry information conveyed by integrins. By using The Cancer Genome Atlas data, we show that the expression of ECM- and IACs-related genes is changed in prostate cancer. Moreover, machine learning methods revealed that they are a source of biomarkers for progression-free survival of patients that are stratified according to the Gleason score. Namely, low expression of FMOD and high expression of PTPN2 genes are associated with worse survival of patients with a Gleason score lower than 9. The FMOD gene encodes protein that may play a role in the assembly of the ECM and the PTPN2 gene product is a protein tyrosine phosphatase activated by integrins. Our results suggest potential biomarkers of prostate cancer progression.


Introduction
Prostate cancer is among the most common cancers with regard to incidence and mortality [1,2]. According to the Global Cancer Observatory, in 2020, there were 1,414,259 new prostate cancer cases diagnosed (7.3% of all sites) and 375,304 deaths from this disease (3.8% of all sites) [3]. Surgical intervention (radical prostatectomy) and radiotherapy are the usual treatment options for localized prostate cancer [4,5]. However, the biochemical recurrence, which is defined by a rise in the blood level of prostate-specific antigen (PSA), occurs within 10 years in a fraction of patients treated with radical prostatectomy (20-40% of cases) and radiotherapy (30-50% of cases) [6]. Biochemical recurrence is usually a sign of a progressive disease, which is accompanied by symptoms or evidence of disease progression on imaging [7]. Although the five-and ten-year survival rates in prostate cancer are favorable in comparison to some other more aggressive cancer types, the recurrence of the disease is fatal for a substantial number of patients. The probability to develop prostate cancer highly increases with age, and it is considered that 30-40% of men older than 50 years of age have prostate cancer, but not all cases are clinically significant [8]. In line with these observations, one of the greatest obstacles in prostate cancer treatment is the inability to foresee the course of the disease and to recognize the tumors that will be indolent and require no or minimal intervention and those that are more malignant and will progress fast. Therefore, novel biomarkers of disease progression and therapeutic targets are needed [9]. ditionally, survival trees are easier to interpret and present by clinicians than the Cox regression results.

Materials and Methods
The main methodological workflow of this article is presented in Figure 1 and described in the following sections. Briefly, after the TCGA PRAD (prostate adenocarcinoma) dataset was downloaded, differentially expressed genes (DEGs) were analyzed. Subsequently, the enrichment analysis was performed on the DEGs. All the mentioned steps were performed with the TCGAbiolinks R package [27,28]. After that, the rpart R module (version 4.1.19) [29,30] was used to perform recursive partitioning and the progression-free survival analysis. Furthermore, the R commander (version 2.8-0) and EZR packages (version 1.61) [31,32] were used to establish the Kaplan-Meier estimate of individual nodes determined by rpart. The reason why we performed survival analysis with all the matrisome and adhesome genes and not only DEGs is that rpart analysis defines risk subgroups, so the changes of gene expressions in a subgroup of patients could be masked by global levels of gene expression in pooled prostate cancer samples.
Biomedicines 2023, 11,2006 3 of 18 partitioning and survival trees for the establishment of prognostic subgroups. Considering the prostate cancer heterogeneity, we trust that our approach better-describes its characteristics. Additionally, survival trees are easier to interpret and present by clinicians than the Cox regression results.

Materials and Methods
The main methodological workflow of this article is presented in Figure 1 and described in the following sections. Briefly, after the TCGA PRAD (prostate adenocarcinoma) dataset was downloaded, differentially expressed genes (DEGs) were analyzed. Subsequently, the enrichment analysis was performed on the DEGs. All the mentioned steps were performed with the TCGAbiolinks R package [27,28]. After that, the rpart R module (version 4.1.19) [29,30] was used to perform recursive partitioning and the progression-free survival analysis. Furthermore, the R commander (version 2.8-0) and EZR packages (version 1.61) [31,32] were used to establish the Kaplan-Meier estimate of individual nodes determined by rpart. The reason why we performed survival analysis with all the matrisome and adhesome genes and not only DEGs is that rpart analysis defines risk subgroups, so the changes of gene expressions in a subgroup of patients could be masked by global levels of gene expression in pooled prostate cancer samples. Figure 1. The workflow of this study. The conducted steps are shown in green rectangles. The software used, and the method that each performs, is shown in red rectangles. ECM, extracellular matrix; IAC, integrin adhesion complex; EZR, Easy R.

ECM-and IACs-Related Genes' Retrieval
Matrisome is the ensemble of genes encoding the extracellular matrix (ECM) and ECM-associated proteins, which was predicted bioinformatically in the genome of various model organisms by using the characteristic domain-based organization of ECM proteins [33,34]. The matrisome genes (N = 1027) were retrieved from: http://matrisome.org/ (accessed on 1 September 2022) [33,34]. These genes can be further divided into genes encoding core matrisome proteins and matrisome-associated proteins.
The consensus adhesome consists of the 60 most common proteins that are extracted from quantitative proteomic datasets, in which IACs were induced by the canonical ligand fibronectin. These proteins are likely to represent the core cell adhesion machinery and were retrieved from [37]. The conducted steps are shown in green rectangles. The software used, and the method that each performs, is shown in red rectangles. ECM, extracellular matrix; IAC, integrin adhesion complex; EZR, Easy R.

ECM-and IACs-Related Genes' Retrieval
Matrisome is the ensemble of genes encoding the extracellular matrix (ECM) and ECMassociated proteins, which was predicted bioinformatically in the genome of various model organisms by using the characteristic domain-based organization of ECM proteins [33,34]. The matrisome genes (N = 1027) were retrieved from: http://matrisome.org/ (accessed on 1 September 2022) [33,34]. These genes can be further divided into genes encoding core matrisome proteins and matrisome-associated proteins.
The consensus adhesome consists of the 60 most common proteins that are extracted from quantitative proteomic datasets, in which IACs were induced by the canonical ligand fibronectin. These proteins are likely to represent the core cell adhesion machinery and were retrieved from [37].
The final combined list of matrisome, adhesome, and consensus adhesome genes had 1286 genes in total, and is provided in the Supplementary Material.

Data Preparation
The TCGAbiolinks R package [27,28] was used to download, prepare, and analyze The Cancer Genome Atlas (TCGA) [38] prostate adenocarcinoma (PRAD) dataset. This dataset contains gene expression data for 497 prostate cancer patients and corresponding non-transformed prostate tissues for a subset of 52 patients. The same R package was used to pre-process, normalize, and filter the dataset and prepare it for the differential gene expression, functional enrichment, and survival analyses.

Differential Gene Expression and Functional Enrichment Analyses
To gain insight into differentially expressed genes (DEGs) in prostate cancer in comparison to non-transformed prostate tissue, we set the following criteria in the TCGAbiolinks R package: |log2FC| ≥ 1 (corresponding to |fold change| ≥ 2) and FDR (false discovery rate) p-value < 0.01. These conditions yielded 2037 DEGs. Among these 2037 genes, we singled out ECM-and IACs-related genes with changed expression in prostate cancer.
The functional enrichment analysis for the Gene Ontology Cellular Component (GO CC) category using 2037 DEGs was performed by using the TCGAbiolinks R package.

Clinical Data Retrieval
The clinical data in Table 1 were downloaded from the cBioPortal [39] and NCI Genomic Data Commons (GDC, TCGA) portals [40]. The downloaded data were combined in a single file according to the patients' unique TCGA codes. In total, there were 493 patients with clinical information available. The event that we considered was progression-free survival (PFS, N = 93). This is because, fortunately, only a smaller percentage of patients had an event needed for overall survival analyses. This makes an overall survival analysis in prostate cancer suboptimal. Some variables in our analysis contained missing data. However, the decision trees that we obtained in the survival analysis by using recursive partitioning hold an advantage in comparison to traditional statistical methods as they are not as affected by missing data [41].

The Survival Analysis
Variables from Table 1 (age, Gleason score, TNM staging, and residual tumor information) were supplemented with gene expression data for matrisome and adhesome genes, and their prognostic value was determined through recursive partitioning. The American Joint Committee on Cancer (AJCC) recommends recursive partitioning for the analysis in prognostic studies [42,43]. We used the rpart package [29,30] in the programming language R (version 4.2.1) [44] for the creation of survival trees. Rpart is an abbreviation for Recursive PARTitioning, and it is the frequently used method for the construction of survival trees. Survival trees obtained through the rpart method enable visual inspection and comparison of prognostic factors [42,43]. The basic principles of the rpart method are elaborated more closely in our previous publications [26,45]. Briefly, first we calculated the importance of individual variables. Second, we generated the survival tree, which is defined by its decision nodes and terminal nodes (leaves). The analysis began with all patients, who were then further divided into prognostic subgroups at each decision node. At the first decision node (the root node), a logical check was conducted. If the criterion imposed by that node was met, the left side of the tree was followed, and if not, the right side was followed. This action was repeated at each decision node through to the point at which the terminal node was reached. At each decision node, a variable was used to subdivide patients in two subgroups, with maximum differences in their hazard ratios (HR). If no further improvement in subdivision was possible, the terminal nodes were reached. Patients in the first decision node (the root node) had a hazard ratio of 1, and the hazard ratio for patients in each further node was assigned in comparison to this value. Overfitting is a frequent problem in machine learning which, in this case, can lead to an extensive fragmentation of the tree, for which it is hard to find a biological meaning. To avoid overfitting, we set the complexity parameter (CP) to 0.0592 and 0.0636 for the ECM and for the IAC genes, respectively. Table 1. Clinical information of The Cancer Genome Atlas patients. The number (N) and the percentage (in parentheses) of patients that belong to a certain category are shown. Some categories contain unknowns (NAs). The table was modified and adapted from our recent publication [26]. The log-rank test was used to analyze the difference in survival between patients in terminal nodes, and the results were presented as survival curves showing the Kaplan-Meier survival estimate [46]. The analysis was performed by using the EZR package [32], an add-on in R commander (a basic-statistics graphical user interface to R) [31]. The obtained data were statistically significant since the log-rank test p-value was <0.001.

The Expression of Matrisome and Adhesome Genes Appears to Be Aberrant in Prostate Cancer
Gene expression analysis of prostate tissue from prostate cancer patients described in the Materials and Methods Section revealed 2037 differentially expressed genes (DEGs) when compared to non-transformed prostate tissue. The result of the functional enrichment analysis for the Gene Ontology Cellular Component (GO CC) category using these 2037 genes and the TCGAbiolinks R package is provided in Figure 2. The top-20 GO Cellular Compartment terms are shown. The enrichment analysis on these genes showed that the GO terms 'extracellular matrix' (N = 35 genes) and 'integrin complex' (N = 12 genes) were among those that were highly enriched in the Gene Ontology Cellular Component (GO CC) category ( Figure 2). In the GO Biological Process (GO BP) category, we detected the 'cell adhesion' term (N = 62 genes) among the top-20 categories. The ECM-and IACs-related DEGs are listed in Table 2. 2037 genes and the TCGAbiolinks R package is provided in Figure 2. The top-20 GO Cellular Compartment terms are shown. The enrichment analysis on these genes showed that the GO terms 'extracellular matrix' (N = 35 genes) and 'integrin complex' (N = 12 genes) were among those that were highly enriched in the Gene Ontology Cellular Component (GO CC) category ( Figure 2). In the GO Biological Process (GO BP) category, we detected the 'cell adhesion' term (N = 62 genes) among the top-20 categories. The ECM-and IACsrelated DEGs are listed in Table 2. The TCGA PRAD dataset was used with the following criteria: |log2FC| ≥ 1 (corresponding to |fold change| ≥ 2) and FDR (false discovery rate) p-value < 0.01. The enrichment analysis was performed by using the TCGAbiolinks R programming language package. Table 2. Matrisome and adhesome genes up-(red; N = 71) and down-regulated (green; N = 177) in prostate cancer in comparison to healthy prostate tissue according to TCGA PRAD data. The numbers in parentheses represent the fold change (FC; threshold |FC| ≥ 2x). The adjusted p-value is <0.01 for each gene. The genes are shown in descending FC values' order. ECM glycoproteins, collagens, and proteoglycans belong to the core matrisome category, and ECM-affiliated proteins, ECM regulators, and secreted factors belong to the category of matrisome-associated proteins [33,34].

Category Genes (FC)
Integrins ECM Glycoproteins The enrichment analysis of differentially expressed genes (N = 2037) in prostate cancer. The TCGA PRAD dataset was used with the following criteria: |log2FC| ≥ 1 (corresponding to |fold change| ≥ 2) and FDR (false discovery rate) p-value < 0.01. The enrichment analysis was performed by using the TCGAbiolinks R programming language package.
Among the ECM-and IACs-related DEGs are many proteins that give a structure to the ECM, such as collagens, various ECM glycoproteins, and ECM proteoglycans (Table 2). Additionally, the expression of ECM regulators, involved in organization of the ECM, is also perturbed. The genes encoding for secreted factors that stimulate the crosstalk between tumor and host cells, (lymph)angiogenesis, and the hijack and recruitment of immune cells, also change expression. With such an extensive perturbation in the ECM composition, it is hard to speculate which characteristics of the ECM changed. However, it is known from the literature that the tumor ECM in general gains an increase in density and mechanical stiffness [47] due to the changed quantity of ECM structural proteins and the extent of crosslinking.
Integrins are a link between the ECM and intracellular machinery that are highly alerted to the changes in the ECM. It is interesting to note that in prostate cancer, integrins and adhesome genes mainly show decreased expression ( Table 2). It would be important to relate these differences to phenotypes of prostate cancer and to decipher whether there are compensatory mechanisms, such as, for example, the increase in the expression of some of the integrin ligands (e.g., collagens).
The expression of genes that we showed are involved in the prognosis of PFS of prostate cancer patients, PTPN2 and FMOD, did not change the global expression between prostate cancer and non-transformed tissue according to the criteria used (|log2FC| ≥ 1 and FDR p-value < 0.01). Table 2. Matrisome and adhesome genes up-(red; N = 71) and down-regulated (green; N = 177) in prostate cancer in comparison to healthy prostate tissue according to TCGA PRAD data. The numbers in parentheses represent the fold change (FC; threshold |FC| ≥ 2x). The adjusted p-value is <0.01 for each gene. The genes are shown in descending FC values' order. ECM glycoproteins, collagens, and proteoglycans belong to the core matrisome category, and ECM-affiliated proteins, ECM regulators, and secreted factors belong to the category of matrisome-associated proteins [33,34].

Category
Genes (

ECM-and IACs-Related Genes Are Involved in Prognosis of Progression-Free Survival in Prostate Cancer Patients
Recursive partitioning is the method recommended by the AJCC for the analysis of prognostic studies [42,43]. Therefore, we used the rpart method to determine the prognostic value of the following variables (Table 1): age, Gleason score, TNM staging, residual tumor information, and the gene expression data for the ECM-and IACs-related genes. The ECM-and IACs-related genes were separately analyzed. The importance of individual variables is shown in Figures 3A and 4A. By performing the rpart analysis, our result from a previous publication, which found the Gleason score to be the strongest prognostic factor in prostate cancer among the studied variables, was confirmed [26]. The five most informative variables in Figure 3A in addition to the Gleason score were the expressions of FMOD, MMP11, COL1A1, COL3A1, and COL5A2 genes. Among them, only FMOD emerged on the survival tree. In Figure 4A, the five most informative variables in addition to the Gleason score were the expressions of PTPN2, RPL23A, MRTO4, PTPN1, and BRIX1. Among those, only the PTPN2 gene expression variable emerged on the survival tree. From the variable importance analysis, it was evident that even the most informative individual variable (the Gleason score) had a score of only 36 (matrisome data) and 27 (adhesome data) in comparison to the whole model, bearing the score of 100. Therefore, the multivariate approach to survival analysis is the only way to correctly describe the patients' prognosis.    AJCC guidelines for prognostic studies suggest that a prognostic value of a single variable is evaluated by considering the other variables [42,43]. The rpart method follows this criterion because rpart uses all variables in the analysis. The results of the rpart algorithm performed on our data are presented on a survival tree (Figures 3B and 4B). Figures 3B and 4B show that, by using two variables in each survival tree, patients were further subdivided into two decision nodes and three terminal nodes (leaves) in each tree.
Variables used in the decision nodes in Figures 3B and 4B are the Gleason score and the FMOD and PTPN2 gene expressions. FMOD and PTPN2 refined the prognosis of patients with a Gleason score < 9, respectively. The importance of the variables in Figures 3B and 4B was determined by their position in the survival tree: the topmost variable (the Gleason score) holds the largest amount of information, the variable below the topmost is the second largest by the content of information, and so on. It is obvious from Figures 3B and 4B that there were three prognostic subgroups on each. For Figure 3B, they were: (a) low Gleason score and high FMOD expression, (b) low Gleason score and low FMOD expression, and (c) high Gleason score. The HR gradually increased from the left to the right of the survival tree. By using the complexity parameter (CP) = 0.0592, we did not find a variable that further refined the high Gleason score patients (≥9). However, when the CP was set at Biomedicines 2023, 11,2006 9 of 17 CP = 0.0371, we obtained a separation in that group of patients according to the expression of the MFAP3 gene. Namely, MFAP3 high expression (≥1389) was associated with worse survival (HR 2.8 vs. 0.37). In Figure 4B, we also established three prognostic subgroups: (a) low Gleason score and low PTPN2 expression, (b) low Gleason score and high PTPN2 expression, and (c) high Gleason score. In this survival tree, the HR also gradually increased from the left to the right.   To conclude, by using the Gleason score information supplemented with the expression of FMOD and PTPN2 genes, a stratification of prostate cancer patients into several prognostic subgroups with significantly different hazard ratios (low, medium, and high risk of progression) was achieved.
The results of recursive partitioning (Figures 3B and 4B) were further supplemented by survival curves obtained using the Kaplan-Meier method for subgroups from each decision node. The difference in survival for subgroups defined by the left and the right branches of the decision node 1 (the Gleason score) is shown in our previous publication [26]. The subgroups from decision node 2 are shown in Figure 5 (FMOD expression) and Figure 6 (PTPN2 expression). The log-rank test p-value was statistically significant (p < 0.001) for both genes (Figures 5 and 6).

Discussion
The driving processes in prostate cancer progression encompass intertwined actions of several signaling pathways, which are potentiated by genetic and epigenetic alterations,

Discussion
The driving processes in prostate cancer progression encompass intertwined actions of several signaling pathways, which are potentiated by genetic and epigenetic alterations, Figure 6. Difference in patients' survival for the left and the right branches of the second decision node from Figure 4B, which uses the PTPN2 gene expression as a separation criterion.

Discussion
The driving processes in prostate cancer progression encompass intertwined actions of several signaling pathways, which are potentiated by genetic and epigenetic alterations, changes in gene expression, and post-transcriptional and post-translational modifi-cations [1,2,48]. However, although a large amount of data exists regarding the mentioned processes, one of the greatest barriers in prostate cancer treatment is still the inability to precisely foresee the course of a disease, and therefore, to define the risk subgroups which would guide the treatment options. In our previous work, we added to the efforts which try to reveal prostate cancer PFS prognosis biomarkers [26]. In that work, the Gleason score emerged as the most informative prognostic factor among all the clinical and the gene expression variables studied. Herein, we extended the analysis to the ECM-and IACs-related genes. Our results are based on the TCGA PRAD dataset, and they dissect differential expression of ECM-and IACs-related genes and their value as prognostic factors in the progression-free survival of prostate cancer patients.
In this article, based on the TCGA PRAD dataset, ECM (matrisome) gene expression appeared to be highly aberrant in prostate cancer tissue. The enrichment analysis on the DEGs showed that the GO term 'extracellular matrix' (N = 35 genes) was among those that were enriched in the Gene Ontology Cellular Component (GO CC) category ( Figure 2). Genes from all the ECM categories (Table 2) showed changed expression. As mentioned in the Results Section, with such a comprehensive change in the expression of individual components, it is hard to speculate which of the ECM general characteristics are changed in prostate cancer. However, it is known from the literature that the cancers' ECMs in general gain an increase in density and mechanical stiffness [47].
In a search for prognostic factors among the ECM-related genes, the expression of the FMOD gene appeared to refine the prognosis based on the Gleason score. Namely, the patients with a Gleason score lower than 9 were further subdivided into two prognostic subgroups based on the FMOD gene expression. The patients with high FMOD expression had better survival ( Figure 5). The FMOD gene encodes the fibromodulin protein, which belongs to the family of small interstitial proteoglycans [61]. This protein interacts with type I and type II collagen fibrils and inhibits fibrillogenesis in vitro. Therefore, fibromodulin may play a role in the assembly of the extracellular matrix. It may also regulate TGFbeta activities by sequestering TGF-beta into the extracellular matrix (www.genecards. org accessed on 1 December 2022). In the prostate cancer setting, FMOD was shown to be overexpressed in human prostate epithelial cancer cell lines in vitro. Additionally, the authors showed that the cancerous tissue expressed significantly higher levels of intracellular fibromodulin compared to matched, benign tissue from the same patients. Higher levels were also detected in cancerous tissue in comparison to tissue from patients with only a benign disease [62,63]. Furthermore, in a study based on Brazilian individuals, FMOD gene variants were suggested to be potential biomarkers for prostate cancer and benign prostatic hyperplasia [64]. However, in a recent article, it was shown that higher FMOD expression was associated with better disease-free survival of prostate cancer patients, a finding that agrees with our results [65]. This would mean that, although the cancerous tissue has higher FMOD expression than non-transformed prostate tissue, in prostate cancer, higher FMOD expression bears a better prognosis. Here, it needs to be remembered that, besides FMOD, our analysis showed that COL1A1, COL3A1, and COL5A2 genes were also shown to have high informative value in the prognosis of PFS when individually analyzed ( Figure 3A). It would be interesting to imply their functional role and to further delineate whether FMOD and these three collagen genes are interacting in the architecture of certain prostate cancer phenotypes that affect the patients' survival.

Integrin Adhesion Complexes-Related Genes Expression and Prognostic Significance in Progression-Free Survival of Prostate Cancer Patients
Integrin receptors are involved in almost every process of cancer formation and progression [66]. Therefore, it is not surprising that numerous preclinical studies on targeting integrins in different cancer types revealed encouraging results. However, there are still obstacles in translating these results into the clinics [67,68]. In addition to all the difficulties [69,70], in our recent paper [19], we suggested that integrin crosstalk could potentially complicate and undermine the effects of targeting integrins. Integrin crosstalk is a phenomenon in which the modulation of the activity and/or expression of one integrin (subunit or a heterodimer) affects the activity and/or expression of other integrin(s) (subunit(s) or heterodimer(s)). To circumvent integrin crosstalk, but to keep the advantages of targeting the integrin pathway, we suggest that the analysis of proteins downstream of integrin ligation and activation could reveal effective therapeutic targets. Therefore, in this paper, we focused on integrin adhesion complexes (IACs), in a search for potential prognostic biomarkers and therapeutic targets in prostate cancer. IACs are essential protein-composed adhesion structures whose components were also detected outside of Metazoa, confirming their ancient evolutionary origin [71]. There are several types of IACs recognized [21], which include nascent adhesions [72], focal complexes [73], focal adhesions [74], fibrillar adhesions [75], reticular adhesions [76], and hemidesmosomes [77]. Although IACs vary in their appearance, size, dynamics, and composition, the core components of integrin adhesome have been identified by several groups [35][36][37]. The integrin adhesome consists of proteins that are affiliated with the structure and signaling activity of integrin-mediated adhesions [36]. By analyzing the core integrin adhesome components, we found that their expression is highly perturbed in prostate cancer. Namely, the category 'integrin complex' appeared among the top functionally enriched Gene Ontology Cellular Component (GO CC) terms ( Figure 2). Furthermore, we detected 44/264 (16.7%) adhesome genes whose expression was significantly changed by ≥2 times (either up-or down-regulated) in prostate cancer, in comparison to non-transformed prostate tissue (Table 2). An important notion is that majority of these genes are downregulated in the prostate cancer tissue. Their functional role and the potential compensatory mechanisms remain to be investigated.
In addition to changes in gene expression, in our analysis, we found that the expression of some of the adhesome genes was correlated with PFS in univariate and multivariate approaches. The examples of genes implicated in the univariate approach are PTPN2, RPL23A, MRTO4, PTPN1, and BRIX1 ( Figure 4A). However, except for PTPN2, those genes did not emerge on the survival tree. This would mean that the expression of the mentioned genes is probably correlated with some of the variables which already hold a prognostic value, such as, for example, the Gleason score. It is interesting to note that three genes (PTPN1, PTPN2, and PTPN12) from the PTPN family of protein tyrosine phosphatases emerged in univariate analysis. Tyrosine phosphorylation is an important post-translational modification in cell adhesion that is dynamically regulated by the protein tyrosine phosphatases and kinases [78]. While PTPN1 [79][80][81] and PTPN12 [82,83] were implicated in prostate cancer biology, the involvement of PTPN2 in prostate cancer is not documented [84]. Regarding integrin signaling, complex roles for PTPN1 [85][86][87][88], PTPN2 [89], and PTPN12 [90] have been documented. Despite this, it needs to be mentioned that PTPN proteins have other, broader roles [84]. Therefore, it cannot be ruled out that some of these other roles are also important for the biology of prostate cancer.
The PTPN2 gene expression appeared on the survival tree as a variable that refines the PFS of lower (<9) Gleason score patients. Our results suggested that its higher expression bears a poorer prognosis. PTPN1 and PTPN2 are highly related PTPs [84], but, as mentioned previously, PTPN2 has not been implicated in prostate cancer. However, PTPN2 is a key predictor of prognosis for pancreatic adenocarcinoma, and its higher expression is associated with a poor prognosis [91]. Overexpression of PTPN2 also predicted a poor survival in clear cell renal cell carcinoma [92], which agrees with our results. However, low PTPN2 expression was associated with poor overall survival in ovarian serous cystadenocarcinoma [93], indicating its versatile roles in different cancer types. The connection of PTPN2 with integrin signaling was confirmed by several articles, which indicate activation of PTPN2 by integrins. Namely, it was recently shown that the catalytic activity of PTPN2 is auto-regulated by its intrinsically disordered tail and activated by ITGA1 [89]. An earlier article also documented that PTPN2 is activated by the integrin ITGA1/ITGB1 and that it subsequently dephosphorylates EGFR and negatively regulates EGF signaling [94]. In line with this, the same group showed that PTPN2 activity was induced upon integrinmediated binding of endothelial cells to the collagen matrix [95]. However, the potential role of PTPN2 activation by integrins in prostate cancer remains to be investigated. To conclude, PTPN2 might be a potential target in prostate cancer treatment, whose targeting is achievable because the PTPN2 inhibitors are available.
An interesting notion is that neither ECM-nor IACs-related genes defined risk subgroups for the Gleason score ≥ 9, according to the conservative complexity parameters that we selected. It could be that the high Gleason score cancers show such aberrant ECM-and IACs-related genes' expression that are of a great importance for cancer progression and, therefore, are common to all patients. This would mean that ECM-and IACs-related genes' aberrant expression is underlying for all high Gleason score (≥9) patients.

Methodological Considerations
In this article we used recursive partitioning to define the risk subgroups of prostate cancer patients in the analysis that included clinical information and the gene expression data. Recursive partitioning is the method recommended by AJCC for the analysis of prognostic studies [42,43]. Due to the prostate cancer heterogeneity, it is to be expected that this method better describes its diversity than the Cox regression analysis, which is used by majority of papers dealing with similar questions. Moreover, the survival tree, obtained by recursive partitioning, is easier to interpret than the Cox regression results. Therefore, we believe that our approach is more appropriate to analyze the prostate cancer survival data.

Conclusions
ECM is the first frontier of the cell towards its surroundings, and it is among the main determinants of the cell's behavior. Therefore, important roles of the ECM in cancer development, progression, and prognosis were documented. By using the TCGA PRAD dataset, in this article, the expression of ECM genes in prostate cancer was analyzed and correlated with progression-free survival of prostate cancer patients. We revealed that the expression of ECM-related genes changed in prostate cancer. Moreover, the ECM-related genes showed prognostic significance for the prostate cancer patients, who were stratified according to the Gleason score. Our results confirmed the important roles for the ECMrelated genes in prostate cancer and suggested the potential biomarkers of prostate cancer progression from the list of the ECM-related genes.
Integrins are among the main receptors for the ECM ligands. Several unique characteristics, including integrin crosstalk and the formation of IACs, make integrins exceptional among the signaling receptors. Therefore, their roles in tumor formation, progression, and drug resistance were noted early on [96]. In this paper, we showed that the expression of integrin and IAC genes changed in prostate cancer. Moreover, some of these genes are appearing in univariate and multivariate approaches in the prognosis of PFS, suggesting their potential role in the discovery of biomarkers of prostate cancer progression. Consequently, our results support the early notion that considered integrins (and downstream proteins) attractive therapeutic targets, a strategy that is still hotly debated [68,70].

Supplementary Materials:
The following supporting information can be downloaded at: https:// www.mdpi.com/article/10.3390/biomedicines11072006/s1, Table S1: A list of ECM-and IACs-related genes used in this study.

Conflicts of Interest:
The authors declare no conflict of interest.