Electrostatic Complementarities of Glioblastoma-Resident T-Cell Receptors and Cancer Testis Antigens Linked to Poor Outcomes and High Levels of Sphingosine Kinase-2 Expression

Simple Summary The chemical complementarity of glioblastoma, tumor-resident T-cell receptors and cancer testis antigens were associated with a worse outcome. Additionally, the high expression of immune marker and low expression of apoptosis genes were associated with a high T-cell receptor–cancer testis antigen chemical complementarity and a worse outcome. In sum, T-cell receptor recombination reads from exome files have the potential to aid in glioblastoma prognoses and may provide opportunities to detect unproductive immune responses. Abstract Introduction. Glioblastoma (GBM) is the most aggressive primary brain tumor in adults. Despite a growing understanding of glioblastoma pathology, the prognosis remains poor. Methods. In this study, we used a previously extensively benchmarked algorithm to retrieve immune receptor (IR) recombination reads from GBM exome files available from the cancer genome atlas. The T-cell receptor complementarity determining region-3 (CDR3) amino acid sequences that represent the IR recombination reads were assessed and used for the generation of chemical complementarity scores (CSs) that represent potential binding interactions with cancer testis antigens (CTAs), which is an approach particularly suited to a big data setting. Results. The electrostatic CSs representing the TRA and TRB CDR3s and the CTAs, SPAG9, GAGE12E, and GAGE12F, indicated that an increased electrostatic CS was associated with worse disease-free survival (DFS). We also assessed the RNA expression of immune marker genes, which indicated that a high-level expression of SPHK2 and CIITA genes also correlated with high CSs and worse DFS. Furthermore, apoptosis-related gene expression was revealed to be lower when the TCR CDR3-CTA electrostatic CSs were high. Conclusion. Adaptive IR recombination reads from exome files have the potential to aid in GBM prognoses and may provide opportunities to detect unproductive immune responses.


Introduction
Glioblastoma (GBM) remains incurable and has benefited little from new medical treatments since the current treatment approach was established in 2002 [1], although there have been some surgical innovations since that time. Because of the very limited success of current approaches, more recently emphasis has been placed on establishing GBM subtypes with the expectation of moving more in the direction of personalized medical treatments. In particular, the four subtypes of proneuronal, neural, classical, and mesenchymal [2] 2 of 10 GBM have gained widespread acceptance with regard to consistent mutation profiles. Meanwhile, immune checkpoint blockade (ICB) has been a disappointment since the first attempts in 2013, with several exceptions related to subdividing GBM categories and with an eye towards one exemplary case of a mutator phenotype that results in possibly complete but, at least, very long-term remissions [3].
Several reasons have been proposed for the lack of improvement using ICB. Glioblastomas contain relatively few T-cells and an abundance of tumor-associated macrophages (TAMs), which enhance tumor development and promote treatment resistance. In contrast, other aggressive tumors have high T-cell infiltration and tend to be more responsive to ICB. Another explanation for GBM's lack of responsiveness to ICB could be due to chemotherapy and corticosteroids, which are commonly used to manage GBM symptoms. For example, the immunosuppression caused by chemotherapy could negatively impact the effectiveness of ICB. Additionally, corticosteroids are commonly prescribed to manage cerebral edema and are anti-inflammatory, which would likely decrease lymphocyte tumor infiltration. GBM tumors can also adapt to ICB by upregulating different checkpoint protein ligands [4].
The above considerations indicate a need for additional immunological parameters for personalized treatments, including ICB treatments and possibly other immunotherapies or even anti-inflammatory therapies [5]. Specifically, for GBM, cancer testis antigens (CTAs) have come to light as possible targets for immunotherapy [6], with one very relevant indication that such CTAs can be specifically detected in GBM patients and not in healthy controls [7]. CTAs are generally thought to be commonly expressed in many cancers due to increased levels of gene demethylation in cancer cells, either indirectly, due to the inability to reassemble heterochromatic regions representing the CTA genes because of rapid cell division, or due to other, as yet not fully appreciated stem cell-like features of cancer cells that specifically reduce heterochromatin in regions of CTA genes. Thus, in this report, we have applied a recently developed algorithm for assessing the chemical complementarity between immune receptors and CTAs [8] to assess the relationship between such complementarity and DFS and GBM-related gene expression, with results revealing a novel connection between sphingosine kinase-2 (SPHK2) expression and an apparently failed immune response.

Recovery of the TCGA-GBM (phs000178) IR Recombination Reads
The recovery of the TRA and TRB recombination reads from the GBM exome (WXS) files was per NIH dbGaP approved protocol number 6300, and the WXS file mining for these recombination reads has been extensively described [9][10][11]. The software for recombination read recovery, including readme files, is available at https://github.com/ bchobrut-USF/blanck_group and a container version is available at https://hub.docker. com/r/bchobrut/vdj. An updated version of the software, with some added technical conveniences, is available at https://github.com/kcios/2021. The data representing the entire collection of TRA and TRB recombination reads extracted from the TCGA-GBM WXS files are available in supporting online material (SOM) Table S1. The nucleotide sequences themselves are not available due to controlled-access restrictions; however, these reads will be made available by the corresponding author to investigators with proper dbGaP controlled-access approvals.

Construction and Use of the Adaptive Match Web Tool
The chemical complementarity scoring is based on [8] and was facilitated by the construction of an original web tool, adaptivematch.com [5,12,13], which is publicly accessible and has been extensively benchmarked in [13]. The electrostatic complementarity scores (CSs) are generated by aligning the adaptive IR CDR3 AAs with a candidate antigen peptide sequence. The scoring process is described extensively in [8], including in the SOM of ref. [8], which includes algorithmic details and an mp4 video representing a user-friendly description. Basically, a positive charge in one entity directly across from a negative charge in the other entity would increase the score significantly, whereas proximity without direct alignment would represent a reduced contribution to a complementarity score increase. The adaptivematch.com web tool includes instructions for preparation of input files. Additionally, three example input files and two example output files are provided in the SOM as Tables S2-S6, representing CS calculations with the CDR3s traceable to the TCGA-GBM WXS files.

Gene Expression Analysis
The TCGA-GBM gene expression assessments were performed using RNAseq values available via cbioportal.org. The RNAseq values were plotted against the electrostatic CSs to determine Pearson's correlation coefficients and p-values. Again, this process is facilitated by adaptivematch.com, with the uploading of the csv files as detailed in the web tool instructions for preparation of input files containing RNAseq values and case IDs. For this study, the RNAseq values represented the Firehose legacy version of the TCGA-GBM dataset.

Analysis of the Clinical Proteomic Tumor Analysis Consortium (CPTAC, phs001287) Dataset
To assess the correlation of CIITA RNAseq values with the CSs for TCR CDR3-SPAG9 Fragment 6, first, the adaptive IR recombination reads were obtained from the GBM subset of the CPTAC-3 study, by downloading the RNAseq files representing that dataset to USF research computing, via dbGaP approved protocol number 31752, and processing the RNAseq files with the software described and provided above, via the github links. Then, the CIITA RNAseq values were obtained by merging the files for each case ID for open access, "STAR counts", available via the genomic data commons web tool.

Results
To determine whether the electrostatic CSs (Methods) for GBM tumor-resident TCR CDR3s and CTAs could represent survival distinctions, we tested a set of CTAs, with results indicating that the upper 50th percentile of the TCR CDR3-CTA CSs for SPAG9, GAGE12F, and GAGE12G represented a worse DFS probability ( Figure 1; Table S7). To further investigate these results, we assessed the RNA expression of a panel of immune marker genes for the correlation of their expression levels with the electrostatic CSs, with results indicating that CIITA and SPHK2 expression correlated with the TCR CDR3-CTA CSs (Figure 2), while expression of the B-cell marker gene CD19 inversely correlated with the CSs (Table 1).    Table 1; Table S8).   Table 1; Table S8).  Table 1; Table S8).
We next considered the possibility that GBM tumor samples representing higher TC CDR3-CTA CSs and worse DFS would also have a lower level of expression of apoptosis related genes. We thus tested a panel of 22 apoptosis-effector genes and, most strikingly the results indicated that AIFM3, which has a highly brain-specific expression patter (http://genome.ucsc.edu/cgibin/hgGene?hgg_gene=ENST00000399167.6&hgg_chrom=chr22&hgg_start=20965107&h gg_end=20981360&hgg_type=knownGene&db=hg38), inversely correlated with the hig TCR CDR3-CTA CSs (Figure 3, Table 2).  Table 2; Table S9).   Table 2; Table S9). We next divided the SPAG9 AA sequences into approximately 18 equal segments and tested each segment for a correlation of the TCR CDR3-SPAG9 fragment CSs with survival, with the results indicating that the SPAG9 Fragment 6 (KHIEVQVAQETRNVST-GSAENEEKSEVQAIIESTPELDMDKDLSGYKGSSTPTKGIENKA) in particular reproduced the strong inverse association of the TCR CDR3-CTA CSs with DFS and reproduced the same immune marker and apoptosis marker correlations, or inverse correlations (Figure 4; Tables 3 and 4).
We next divided the SPAG9 AA sequences into approximately 18 equal segments and tested each segment for a correlation of the TCR CDR3-SPAG9 fragment CSs with survival, with the results indicating that the SPAG9 Fragment 6 (KHIEVQVAQETRNVST-GSAENEEKSEVQAIIESTPELDMDKDLSGYKGSSTPTKGIENKA) in particular reproduced the strong inverse association of the TCR CDR3-CTA CSs with DFS and reproduced the same immune marker and apoptosis marker correlations, or inverse correlations (Figure 4; Tables 3 and 4).  Table 3; Table S7. We next considered the possibility that the immune gene expression markers could represent survival distinctions for the entire GBM dataset rather than only for the samples with TRA and TRB recombination read recoveries. We focused on CIITA, in the immune marker case, because CIITA represented the most statistically significant correlations with  Table 3; Table S7. We next considered the possibility that the immune gene expression markers could represent survival distinctions for the entire GBM dataset rather than only for the samples with TRA and TRB recombination read recoveries. We focused on CIITA, in the immune marker case, because CIITA represented the most statistically significant correlations with the TCR CDR3-CTA CSs. The results indicated that upper and lower 50th percentiles for CIITA RNAseq values indicated a DFS distinction, with worse DFS probability associated with high CIITA RNAseq values ( Figure 5).   Table S10).
Finally, we sought to replicate the correlation of the TCR CDR3-SPAG9 Fragment 6 CS with the CIITA gene expression using the CPTAC GBM dataset. In this case, the CSs were based on TCR recombination reads obtained from GBM RNAseq files, rather than WXS files, which were the source of the TCR recombination reads for the TCGA-GBM dataset analyses above. The results indicated that the CPTAC GBM TCR CDR3-SPAG9 Fragment 6 CSs did indeed correlate with the expression of CIITA (R value = 0.388, p-value = 0.0002). As indicated in Methods, this assessment was facilitated by the use of the adaptivematch.com web tool. The input files for the web tool for this CPTAC-related analysis are in the SOM as Tables S11-S13. The output data are available in Table S14.

Discussion
The data above indicated that a higher electrostatic CS for TCR-CDR3-GBM CTAs and higher CIITA expression negatively correlated with GBM DFS probabilities, which emphasizes the need to more fully understand the potential negative impacts of the immune system on outcomes. While it is clear that an anti-tumor immune response in certain settings will facilitate tumor eradication, it is equally clear that certain inflammatory settings predispose to cancer growth. AIFM3, which is expressed almost exclusively in the brain, was expressed at a higher level in tumors where the TCR CDR3-CTA CSs were low, which is consistent with the idea that a high TCR CDR3-CTA CS is not representative of tumor cell killing by T-cells, i.e., lower DFS probabilities were consistent with the highly likely lack of apoptosis in the tumor samples. Thus, the important question becomes, is the inflammatory setting indicated by high TCR CDR3-CTA CSs supporting rather than reducing tumor growth?
The above results have two important limitations. First, there is no other reported approach, other than the in silico approach reported here, that supports the in vivo interaction of a TCR CDR3 with the CTAs that were identified with this in silico immunoinformatics approach. Second, there was no identification of the source of CIITA, which was identified as a pan-GBM dataset biomarker for a worse DFS probability. However, while additional work is needed to support the in silico indications of TCR and CTA binding, the algorithm used here will likely have use for patient risk stratification. As for CIITA, generally, when CIITA is expressed by tumor cells, which is very common in melanoma, it has a strongly anti-apoptotic function [17], consistent with the worse outcome reported  Table S10).
Finally, we sought to replicate the correlation of the TCR CDR3-SPAG9 Fragment 6 CS with the CIITA gene expression using the CPTAC GBM dataset. In this case, the CSs were based on TCR recombination reads obtained from GBM RNAseq files, rather than WXS files, which were the source of the TCR recombination reads for the TCGA-GBM dataset analyses above. The results indicated that the CPTAC GBM TCR CDR3-SPAG9 Fragment 6 CSs did indeed correlate with the expression of CIITA (R value = 0.388, p-value = 0.0002). As indicated in Methods, this assessment was facilitated by the use of the adaptivematch.com web tool. The input files for the web tool for this CPTAC-related analysis are in the SOM as Tables S11-S13. The output data are available in Table S14.

Discussion
The data above indicated that a higher electrostatic CS for TCR-CDR3-GBM CTAs and higher CIITA expression negatively correlated with GBM DFS probabilities, which emphasizes the need to more fully understand the potential negative impacts of the immune system on outcomes. While it is clear that an anti-tumor immune response in certain settings will facilitate tumor eradication, it is equally clear that certain inflammatory settings predispose to cancer growth. AIFM3, which is expressed almost exclusively in the brain, was expressed at a higher level in tumors where the TCR CDR3-CTA CSs were low, which is consistent with the idea that a high TCR CDR3-CTA CS is not representative of tumor cell killing by T-cells, i.e., lower DFS probabilities were consistent with the highly likely lack of apoptosis in the tumor samples. Thus, the important question becomes, is the inflammatory setting indicated by high TCR CDR3-CTA CSs supporting rather than reducing tumor growth?
The above results have two important limitations. First, there is no other reported approach, other than the in silico approach reported here, that supports the in vivo interaction of a TCR CDR3 with the CTAs that were identified with this in silico immunoinformatics approach. Second, there was no identification of the source of CIITA, which was identified as a pan-GBM dataset biomarker for a worse DFS probability. However, while additional work is needed to support the in silico indications of TCR and CTA binding, the algorithm used here will likely have use for patient risk stratification. As for CIITA, generally, when CIITA is expressed by tumor cells, which is very common in melanoma, it has a strongly anti-apoptotic function [17], consistent with the worse outcome reported here. However, CIITA, of course, could also be expressed by microenvironment antigen-presenting cells, consistent with a strong TCR CDR3-CTA interaction.
The data above also indicated that a higher level of SPHK2 was associated with the higher TCR CDR3-CTA CSs. Sphingosine 1-phosphate (S1P) plays a role in various cellular Biology 2023, 12, 575 8 of 10 processes, including cell survival, proliferation, and apoptosis. Both mammalian sphingosine kinases, SPHK2 and sphingosine kinase 1 (SPHK1), generate S1P from a sphingosine precursor [18]. Both SPHK2 and SPHK1 have the capacity to facilitate tumorigenesis. However, SPHK2 has a more complex role, which is dependent on its subcellular localization. When SPHK2 is translocated to the plasma membrane, pro-proliferative signaling occurs, while translocation to internal organelles enhances anti-proliferative functions [18]. The cellular localization of SPHK2 to internal organelles is mediated by cytoplasmic dynein 1. It has been shown that GBM demonstrates lower cellular cytoplasmic dynein 1, and in turn, higher SPHK2 in the plasma membrane. In addition, it should be kept in mind that sphingosine-1-phosphate is a chemoattractant for T-cells [19,20], raising the question of whether SPHK2-generated sphingosine-1-phosphate in turn leads to a high infiltration of T-cells and ultimately to a pro-tumor, inflammatory environment?

Conclusions
A significant portion of GBM patients is likely to have only non-functional adaptive immune responses that may be actively supporting tumor growth.

Supplementary Materials:
The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/biology12040575/s1, Table S1: Data representing the entire collection of TRA and TRB recombination reads extracted from the TCGA-GBM WXS files; Table S2: adaptivematch input file, CTAs; Table S3: adaptivematch input file, GBM CDR3s; Table S4: adaptivematch input file, GBM DFS data. Table S5: adaptive match output file, summary (Note, electrostatic indicated as "ncpr" in this output file); Table S6: adaptive match output file, raw electrostatic complementarity scores (Note, electrostatic indicated as "ncpr" in this output file); Table S7: Figures 1 and 4A case IDs used for KM analyses; Table S8: Figure 2 input and output data (SPHK2 and CIITA correlated with CDR3-SPAG9 CSs); Table S9: Figure 3 input and output data; Table S10: Figure 5 case IDs used for KM analyses; Table S11: Adaptivematch (AM) input file, CPTAC TRA, TRB CDR3s; Table S12: Adaptivematch (AM) input file, spag9 fragment 6; Table S13: Adaptivematch (AM) input file, CPTAC CIITA gene expression; Table S14: adaptivematch output file, CIITA RNAseq_CS_correlation, for the CPTAC dataset. Institutional Review Board Statement: IRB Ethical review and approval were waived for this study due to the de-identified features of the data. However, as noted above, the data access and project approvals were required by the dbGaP. The dbGaP project approval number is 6300.
Informed Consent Statement: Not applicable, as the analysis of these data were non-human subject results, as deemed by dbGaP project approval 6300.

Data Availability Statement:
The data presented in this study are available in the supporting online materials or are publicly available at cbioportal.org.