Intronic TP53 Polymorphisms Are Associated with Increased Δ133TP53 Transcript, Immune Infiltration and Cancer Risk

Simple Summary We investigated the influence of genetic variants, called single nucleotide polymorphisms (SNP) in the TP53 tumour suppressor gene, on cancer risk, clinical features and TP53 isoform levels. These SNPs were significantly over-represented in cohorts of mixed cancers versus controls, suggesting they confer increased cancer risk. Heterozygosity at rs1042522(GC) and either of the two SNPs rs9895829(TC) and rs2909430(AG) confer up to a 5-fold greater risk of developing cancer. The SNP combinations were associated with high Δ133TP53 and TP53β messenger RNA levels, elevated infiltrating immune cells and shorter patient survival for glioblastoma and prostate cancer. The data suggest that ∆133p53β protein levels are increased by the SNPs resulting in increased inflammation which contributes to more aggressive cancers. Abstract We investigated the influence of selected TP53 SNPs in exon 4 and intron 4 on cancer risk, clinicopathological features and expression of TP53 isoforms. The intron 4 SNPs were significantly over-represented in cohorts of mixed cancers compared to three ethnically matched controls, suggesting they confer increased cancer risk. Further analysis showed that heterozygosity at rs1042522(GC) and either of the two intronic SNPs rs9895829(TC) and rs2909430(AG) confer a 2.34–5.35-fold greater risk of developing cancer. These SNP combinations were found to be associated with shorter patient survival for glioblastoma and prostate cancer. Additionally, these SNPs were associated with tumor-promoting inflammation as evidenced by high levels of infiltrating immune cells and expression of the Δ133TP53 and TP53β transcripts. We propose that these SNP combinations allow increased expression of the Δ133p53 isoforms to promote the recruitment of immune cells that create an immunosuppressive environment leading to cancer progression.


Introduction
p53 is a powerful tumours suppressor [1] with defects in the p53 pathway being an almost universal hallmark of human cancers [2,3]. Mice deleted for the TP53 gene are highly tumour prone [1] and in Li-Fraumeni syndrome (LFS) where mutations in the TP53 gene are inherited, multiple tumour types occur [4]. Despite this, identifying TP53 mutations alone has limited power in predicting patient outcome [5][6][7][8].
Apart from point mutations and protein-coding transcripts, the TP53 gene contains over 200 single nucleotide polymorphisms (SNPs). One of the most well studied TP53 polymorphisms is rs1042522 (G > C) that changes the amino acid at codon 72 from Arginine to Proline (P72R > P). This change in amino acid leads to a structural alteration in the protein, which alters the function of p53 [9][10][11]. For example, p53R72 (R72) induces apoptosis more effectively than p53P72 (P72) [12], whereas P72 appears to induce a greater G 1 arrest [9,13]. Furthermore, P72 is more efficient than R72 in activating several p53-dependent DNA repair and other genes, and cells with P72 have a significantly higher DNA repair capacity and reduced genome instability [13,14].
Other than rs1042522 SNP, common haplotypes found in the P2 promoter in intron 4 [32] are associated with increased lung cancer risk and poor prognosis [33]. Another study demonstrated that there are 8 known TP53 polymorphisms in 11 common haplotypes that are present within the P2 promoter [32]. Using promoter constructs, this study identified 2/11 haplotypes markedly increased the baseline expression of the ∆133TP53 family of isoforms [34].
∆133TP53 is one of 4 families of isoforms encoded in the TP53 locus. These isoforms differ at the N-terminus by use of alternative promoter usage, alternative initiation of translation, and by alternate splicing at the C-terminus (rviewed in detail [5,6,35]). Several studies have now shown that aberrant expression of some p53 isoforms contribute to diseases, including cancer [5,6,35]. Elevated transcript levels of the ∆133TP53 isoforms are associated with poor patient outcomes in breast [36,37], prostate [38] and colorectal cancers [39] and the ∆133TP53β isoform level is associated with increased immune cell infiltration in glioblastoma [35] and prostate cancer [38]. Other evidence suggests that ∆133p53 isoforms can antagonise canonical full-length p53 tumour suppressor activity and have multiple intrinsic tumour-promoting capacities [40].
Given this background, we investigated the influence of zygosity in addition to haplotype variation between the R72P SNP (rs1042522) and two P2 promoter SNPs in intron 4 on cancer risk. We hypothesized that the combination of these SNPs may affect transcription of the ∆133TP53 isoforms. We report that the combination of R72P SNP and intron 4 SNPs was associated with increased cancer risk; affected transcript levels of ∆133TP53 isoforms, and was associated with increased immune infiltrate and poor survival in glioblastoma and prostate cancer patients.

Intronic SNPs rs9895829 and rs2909430 Are Associated with Increased Cancer Risk
Five SNPs in TP53 (rs1042522, rs35850753, rs1794287, rs9895829 and rs2909430) were genotyped using blood or normal associated tissue from individuals that had cancer [33,34]. Consistent with the literature our sequencing data showed rs9895829/rs35850753 and rs2909430/rs1794287 are in high linkage disequilibrium and form haplotype blocks [33,34]. The SNP data for rs9895829 and rs2909430 are reflective of the SNPs at rs35850753 and rs1794287 respectively. Thus, to avoid redundant comparison, we selected rs9895829 and rs2909430 as tagging SNPs for rs35850753 and rs1794287 respectively for further analyses (Figure 1), along with rs14042522. The study cohort comprised individuals with glioblastoma (GBM, n = 89), prostate (PCa, n = 122) and breast (BrCa, n = 389) cancers respectively. Ethnically matched control cohorts were derived from the 1000 genomes [41] (Control Cohort 1), the Caucasian case-control cohort from Mechanic et al. [33] (Control Cohort 2) and the Dunedin Multi-Disciplinary Health and Development Study [42] (Control Cohort 3). isoforms. We report that the combination of R72P SNP and intron 4 SNPs was associated with increased cancer risk; affected transcript levels of Δ133TP53 isoforms, and was associated with increased immune infiltrate and poor survival in glioblastoma and prostate cancer patients.

Intronic SNPs rs9895829 and rs2909430 are Associated with Increased Cancer risk
Five SNPs in TP53 (rs1042522, rs35850753, rs1794287, rs9895829 and rs2909430) were genotyped using blood or normal associated tissue from individuals that had cancer [33,34]. Consistent with the literature our sequencing data showed rs9895829/rs35850753 and rs2909430/rs1794287 are in high linkage disequilibrium and form haplotype blocks [33,34]. The SNP data for rs9895829 and rs2909430 are reflective of the SNPs at rs35850753 and rs1794287 respectively. Thus, to avoid redundant comparison, we selected rs9895829 and rs2909430 as tagging SNPs for rs35850753 and rs1794287 respectively for further analyses (Figure 1), along with rs14042522. The study cohort comprised individuals with glioblastoma (GBM, n = 89), prostate (PCa, n = 122) and breast (BrCa, n = 389) cancers respectively. Ethnically matched control cohorts were derived from the 1000 genomes [41] (Control Cohort 1), the Caucasian case-control cohort from Mechanic et al. [33] (Control Cohort 2) and the Dunedin Multi-Disciplinary Health and Development Study [42] (Control Cohort 3). The frequency of rs1042522 was similar in the study population and control populations (Table  1). However, the frequencies of the other two SNPs (rs9895829 and rs2909430) were significantly different between the study and control populations (Table 1). These results suggest that these two intronic SNPs are associated with an increased risk of cancer. The frequency of rs1042522 was similar in the study population and control populations (Table 1). However, the frequencies of the other two SNPs (rs9895829 and rs2909430) were significantly different between the study and control populations (Table 1). These results suggest that these two intronic SNPs are associated with an increased risk of cancer. Consistent, with previous studies [9,[24][25][26], cloning and sequencing the region spanning exons 4-5 in several tumour types showed that the minor allele (MA) of the intronic SNPs was primarily found in combination with the "C" allele for rs1042522 ( Figure S1). We, therefore, tested each intron 4 SNP in combination with rs1042522 (rs22) for association with cancer risk. To do this we compared the study cohort to control cohort 1 since it was the only one with the intronic SNP data. Individuals that are heterozygous for rs1042522(CG/GC) hereafter are referred to as rs22(GC), for rs9895829(TC/CT) as rs29(TC) and rs2939430(AG/GA) as rs30 (AG). Results from the study cohort compared to control cohort 1 indicated that individuals with rs22(GC)+rs30(AG) SNP combination had a 2.34-fold increased risk of cancer while those with rs22(GC)+rs29(TC) SNP combination had a 5.35-fold increased risk of cancer ( Table 2). None of the individuals in the control cohorts that were homozygous for rs22(GG), were heterozygous for rs29(TC), but this combination was present in nearly half the study cohort (Table 2). Similarly, only one individual that was homozygous for rs22(GG) was heterozygous for rs30(AG) in control cohort 1 compared to 182 individuals with the SNP combination in the study cohort (Table 2). These results suggest, that individuals that are heterozygous at rs22(GC) and are heterozygous for either of the two intronic SNPs rs29(TC) and rs30(AG) have a significantly higher likelihood of having cancer.

Association of TP53 Polymorphisms with Patient Survival
To determine if the combinations of TP53 SNPs were associated with cancer patient outcomes we performed survival analyses on the GBM and PCa cohorts after stratification for SNP combinations. Results for GBM patients show that individuals with the rs22(GC)+rs29(TC) SNP combination had significantly shorter overall survival than individuals with rs22(GC), rs22(CC) and rs22(GG) combinations (Hazard ratio: 2.6-7.16, Figure 2a,b). Similarly, individuals with the rs22(CC)+rs29(TC) or rs22(GC)+rs30(AG) SNP combinations had significantly shorter overall survival than individuals with the rs22(GG) SNP alone (Hazard Ratio: 3.3-3.6, Figure 2a,b).

Association of TP53 Polymorphisms with Clinical Parameters of Cancer Patients
GBM can be stratified by telomere maintenance mechanisms. They can be telomerase (TEL) positive or positive for the Alternative Lengthening of Telomeres (ALT) mechanism [43,44]. In addition, GBMs can be further stratified based on the content of CD163+ macrophages (Mφ) [45,46]. Both parameters affect patient survival with those that are TEL positive and have a high Mφ content (TELM) have the poorest outcome [43,45,46]. We, therefore, assessed the influence of the three TP53 SNP combinations on these parameters. Our analysis showed that individuals with rs22(GC)+rs29(TC) were more likely to have TELM GBMs compared to rs22(GC) and rs22(CC) (Table S1). Similarly, individuals with rs22(GC)+rs30(AG) were more likely to have TELM GBMs compared to rs22(CC) ( Table S2).
We then assessed the association of these SNPs with common clinical parameters of PCa. Our results showed that none of these genotypes was significantly associated with prostate-specific antigen (PSA) levels, whilst individuals with rs22(CC) genotype were significantly less likely to be associated with a Gleason score of ≥7 and a CAPRA score of >3 compared to individuals with rs22(GC) and rs22(GC)+rs30(AG) genotypes (Table S2).
The presence of immune cells that promote an immunosuppressive environment within PCa has been associated with advanced PCa [47,48]. Consistent with this, we demonstrated CD3+ T cells can predict poor PCa outcomes with 79% accuracy and CD163+ Mφ cells with 68% accuracy [38]. Thus, we assessed the influence of the three TP53 SNP combinations on immune cell content. Results show that tumours from patients with the SNP combination rs22(GC)+rs30(AG) had significantly more CD163+ Mφ (median 2.28-2.56 fold) compared to rs22(GC)+rs29(TC) and rs22(CC) (Figure 3b and Table S3). Individuals also with rs22(GC) and rs22(GG) had tumours that had 2-fold elevated median levels of CD163+ Mφ counts compared to rs(GC)+rs29(TC), which was also significant (Figure 3b and Table S3). However, there was no significant difference between the median CD163+ Mφ counts in the tumours of patients with rs22(GC)+rs30(AG) compared to those with rs22(GC), and of rs22(GC)+rs29(TC) compared to the rs22(CC) genotype (Figure 3b and Table S3).
Next, we examined the effect of the different TP53 SNP combinations on the amount of CD3+ T cells in PCa. Results showed that patients with SNP combination rs22(GC)+rs30(AG) had significantly elevated CD3+ T cells compared to patients with other SNP combinations (Figure 3c and Table S3). Similarly, individuals with SNP combination rs22(GC)+rs29(TC) had significantly elevated CD3+ T cells (Figure 3c and Table S3). Finally, individuals with rs22(GC) and rs22(GG) had significantly elevated median levels of CD3+ T cells compared to rs22(CC) (Figure 3c and Table S3). Thus, similar to the GBM cohort, SNP combinations associated with poor PCa patient outcomes were also associated with a higher content of inflammatory cells.

Association of TP53 Polymorphisms with ∆133TP53 and TP53β Isoform Expression in Cancers
High levels of the ∆133TP53β transcript are associated with an inflammatory phenotype [38,49] and poor outcome [38,39]. To test for an association of the SNP combinations present in the P2 promoter with expression data of the TP53 isoforms, we stratified the GBM and PCa by SNP combinations and published isoform expression data [38,39,49]. Results showed that individuals with rs22(GC)+rs30(AG) SNP combination had tumours with significantly elevated median levels of ∆133TP53 and TP53β mRNA compared to rs22(GC), rs22(CC), and rs22(CC)+rs30(AG) (Figure 4a,b). Interestingly, these individuals also had significantly elevated levels of ∆133TP53 mRNA alone compared to those with rs22(GG) (Figure 4a). Additionally, individuals with rs22(GC)+rs29(TC) SNP combinations had tumours with significantly elevated median levels of ∆133TP53 and TP53β mRNA compared to rs22(CC) alone (Figure 4a,b). Finally, none of the three TP53 SNPs was associated with altered mRNA levels of other TP53 transcripts (Figure 4c-e). Taken together, these results suggest that the presence of the minor allele for either of the two intronic SNPs in combination with heterozygous rs22 SNP is associated with elevated levels of the ∆133TP53 family and the TP53β splice variant in these tumour types.

Association of TP53 Polymorphisms with ∆133TP53 and TP53β Isoform Expression in Normal Cells
To test if the combinations of SNPs are associated with basal expression of ∆133TP53 and TP53β mRNAs as a potential mechanism of cancer predisposition as suggested by Bellini et al. [34], we measured the expression of all TP53 transcripts in 33 breast fibroblast cell lines, stratified by SNP combinations. We combined the rs22(GC)+rs30(AG) SNP with rs22(GC)+rs29(TC) and rs22(CC)+rs29(TC) with rs22(CC)+rs29(TC)+rs30(AG) as there was only 1 cell line with the former combination and 2 cell lines with the latter combination. Thus, we stratified the isoform expression into 4 groups as follows rs22(GC), rs22(GC)+rs29(TC)/rs30(AG), rs22(CC)+rs29(TC)+/rs30(AG) and rs22(GG). Consistent with the results from the tumours, cell lines with rs22(GC)+rs29(TC)/rs30(AG) combination had the highest median expression of ∆133TP53 and TP53β mRNAs (Figure 5a,b). Interestingly, both rs22(GC) and rs22(GG) also had a significantly higher median expression for ∆133TP53 and TP53β mRNAs compared to cell lines with rs22(CC)+rs29(TC)/rs30(AG) (Figure 5a,b). The basal expression of other TP53 transcripts was not significantly different (Figure 5c-e). Results from these cell lines provide further evidence that there is an association with these SNP combinations and increased mRNA expression of ∆133TP53 and TP53β isoforms.    Red line-median and whiskers-interquartile range. Significance was determined using Welch's t-test. *-p-value. * < 0.05, *** < 0.001 and **** < 0.0001, respectively.

Discussion
We investigated the influence of TP53 SNPs in exon 4/intron 4 on cancer risk, clinicopathological parameters, patient survival and the mRNA expression of the TP53 isoforms. The minor alleles of the intron 4 SNPs were significantly over-represented in the cohorts of mixed cancers we studied compared to three control cohorts, suggesting that they confer increased cancer risk. Further analysis showed that heterozygosity at rs22(GC) and at either of the two intronic SNPs rs29(TC) and rs30(AG) conferred a greater risk of developing cancer. Our analysis found individuals with rs22(GC)+rs30(AG) were more likely to have aggressive PCa characterised by increased immune infiltration (CD163+ Mφ and CD3+ T cells) and elevated levels ∆133TP53 and TP53β mRNA compared to rs22(CC), rs22(GC) and rs22(GG) individuals. Our previous studies in GBM [49], PCa [38] and CRC [39] have identified a correlation coefficient of >0.95 between ∆133TP53 and TP53β, leading to the inference that the ∆133TP53β is the most likely ∆133TP53 isoform being expressed. We have recently shown that PCa with elevated levels ∆133TP53β isoform and increased immune cell infiltration are associated with poor outcome [38]. Consistent with these observations we found that individuals with rs22(GC)+rs30(AG) had significantly shorter progression-free PCa survival and were more likely to have aggressive TELM GBMs. In GBMs, we also found that individuals with the SNP combination rs22(GC)+rs29(TC) were more likely to have aggressive GBMs characterised by increased immune infiltration (CD163+ Mφ) and elevated levels ∆133TP53β isoform. Consistent with our observations that TELM tumours are associated with poor outcomes [49], we observed that individuals with GBM and rs22(GC)+rs29(TC) SNP combination were associated with the shortest overall survival.
A previous study showed that the rs22C+rs29T+rs30A haplotype had a poor outcome in lung cancer patients in African Americans but not in Caucasians [33], although there was no association with TP53 isoform expression done. In our studies, compound heterozygosity at rs22(GC) and intron 4 SNP are predictive of cancer predisposition and prognosis in Caucasians. These SNP combinations are also associated with elevated ∆133TP53β transcription.
Given this context, we propose that the TP53 SNPs investigated here allow increased transcription of the ∆133TP53 mRNAs, as suggested by Bellini et al. [34] using in vitro P2 promoter assays. Our data would also suggest that the SNPs may influence 3 splicing to favour the TP53β variant. This then leads to increased ∆133p53β, which in turn increases the levels of chemokines and cytokines to provoke inflammation involving immune cells that create an immunosuppressive environment and cancer progression.
Our model predicts that the SNP combinations above provide a DNA template structure conducive to increased transcription from the TP53 P2 promoter. The data from Bellini et al. suggested that the presence of rs1794287 in the P2 promoter can alter binding affinity for several transcription factors [34]. Although there are no experimental data confirming this hypothesis, in silico modelling of the effect of the three SNPs (rs22, rs29 and rs30) on the structure of the P2 promoter [50] also suggest that each of these SNPs alters DNA conformation and interaction with a number of transcription factors (determined using TFBind [51], Figure S2). For example, the prediction is that heterozygosity at rs30 allows the binding of 13 unique transcription factors in addition to 6 common transcription factors. Similarly, heterozygosity at rs29 allows the binding of 8 unique transcription factors plus 9 common transcription factors. It is also possible that heterozygosity observed at rs22, rs29 and rs30 might act cooperatively with each other, thus playing an important role in determining the final activity of the P2 promoter. Pre-mRNA splicing is frequently coupled to transcription by RNA polymerase II (RNAPII) [52]. Accumulation of transcription factors at the P2 promoter can alter the rate of elongation by Pol II, facilitating the recruitment of splicing factors that favour retention of intron 9β. This would provide a possible explanation for increased ∆133TP53β mRNA expression in individuals heterozygous for rs22 in combination with rs29 or rs30 compared to other SNP combinations. Increased ∆133p53β would in turn promote a tumour-favouring microenvironment, resulting in aggressive disease. Characterization of differentially bound transcription factors to these sequences remains to be explored experimentally.
As the presence of "C" or "G" allele at rs1042522 influences P2 promoter activity and ∆133TP53β expression [34], our results suggest that C allele dampens expression of ∆133TP53β. However, this can be rescued by two mechanisms: (i) selecting for heterozygosity at rs1042522 in combination with haplotypes of the C allele that contain the minor allele at either rs29 or rs30, or (ii) selecting for G allele homozygosity at rs22. As rs22(CC) is at a very low frequency in Caucasians, selecting for genotype combinations that increase ∆133p53 expression has presumably been advantageous during Caucasian evolution. This may be due to the pro-inflammatory properties of the ∆133p53 isoforms enabling increased adaptability to rapidly changing environmental conditions.
Our studies highlight the potential importance of SNP combinations in contributing to disease risk. However, our conclusions need to be tempered due to the low numbers of samples we have in our cohorts, which is exacerbated by the various stratifications we have done. Nonetheless, identification of individuals not only at risk of developing the disease but also having knowledge of disease severity or sub-type is clearly important. Thus, confirmation of our findings in multiple and larger patient cohorts is required.

Control Cohorts
Three control cohorts were used in this study. All three cohorts were sourced from different geographical locations. Control cohort 1 consisted of 599 subjects from the 1000 genomes [41]

Cell Culture
The human breast fibroblast cell lines (Fre lines) were obtained from Professor Roger Reddel and derived from post breast reduction surgery. There were 35

Genotyping
Genotyping for each cancer cohort is as described here. DNA was extracted from the blood of individuals with GBM, BrCa cohort 1 and from normally associated tissue of individuals with PCa. DNA was extracted from the Fre lines for genotyping. For all tumour cohorts studied in Dunedin and the Fre lines, PCR was used to amplify the DNA region between exons 4-9 of TP53. The primer sequences used were those published [55]. For amplification of intron 4, the following primers were also used: 5 -AGTTTACCCACTTAATGTGTG-3 ; 5 -CAGGAGATGGAGGCTGCAGTG-3 ; 5 -CAAGGCAGGCAGATCACCTG-3 (sense primers) and 5 -CTATAGGTGTGCACCACCATG-3 ; 5 -CCTCCTGCAACCCACTAG-3 ; 5 -CCACAGCTGCACAGGGCAGG-3 (antisense primers). Purified PCR products were subjected to Sanger sequencing to identify SNPs.
For BrCa cohort 2, genomic DNA (2.5ng) from the blood of breast cancer patients at diagnosis was genotyped using a 96.96 genotyping integrated fluidics circuit (IFC) with custom SNP-type assays on the Juno TM system (Fluidigm, South San Francisco, CA, USA) with quantitation on the Biomark TM (Fluidigm, South San Francisco, USA), according to the manufacturers' instructions. The data were analysed using the Fluidigm SNP Genotyping Analysis software version 4.5.1 (South San Francisco, CA, USA), to obtain genotype calls. 4.6. Preparation of RNA, cDNA and ddPCR for Analysis of p53 Isoforms RNA (1µg) extracted from the 33 cell lines was DNase I treated (Thermo Fisher Scientific, Waltham, MA, USA) and then reverse-transcribed using qScript cDNA SuperMix (Quanta Biosciences, Beverly, MA, USA), according to manufacturer's instructions. Primers designed for specific TP53 transcript subclasses (FL/∆40TP53_T1 referred to as FLp53, FL/∆40TP53_T2 referred to as ∆40p53 and ∆133TP53, TP53α and TP53β) were used from previous studies [56,57]. Absolute transcript abundance was measured with EvaGreen SuperMix using the Bio-Rad QX200 ddPCR System (Bio-Rad, Hercules, CA, USA) and converted to copies/µg RNA, as described previously [56,57].

Statistical Analyses
A Chi-square test was used to assess significant differences in the distribution of SNPs between control and cancer cohorts. Two-sided Fisher's exact test was used to assess the significant differences and Odds ratios were calculated using the Baptista-Pike method. Differences between survival curves were tested using Kaplan-Meier analysis followed by a two-sided log-rank test. Comparisons between groups were done using unpaired one-tailed t-test with Welch's correction. All Statistical analyses were performed using GraphPad Prism software version 7.03.

Conclusions
Our study found that zygosity of rs1042522 and two intron 4 SNPs rs9895829 and rs2909430 are associated with increased cancer risk. Heterozygous combinations of rs1042522 with either rs9895829 or rs2909430 also affected the transcript levels of ∆133TP53 and TP53β isoforms. Heterozygous combinations of these SNPs were associated with poor patient outcome and increased immune cell infiltration in glioblastoma and prostate cancers. In conclusion, our studies highlight how genomic variations within the TP53 locus profoundly affect the risk of developing cancer as well as cancer progression.