Thyroid Cancer: The Quest for Genetic Susceptibility Involving DNA Repair Genes

The incidence of thyroid cancer (TC), particularly well-differentiated forms (DTC), has been rising and remains the highest among endocrine malignancies. Although ionizing radiation (IR) is well established on DTC aetiology, other environmental and genetic factors may also be involved. DNA repair single nucleotide polymorphisms (SNPs) could be among the former, helping in explaining the high incidence. To further clarify the role of DNA repair SNPs in DTC susceptibility, we analyzed 36 SNPs in 27 DNA repair genes in a population of 106 DTCs and corresponding controls with the aim of interpreting joint data from previously studied isolated SNPs in DNA repair genes. Significant associations with DTC susceptibility were observed for XRCC3 rs861539, XPC rs2228001, CCNH rs2230641, MSH6 rs1042821 and ERCC5 rs2227869 and for a haplotype block on chromosome 5q. From 595 SNP-SNP combinations tested and 114 showing relevance, 15 significant SNP combinations (p < 0.01) were detected on paired SNP analysis, most of which involving CCNH rs2230641 and mismatch repair variants. Overall, a gene-dosage effect between the number of risk genotypes and DTC predisposition was observed. In spite of the volume of data presented, new studies are sought to provide an interpretability of the role of SNPs in DNA repair genes and their combinations in DTC susceptibility.


Introduction
Thyroid cancer (TC) is the most common endocrine malignancy and its increasing incidence raises concern. It is two to four times more frequent in women than in men and one of the most common malignancies in adolescent and young adults, ages 15-39 years, the median age at diagnosis being lower than that for most other types of cancer [1,2]. Papillary (PTC) and follicular (FTC) thyroid cancer, may allow the identification of candidate SNPs for future use as susceptibility biomarkers, hence, the development of tailored DTC prevention policies and perhaps implementation of guidelines.

Study Subjects
Overall, 335 Caucasian Portuguese subjects were enrolled in this hospital-based case-control study: 106 histologically confirmed DTC patients were recruited in the Service of Nuclear Medicine of the Portuguese Oncology Institute, Lisbon, Portugal where they were treated according to the hospital current practice and 229 unrelated age (±2 years) and gender-matched controls (two for each DTC case, in each of the previously published studies) were recruited at the Department of Clinical Pathology of the São Francisco Xavier Hospital, West Lisbon Hospital Centre, Portugal where they were seeking healthcare for non-neoplastic pathology. None of the study participants had personal history of prior malignancy nor familial history of thyroid disease.
In order to verify eligibility criteria and to account for potential confounding factors, information on demographic characteristics (e.g., gender, age, occupation), family history of cancer, lifestyle habits (e.g., smoking, alcohol drinking) and IR exposure was collected from each study participant, on recruitment, through a pre-designed questionnaire performed by trained interviewers. Prior exposure to relevant levels of ionizing radiation (i.e., other than that from natural and standard diagnostic sources) was denied by all subjects included in the study. Former smokers were considered as non-smokers if they gave up smoking 2 years before DTC diagnosis or 2 years before their inclusion as controls. The response rate was >95% for both cases and controls.
All studies were previously approved by the local ethics boards of the involved institutions and conducted in compliance with the Helsinki Declaration. On recruitment, prior to blood withdrawal, all eligible subjects were informed about the objectives of the study. Those agreeing to participate gave their written informed consent and were enrolled in the study. The anonymity of all participants was guaranteed.

SNP Selection
The selection of SNPs for genotyping was performed according to criteria that were predefined individually for each original study [14][15][16][17][18]. Briefly, eligible SNPs were required to exhibit a minor allele frequency (MAF) greater than 0.05 in Caucasian populations, the remaining criteria (e.g., being located in a coding or splice region, altering the amino acid sequence, being a tagging SNP, having been previously referred to in MEDLINE) varying according to the individual study, as indicated in the original studies of individual alleles.
Overall, a total of 36 DNA repair SNPs across all DNA repair pathways were selected for genotyping and analysed. Details on the genomic location, base and amino acid exchange and MAF of selected SNPs are presented on Table 1. Table 1. Selected SNPs and detailed information on the corresponding base and amino acid exchanges, minor allele frequency (MAF) and AB assay used for genotyping.

Practical Methodologies-Brief Description
All DNA samples were obtained after collection of peripheral venous blood samples from each participant. The DNA extraction was performed as described previously [14][15][16][17][18] using a commercial available kit (QIAamp ® DNA mini kit; Qiagen GmbH, Hilden, Germany), according to the manufacturer's recommendations. All samples were stored at −20 • C until further analysis.
Genotyping was carried out through either real-time polymerase chain reaction (PCR) or conventional PCR-restriction fragment length polymorphism (RFLP) techniques, as described in previous studies [14][15][16][17][18]. For real-time PCR-the option for the vast majority of SNPs considered in this study -genotyping was performed on an ABI 7300 Real-Time PCR system thermal cycler (Applied Biosystems; Thermo Fisher Scientific, Inc., Waltham, MA, USA), using the commercially available TaqMan ® SNP Genotyping Assays (Applied Biosystems) identified in Table 1. Conventional techniques of polymerase chain reaction (PCR) and restriction fragment length polymorphism (RFLP) were employed to genotype XRCC1 rs1799782, XRCC1 rs25487 and OGG1 rs1052133 (BER pathway); XPC rs2228000 and XPC rs2228001 (NER pathway); and XRCC3 rs861539 and XRCC2 rs3218536 (HR pathway). Primer design methods and sequences, PCR conditions, PCR product sizes, restriction analysis conditions and expected digestion pattern for each genotype have been described in full detail elsewhere [14,16,17] and will therefore not be reproduced here. Irrespective of the genotyping method, all inconclusive samples were reanalysed. Also, for quality control, at least 10-15% of genotype determinations were run in duplicates through independent experiments, with 100% concordance between experiments.

Statistical Analysis
Prior to analysis, genotype distributions for each studied SNP were checked for deviation from Hardy-Weinberg equilibrium (HWE) using SNPstat platform [31], in both case and control populations. Variable transformation was applied to categorize the only continuous variable (age of diagnosis) and the Chi-square test was then used to evaluate differences in genotype frequency, smoking status, age class and gender distributions between DTC patients and controls. Whenever the construction of 2 × 2 contingency tables was possible, the two-sided Fisher's exact test was employed instead of the Chi-square test.
Logistic regression was used to estimate the risk of DTC associated with each genotype: risk estimates were calculated under the codominant, dominant and recessive models and expressed as crude and adjusted odds ratios (OR) and corresponding 95% confidence intervals (CI). Whenever adjustment was performed, terms for gender (male/female), age class (<30, 30-49, 50-69 and ≥70 years) and smoking habits (smokers/non-smokers) were included in the model, the most common homozygous genotype, female gender, lower age group and non-smoking status being considered the reference classes for such calculations. As data on prior IR exposure was not suitable for rigorous quantitative transformation, it was not possible to include such term in the adjustment model. Risk estimates were calculated in the whole population and after stratification according to histological type of tumour (papillary or follicular TC), gender (male and female) and age (<50 and ≥50 years).
Finally, the joint effect of multiple SNPs on DTC risk was estimated from application of logistic regression analysis (1) to relevant haplotypes, (2) to individual genetic risk scores calculated from genotype variables significant on single SNP analysis and (3) to all possible 2 × 2 combinations of the DNA repair SNPs included in this study. For the purpose of risk score calculations, genotypes presenting significant results on single SNP analysis were attributed a +1 score, the risk score for each participant corresponding to the sum of such scores. Samples with one or more missing genotypes were excluded from these calculations to avoid bias due to missing data. For paired SNP analysis, the combination of the most common homozygous genotypes of each individual SNP in the control group was taken as the reference category in OR calculations. Also, paired genotypes with frequency <5% in the study population were pooled together. This is not a conclusive final study but an exploratory one that should be regarded as 'proof of concept'. As such, the Bonferroni adjustment was deemed as not necessary as it is too conservative. Also, the complement of the false negative rate β to compute the power of a test (1−β) was not taken into account at this stage since further studies with more patients and controls should be undertaken to change over this preliminary study into a confirmatory positive one. All statistical analyses were performed with SPSS 22.0 (IBM SPSS Statistics for Windows, version 22.0, IBM Corp, Armonk, NY, USA) except for assessment of HWE deviation, MAF calculations, haplotype estimation and linkage disequilibrium (LD) analysis which were carried out using SNPstats [31]. Results were considered significant when the corresponding two-tailed p-values were <0.05 except for paired SNP analysis where, because of the high number of SNP-SNP combinations being tested, a more stringent significance level (p < 0.01) was employed. The study was approved by the Ethical Committee of Nova Medical School, Faculty of Medical Sciences with the number 05/2008 dated of January 9th, 2008. The approval was also obtained by the ethical committee of Portuguese Oncology Institute (IPO), the hospital responsible for blood samples collection with the reference GIC/357 dated of July 14th 2004.

General Analysis
The general characteristics of the 106 DTC patients and their 229 age-and gender-matched controls included in this study are depicted in Table 2. The overall mean age of the study population was 51 years (52.1 in the patient group and 51.0 in the control group). As expected from the worldwide gender distribution for DTC [1,2], female patients greatly outnumbered male patients in the case group. Twelve (11.3%) DTC patients were categorized as smokers. Age distribution, gender and smoking habits were not significantly different between case and control populations. Concerning histological classification of tumours, 78 (73.6%) patients were diagnosed as papillary TC while 28 (26.4%) presented follicular tumours, in line with DTC histotype distributions commonly reported in the literature [4]. Three additional cases of poorly differentiated TC were also present in some of our original studies but, since this study concerns only with DTC, such cases (and the corresponding controls) were excluded from this analysis. Prior IR exposure (except for diagnostic X-rays) was denied by all cases.

All DTC Cases
Allelic and genotypic frequencies as well as crude/adjusted ORs were calculated for all 36 DNA repair SNPs analysed in our study. Significant findings are reported in Table 3. The allelic and genotypic frequencies observed in the control group were in agreement with those expected for Caucasian populations. Also, for the majority of SNPs, genotype distributions were in Hardy-Weinberg equilibrium (HWE, p ≥ 0.05), in both case and control populations. Significant deviations from HWE were observed for OGG1 rs1052133, MUTYH rs3219489 and CDK7 rs2972388 in the control group and for XRCC1 rs1799782, XPC rs2228000 and MSH3 rs184967 in the DTC group. Further, strong linkage disequilibrium was observed between XRCC5 rs1051677 and rs6941, but not between any other pair of SNPs. XRCC5 rs6941 was thus excluded from further analysis, the conclusions taken for XRCC5 rs1051677 being valid for XRCC5 rs6941, since they behave as tag SNPs. As expected, both the comparison of genotype frequency distributions between case and control populations and the logistic regression analysis (Table 3) yielded results similar to those previously reported [14][15][16][17][18]: significant differences on the distribution of genotypic frequencies between cases and controls were observed for CCNH rs2230641 (p = 0.037 on the codominant model and p = 0.024 on the dominant model), for MSH6 rs1042821 (p = 0.042, on the codominant model and p = 0.037 on the recessive model) and for XRCC3 rs861539 (p = 0.021 on the codominant model and p = 0.011 on the recessive model). On logistic regression analysis, after adjustment for age, gender and smoking status, DTC risk was significantly increased in CCNH rs2230641 heterozygotes (adjusted OR = 1.89, 95% CI: 1.14-3.14, p = 0.014) and also in variant allele carriers, according the dominant model (adjusted OR = 1.79, 95% CI: 1.09-2.93, p = 0.021), in MSH6 rs1042821 variant allele homozygotes (adjusted OR = 3.42, 95% CI: 1.04-11.24, p = 0.042 on the codominant model; adjusted OR = 3.84, 95% CI: 1.18-12.44, p = 0.025 on the recessive model), in XRCC3 rs861539 variant allele homozygotes (adjusted OR = 2.20, 95% CI: 1.20-4.03, p = 0.011 on the recessive model) and in XPC rs2228001 variant allele homozygotes (adjusted OR = 1.97, 95% CI: 1.01-3.84, p = 0.046 on the recessive model). A borderline significant DTC risk reduction was observed in ERCC5 rs2227869 heterozygotes (adjusted OR = 0.39, 95% CI: 0.16-1.00, p = 0.049). The association between XPC rs2228001 and DTC risk is a new finding emerging from this reanalysis, since the recessive model of inheritance had not been applied in the original study [17].
No additional significant differences in genotype frequency distributions nor associations with DTC risk were found, irrespective of the model assumed.

Stratified Analysis
Stratified analysis according to histological tumour type, gender and age may be important to identify any subgroup-specific risk association but was only partially performed in prior studies in this population. On stratification according to histological criteria (Table 4), this study confirmed prior observations [14,17,18] 32, p = 0.035, on the recessive model). Interestingly, three other significant associations were observed in this reanalysis that were not present or had not been detected in the original studies, while two previously observed associations were lost in this reanalysis: a previously undetected decreased papillary TC risk was observed in MUTYH rs3219489 heterozygotes (crude OR = 0.56, 95% CI: 0.32-1.00, p = 0.048) and variant allele carriers (crude OR = 0.57, 95% CI: 0.33-0.99, p = 0.048) as well as in NBN rs1805794 variant allele homozygotes (adjusted OR = 0.28, 95% CI: 0.08-0.97, p = 0.045, on the recessive model) while the presence of the variant allele of XRCC2 rs3218536 exhibited a protective effect for follicular TC (crude OR = 0.21, 95% CI: 0.04-1.00, p = 0.049, either for heterozygotes in the codominant model and for variant allele carriers in the dominant model). In contrast, the associations of XRCC5 rs2440 and CCNH rs2230641 genotypes with papillary and follicular TC risk, respectively, reported in our original studies [15,17], were no longer observed.
On gender stratification ( Opposing, ERCC5 rs2227869 heterozygotes (adjusted OR = 0.25, 95% CI: 0.07-0.88, p = 0.030) and variant allele carriers (adjusted OR = 0.32, 95% CI: 0.11-0.97, p = 0.044) as well as ERCC5 rs17655 variant allele homozygotes (adjusted OR = 0.27, 95% CI: 0.08-0.95, p = 0.041, on the recessive model) presented a significant risk reduction among female patients. Among these gender-specific genetic effects, only the association with MSH6 rs1042821 had been reported in the original studies [18]. No significant association was observed in the male subset of patients, possibly because of the low number of cases in this gender group. An association between XRCC5 rs1051677 and TC risk had previously been identified in this subset of patients [15] but significance was lost upon restricting analysis to well-differentiated forms of TC (this study).
No further correlations between individual DNA repair SNPs and DTC risk were observed on histology-, gender-and age-based stratification analysis.

Combined Genotypes
In order to investigate the joint effect of multiple SNPs on DTC risk, genetic risk scores (RS) were calculated for each study participant, considering only significant findings on single SNP analysis. As depicted in Table 5, after adjusting for covariates, DTC risk was more than two and five times higher in individuals bearing, respectively, 2 (adjusted OR = 2.68, 95% CI: 1.56-4.59, p < 0.001) and 3 or more (adjusted OR = 5.02, 95% CI: 2.24-11.24, p = 0.001) risk genotypes (CCNH rs2230641 Val/Ala or Ala/Ala; ERCC5 rs2227869 Cys/Cys or Ser/Ser; XPC rs2228001 Gln/Gln; MSH6 rs1042821 Glu/Glu; XRCC3 rs861539 Met/Met), when compared to individuals bearing none or only one of such risk genotypes. Similar associations between RS and TC risk were also observed on stratification according to histological, gender or age criteria, after adapting RS calculations to the SNPs significant for each strata (Table 5). A high significance level was observed in most cases (p < 0.001 in approximately 50% of RS categories) and was even greater if higher RS categories were merged together (results not shown).
Also, in order to investigate the combined effect of different pairs of SNPs on DTC risk, we performed a paired SNP analysis considering all possible 2 × 2 combinations of the DNA repair SNPs included in this study. Overall, 595 SNP-SNP combinations were tested, 114 (approximately 20%) of which yielded significant results at a 0.05 significance level (results not shown). Considering that such a high number of hypothesis being tested may result in a considerable number of false positive findings, a more stringent significance level (p < 0.01) was employed in this analysis, limiting the number of SNP pairs with significant findings to 15 (approximately 2.5% of all possible combinations). Such significant findings are depicted in Table 6 and also in Figure 1. CCNH rs2230641 emerges from Figure 1 as the DNA repair SNP most frequently represented in significant SNP-SNP combinations, both at 0.01 and 0.05 significance levels, followed by RAD51 rs1801321, MLH3 rs175080 and MSH4 rs5745549 (0.01 significance level) or RAD51 rs1801321and XRCC3 rs861539 (0.05 significance level). MMR variants were the most frequently involved as they were present in 9 of the 15 SNP-SNP combinations that were significant. Also, among significant findings, 3 intra-pathway SNP combinations were detected: RAD51 rs1801321-XRCC3 rs861539 (HR pathway), MLH3 rs175080-MSH6 rs1042821 and MSH4 rs5745549-MSH6 rs1042821 (MMR pathway).
Finally, haplotype analysis was applied to SNPs located in the same chromosome arm, since these are likely to segregate together. According to such criteria, it was possible to establish 8 blocks of DNA repair SNPs, of which only one, located on chromosome 5q and comprising 6 SNPs (CCNH rs2230641, CDK7 rs2972388, MSH3 rs26279, MSH3 rs184967, XRCC4 rs1805377 and XRCC4 rs28360135), revealed significant associations with DTC (Table 7): two different allele combinations were associated with a significantly decreased DTC risk, when compared to the most frequent combination of chromosome 5q SNPs (adjusted OR1 = 0.26, 95% CI: 0.08-0.87, p = 0.030; adjusted OR2 = 0.15, 95% CI: 0.03-0.72, p = 0.019). Haplogroup analysis comprising all SNPs under study could also prove useful to understand the joint effect of the variants since it would better reflect the real context situation (where different DNA repair proteins interact with each other) but could not be performed because, considering the high number of SNPs under study, the frequency of each specific allele combination would be too low for meaningful results to be obtained.

Discussion
In order to further characterize the potential contribution of DNA repair SNPs to DTC susceptibility, we aggregated and reanalysed the data from our previously published case-control studies [14][15][16][17][18] performed on a Caucasian Portuguese population.
A significant risk increase was observed, after adjustment for age, gender and smoking status, in CCNH rs2230641 heterozygotes and variant allele carriers, in MSH6 rs1042821 variant allele homozygotes (codominant and recessive model), in XRCC3 rs861539 variant allele homozygotes (recessive model) and in XPC rs2228001 variant allele homozygotes (recessive model), while the heterozygous ERCC5 rs2227869 genotype was associated with a borderline risk reduction. Except for XPC rs2228001, which is a new finding emerging from this reanalysis because the recessive model of inheritance had not been applied in the original study, such results are fundamentally similar to those reported on the original studies despite, on reanalysis, data was restricted to DTC cases and corresponding controls. A role for these variants specifically on well-differentiated forms of TC is thus apparent from this reanalysis. As these findings have been discussed in detail in the original studies, they will be discussed here only briefly, with emphasis on new data published since then.
XRCC3 participates in HR to maintain chromosome stability and repair DNA damage and is therefore a highly suspected candidate gene for cancer susceptibility. The XRCC3 rs861539 has been the most studied genetic variant of XRCC3 gene, especially because is located in a functional relevant domain of the protein, in an interaction region with other proteins such as RAD51 [22,32]. The presence of this variant may affect the structure of this DNA repair protein and lead to a deficiency in the HR pathway. As a result, the HR pathway may be compromised, shifting the repair mechanism to NHEJ, promoting chromosome instability and disturbing the cellular repair capacity [33]. The potential contribution of XRCC3 rs861539 to cancer susceptibility has been widely addressed: while conflicting evidence exists, several large meta-analyses strongly support a positive association with cancer susceptibility, namely breast [34][35][36] and bladder cancer [36][37][38], among others. In the particular context of thyroid cancer, interestingly, multiple studies [22,[39][40][41][42][43], including a meta-analysis [44], have suggested the XRCC3 rs861539 variant T allele and/or, in particular, the TT homozygous genotype to be associated with increased risk of TC or, more specifically, PTC. In another meta-analysis [45] such association was also detected but only in Caucasian populations. Therefore, despite studies reporting no significant association also exist [46,47], the vast majority of available evidence supports our results and suggests a role for XRCC3 rs861539 in DTC susceptibility.
To the best of our knowledge, none of the remaining SNPs presenting significant results on overall analysis has been evaluated in the context of DTC (or TC) susceptibility.
XPC codes for a DNA binding protein that acts forming the distortion-sensing component of NER by binding tightly with another important NER protein, HR23B, to form a stable XPC-HR23B complex, thus playing a central role in the process of early damage recognition [48,49]. XPC-HR23B complex can recognize a variety of DNA adducts formed by exogenous carcinogens and binds to the DNA damage sites. Therefore, it may play a role in decreasing the toxic effects of such carcinogens and its deficiency may interact with carcinogen exposure [50]. XPC is also involved in DNA damage-induced cell cycle checkpoint regulation and apoptosis, removal of oxidative DNA damage and redox homeostasis [49,51]. XPC rs2228001 (an A-to-C transition in exon 15) leads to a substitution of glutamine for lysine in codon 939 (Lys939Gln) and is located in the domain interacting with the transcription factor IIH (TFIIH) complex [50,[52][53][54][55], initiating the global genome NER pathway. XPC rs2228001 is one of the most extensively studied NER pathway SNPs, as numerous case-control association studies and meta-analyses have been performed to investigate its potential role on cancer predisposition. In line with our data for DTC, a modest but consistent association of the Gln/Gln homozygous genotype with overall cancer risk is apparent from two of the three meta-analysis that pool data from different cancer types [56][57][58]. Evidence from these and other cancer site-specific meta-analyses is stronger for lung [53,[56][57][58][59][60], bladder [54,56,61,62] and colorectal cancer (CRC) [56,58] [63,64], but also exists for other cancer types such as upper digestive system cancer [65] and hepatocellular carcinoma [50,66].
The MSH6 gene (mutS homolog 6) is a member of a set of genes known as the mismatch repair (MMR) genes. MSH6 integrates the MutSα complex, a sensor of genetic damage that, besides its role in the repair of replication errors, cooperates with other DNA repair and damage-response signalling pathways to allow for cell cycle arrest, DNA repair and/or apoptosis of genetically damaged cells. Several MSH6 mutations have been identified and suggested as causative in Lynch syndrome (LS) patients [76][77][78][79][80]. Despite TC is not part of the usual LS spectrum, the effect of MSH6 in TC susceptibility has previously been explored [81,82]. MSH6 rs1042821 has also been frequently investigated in the context of cancer susceptibility, mostly with inconclusive findings [83][84][85][86][87][88][89][90]. Consistent with our results, MSH6 rs1042821 has previously been associated with increased CRC risk [91][92][93], highly malignant bladder cancer [94], pancreatic cancer [95] and triple negative breast cancer (TNBC) [96]. On the contrary, the T allele [97] and the CT heterozygous genotype [98] have been associated with decreased colorectal and hepatocellular carcinoma, respectively. The only meta-analysis concerning the role of MSH6 rs1042821 on cancer predisposition that we are aware of is also inconclusive [99]. Despite plausible, a potential role for MSH6 rs1042821 on cancer predisposition (DTC, in particular) remains elusive. Further well-powered studies are needed to clarify this issue.
The role of CCNH rs2230641 on cancer predisposition has only seldom been evaluated: in agreement with our results, a significantly increased bladder cancer risk in ever smokers has been reported for C allele carriers [100] but, on the contrary, such genotype has also been associated with a significantly decreased risk of chronic leukaemia [101]. Most other studies, namely in oesophageal [102], bladder [103], biliary tract [104] and renal cell carcinoma [105], as well as in oral premalignant lesions [106] have been inconclusive. Interestingly, the pharmacogenomic implications of CCNH rs2230641 on the outcome of platinum-based chemotherapy have also been evaluated, results supporting a role for CCNH rs2230641 on the response to DNA damaging agents: the presence of the CCNH rs2230641 variant C allele has been associated with longer survival in NLCSC patients receiving platinum-based chemotherapy [107] and with increased incidence and severity of oxaliplatin-induced acute peripheral neuropathy in digestive tract cancer patients undergoing with the oxaliplatin-based chemotherapy [108]. Similarly, increased risk of severe oxaliplatin-induced acute peripheral neuropathy was observed by Custodio et al. [109] in high-risk stage II and stage III colon cancer patients homozygous for the C allele, submitted to oxaliplatin-based adjuvant chemotherapy. CCNH codes for a highly conserved cyclin protein that participates in several cellular processes such as the NER pathway, cell cycle regulation and receptor phosphorylation, among others [48,110]. Although data on the functional relevance of rs2230641 is lacking, the pleiotropic effects of CCNH confer biological plausibility to our hypothesis that CCNH variants may be involved in cancer susceptibility.
Finally, ERCC5, also known as XPG, is located on chromosome 13q22-q33 [111] and comprises 15 exons [112,113]. It encodes a structure-specific endonuclease that has multiple functions during NER [114], reason why defects in this gene can impair DNA repair resulting in genomic instability and carcinogenesis [115]. In fact, only a few studies have considered the putative contribution of ERCC5 rs2227869 to cancer susceptibility, most being inconclusive. Interestingly, the only significant findings reported thus far are in line with those reported here, suggesting a protective role for the heterozygous genotype: Hussain et al. [116] reported a significant reduction in stomach cancer risk in heterozygous genotype individuals and a similar, despite nonsignificant, trend has also been independently observed for melanoma [117] and for squamous cell carcinoma of the head and neck (SCCHN) [118]. More importantly, in the only meta-analysis performed to date [119], a decrease in cancer risk in ERCC5 rs2227869 heterozygotes (and for the C allele) has also been reported.
Many of these (and other) SNPs also presented significant findings on stratifying data according to hystotype, gender and age: on histological stratification, significant associations were observed between XRCC3 rs861539, XPC rs2228001, ERCC5 rs2227869, MUTYH rs3219489 and NBN rs1805794 and papillary TC, while MSH6 rs1042821, MLH3 rs175080 and XRCC2 rs3218536 were associated with follicular TC. XRCC3 rs861539, XPC rs2228001, MSH6 rs1042821, CCNH rs2230641, ERCC5 rs2227869 and ERCC5 rs17655 were associated with DTC in the female subset while no association was observed in males. Finally, XPC rs2228001 and XRCC5 rs2440 were associated with DTC in participants younger than 50 years, while, in participants aged 50 or more years, the DTC-associated SNPs included XRCC3 rs861539, CCNH rs2230641, ERCC6 rs2228529 and RAD51 rs1801321.
It is unclear whether these findings (and which among these) truly represent group-specific effects or whether they simply reflect the overall effect on the largest groups (i.e., when group sizes are unbalanced, e.g., papillary TC vs follicular TC, female vs male) and the corresponding lack of power to detect an effect on the smallest groups. Also, due to the low sample size on each strata, some of these results may simply represent incident findings (type I errors). XRCC3 rs861539, for example, has been previously associated with papillary TC [22,39,40]-in line with our results-but not with follicular TC. An effect of XRCC3 rs861539 genotype in follicular TC cannot, however, be excluded since follicular TC is much less frequent than papillary TC and these studies may have been underpowered to detect such effect. Also, Su et al. [120] have demonstrated the homozygous genotype of this SNP to be associated with breast cancer, the association being stronger in women younger than 55 years, with earlier first menarche or with latter menopause. This suggests an oestrogen-potentiated genetic effect, compatible with our own observation of increased DTC risk in XRCC3 rs861539 TT homozygotes among females but not among males. Further, the involvement of CCNH, through a cyclin-activated kinase complex, in oestrogen receptor phosphorylation [48] provides a possible rationale for our own observation of an association of the CCNH rs2230641genotype with DTC among females but not among males. Finally, the association of MSH6 rs1042821with DTC, observed in this study for female but not male individuals, is compatible with the growing evidence placing DTC as an oestrogen-associated cancer [121][122][123][124] and implicating MSH6 in such cancers [78,[125][126][127][128][129].
These selected examples highlight the plausibility of the existence of group-specific genetic effects. Overall, such hystotype, gender and age specifies in DTC susceptibility are likely since (1) papillary and follicular TC represent distinct entities, with hystotype-specific molecular profiles (e.g., BRAF mutations and RET/PTC rearrangements in PTC, RAS mutations and PAX8/PPARγ translocations in FTC) [130]; (2) important gender differences exist in the incidence of DTC (i.e., DTC is, as previously stated two to four times more frequent in women than in men) [1,2]; and (3) DTC presents some age specificities, uncommon in other types of cancer (DTC is one of the most common malignancies in adolescent and young adults, the median age at diagnosis being lower than that for most other types of cancer) [1,2]. Further well-powered studies are urgently needed to clarify these results and thus establish which of these SNPs, if any, represents true group-specific susceptibility biomarkers.
Considering the multifactorial nature of DTC aetiology and the probable involvement of multiple genetic factors, alone or in combination, in DTC susceptibility, we undertook a combined genotype analyses to investigate the joint effect of multiple SNPs on DTC risk. When combining all risk genotypes significant at single SNP analysis into a unique unbalanced risk score, a clear-cut gene-dosage effect between the number of risk genotypes (unbalanced risk score) and DTC risk was observed, both on global analysis (considering all DTC cases and corresponding controls) and after stratification according to histological, gender and age criteria. This is biologically plausible since the different DNA repair proteins physically and functionally interact with each other, within the same or different DNA repair pathways, establishing ground for additive or even multiplicative effects of different SNPs on DNA repair activity and, hence, cancer risk. Such polygenic approach to assess the cumulative effects of multiple genetic variants on cancer risk has previously been employed [27,107,131,132], supporting its usefulness and clinical potential.
To investigate the effect of specific DNA repair SNP combinations on DTC risk, all possible 2 × 2 combinations were tested on paired SNP analysis, yielding fifteen SNP pairs with p < 0.01. Multiple interactions between SNPs from different DNA repair pathways and, even, other DNA damage response proteins have previously been reported [39,42,66,87], providing a rationale for such approach. Of notice, CCNH rs2230641 was the most frequently represented DNA repair SNP in such significant combinations, both at 0.01 and 0.05 significance levels, a finding that is compatible with the pleiotropic role of CCNH in DNA damage repair, cell cycle regulation and receptor phosphorylation [48,110]. More importantly, the contribution of MMR variants to the joint effect of DNA repair SNPs on DTC risk is evident from our results, as they were present in 9 of the 15 SNP pairs presenting significant findings. Besides its critical role in post-replication repair (through recognition and repair of base-base mispairs and insertion/deletion loops that arise during replication), the MMR pathway cooperates with other repair pathways in the recognition and subsequent repair of DNA damage induced by IR, UV light, oxidative stress or genotoxic chemicals (e.g., oxidative lesions, double strand breaks, pyrimidine dimers and inter-strand crosslinks) and contributes to damage-induced cytotoxicity through downstream signalling for cell cycle arrest and apoptosis [133][134][135]. Therefore, considering the large spectre of action of the MMR pathway, an elevated number of interactions between MMR and other DNA repair SNPs is expected. Such hypothesis, in line with our findings, has been recently strengthened by a report [136] associating SNPs from different DNA repair pathways with CRC in Lynch syndrome patients, a cancer predisposition condition originated by germline MMR mutations. Finally, among SNP pairs presenting significant findings in this study, three are intra-pathway combinations involving either HR or MMR pathway SNPs. The joint effects of MLH3 rs175080 -MSH6 rs1042821 and MSH4 rs5745549 -MSH6 rs1042821 (MMR pathway) SNP combinations were reported and discussed in our original study [18]. The joint effect of RAD51 rs1801321 and XRCC3 rs861539 (HR pathway) on cancer risk has been previously reported for breast cancer [137], in line with our results, and may be of particular relevance for DTC since the formation of radiation damage-induced RAD51 foci requires functional XRCC3 [138].
Finally, on applying haplotype analysis to SNPs that are located in the same chromosome arm (thus likely to segregate together), one block of DNA repair SNPs located on chromosome 5q (comprising CCNH rs2230641, CDK7 rs2972388, MSH3 rs26279, MSH3 rs184967, XRCC4 rs1805377 and XRCC4 rs28360135) was associated with DTC risk in our study. Such results further suggest an independent or interactive effect of these SNPs on DTC predisposition.
Overall, our results suggest that DNA repair SNPs across different pathways and may contribute to DTC predisposition, possibly exerting cumulative effects. This is of relevance since the estimated high heritability of DTC is only partially explained, even when considering the contribution of several GWAS recently performed. Gene-gene and gene-environment interactions have been hypothesised to play an important role so their identification and in-depth study is highly desirable to explain the "missing" heritability of DTC. However, the results presented here should be regarded only as proof of concept and must therefore be validated through replication in larger independent populations. Future studies should also be designed with the intention of accounting for environmental factors such as IR exposure and iodine deficiency (and their potential interaction with genetic factors). In addition, they should be sufficiently powered to allow other, less frequent but potentially relevant SNPs, to be studied and to allow more sophisticated and conclusive gene-gene interaction analysis to be performed. Finally, in order to strengthen our preliminary findings, the functional significance of these SNPs should be further investigated as well as their potential association with mutational events involved in DTC carcinogenesis (e.g., BRAF mutations and RET/PTC rearrangements).