Effect of Single Nucleotide Polymorphisms in the Vitamin D Metabolic Pathway on Susceptibility to Non-Small-Cell Lung Cancer

The pathogenesis of non-small-cell lung cancer (NSCLC) is complex, since many risk factors have been identified. Recent research indicates that polymorphisms in the metabolic pathway of vitamin D may be involved in both risk and survival of the disease. The objective of this study is to assess the effect of 13 genetic polymorphisms involved in the vitamin D metabolic pathway on the risk of suffering from NSCLC. We conducted an observational case-control study, which included 204 patients with NSCLC and 408 controls, of Caucasian origin, from southern Spain. The CYP27B1 (rs4646536, rs3782130, rs703842, rs10877012), CYP2R1 (rs10741657), GC (rs7041), CYP24A1, and VDR (BsmI, Cdx-2, FokI, ApaI, TaqI) gene polymorphisms were analyzed by real-time polymerase chain reaction. The logistic regression model, adjusted for smoking and family history of cancer, revealed that in the genotypic model, carriers of the VDR BsmI rs1544410-AA genotype were associated with a lower risk of developing NSCLC compared to the GG genotype (p = 0.0377; OR = 0.51; CI95% = 0.27–0.95; AA vs. GG). This association was maintained in the recessive model (p = 0.0140). Haplotype analysis revealed that the AACATGG and GACATGG haplotypes for the rs1544410, rs7975232, rs731236, rs4646536, rs703842, rs3782130, and rs10877012 polymorphisms were associated with a lower risk of NSCLC (p = 0.015 and p = 0.044 respectively). The remaining polymorphisms showed no effect on susceptibility to NSCLC. The BsmI rs1544410 polymorphism was significantly associated with lower risk of NSCLC and could be of considerable value as a predictive biomarker of the disease.


Introduction
Lung cancer is one of the most serious malignancies, with the highest mortality (18% of cancer deaths worldwide) [1]. It also has a higher incidence in men (14.3%) and is the third most common cancer in women (8.4%), after breast and colorectal cancer. Taking both sexes together, it has the second highest incidence after breast cancer [1]. It is estimated that by 2040 the incidence of lung cancer will rise by 22.4% in Europe and that the increase in mortality will be 25.5% [2]. According to the latest cancer statistics, 236,740 new cases and 130,180 deaths are projected to occur in the United States in 2022 [3].
Lung cancer is a heterogeneous disease comprising various types. On this basis it can be classified primarily as small-cell lung cancer (SCLC, 13% of cases) and non-small-cell lung cancer (NSCLC, 83% of cases). The latter, in turn, can be divided into three subtypes: The controls were individuals aged over 18 years with no personal history of malignancies who had lived in the same geographical area and were recruited from the same hospital.
This case-control study was carried out in accordance with the Declaration of Helsinki and was approved by the Ethics and Research Committee of the Andalusian Public Health Service's Biobank (Code: 1322-N-20). The subjects signed a written informed consent form for collection of blood or saliva samples and their donation to the Biobank. The samples were coded and treated confidentially.

Sociodemographic and Clinical Variables
The sociodemographic data include gender, age at diagnosis, smoking status, drinking status, family history of cancer, and previous lung disease. Individuals were classified as non-smokers if they had never smoked or had smoked fewer than 100 cigarettes in their lives, as ex-smokers if they had smoked 100 or more cigarettes in their lives but did not currently smoke, and as active smokers if they had smoked 100 or more cigarettes in their lives and currently smoke. Individuals were classified by standard drink units (SDUs) as non-drinkers if they were teetotalers or did not consume alcohol regularly, as active drinkers if their alcohol consumption was greater than 4 SDUs per day in men and greater than 2.5 SDUs per day in women, and as ex-drinkers if their alcohol consumption had been greater than 4 SDUs per day in men and greater than 2.5 SDUs per day in women but they did not currently drink [32]. Histopathologic data (tumor histology and stage) were also collected. The staging system used to classify the tumors was based on the guidelines of the American Joint Committee on Cancer [33].

DNA Isolation
Blood samples (3 mL) were collected in BD Vacutainer ® K3E Plus blood collection tubes and saliva samples in BD Falcon™ 50 mL conical tubes (BD, Plymouth, UK). DNA was extracted using the QIAamp DNA Mini extraction kit (Qiagen GmbH, Hilden, Germany), according to the manufacturer's instructions for purification of DNA from blood or saliva, and stored at −40 • C. The concentration and purity of the DNA were measured using the NanoDrop 2000™ UV spectrophotometer with 280/260 and 280/230 absorbance ratios. The DNA samples, isolated from blood or saliva, were preserved in the Biobank of the Hospital Universitario Virgen de las Nieves, part of the Andalusian Public Health Service's Biobank.

Detection of Gene Polymorphisms and Quality Control
We determined the gene polymorphisms by real-time PCR allelic discrimination assay using TaqMan ® probes (ABI Applied Biosystems, QuantStudio 3 Real-Time PCR System, 96 wells), following the manufacturer's instructions (Table 1). Ten per cent of the results were confirmed by Sanger sequencing. Real-time PCR and Sanger sequencing were performed in the Pharmacogenetics Unit of the Hospital Universitario Virgen de las Nieves. The criteria for SNPs quality control were: (1) missing genotype rate per SNP < 0.05; (2) minor allele frequency > 0.01; (3) p value > 0.05 in Hardy-Weinberg equilibrium test; (4) missing genotype rate between cases and control < 0.05.

Statistical Analysis
We matched cases and controls by age and gender using the 1:2 propensity score matching method [34]. Quantitative data were expressed as the results (±standard deviation) for variables with normal distribution and medians or percentiles (25 and 75) for variables with non-normal distribution. We used the Shapiro-Wilks test to verify normality.
We determined the Hardy-Weinberg equilibrium and haplotype frequency through the D' and r2 coefficients. The bivariate association between risk of NSCLC and polymorphisms was evaluated for multiple models (genotypic, additive, allelic, dominant and recessive), using the Pearson chi-square and Fisher exact tests, and assessed with the odds ratio and corresponding 95% confidence interval (CI). We defined the models as follows: genotypic (DD vs. Dd vs. dd), allelic (D vs. d), dominant ((DD, Dd) vs. dd), recessive (DD vs. (Dd, dd)), and additive (dd = 0, Dd = 1, DD = 2), where D is the minor allele and d the major allele. We used the Bonferroni adjustment for multiple comparisons. Unconditional multiple logistic regression models (genotypic, dominant, and recessive) were considered to determine the influence of possible confounding variables on the risk of suffering from lung cancer.
* The SNPs were analyzed using custom assays by ThermoFisher Scientific (Waltham, MA, USA).
All the tests were bilateral, with a significance level of p < 0.05, and were estimated using PLINK and R 4.0.2 software [35,36]. We performed linkage disequilibrium with Haploview 4.2 and haplotype analysis with SNPStats [37,38].

Patient Characteristics
The study included a total of 204 cases of NSCLC and 408 controls, whose clinicopathological characteristics are described in There were statistically significant differences between the cases and the controls with respect to smoking status (p < 0.001; OR = 8.88; CI95% = 5.42-14.9; current smokers vs. nonsmokers and p < 0.001; OR = 3.43; CI95% = 2.14-5.63; ex-smokers vs. non-smokers) and family history of cancer (p < 0.001; OR = 15.2; CI95% = 9.55-25.2; yes vs. no). No statistically significant differences were observed between the two groups in gender (p = 0.1898), age (p = 0.1030), drinking status (p = 0.1392), or previous lung disease (p = 0.9044). * p-value for t test; Shade means the value is significant. N means the whole number of patients considered; n means the number of patients in subgroups.
Nutrients 2022, 14, x FOR PEER REVIEW 6 of 13 than 1% and none of them was excluded from the analysis (Table S3). The haplotype frequency estimates are presented in Table S4.

Influence of Genetic Polymorphisms on the Risk of NSCLC
We used the genotypic, additive, allelic, dominant, and recessive models to perform the bivariate analysis between the gene polymorphisms and the risk of suffering from NSCLC (Table S5). A statistically significant association was observed in the following SNPs: VDR BsmI rs1544410, in the genotypic (p = 0.0020), additive (p = 0.0122), allelic (p =

Influence of Genetic Polymorphisms on the Risk of NSCLC
We used the genotypic, additive, allelic, dominant, and recessive models to perform the bivariate analysis between the gene polymorphisms and the risk of suffering from NSCLC (Table S5). A statistically significant association was observed in the following SNPs: VDR BsmI rs1544410, in the genotypic (p = 0.0020), additive (p = 0.0122), allelic (p = 0.0121), and recessive (p = 0.0006) models, VDR TaqI rs731236 in the genotypic (p = 0.0299), and recessive (p = 0.0124) models, and CYP24A1 rs6068816 in the genotypic (p = 0.0292), additive (p = 0.0316), allelic (p = 0.0353), and recessive (p = 0.0194) models (Table S5). However, after the adjustment by the Bonferroni method had been made, the only SNP that maintained a statistically significant association with the risk of developing NSCLC was VDR BsmI rs1544410 in the genotypic and recessive models (Table S5). In the genotypic model, patients carrying the VDR BsmI rs1544410-AA genotype had a lower risk of NSCLC relative to the GG genotype (p Bonferroni-adjusted = 0.0361; OR = 0.457; CI95% = 0.26-0.76; AA vs. GG). Moreover, in the recessive model it was observed that patients carrying the VDR BsmI rs1544410-AA exhibited a lower risk of NSCLC than those carrying the G allele (p Bonferroni-adjusted = 0.0082; OR = 0.442; CI95% = 0.26-0.70; AA vs. G) ( Table 3). The logistic regression model adjusted for smoking and family history of cancer revealed that in the genotypic model carriers of the VDR BsmI rs1544410-AA genotype were associated with a lower risk of developing NSCLC compared to the GG genotype (p = 0.0377; OR = 0.51; CI95% = 0.27-0.95; AA vs. GG). This association was maintained in the recessive model, where patients carrying the AA genotype showed a lower risk of suffering from NSCLC than those carrying the G allele (p = 0.0140; OR = 0.49; CI95% = 0.27-0.85; AA vs. G) ( Table 4). The other SNPs analyzed showed no statistically significant association with developing NSCLC in any of the models studied (Table S5). In performing the haplotype analysis, we took account of the polymorphisms that were in strong linkage disequilibrium (Table S2). We found that the AACATGG and GACATGG haplotypes, for the rs1544410, rs7975232, rs731236, rs4646536, rs703842, rs3782130, and rs10877012 SNPs, were associated with a lower risk of NSCLC (p = 0.015; OR = 0.63; CI95% = 0.44-0.91 and p = 0.044; OR = 0.11; CI95% = 0.01-0.94 respectively) ( Table 5).  Shade means the value is significant.

Discussion
The pathogenesis of NSCLC is complex, since numerous risk factors have been identified, such as smoking, exposure to radon, air pollution, history of previous lung disease, somatic mutations, and low serum levels of vitamin D [5,7]. Recent research indicates that polymorphisms involved in the metabolic pathway of vitamin D are related both to survival of the disease [12,17,[40][41][42][43] and to its pathogenesis [18][19][20][21][22][23][24][25][26][27][28][29][30][31]. It is therefore important to explore the impact of these associations in different populations. In this case-control study we investigate the influence of 13 genetic polymorphisms in five genes involved in the metabolic pathway of vitamin D on susceptibility to NSCLC in a Caucasian population (Spain).
Recently, the effect of the rs6068816 and rs4809957 SNPs of the CYP24A1 gene on the risk of NSCLC has been studied [24,27,[29][30][31]. This gene is responsible for synthesizing the enzyme involved in degrading vitamin D and thereby preventing it from accumulating. The results obtained in our study relate the rs6068816-C allele to lower risk of developing NSCLC in the bivariate analysis. However, this significance was not maintained after the Bonferroni adjustment. Nevertheless, our results are in line with those presented in the literature. A meta-analysis consisting of two studies conducted in Asian populations (China) (1056 cases/1302 controls) showed that the rs6068816 SNP was significantly associated with the risk of suffering from lung cancer. In particular, the rs6068816-C allele showed a protective effect from the development of NSCLC (p = 0.031; OR = 0.88; CI95% = 0.78-0.99; I2 = 0%; p heterogeneity = 0.667; C vs. T and p = 0.049; OR = 0.85; CI95% = 0.72-1.00; I2 = 0%; p heterogeneity = 0.955; CC vs. T). However, the significance was not maintained after the Bonferroni adjustment [29]. As for the rs4809957 SNP, no significant association was found in our study in any of the models analyzed. However, a previous study comprising 603 cases and 661 controls of Asian descent (China) found that rs4809957 was associated with the risk of NSCLC [27].
The rs4646536, rs3782130, rs10877012, and rs703842 SNPs in the CYP27B1 gene, the only one capable of synthesizing the α-1-hydroxylase enzyme, which is responsible for the second hydroxylation in the vitamin D activation process, showed no statistically significant associations with the risk of NSCLC in our study. The effect of the rs4646536 and rs703842 SNPs on susceptibility to developing lung cancer has not so far been evaluated. However, their effect on other cancers (breast, colorectal, and prostate, among others) has been assessed, without any statistically significant association being found [48]. On the other hand, two studies evaluating the effect of the rs10877012 and rs3782130 SNPs on the risk of developing NSCLC have been conducted.  [27].
The rs7041 SNP is located in exon 11 in domain III of the GC gene. This gene synthesizes the vitamin D binding protein (VDBP), which has immunomodulatory functions, specifically in the lung, related to activation of macrophages and chemotaxis of neutrophils [50]. Consequently, SNPs in the GC gene may alter the function of the protein and increase the risk of disease [51]. Our results did not show a statistically significant association between the rs7041 SNP and susceptibility to NSCLC in any of the genetic models analyzed. These results are in line with a meta-analysis comprising 3 articles with Asian populations (China) (1142 cases/1219 controls), which showed that the rs7041 SNP was not associated with risk of lung cancer in any of the models analyzed (  [28]. However, when the subgroup analysis by cancer subtypes was performed, it showed that the G allele and the GG genotype might be protective factors against the development of NSCLC in an Asian population (China) (p = 0.004; OR = 0.76; CI95% = 0.62-0.92; I2 = 53%; p heterogeneity = 0.14; G vs. T and p = 0.03; OR = 0.58; CI95% = 0.35-0.95; I2 = 51%; p heterogeneity = 0.15; GG vs. T respectively) [28].
The main limitation of this study is the limited size of the sample compared to other studies, particularly with regard to cases. This may have prevented detection of associations of some polymorphisms. However, despite this limited sample size, after the Bonferroni adjustment was applied to avoid false-positive associations, the effect of VDR BsmI rs1544410 on susceptibility to NSCLC remained. The strengths of our study include a very homogeneous cohort of cases, consisting solely of patients with NSCLC diagnosed by the same team of pathologists and recruited in the same geographical area, increasing their uniformity.
To summarize, our results suggested that the BsmI rs1544410 polymorphism in the VDR gene may act substantially as a protective factor against developing NSCLC. Further studies in different populations could help to find additional associations between other genes and polymorphisms in the vitamin D metabolic pathway and the risk of NSCLC.
Supplementary Materials: The following supporting information can be downloaded at: https:// www.mdpi.com/article/10.3390/nu14214668/s1. Table S1: Hardy-Weinberg equilibrium.   Informed Consent Statement: All subjects involved in the study signed the written informed consent form.