Single Nucleotide Polymorphisms in HMGB1 Correlate with Lung Cancer Risk in the Northeast Chinese Han Population

Lung cancer is the principal cause of cancer-associated deaths. HMGB1 has been reported to be associated with tumorigenesis. This study aimed to investigate the relationship between rs1412125 and rs1360485 polymorphisms in HMGB1 and the risk and survival of lung cancer. 850 cases and 733 controls were included. Logistic regression analysis and survival analysis were performed to investigate the association between SNPs and the risk and survival of lung cancer. Crossover analysis was used to analyze the interaction between SNPs and tobacco exposure. Results indicated that rs1412125 polymorphism was associated with lung cancer risk, especially with the risk of lung adenocarcinoma and small cell lung cancer. Carriers with CT and CC genotypes had a decreased risk of lung cancer (CT + CC vs.TT: adjusted OR = 0.736, p = 0.004). Similar results were obtained in the stratification analysis for non-smokers and female population. For rs1360485 polymorphism, AG and GG genotypes could decrease the risk of lung adenocarcinoma and female lung cancer by 0.771-fold and 0.789-fold. However, no significant interaction between polymorphisms and tobacco exposure or association between SNPs and the survival of lung cancer was observed. This study indicated polymorphisms in HMGB1 may be a novel biomarker for female lung adenocarcinoma risk.


Introduction
Lung cancer, with its low 5-year survival rate, is one of the most frequently diagnosed tumors [1]. The mechanism of its progression and development is still unclear. Smoking has been recognized as a pivotal environmental risk factor for lung cancer. Carcinogens and their metabolites (i.e., N-nitrosamines and polycyclic aromatic hydrocarbons) from cigarette smoke can activate multiple pathways such as cell proliferation, migration and apoptosis involved in tumor promotion and progression [2]. In addition to environmental factors, genetic susceptibility may participate in the development and progression of lung cancer too. To predict the susceptibility and survival of lung cancer, understanding of the genetic mechanisms, including DNA genotyping and heterogeneity as well as the association between the genetic susceptibility and cigarette smoking is increasingly required [3].
Analysis aiming at comparing distribution frequencies of genotypes among different subgroups is increasingly conducted to assess the susceptibility and survival of cancers [28][29][30]. Previous studies have reported the HMGB1 polymorphisms might efficiently predict the risk of susceptibility to different cancers such as hepatocellular carcinoma, uterine cervical neoplasia as well as colorectal cancer [31][32][33][34]. Thus we conducted this study to evaluate the association between the two SNPs (rs1412125 and rs1360485) in HMGB1 and the susceptibility as well as the survival of lung cancer to provide a new biomarker for the diagnosis, susceptibility and prognosis of lung cancer (as shown in Figure 1).

Study Characteristics
The baseline data of these 850 cases and 733 controls in this present study was summarized, as shown in Table 1. There was no significant difference in the distribution of age (58.63 ± 11.16 years for cases and 57.46 ± 13.18 years for controls, p = 0.056) and family history of cancer (p = 0.688) between the two subgroups. However, the significant differences existed in the distribution of both smoking status and gender between cases and controls (66.5% and 71.2% non-smokers, 35.38 ± 10.81 packing-years for cases and 29.41 ± 14.76 pack-years for controls, 74.6% and 87.3% female individuals for case and control groups respectively, p < 0.001). Therefore, all the further statistical analyses were adjusted for age, gender and smoking status and stratification analysis for gender and smoking status was also conducted to eliminate the influence from their inequality between groups on the following analysis results. Among the 850 cases, there were 507 cases with lung adenocarcinoma (LAD), 212 cases with lung squamous cell carcinoma (LSCC), 106 cases with small cell lung cancer (SCLC) and 25 cases with other histological types. As for the clinical stage, 55 cases were in stage I, 126 cases were in stage II, 488 cases were in stage III-IV and the remaining 181 cases' information of clinical stage was not obtained. The median follow-up time was 18 months among the 365 cases whose follow-up information was obtained.  Table 2 summarizes the distributions of rs1412125 and rs1360485 alleles and genotypes between cases and controls as well as the associations between genotypes and the susceptibility to lung cancer. Genotype frequencies of rs1412125 (p = 0.529) and rs1360485 (p = 0.945) polymorphisms were satisfied to HWE in controls. For the distribution of rs1412125 polymorphism, significant difference existed between TT and CT genotypes that carriers with CT genotype had a 0.744-fold decreased risk of susceptibility to lung cancer than carriers with TT genotype. The distribution of rs1412125 polymorphism was also prominently different in the dominant model between the cases and controls that subjects with CT and CC genotypes had a 0.736-fold decreased risk of susceptibility to lung cancer than those carrying TT genotype. For the allele comparison of rs1360485 polymorphism, G allele decreased the risk for lung cancer by 0.829-fold while different rs1360485 genotypes were not associated with the risk of susceptibility to lung cancer. Tables 3 and 4 indicated that rs1412125 polymorphism might have an effect on the risk of susceptibility to LAD and SCLC. Relative to TT genotype carriers, individuals with CT genotype had a 0.764-fold and 0.576-fold decreased risk of LAD and SCLC respectively while for the dominant models individuals who carried CT and CC genotypes had a 0.752-fold and 0.595-fold decreased risk of LAD and SCLC respectively. Among non-smokers, CT genotype decreased the risk for lung cancer by 0.756-fold while carriers with CT and CC genotypes had a 0.750-fold decreased risk of lung cancer. When it came to the female population, carriers with CT genotype had a 0.727-fold decreased risk of lung cancer and carriers with CT and CC genotypes had a 0.725-fold decreased risk. Results of stratified analyses for SNP rs1360485 suggested that AG and GG genotypes could decrease the risk for LAD by 0.771-fold and decrease the risk for female lung cancer by 0.789-fold when the AA genotype was considered as a reference (Tables S1 and S2).

Interaction between SNPs and Tobacco Exposure
Results of the crossover analysis were summarized in Table 5. When compared with non-smokers who carried CT and CC genotypes, TT genotype carriers with tobacco exposure, CT and CC carriers with tobacco exposure and TT genotype carriers without tobacco exposure obtained 3.456-fold, 2.467-fold and 1.399-fold increased risk of susceptibility to lung cancer respectively for rs1412125 polymorphism. Nevertheless, there was no addictive interaction or multiplicative interaction between rs1412125 or rs1360485 polymorphisms and tobacco exposure (Tables 6 and 7).

Survival Analysis
Results of COX regression analysis demonstrated that no significant association existed between these two SNPs and the survival of lung cancer (Table 8). In addition, no significant difference could be observed in the distribution of rs1412125 or rs1360485 polymorphisms in different clinical stages (Table S3).

Discussion
Lung cancer with high morbidity and mortality is one of the most malignant cancers around the world. Therefore, reducing the morbidity and mortality becomes an important challenge for the public healthcare. The development and progression of lung cancer contain multiple processes which could be affected by environmental factors and genetic or epigenetic regulations. Genetic mutations involve in the tumorigenesis, cancer progression and prognosis while SNPs could regulate the expression of gene or affect gene's functions and then alter the phenotypes.
SNP rs1412125 (−1615T/C), located in −1615 base pairs upstream of HMGB1, acts as a transcription repressor to inhibit the transcription process [35,36]. The mutant allele C would lose the inhibition function and result into the overexpression of HMGB1. SNP rs1360485 is located in the intron region of HMGB1. Although polymorphism rs1360485 could not change the sequence of HMGB1 protein, it might regulate transcription process of HMGB1 or other gene. A series of studies have reported the association between polymorphisms rs1412125 or rs1360485 and the risk of cancers such as oral squamous cell carcinoma [27], hepatocellular carcinoma [31], uterine cervical neoplasia [33], colorectal cancer [34] as well as the lung cancer chemotherapy response [37]. Lin et al. conducted a study to verify the association between four SNPs (rs1412125, rs2249825, rs1045411, and rs1360485) in HMGB1 and the risk of oral squamous cell carcinoma (OSCC). They found that only the rs1045411 polymorphism could affect the risk of OSCC while other three SNPs might not be related to the susceptibility to OSCC [27]. Wu's study showed an association of SNPs of HMGB1 with the risk of susceptibility to uterine cervical neoplasia for Taiwanese women. Results indicated that the risk of cervical invasive cancer was 1.85-fold for women with TC and 1.99-fold for women with TC/CC when compared with TT carriers in HMGB1 rs1412125 polymorphism [33]. The association of HMGB1 polymorphisms with the risk of colorectal carcinoma was also investigated in a Chinese population. However, there was no significant association between the rs1412125 polymorphism and the risk of susceptibility to colorectal cancer [34]. Wang et al. reported the effects of four HMGB1 SNPs (rs1412125, rs1045411, rs2249825, and rs1360485) on the susceptibility and development of hepatocellular cancer and results indicated that carriers with TT genotype had a higher risk of distant metastasis compared with individuals carrying at least one C allele for rs1412125 polymorphism [31]. Significant associations were found between rs1412125 polymorphism and the platinum-based chemotherapy response in both genotypic and recessive models. The same result was also observed in the subgroup of cases aged over 55 years in additive and recessive as well as the genotypic models [37].
In this current study, we estimated the association between SNPs (rs1412125 and rs1360485) in HMGB1 and the susceptibility of lung cancer among 850 cases and 733 controls. There are two main reasons why we did not study the other two SNPs (rs2249825 and rs1045411): (1) the results of our previous GWAS study indicated that an association between the two SNPs (rs1412125 and rs1360485) and the risk of lung cancer might exist. However, the other two SNPs (rs2249825 and rs1045411) did not exist in the GWAS loci of lung cancer; (2) according to the results of the ensembl database (the website of the ensembl database: http://www.ensembl.org/index.html) and the previous articles, a strong linkage disequilibrium exists not only between rs1360485 and rs1045411 but also between rs1360485 and rs2249825 [27,38,39]. It means that we just need to study only one SNP to explore the association between the SNP and the lung cancer risk and the analysis result can represent all these three SNPs. Among these three SNPs, rs1360485 is the most common one so that we selected the rs1360485 as a tagSNP rather than studying all of these three SNPs. Therefore, only rs1412125 and rs1360485 were contained in this study. Results indicated that rs1412125 polymorphism in HMGB1 was associated with the risk of susceptibility to lung cancer, especially with the risk of susceptibility to LAD and SCLC. Subjects who carried TT genotype had higher risk of LAD and SCLC than those who carried CT and CC genotypes. In the stratification analysis for non-smokers and female population, the same results were also obtained. For rs1360485 polymorphism, the G allele could reduce the risk for lung cancer and results of stratified analyses demonstrated that AG and GG genotypes could reduce the risk for LAD as well as female lung cancer compared with the AA genotype. Results of our study were inconsistent with the former one conducted by Hu et al. [39]. They examined four HMGB1 SNPs (rs2249825, rs1360485, rs1045411 and rs1412125) in 190 lung cancer cases and 187 healthy controls. Results indicated CT or TT + CT genotypes of rs1045411 polymorphism could reduce the risk of lung cancer and the T/C/G haplotypes of rs1045411, rs2249825 and rs1360485 also decreased the risk of lung cancer by 0.486-fold while no significant association was found between rs1412125 polymorphism and the risk of susceptibility to lung cancer. Their results were inconsistent with ours and the reasons might come from the following different aspects: (a) the different sample sizes. This current study enrolled 850 cases and 733 controls while only 190 patients and 187 controls participated in the genotypic frequency analysis for Hu's study; (b) the different areas. Our study was conducted in the northeast of China while the east of China was selected by Hu et al. Although all the participants were Chinese Han population for both these two studies, the results could still be inconsistent as a result of some unknown environmental factors due to different areas; (c) the different inclusion criteria. Hu's findings might have been influenced by inclusion of cases that had been diagnosed as lung cancer patients before their study began, which might result into the prevalence-incidence bias. This present study only enrolled the newly diagnosis so that it could avoid this problem.
For this study, some limitations still existed. Firstly, the sample size was relatively small, especially in the stratification analysis. Secondly, all the participants for this study were from the northeast of China and this might cause the bias of the results. Therefore, studies with large sample sizes and more diversified population are needed to confirm these results.
In spite of the defects above, there were some advantages in this present study. One strength was that this was a multi-center and large sample-size study, which could enhance the reliability of the results. In addition, all the cases were newly diagnosed as lung cancer patients to prevent the prevalence-incidence bias so that the results of this study was more credible. Finally, not only the pooled analysis but also the stratified analysis was conducted according to the smoking status, genders as well as the pathological types to provide a detailed analysis of the association between the single nucleotide polymorphism in HMGB1 and the lung cancer risk.
In summary, this study provided evidence that polymorphisms in table (rs1412125 and rs1360485) might alter the individual susceptibility to lung cancer. However, future larger studies with different ethnic and area populations are still required to confirm these current findings.

Study Subjects
This is a molecular epidemiologic study of lung cancer in Shenyang, located in northeast China. 850 cases and 733 controls were included in our hospital-based case-control study. All of the cases were recruited from the First Affiliated Hospital of China Medical University, the Fourth Affiliated Hospital of China Medical University as well as Liaoning Cancer Hospital (between January 2010 and January 2014). Inclusions of case group as follows: (a) newly diagnosed histologically as lung cancer patients; (b) without any chemotherapy or radiotherapy. Exclusion criteria included the previous cancer or metastasized cancer from the different cancer. Meanwhile, 733 healthy controls without lung cancer were recruited from the medical examination center of the same hospital. Controls were frequency matched to cases on age (±5 years). All of the participants were unrelated ethnic Han Chinese population. Some data was collected including age, gender, smoking status, clinical stage as well as the pathologic types. Individuals who smoked less than 100 cigarettes for the entire lifetime were considered as non-smokers. In the meantime, the cases were followed up for at least two years. The study was conducted in accordance with the Declaration of Helsinki, and the protocol was approved by the Institutional Review Board of China Medical University and each subject had signed the informed consent form.

DNA Isolation Genotyping
The genomic DNA of every participant was extracted from the venous blood sample with the phenol-chloroform method. The SNP genotyping was conducted by the 7500 Fast Real-Time PCR system (Applied Biosystems, Foster City, CA, USA) and the PCR Taqman probes and primers were designed by the same company (assay ID C___8690889_10 for rs1412125 and C___8690872_10 for rs1360485). The reaction condition of quantitative real-time PCR (qRT-PCR) was heating to 95 • C for 10 min, 30 s at 92 • C and 1 min at 60 • C for 47 cycles. To control the quality, 5% samples from the subgroups of both cases and controls were randomly chosen to analysis again and results were consistent with the former ones. The ancestral and derived Alleles of rs1412125 polymorphism are T and C respectively while the ancestral and derived Alleles of rs1360485 polymorphism are A and G respectively. Therefore, the wild homozygous of rs1412125 and rs1360485 SNPs is TT and AA respectively, and the mutant homozygous of the two SNPs is CC and GG respectively. The heterozygote is CT and AG respectively.

Statistical Analysis
The student's t-test was carried out in continuous variables while Pearson's χ 2 test was conducted in the categorical variables. A goodness-of-fit χ 2 test was conducted to investigate the Hardy-Weinberg equilibrium (HWE) in control group. Odds Ratios (ORs) and their 95% Confident Intervals (95% CIs) were computed by unconditional logistic regression analysis after adjustment for age, gender as well as the smoking status to assess the association between the two SNPs and the risk of susceptibility to lung cancer. Interaction between SNPs and tobacco exposure was evaluated by crossover analysis. The multiplicative interaction was estimated by OR and its 95% CI with the unconditional logistic regression model. Relative Excess Risk due to Interaction (RERI), Synergy Index(S) and Attributable Proportion due to Interaction (AP) were used to estimate the addictive interaction. If the 95% CI of RERI and AP did not contain 0 and the 95% CI of S did not contain 1, the statistical difference was significant and there might be interactions between SNPs and the tobacco exposure [40]. Hazard Ratios (HRs) and the 95% CIs were computed by COX regression analysis to evaluate the potential association between these two SNPs and the survival of lung cancer. All of the statistical analyses were two-sided and carried out by SPSS software (vision 22.0, IBM SPSS, lnc. Chicago, IL, USA). Criterion of the statistical significance was p < 0.05.

Conclusions
Results of this current study suggested that polymorphisms in HMGB1 showed an association with the risk of lung cancer in non-smokers and female population and it could be used as a biomarker for the risk of lung cancer, especially for the risk of LAD and SCLC. However, studies with larger sample sizes might be needed to validate these findings and the biological functions of these two polymorphisms in lung cancer will be explored in the future.

Conflicts of Interest:
The authors declare no conflict of interest.