Polymorphisms in GEMIN4 and AGO1 Genes Are Associated with the Risk of Lung Cancer: A Case-Control Study in Chinese Female Non-Smokers

MicroRNA biosynthesis genes can affect the regulatory effect of global microRNAs to target mRNA and hence influence the genesis and development of human cancer. Here, we selected five single nucleotide polymorphisms (SNPs) (rs7813, rs2740349, rs2291778, rs910924, rs595961) in two key microRNA biosynthesis genes (GEMIN4 and AGO1) and systematically evaluated the association between these SNPs, the gene-environment interaction and lung cancer risk. To control the impact of cigarette smoking on lung cancer, we recruited Chinese female non-smokers for the study. The total number of lung cancer cases and cancer-free controls were 473 and 395 in the case-control study. Four SNPs showed statistically significant associations with lung cancer risk. After Bonferroni correction, rs7813 and rs595961 were evidently still associated with lung cancer risk. In the stratified analysis, our results revealed that all five SNPs were associated with the risk of lung adenocarcinoma; after Bonferroni correction, significant association was maintained for rs7813, rs910924 and rs595961. Haplotype analysis showed GEMIN4 haplotype C-A-G-T was a protective haplotype for lung cancer. In the combined unfavorable genotype analysis, with the increasing number of unfavorable genotypes, a progressively increased gene-dose effect was observed in lung adenocarcinoma. We also found that individuals exposed to cooking oil fumes showed a relatively high risk of lung cancer, but no interactions were found between cooking oil fume exposure or passive smoking exposure with these SNPs, either on an additive scale or a multiplicative scale. Overall, this is the first study showing that rs7813 and rs595961 could be meaningful as genetic markers for lung cancer risk.


Introduction
Lung cancer is one of the most common malignant tumors affecting millions of people around the world.In 2012, about 1.8 million lung cancer patients were newly diagnosed, accounting for about 13% of all new cancer cases in the world [1].Smoking is recognized as a primary environmental risk factor of lung cancer, but only a fraction of smokers will develop lung cancer.Several studies observed that the incidence rate of lung cancer in non-smokers is increasing, especially for females in China.High incidence rates of lung cancer in Chinese female non-smokers appear to be related to other factors [2].Therefore, exploring other related lung cancer risk factors in Chinese female non-smokers seems very meaningful.
The genesis and development of lung cancer is influenced by many risk factors, including genetic mutations and environmental factors and their interactions.Previous studies have confirmed many different genetic factors are involved in the development of lung cancer, including microRNA [3][4][5].MicroRNAs are a type of single-stranded noncoding RNAs, the length of approximately 20 nucleotides, which are considered to regulate a large amount of gene expression mainly through binding to the 3' untranslated region of their target mRNA [6].The mature microRNA molecule will load together with several microRNA biosynthesis gene proteins, including at least one member of the AGO family, Dicer, GEMIN3, and GEMIN4 into miRNA-induced silencing complex (miRISC) to play a critical role in silencing target mRNA.MicroRNAs have been confirmed to be involved in most of the human biological processes at the posttranscriptional level; deregulation of microRNA is considered to be involved in human cancer [7][8][9][10].The abnormality of microRNA biosynthesis genes can affect the regulatory effect of global microRNAs to target mRNA, thereby influencing diseases; therefore, the abnormality of microRNA biosynthesis genes may play an important part in human cancer [11][12][13][14][15].
Recently, genetic association studies have identified that some genetic variants in microRNA biosynthesis genes may affect susceptibility to cancer risk such as gastric cancer, renal cell carcinoma, ovarian cancer, breast cancer and prostate cancer [16][17][18][19][20]. However the relationship between the SNPs in microRNA biosynthesis genes and the risk of lung cancer is still unclear.Herein, we evaluated the association between five SNPs in GEMIN4 and AGO1, the gene-environment interaction and lung cancer risk.

Study Subject
This study was approved by the Institutional Review Board of China Medical University.A total number of 868 participants consisting of 473 lung cancer cases and 395 cancer-free controls were included in the hospital-based case-control study.All participants were female non-smokers and genetically unrelated Chinese Han population.All participants signed informed consent.Patients were selected from the First Affiliated Hospital of China Medical University and the Liaoning Cancer Hospital.There was no restriction of age, clinical stage and histological type for the recruitment.All patients were newly diagnosed with histopathology-confirmed primary lung cancer that was previously untreated.During the same period, age matched (±5 years) cancer-free controls were recruited from medical examination centers in the same hospital.

Data Collection
A 10 mL venous blood sample was drawn from each subject and then stored at −20 • C for subsequent DNA isolation.
Clinical pathological information was obtained from clinical records.A face-to-face questionnaire interview was conducted among participants to collect demographics and environmental exposure information, including age, sex, smoking status, cooking oil fume exposure status and so on.In their lifetime, subjects who had smoked less than 100 cigarettes were defined as non-smokers, all others were smokers.Individuals who had been exposed to the secondhand smoke of one cigarette every day for at least one year were defined as passive smokers.For cooking oil fume exposure, participants were asked, "How often did the air in your kitchen become filled with oily 'smoke' during cooking?"There were four possible responses ranging from "never", "seldom" and "sometimes" to "frequently".Exposure to cooking oil fumes was defined as an indicator variable equal to 0 if participants reported seldom or never and equal to 1 if participants reported frequently or sometimes [21,22].

Genotyping Analysis
Genomic DNA was extracted from blood samples by the standard phenol-chloroform method.The genotyping method refers to our previous study [23].

Statistical Analysis
The Pearson chi-squared test was used to evaluate the Hardy-Weinberg equilibrium (HWE) in controls.The t-test and chi-squared test were separately performed to assess the distribution of continuous variables and categorical variables between two groups.The odds ratios (ORs) and their 95% confidence intervals (CIs) for assessing the relationship between risk factors and lung cancer risks were performed by logistic regression.The linkage disequilibrium (LD) and haplotype analyses were calculated by SHEsis online web-server [24].The analysis of cumulative effects of unfavorable genotypes included those genotypes showing significant association with increased lung cancer risk in the main analysis.Crossover analysis was performed to assess gene-environment interaction.The evaluation of the additive interactions was based on Tomas Andersson's study [25].Multiplicative interactions were assessed by logistic regression model.
All statistical tests were two-sided, and nominal p < 0.05 was defined as statistically significant.The Bonferroni correction was used to adjust p value for multiple statistical tests.The SPSS 22.0 software (IBM, New York, NY, USA) was used for statistical analyses in the present study.

Population Characteristics
The basic characteristics of participants are summarized in Table 1.The participants were composed of 473 lung cancer cases and 395 controls.All included individuals were Chinese female non-smokers, and no significant difference was found for age between two groups (p = 0.87).First we assessed the association of the five SNPs and the lung cancer risk.Data is listed in Table 3.The results indicate that the distribution of rs7813, rs2291778, rs910924 and rs595961 genotypes exhibited statistically significant differences between two groups (p < 0.05); after Bonferroni correction, rs7813 and rs595961 were still associated with lung cancer risk.
In squamous cell carcinoma, no significant differences were found between the distributions of genotypes in two groups.In SCLC, distribution of rs2291778 genotypes showed a remarkable result; however, due to the relatively small sample size, the results need to be further verified with a large sample population (Table S1).

The Linkage Disequilibrium (LD) and Haplotype Analyses of the SNPs in GEMIN4 s and Lung Cancer Risk
We analyzed the association between different haplotypes and lung cancer risk.The LD plots are shown in Figure 1.Table 5 lists the frequencies of the haplotypes constructed with four SNPs: rs7813, rs2740349, rs2291778 and rs910924 in the GEMIN4 gene.Five common haplotypes were also observed.Compared with the combination of all other haplotypes, C-A-G-T showed a protective effect in lung cancer and lung adenocarcinoma (OR = 0.688, 95% CI = 0.523-0.905,p = 0.007; OR = 0.583, 95% CI = 0.424-0.801,p = 0.001, respectively).

The Linkage Disequilibrium (LD) and Haplotype Analyses of the SNPs in GEMIN4 s and Lung Cancer Risk
We analyzed the association between different haplotypes and lung cancer risk.The LD plots are shown in Figure 1.Table 5 lists the frequencies of the haplotypes constructed with four SNPs: rs7813, rs2740349, rs2291778 and rs910924 in the GEMIN4 gene.Five common haplotypes were also observed.Compared with the combination of all other haplotypes, C-A-G-T showed a protective effect in lung cancer and lung adenocarcinoma (OR = 0.688, 95% CI = 0.523-0.905,p = 0.007; OR = 0.583, 95% CI = 0.424-0.801,p = 0.001, respectively).

Cumulative Effects of the Unfavorable Genotypes in Lung Adenocarcinoma
As a result of the strong association between SNPs and risk of lung adenocarcinoma, we further assessed the combined effects of the high-risk genotypes on the lung adenocarcinoma risk (Table 6).The unfavorable genotypes were defined as following: rs7813 (TT), rs2740349 (AA), rs2291778 (GT + TT), rs910924 (CC), rs595961 (AG + AA).With the increasing number of unfavorable genotypes, a progressively increased gene-dose effect was found.The low-risk group's subjects carrying zero/one unfavorable genotype were used as reference, whereas subjects carrying two/three and four/five unfavorable genotypes showed an increased risk of lung adenocarcinoma (adjusted OR = 1.798, 95% CI = 1.175-2.751,p = 0.007; adjusted OR = 3.206, 95% CI = 2.063-4.983,p < 0.001, respectively).

SNPs in GEMIN4 and AGO1 and Environmental Risk Factors (Cooking Oil Fume Exposure and Passive Smoking Exposure) as Well as Their Interaction on the Risk of Lung Cancer
Of the participants in this study, there were 224 cases and 244 controls with environmental exposure information.Individuals exposed to cooking oil fumes have a higher risk of lung cancer (OR = 2.132, 95% CI = 1.416-3.212,p < 0.001).Table 7 shows the interaction between environmental risk factors and these five SNPs on lung cancer risk.Compared with the reference group (rs595961-GG genotype carrier without environmental risk factors exposure), AG + AA genotype carriers exposed to cooking oil fumes or passive smoking have a significantly increased risk of lung cancer after Bonferroni correction (adjusted OR = 6.314, 95% CI = 2.752-14.485,p < 0.001, adjusted OR = 3.139, 95% CI = 1.678-5.871,p < 0.001, respectively).
The crossover analysis suggested the possibility of the existence of gene-environment interaction, so further analyses based on the additive scale (Table S2) and multiplicative scale were performed.The results suggest that there is no significant interaction on the additive scale.Logistic models were used to evaluate the interaction on a multiplicative scale; the results did not show any statistical significance.

Cumulative Effects of the Unfavorable Genotypes in Lung Adenocarcinoma
As a result of the strong association between SNPs and risk of lung adenocarcinoma, we further assessed the combined effects of the high-risk genotypes on the lung adenocarcinoma risk (Table 6).The unfavorable genotypes were defined as following: rs7813 (TT), rs2740349 (AA), rs2291778 (GT + TT), rs910924 (CC), rs595961 (AG + AA).With the increasing number of unfavorable genotypes, a progressively increased gene-dose effect was found.The low-risk group's subjects carrying zero/one unfavorable genotype were used as reference, whereas subjects carrying two/three and four/five unfavorable genotypes showed an increased risk of lung adenocarcinoma (adjusted OR = 1.798, 95% CI = 1.175-2.751,p = 0.007; adjusted OR = 3.206, 95% CI = 2.063-4.983,p < 0.001, respectively).

SNPs in GEMIN4 and AGO1 and Environmental Risk Factors (Cooking Oil Fume Exposure and Passive Smoking Exposure) as Well as Their Interaction on the Risk of Lung Cancer
Of the participants in this study, there were 224 cases and 244 controls with environmental exposure information.Individuals exposed to cooking oil fumes have a higher risk of lung cancer (OR = 2.132, 95% CI = 1.416-3.212,p < 0.001).Table 7 shows the interaction between environmental risk factors and these five SNPs on lung cancer risk.Compared with the reference group (rs595961-GG genotype carrier without environmental risk factors exposure), AG + AA genotype carriers exposed to cooking oil fumes or passive smoking have a significantly increased risk of lung cancer after Bonferroni correction (adjusted OR = 6.314, 95% CI = 2.752-14.485,p < 0.001, adjusted OR = 3.139, 95% CI = 1.678-5.871,p < 0.001, respectively).
The crossover analysis suggested the possibility of the existence of gene-environment interaction, so further analyses based on the additive scale (Table S2) and multiplicative scale were performed.The results suggest that there is no significant interaction on the additive scale.Logistic models were used to evaluate the interaction on a multiplicative scale; the results did not show any statistical significance.

Discussion
The relationship between SNPs of microRNA biosynthesis genes and the lung cancer risk has not been widely studied.To our knowledge, this is the first study to focus on the five SNPs of microRNA biosynthesis genes, cooking oil fumes and passive smoking exposure with risk of lung cancer.In order to control the influence of cigarette smoking on lung cancer, we selected this female non-smoker population as our study participants.It is noteworthy that the results of distribution of the SNPs, haplotype analysis and cumulative effects of the unfavorable genotypes all showed remarkable results in lung cancer.Meaningful results suggest that further functional studies need to be carried out to explore the underlying mechanisms of how the five SNPs affect lung cancer.
MicroRNA and some essential proteins, including GEMIN4 and AGO1, formed miRISC, through which the translation and stability of target mRNA were negatively regulated.miRISC play a role similar to oncogenes or tumor-suppressor genes involved in multiple tumor types by inhibiting the expression of target genes [26][27][28].AGO family proteins contain three evolutionarily conserved domains, PAZ, MID and PIWI.The seed sequence of microRNA directly or indirectly anchored MID and PIWI domains in a deep pocket.Subsequently, GW182 family proteins directly act downstream of AGO proteins to affect miRNA-mediated repression.In miRISC, the AGO proteins serve as scaffolds to recruit GW182 to mRNA [29,30].The GEMIN4 gene has been mapped to chromosome 17p13 and encodes 1058 amino acids.The role of GEMIN4 protein in miRISC is not very clear.Aberrant microRNA biosynthesis genes have been found to be implicated in the genesis, development and survival of several types of cancer, indicating that a more general role may exist in microRNA biosynthesis genes in modifying the development of cancer [15,16,[18][19][20]31,32]. While the underlying associations by which microRNA biosynthesis genes influences the risk of lung cancer remains unclear, our findings provide strong evidence regarding the association between SNPs in microRNA biosynthesis genes and lung cancer risk.
The nonsynonymous SNP rs7813 of the GEMIN4 gene could induce Arg to Cys substitution at the 1033 amino acid position through the C to T transition.Interestingly, Liang et al. found that in the non-Hispanic Caucasian population, rs7813 and rs2740349 were at the top of 226 microRNA biosynthesis gene SNPs associated with ovarian cancer risk [18].Our study found that the T allele of rs7813 has a negative effect on lung cancer risk.Our finding is identical with the other two studies on rs7813 and cancer risk [16,18].However, the earliest study about rs7813 and cancer risk, that by Yang et al., evaluated the relationship between rs7813 and bladder cancer risk in the Caucasian population, though no significant association was found [33].As rs910924 is located in the GEMIN4 gene promoter region, we found that the CC genotype is an unfavorable genotype.Two previous studies about the relationship between rs910924 and cancer risk have not reached a statistically significant level [16,33].In 2010, a study was carried out between 24 SNPs in 11 microRNA biosynthesis genes and lung cancer risk; the distribution of nine SNPs in GEMIN4 and AGO1 gene did not show any statistical difference between 100 cases and 100 controls [34].
According to previous studies, haplotypes are more meaningful than a single SNP for changes in gene function [35,36].In our study, five common haplotypes were detected; after Bonferroni correction, one of them was still found to be associated with lung cancer risk.The analysis between cumulative effect of unfavorable genotypes and lung adenocarcinoma risk also showed a notable result.It is remarkable that our results revealed that the SNPs and haplotypes were more correlated with the lung adenocarcinoma risk than other types of lung cancer, suggesting that the function of SNPs of the GEMIN4 gene may have cell specificity.This fact may signify that these SNPs provide genetic marker identification for different types of lung cancer.However, the sample size of lung adenocarcinoma and SCLC research was small, and the results need to be further verified in a larger sample population.
Lung cancer is a kind of malignant tumor which is affected by many factors, including genetic and environmental factors and their interactions.In this study, our results indicated that a higher risk of lung cancer was found in the cooking oil fume exposure group, but no gene-environment interaction was found.The results are consistent with our previous studies [37,38].Relevant studies found that DNA damage can be induced by cooking oil fume exposure and influence the carcinogenesis and development of lung cancer [39,40].Chinese cooking involves more high-temperature cooking and frying processes, so more cooking oil fumes will be produced.Cooking oil fumes contain large amounts of carcinogens, which is likely to play a part in the carcinogenesis and development of lung cancer.Further studies on the mechanisms behind and relationship of cooking oil fumes and lung cancer should be carried out.This is the first study to show a significant association between microRNA biosynthesis genes polymorphism and lung cancer risk.There are some limitations to our study, however.First, the relatively small sample size may not have provided enough statistical power.Second, since this study was a hospital-based study, selection bias may exist.Third, other SNPs in microRNA biosynthesis genes may be involved in lung cancer risk.In addition, there are some other environmental risk factors involved in lung cancer that may not have been considered in the present study.

Conclusions
On the whole, the present study firstly reported the significant association between rs7813 and rs595961 and lung cancer risk.We also found that individuals exposed to cooking oil fumes showed a relatively high risk of lung cancer, although no interactions were found between environmental risk factor exposure and these SNPs.

Table 1 .
Characteristics of lung cancer cases and cancer-free controls.

Table 2 .
Single nucleotide polymorphisms in microRNA biogenesis genes.

Table 3 .
Distribution of genotypes and ORs for lung cancer cases and cancer-free controls.

Table 4 .
Distribution of genotypes and ORs for adenocarcinoma cases and cancer-free controls.

Table 6 .
Cumulative effect of unfavorable genotypes and lung adenocarcinoma risk.

Table 7 .
Interaction of five SNPs and environmental risk factors on lung cancer risk.