Genetic Polymorphisms in Estrogen-Related Genes and the Risk of Breast Cancer among Han Chinese Women

Exposure to high levels of estrogen is considered an important risk factor for susceptibility to breast cancer. Common polymorphisms in genes that affect estrogen levels may be associated with breast cancer risk, but no comprehensive study has been performed among Han Chinese women. In the present study, 32 single-nucleotide polymorphisms (SNPs) in estrogen-related genes were genotyped using the MassARRAY IPLEX platform in 1076 Han Chinese women. Genotypic and allelic frequencies were compared between case and control groups. Unconditional logistic regression was used to assess the effects of SNPs on breast cancer risk. Associations were also evaluated for breast cancer subtypes stratified by estrogen receptor (ER) and progesterone receptor (PR) status. Case-control analysis showed a significant relation between heterozygous genotypes of rs700519 and rs2069522 and breast cancer risk (OR = 0.723, 95% CI = 0.541–0.965, p = 0.028 and OR = 1.500, 95% CI = 1.078–2.087, p = 0.016, respectively). Subgroup comparisons revealed that rs2446405 and rs17268974 were related to ER status, and rs130021 was associated with PR status. Our findings suggest that rs700519 and rs2069522 are associated with susceptibility to breast cancer among the Han Chinese population and have a cumulative effect with three other identified SNPs. Further genetic and functional studies are needed to identify additional SNPs, and to elucidate the underlying molecular mechanisms.

affect the regulation, transcription, and activity of the enzymes, as well as subsequent exposure to endogenous estrogen and the development of breast cancer [14]. Although the effect of individual SNPs is usually small, multi-SNP association analysis can reveal cumulative risk effects and significantly relate to breast cancer [22]. Except for some reviews in which data from different reports and races were summarized and synthetically analyzed, most studies to date have focused on Western women [14,15]. Chinese populations have only been analyzed through studies with small sample sizes where few genes in the estrogen pathways were included. These conditions are too limited to explore and demonstrate special genes or SNPs of breast cancer genes in Han Chinese women comprehensively.
In a previous study, we genotyped six SNPs in the estrogen biosynthesis gene CYP11A1, and three variants (rs2279357, rs2959003, and rs2959008) were identified as associated with susceptibility to breast cancer [23]. In the present study, we explored the role of genetic polymorphisms in additional estrogen-related genes in breast cancer risk to systematically identify additional risk polymorphisms. We selected 32 SNPs, including 14 located in estrogen biosynthesis genes (CYP17A1, HSD17B1, CYP19A1, and STS), seven in metabolism genes (CYP1A1, CYP1A2, CYP1B1, COMT, and GSTP1), six in estrogen receptors (ERα and ERβ), and five in estrogen-signaling regulatory enzymes (CARM1, CREBBP, NQO1, SRD5A2, and SΜLT1E1). The study was performed on Guangdong Han Chinese women.
Quality control of all of the SNPs, consisting of a Hardy-Weinberg equilibrium (HWE) p > 0.05 in healthy controls, a genotyping rate ≥ 85%, and a minor allele frequency (MAF) > 0.01 was conducted. Five SNPs (rs2830, rs4680, rs707762, rs17268974, and rs1695) that deviated from HWE were omitted from further case-control analysis.

Polymorphisms in Estrogen Biosynthesis Genes and Breast Cancer
We compared the genotypic and allelic frequencies of 11 SNPs in estrogen biosynthesis genes based on the case-control study and observed that rs700519 in CYP19A1 was significantly associated with breast cancer risk. The frequency of the heterozygous genotype C/T was lower in the breast cancer group than in the control group (20.9% vs. 26.2%). Women who carried the C/T genotype and T allele could be protected against breast cancer (OR = 0.723, 95% CI = 0.541-0.965, p = 0.028 and OR = 0.772, 95% CI = 0.600-0.993, p = 0.044, respectively). The results are shown in Table 2. No significant association was identified for the other SNPs.

Polymorphisms in Estrogen Metabolism Genes and Breast Cancer
Among the five SNPs in estrogen metabolism genes, the genotype distribution of CYP1A2 rs2069522 was significantly different between cases and controls. In contrast to rs700519, the heterozygote T/C of rs2069522 increased the risk of breast cancer (OR = 1.500, 95% CI = 1.078-2.087, p = 0.016, Table 3). The other four SNPs showed no evidence of an association with breast cancer.

Polymorphisms in Estrogen Receptors (ERs) and Regulatory Genes
No significant differences were found in genotypic or allelic distributions between the case and control groups, indicating that the 11 SNPs in the ERs and regulatory genes were not independently associated with breast cancer in our study population. Detailed information is shown in Tables 4 and 5.

Cumulative Risk Analysis
CYP11A1 encodes the key enzyme that catalyzes the initial and rate-limiting steps of steroid hormone synthesis. In our previous study, three SNPs, rs2279357, rs2959003, and rs2959008, in CYP11A1 were validated to be associated with breast cancer risk (OR = 1.558, 95% CI = 1.092-2.222, p = 0.014; OR = 0.668, 95% CI = 0.460-0.971, p = 0.034 and OR = 0.669, 95% CI = 0.452-0.991, p = 0.045) [23]. To investigate whether these significant risk polymorphisms had a cumulative effect, an integrated analysis including these five SNPs was performed to evaluate the cumulative risk for breast cancer (Table 6). We defined the genotypes C/C of rs700519, T/C of rs2069522, and T/T of rs2279357, rs2959003, and rs2959008 as risk genotypes to conduct statistical analysis. Consistent with our hypothesis, women who carried more risk genotypes had a higher susceptibility for developing breast cancer (ptrend = 0.030). Women with five high-risk genotypes had a statistically significant 1.286 times higher risk of developing breast cancer than controls (OR = 2.286, 95% CI = 1.187-4.399, p = 0.013).
The admixture maximum likelihood (AML) method was employed to evaluate the cumulative effect from multiple variants. As expected, four SNPs (rs2279357, rs2959008, rs2959003, and rs2069522) were significant at the 5% level (p = 0.02, 0.02, 0.03, and 0.04) after adjusting for age based on the trend test for association. The other SNP (rs700519) failed to reach this threshold (p = 0.08). The global test yielded a marginally significant association for the whole estrogen-related pathway (pglobal = 0.047). Dividing into four functional sub-pathways for the global test revealed strong associations with the estrogen biosynthesis (pglobal = 0.026) and metabolism (pglobal = 0.014) sub-pathways. However, the other two sub-pathways showed no association with breast cancer.

Discussion
In the present retrospective study, two SNPs, rs700519 and rs2069522, were identified as being associated with breast cancer risk. The non-synonymous coding SNP rs700519 (Arg264Cys) is located in exon 7 of CYP19A1, which encodes aromatase, the key enzyme in estrogen biosynthesis. Aromatase catalyzes the rate-limiting step in the conversion of testosterone and androstenedione to estradiol and estrone in a wide variety of tissues, including the ovary, placenta, brain, and adipose tissue [24]. The up-regulation of CYP19A1 expression has been observed to contribute to breast cancer [25]. Individual genetic polymorphisms in this gene have also previously been reported to be associated with enzymatic activity (C1123T and rs28757184), with certain pathologic phenotypes including breast cancer (Trp39Arg), and with poor efficacy of cancer treatment (rs4646) [26][27][28][29][30]. Studies in Chinese women have identified that the common missense polymorphism rs700519 alone or combined with other polymorphisms could significantly modulate the risk of endometriosis [31], polycystic ovary syndrome [32], and survival of breast cancer [33]. In the present population, this SNP was significantly associated with susceptibility to breast cancer, consistent with a result based on North Indian women [18]. However, some studies showed no significant association [34][35][36]. Another key enzyme of the pathway is CYP1A2, which also plays a crucial role in estrogen metabolism. Genetic variants of CYP1A2 have been shown to contribute to the risk of lung and breast cancer by interacting with environmental factors and drug metabolism through regulation of enzyme activity [37][38][39]. SNP rs2069522 is located at −2847 bp of CYP1A2, within a putative region that bidirectionally regulates the transcriptional activation of both CYP1A1 and CYP1A2 genes [40]. Despite the important function of rs2069522, its association with breast cancer risk has not been clearly demonstrated. Genetic studies have only shown that it had the potential to affect colorectal cancer risk and the treatment response to clozapine in schizophrenic patients among British Caucasians and Koreans, respectively [41,42]. However, those results were not significant after correction for multiple testing. Our study revealed for the first time that heterozygosity of rs2069522 significantly increased breast cancer risk. It is possible that this variant may influence the transcription and thus activity of CYP1A1 protein. It is particularly worth noting that the association of rs700519 and rs2069522 was only observed for heterozygotes, but not for homozygotes. One possibility is that the number of rare homozygotes for these two SNPs is too small to show significant evidence, which is consistent with the HapMap database and other reports [41,43]. Furthermore, the biological function of these two SNPs has not been clarified fully. Thus, further analyses based on a larger set of SNPs should be conducted to perform fine mapping of the locus, and to explore their biological functions and mechanisms in more detail.
Genetic polymorphisms in estrogen-related genes have been intensively investigated in many studies. However, these studies produced inconsistent results [15][16][17][18]21,31,44,45]. Several polymorphisms that were previously reported to contribute to breast cancer risk showed no significant association in our case-control analysis, including rs743572, rs1056836, rs1695, and rs2234693 [18,21,44,45]. The inconsistency might mainly be due to racial differences in susceptibility to breast cancer [46], because these studies were performed on various populations and subject numbers. For instance, variants in CYP1B1 (rs1056836), COMT (rs4680), and ERα (rs9340799) were significantly related to breast cancer susceptibility among Caucasian women [20,21], but we failed to find the same associations in our study population. Thus, it is necessary to replicate these risk loci to validate specific variants for Han Chinese women.
Polymorphisms in genes could affect the levels of transcription, translation, and activity of enzymes and result in wide inter-individual variability of responses to carcinogenic substances. Although the effect of a single polymorphism on tumorigenesis is weak, multiple loci could combine to generate a cumulative risk that cooperates with environmental exposure [47]. We have previously investigated the polymorphisms in CYP11A1 and found that rs2279357, rs2959003, and rs2959008 were related to breast cancer risk [23]. Combined with two associated variants in the present study, the cumulative analysis of these five verified variants revealed a much higher risk than a single SNP, a finding that agrees remarkably well with our expectation. The pathway-based AML global test also showed a significant association for the whole estrogen-related pathway that was much stronger for estrogen biosynthesis and metabolism sub-pathways. These findings support the multigenic nature of the etiology of breast cancer and need for gene-gene or locus-locus interactive studies on low-penetrance genes to comprehensively explore the underlying mechanisms of breast cancer.
It is necessary for all hormones to combine with their corresponding receptors to exert their biological functions. The expression of receptors, such as ER, PR, and human epidermal growth factor receptor-2 (HER2), is included in routine clinical classification and evaluation of breast cancer [48]. Their expression may also predict the response to targeted therapies such as tamoxifen and trastuzumab. Therefore, we also carried out stratified analyses based on ER and PR subtypes in the case group. Three SNPs (rs2446405, rs17268974, and rs130021) were confirmed to be associated with ER or PR status. Genetic changes in the receptors may influence their activity and therefore the risk and subtype of breast cancer.
Our study had some limitations. First, menopausal status can modify a woman's risk of developing breast cancer, and the incidence rate differs between premenopausal and postmenopausal women. In the current study, we did not evaluate the associations for postmenopausal and premenopausal women separately because information on menopausal status was not obtained from most of the participants. Second, the statistical power for all of the 1076 subjects was larger than 83% to detect a log-additive genotype relative risk of 1.30 (Table S1). It was considered to be well powered to detect strong or moderate effect sizes of disease-predisposing variants, but smaller effects might have been missed. In fact, the exact powers of the two significant SNPs, rs700519 and rs2069522, were 68.12% and 83.69%, respectively. Furthermore, some variants in estrogen-related genes, such as CYP1A1 T3801C, CYP1A1 A2455G, and COMT Val158Met, were identified as susceptibility loci in breast cancer development according to recent meta-analysis reports [49][50][51]. However, these SNPs were not included in the present study because they failed in the assay design or deviated from HWE. It is worth noting that the significance of the results could be due to certain biases, which should be considered. Our study is still ongoing, and an additional large-scale assay with a more comprehensive design and more candidate SNPs may help to resolve these issues. It is, however, expected to improve the risk models for breast cancer among Han Chinese women.

Ethics Statement
The present study was reviewed and approved by the regional ethical committee of Nanfang Hospital, Southern Medical University, Guangzhou, Guangdong Province, China (IRB No: 2009-SCBCGS-GZ-01) on 12 January 2009. Written informed consent was obtained from all of the participants in the trial.

Study Subjects
A total of 1076 Han Chinese women were included in the current study; 530 were cases with breast cancer and 546 were controls. All of the patients were recruited from Nanfang Hospital after a pathological diagnosis was ascertained. Clinical information was obtained from the patients' medical records, including demographic characteristics, age at diagnosis, menopausal status, ER status, PR status, HER2 status, clinical stage, degree of differentiation, size of tumor, and lymph node involvement. The healthy controls were derived randomly and simultaneously from the same hospital, and were matched by ethnicity and geographic location. All of the controls were checked by a physician for the absence of cancer, and family history of their first-degree relatives was self-ascertained.

DNA Extraction
Peripheral blood samples were collected from the participants and stored at −70 °C until DNA extraction. Genomic DNA was extracted from EDTA-containing blood using the EZNA blood DNA kit (Omega Biotek, Norcross, GA, USA) according to the manufacturer's instructions. The DNA was stored at −70 °C.

SNP Selection and Genotyping
The rationale for SNP selection has been described in detail elsewhere [23,52]. Briefly, we selected candidate SNPs in estrogen-related genes from Genome-Wide Association Studies GWAS, meta-analysis in multiple populations, or other large-sample association studies on breast cancer [41,44,53]. Furthermore, SNPs that were significantly associated with other hormone-induced cancers such as ovarian cancer and endometrial cancer were also selected. To achieve a power of at least 50%, with an odds ratio of 1.3, only SNPs with MAF ≥ 10% in the Chinese population according to the HapMap database were included. Monomorphic SNPs among the Chinese population were discarded. Finally, the genotyping assay was designed using Assay Design 4.0 software (Sequenom, San Diego, CA, USA), and after comprehensive evaluation, a total of 32 SNPs in 16 genes were included in this study. Details on the locations and characteristics of these SNPs are listed in Table S2.
For each SNP, a pair of amplification primers and an extension primer were designed using Assay Design 4.0 software. Genotyping was performed using the high-throughput MassARRAY iPLEX platform (Sequenom) through multiplex reactions. The genotyping rate of DNA samples was set to ≥85%.

Statistical Analysis
The statistical power of the case-control study was calculated using QUANTO software version 1.2.4 (University of Southern California, Los Angeles, CA, USA; http://hydra.usc.edu/gxe). We conducted a case-control study for all of the subjects, and then the patients were stratified by ER and PR status. Association analysis based on unconditional logistic regression was carried out by estimating OR and 95% CI with multiple inheritance models (codominant, dominant, recessive, overdominant, and additive) for each SNP. HWE was assessed using Fisher's exact test and the χ 2 test. The statistical tests were implemented using SPSS 13.0 software combined with the Web-based tool SNPstats (http://bioinfo.iconcologia.net/SNPstats). We also performed the AML-based global test of association for the whole estrogen-related pathway as well as for four sub-pathways after adjusting for age using software provided by the authors of Tyrer et al. [54]. All of the statistical analyses were two-tailed, and the significance level was set at a p value of 0.05.

Conclusions
In conclusion, we have presented an integrated analysis of the relationship between common variants in estrogen-related genes and breast cancer risk. Our study provides molecular genetic evidence suggesting that the independent and combined effects of these polymorphisms potentially reflect exposure to estrogen and as a consequence, an increase in breast cancer susceptibility. Our data suggest that this association could be specific to ER status. These findings may contribute to obtaining a more conclusive understanding about the underlying etiology of breast cancer and provide new strategies for breast cancer prevention, early detection, and specific treatment.