A Common Variant in the SETD7 Gene Predicts Serum Lycopene Concentrations

Dietary intake and higher serum concentrations of lycopene have been associated with lower incidence of prostate cancer and other chronic diseases. Identifying determinants of serum lycopene concentrations may thus have important public health implications. Prior studies have suggested that serum lycopene concentrations are under partial genetic control. The goal of this research was to identify genetic predictors of serum lycopene concentrations using the genome-wide association study (GWAS) approach among a sample of 441 Old Order Amish adults that consumed a controlled diet. Linear regression models were utilized to evaluate associations between genetic variants and serum concentrations of lycopene. Variant rs7680948 on chromosome 4, located in the intron region of the SETD7 gene, was significantly associated with serum lycopene concentrations (p = 3.41 × 10−9). Our findings also provided nominal support for the association previously noted between SCARB1 and serum lycopene concentrations, although with a different SNP (rs11057841) in the region. This study identified a novel locus associated with serum lycopene concentrations and our results raise a number of intriguing possibilities regarding the nature of the relationship between SETD7 and lycopene, both of which have been independently associated with prostate cancer. Further investigation into this relationship might help provide greater mechanistic understanding of these associations.


Introduction
The carotenoids are a group of over 600 pigments that are synthesized by plants and microorganisms. Lycopene is a red-pigmented carotenoid present in tomatoes, watermelon, papaya, and other fruits and vegetables. There is no endogenous production of lycopene or other carotenoids in animals and they are only obtained from the diet. Higher intake and circulating concentrations of carotenoids have been associated with lower risk of cancer [1], cardiovascular disease [2,3], metabolic syndrome [4], and diseases of the eye [5]. The protective effects of dietary carotenoids appear to be due in part to their antioxidant activity. Lycopene has among the most potent antioxidant effects of the samples were available for serum lycopene measurement. The HAPI Heart Study is a family study and by design included individuals from the same nuclear family. Moreover, many of the enrolled families were related to each other given the social structure of the Old Order Amish community. The design of the HAPI Heart Study, including study inclusion/exclusion criteria, demographics of the full study sample, etc., has been described previously [33] and was registered on Clinicaltrials.gov (NCT00664040). Of the 868 Old Order Amish adults recruited into the HAPI Heart study, 469 were administered the controlled dietary intervention and subsequently provided a fasting blood sample at the conclusion of the 6-day diet. There were 27 samples of insufficient quality to measure serum lycopene concentrations and one sample excluded from analysis due to extreme serum lycopene concentration (22,995 µg/dL) that was considered an outlier. There were 308 nuclear families in our study sample, ranging from one participant to nine participants per family. The participant relatedness was as follows: 113 parent-children, 198 sibling, 105 avuncular, 33 first cousins, and six grandparents-grandchildren. The study was approved by the Institutional Review Board of the University of Maryland School of Medicine and all participants provided written informed consent.

Controlled Diet
A controlled diet was prepared for participants by study staff. A registered dietitian visited several Old Order Amish households to obtain diet histories and observe meals and foods that were in their homes. All meals in the controlled diet were designed to be representative of the typical diets of Old Order Amish adults and were provided to the study participants by home delivery over a period of six days. Study participants also abstained from both prescribed and over the counter medications and dietary supplements during this 6-day period. The full menus for the 6-day controlled diet that the participants in this study consumed are provided in the Supplementary Materials. The controlled diet contained an average of 3277 kilocalories per day; 49% from carbohydrate, 15% from protein, and 36% from fat. There was an average of 525 mg of cholesterol per day in the diet. While designed to be representative of the typical diet of the Old Order Amish, the controlled diet was higher in carbohydrate, total fat, and cholesterol than has generally been suggested in the Dietary Guidelines for Americans of the United States Department of Agriculture. The diet contained approximately 10.4 mg of lycopene per day, coming primarily from tomatoes and tomato sauce.
Compliance with the controlled diet was assessed by comparing sodium, potassium, and creatinine levels from first morning urine samples obtained: (1) prior to consuming the 6-day controlled diet that participants consumed in this study; (2) on the final day of the 6-day controlled diet that the participants consumed in this study; and (3) on the final day of a second isocaloric, 6-day controlled diet that was low in salt and consumed after the blood draw that was used to conduct the GWAS in this study. The compliance data have been reported in detail previously [34], but in brief, the excreted sodium, potassium, and creatinine levels that reflect varying salt content of the diets revealed high compliance with the controlled diet in this study.

Serum Lycopene Measurement
Frozen blood samples that had been obtained after a 12-h fast were assayed for serum lycopene concentrations at Johns Hopkins University. 200 µL from each frozen blood sample were used for the reverse-phase high-pressure liquid chromatography (HPLC) assessment of serum lycopene concentrations [35]. There were 13 batches run and the intra-assay and inter-assay coefficients of variation CVs for lycopene were 7.9% and 17.4%, respectively.

Genotyping
A genome-wide association study assesses associations of phenotype with single nucleotide polymorphisms (SNPs) throughout the genome. Study participants were genotyped using either the Affymetrix 500k or Affymetrix 1M SNP chip v6.0 by the Genomics Core Laboratory at the University of Maryland. Genotyping calls were made using Birdseed, which is part of the Birdsuite tools [36]. A total of 397,704 SNPs were in common on both arrays and used for analysis. SNPs with a minor allele frequency (MAF) ě 1%, a call rate exceeding 95% and conforming to the expectations of Hardy-Weinberg equilibrium (p > 10´6) were used for imputation with MACH using the HapMap CEU reference sample [37]. Results were filtered using a MAF ě 2% and an imputation quality score ě 30%, for a final analyzed SNP count of 2,302,013.

Statistical Methods
Descriptive statistics were performed to characterize the study sample and determine the mean concentrations of serum lycopene. For GWAS analysis, we estimated the effect of genotype on lycopene levels, adjusting for the effects of age, sex, and body-mass index (BMI) using a general linear model. Genotype was coded as the number of copies of the reference allele (0, 1, or 2), thus corresponding to an additive genetic model. The GWAS analyses were performed using the MMAP software [38], which accounts for family structure as a random effect. Statistical analysis was performed using a variance component approach to account for relatedness among study participants. This approach has previously been shown to provide valid estimates of regression parameters [39]. To account for the multiple SNPs tested, we considered associations at p < 5ˆ10´8 to be statistically significant. At this genome-wide significance threshold, we estimated that our sample provided 80% power to detect SNPs accounting for 9%-10% of trait variation.

Results
Baseline characteristics of the study sample are provided in Table 1. There were more men in the study than women (254 men, 187 women). Participants were in their mid-40s on average (mean = 43.1 years) with the men being slightly younger than the women. Participants had a mean BMI of 26.4 kg/m 2 and over half of both the men and the women could be classified as overweight (BMI ě 25 kg/m 2 ). Mean lycopene values were 39.2 µg/dL (standard deviation = 19.9 µg/dL, standard error of mean = 10.7 µg/dL, range = (7.5-136.9 µg/dL)). The heritability was estimated to be 0.38˘0.12. A Manhattan plot summarizing results of the GWAS is provided in Figure 1. The top hits from the association analyses are presented in Table 2. We detected genome-wide significant evidence for association of lycopene levels to a locus on chromosome 4q31 (lead SNP = rs7680948; age, sex, and BMI-adjusted p = 3.41ˆ10´9). Each copy of the A allele was associated with a 8.6 µg/dL decrease in serum lycopene, and this locus accounted for 9.3% of the variation in lycopene levels.          annotated. The left y-axis shows the p-value for association tests at each locus (dot) on the log scale. The right y-axis provides recombination rates in centimorgans per megabase in the chromosomal region identifying recombination hotspots in the region (grey line). The diamond is the "top hit" (i.e., the strongest association). Other SNPs in the region are represented by circles. The colors indicate linkage disequilibrium per the r 2 map on top left. Linkage disequilibrium associated with the top signal appears to span the entire region of gene SETD7. We additionally identified three other loci for which associations were observed at p < 1 × 10 −6 that are provided in Table 2. We also performed look-ups for SNPs previously reported to be associated with serum lycopene levels in a multiethnic GWAS [28], in which associations were reported at three loci: One achieving genome-wide association to three SNPs in high linkage disequilibrium in SCARB1 (lead SNP: rs1672879) in the meta-analysis across three ethnic groups, and the other two achieving genome-wide significance in the African-American sample only (to SNPs in SLIT3 (lead SNP: rs11057841) and DHRS2 (lead SNP: rs74036811)). Notably, the associated SNPs in SCARB1 have a minor allele frequency (MAF) of only 0.03 in European-Americans and the associated SNPs in SLIT3 and DHRS2 were monomorphic in European-Americans. In the Amish, the MAF of the three SNPs were even lower (MAF = 0.015), and there was no evidence for association with lycopene levels (p = 0.26). Although not a replication, we did, however, detect nominal We additionally identified three other loci for which associations were observed at p < 1ˆ10´6 that are provided in Table 2. We also performed look-ups for SNPs previously reported to be associated with serum lycopene levels in a multiethnic GWAS [28], in which associations were reported at three loci: One achieving genome-wide association to three SNPs in high linkage disequilibrium in SCARB1 (lead SNP: rs1672879) in the meta-analysis across three ethnic groups, and the other two achieving genome-wide significance in the African-American sample only (to SNPs in SLIT3 (lead SNP: rs11057841) and DHRS2 (lead SNP: rs74036811)). Notably, the associated SNPs in SCARB1 have a minor allele frequency (MAF) of only 0.03 in European-Americans and the associated SNPs in SLIT3 and DHRS2 were monomorphic in European-Americans. In the Amish, the MAF of the three SNPs were even lower (MAF = 0.015), and there was no evidence for association with lycopene levels (p = 0.26). Although not a replication, we did, however, detect nominal evidence for association of lycopene levels with a different SNP in SCARB1 (rs11057841; MAF = 0.14; p = 3.79ˆ10´4).

Discussion
The key result from this study is the novel association observed between a common variant at the SETD7 locus, rs7680948, and serum lycopene concentrations in this genome-wide association study. This represents the first genome-wide significant genetic association of lycopene in a mixed-gender, Caucasian population and the first study evaluating genetic determinants of lycopene concentrations among a sample that had consumed a controlled diet.
We were unable to replicate an association previously noted between a SNP in SCARB1 with low MAF in Caucasians and serum lycopene concentrations, although we did observe a nominal association of lycopene levels with a different SNP within this gene (rs11057841). Interestingly, rs11057841 has previously been associated with lipoprotein-associated phospholipase A 2 (Lp-PLA 2 ) [40]. Both Lp-PLA 2 and lycopene are primarily carried throughout circulation on low-density lipoprotein (LDL) [41,42]. Our data do not provide replicative support for either SLIT3 or DHRS2 [28], although this is not surprising as these prior associations were detected to a SNP found only in African-Americans and not in Caucasian Americans of European descent.
Our results raise a number of intriguing possibilities regarding the nature of the relationships previously noted between SETD7, lycopene, and prostate cancer. SETD7 is proliferative and anti-apoptotic in prostate cancer cells and nuclear expression is upregulated in prostate cancer tissue [43]. The activity of SETD7 as a histone methyltransferase (HMT) may also play a role in prostate cancer. HMTs have been shown to be upregulated in prostate cancer [44,45] and deregulation of HMTs has also been associated with prostate cancer development and progression [46]. It is also plausible that SETD7 may be related to prostate cancer risk through its relationship to serum lycopene concentrations, as was identified in this study. The potential protective mechanisms of lycopene against prostate cancer include regulation of the antioxidant response element, exertion of effects on VEGF signaling pathways, induction of cell cycle arrest, and mediation of apoptosis [18,47]. Future studies containing prostate cancer endpoints would be necessary to confirm this relationship.
There are several key strengths of the study. This was the first GWAS aimed at identifying genetic predictors of serum lycopene concentrations that was conducted among a sample that had consumed a controlled diet. The controlled diet and consistent lycopene, fat, and cholesterol intake among the study participants enabled us to more closely isolate the genetic contributions to the variance in serum lycopene than would have been possible on a variable diet. A related strength was that adherence to the diet was also high, as verified by urinary excretion. The diet was informed by home visits of the study population performed by a registered dietitian, was designed to be culturally-appropriate based upon the foods and beverages present in the homes during these visits, and was delivered to the homes of the study participants to encourage adherence. Conduct of this study in the Old Order Amish population was also advantageous for several reasons. To our knowledge, this is the first study to estimate heritability of serum lycopene concentrations in humans, an analysis made possible by the relationship structure of the Old Order Amish. The Old Order Amish study sample also provided a population that was relatively homogenous with respect to genetics, environmental exposures, and lifestyle habits. This homogeneity, particularly with respect to genetics, provided increased power to detect genetic variants associated with lycopene concentrations.
There were also several notable limitations to this study. The relatively small sample size (n = 441) may have limited our ability to detect genome-wide significant associations between genetic variants and serum lycopene concentrations. Furthermore, the relatively high inter-assay CV of 17.4% for our serum lycopene measurements could have resulted in lower precision of our estimates. However, despite the relatively small sample size and relatively high inter-assay CV, our study was able to identify a novel locus associated with serum lycopene concentrations. We attribute this success in part to the aforementioned advantages of studying an Old Order Amish population as well as the controlled diet that the participants consumed prior to the fasted-state blood draw which enabled us to more closely isolate the genetic contributions to serum lycopene concentrations. A limitation of the controlled diet was its relatively short duration of six days. While the time to maximum concentration of lycopene after consumption is just six hours, lycopene has an elimination half-life of between five and nine days [48,49]. It is likely that the serum lycopene concentrations measured at the conclusion of the controlled diet were also influenced to some degree by variable dietary intake that occurred prior to the initiation of the controlled diet. However, the controlled diet was designed to be representative of the typical Old Order Amish diet and to the authors' knowledge, all previously published GWAS of lycopene and carotenoid concentrations have been conducted among populations on uncontrolled diets. Thus, we do not believe that this limitation of the controlled diet has a major influence on the findings of this study. Finally, while the novel association noted between a variant in SETD7 and serum lycopene concentrations, both of which have been associated with prostate cancer, may provide the rationale for further study into the specific mechanisms of this relationship, this study did not collect data on family history of prostate cancer, prostate specific antigen, or other markers of the disease and no direct inference can be made.
In conclusion, this study provides the identification of a novel genetic association between rs7680948, an intronic variant in SETD7, and serum lycopene concentrations. These findings provide further support that genetics may affect serum concentrations of lycopene. Further studies are needed to clarify any potential relationships between SETD7, lycopene, and clinical endpoints such as prostate cancer.