Next Article in Journal
Clinical Relevance of +936 C>T VEGFA and c.233C>T bFGF Polymorphisms in Chronic Lymphocytic Leukemia
Previous Article in Journal
DNA Hypermethylation and Unstable Repeat Diseases: A Paradigm of Transcriptional Silencing to Decipher the Basis of Pathogenic Mechanisms
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:

Array-Based Epigenetic Aging Indices May Be Racially Biased

Department of Psychiatry, University of Iowa, Iowa City, IA 52242, USA
Behavioral Diagnostics LLC, Coralville, IA 52241, USA
Center for Family Research, University of Georgia, Athens, GA 30602, USA
Department of Psychological Sciences, University of Connecticut, Storrs, CT 06268, USA
Author to whom correspondence should be addressed.
Genes 2020, 11(6), 685;
Submission received: 4 May 2020 / Revised: 1 June 2020 / Accepted: 18 June 2020 / Published: 22 June 2020
(This article belongs to the Section Human Genomics and Genetic Diseases)


Epigenetic aging (EA) indices are frequently used as predictors of mortality and other important health outcomes. However, each of the commonly used array-based indices has significant heritable components which could tag ethnicity and potentially confound comparisons across racial and ethnic groups. To determine if this was possible, we examined the relationship of DNA methylation in cord blood from 203 newborns (112 African American (AA) and 91 White) at the 513 probes from the Levine PhenoAge Epigenetic Aging index to ethnicity. Then, we examined all sites significantly associated with race in the newborn sample to determine if they were also associated with an index of ethnic genetic heritage in a cohort of 505 AA adults. After Bonferroni correction, methylation at 50 CpG sites was significantly associated with ethnicity in the newborn cohort. The five most significant sites predicted ancestry with a receiver operator characteristic area under the curve of 0.97. Examination of the top 50 sites in the AA adult cohort showed that methylation status at 11 of those sites was also associated with percentage European ancestry. We conclude that the Levine PhenoAge Index is influenced by cryptic ethnic-specific genetic influences. This influence may extend to similarly constructed EA indices and bias cross-race comparisons.

1. Introduction

Over the past 10 years, there have been significant advances in our ability to use epigenetic status to predict healthcare outcomes. Progress in this area has been facilitated by the development of methylation profiling platforms, in particular the Illumina Human-Methylation450 BeadChip (450K) and Infinium MethylationEpic Beadchip (Epic) arrays that are capable of assessing methylation status at hundreds of thousands of CpG residues simultaneously [1,2].
In the world of research, these arrays have been used for two distinct purposes. First, they have been used for discovery. Using these arrays, literally thousands of studies describing the relationship of methylation to illnesses, such as diabetes and heart disease, or to environmental exposures/consumption of substances such as cigarettes, alcohol, and pesticides have been conducted [3,4,5,6,7,8,9]. Importantly, the findings from many of these studies have been subsequently replicated in cohorts of other ancestries thus ensuring the generalizability of the findings across ethnic groups. Additionally, array-based methylation assessments have often been verified by more exact approaches for assessing methylation such as pyrosequencing [10].
These arrays have also been used as tools for imputing global health status. Perhaps the most popular of these metrics is that of “epigenetic aging” (EA). Although the concept owes its origins to a variety of individuals including Fraga and Esteller, the implementation of the information from these arrays as EA indices was described separately and independently by both Hannum and associates and Horvath and associates in 2013 [11,12,13]. At the core of their approaches, each team first used regression to identify a set of several hundred CpG loci whose methylation status changes, either increasing or decreasing, in association with chronological age in a first set of subjects. Then, they used the rate of change at each of these loci in array data from a second group of subjects to predict the apparent chronological age of the subjects with the difference between the apparent and actual chronological age being referred to as “accelerated epigenetic age”.
Given the social and economic challenges of the burdens associated with aging, the use of these global indices has been remarkably popular. Hundreds, if not thousands of papers have used these indices, or some derivative of them to predict key healthcare outcomes associated with aging such as mortality and cardiovascular disease. More recently epigenetic indices have been developed to predict disease phenotypes. In particular, the “PhenoAge” DNA methylation index developed by Levine and colleagues was designed to overcome limitations of age focused prior measures [14]. Because EA indices keyed to chronological age were not found to be consistently related to cardiovascular disease or early onset of chronic illness [15], DNAm PhenoAge was developed using both chronological age and clinical measures so that it would better predict individual differences in lifespan and health span [16] The index reflects several known aging pathways and provides a useful objective marker of elevated risk for early onset morbidity and chronic illness. A potential limitation of the DNAm PhenoAge index, and perhaps all similarly formed indices, is the extent it was trained on data containing ancestry specific methylation information. If the ethnic groups whose data are represented in the training sets are mismatched for socioeconomic factors that also affect health outcomes, the resulting algorithm may inadvertently be biased, resulting in an index that is not well suited for comparisons across race, or for application to broad, heterogeneous samples.
The concern regarding possible contamination by cryptic ethnic variation in EA indices is not hypothetical. It is well-established that methylation arrays contain significant implicit and explicit genotype information [17,18,19]. To understand how implicit genotype information becomes detectable on methylation arrays, it is necessary to recall that the key difference between the arrays that measure genetic as compared to those which measure epigenetic variation is that while both arrays quantitatively capture allele specific hybridization signals, the former uses regular genomic DNA while the latter uses bisulfite converted DNA. Because only a portion of the sequence variation is destroyed by bisulfite conversion, it is possible to assess explicit and implicit genetic information from bisulfite converted DNA. The 450 K array provides explicit genotype information at 65 genetic polymorphisms whose genotypic variation (e.g., an A to G polymorphism) is unaffected by the bisulfite conversion [20]. These 65 genotypes on the array can be used to help sort out any laboratory mix-ups with respect to subject identification. The implicit genotyping information is considerably larger and takes advantage of the fact that many of the 50 bp pair long probes hybridize to segments of the genome containing not only a CpG residue, but genetic polymorphisms as well. Because the presence of the polymorphism can directly alter the annealing temperature of a segment of DNA for a given C or T specific probe or indirectly by changing the amount of methylation at a given site, the presence of the genetic polymorphism can be inferred. The presence of many of these potential genetic influences is noted in the 450K and Epic annotation files [20]. In 2014, we showed how “genetic” information contained in the hybridization signal from the 450K analyses of 111 African Americans (AAs) subjects could be used to infer over 10,000 genotypes [17]. However, each ancestry will have its own unique set of signals with the number of potential detectable gene-methylation interaction effects likely to be in the millions [21,22].
Relevant to our current investigation, many cryptic genotypic influences are ethnically specific. In fact, Rahmani and colleagues have developed a program called EPISTRUCTURE for inferring ethnicity from the cryptic genotyping information [18]. This cryptic variation may affect important health related loci. For example, with respect to smoking, we have shown that almost 90% of the top ranked CpG sites for predicting are affected by ethnic specific genetic variation [23]. In the Framingham Heart Study (FHS) Offspring Cohort, which consists of individuals of European ancestry, 195 of the 513 CpG sites in the DNAm PhenoAge EA Index are significantly associated with methylation at the well-established indicator of smoking, cg05575921 [24]. However, in AA samples, the Levine PhenoAge Index is only modestly associated with cotinine seropositivity and cg05575921 inferred smoking intensity [25].
Might discrepancies in the pattern of correlation between DNAm PhenoAge EA and health behaviours for those of differing ethnicity be the result of differences introduced by unacknowledged cryptic genetic influences? Despite being based on methylation platforms that are designed to capture acquired changes, both the DNAm PhenoAge index and newly described GrimAge have strong heritable components. The DNAm PhenoAge index has a h2 of 0.51 while the GrimAge index has a h2 of 0.37 with the individual subscales have heritabilities between 0.34 and 0.51. [14,26] Conceivably, part of the heritability could come from germline transmission of methylation signals. However, although epigenetic inheritance has been posited in humans, none has conclusively demonstrated with most experts agreeing that if it occurs, the amount of germline transmission is fairly low [27,28]. Accordingly, it is important to carefully examine the source of any observed heritability of EA indices and determine whether ethnic specific components in that heritability affects are correlates.
In this communication, we hypothesize that some of the inconsistency across samples of differing races/ethnicities when examining associations of the DNAm PhenoAge EA Index with health behaviors; and, some of its heritability, may be the consequence of methylation being affected by ethnic specific genetic variation. To test this hypothesis, we examine the association of methylation values at individual probes from the DNAm PhenoAge EA index with ethnicity in methylation data from two separate cohorts. The first set of data is from a set of 203 newly born infants (112 AA and 91 White). We then tested and extended the most significant finding to a cohort of AAs for whom we can infer varying degrees of European ancestry.

2. Materials and Methods

The genetic and epigenetic data used in this study were obtained from two sources. The list of the probes used in the Levine EA Index was taken from their 2018 work [14].
The first set of data was from a set of methylation assessments of newborn cord blood DNA conducted by Mozhui and associates as part of their 2015 study of maternal nutrition [29]. In brief, after obtaining approval from the University of Tennessee Institutional Review Board (IRB 200802719), Mozhui and colleagues obtained cord blood from 212 newborns from the University of Tennessee Health Center including samples from 112 AA and 91 White subjects. Methylation array profiling of the DNA samples prepared from these samples was conducted using the Illumina Humanmethylation27 BeadChip and processed using the Illumina Genome Studio (version 2009.1) [29]. The data were then corrected for batch effects using the COMBAT R package. The resulting M-values for 27,577 probes (including all 513 probes from the DNAm PhenoAge EA index) for all 212 samples were then generously posted to the NCBI NIH Gene Expression Omnibus (accession ID GSE64940).
The source and processing of the genome wide epigenetic data from the Family and Community and Health Studies (FACHS) have also been previously described [30,31]. In brief, the FACHS study was a longitudinal study of the effects of socioeconomic factors on health-related outcomes of AA parent-child dyads from Iowa and Georgia. During Wave 5 of this longitudinal study (2008–2010), the adult subjects from these dyads were interviewed and phlebotomized. The DNA from these samples were processed via our standard procedures, then interrogated for genome wide methylation using the Infinium MethylationEpic Beadchip by the University of Minnesota Genome Center. Standard sample and probe level quality control were conducted as previously described [23,25]. After quantile normalization, the resulting methylation values were exported as beta values for use in this study.
The genetic information from FACHS cohort used in the ancestry index described below was obtained using Infinium Multi-Ethnic Global-8 Beadchip by the University of Minnesota Genome Center. In short, after processing with Genome Studio, the data were subjected to quality control measures at both the sample and SNP probe levels using PLINK [32]. Subject data from whose self-reported gender and biological sex were discordant or whose heterozygosity rate was greater or smaller than the mean ± 2SD and with a proportion of missing SNPs > 0.03 were excluded.
The European Ancestry index (EAI) was constructed using a list of ancestry informative polymorphisms identified by Seldin and associates [33]. In brief, in 2009 this group identified a set of 128 genetic markers whose genotype status could be used to infer the ancestral origins of subjects of anyone in the world. From their list of 128 single nucleotide polymorphisms (SNPs), we selected 13 (rs9809104, rs385194, rs6556352, rs2504853, rs1871428, rs7803075, rs2416791, rs772262, rs9522149, rs4984913, rs2125345, rs7238445, rs4891825) whose major and minor allele frequencies were markedly discordant between those with African vs European ancestry (e.g., 10% A in one population and 90% A in the other population) and whose data were available from the Multi-Ethnic Global-8 Beadchip. We then converted their genotypes at each of those SNPs to a 1, 2 or 3 scale with 1 being the genotype most common in those with African ancestry and 3 being the genotype most common with European ancestry (e.g., AA = 1, AG = 2 and GG = 3 for rs385194). The scores at each these loci were averaged to provide an EAI score for each subject.
All data were analyzed using the JMP suite of programs (Cary, SC) using the statistical tests (T-tests, logistic regression, and receiver–operator characteristic (ROC) area under the curve (AUC)) analyses described in the text [34,35].

3. Results

As a first step in understanding the potential for ethnically contextual genetic effects affecting methylation status at the 513 loci used in the DNAm PhenoAge EA index, we analyzed the relationship of methylation score to ancestry using the cord blood DNA methylation data from 112 African American (AA, 59 male and 53 female) and 91 White (41 male and 50 female) subjects profiled by Mozhui and associates [29].
In total, methylation status at 223 of the 513 probes in the Levine EA index were nominally associated with ethnicity with methylation status at 50 of these probes being significantly associated (p < 0.05; t-test) after Bonferroni correction. Table 1 lists the 30 probes whose methylation status is most significantly associated with ancestry along with information regarding the presence of nearby polymorphisms from the 2016 edition of the Illumina Human-Methylation450 BeadChip annotation file. A listing of the complete association analyses is given in Table S1.
Although the genetic variation potentially affecting DNA methylation status at any given residue can be anywhere in the genome, genetic variation immediately adjacent to the CpG site is thought to have particularly strong effects [36]. Review of the annotation information from Illumina (see Table 1) shows that 7 of the 30 most significant probes have known polymorphisms within 50 bp of the CpG site with yet an eighth probe (cg12864235) having a SNP (rs73925316) within 10 bp of the CpG site specifically interrogated by the probe. Examination of that locus in the NCBI dbSNP database [37] shows a marked discrepancy between the allele frequencies between Africans and Europeans with the frequency of the G allele being 0.0484 (n = 2072) in Europeans and 0.3 in AAs (n = 76). Review of the dbSNP data at the other less closely located SNPs from Table 1 show similar ethnic specific differences in several of the 7 SNPs.
To understand the power of methylation values at these CpG sites to predict ethnicity, we conducted a series of nominal logistic regression analyses. A simple model just using the information from the most highly associated CpG probe (cg08654655) was highly significant (p < 0.0001, R2 = 0.225, ROC AUC 0.80, n = 203). Stepwise addition of methylation information of the next 4 most highly ranked probes steadily increased the predictive power for ethnicity status to (p < 0.0001, R2 = 0.66, ROC AUC 0.97, n = 203). In contrast, although ethnicity had a strong effect on methylation, there was no effect of gender in any of the models.
As a next step of our analyses, we tested whether methylation at the most significantly associated sites was associated with the percentage of European ancestry in our adult AA cohort. As a first step, we calculated an index of the relative amount of European ancestry using the method outlined by Seldin and associates. As Figure 1 shows, the relative amount of European ancestry varied widely. The average value of the European Ancestry Index (EAI) was 1.36 with 15 self-reported AA subjects averaging at least one or more of the alleles normally associated with Europeans across each of the 13 polymorphic sites surveyed.
We then analyzed the relationship of the EAI to methylation status at each of the 50 CpG sites that were significantly associated with ethnicity in the newborn cohort. Overall, 11 of the 50 CpG sites associated with AA ancestry among infants were also associated with the degree of AA ancestry in the adult FACHS subjects (see Table 2). Interestingly, five of the eleven CpG probes were among those associated with cg05575921 status in the prior study of the DNAm PhenoAge Index using the Framingham Heart Study population.

4. Discussion

Using methylation and ethnicity data from two cohorts, we found that methylation status at some of the sites used in the DNAm PhenoAge index is associated with ethnicity. Caveats include the relatively limited power of the confirmation cohort, possible confounding biases in the newborn population, and the possibility that some observed differences may be attributable to differential prenatal exposures. Finally, because the UCLA website (www. that calculates the DNAm PhenoAge EA does not accept 27K data for imputation of the DNAm PhenoAge Index, we were unable to directly compare the averages and distribution of the DNAm PhenoAge EA predictions for these subjects.
Although perhaps ungainly at first consideration, the study design purposefully addressed the possibility of racial/ethnic confounding by using cohorts with different types of contrasts. In the first cohort, the contrast was between subjects. All of the blood samples came from the same source, cord blood, and at the same age, birth. Hence, there were no confounds with respect to age or DNA source making this an intuitive, easy to understand approach to identifying potential ethnic specific methylation variation that minimizes other potentially confounding influences. At the same time, this case and control approach is subject to other biases, such as the potential for White mothers to have greater access to pre-natal care or different prenatal experiences. Conceivably, these differences could be reflected in methylation differences among infants. Arguing for a genetic source for these differences, however, we note that local genetic variation is documented in 8 of the 30 most significantly affected probes. So, there is good reason to believe that genetic factors may already be in play at these loci at birth, and that some of the findings represent differences due to cryptic ethnic genetic variation rather than differential prenatal experience. Such differences have the potential to yield scales that reflect different influences across racial/ethnic groups. For example, in the FHS (White) sample, the DNAm PhenoAge EA index largely loaded on cigarette and alcohol consumption and this was not the case in the FACHS (AA) sample [24,25]. Since the rate of smoking and alcohol consumption are roughly equal in Whites and AAs [38,39], it is unlikely that differences in habits affected the methylation outcomes, suggesting that the DNAm PhenoAge index is operating somewhat differently across ethnic groups.
In the second set of analyses, the contrast was within individual- using individuals who varied by age. Still, since all of these individuals are from the same cohort, the likelihood that some systematic bias in nutrition, substance use, community, or socio-economic adversity is differentially affecting individuals as a function of their EAI score is limited, although not impossible. More likely is that the EAI predicts the proportion of ethnic specific genetic variation that can raise or lower the methylation particular loci. Despite the limited number of subjects with high EAI scores, methylation status at 11 of the loci was significantly associated with EAI which suggests that a larger more informative cohort could have confirmed the associations at additional loci that did not achieve statistical significance in the current sample. Unexamined was the extent to which EAI was associated with sources of stress, including economic hardship and discrimination, that might also contribute to the association.
A reasonable question is how accurate is the EAI approach that we used in this study. Certainly, the use of additional ethnically informative SNP information would provide more precise estimates. Unfortunately, of the 24 SNPs most highly predictive of White versus AA ancestry in the 2009 manuscript by Seldin and colleagues, only 13 were included in the Global-8 Multi-Ethnic chip. Still, we note that all things being equal, the average value of 1.36 suggests that, on average, 18% of the ancestry of FACHS participants is of European origin. This figure is well in line with prior estimates of 24% by Bryc and associates in their analyses of 5269 self-reported AA subjects [40].
The fact that 5 of the CpG sites whose confounding by ethnic specific genetic variation was confirmed in the second set of analyses were also significantly associated with epigenetic smoking in prior studies of the FHS is not unexpected. Recently, we reported a set of analyses that showed that cg05575921 and a GrimAge sub-index (packyears), but not the DNAm PhenoAge EA index strongly predicted smoking status in the FACHS cohort [25]. The finding that the methylation signal at these loci is affected by ethnic specific variation supports the assertion that confounding may have diminished the association between the DNAm PhenoAge EA index and health behaviour in the FACHS cohort relative to white samples. We also note that none of these sites were significantly associated with smoking in any of our two prior genome wide studies of smoking in AAs [23,41].
The findings in this manuscript further emphasize the need to understand the genomic basis for the genetic x methylomic (i.e., GxMeth effects) interactions. The existence of these GxMeth effects has been known for many years. In 2010, Mill and colleagues noted widespread allele specific methylation skewing [42]. In 2013, Illumina posted a product note to their website that indicated that over half of the (273,660 of the 485,577 total) probes in the 450 K array had one of more significant GxMeth interaction effects [43]. Since then, a number of studies have shown that the genetic variation detected by these platforms is widespread, has significant impact, and can be used to predict ethnicity [18,23,43,44]. Still, it is important to note that these arrays only assess a fraction of the 28 million CpG sites in the human genome [45]. Therefore, whole genome bisulfite sequencing approaches will be needed to establish a comprehensive understanding of the large and fine scale methylomic regulation. If successful, these studies may lead to the development of precision medicine tools for the treatment of certain developmental disorders which result from defective genomic imprinting or cancer [46].
What is not clear from this study is what proportion of the signal in the DNAm PhenoAge EA index, loads on ethnicity. In part, that is because the process involved in its derivation is complex and relies on data from a number of cohorts each of which possess unique ascertainment and demographic characteristics. Still, the results from the current study combined with the fact that 51% of the DNAm PhenoAge EA index is reported to be heritable, provide a conceptual challenge to the hope that it would provide a relatively pure index of the impact of acquired effects of the environment on the epigenome.
Are other recently developed EA indices that use methylation arrays also subject to ethnic specific genetic effects? The Brenner Frailty index bases its prediction on the information from a relative handful (34) of CpG sites [47]. If the amount of confounding is proportional to the number of probes, then the effects on this index are likely less severe. In addition, many of the CpG sites in their index, such as cg04987734 and cg05575921, load specifically on smoking and drinking consumption and have no known ethnic bias in their set points [48,49]. In contrast, the GrimAge index, which also has considerable heritability, uses information from over 1000 CpG probes [26]. However, the identity of those probes has not been publicly disclosed, making examination of contamination by cryptic ethnic variation difficult.
The policy implications of this study are potentially significant. In essence, we show that a number of loci the DNAm PhenoAge index can be used to predict ethnicity and thereby that some of its prediction of health outcomes in general population samples may be secondary to health disparities between Blacks and Whites or other ethnic groups. Concerns about potentially different patterns of correlates are also relevant to health advice based on these measures. For example, companies using EA indices commercially to provide estimates of biological age and indicate the need for products such as vitamin supplements [50], may need to reconsider the quality of advice being provided to non-Whites. Because supplements are relatively non-toxic, this may be of minor concern at present. However, use of these or similar tools in more extensive medical decision making could be more consequential. Likewise, if these tools are used to guide changes in Federal or State policy, or allowed to influence insurance costs, the financial consequences for groups showing higher scores due to contamination by cryptic ethnic variation could be substantial.

5. Conclusions

In summary, we report that the DNAm PhenoAge EA index contained some ancestry specific information. Although measures of EA are useful in a variety of research contexts, particularly when focused on homogeneous samples, we suggest the need for caution in the use of this and similar tools in situations that explicitly or implicitly involve comparisons across racial or ethnic groups. It is possible that effects of contamination by cryptic ethnic variation are limited to main effects that could be corrected statistically. Conversely, there is reason to worry that effects of contamination by cryptic ethnic variation may also extend to patterns of association. In that case, observations and recommendations regarding predictors and consequences of EA measures may need to be carefully replicated with multiple ethnic groups to directly test the extent of generalizability. Alternatively, efforts to ensure that EA measures are free of contamination by cryptic ethnic variation, equally applicable to multiple ethnic groups, and responsive to similar predictors across ethnic groups may require some revision of the EA measures that are currently in widespread use.

Supplementary Materials

The following are available online at, Table S1: Supplemental Table S1.

Author Contributions

Conceptualization, R.P., M.-K.L., R.L.S., M.V.D. and S.R.H.B.; methodology, M.-K.L. and R.P.; analysis, R.P., M.-K.L.; resources, R.P., S.R.H.B., F.X.G., M.G., R.L.S.; data curation, M.-K.L. and M.V.D.; writing—original draft preparation, R.P.; writing—review and editing, R.P., M.-K.L., M.-K.L., M.G., R.L.S., M.V.D. and S.R.H.B.; funding acquisition, R.P., S.R.H.B., F.X.G., M.G., R.L.S.; All authors have read and agreed to the published version of the manuscript.


This work was supported by the National Institutes of Health; grant number AG055393 (Simons, PI), grant number R01DA021898 (Gibbons, PI) and grant number R01CA0220254 (Gibbons, Beach and Philibert, PIs).

Conflicts of Interest

Philibert is the Chief Executive Officer of Behavioral Diagnostics and the inventor of both granted and pending patent applications covering the use of DNA methylation to assess smoking, drinking and cardiovascular disease. Dogan is the Chief Executive Officer of Cardio Diagnostics and is a co-inventor on a pending patent application covering the use of DNA methylation to assess cardiovascular disease. The remaining authors do not report any significant conflicts.


  1. Dedeurwaerder, S.; Defrance, M.; Bizet, M.; Calonne, E.; Bontempi, G.; Fuks, F. A comprehensive overview of Infinium HumanMethylation450 data processing. Brief. Bioinform. 2014, 15, 929–941. [Google Scholar] [CrossRef] [Green Version]
  2. Pidsley, R.; Zotenko, E.; Peters, T.J.; Lawrence, M.G.; Risbridger, G.P.; Molloy, P.; Van Djik, S.; Muhlhausler, B.; Stirzaker, C.; Clark, S.J. Critical evaluation of the Illumina MethylationEPIC BeadChip microarray for whole-genome DNA methylation profiling. Genome Biol. 2016, 17, 208. [Google Scholar] [CrossRef] [Green Version]
  3. Soriano-Tárraga, C.; Jiménez-Conde, J.; Giralt-Steinhauer, E.; Mola-Caminal, M.; Vivanco-Hidalgo, R.M.; Ois, A.; Rodríguez-Campello, A.; Cuadrado-Godia, E.; Sayols-Baixeras, S.; Elosua, R. Epigenome-wide association study identifies TXNIP gene associated with type 2 diabetes mellitus and sustained hyperglycemia. Hum. Mol. Genet. 2016, 25, 609–619. [Google Scholar] [CrossRef] [Green Version]
  4. Monick, M.M.; Beach, S.R.; Plume, J.; Sears, R.; Gerrard, M.; Brody, G.H.; Philibert, R.A. Coordinated changes in AHRR methylation in lymphoblasts and pulmonary macrophages from smokers. Am. J. Med. Genet. 2012, 159, 141–151. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  5. Fernández-Sanlés, A.; Sayols-Baixeras, S.; Subirana, I.; Degano, I.R.; Elosua, R. Association between DNA methylation and coronary heart disease or other atherosclerotic events: A systematic review. Atherosclerosis 2017, 263, 325–333. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  6. Paul, K.C.; Chuang, Y.-H.; Cockburn, M.; Bronstein, J.M.; Horvath, S.; Ritz, B. Organophosphate pesticide exposure and differential genome-wide DNA methylation. Sci. Total Environ. 2018, 645, 1135–1143. [Google Scholar] [CrossRef] [PubMed]
  7. Liu, C.; Marioni, R.E.; Hedman, Å.K.; Pfeiffer, L.; Tsai, P.-C.; Reynolds, L.M.; Just, A.C.; Duan, Q.; Boer, C.G.; Tanaka, T. A DNA methylation biomarker of alcohol consumption. Mol. Psychiatry 2018, 23, 422–433. [Google Scholar] [CrossRef]
  8. Zeilinger, S.; Kühnel, B.; Klopp, N.; Baurecht, H.; Kleinschmidt, A.; Gieger, C.; Weidinger, S.; Lattka, E.; Adamski, J.; Peters, A.; et al. Tobacco smoking leads to extensive genome-wide changes in DNA methylation. PLoS ONE 2013, 8, e63812. [Google Scholar] [CrossRef]
  9. Philibert, R.; Penaluna, B.; White, T.; Shires, S.; Gunter, T.D.; Liesveld, J.; Erwin, C.; Hollenbeck, N.; Osborn, T. A pilot examination of the genome-wide DNA methylation signatures of subjects entering and exiting short-term alcohol dependence treatment programs. Epigenetics 2014, 9, 1212–1219. [Google Scholar] [CrossRef] [Green Version]
  10. Bock, C.; Halbritter, F.; Carmona, F.J.; Tierling, S.; Datlinger, P.; Assenov, Y.; Berdasco, M.; Bergmann, A.K.; Booher, K.; Busato, F. Quantitative comparison of DNA methylation assays for biomarker development and clinical applications. Nat. Biotechnol. 2016, 34, 726. [Google Scholar]
  11. Hannum, G.; Guinney, J.; Zhao, L.; Zhang, L.; Hughes, G.; Sadda, S.; Klotzle, B.; Bibikova, M.; Fan, J.-B.; Gao, Y.; et al. Genome-wide Methylation Profiles Reveal Quantitative Views of Human Aging Rates. Mol. Cell 2013, 49, 359–367. [Google Scholar] [CrossRef] [Green Version]
  12. Horvath, S. DNA methylation age of human tissues and cell types. Genome Biol. 2013, 14, 3156. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  13. Fraga, M.F.; Esteller, M. Epigenetics and aging: The targets and the marks. Trends Genet. 2007, 23, 413–418. [Google Scholar] [CrossRef] [PubMed]
  14. Levine, M.E.; Lu, A.T.; Quach, A.; Chen, B.H.; Assimes, T.L.; Bandinelli, S.; Hou, L.; Baccarelli, A.A.; Stewart, J.D.; Li, Y. An epigenetic biomarker of aging for lifespan and healthspan. Aging (Albany NY) 2018, 10, 573. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  15. Jylhävä, J.; Pedersen, N.L.; Hägg, S. Biological age predictors. EBioMedicine 2017, 21, 29–36. [Google Scholar] [CrossRef] [Green Version]
  16. Horvath, S.; Raj, K. DNA methylation-based biomarkers and the epigenetic clock theory of ageing. Nat. Rev. Genet. 2018, 19, 371. [Google Scholar] [CrossRef]
  17. Philibert, R.A.; Terry, N.; Erwin, C.; Philibert, W.J.; Beach, S.R.; Brody, G.H. Methylation array data can simultaneously identify individuals and convey protected health information: An unrecognized ethical concern. Clin. Epigenetics 2014, 6, 28. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  18. Rahmani, E.; Shenhav, L.; Schweiger, R.; Yousefi, P.; Huen, K.; Eskenazi, B.; Eng, C.; Huntsman, S.; Hu, D.; Galanter, J.; et al. Genome-wide methylation data mirror ancestry information. Epigenetics Chromatin 2017, 10, 1. [Google Scholar] [CrossRef] [Green Version]
  19. Chen, Y.A.; Lemire, M.; Choufani, S.; Butcher, D.T.; Grafodatskaya, D.; Zanke, B.W.; Gallinger, S.; Hudson, T.J.; Weksberg, R. Discovery of cross-reactive probes and polymorphic CpGs in the Illumina Infinium HumanMethylation450 microarray. Epigenetics 2013, 8, 203–209. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  20. Infinium HumanMethylation450K v1.2 Product Files. Available online: (accessed on 26 April 2020).
  21. Czamara, D.; Eraslan, G.; Page, C.M.; Lahti, J.; Lahti-Pulkkinen, M.; Hämäläinen, E.; Kajantie, E.; Laivuori, H.; Villa, P.M.; Reynolds, R.M. Integrated analysis of environmental and genetic influences on cord blood DNA methylation in new-borns. Nat. Commun. 2019, 10, 1–18. [Google Scholar] [CrossRef] [Green Version]
  22. Dogan, M.V.; Beach, S.R.; Philibert, R.A. Genetically contextual effects of smoking on genome wide DNA methylation. Am. J. Med. Genet. Part B Neuropsychiatr. Genet. 2017, 174, 595–607. [Google Scholar] [CrossRef] [PubMed]
  23. Dogan, M.V.; Xiang, J.; Beach, S.R.; Cutrona, C.; Gibbons, F.X.; Simons, R.L.; Brody, G.H.; Stapleton, J.T.; Philibert, R.A. Ethnicity and smoking-associated DNA methylation changes at HIV co-receptor GPR15. Front. Psychiatry 2015, 6, 132. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  24. Mills, J.A.; Beach, S.R.; Dogan, M.; Simons, R.L.; Gibbons, F.X.; Long, J.D.; Philibert, R. A direct comparison of the relationship of epigenetic aging and epigenetic substance consumption markers to mortality in the Framingham heart study. Genes 2019, 10, 51. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  25. Lei, M.-K.; Gibbons, F.X.; Simons, R.L.; Philibert, R.A.; Beach, S.R. The Effect of Tobacco Smoking Differs across Indices of DNA Methylation-Based Aging in an African American Sample: DNA Methylation-Based Indices of Smoking Capture These Effects. Genes 2020, 11, 311. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  26. Lu, A.T.; Quach, A.; Wilson, J.G.; Reiner, A.P.; Aviv, A.; Raj, K.; Hou, L.; Baccarelli, A.A.; Li, Y.; Stewart, J.D. DNA methylation GrimAge strongly predicts lifespan and healthspan. Aging (Albany NY) 2019, 11, 303. [Google Scholar] [CrossRef]
  27. Horsthemke, B. A critical view on transgenerational epigenetic inheritance in humans. Nat. Commun. 2018, 9, 1–4. [Google Scholar] [CrossRef] [Green Version]
  28. Heard, E.; Martienssen, R.A. Transgenerational epigenetic inheritance: Myths and mechanisms. Cell 2014, 157, 95–109. [Google Scholar] [CrossRef] [Green Version]
  29. Mozhui, K.; Smith, A.K.; Tylavsky, F.A. Ancestry dependent DNA methylation and influence of maternal nutrition. PLoS ONE 2015, 10, e0118466. [Google Scholar] [CrossRef]
  30. Simons, R.L.; Lei, M.K.; Beach, S.R.; Brody, G.H.; Philibert, R.A.; Gibbons, F.X. Social environmental variation, plasticity genes, and aggression: Evidence for the differential susceptibility hypothesis. Am. Sociol. Rev. 2011, 76, 833. [Google Scholar] [CrossRef] [Green Version]
  31. Gibbons, F.X.; Gerrard, M.; Cleveland, M.J.; Wills, T.A.; Brody, G. Perceived discrimination and substance use in African American parents and their children: A panel study. J. Pers. Soc. Psychol. 2004, 86, 517. [Google Scholar] [CrossRef]
  32. Purcell, S.; Neale, B.; Todd-Brown, K.; Thomas, L.; Ferreira, M.A.; Bender, D.; Maller, J.; Sklar, P.; De Bakker, P.I.; Daly, M.J. PLINK: A tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 2007, 81, 559–575. [Google Scholar] [CrossRef] [Green Version]
  33. Nassir, R.; Kosoy, R.; Tian, C.; White, P.A.; Butler, L.M.; Silva, G.; Kittles, R.; Alarcon-Riquelme, M.E.; Gregersen, P.K.; Belmont, J.W. An ancestry informative marker set for determining continental origin: Validation and extension using human genome diversity panels. BMC Genet. 2009, 10, 39. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  34. Hanley, J.A.; McNeil, B.J. A method of comparing the areas under receiver operating characteristic curves derived from the same cases. Radiology 1983, 148, 839–843. [Google Scholar] [CrossRef] [Green Version]
  35. Fleiss, J.L. Statistical Methods for Rates and Proportions, 2nd ed.; John Wiley & Sons Inc: New York, NY, USA, 1981. [Google Scholar]
  36. Huan, T.; Joehanes, R.; Song, C.; Peng, F.; Guo, Y.; Mendelson, M.; Yao, C.; Liu, C.; Ma, J.; Richard, M. Genome-wide identification of DNA methylation QTLs in whole blood highlights pathways for cardiovascular disease. Nat. Commun. 2019, 10, 1–14. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  37. Sherry, S.T.; Ward, M.-H.; Kholodov, M.; Baker, J.; Phan, L.; Smigielski, E.M.; Sirotkin, K. dbSNP: The NCBI database of genetic variation. Nucleic Acids Res. 2001, 29, 308–311. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  38. Centers for Disease Control and Prevention (CDC). Prevention: Vital signs: Current cigarette smoking among adults aged ≥ 18 years--United States, 2005–2010. MMWR Morb. Mortal. Wkly. Rep. 2011, 60, 1207. [Google Scholar]
  39. Grucza, R.A.; Sher, K.J.; Kerr, W.C.; Krauss, M.J.; Lui, C.K.; McDowell, Y.E.; Hartz, S.; Virdi, G.; Bierut, L.J. Trends in adult alcohol use and binge drinking in the early 21st-century United States: A meta-analysis of 6 National Survey Series. Alcohol. Clin. Exp. Res. 2018, 42, 1939–1950. [Google Scholar] [CrossRef]
  40. Bryc, K.; Durand, E.Y.; Macpherson, J.M.; Reich, D.; Mountain, J.L. The genetic ancestry of african americans, latinos, and european Americans across the United States. Am. J. Hum. Genet. 2015, 96, 37–53. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  41. Dogan, M.V.; Shields, B.; Cutrona, C.; Gao, L.; Gibbons, F.X.; Simons, R.; Monick, M.; Brody, G.; Tan, K.; Philibert, R. The effect of smoking on DNA methylation of peripheral blood mononuclear cells from African American women. BMC Genom. 2014, 15, 151. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  42. Schalkwyk, L.C.; Meaburn, E.L.; Smith, R.; Dempster, E.L.; Jeffries, A.R.; Davies, M.N.; Plomin, R.; Mill, J. Allelic skewing of DNA methylation is widespread across the genome. Am. J. Hum. Genet. 2010, 86, 196–212. [Google Scholar] [CrossRef] [Green Version]
  43. Infinium HumanMethylation450K BeadChip Product Files. Available online: (accessed on 26 April 2020).
  44. Yuan, V.; Price, E.M.; Del Gobbo, G.; Mostafavi, S.; Cox, B.; Binder, A.M.; Michels, K.B.; Marsit, C.; Robinson, W.P. Accurate ethnicity prediction from placental DNA methylation data. Epigenetics Chromatin 2019, 12, 1–14. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  45. Lövkvist, C.; Dodd, I.B.; Sneppen, K.; Haerter, J.O. DNA methylation in human epigenomes depends on local topology of CpG sites. Nucleic Acids Res. 2016, 44, 5123–5132. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  46. Wu, J.; Tang, B.; Tang, Y. Allele-specific genome targeting in the development of precision medicine. Theranostics 2020, 10, 3118. [Google Scholar] [CrossRef]
  47. Zhang, Y.; Saum, K.-U.; Schöttker, B.; Holleczek, B.; Brenner, H. Methylomic survival predictors, frailty, and mortality. Aging (Albany NY) 2018, 10, 339. [Google Scholar] [CrossRef] [PubMed]
  48. Philibert, R.; Miller, S.; Noel, A.; Dawes, K.; Papworth, E.; Black, D.W.; Beach, S.R.; Long, J.D.; Mills, J.A.; Dogan, M. A Four Marker Digital PCR Toolkit for Detecting Heavy Alcohol Consumption and the Effectiveness of Its Treatment. J. Insur. Med. 2019, 48, 90–102. [Google Scholar] [CrossRef] [PubMed]
  49. Philibert, R.; Dogan, M.; Beach, S.R.; Mills, J.A.; Long, J.D. AHRR methylation predicts smoking status and smoking intensity in both saliva and blood DNA. Am. J. Med. Genet. Part B Neuropsychiatr. Genet. 2020, 183, 51–60. [Google Scholar] [CrossRef] [PubMed]
  50. Dupras, C.; Beck, S.; Rothstein, M.A.; Berner, A.; Saulnier, K.M.; Pinkesz, M.; Prince, A.E.; Liosi, S.; Song, L.; Joly, Y. Potential (mis) use of epigenetic age estimators by private companies and public agencies: Human rights law should provide ethical guidance. Environ. Epigenetics 2019, 5, dvz018. [Google Scholar]
Figure 1. The Distribution of the European Ancestry Index (EAI) in the FACHS Adult Subjects (n = 505). A score of 1 indicates the presence only alleles enriched in African subjects while a score of 3 indicates the presence of alleles enriched in European subjects.
Figure 1. The Distribution of the European Ancestry Index (EAI) in the FACHS Adult Subjects (n = 505). A score of 1 indicates the presence only alleles enriched in African subjects while a score of 3 indicates the presence of alleles enriched in European subjects.
Genes 11 00685 g001
Table 1. The thirty probes most significantly associated with ancestry.
Table 1. The thirty probes most significantly associated with ancestry.
Illumina Probe IDt-TestsBF CorrectedCHRSNPs within 50 bp *SNPs within 10 bp **
cg086546552.28 × 10−151.17 × 10−121
cg187713003.13 × 10−151.61 × 10−1214
cg153440289.86 × 10−155.06 × 10−122
cg020164192.05 × 10−121.05 × 10−917
cg124022514.56 × 10−122.35 × 10−98
cg047184145.78 × 10−122.96 × 10−913rs17337675
cg082513991.81 × 10−119.29 × 10−92
cg128642352.6 × 10−111.34 × 10−85
cg097998732.26 × 10−101.16 × 10−719 rs73925316
cg008622904.37 × 10−102.24 × 10−73
cg066384515.77 × 10−102.96 × 10−73rs17059410
cg195664057.70 × 10−103.95 × 10−717
cg135091477.77 × 10−103.99 × 10−719
cg200666771.10 × 10−95.62 × 10−712
cg167137273.18 × 10−81.63 × 10−51
cg116185773.49 × 10−81.79 × 10−52
cg128137924.94 × 10−82.53 × 10−520
cg048360387.23 × 10−83.71 × 10−513
cg271878818.21 × 10−84.21 × 10−522
cg107956469.29 × 10−84.77 × 10−51
cg152018772.32 × 10−70.00011911
cg171333884.10 × 10−70.00021043
cg131196094.64 × 10−70.00023819
cg227363544.96 × 10−70.00025456rs28940575
cg231593378.61 × 10−70.00044163rs34959916
cg241256481.21 × 10−60.00061815rs75056397
cg159634171.38 × 10−60.000709212rs62652660
cg093040401.58 × 10−60.000809712
cg094046331.82 × 10−60.00093591
cg105701772.56 × 10−60.00131119rs36223203
BF: Bonferroni, * and ** refer to the presence of polymorphisms with 50 and 10 base pairs of the CpG targeted by the probe.
Table 2. The association of methylation at the 11 most significantly associated probes with EAI Score.
Table 2. The association of methylation at the 11 most significantly associated probes with EAI Score.
IDp-ValueAssociated with Smoking in FHS a
a As shown in Mills et al. (2019) [24]. FHS: Framingham Heart Study.

Share and Cite

MDPI and ACS Style

Philibert, R.; Beach, S.R.H.; Lei, M.-K.; Gibbons, F.X.; Gerrard, M.; Simons, R.L.; Dogan, M.V. Array-Based Epigenetic Aging Indices May Be Racially Biased. Genes 2020, 11, 685.

AMA Style

Philibert R, Beach SRH, Lei M-K, Gibbons FX, Gerrard M, Simons RL, Dogan MV. Array-Based Epigenetic Aging Indices May Be Racially Biased. Genes. 2020; 11(6):685.

Chicago/Turabian Style

Philibert, Robert, Steven R.H. Beach, Man-Kit Lei, Frederick X. Gibbons, Meg Gerrard, Ronald L. Simons, and Meeshanthini V. Dogan. 2020. "Array-Based Epigenetic Aging Indices May Be Racially Biased" Genes 11, no. 6: 685.

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop