Epigenome-Wide Association Study of Infant Feeding and DNA Methylation in Infancy and Childhood in a Population at Increased Risk for Type 1 Diabetes

We assessed associations between infant diet (e.g., breastfeeding and introduction to solid foods) and DNA methylation in infancy and childhood. We measured DNA methylation in peripheral blood collected in infancy (9–15 months of age) in 243 children; and in a subset of 50 children, we also measured methylation in childhood (6–9 years of age) to examine persistence, and at birth (in cord blood) to examine temporality. We performed multivariable linear regression of infant diet on the outcome of methylation using epigenome-wide and candidate site approaches. We identified six novel CpG sites associated with breastfeeding duration using an EWAS approach. One differentially methylated site presented directionally consistent associations with breastfeeding (cg00574958, CPT1A) in infancy and childhood but not at birth. Two differentially methylated sites in infancy (cg19693031, TXNIP; cg23307264, KHSRP) were associated with breastfeeding and were not present at birth; however, these associations did not persist into childhood. Associations between infant diet and methylation in infancy at three sites (cg22369607, AP001525.1; cg2409200, TBCD; cg27173510, PGBD5) were also present at birth, suggesting the influence of exposures other than infant diet. Infant diet exposures are associated with persistent methylation differences in CPT1A, which may be one mechanism behind infant diet’s long-term health effects.


Introduction
Breastfeeding and complementary feeding are important early exposures that may have important long-term effects. Longer breastfeeding duration has been associated with decreased risk of obesity [1], type 2 diabetes (T2D) [2] and type 1 diabetes (T1D) [3]. It is also recommended to introduce solid foods by 6 months to avoid nutrient deficiencies, such as iron, protein, and vitamins [4]. Studies have shown that the timing of exposure to gluten in infancy is associated with the risk of celiac disease [5,6] although not consistently [7,8]. Earlier introduction to fruits and berries, root vegetables, and cereals is associated with T1D autoimmunity and T1D [9][10][11][12].
The mechanisms through which early dietary factors impact long-term health outcomes are still unknown, but could be through epigenetics, such as changes in DNA Nutrients 2021, 13, 4057 2 of 11 methylation [13]. DNA methylation utilizes DNA methyltransferase to methylate, or add a methyl group to, CpG (cytosine-phosphate-guanine) sites in DNA [14] to regulate gene expression. Several studies have examined the association between breastfeeding and DNA methylation in the child [15]. Total breastfeeding duration was associated with DNA methylation of the Leptin (LEP) and retinoid X receptor alpha (RXRA) genes in 1-year-old infants [16]. Moreover, the total and exclusive breastfeeding durations were associated with DNA methylation of LEP gene in 10-year-old children [17]. Finally, breastfeeding duration >6 months was associated with methylation in the SNX25 and LINC00840 genes, and exclusive breastfeeding for > 3 months was associated with methylation in the FDFT1 gene at 10 years of age, but not at 18 years [18]. Odintsova et al. found that methylation at CpG sites in ZNF232, MUCL1, DSCR3 and ATG10 genes was associated with ever breastfeeding, but not with breastfeeding duration [19]. Finally, Hartwig et al. found that having ever been breastfed was associated with DNA methylation in the cg11414913 CpG at ages 7 and 15-17 years, but not at birth [20]. These studies were limited to examining breastfeeding as a measure of infant diet and have not looked at other aspects of the infant diet. This could be due to the high cost and challenge of gathering and compiling quality data about infant diet and the long-term follow-up required to measure methylation later in childhood. Since breastfeeding has been shown to influence DNA methylation, it is likely that other aspects of infant diet may have an effect as well.
Because DNA methylation plays a large part in regulation for many genes, it is important to understand how infant diet changes DNA methylation across the genome. This study aims to address the gaps in knowledge on infant diet exposures and their effects on DNA methylation. We conducted an epigenome-wide association study (EWAS) with several infant diet exposures, such as breastfeeding duration and age at exposure to gluten-containing cereals, non-gluten-containing cereals, fruit, vegetables, and meat. In addition, we aimed to assess if the associations found in infancy persist into childhood and if they were present at birth, to assess if factors other than infant diet exposures were at play.

Study Population
The study population was selected from the Diabetes Autoimmunity Study in the Young (DAISY), a prospective cohort of 2547 children at moderate to high risk for type 1 diabetes (T1D), which has been described previously [21,22]. In brief, this study recruited (1) children born in St. Joseph's hospital whose umbilical cord blood was screened positive for T1D susceptibility alleles in the HLA-DR region or (2) children who had a first-degree relative with T1D. DAISY enrolled participants from 1993 through 2006 and is following them for the onset of autoimmunity and T1D. The Colorado Multiple Institutional Review Board approved all DAISY study protocols (COMIRB 92-080). Informed consent and assent, if appropriate, were obtained from the parents/legal guardians of all children prior to participation in any research related activities. Infant diet exposures were measured with a questionnaire in which the parents were asked to report their infant's dietary exposures regarding the previous 3 months at 3, 6, 9, 12, and 15 months of age of their infant. The questionnaire included questions regarding breastfeeding duration and type and timing of complementary food and beverage introduction. Study visits (for the collection of blood samples) were conducted at 9, 15, and 24 months and annually thereafter to determine the presence of islet autoantibodies in serum.
DAISY conducted a nested case-control study of 413 autoantibody-positive and autoantibody-negative children in whom DNA methylation in blood was measured at multiple timepoints throughout childhood. For our primary analysis, we selected the 243 children that had methylation measured during infancy (i.e., between 9 and 15 months of age) (Figure 1). We then investigated (1) whether the methylation differences persisted into childhood, and (2) whether the methylation differences detected in infancy pre-dated the infant diet exposures by examining cord blood methylation as a 'negative control' [23], with the reasoning that if the associations were present at birth, prenatal exposures or inherited differences related to infant diet behaviors were responsible rather than the dietary exposures themselves. To do this, we selected a subsample of 50 children that had methylation data at birth, in infancy and in childhood. R PEER REVIEW 3 of 11 by examining cord blood methylation as a 'negative control' [23], with the reasoning that if the associations were present at birth, prenatal exposures or inherited differences related to infant diet behaviors were responsible rather than the dietary exposures themselves. To do this, we selected a subsample of 50 children that had methylation data at birth, in infancy and in childhood.

Measurement of DNA Methylation
DNA methylation was profiled in peripheral whole blood using the Infinium Human Methylation 450K Beadchip (Illumina, San Diego, CA, USA, "450 K") or the Infinium Human Methylation EPIC Beadchip ("EPIC"). Children were randomly assigned to either the 450 K group (which included duplicate samples for quality control) or the EPIC group (which included replicates from the 450 K set for quality control). All visits of each individual were included on the same chip and in the same run. Both sets of data underwent identical preprocessing using the SeSAMe pipeline [24], and measurement platform (450 K or EPIC) was included as a covariate in all statistical models to account for technological batch effects. Johnson and colleagues performed quality control and removed poor quality samples and probes (see Johnson et al. for details) [25,26]. This resulted in 199,243 quality DNA methylation probes that were measured on both the 450 K and EPIC platforms. Normalized M values were used in all statistical analyses.

Infant Diet Variables
This study was designed as a prospective cohort to assess the association between infant diet variables and DNA methylation. We assessed six infant diet variables: breastfeeding duration (in months), exclusive breastfeeding duration (in months), and age at introduction to gluten-containing cereals (wheat, barley, rye), non-gluten-containing cereals (rice, oat), fruit (not including fruit juice), vegetables, and meat. The age at introduction variables were categorized using American Association of Pediatrics (AAP) guidelines regarding complementary feeding and associations previously found in DAISY [9] into three categoriesintroduction prior to 4 months of age, introduction from 4 to 5 months of age, and introduction at 6 or more (≥ 6) months of age. The 4-5 months of age category was used as the referent DAISY Cohort n = 2547 Nested case control study with methylation and diet data n = 413 Methylation measures in infancy n = 243 Methylation measures at birth, infancy, and childhood n = 50

Measurement of DNA Methylation
DNA methylation was profiled in peripheral whole blood using the Infinium Human Methylation 450K Beadchip (Illumina, San Diego, CA, USA, "450 K") or the Infinium Human Methylation EPIC Beadchip ("EPIC"). Children were randomly assigned to either the 450 K group (which included duplicate samples for quality control) or the EPIC group (which included replicates from the 450 K set for quality control). All visits of each individual were included on the same chip and in the same run. Both sets of data underwent identical pre-processing using the SeSAMe pipeline [24], and measurement platform (450 K or EPIC) was included as a covariate in all statistical models to account for technological batch effects. Johnson and colleagues performed quality control and removed poor quality samples and probes (see Johnson et al. for details) [25,26]. This resulted in 199,243 quality DNA methylation probes that were measured on both the 450 K and EPIC platforms. Normalized M values were used in all statistical analyses.

Infant Diet Variables
This study was designed as a prospective cohort to assess the association between infant diet variables and DNA methylation. We assessed six infant diet variables: breastfeeding duration (in months), exclusive breastfeeding duration (in months), and age at introduction to gluten-containing cereals (wheat, barley, rye), non-gluten-containing cereals (rice, oat), fruit (not including fruit juice), vegetables, and meat. The age at introduction variables were categorized using American Association of Pediatrics (AAP) guidelines regarding complementary feeding and associations previously found in DAISY [9] into three categories-introduction prior to 4 months of age, introduction from 4 to 5 months of age, and introduction at 6 or more (≥6) months of age. The 4-5 months of age category was used as the referent group. The age at introduction to meat and gluten-containing cereals variables were dichotomized into <6 months and ≥6 months (as the referent group) due to small samples sizes in the <4 months age category.

Statistical Analyses
We performed an EWAS to detect novel methylation sites associated with infant diet exposures at 9-15 months of age (infancy) in 243 children. In addition, we selected 17 CpG sites that had been significantly associated with breastfeeding in the previous literature and tested whether these CpG sites were associated with the infant diet variables in infancy in our cohort. We were unable to determine the actual CpG sites that were significant in the Pauwels et al. 2019 paper, as these were listed as 'CpG2 and 'CpG3 . Upon request, the authors provided the locations of these two regions and from these we selected 2 CpGs (for CpG2) and 5 CpGs (for CpG3) that were in our dataset and that were within 1000 kb at either end of the region as candidates for this analysis. The previous publications and the selected CpGs from each include Pauwels et al. 2019 [16] (cg00666422, cg12782180, cg13381984, cg14204281, cg19594666, cg24341498, cg26814075); Odintsova et al., 2019 [19] (cg03995300, cg11287055, cg16387046, cg16704958, cg27284194); Sherwood et al. 2019 [17] (cg03084214, cg23753947); Sherwood et al., 2020 [18] (cg04957663, cg14723566); and Hartwig et al. 2020 [20] (cg11414913).
We estimated the proportion of CD4+T cells, CD8+T cells, B cells, natural killer cells, granulocytes, and monocytes in each sample using the whole blood reference set [27,28]. We used multiple linear regression with infant diet variables as the exposure and DNA methylation as the outcome. We adjusted for age at methylation measure, sex, race/ethnicity (non-Hispanic white (NHW) vs. other), estimated cell proportions and methylation platform (450 K or EPIC) in all models. We assessed gestational age category (pre-term, term, post-term, based on maternal report) and birth weight as potential confounders, and they did not meet the classical definition of confounding and were therefore not included in the final models. For the EWAS, we used a false discovery rate (FDR) of <0.1 to determine significant associations. For the candidate CpGs from the previous literature, we used a nominal p-value of <0.05 to determine significance.

Analysis of Methylation Associations in Childhood and at Birth
The CpG sites that were associated with infant diet in infancy (9-15 months of age) were then analyzed in a subset of 50 children with methylation data at birth, in infancy and in childhood to examine if the associations seen in infancy were also present in childhood at 6-9 years of age, and if the associations present in infancy were present prior to the infant diet exposures. We focused on complete data (i.e., having all timepoints) so that the sample size was equivalent to avoid a power imbalance that would complicate interpretation of the results. We calculated a Bonferroni p-value cut-off of 0.0083, based on the 6 EWASsignificant CpGs that were tested in these analyses.

Results
Our primary study population consisted of 243 children with DNA methylation measures in infancy (at 9-15 months of age), of whom 47% were female, and 79.4% were non-Hispanic white (NHW) ( Table 1). The population of 50 children with birth, infancy, and childhood methylation data was 50% female and 78% NHW. The distributions of infant diet variables were similar across the two analysis populations ( Table 2).

Methylation in Infancy
We first conducted an EWAS of the infant diet variables and methylation during infancy (9-15 months). We identified six sites (cg00574958, cg19693031, cg22369607, cg23307264, cg24092000, and cg27173510) that were associated with breastfeeding duration months (FDR < 0.10) (Figure 2 and Table 3). The quantile-quantile (Q-Q) plot for Nutrients 2021, 13, 4057 6 of 11 breastfeeding duration shows no strong indication of genome-wide inflation (genomic inflation factor of 1.1) [29]. None of the other infant diet variables were associated with methylation in infancy (FDR ≥ 0.10). (a) (b) Figure 2. Manhattan plot (a) and Q-Q plot (b) for the epigenome-wide association study (EWAS) of breastfeeding duration in infancy (9-15 months). Breastfeeding duration was a continuous variable, in months. Models were adjusted for age at methylation measure, sex, race/ethnicity, cell composition, and platform. In the Manhattan plot (a), given names of six CpG sites indicate those that were associated with breastfeeding duration at FDR < 0.10. Figure 2. Manhattan plot (a) and Q-Q plot (b) for the epigenome-wide association study (EWAS) of breastfeeding duration in infancy (9-15 months). Breastfeeding duration was a continuous variable, in months. Models were adjusted for age at methylation measure, sex, race/ethnicity, cell composition, and platform. In the Manhattan plot (a), given names of six CpG sites indicate those that were associated with breastfeeding duration at FDR < 0.10.

Confirmation of Associations from Previous Literature
Of the 17 selected CpG sites previously found to be associated with breastfeeding or breastfeeding duration, DNA methylation at 4 CpG sites, near the Leptin gene (LEP), were associated with an infant diet variable in DAISY infants at 9-15 months. Higher methylation at cg13381984 and cg26814075 in infancy was significantly associated with shorter breastfeeding duration (p < 0.05); and higher methylation at cg00666422 and cg23752947 was associated with introduction to meat before 6 months of age (p < 0.05) ( Table 3). No other associations with the previous candidate sites were detected.

Methylation in Childhood
Next, we examined whether the 10 associations between infant diet and methylation in infancy were present in childhood (6-9 years) in 50 children with methylation measures available at both timepoints (Table 4). We present results from both the infancy and childhood timepoints in this subset of 50 children for consistency and comparison. DNA methylation at 4 CpGs (cg00574958, cg22369607, cg24092000 and cg27173510) was significantly associated with breastfeeding duration at the childhood timepoint with a similar direction of effect as that seen in the original 243 children as well as in the 50 children in the overlapping subset at the infancy timepoint. None of the associations between breastfeeding duration or meat introduction and DNA methylation in LEP in infancy were present in childhood, nor were they present in infancy in this subsample of 50 children. Table 4. Examination of whether the significant infant diet and DNA methylation associations in infancy are present in childhood and/or at birth in a subsample of 50 children with methylation data at birth, infancy and childhood.

Methylation at Birth
Finally, we tested the 10 CpG sites in cord blood samples to determine whether the associations between infant diet and DNA methylation in infancy were present at birth (i.e., prior to the infant diet exposures) ( Table 4). DNA methylation at three CpGs (cg22369607, cg24092000, and cg27173510) was significantly associated with breastfeeding duration at the birth timepoint with a similar direction of effect as that seen in the original 243 children as well as in the 50 children in the overlapping subset at the infancy timepoint. DNA methylation at cg26814075 at birth was marginally associated with breastfeeding duration with a similar direction of effect as that seen in the original infancy analysis. Neither of the two CpGs associated with age at introduction to meat in the original infancy analysis were associated at the birth timepoint.

Discussion
Compelling evidence of an effect of infant diet exposures on the epigenome is the presence of an association in infancy and childhood, and an absence of an association at birth. In our EWAS, we found that longer breastfeeding duration is associated with lower peripheral blood methylation at cg00574958 in CPT1A in infancy (9-15 months) and childhood (6-9 years), but not at birth.
CPT1A encodes the key enzyme (carnitine palmitoyltransferase 1A) in the carnitinedependent multistep process that breaks down (metabolizes) fats and converts them into energy. Methylation at cg00574958 in CPT1A is associated with type 2 diabetes (T2D) [30,31], triglyceride levels [32], and blood pressure [33]. The expression of CPT1A in obese and overweight children is higher than that in normal weight children [34], and breastfeeding is associated with a decrease in CPT1A expression [35]. This, coupled with our findings, suggests that the long-term metabolic effects of breastfeeding may be through changes in methylation and expression of CPT1A.
Several studies have reported an inverse association between the duration of breastfeeding and methylation of CpG sites within LEP [16,17,36]. LEP encodes the hormone leptin, which is important in the regulation of energy intake by controlling satiety. We selected seven sites in LEP as significant candidates from the previous literature, and found that breastfeeding duration was inversely associated with methylation at cg13381984 and cg26814075 within LEP in infancy, and also that introduction to meat before 6 months was associated with increased methylation at cg00666422 and cg23753947 in LEP during infancy. While no other study has examined meat introduction with regard to LEP, the hypothesis would be that increased methylation would lead to decreased expression of LEP, thus increasing hunger leading to increased obesity and other metabolic disease [37]. These associations did not persist into childhood in our subsample of 50 children, but this may have been due to low power for cg13381984, cg26814075 and cg00666422, since the beta estimates were similar or only slightly attenuated in childhood, and even the infancy methylation associations became nonsignificant when tested in this small subsample.
Our study provides suggestive evidence of association between breastfeeding duration and methylation in infancy at cg19693031 (TXNIP), and cg23307264 (KHSRP) in infancy but not at birth. These associations were not present in childhood suggesting that the effect of infant diet did not persist. It is not clear whether these associations are merely false positives or truly transient effects. For TXNIP and KHSRP, it is unlikely that this lack of persistence is due to low power since the estimates in childhood were opposite to those in infancy. While methylation in TXNIP has been associated with type 2 diabetes [30,31], it has not been associated with breastfeeding or other infant diet exposures. Similarly, KHSRP has not been associated with infant diet exposures.
We tested methylation in cord blood to determine whether associations found in infancy were present at birth before any infant diet exposure. Methylation at cg22369607 (AP001525.1), cg24092000 (TBCD), and cg27173510 (PGBD5) was significantly associated with breastfeeding duration in both infancy and childhood, but also at birth. This could indicate prenatal exposures that are correlated with the mother's decision to breastfeed longer that lead to methylation changes. An alternative explanation may be that the child inherited methylation marks associated with the breastmilk exposure of the mother, and that this exposure is associated with the decision of the mother to breastfeed her child. Further studies are needed to elucidate these potential explanations.
Limitations of this study were the small sample sizes available for the childhood and birth analyses. Moreover, the population was largely non-Hispanic white and at increased risk for type 1 diabetes, which affects the generalizability of the study results. Additionally, this study analyzed peripheral blood, which may not represent tissue-specific methylation, although we have adjusted for estimated cell proportions in all analyses.

Conclusions
This study has provided further evidence to support a role for infant diet in shaping a child's epigenome. Our findings point to CPT1A as a candidate gene influenced by breastfeeding for further investigation. Future research should include large cohorts or consortia [38] to more accurately capture persistence of methylation associated with infant diet, and studies that include a more diverse population. Other areas of study include testing additional epigenetic mechanisms, such as histone modification, to determine if these are associated with infant diet.  Informed Consent Statement: Informed consent was obtained from all subjects involved in this study.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author. The data are not publicly available because it contains protected health information.