Metabolome Alterations Linking Sugar-Sweetened Beverage Intake with Dyslipidemia in Youth: The Exploring Perinatal Outcomes among CHildren (EPOCH) Study

The objective of this study was to assess intermediary metabolic alterations that link sugar-sweetened beverage (SSB) intake to cardiometabolic (CM) risk factors in youth. A total of 597 participants from the multi-ethnic, longitudinal Exploring Perinatal Outcomes among CHildren (EPOCH) Study were followed in childhood (median 10 yrs) and adolescence (median 16 yrs). We used a multi-step approach: first, mixed models were used to examine the associations of SSB intake in childhood with CM measures across childhood and adolescence, which revealed a positive association between SSB intake and fasting triglycerides (β (95% CI) for the highest vs. lowest SSB quartile: 8.1 (−0.9,17.0); p-trend = 0.057). Second, least absolute shrinkage and selection operator (LASSO) regression was used to select 180 metabolite features (out of 767 features assessed by untargeted metabolomics) that were associated with SSB intake in childhood. Finally, 13 of these SSB-associated metabolites (from step two) were also prospectively associated with triglycerides across follow-up (from step one) in the same direction as with SSB intake (Bonferroni-adj. p < 0.0003). All annotated compounds were lipids, particularly dicarboxylated fatty acids, mono- and diacylglycerols, and phospholipids. In this diverse cohort, we identified a panel of lipid metabolites that may serve as intermediary biomarkers, linking SSB intake to dyslipidemia risk in youth.


Introduction
Excess consumption of sugar-sweetened beverages (SSBs), which typically include sodas, fruit-flavored drinks, energy drinks, sports drinks, and sweetened teas/coffees, has been consistently associated with rates of overweight and obesity among youth over recent decades [1][2][3]. This associated weight gain may result from the energy-dense, liquid nature of SSBs, resulting in weaker satiation and compensatory food responses, or activation of the hedonic food reward system [4]. SSB intake has also been associated with cardiometabolic risk factors in childhood and adolescence, including markers of cardiometabolic risk factors in childhood and adolescence, including markers of insu resistance, inflammation, and dyslipidemia, independent of energy intake [5-1 Proposed mechanisms include the higher glycemic index of SSBs, which may contrib to an increased insulin secretory response [11], or associations of added sugars in SS especially fructose, with hepatic de novo lipogenesis and ectopic liver fat [12][13][14][15][16], an eff that has been observed even when weight is held stable [13]. A critical future direction therefore, better understanding the pathophysiological disturbances associated with SS as this may also shed light on objective biomarkers of intake of SSBs, a food group which social norms may bias self-reported intake data.
Metabolomics is an evolving science that involves the comprehensive measurem of low molecular weight molecules, or metabolites, in biological samples. This includ endogenous compounds, which serve as products, intermediates, and substrates chemical reactions in the human metabolism, as well as exogenous compounds, wh reflect environmental exposures [17]. In adults, a study by Gibbons et al. leverag untargeted metabolomics to identify a set of metabolites (formate, citrulline, tauri isocitrate) that showed promise as biomarkers of SSB intake [18], though this study d not link these metabolites to any health outcomes. An alternative strategy is to evalu the metabolome alterations linking SSB intake and cardiometabolic risk. This may achieved by using a "meet-in-the-middle" approach [19], an analytical framework t aims to identify the functional biomarkers that mark the relationship between an exposu and a health outcome, thereby providing insight into pathophysiological alteratio potentially attributable to the exposure. For example, in a cross-sectional cohort Mexican youth, Perng et al. identified two metabolites (urate and nonanoate) that mark the relationship between SSB intake and higher blood pressure in girls [20]. While th findings serve as a foundation for understanding the link between SSB intake and o metabolic risk factor in youth, additional prospective studies are needed to further expl this pathway and establish the temporality of associations.
In this study, we investigated the intermediary metabolic alterations that mark relationship between SSB intake in childhood and cardiometabolic risk factors measur prospectively across childhood and adolescence. This was done using data from Exploring Perinatal Outcomes among CHildren (EPOCH) study, a longitudinal cohor diverse youth, and by employing a multi-step conceptual framework, summarized Figure 1, which integrated data collected during childhood and/or adolescence on d metabolomics, and conventional cardiometabolic measures [fasting glucose and insu insulin resistance (assessed using the homeostatic model of assessment), HDL cholester fasting triglycerides, and systolic blood pressure].

Figure 1.
Conceptual diagram of the "meet in the middle" approach used to identify metabol that mark the relationship between sugar sweetened beverage (SSB) intake and cardiometab risk. (a) In step 1, we tested prospective associations of SSB intake in childhood with cardiometab risk factors across childhood and adolescence. (b) In step 2, we identified plasma metabolites childhood associated with SSB intake in childhood. (c) In step 3, we tested whether SSB-associa metabolites (from step 2) were also prospectively associated with SSB-associated CM risk fact Figure 1. Conceptual diagram of the "meet in the middle" approach used to identify metabolites that mark the relationship between sugar sweetened beverage (SSB) intake and cardiometabolic risk. (a) In step 1, we tested prospective associations of SSB intake in childhood with cardiometabolic risk factors across childhood and adolescence. (b) In step 2, we identified plasma metabolites in childhood associated with SSB intake in childhood. (c) In step 3, we tested whether SSB-associated metabolites (from step 2) were also prospectively associated with SSB-associated CM risk factors across childhood and adolescence (from step 1). Abbreviations: FFQ, food frequency questionnaire; SSB, sugarsweetened beverage.

Characteristics
Background characteristics of the sample by quartile of energy-adjusted SSB intake in childhood (visit one) are shown in Table 1. The mean age of participants at the childhood visit increased across SSB quartiles (mean age ± SD for highest vs. lowest SSB quartile: 10.7 ± 1.5 vs. 10.2 ± 1.5, respectively; p = 0.023). There were also differences in the racial/ethnic distribution across SSB quartiles (p < 0.001), whereby the highest quartile versus the lowest quartile had a higher percentage of Hispanic participants (52 vs. 29%, respectively), but fewer non-Hispanic white participants (33 vs. 60%, respectively).  1 SSB intake quartiles were determined based on energy adjusted SSB intake using the residual method. 2 Differences in each child's characteristics by SSB intake quartile were assessed using analysis of variance (ANOVA) for continuous characteristics and Chi-squared tests for categorical characteristics. 3 SSB intake was energy-adjusted using the residual method. Abbreviations: SSB, sugar sweetened beverages; BMI, body mass index; GDM, gestational diabetes mellitus.

Associations of SSB Intake in Childhood with Cardiometabolic Measures
The means and standard deviations for each cardiometabolic measure of interest at study visits in childhood and adolescence are shown in Table S1; generally, mean values for each cardiometabolic measure increased from childhood to adolescence, except for HDL cholesterol, which decreased. In linear mixed models adjusted for age, sex, and race/ethnicity, SSB intake in childhood was associated with higher fasting triglycerides across childhood/adolescence (β (95% CI) for highest versus lowest SSB intake quartile: 8.1 (−0.9, 17.0) mg/dL; p-trend = 0.057; Table 2). There were no notable associations between SSB intake in childhood and the other cardiometabolic risk factors (fasting glucose, fasting insulin, HOMA-IR, HDL cholesterol, or systolic blood pressure; Table 2). Thus, downstream analyses focused on triglycerides as the primary SSB-associated cardiometabolic outcome of interest.  1 SSB intake quartiles were determined based on energy-adjusted SSB intake using the residual method. 2 Estimates based on mixed-effects models adjusted for participant age across visits, sex, and race/ethnicity (Hispanic, non-Hispanic white, non-Hispanic black, or non-Hispanic other). All models included a participant-specific random intercept. 3 Linear trends across quartiles were assessed using the median value for each quartile. Abbreviations: HOMA-IR, homeostatic model of assessment for insulin resistance; HDL, high-density lipoprotein.

Associations of SSB Intake in Childhood with Plasma Metabolites
A total of 180 metabolites (out of 767 metabolites measured by the untargeted metabolomics assay) were selected in LASSO regression with log-transformed SSB intake as the dependent variable, whereby selection was defined as having a non-zero coefficient in ≥40% of the bootstrap samples, which was a data-driven threshold described in more detail in the Methods section under "Statistical Analyses". Among these selected metabolites, 136 (76%) had confirmed identities. The metabolites selected in LASSO regression with SSB intake are summarized in Table 3, which also reports the number of times that the metabolite was selected across bootstrap samples (whereby, "selection" was defined as having a non-zero regression coefficient) and its average β-coefficient across bootstrap samples. The selected metabolites were primarily lipids (45 metabolites, 25%), amino acid-related (44 metabolites, 24%), or xenobiotics (21 metabolites, 12%).    3 Indicates the number of times the metabolite was selected (based on having a non-zero coefficient) across 100 bootstrap samples. Metabolites shown here were selected in ≥40% of bootstrap samples. * Indicates tier 2 identification in which no commercially available authentic standard can be found; however, it was annotated based on accurate mass, spectral, and chromatographic similarity to tier 1-identified compounds.

Associations of SSB-Related Metabolites in Childhood with Triglycerides
In Table 4, we show the subset of SSB-associated metabolites (from step two, reported in Table 3) that were also associated with fasting triglycerides (the primary cardiometabolic outcome of interest from step one/ Table 2) across childhood and adolescence in linear mixed models adjusted for age, sex, and race/ethnicity. Of the 180 metabolites associated with SSB intake in childhood, 13 metabolites were also associated with triglycerides across childhood and adolescence in the same direction as their association with SSB intake in childhood, based on a Bonferroni-adjusted p < 0.00028 (α = 0.05/180 metabolites; Table 4). Among these 13 metabolites, 11 metabolites were lipid metabolites, while the remaining 2 were unknown metabolites ( Table 4). The identified metabolites included four dicarboxy-Metabolites 2022, 12, 559 7 of 16 lated fatty acids (chain lengths ranging 10 to 16), a lactosylceramide, and a plasmalogen, which were inversely associated with triglycerides (Table 4). There were also several monoand diacylglycerols and phospholipids, which were positively associated with triglycerides (Table 4). Scatter plots visualizing the relationship between these lipid metabolites in childhood and triglycerides across childhood and adolescence are shown in Figure S1. Table S2 summarizes the full results from this step, i.e., associations between all 180 SSB-associated metabolites in childhood and triglycerides across childhood and adolescence, based on linear mixed models. Table 4. Prospective associations between the selected metabolites associated with sugar-sweetened beverage intake in childhood and fasting triglycerides across childhood and adolescence.

Sensitivity Analyses
In sensitivity analyses, we observed similar findings when we additionally adjusted each step for pubertal stage, which may be a mediator of the relationship between SSB intake and the cardiometabolic outcomes. Specifically, in step one, the association between SSB intake in childhood and triglycerides across childhood and adolescence was similar in magnitude and direction (β (95% CI): 7.4 (−1.5, 16.4) mg/dL for the highest versus lowest quartile of SSB intake; p-trend = 0.075). In step two, the LASSO regression selected a similar set of metabolites that were associated with SSB intake in childhood (170 metabolites were selected in ≥40% bootstrap samples, among which 153 (90%) were also selected in LASSO regression before adjusting for pubertal stage). In step three, the subset of SSB-associated metabolites from step two that were also associated with triglycerides across childhood and adolescence consisted of 11 of the 13 metabolites selected in the primary analysis above without adjusting for pubertal stage (two of the dicarboxylated fatty acids were not selected), which are summarized in Table S3.

Discussion
In this longitudinal cohort of diverse youth, we integrated data on diet, untargeted metabolomics, and conventional cardiometabolic risk factors using a "meet in the middle" approach to identify plasma metabolite alterations that link SSB intake in childhood with cardiometabolic risk across childhood adolescence. First, we found that energy-adjusted SSB intake was associated with higher fasting triglycerides across childhood and adolescence in this sample, but no other cardiometabolic risk factors (glucose, insulin, HOMA-IR, HDL cholesterol, or systolic blood pressure). Subsequently, we used robust selection criteria to identify 13 plasma metabolites that were associated both with SSB intake in childhood and with triglycerides across childhood and adolescence with the same directionality. All the metabolites that could be identified (11 of 13) were lipid-related metabolites, particularly dicarboxylated fatty acids, mono-and diacylglycerols, and phospholipids. This pattern of metabolite alterations may, therefore, reflect disruptions in lipid metabolism that causally link higher SSB intake in childhood with dyslipidemia risk, which will need to be investigated in future studies.
In the first step of the analysis, we found that childhood SSB intake was associated with higher triglycerides across childhood and adolescence, consistent with the literature [21]. This effect may be due to unregulated hepatic fructose metabolism [22], which can lead to hepatic substrate overload and increased hepatic de novo lipogenesis [23,24]. Fructose intake may also alter hepatic lipid metabolism by stimulating lipogenic gene expression via the activation of several transcriptional activator families, including carbohydrateresponsive element-binding protein (ChREBP), sterol regulatory element-binding protein (SREBP), and peroxisome proliferator-activated receptor (PPAR) [23,24], as well as indirectly by inhibiting fat oxidation [25]. Over time, these metabolic alterations contribute to intrahepatic lipid accumulation, which can promote a compensatory increase in the production and secretion of very-low-density lipoproteins (VLDLs), leading to higher plasma triglycerides, as we observed in this study. It was unexpected that we did not find significant associations of SSB intake in childhood with any of the other cardiometabolic risk factors. However, this aligns with the findings from a few other studies of relatively healthy populations, which also found a preferential association of SSB intake with lipids, but not with other markers of glucose-insulin homeostasis [26]. It is possible that disruptions in fasting triglycerides are a 'first step' in the metabolic milieu associated with SSB intake and that more prolonged exposure is needed to observe other associations, particularly among otherwise healthy youth.
After evaluating prospective associations between SSB intake and cardiometabolic outcomes, we identified the 180 plasma metabolites that were most strongly associated with SSB intake at the childhood visit using LASSO regression. LASSO is a data-driven, multivariate analytical technique that is ideally suited for dimension reduction and feature selection. These 180 metabolites were primarily related to lipid, amino acid, and xenobiotic metabolism pathways. In comparing the findings of the present study with published literature, we noted that one metabolite selected by LASSO, taurine, was one of the four urinary metabolites identified as a biomarker of SSB intake in a study of adults in Ireland by Gibbons et al. [18]. This consistent association may be because taurine is often an added ingredient (along with caffeine) in sugar-sweetened energy drinks, although, it is worth noting that circulating taurine may also be influenced by other food sources, such as meat and seafood, as well as endogenous taurine biosynthesis via methionine and cysteine metabolism [27]. The other three metabolites on the panel selected by Gibbons et al. (formate, isocitrate, and citrulline) were not selected by LASSO regression in this study, which may be due to differences in the biospecimen analyzed (i.e., urine versus blood) or the age range of the participants (i.e., adults versus children). We also note that none of the metabolites associated with SSB intake in this study overlapped with the metabolites associated with SSB intake in a similar analysis by Perng et al. based on the ELEMENT Project in Mexico City [20]. This may be due to differences in the geography of each cohort (including the composition and overall intake patterns of SSBs in the U.S. versus Mexico), as well as the untargeted metabolomics platforms employed (Metabolon in EPOCH vs. University of Michigan's Metabolomics Research Core in ELEMENT) and the nuances of the analytical strategies used in each study. Collectively, this highlights the challenges in identifying dietary biomarkers that are consistent across different populations.
In the final step of our analysis, we filtered the 180 metabolites associated with SSB intake in childhood to 13 metabolites that were also associated with triglycerides across childhood and adolescence, which was the primary cardiometabolic outcome of interest. Except for two unknowns, all of these metabolites were identified as lipids. Specifically, four metabolites were dicarboxylated fatty acids, which are fatty acids with two carboxyl groups that can be generated from ω-oxidation or plant/vegetable intake [28] and have been shown to be markedly lower in children with obesity versus controls [29]. In this study, we also found inverse associations of dicarboxylated fatty acids with SSB intake and triglycerides, potentially reflecting disruptions in fatty acid catabolism/oxidation-a plausible metabolic disturbance that could be mediated by fructose-induced alterations in PPAR-alpha activity [30] and, in turn, contribute to elevated triglycerides. Other identified metabolites were two monoacylglycerols and one diacylglycerol, which were positively associated with SSB intake and triglycerides. This finding corroborates other studies showing that the accumulation of these lipid intermediates is associated with poorer cardiometabolic health [31,32], possibly via disruptions in insulin signaling and mitochondrial dysfunction [33,34], and, further, that their plasma levels were responsive to a healthy dietary pattern (Mediterranean diet) intervention [31].
Other lipid metabolites identified in this final step of our analysis included three glycerophospholipids and one lactosylceramide, a type of sphingolipid. Phospholipid alterations assessed by metabolomics or lipidomics have been consistently associated with cardiometabolic diseases, including diabetes/prediabetes and cardiovascular disease [35,36]. One of the phospholipids identified was 1-stearoyl-2-oleoyl-GPC (18:0/18:1), a common phosphatidylcholine (PC) found in animal membranes that was positively associated with SSBs and triglycerides in this study. This aligns with findings from a dietary intervention in adults showing that the same PC was directly associated with changes in triglycerides following an 8-week low-calorie diet [37]. The potential mechanisms underlying these associations warrant further investigation but may reflect the close link between hepatic phospholipid metabolism and triglyceride packaging and secretion from the liver [38].
A limitation was our reliance on a self-reported dietary assessment to quantify SSB intake, which can be prone to several biases (recall bias, social desirability bias, etc.), especially in children/adolescents with obesity [39]. However, because we assessed the relationship between dietary intake and the cardiometabolic outcomes prospectively, it is less likely that any reporting bias was differential with respect to the outcome. Moreover, we adjusted SSB intake for total energy intake for analyses, which may mitigate measurement error and improve the precision of estimates [40]. The metabolomics assay used was semi-quantitative in nature; thus, additional, targeted assays would be needed to assess absolute concentrations for metabolites. In addition, the sample was from one geographic region (Colorado, USA), which may limit the generalizability of our findings. Strengths of this study include the longitudinal study design, which allowed us to assess prospective associations between SSB intake and metabolome alterations in childhood with repeated measures of cardiometabolic risk, and the use of a comprehensive untargeted metabolomics profiling technique combined with a multivariate method (LASSO regression) with proven utility to protect against false positive findings for variable selection/dimension reduction. Additionally, the extensive laboratory, metabolic, anthropometric, and behavioral assessments performed on the EPOCH cohort allowed us to adjust for various potential confounding factors, and the relatively large sample size (~600 participants), especially compared to other metabolomics datasets in youth cohorts, provided statistical power to assess potential effect modification by sex.

Conclusions
In a longitudinal, multiethnic cohort of children based in Colorado, we identified 13 metabolites, 11 of which were involved in lipid metabolic pathways, that link the prospective relationship between SSB intake in childhood with fasting triglyceride levels across childhood and adolescence. These intermediary lipid metabolites may not only represent potential biomarkers of higher SSB intake in youth but also may reflect underlying metabolic disruptions that are causally involved in the adverse effects of SSB intake on plasma triglycerides, supporting their prioritization in future investigations. Specifically, in addition to other prospective studies validating the utility of these metabolites as SSB biomarkers, experimental studies are needed to further understand the potential mech-anisms underlying this interplay between SSB intake, lipid metabolite disruptions, and dyslipidemia in youth. This study also adds to the growing body of literature supporting a link between SSB intake and cardiometabolic abnormalities in youth, which supports the importance of more research aiming to identify and target factors that may influence childhood SSB intake, such as parental/family-, school/community-, or policy-related factors [41][42][43].

Study Population
This study included participants from the EPOCH study, a longitudinal, multiethnic cohort of youth in Colorado. The original aim was to examine the health effects on offspring of in utero exposure to maternal gestational diabetes mellitus (GDM) [44]. Participants were the offspring of mothers who were members of the Kaiser Permanente of Colorado (KPCO) health plan. We enrolled children who were exposed to maternal diabetes in utero (n = 99) and a random sample of children who were not exposed to maternal diabetes (n = 505). The first research visit occurred from 2006 to 2009, when offspring were 6-14 years old (median 10.6 years; "visit 1"). Among these participants, 413 returned for a second visit from 2012 to 2015 when offspring were 12-19 years old (median 16.8 years; "visit 2"). Mothers provided written informed consent and children participants provided written assent. This study was approved by the Colorado Multiple Institutional Review Board (Protocol no. 05-0623).
For this analysis, we excluded two participants who were missing dietary data at visit 1 and five participants with insufficient blood volume for metabolomics analysis at visit 1, resulting in an analytical sample of 597 participants. In analyses examining the prospective associations with cardiometabolic risk factors across childhood and adolescence, we also excluded the following participants who were missing data both at visit 1 and visit 2 for the following measures (i.e., no data at either visit for the outcome of interest): n = 1 missing glucose at both visits; n = 3 missing insulin and HOMA-IR at both visits; n = 5 missing HDL cholesterol at both visits; n = 1 missing triglycerides at both visits. Participant selection is summarized in Figure 2.

Dietary Assessment
Dietary intake was assessed at both visits using a modified version of the Block Kids Food Frequency Questionnaire [45], a semi-quantitative food frequency questionnaire that has been validated in children as young as 8 years old [46,47]. The questionnaire

Dietary Assessment
Dietary intake was assessed at both visits using a modified version of the Block Kids Food Frequency Questionnaire [45], a semi-quantitative food frequency questionnaire that has been validated in children as young as 8 years old [46,47]. The questionnaire queried whether the participant consumed each food/beverage item in the past week and, if so, how many days per week (ranging from 'one day' to 'every day') and the usual amount eaten per day. The full questionnaire may be available upon reasonable request. Completed questionnaires were then analyzed to estimate individual intakes of total energy, nutrients, and food groups per day. For each participant, SSB intake in childhood was assessed by summing the servings per day from the following: sodas, fruit drinks (i.e., Sunny Delight, Hawaiian Punch, etc.), sports drinks (i.e., Gatorade), and sweetened tea or coffee. We used the residual method to adjust SSB intake for total energy intake per day [48]. Participants were then grouped based on their quartile of energy-adjusted SSB intake for later analyses.

Untargeted Metabolomics Profiling of Plasma
Fasting blood samples were collected from all participants at both visits by trained phlebotomists. All samples were refrigerated immediately, processed within 24 h, and stored at −80 • C until the time of analysis. As previously described [49,50], untargeted metabolomics profiling was performed on stored fasting plasma samples by Metabolon using a multiplatform mass spectroscopy-based technique, which identified 1193 unique features at both time points. Prior to formal statistical analysis, we removed metabolites with ≥20% missing values per batch. The remaining missing values were imputed using the K-nearest neighbor algorithm (with K = 10) [51]. The samples were analyzed in two batches: the first batch of participants had 913 metabolites after removing those with high missingness, and the second batch had 898 metabolites. We then merged the two batches for subsequent data processing and retained 767 metabolites that were present in both batches. The retained metabolites then underwent log 10 -transformation, followed by metabolite normalization and correction for batch effects (as well as other biological and technical variability) using the remove unwanted variation method (with K = 2 factors of unwanted variation estimated from the data) [52]. All the above preprocessing steps were performed using R Statistical Software (version 3.5.3) [53].

Cardiometabolic Risk Assessments
We used fasting blood samples from both visits to assess the following markers of cardiometabolic risk across childhood and adolescence. Fasting glucose was assessed enzymatically, and fasting insulin was assessed by using radioimmunoassay (Millipore, Darmstadt, Germany). Fasting glucose and insulin were then used to estimate insulin resistance using the homeostatic model of assessment (HOMA-IR) [54] Fasting blood lipids, including HDL cholesterol and triglycerides, were assayed on the Olympus AU400 advanced chemistry analyzer system. At both visits in childhood and adolescence, participants' blood pressure was also measured twice in the sitting position using an oscillometric monitor (Dinamap ProCare V100). For this analysis, we used the average of the two values and focused on systolic blood pressure only.

Other Covariate Assessments
Exposure to maternal GDM during pregnancy was ascertained from the KPCO perinatal database and defined as a physician's diagnosis of gestational diabetes based on routine screening at 24-28 weeks of gestation using the standard two-step protocol [55]. Child race/ethnicity was self-reported at visit 1 as being non-Hispanic white, non-Hispanic black, Hispanic, or non-Hispanic other. The child's height (kg) and weight (cm) were assessed at both visits by trained research staff while wearing light clothing and without shoes. These measurements were used to calculate age-and sex-specific body mass index (BMI) z-scores using the WHO growth reference for children aged 5-19 years [56]. At both visits, participants reported their pubertal development based on pictorial diagrams of the Tanner stages, which had been validated against physician-assessed Tanner staging and puberty-related hormones [57], and which bases pubertal stage on pubic hair development in boys and breast development in girls. Participants were then categorized as pre-pubertal (Tanner = 1), pubertal (Tanner = 2 or 3), and late/post-pubertal (Tanner = 4 or 5).

Statistical Analyses
We first examined bivariate associations between energy-adjusted SSB intake quartiles in childhood and the background characteristics of the sample in childhood. For categorical variables, we reported counts and frequencies for each sub-category according to SSB intake quartile and tested differences across quartiles using Chi-squared tests. For continuous variables, we reported means and standard deviations according to SSB quartile and tested for differences across SSB quartiles using a one-way analysis of variance (ANOVA).
Next, a "meet in the middle" approach was employed to identify the plasma metabolites that may mark the relationship between SSB intake and cardiometabolic risk. In the first step, we examined the associations of SSB quartiles in childhood (visit 1) with repeated measures of cardiometabolic risk (fasting glucose, fasting insulin, HOMA-IR, HDL cholesterol, fasting triglycerides, and systolic blood pressure) across childhood and adolescence (visit 1 and visit 2) using mixed-effects models adjusted for child age, sex, and race/ethnicity. Models included a random intercept for each participant ID to account for intra-individual correlations in cardiometabolic measures across visits and an unstructured covariance matrix. We assessed effect modification by sex using product interaction terms, but none reached significance (all p > 0.10); therefore, all analyses were conducted on the entire sample. Results are reported as regression coefficients and 95% confidence intervals (CIs) for the 2nd, 3rd, and 4th quartiles of SSB intake when compared to the 1st quartile (the reference). We also tested for a linear trend across quartiles using a continuous variable based on the median value for each SSB quartile. For this step, we considered cardiometabolic risk factors for further exploration in downstream analyses if p < 0.10 for a linear trend across SSB quartiles.
In the second step, we identified plasma metabolites that were cross-sectionally associated with SSB intake in childhood (visit 1) by employing LASSO regression [58], using the glmnet package in R [53]. Briefly, LASSO is a regularized regression technique designed to select the strongest variables associated with an outcome of interest from a high-dimensional and correlated set of predictors. This is done by imposing a tuning parameter on the model that shrinks the regression coefficients for weaker variables to zero during feature selection, thereby removing weak but statistically significant associations that may represent false positive findings. Ten-fold cross validation was used to determine the tuning parameter that achieved the minimum mean error. To perform more stabilizing of variable selection, instead of running once, LASSO regression was carried out with 100 bootstrap samples with log-transformed SSB intake in childhood as the dependent variable and with adjustments for child age, sex, and race/ethnicity, as unpenalized variables. Metabolites were considered for downstream analyses if they were selected by LASSO (i.e., non-zero coefficient) in ≥40% of bootstrap samples. This threshold was determined by firstly calculating the number of metabolites selected, on average, across the 100 bootstrap iterations (202 metabolites selected on average). We then calculated the number of metabolites selected per 5-unit threshold increment and chose a threshold that selected a number of metabolites closest to the average from bootstrapping (180 metabolites were selected with a threshold of ≥40%).
In the third step, we investigated whether SSB-associated metabolites in childhood (selected from step 2) were also associated with SSB-associated cardiometabolic risk factors across childhood and adolescence (selected from step 1) using a second set of linear mixedeffects models with each metabolite as the independent variable and the repeated measures of each cardiometabolic measure across childhood and adolescence used as the dependent variable. Models were again adjusted for age, sex, and race/ethnicity, and included a random intercept for each participant's ID and an unstructured covariance matrix. Results are reported as regression coefficients and 95% CIs for the effect of a 1 unit increase in each metabolite on cardiometabolic measures. Metabolites were selected as being intermediary biomarkers if the regression coefficient's p-value was below a Bonferroni-adjusted threshold (α = 0.05/number of metabolites tested) and if the direction of the association with the cardiometabolic measure was the same as with the SSB intake. We also performed a sensitivity analysis where we additionally adjusted each step of the analysis for pubertal stage (pre-pubertal, pubertal, or late-pubertal), which can also affect cardiometabolic status during childhood [59]. Unless otherwise stated, statistical analyses were performed using SAS Statistical Software (version 9.4; Cary, NC, USA).

Supplementary Materials:
The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/metabo12060559/s1, Figure S1: Associations between plasma metabolites in childhood (selected from Table 4) with triglycerides across childhood and adolescence, Table S1: Means and standard deviations for the cardiometabolic measures of interest at each visit in childhood and adolescence, Table S2: Prospective associations between all 180 metabolites associated with sugar sweetened beverage intake in childhood (from step 2) and fasting triglycerides across childhood and adolescence, Table S3: Sensitivity Analysis-Prospective associations between select metabolites associated with sugar sweetened beverage intake in childhood and triglycerides across childhood and adolescence with additional adjustment for pubertal stage.