Deficits in Prenatal Serine Biosynthesis Underlie the Mitochondrial Dysfunction Associated with the Autism-Linked FMR1 Gene

Fifty-five to two hundred CGG repeats (called a premutation, or PM) in the 5′-UTR of the FMR1 gene are generally unstable, often expanding to a full mutation (>200) in one generation through maternal inheritance, leading to fragile X syndrome, a condition associated with autism and other intellectual disabilities. To uncover the early mechanisms of pathogenesis, we performed metabolomics and proteomics on amniotic fluids from PM carriers, pregnant with male fetuses, who had undergone amniocentesis for fragile X prenatal diagnosis. The prenatal metabolic footprint identified mitochondrial deficits, which were further validated by using internal and external cohorts. Deficits in the anaplerosis of the Krebs cycle were noted at the level of serine biosynthesis, which was confirmed by rescuing the mitochondrial dysfunction in the carriers’ umbilical cord fibroblasts using alpha-ketoglutarate precursors. Maternal administration of serine and its precursors has the potential to decrease the risk of developing energy shortages associated with mitochondrial dysfunction and linked comorbidities.


Introduction
The fragile X mental retardation 1 (FMR1) gene contains CGG repeats that can expand to >200 copies (full mutation), causing fragile X syndrome, a condition frequently codiagnosed with autism and other intellectual disabilities [1,2]. Normal alleles (10-44 repeats) are highly stable; however, transmissions of the premutation (PM; 55-200 repeats) may expand to full mutation in one generation through maternal inheritance [3].
Adult PM carriers are at higher risk of developing the late-onset fragile-X-associated tremor/ataxia syndrome (FXTAS), characterized by progressive intention tremors, gait ataxia, deficits in executive function and memory, peripheral neuropathy, and parkinsonism [4,5]. Children with the PM show higher incidence of attention deficit, aggressiveness, social anxiety, autism, and seizures [6,7]. In terms of developmental disabilities, a number of studies suggest a higher than expected rate of developmental delays in children and adolescents with the PM (see [6,7] and references herein), whereas others failed to identify such a link [8]. While an extensive analysis of the factors contributing to this discrepancy is outside the scope of this study, it is noteworthy that many of these reports may suffer from selection bias, as they were performed with clinically referred children. However, we cannot exclude the possibility that other variables (i.e., age at testing, CGG repeat size,

Selection and Characterization of Amniotic Fluid Samples
In order to identify the prenatal metabolic footprint of the PM, we examined the proteomic and metabolomic profiles of 54 amniotic fluid (AF) supernatants obtained from 52 pregnant carrier women who had undergone amniocentesis for prenatal fragile X diagnosis. All mothers carried a male fetus who had inherited either a PM or a nonmutated fragile X allele (Table 1). Due to the direct correlation between CGG repeat length and increased oxidative stress and mitochondrial dysfunction [22,23], only mothers with similar CGG expansions were selected. The maternal CGG repeats ranged-for the mutant allele-from 53 to 200 (mean ± SD: 72 ± 30) for mothers carrying a non-carrier fetus, and from 55 to 82 (mean ± SD: 63 ± 8) for those carrying a fetus with the PM, with no significant differences between the two groups (t (33) = −1.493; p = 0.143; d = 0.389; unequal variances F = 16.13; p < 0.0001). The age of the mothers carrying a non-carrier fetus ranged from 25.5 to 41.8 y (34 ± 4), slightly older than those carrying a PM fetus, who ranged from 17.8 to 42.1 y (30 ± 7; t (33) = −2.597; p = 0.0138; d = 0.751; unequal variances F = 2.334; p = 0.035). The fetal repeats, as expected, were significantly different between groups, ranging from 20 to 44 for the non-carrier group, and from 55 to 157 for the carrier group (mean ± SD: 29 ± 5 and 69 ± 21; t (23) = 9.908; p < 0.0001; d = 2.938; unequal variances F = 13.75; p < 0.0001). Most (88.9%; p < 0.0001) of the transmitted PM alleles were unstable (i.e., ± ≥1 repeat difference from the mother's, as defined in [24]). As expected, most non-carrier fetuses presented two AGG interruptions in the CGG repeat stretch (65%), whereas most carriers exhibited fewer interruptions (68.2%) and of longer repeat size (Table 1). Moreover, while most of the non-carriers had a limited number of AGG interruption configurations (8 combinations), carriers exhibited a more varied pattern (15 combinations; Table 1). Samples were taken from female carriers at gestational periods ranging from 13.3 to 21.2 weeks, with no statistical differences between the two groups (mean ± SD: 17 ± 2 gestational weeks and n = 30; 17 ± 1 and n = 22; t (50) = -0.838; p = 0.406; d = 0.228; Table 1). This gestational age was selected because the first 15-20 weeks are the most vulnerable for neurodevelopment [25], and the relatively narrow range minimizes putative metabolic profile differences that occur across the gestational period [26,27]. In support of this last premise, no correlation was observed between the fetal biomarker alpha-fetoprotein (AFP) and gestational age [28], as expected for the short timeframe (13-21 weeks) of sample collection, thereby allowing for a direct case-control comparison independent of the exact gestational week. Another argument that supports the selection of this interval is based on the fact that AF supernatants during these gestational weeks serve as suitable proxies for intra-fetal composition, mainly derived from fetal fluids. From 10 to 20 weeks of gestation, free diffusion occurs bi-directionally between AF and the fetus across the fetal skin, placenta, and umbilical cord, resulting in an AF composition similar to that of fetal plasma [29]. This is a direct consequence of the immature development of both keratinization and blood-brain barrier permeability [29], thereby allowing the passage of proteins between organs (including the brain) and the bloodstream [30].

Combined Omics as a Tool to Validate the Biological Matrix as Fetal Amniotic Fluid
As reported by other studies utilizing AF supernatants to unveil biomarkers of various prenatal conditions [31], most of the AF proteome was represented by structural proteins, followed by those necessary for fetal development, transport, blood, hormone signaling, and immune response, among others (Supplementary Figure S1A,B). Of the identified proteins, 117 were unique, whereas 306 were already reported in other studies utilizing AF (Supplementary Figure S2), including the 12 most abundant proteins matching the top 15 in another study [32] (Supplementary Table S1). A heat map showing the tissue of origin for the top 10% most abundant proteins indicated that the main cluster was represented by the respiratory, gastrointestinal (GI), and urinary tracts, followed by male organs and fetal/placental proteins ( Figure 1A, Supplementary Figure S3). Amniocyteand mesenchymal-stem-cell-specific proteins [31], along with proteins from the brain and cerebral cortex, were also identified, consistent with the presence of an immature blood-brain barrier ( Figure 1A, Supplementary Figure S3).
Among the identified metabolites, the most abundant were amino acids and their derivatives, glucose, TCA intermediaries, and lipids (Supplementary Figure S1C), some of which were previously identified in AF from second trimester normal pregnancies [33]. The main tissues of origin for these components were the placenta, liver, fetal brain cortex, nervous tissue, pancreas, and intestine (Supplementary Figure S3), supporting the notion that AF is derived mainly from fetal tissues during this period [34].

Combined Omics as a Tool to Validate the Biological Matrix as Fetal Amniotic Fluid
As reported by other studies utilizing AF supernatants to unveil biomarkers of various prenatal conditions [31], most of the AF proteome was represented by structural proteins, followed by those necessary for fetal development, transport, blood, hormone signaling, and immune response, among others (Supplementary Figure S1A,B). Of the identified proteins, 117 were unique, whereas 306 were already reported in other studies utilizing AF (Supplementary Figure S2), including the 12 most abundant proteins matching the top 15 in another study [32] (Supplementary Table S1). A heat map showing the tissue of origin for the top 10% most abundant proteins indicated that the main cluster was represented by the respiratory, gastrointestinal (GI), and urinary tracts, followed by male organs and fetal/placental proteins ( Figure 1A, Supplementary Figure S3). Amniocyte-and mesenchymal-stem-cell-specific proteins [31], along with proteins from the brain and cerebral cortex, were also identified, consistent with the presence of an immature bloodbrain barrier ( Figure 1A, Supplementary Figure S3).
Among the identified metabolites, the most abundant were amino acids and their derivatives, glucose, TCA intermediaries, and lipids triglycerides (Supplementary Figure  S1C), some of which were previously identified in AF from second trimester normal pregnancies [33]. The main tissues of origin for these components were the placenta, liver, fetal brain cortex, nervous tissue, pancreas, and intestine (Supplementary Figure S3), supporting the notion that AF is derived mainly from fetal tissues during this period [34]. Analysis was performed as described in [35]. (A) Respiratory (lungs, oral epithelium), GI (esophagus, salivary gland, colon, rectum, gut, pancreas, stomach, liver, gall bladder), and urinary tract (kidney, urinary bladder) tissues were identified, along with male organs (e.g., spermatozoon, testis, seminal vesicles, prostate gland), as well as fetal and placental proteins. Amniocyte-and mesenchymal- obtained with the top 10% most abundant proteins from the AF proteomes. Analysis was performed as described in [35]. (A) Respiratory (lungs, oral epithelium), GI (esophagus, salivary gland, colon, rectum, gut, pancreas, stomach, liver, gall bladder), and urinary tract (kidney, urinary bladder) tissues were identified, along with male organs (e.g., spermatozoon, testis, seminal vesicles, prostate gland), as well as fetal and placental proteins. Amniocyte-and mesenchymal-stem-cell-specific proteins, and proteins from the brain and cerebral cortex, were also identified. (B) Those proteins predominantly or exclusively fetal were AFP, SI, LCT, CST3, SERPINA1, CP, TF, and ORM1; those X-linked were ARMCX4, BGN, CD99L2, CHRDL1, EFNB1, FLNA, IGSF1, MSN, MXRA5, PCSK1N, PLS3, SERPINA7, SRPX, TIMP1, TMSB4X, and VSIG41; and those exclusively of maternal origin were HP and HPX [31]. The ratios of fetal markers to X-linked markers (mother and male fetus), and of alpha-fetoprotein (AFP) normalized to X-linked proteins, were 7.8-and 147.9-fold, respectively, confirming the overwhelmingly higher fetal contribution in AF samples. The ratio of fetal hemoglobin (γ-Hb) to that of adult hemoglobin (β-Hb) was 28.4-fold in AF samples.
To validate the hypothesis that any metabolic changes detected through the proteomics and metabolomics of AF may have originated mainly from the fetus vs. the mother's metabolism, two approaches were followed. One of these approaches, by identifying proteins of fetal origin (fetus only) and X-linked proteins (fetus and mother) in the AF proteome [31], allowed us to obtain an enrichment of fetal proteins over maternal ones of 8. The average of fetal-only proteins normalized to X-linked proteins was 116.3, while that for the fetal biomarker AFP [31,36,37] was 147.9-all significantly higher than the 14.9 for the normalized maternal proteins ( Figure 1B). In parallel, the average ratio of hemoglobin gamma (HBG1 and HBG2, both of fetal origin) to that of hemoglobin beta (adult origin) was 28.4, indicating that the AF was enriched in fetal Hb, with the detected adult Hb in AF being an unavoidable consequence of the amniocentesis procedure. Taken together, the AF was enriched by 8-28 times in fetal vs. maternal proteins.

Impact of a Deficient Glycolysis-Derived Serine Biosynthetic Pathway in the Premutation
To test the separation of the diagnostic groups and predict outcomes based on the omics data, we used two approaches (supervised and unsupervised learning models) within the artificial intelligence and machine learning fields. For the first approach (supervised), classification problems (in our case, separating the two diagnostic groups) use an algorithm to accurately assign test (labeled) data into specific categories (i.e., premutation and non-carriers). In our case, we used the classification algorithm named partial least squares-discriminant analysis (PLS-DA; Figure 2A,C). For the second approach (unsupervised learning), the learning model uses machine learning algorithms to analyze and cluster unlabeled data sets. These algorithms discover hidden patterns in data without the need for human intervention (hence, they are "unsupervised"). We used unsupervised learning models for agglomerative hierarchical clustering (visualized here as heat maps; Figure 2B,D; Supplementary Figure S4). During agglomerative clustering, data points were initially isolated as separate groupings, and then they were merged iteratively based on similarity until a minimum number of clusters was achieved. To measure similarity, we used Ward's linkage. This method states that the distance between two clusters is defined by the increase in the combined error sum of squares after the clusters are merged. This distance was calculated using the Euclidean algorithm, as this is the most common metric used to calculate these parameters. Surprisingly, the inspection of the plots obtained with supervised and unsupervised learning models showed a marked distinction between carriers and non-carriers, as not all pediatric carriers exhibit early and/or overt metabolic and clinical phenotypes [6,7].   Pathway analysis of a joint AF metabolome-proteome revealed upregulation of aminoacyl-tRNA biosynthesis, as well as the metabolism of the following amino acids: sulfur-containing (Cys and Met), branched-chain (Val, Ile, Leu), aromatic (Phe, Tyr and Trp), and Arg and Pro. The downregulated pathways included, among others, energy supply (glycolysis and TCA cycle); metabolism of Ala, Gly, Ser, Thr, Asp, and Glu; the pentose phosphate pathway; glutathione metabolism; and pantothenate and coenzyme A biosynthesis ( Figure 3).  [38], which allows for an integrated approach to results obtained from combined metabolomics and protein expression studies conducted under the same experimental conditions. Enrichment analysis was based on the hypergeometric features and utilizing the topology based on the degree centrality (which evaluates the number of links that connect anode) within a pathway. The database utilized was KEGG, and the integration method combined the queries. Results were plotted as a function of impact within the pathway and significance. Labeled pathways have an FDR < 0.05. (B) Differentially expressed pathways in AF supernatants from carriers and non-carrier fetuses. These results were obtained by using the metabolomics and proteomics data (filtered by log2 FC >1 or <−1 and p < 0.05) as input data, and analyzing them using the pathway modeling software PathVisio version 3.0 [39], which computes Zscores as well as permutated p-values. This analysis was repeated utilizing proteomics data alone and employing the STRING database version 11, which specifically asks for proteome scale input, with each protein having an associated numerical value (log2 FC) [40]. Of the available methods for searching functional enrichments in such a set, we chose a permutation-based, non-parametric test that computes, for each protein set to be tested, the average of all values provided by our dataset for the constituent proteins. This average was then compared against averages of randomized gene sets of the same size. Multiple testing correction was applied separately within each functional classification framework (KEGG), in accordance with the method of Benjamini and Hochberg, but not across these frameworks, as there is significant overlap between them. Impaired glycolysis was gathered by the lower expression of GAPDH and the downstream enzymes of the glycolytic pathway (PGK1, PGAM1, and ENO1; Figure 4A). These deficits resulted in lower production of both pyruvate and lactate, the latter further compounded by the lower expression of LDHA/B ( Figure 4A).
To avoid algorithm and pathway database bias, we repeated the joint analysis utilizing different algorithms (utilizing as input only those metabolites and proteins with VIP scores of 0.8 or above, and the WikiPathways and REACTOME pathway databases). This analysis resulted in 34 modules (Supplementary Figure S5), several overlapping with those already identified under Figure 3, and most overlapping (albeit to a lesser extent) with those identified in bio-samples from adult PM carriers [10,23,[41][42][43].
It could be argued that the association relationship between maternal age and dizygosity incidence-leading to age-dependent epigenetics (e.g., X chromosome inactivation or imprinting) or the association between the genetics of dizygotic twinning and cerebral asymmetry-may have influenced the results presented here. However, support for the non-concordance of the biological outcomes between dizygotic twins was performed by principal components analysis (Supplementary Dataset 1). For this reason, data obtained on AF from the two sets of non-identical twins were included (note: analysis of the data  [38], which allows for an integrated approach to results obtained from combined metabolomics and protein expression studies conducted under the same experimental conditions. Enrichment analysis was based on the hypergeometric features and utilizing the topology based on the degree centrality (which evaluates the number of links that connect anode) within a pathway. The database utilized was KEGG, and the integration method combined the queries. Results were plotted as a function of impact within the pathway and significance. Labeled pathways have an FDR < 0.05. (B) Differentially expressed pathways in AF supernatants from carriers and non-carrier fetuses. These results were obtained by using the metabolomics and proteomics data (filtered by log2 FC >1 or <−1 and p < 0.05) as input data, and analyzing them using the pathway modeling software PathVisio version 3.0 [39], which computes Z-scores as well as permutated p-values. This analysis was repeated utilizing proteomics data alone and employing the STRING database version 11, which specifically asks for proteome scale input, with each protein having an associated numerical value (log2 FC) [40]. Of the available methods for searching functional enrichments in such a set, we chose a permutation-based, non-parametric test that computes, for each protein set to be tested, the average of all values provided by our dataset for the constituent proteins. This average was then compared against averages of randomized gene sets of the same size. Multiple testing correction was applied separately within each functional classification framework (KEGG), in accordance with the method of Benjamini and Hochberg, but not across these frameworks, as there is significant overlap between them.
Conditions of increased oxidative stress were incurred by the downregulation of the pentose phosphate pathway ( Figure 3B Impaired glycolysis was gathered by the lower expression of GAPDH and the downstream enzymes of the glycolytic pathway (PGK1, PGAM1, and ENO1; Figure 4A). These deficits resulted in lower production of both pyruvate and lactate, the latter further compounded by the lower expression of LDHA/B ( Figure 4A).  , followed by addition of oligomycin (State 4o) and FCCP (State 3u). The same outcomes were recorded in 2 PM umbilical cords under the same conditions. Statistical analysis was performed using Student's t test, for the comparison between PM and WQ-2101-treated control fibroblasts. Multiple groups were analyzed using ANOVA followed by Tukey's post-hoc test. (C) The same mitochondrial outcomes were tested in PM umbilical cord fibroblasts in the presence of glucose (4 mM), Gln (2 mM), DMAKG (3 mM), and the combination of the three substrates. Bar graphs are shown as Z-scores. RCR: respiratory control ratio; IRC: index of respiratory capacity.

Independent Validation of the Prenatal Metabolic Footprint of the PM
To prevent bias-corrected estimates of model performance [58], the prenatal metabolic footprint was independently validated by utilizing one internal (AF) and two external cohorts. The two external cohorts from PM carriers and non-carriers-all obtained at a center located in California (Supplementary Methods)-included proteomics from 25 primary dermal fibroblasts and metabolomics from 39 plasma samples (characteristics of donors under Supplementary Table S2).
A receiver operating characteristic curve for the predictive performance of the prenatal PM metabolic footprint was built utilizing two-thirds of the AF samples (randomly selected) and the top variable in importance (VIP) predictors from both proteomic and metabolomic data (Supplementary Figure S10 A-D), and then further tested by incorporating the holdout samples. The performance of the model was significant (100 cross-validations and p < 0.01), with an accuracy of 77.78%.
When the model was built with either the proteomics data of all AF samples, with proteomics from fibroblasts used as holdout samples (Supplementary Figure S10   , followed by addition of oligomycin (State 4o) and FCCP (State 3u). The same outcomes were recorded in 2 PM umbilical cords under the same conditions. Statistical analysis was performed using Student's t test, for the comparison between PM and WQ-2101-treated control fibroblasts. Multiple groups were analyzed using ANOVA followed by Tukey's post-hoc test. (C) The same mitochondrial outcomes were tested in PM umbilical cord fibroblasts in the presence of glucose (4 mM), Gln (2 mM), DMAKG (3 mM), and the combination of the three substrates. Bar graphs are shown as Z-scores. RCR: respiratory control ratio; IRC: index of respiratory capacity.
To avoid algorithm and pathway database bias, we repeated the joint analysis utilizing different algorithms (utilizing as input only those metabolites and proteins with VIP scores of 0.8 or above, and the WikiPathways and REACTOME pathway databases). This analysis resulted in 34 modules (Supplementary Figure S5), several overlapping with those already identified under Figure 3, and most overlapping (albeit to a lesser extent) with those identified in bio-samples from adult PM carriers [10,23,[41][42][43].
It could be argued that the association between maternal age and dizygosity incidenceleading to age-dependent epigenetics (e.g., X chromosome inactivation or imprinting) or the association between the genetics of dizygotic twinning and cerebral asymmetry-may have influenced the results presented here. However, support for the non-concordance of the biological outcomes between dizygotic twins was performed by principal components analysis (Supplementary Dataset 1). For this reason, data obtained on AF from the two sets of non-identical twins were included (note: analysis of the data excluding the two sets of non-identical twins or including one of each set did not modify the conclusions of the study).
Taken together, these results point to a general decline in cellular energy metabolism in carriers, encompassing both glycolysis and oxidative phosphorylation linked to deficiencies in the Ser and Gly biosynthetic pathways. This is supported by the major decline in the levels of Asp, Glu, and Ser (Supplementary Figure S6A)-amino acids that are biochemically linked to AKG by transamination (Glu/AKG or AKG-Asp/oxaloacetate) or via Ser biosynthesis (AKG/Glu-Ser). In agreement with this concept, Asp has been shown to play a role in the regulation of Ser uptake and metabolism, as well as downstream pathways, in rapidly proliferating cells [44].
Notably, PGK1 coordinates glycolysis with the TCA cycle (via mitochondrial translocation) [45], autophagy [46], and endoplasmic reticulum (ER) stress response [47], including the production of 3-phosphoglycerate, the substrate for the ultimate biosynthesis of Ser and Gly. Based on the anaplerotic role that the Ser biosynthetic pathway has on the Krebs cycle [48], the downregulation of Ser synthesis suggests a limited AKG supply, which would limit mitochondrial energy production ( Figure 4A). In this regard, the direct correlation of AKG with TCA intermediates downstream from AKG dehydrogenase (i.e., malate, fumarate, succinate), as well as Glu, Ser, and Gly levels, and the inverse correlation with Gln and pantothenate (Supplementary Figure S8) reinforces the biochemical link between AKG and both processes-the TCA cycle and the Ser biosynthetic pathway. Aside from providing AKG, the Ser-dependent pathways also provide intermediates for the synthesis of creatine (requiring Gly) and carnitine (Supplementary Figure S7), further impacting the cellular energy management.
It could be claimed that the impact of a non-essential amino acid such as Ser on cellular metabolism is negligible; however, its role is critical during CNS neurodevelopment, based on its low blood-brain barrier permeability [49], and the high demand of prenatal energy-dependent synthesis of nucleotide precursors [50] and myelin, among which are key brain sphingolipids and gangliosides essential for dendritic outgrowth [51]. More importantly, Ser is involved in neurotransmission through its involvement in the GABA shunt (AKG, Glu, succinate; Supplementary Figure S7) and synthesis of glycine and D-Ser (direct co-agonists of the NMDA receptor along with Glu), thereby fine-tuning synapses, neuronal plasticity, and excitotoxicity [52]. It is critical to highlight that sustaining biosynthetic processes while minimizing ROS-mediated damage (especially when derived from mitochondria) is key to neurodevelopment [53]. As such, the combination of decreased antioxidant defenses with increased mitochondrial ROS resulting from mitochondrial dysfunction in carriers may impact cellular proteostasis. In this regard, evidence for a proteotoxic insult was supported by the higher levels of aconitate, likely the result of ROS-mediated aconitase inactivation [54,55] (Figure 4A).
Further independent support for these findings was provided by the significant overlap of dysregulated pathways between the metabolic footprints observed in AF from fetal carriers and those produced by stressors relevant to energy management (hypoxia, oxygen, and glucose deprivation or OGD-mitochondrial stressors) and proteotoxicity (endoplasmic reticulum stress; Supplementary Figure S9A, and Supplementary Dataset 2: modules 0, 2,8,10,13,14,16,22,23).
The prenatal PM metabolic footprint was aligned mainly with conditions related to energy stress driven by HIF-1α-independent signaling (early OGD response and PHD3 silencing), mitochondrial uncouplers and inhibitors, and ER stress (Supplementary Figure  S9A). A major contribution to the HIF-1α-dependent activation in the metabolic changes from carriers was precluded based on the lower levels of 2-hydroxyglutarate (associated with both lower PHGDH [56] and LDH [57]). Moreover, the prenatal PM metabolic footprint was analogous not only to complex diseases associated with underlying MD (schizophrenia and Alzheimer's disease; Supplementary Figure S9B), but also with others directly and indirectly related to the Ser biosynthetic pathway (e.g., deficiencies in lipoyltransferase, AKG dehydrogenase, PSAT1, PSP and PGHDH; Supplementary Table S3, in bold).

Independent Validation of the Prenatal Metabolic Footprint of the PM
To prevent bias-corrected estimates of model performance [58], the prenatal metabolic footprint was independently validated by utilizing one internal (AF) and two external cohorts. The two external cohorts from PM carriers and non-carriers-all obtained at a center located in California (Supplementary Methods)-included proteomics from 25 primary dermal fibroblasts and metabolomics from 39 plasma samples (characteristics of donors  under Supplementary Table S2).
A receiver operating characteristic curve for the predictive performance of the prenatal PM metabolic footprint was built utilizing two-thirds of the AF samples (randomly selected) and the top variable in importance (VIP) predictors from both proteomic and metabolomic data (Supplementary Figure S10 A-D), and then further tested by incorporating the holdout samples. The performance of the model was significant (100 cross-validations and p < 0.01), with an accuracy of 77.78%.
When the model was built with either the proteomics data of all AF samples, with proteomics from fibroblasts used as holdout samples (Supplementary Figure S10E-H), or the metabolomics from all AF samples and the metabolomics of plasma samples (Supplementary Figure S10I-L), the performance of both models was significant after 100 crossvalidations (p < 0.01 and < 0.05 for proteomics and metabolomics; Supplementary Figure  S10G and Supplementary Figure S10K, respectively), with an accuracy of the models of 77.8% and 65.45% for the proteomics-and metabolomics-based data, respectively, despite the differences in sex (all AF being from males), developmental stage, age (prenatal vs. adults), and tissues (AF vs. fibroblasts and plasma). Taken together, these results indicate that the metabolic footprint of AF from carriers, albeit with different degrees of severity, matches those of other biological matrices obtained from adult carriers.

Improving Ser Status Rescues Mitochondrial Function in the PM
Based on our findings, we hypothesized that RNA-and protein-mediated toxicity, as a result of the FMR1 CGG expansion, impacted the Ser pathway. We reasoned that if glycolysis-derived Ser was critical in providing AKG for the TCA cycle in the PM, then mitochondrial outcomes of non-carriers with a halted Ser biosynthesis should result in mitochondrial outcomes like those of the PM. Conversely, if Ser biosynthesis were the rate-limiting step for providing AKG, then mitochondrial outcomes in the PM should respond favorably to the supplementation of AKG precursors. As a proof of concept, we tested these options by utilizing umbilical cord fibroblasts (UCFs)-tissues reflecting prenatal periods as close as possible to those of the AF-from non-carriers and carriers (two for each). Compared to non-carriers, UCFs from carriers presented mitochondrial hypofunction. This was characterized by a lower fraction of the maximum mitochondrial capacity to synthesize ATP (IRC or index of respiratory oxygen uptake coupled with ATP production [59]), lower maximum electron transport capacity (or State 3u evaluated under uncoupling conditions), and lower capacity to synthesize ATP (State 3 evaluated under physiological conditions; Figure 4B black bars). These results confirmed the presence of MD in the PM even in samples obtained at these early stages of life.
Inhibition of the endogenous synthesis of Ser in non-carrier UCFs with WQ-2101 (inhibitor of PHGDH [60]) resulted in mitochondrial deficits similar to those obtained from carriers (Figure 4B open bars), regardless of the presence of Ser and Gly in the cell growth medium, highlighting the relevance of this anaplerotic pathway during early development periods. Supplementation of carrier UCFs grown in media with glucose (and Ser and Gly) and AKG precursors (Gln, AKG, or their combination), compared to non-carrier UCFs grown in glucose only, resulted in a significant improvement of coupling between electron transport and ATP production (RCR; AKG, and combination), IRC (Gln, AKG, and combination), State 3u (AKG and combination), proton leak, and ROS production (State 4o; AKG and combination), and State 3 (AKG) ( Figure 4C). Addition of glucose, Gln, and AKG improved the RCR (respiratory control ratio), IRC, State 3, and State 4o compared to glucose alone ( Figure 4C).
Further confirmation on the contribution of this metabolic pathway in UCFs was obtained through metabolomics analysis (Table 2). UCFs from non-carriers and carriers grown in glucose and supplemented with Gln showed increases in the [ATP]/[AMP] ratio, with decreased glycolytic and TCA cycle fluxes (Table 2). When the media were supplemented with the cell-permeant AKG, cells from carriers and from non-carriers with or without the PHGDH inhibitor responded similarly, showing increased ATP:AMP ratios and moderate impact on the glycolytic flux, but with significant improvements in the TCA cycle. Cells from non-carriers treated with the PGDH inhibitor resulted in a significant buildup of glucose 6-phosphate and 3-phosphoglycerate, thereby confirming a block in the branch that leads to the formation of AKG, Ser, and Gly from 3-phosphoglycerate. Similar metabolomics results were obtained when both Gln and AKG were used to supplement the media with glucose, indicating that the greatest effect was achieved by the cell-permeant AKG, as shown by the mitochondrial functional outcomes ( Figure 4C).
These results are consistent with those obtained with AF in which the disease gene pathway analysis run with the joint input of metabolites and proteins indicated deficiencies in the AKG dehydrogenase and Ser biosynthetic pathways (bolded in Supplementary  Table S3), and AKG as the most discriminating metabolite, along with other TCA intermediates (malate, fumarate, succinate; Supplementary Figure S4 panel D). It is important to note that TCA intermediates' concentrations are not only driven by the activity of the TCA cycle (e.g., AKG-derived succinate), but also by other anaplerotic reactions (e.g., succinate levels in carriers without Gln vs. those of other TCA intermediates; Table 2). In this regard, succinate concentrations may also be sustained by (for example) the catabolism of fatty acids with uneven numbers of carbons through the formation of propionate, as well as by GABA metabolism. In support of this premise, the correlation coefficient for AKG with succinate was lower (and less significant) than that for AKG with malate or fumarate (Supplementary Figure S8). Therefore, it is important to be cognizant of the ratios among metabolites, as well as their absolute concentrations, by utilizing analyses that are comprehensive and integrated, such as those used in systems biology.
We are cognizant that metabolic changes obtained with different cell types or tissues (in our case AF, UFC, PBMCs, or skin fibroblasts) may present different degrees of severity, as is usually seen with mitochondrial disorders, further complicated by the phenotypic threshold effect [20,61,62]; however, results obtained with these diverse cells and tissues all converged at showing deficits in the anaplerosis of AKG by the glycolysis-driven Ser biosynthetic pathway. Taken together, these results clearly show that cells from carriers can respond to AKG supplementation, thereby improving the performance of the Krebs cycle and mitochondrial energy output. Furthermore, they support the involvement of the Ser pathway as a critical anaplerotic source for the Krebs cycle in PM carriers, especially during perinatal periods and, for the first time, in non-tumoral tissues.

Subjects' Demographics; FMR1 Repeat Sizing and Structure
Fifty-two pregnant women, all carriers of the PM, gave consent for any AF samples remaining after prenatal fragile X diagnosis to be used for research. All available demographic and genetic data are shown in Table 1. The FMR1 repeat sizing was performed by PCR analysis using AmplideX PCR CE/FMR1 (Asuragen, Inc., Austin, TX, USA), and the products were separated by capillary electrophoresis with Applied Biosystems 3130. The AGG interruption patterns were determined using either AmplideX PCR CE/FMR1 at the New York State Institute for Basic Research in Developmental Disabilities (New York, NY, USA), or Xpansion Interpreter (Asuragen, Inc., Austin, TX, USA). Changes in allele repeat length on transmission were determined by comparison of parental and transmitted alleles in parallel PCR capillary electrophoretic analyses. An unstable transmission was defined as a change of at least one repeat from parent to child [24].

Metabolomic and Proteomic Evaluation
Metabolomics and proteomics analyses of 54 AF samples, external validation cohorts constituted by dermal fibroblasts and plasma samples obtained from non-carriers and carriers (Supplementary Table S2), and UCFs were carried out essentially as previously described [6][7][8]. More details are reported in the Supplementary Methods.
Oxygen consumption was immediately evaluated in intact cells using a Clark-type oxygen electrode (Hansatech, King's Lynn, UK), as previously described [10]. ATP-linked oxygen uptake (or State-3-dependent oxygen uptake) was calculated as the difference between basal and oligomycin-induced State 4 oxygen uptake rates; State 4o is the residual respiration after inhibition of ATP synthesis with the ATPase-specific inhibitor oligomycin (2 µM); maximal respiratory capacity, or State 3u, is described as the oxygen uptake rate in the presence of the uncoupler carbonyl cyanide-4-(trifluoromethoxy)phenylhydrazone (FCCP) (40 nM); respiratory control ratio (RCR) was calculated as the ratio between States 3 and 4o; index of respiratory capacity (IRC) was calculated as the difference between State 3 and State 4o normalized by that of State 3u.

Statistical Analysis
Spectral counting was utilized to quantify the proteins from mass spectrometry analysis. Both protein and metabolite levels were normalized to pooled samples from pregnancies of non-carrier fetuses; the features were normalized by utilizing the quantile and then auto scaled (mean-centered and divided by the standard deviation of each variable). Combined pathway analysis was performed using MetaboAnalyst [38]. The threshold for statistical significance was set at the 5% level, unless otherwise indicated. For mitochondrial outcomes, statistics was performed using Student's t test between two groups, and ANOVA followed by Tukey's post-hoc test when comparing multiple groups. Other details are indicated in the figure legends or the supplementary information.

Conclusions
A critical deficiency in the Ser biosynthetic pathway was noted by a thorough joint analysis of the proteome and metabolome of AF from PM-carrier fetuses. This deficiency not only impacted energy management (supply of AKG to mitochondria, creatine synthesis), but also has the potential to affect other biosynthetic pathways relevant to neurodevelopment (lipid and protein syntheses, detoxification). It could be argued that mild prenatal changes in mitochondrial function would not result in fetal metabolic changes in utero, as this environment is considered "hypoxic". However, a wealth of studies reported embryonic lethality at equivalent gestational ages for homozygous knockouts of respiratory chain subunits (NDUFA5 [63], SDHD [64], and RISP-Rieske iron sulfur protein [65]), ancillary factors (TMEM70 [66], NDUFS4 [67], COX15 [68], COX17, and SCO2 [69,70]), and mitochondrial antioxidant defenses (TRX-2 [71,72]), highlighting the critical role of mitochondria during pregnancy. Furthermore, the fact that aerobic production of ATP increases significantly from the second through the third trimesters in preparation for the higher atmospheric oxygen does not imply that during this period mitochondria do not play any role in fetal development. The amniotic pO 2 between weeks 15 and 17 is <10 mmHg (10-15 µM oxygen [73]), able to sustain oxidative phosphorylation throughout development, as this oxygen tension is well above the K m for oxygen of cytochrome c oxidase (K m = 0.9 M [74]). However, it could be inferred that any (even mild) mitochondrial changes in the offspring would be more evident soon after birth. This possibility is consistent with the current view that mitochondrial dysfunction begins long before pathological signs are evident. As such, it is possible that some of these prenatal mitochondrial changes may be magnified after birth and persist into adulthood. While decreasing energy provision by downregulating the Ser biosynthetic pathway limits macromolecule synthesis to reduce the load of ROS-mediated damage to misfoldingprone proteins, this mechanism also puts a brake on CNS cell growth and differentiation at a critical stage of brain development. The encouraging results of Ser supplementation to treat individuals with PHGDH deficiency [75], ALS [76], and Alzheimer's disease [77], along with our findings on downregulation of the Ser biosynthetic pathway and the recovery of mitochondrial function by using AKG precursors, raise the possibility of introducing precursors of AKG, including Ser supplementation, as a ready-to-use intervention during pregnancy to minimize the risk of children developing atypical neurobehavioral and emotional phenotypes or, later in life, the occurrence of the neurodegenerative disease FXTAS.