Predictive Modeling of Alzheimer’s and Parkinson’s Disease Using Metabolomic and Lipidomic Profiles from Cerebrospinal Fluid

In recent years, metabolomics has been used as a powerful tool to better understand the physiology of neurodegenerative diseases and identify potential biomarkers for progression. We used targeted and untargeted aqueous, and lipidomic profiles of the metabolome from human cerebrospinal fluid to build multivariate predictive models distinguishing patients with Alzheimer’s disease (AD), Parkinson’s disease (PD), and healthy age-matched controls. We emphasize several statistical challenges associated with metabolomic studies where the number of measured metabolites far exceeds sample size. We found strong separation in the metabolome between PD and controls, as well as between PD and AD, with weaker separation between AD and controls. Consistent with existing literature, we found alanine, kynurenine, tryptophan, and serine to be associated with PD classification against controls, while alanine, creatine, and long chain ceramides were associated with AD classification against controls. We conducted a univariate pathway analysis of untargeted and targeted metabolite profiles and find that vitamin E and urea cycle metabolism pathways are associated with PD, while the aspartate/asparagine and c21-steroid hormone biosynthesis pathways are associated with AD. We also found that the amount of metabolite missingness varied by phenotype, highlighting the importance of examining missing data in future metabolomic studies.


Introduction
Alzheimer's (AD) and Parkinson's (PD) disease are the two most common neurodegenerative disorders, and the fifth and tenth leading causes of death in the United States among individuals aged 65 or older [1]. Projections using the 2010 US census estimate that there will be approximately 1.2 million cases of PD (among individuals aged 45 or older) and 8.4 million cases of AD by 2030 [2,3]. Currently, there are no therapies that can slow or halt progression of these debilitating diseases. Biomarkers for these diseases could improve our ability to diagnose these conditions at early stages, and track progression of neurodegeneration and the effect of therapies on slowing or halting disease progression.
Recent research has focused on the discovery of biomarkers to identify the presence of these diseases [4,5]. Cerebrospinal fluid (CSF) surrounds the brain and is peripherally accessible, making it a promising medium to use for identifying biomarkers of neurodegeneration in AD and PD [6,7]. The metabolome, consisting of the small molecules (<2000 Da) circulating in an organism, has been shown to vary across many phenotypic traits, including (but not limited to) human longevity [8], cancer [9,10], and type 2 diabetes [11]. Alterations in the metabolome have also been identified in AD versus control subjects [12,13], and PD versus control subjects [14][15][16].
In the present study, we examined whether the CSF metabolome can discriminate between healthy controls, AD, and PD subjects by fitting predictive models for each disease. We compared untargeted aqueous, targeted aqueous, and lipidomic approaches to profiling the metabolome in their ability to discriminate between AD, PD, and control CSF. Using a logistic elastic net regression to classify phenotypes and select relevant features [17], we were able to best discriminate PD versus control in the untargeted metabolite profiles (0.998 AUC). Our models were also able to discriminate AD versus control in untargeted and lipid profiles well (0.81 and 0.75 AUC, respectively). Our model's ability to differentiate between AD and PD in the untargeted metabolomic profile was also excellent (0.976 AUC). We performed univariate analyses to identify potential metabolomic pathways associated with each disease, revealing metabolites and lipids that have been previously identified in association with AD and PD versus controls, as well as some new putative biomarkers. We address statistical methods for handling potential data quality issues related to metabolomics, including corrections for assay drift and an examination of statistically informative missingness. Our work should be validated in follow-up studies to determine if our findings are robust in larger cohorts, and in longitudinal studies to identify if any metabolites are biomarkers for disease progression. To our knowledge, this is the first paper to compare multivariate classification performance of AD and PD between metabolomic profiles.

Results
CSF samples from AD, PD, and healthy control subjects obtained through two different longitudinal studies were analyzed in this study: 85 healthy subjects, 57 subjects with AD, and 56 subjects with PD. Age at time of CSF collection ranged from 20 to 88, with a median age of 65 ( Table 1).
The average age of control subjects was 54 at the time of LP, ranging from 20 to 86 years old, with 48% female and 93% white. The average age of AD subjects at the time of LP was 71, ranging from 52 to 87 years old, with 51% female and 95% white. The average duration of disease of AD subjects was 4.3 ± 2.8 years. Consistent with the diagnosis of AD, the average MMSE score for AD subjects was 21.4 ± 5.6. compared to 29.4 ± 0.9 for controls. Additional cognitive measures for AD and control subjects are included in Table 1.
The average age of the PD subjects was 65, with 9 years average duration of disease at the time of LP. The average levodopa equivalent daily dose (LEDD) was 714 mg/day, where only three PD subjects were not taking dopaminergic medication. The average MDS-UPDRS part III score for motor symptoms was 25, with average Hoehn and Yahr stage of 2, consistent with mid-stage PD patients on a moderate amount of dopaminergic medication. The average MOCA score for the PD cohort was 25.2 ± 3.8. The majority of the cohort had the consensus diagnosis of mild cognitive impairment (64%), while 29% had no cognitive impairment and 7% had dementia at the time of LP. Six PD subjects were carriers of pathogenic GBA gene mutations and two subjects were carriers of the GBA E326K polymorphism. None of the most common pathogenic mutations of leucine rich repeat kinase 2 (LRRK2) G2019S or R1441C/G/H/S mutation carriers were identified in this cohort.
Metabolomic profiling of CSF was performed using untargeted and targeted metabolite methods, as well as lipid profiling. The untargeted, targeted, and lipid profiling yielded 6735, 108, and 1070 features, respectively. These samples came with widely varying degrees of missingness, with 16%, 3%, and 81% of the data missing from each of the profiles, respectively. A summary of subject age and missing data are displayed in Appendix A Figure A1, a flowchart outlining the analysis performed in this study is available in Appendix B Figure A2, and code used to run the analysis can be found in the Supplementary Materials. Table 1. Summary of subject data split by phenotype (top), with cognitive test results (middle), and with additional PD subject information (bottom). The cognitive status for PD was classified at three levels: no cognitive impairment, mild cognitive impairment (MCI), and dementia. Age of onset refers to the age of onset of motor symptoms. GBA refers to the carrier frequency for pathogenic GBA mutations and the E326K polymorphism. A chi-squared test of independence for sex and phenotype reports a p-value of 0.051. A one-way ANOVA of age at time of LP and phenotype reports a p-value of 2.099 × 10 −9 [18].

PCA
We first used the unsupervised method of principal component analysis (PCA) (see, e.g., [19]) to identify the combinations of metabolites responsible for the majority of the variation present in the untargeted metabolomic data. Twenty percent of the variation in the data was explained by the first two principal components ( Figure 1). This variation did not associate strongly with disease (AD, PD, controls), as the projection of the subjects' untargeted profile onto the first two principal components does not show significant separation between subject groups. In a previous analysis on the dataset of only control subjects, we found the first principal component to be associated with age [20]. Untargeted data projected onto the first two Principal Components (PC). Each point represents a subject, colored by their phenotype. Percentages in the axis titles refer to the percentage of variation of the data explained by the respective PC. In addition, 95% confidence ellipses assuming the t-distribution are also plotted. The first two principal components do not clearly separate the disease phenotypes.

Prediction Results
We next used supervised methods to seek explicit combinations of metabolites and lipids that can differentiate AD, PD, and healthy controls. For both its predictive power and interpretable results, we fit binary logistic elastic net regression models on each of these three pairwise phenotype combinations (AD, PD, control), and on three profiles (untargeted and targeted aqueous metabolites and untargeted lipids), using leave one out prediction to classify the disease status for each subject. The overall predictive performance of models fit on the untargeted profiles is displayed via ROC curves in Figure 2. Alternative characterizations of model performance are presented in Appendix E, including precisionrecall curves and examples of threshold-dependent performance metrics. As stated in Section 4.2, age and sex are detrended from each metabolite prior to modeling fitting to better distinguish the signal from the metabolome from the known association of these covariates between AD, PD, and controls. The predictive performance of models fit including age and sex can be found in Appendix F.

Classifying AD against Controls
The predictive accuracy of models classifying AD against controls varied across the profiles, reporting Area Under the ROC Curves (AUC) between 0.5 and 0.7. The untargeted profile yielded the highest predictive accuracy. Recall that age and sex are detrended from each metabolite, so that these models mostly disregard the discriminatory power of age and sex. These models contain less discriminatory power than age and sex alone, as the leave one out performance of classical logistic regression models using only age and sex yields an AUC of 0.734. In addition to the leave one out modeling done to generate these ROC curves, a logistic elastic net model was fit on the full lipid and targeted profiles to get a sense of which metabolites separate AD from controls in our data. Exponentiated coefficients larger than 1.1 or smaller than 0.9 for these models are displayed in Table 2, and represent the odds ratio (OR) of classifying the subject as having AD resulting from a standard deviation increase in the metabolite/lipid. Table 2. Names and Odds Ratios (OR) for targeted metabolites and lipids retained in all five imputations of elastic net models fit on the full data to classify AD patients against controls. The tables are sorted by magnitude and split into positive and negative coefficient tables. Because the features were standardized prior to fitting the models, these ORs represent the expected odds ratio resulting from a standard deviation increase in concentration. Only ORs with magnitude greater than 1.1 or less than 0.9 are shown here. A two standard deviation interval is shown for the OR to quantify variability across the five missing data imputations.

Classifying PD against Controls
The models correctly classify PD patients with near perfect accuracy in the untargeted metabolomic profile, and classify better than chance in the targeted metabolomic and lipidomic profiles. For reference, the leave one out predictions of logistic regression models fit using only age and sex yield AUC of 0.671. Table 3 displays the odds ratios associated with an elastic net model fit on the full targeted metabolite and lipid profiles, akin to Table 2. To test whether this near perfect prediction in the untargeted profile can be attributed to a small subset of metabolites, we proceeded to run a second elastic net regression, following the same procedure described in Section 4.3, but this time removing all metabolites selected by the first elastic net regression model from consideration (fit using all controls and PD subjects). This second elastic net model was still predictive of PD, reporting an average AUC of 0.83 over the five imputations and the leave one out analysis, indicating that the discriminatory power of our model is not limited to a small set of metabolites.
Mutations in GBA are the strongest genetic risk factor for PD, increasing the risk of PD in heterozygous carriers by approximately 5-fold compared to non-carriers [21], and are present in 5% of all PD patients. GBA E326K is also associated with PD risk, albeit with a smaller effect size [22], and both GBA mutations and E326K are associated with more rapid progression of cognitive and motor symptoms in PD [23][24][25]. Our cohort of PD patients included six GBA mutation carriers and two GBA E326K carriers. Due to the small size of both genotype groups, we pooled the six mutation carriers and two E326K carriers to improve power for our analysis. Models classifying the eight GBA variant carriers against non-carriers within the PD cohort do not perform better than chance. We also examined whether PD-related medications could be significantly influencing metabolic profiles, as 53 of the 56 PD subjects were taking dopaminergic medication at the time of LP. However, models predicting LEDD from metabolites had very little predictive power, not performing much better than a naive intercept-only model. Additionally, we performed a reanalysis of the untargeted profile after removing all metabolites within 1 m/z of levodopa, entacapone, and their drug metabolites (dopamine, dihydroxyphenylacetic acid, homovanillic acid, and 3-O-methyldopa, S-adenosyl-L-methionine, norepinephrine, epinephrine). In this reanalysis, we found no reduction in the discriminative performance of our models. Table 3. Names and OR for targeted metabolites and lipids retained in all five elastic net models fit on the full data to classify PD patients against controls. Only ORs with magnitude greater than 1.1 or less than 0.9 are shown here. A two standard deviation interval is shown for the OR to quantify variability across the five missing data imputations.

Classifying AD from PD
In addition to classifying disease subjects versus controls, models were also fit to classify AD versus PD using the metabolome. Similar to PD versus control classifications, the untargeted profile was able to distinguish between the two classes with near perfect accuracy, while the lipid models do not perform significantly better than chance. For reference, the leave one out predictions of logistic regression models using only age and sex yields an AUC of 0.716. The most significant metabolites distinguishing between AD and PD, as determined by analyzing untargeted metabolite profiles, are summarized in Table 4. The corresponding tables for lipids are reported in Appendix I Table A3. Table 4. Names and OR (associated with a standard deviation increase in concentration) for targeted metabolites retained in all five elastic net models fit on the full data to classify PD patients against AD patients. A metabolite with OR > 1 indicates that higher concentration is associated with AD in our models. Only ORs with magnitude greater than 1.1 or less than 0.9 are shown here. A two standard deviation interval is shown for the OR to quantify variability across the five missing data imputations.

Missing Data
We noted that there was a significant percentage of missing features for targeted and untargeted aqueous metabolite and lipid profiles per group. To determine whether there could be any correlation between the pattern of missing features and AD, PD, or controls, we fit a series of models following the procedure described in Section 4.3 with the modifications that all metabolite abundances were replaced with 1's if they were present, and 0 if they were missing. Age and sex were still included in this model to control for their possible associations with missingness [20]. These matrices of missingness indicators (along with sex at birth and age) were still moderately to extremely successful at classifying AD and PD against controls, reporting leave one out AUCs as high as 0.97 (classifying PD against controls in the untargeted profile). The high predictive accuracy at classifying PD against controls in the untargeted missingness indicator profile is moderately robust, as removing all significant metabolites and rerunning the analysis yields an AUC of 0.72. However, the near perfect classification result reported in Section 2.2.2 is not solely a result of differential missingness, as similar performance is achieved after removing all metabolites that were missing in more than 1% of samples. Targeted and lipidomic tables for the full models (following the same format as Tables 2 and 3) are shown in Table 5.
We also found that, when the proportion of missing data between phenotype subjects is large, the controls tended to have the most missingness, and PD subjects have the least missingness (after controlling for age). A series of univariate binary logistic regressions were fit on the untargeted profile to classify each metabolite's pattern of abundance missingness based on phenotype (AD, PD, or age-matched controls). All metabolites with at least one coefficient FDR < 0.05 are displayed in Appendix D Figure A4. Table 5. Names and OR for targeted metabolite abundance missingness indicators in AD classification against controls, lipid abundance missingness indicators in PD classification against controls, and lipid abundance missingness indicators in AD classification against PD, retained in elastic net models fit on the full data. The OR corresponds to increased odds of classifying a patient as having AD/PD/AD associated with a standard deviation increase in metabolite concentration (for left, middle, and right, respectively). Age and sex are not penalized, and are therefore guaranteed to be included in these models. Other combinations of phenotype and profile (i.e., targeted-PD or lipids-AD tables) are not shown because the only retained covariates were age and sex.

Pathway and Set Enrichment Analysis
We used pathway analysis to map significant individual metabolites identified by our analysis to known metabolite networks. To annotate the metabolites obtained in an untargeted method, we used mummichog, which combines the tasks of pathway analysis and metabolite identification using mass:charge, retention time, MS ion mode, along with univariate classification results. As shown in Figure 3, mummichog identified vitamin B3 (nicotinate and nicotinamide) metabolism and alanine and aspartate metabolite pathways with the negative ion mode metabolites classifying PD against controls. No associations were found classifying AD against controls with negative ion mode metabolites at a false discovery rate <0.05. For the positive ion mode detected metabolites, urea cycle/amino group metabolism, aspartate and asparagine metabolism, C21-steroid hormone biosynthesis and metabolism, biopterin metabolism, glutathione metabolism, vitamin B12 (cyanocobalamin) metabolism, vitamin E metabolism, bile acid biosynthesis, drug metabolism-cytochrome P450, and arginine and proline metabolisms were found to be associated with PD at the false discovery rate <0.05. Classifying AD against controls, mummichog identified the C21-steroid hormone biosynthesis and metabolism, sialic acid metabolism, androgen and estrogen biosynthesis metabolism, starch and sucrose metabolism, and hexose phosphorylation pathways at a false discovery rate <0.05.
Using metabolite abundance missingness indicators to classify PD against controls in univariate analysis, we found vitamin E metabolism to be the only pathway associated with the significant metabolites at a false discovery rate <0.05 in the positive ion mode detected metabolites. No associations were found using the negative mode metabolite missingness to classify PD against controls. No associations were found using positive or negative mode metabolite missingness to classify AD against controls. Using significant metabolites from untargeted univariate analysis classifying AD against PD, mummichog identified the ascorbate (vitamin C) and aldarate metabolism pathways in the negative ion mode, and drug metabolism-cytochrome P450 and vitamin E metabolism pathways in the positive ion mode. For the targeted profile, where metabolite identities have already been established, we used Metabolite Set Enrichment Analysis (MSEA), which performed overrepresentation analysis to compare significant metabolites from univariate classification against known metabolites sets compiled from literature. MSEA found no significant relationships using p-values computed from the hypergeometric distribution. Using metabolite sets from the Small Molecule Pathways Database (SMPDB) [26], we find that the targeted metabolites identified in our univariate classification of PD against controls match two out of three metabolites in the carnitine synthesis set, two out of four of the metabolites in the betaine metabolism set, and three out of eight of the metabolites in the methionine metabolism set. A table of the sets with the largest overlap can be found in Appendix G.

Discussion
In this study, we analyzed targeted and untargeted metabolomic and lipidomic CSF profiles to develop possible predictive models for AD and PD. Using elastic net regression models, we were able to classify PD against both controls and AD with a high degree of accuracy using untargeted metabolomic profiles, as well as moderate accuracy for all other models classifying phenotype using the targeted and untargeted metabolome. We found that some metabolites driving these predictive multivariate models, such as alanine and ornithine, have already been reported in the existing literature, providing further support for our results. We also performed univariate analysis coupled with pathway and set enrichment analysis and identified putative metabolomic pathways that have been previously associated with AD or PD. Associations between the presence of missing values and phenotype were also reported, suggesting the need for careful treatment of missing data in future metabolomic studies.
While several similar studies analyzing metabolomic and/or lipidomic profiles have been published, our study has several unique strengths. First, we analyzed both targeted and untargeted metabolomic profiles and were able to distinguish a specific metabolomic signature associated with neurodegeneration in both an unbiased interrogation of the metabolome as well as in metabolites of interest. In addition, most prior studies have analyzed metabolomic or lipidomic alterations in a single disease versus controls. We analyzed metabolomic and lipidomic profiles from both AD and PD in relation to each other as well as compared to controls to identify metabolomic and lipidomic changes that might be specific to each disease versus common to neurodegeneration. We were able to distinguish between the metabolome of AD and PD CSF, supporting growing evidence that neurodegenerative diseases have unique metabolomic "signatures". Finally, we analyzed CSF from living patients rather than from post mortem samples, which may be advantageous in revealing significant alterations in the metabolome. This could be relevant to developing biomarkers of disease progression, rather than alterations that are only detectable in end-stage disease.
The leave-one-out predictive accuracy of our models ranges greatly depending on the classification task and profile used. Models fit using the untargeted profile tended to be the most accurate, classifying PD against controls and PD against AD nearly perfectly, and classifying AD against controls better than chance. We also found that the models achieving near perfect accuracy are not limited to a few metabolites, and that the pattern of missing values alone is enough to distinguish these phenotypes. This suggests that features with missingness may encode important information about phenotype and should be treated with careful consideration in future metabolomics studies. There were a small number of targeted metabolites and lipids that appeared in all leave one out models, which are reported in Appendix J.
Interestingly, the lipidomic profiles of AD and PD subjects were nearly indistinguishable, despite our models being able to classify the lipidomic profiles of PD against controls better than chance. This suggests that there may be common alterations in lipid metabolism in both neurodegenerative diseases. Increasing evidence indicates that alterations in lipid metabolism, particularly ceramide and sphingolipid metabolism, are present in multiple neurodegenerative diseases including AD and PD. This study is one of only a few studies able to compare AD and PD CSF lipidomic profiles obtained at the same time by the same laboratory, reducing alterations due to batch effect. Alterations in lipid metabolism could signal perturbations in multiple biological pathways, including inflammation, mitochondrial function, cell signaling, cell death, endolysosomal trafficking, and exosome biogenesis [27][28][29]. The two lipid species whose abundances have the strongest positive association with AD in the multivariate model are ceramide (24:1) and cholesterol ester (20:1), which are consistent with studies finding an increased abundance of long chain ceramides and cholesterol in AD [30][31][32][33]. A decrease in concentration of four species of phosphatidylethanolamine (PE) were found to be associated with AD in our multivariate model ( Table 2). These bioactive glycerophospholipids have been previously associated with AD and PD [34,35], though there is conflicting evidence of decreased PE levels in AD [36], or unchanged between AD and age-matched controls [37].
Several of the targeted metabolites driving our predictive multivariate models have been previously reported in association with AD or PD. Alanine [16,[38][39][40], kynurenine [14], glycine [41], tryptophan [40,42], xanthine [43], and serine [40,43] have all been found to be associated with PD, and appear in either our multivariate analysis of the targeted profile, or the pathways and set enrichment analysis derived from univariate analysis. Ornithine, an amino acid involved in the urea cycle, is the metabolite most positively associated with PD. It has been reported to be increased in urine and serum from PD patients [16,44,45] but decreased in CSF of PD patients compared to controls [46]. In our sample, ornithine is strongly associated with PD in the untargeted profile as well, as mummichog identified the urea cycle/amino acid group metabolism as the most enriched pathway for classification between PD patients and controls. Alanine is one of the metabolites most positively associated with AD in our multivariate model (Table 1). Increased levels of this amino acid, as well as several others including glycine, methionine, threonine, phenylalanine, and citrulline, have been found to be increased in CSF of AD patients [47][48][49]. Targeted aqueous metabolomic analysis of premortem blood and postmortem brain from individuals with AD and healthy controls found acylcarnitine to be associated with AD and cognitive changes [50]. We found decreased levels of creatine to be associated with AD in our multivariate model, and proteomic studies have found creatine kinase to be associated with AD patients [51,52]. The positive association between fructose and AD in our multivariate model might provide support for the hypothesis in Johnson et al. [53], suggesting that AD can be driven by an overactivation of fructose metabolism.
Our pathway and set enrichment analysis identified several putative pathways previously associated with AD and/or PD, including biopterin metabolism, which is important for dopamine synthesis, and glutathione metabolism, which is critical for antioxidant activity and mitochondrial integrity. Urea cycle and amino group metabolism were also identified, which are important for fatty acid metabolism, and have been implicated in neuronal function and neurodegeneration. Finally, alanine and asparagine metabolism may be important for metabolic control of neurons [54]. The emergence of these pathways from our analysis suggests that our model correctly predicted PD compared to controls based on changes in metabolites and lipids that arise from significant alterations in cell function, and further supports this method in identifying possible biomarkers as well as biological mechanisms underlying disease pathogenesis.

Modeling Limitations
There are several limitations to the current study, which we address here. We focus in particular on the near perfect ability to discriminate PD from both controls and AD, which might be met with some skepticism. First, it is possible that our results could be distorted by the effects of dopaminergic medication on the metabolome. While the LEDD was recorded for most PD subjects, it was not used in this study because non-zero LEDD itself is a nearly perfect predictor of PD status. Levodopa treatment has been shown to impact kynurenine, tryptophan, and tyrosine metabolisms [4], which could explain several of the associations found in MSEA (Table 5) and multivariate analysis ( Table 2). That said, there is evidence to suggest that the classification is not driven solely by the presence of drugs; we find that a Gaussian elastic net model is unable to predict LEDD better than chance in the PD cohort, and we also find that removing all untargeted metabolites within 1 m/z of levodopa, entacapone, and their drug metabolites does not change the near-perfect discrimination of PD against controls.
Another potential confounder driving the near perfect PD classification results could come from differences in sample processing. The PD samples were collected by a different study and researchers (Pacific Udall Center) than the AD and control samples (VA MIRECC; see Materials and Methods), and our models were able to classify PD against both AD and controls with high degrees of accuracy. It is possible that, despite strict adherence to the same protocol, there were still differences in sample collection and processing between the two studies that our models were able to detect. One important limitation of our PD analysis is that CSF was extracted at an earlier date for the AD and control subjects than for the PD subjects ( Figure A3), and there is some evidence to suggest that prolonged storage time can lead to significant alterations in metabolomic profiles [55]. Determining the relationship between metabolomic profiles and storage time is an active area of research, and we refer to Section 2.5 of [56] for a review of the current literature. However, the exploratory analysis presented in Figure A8 provides evidence to suggest that batch effects are not the sole source of predictive power in our models. In short, the metabolites distinguishing AD and PD are distinct from those distinguishing AD and controls, suggesting that the sample extraction date is not the sole confounder driving these predictions. In addition, studies such as Luan et al. [42] and Willkommen et al. [15] achieve similar performance in distinguishing PD from controls, which indicates that it is possible that the nearly perfect classification in our models is due to a true signal in the metabolome. Furthermore, we note that this analysis is meant to explore the predictive power of the metabolome to distinguish phenotype without relying on known associations with age and sex. As such, the choice to detrend and exclude sex and age in the elastic net models was a conservative choice that shrinks the possibility that the predictive performance and metabolites retained were dependent on age and sex. An approach that explicitly includes age and sex would likely generate more accurate predictions.
Relatedly, the pattern of missing values in metabolomic and lipidomic profiles differed between PD and controls but not between AD and controls. A re-analysis of PD and control metabolite profiles suggests that differential missingness is not solely responsible for the predictive performance of our models; after removing nearly all metabolites that were missing in either or both groups, we still found significant differentiation between PD and controls.
More generally, a larger sample size and further investigation using experimental model systems are required to test whether the discriminatory features that we identified between AD, PD, and controls may have causal biological relationships. In particular, our exploratory analysis of the models distinguishing PD from controls shows that there is a multiplicity of good models, suggesting that the results shown in this study are neglecting potentially important predictors. While the elastic net classification models used in this study are powerful tools for interpretable high dimensional analysis, performing statistical inference (e.g., obtaining p-values for individual coefficients) in this setting is an active area of research and is not yet well understood (see [57] for a recent and thorough treatment of this topic). As such, the results of this analysis should be seen as exploratory and as potential directions for future work, rather than making strong inferential claims on its own. Additionally, the null result of being unable to classify GBA variant carriers against non-carriers within the PD cohort is likely due to the small sample size. Specific lipidome alterations in GBA variant carriers with PD have been identified in CSF [58], but these observations were not replicated in our study.
A third limitation of the study is the cross-sectional nature of our data. The metabolome is known to be dynamic and volatile, so patterns identified in a cross-sectional sample might not generalize well. Longitudinal studies, similar to those done by Huo et al. [50] and LeWitt et al. [43], will be needed to determine whether the discriminatory features that we identified between PD, AD, and control CSF are not only diagnostic, but change over time to reflect disease progression.
In conclusion, we identified a number of metabolites and developed an elastic net regression model that can distinguish between AD, PD, and healthy control CSF metabolite profiles, as well as PD versus control lipidomic profiles. Our study is one of the few studies to examine targeted and untargeted aqueous metabolite profiles as well as lipidomic profiles in the two most common neurodegenerative diseases, AD and PD, as well as healthy controls. While our study has some limitations, including small cohort size and lack of longitudinal data, our analyses suggest several directions for future investigation of lipidomic and metabolomic profiles as markers of disease state and progression.

Materials and Methods
Our analysis was based on cerebrospinal fluid (CSF) samples of 85 healthy subjects, 57 AD subjects, and 56 PD subjects, previously utilized and described in [20]. Sex at birth and APOE variant is known for all subjects, while Glucosylceramidase Beta variant (GBA) and Levodopa Equivalent Daily Dosage (LEDD) (calculated following [59]) are also known for the PD cohort. All samples were collected under University of Washington or VA Puget Sound Health Care System IRB-approved protocols. Written informed consent was given by all participants, or their legally authorized representative, before any study procedures took place. The VA Northwest Mental Illness Research, Education, and Clinical Center (MIRECC) Sample and Data Repository provided the samples for AD and control subjects, while the Pacific Udall Center provided the PD samples used for this analysis.
Samples were collected while subjects were in the fasting state between 0900-1100 h, after the subjects were in supine bedrest for at least 40 min. CSF was extracted via negative pressure using a Sprotte 24 g atraumatic spinal needle, while subjects were placed in the lateral decubitus position. At the bedside, CSF was collected into sterile polypropylene syringes, placed into 0.5 mL aliquots in polypropylene cryotubes, and frozen immediately on dry ice. All samples were stored at −80 • C prior to assay. Figure A3 displays the distribution of the date of LP draws. All participants with AD met the NINDS-ADRDA criteria for probable AD. A summary of the neuropsychological testing results is available in Table 1.
All PD patients were enrolled in the Pacific Udall Center Clinical Core [60]. All PD subjects underwent physical, neurological, and neuropsychological assessments. which are summarized in Table 1, and described in detail in [61]. Cognitive and motor diagnoses were made at consensus conferences, which included at least two movement disorders specialists, a neuropsychologist, and study support personnel. For more detailed information about sample collection for the PD cohort, see [62].

Metabolomics
Details of metabolite processing can be found in previous analyses [20,63]. In brief, we used a LC-MS/MS platform to carry out both global and targeted metabolomics analysis. An Agilent 1200 LC system and a 6520 Q-TOF mass spectrometer (Agilent Technologies, Santa Clara, CA, USA) were used to perform global metabolomic analysis. We prepared 200 µL CSF samples by dissolving in 1000 µL methanol before vortexing and incubating at 20 • C. After centrifuging at 18,000 g for 20 min, we collected 750 µL of the supernatant, which was dried and reconstituted in a 100 µL solution of 40% H 2 O: 60% acetonitrile. We then analyzed the samples in positive and negative ESI modes using 5 µL and 10 µL injection volumes, respectively. We performed the chromatographic separation using a Waters XBridge BEH Amide column (15 cm × 2.1 mm, 2.5 µm) (Waters Corporation, Milford, MA, USA) heated to 40 • C, using a flow rate of 0.3 mL/min, 10 mM NH 4 HCO 3 in 100% H 2 O (mobile phase A) and 100% acetonitrile (mobile phase B) with a gradient of 95-10% B from 0 to 5 min, 10% B from 5-40 min, 10-100% B from 40-45 min, and 100% from 40-70 min. We calibrated the Q-TOF/MS before each batch run, and achieved mass accuracy of <1 ppm using a G1969-85000 Agilent Technologies tuning mixture. We then used Agilent Mass Hunter and Mass Profiler Professional to process the MS data. We used an m/z scan range of 100-2000 and acquisition rate of 1.0 spectra/s. For targeted analysis, a Sciex 6500+ LC-MS/MS system (Framingham, MA, USA) was used to target 203 known metabolites across more than 25 metabolomic pathways, along with 33 stable-isotope interal standards from Cambridge Isotope Laboratory, Tewksbury, MA, USA. We injected 2 µL of each sample for positive ESI ioniziation mode analysis and 10 µL for negative mode. In both cases, we conducted chromatographic separations using hydrophilic interaction chromatography, employing a Waters XBridge BEH Amide column as before, with flow rate of 0.3 mL per minute, a column compartment temperature of 40 • C, and autosampler temperature of 4 • C. We used 5 mM ammonium acetate in H 2 O with 0.5% acetic acid and 0.5% acetonitrile (Solvent A) and acetonitrile with 0.5% acetic acid and 0.5% water (Solvent B) to compose the mobile phase. For both negative and positive ionization modes, we used the following gradient: isocratic elution of 10% A for 0-1.5 min, a linear increase of A to 65% from 1.5-9 min, constant 65% A from 9-14 min, and 10% at 15 min. We used authentic commercially obtained compounds (Fisher Scientific, Pittsburg, PA, USA or Sigma-Aldrich, Saint Louis, MO, USA) to develop the assays, and MultiQuant 3.0.2 was used to obtain metabolite concentrations, as described previously [64].
Details of lipidomic analysis have been described previously [20,65,66]. For lipidomic analysis, we added 54 isotope labeled internal standards (spanning 13 lipid classes) purchased from Sciex. We thawed the frozen CSF samples at 25 • C for 30 min, then vortexed and transferred 100 µL of CSF to a borosilicate glass culture tube. We extracted lipids using dichloromethane/methanol, then concentrated the extracts under nitrogen, and reconstituted the samples in 100 µL of 10 mM ammonium acetate in dichloromethane:methanol (50:50). We analyzed the resulting lipids using the Sciex Lipidyzer platform: specifically, we used a Shimadzu LC and AB Sciex QTRAP 5500 MS/MS system, where differential mobility spectrometry (DMS) is conducted using SelexION. We then used multiple reaction monitoring to quantify lipids in both positive and negative ionization modes, with and without DMS, before processing the data using Sciex Analyst 1.6.3 and Lipidomics Workflow Manager 1.0.5.0. The Lipidizer platform outputs an abbreviated lipid annotation that does not exactly match the LIPID MAPS shorthand annotation [67]. Table S1A of [66] provides a table of translation between the two shorthand notations.
As described in [64], we monitored instrument performance over long periods and assisted with normalization by running both a pooled study Quality Control (QC) sample and a commercially pooled lab QC sample once every 10 study samples, and at the beginning and end of all sample batches. The pooled study QC sample is made by combining 10-20 µL of each sample, which are prepared as usual alongside the study samples.

Preprocessing
Samples were prepared for assay in seven batches of 28 or 29 samples each. As described in [20], batch effects are reduced using the finite selection model from [68] to balance age, sex, APOE status, and disease status across the batches. An XGboost model (a nonparametric regression method implemented in the 'xgboost' R package) was fit separately on each metabolite using run index, age, and sex as covariates. Cross validation was used to select tuning parameters for these models [69]. The residuals of this procedure were used for the remainder of the analysis, which reduces the likelihood that the predictive performance of our models solely reflects metabolite drift over time, or the known associations of age and sex with both AD and PD [70][71][72]. The inclusion of run index in this process supplements the QC sample normalization done in the metabolite processing, and can be helpful, especially when considering metabolite drift between QC runs. The benefit of using a flexible tree-based method like XGboost is that it is capable of handling and correcting nonlinear and discontinuous metabolite drift. Examples of this drift correction procedure in our dataset are shown in Supplementary Figure 2 of [20].After this correction, each metabolite was centered and scaled to have unit variance. Within each feature, abundances further than three Median Absolute Deviations (MAD) were removed and treated as missing values to limit the influence of unreasonably small or large abundances, which are most likely the result of processing noise. To limit the impact of missing data imputation on the analysis, we excluded features with more than 50% missingness from this analysis. The remaining missing abundances were imputed using multiple imputation, whereby multiple (in our case, 5) versions of imputed data are created. Identical analysis is done on each of the five imputed datasets, which provides a measure of the variability induced by the missing data procedure. The Amelia package in R [73] was used to perform this procedure, with details of the implementation to accommodate large metabolomic datasets available in [20]. Missing data imputation was done within each cross validated loop (described in Section 4.3) to avoid the imputation depending on information from both the training and testing sets.

Regression Modeling
To model the extent to which our data discriminate AD and PD against controls, we fit regularized logistic regression models, which extend the classic logistic regression model to allow for the number of features in our dataset to exceed the number of rows. In particular, we used elastic net regression, which fits Here, α is a value between 0 and 1 which determines the type of regularization, while λ is a non-negative value that determines the total amount of regularization. We used the R package glmnet to fit these models, choosing α = 0.5 and selecting the value of λ that resulted in the smallest deviance in leave one out cross validation [17]. This choice of α allowed for variable selection without imposing a strict upper limit on the number of selected variables. To mitigate the effects of class imbalance, observations were weighted by inverse proportion of their class label. For interpretability, in Tables 2-4, we show exponentiated coefficients for models fit on the full data (for the targeted and lipidomic profiles) to classify AD against controls, PD against controls, and AD against PD, respectively. Because the covariates are standardized, these exponentiated coefficients correspond to the expected odds ratio (OR) resulting from a standard deviation increase in metabolite/lipid concentration.
To estimate out of sample predictive power, leave one out prediction was performed, i.e., missing data imputation and regression modeling were iteratively done on the training data after removing a single observation, and a prediction was formed on the left out observation. Rather than imputing missing values on a left out observation, missing features were removed from that iteration's test set, using a process known as reduced feature modeling [74]. These left out predictions were used to create Receiver Operating Characteristic (ROC) curves and compute Area Under the ROC Curve (AUC). To summarize the performance of our models, we compute the average AUC value across all five missing data imputations for each leave-one-out iteration.

Pathway Analysis
To perform pathway analysis and metabolite set enrichment analysis, we construct univariate classification models by separately regressing each metabolite against disease status, which allows for correlated and collinear metabolites to be included in their respective pathways. For univariate untargeted metabolomic analysis, we used logistic regression models to perform the classification, and then inputted mass:charge ratios, retention times, t-values, and Benjamini-Hochberg corrected p-values into Mummichog via a Python script [75] to perform pathway analysis. Positive and negative ionization modes were analyzed separately.
For univariate targeted analysis, we used the same logistic regression model technique, and inputted the names of all metabolites with false discovery rate <0.05 into Metabolite Set Enrichment Analysis (MSEA), which performed hypergeometric tests of associations with known metabolite sets [76]. To perform MSEA on the targeted dataset, we used the open source MetaboAnalystR package [77].

Conflicts of Interest:
The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

Abbreviations
The following abbreviations are used in this manuscript:    Figure A6. Cont. Figure A6. ROC curves displaying leave one out predictive accuracy of models including age and sex rather than detrending each metabolite. Age and sex are not penalized in the elastic net procedure to reflect the known associations between age and sex with AD and PD. As expected, we generally find that the predictive performance of these models is slightly better than those presented in the main analysis. Tables   Table A2. Metabolite Set Enrichment Analysis: Output of MSEA on the targeted profile based on metabolites significant for PD classification in univariate logistic regression. Set are the pre-defined metabolite sets in the SMPDB (left) and CSF disease library (right). Total is the number of metabolites in these sets, hits is the number of metabolites shared between the pre-defined sets and our input list, and expected is the expected number of metabolites shared by random chance, using the cumulative hypergeometric distribution.

Appendix H. Removing Batch Effects for PD vs. Control Analysis
Batch effects and other sources of unwanted variation are often a problem in metabolomic analyses [78]. As mentioned in the Discussion, the result of near perfect classification of the PD against controls in this study is largely limited by potential artifacts introduced by the processing differences between the PD samples, compared to the AD and control samples. For example, CSF was collected from the AD and control subjects much earlier than the PD samples, which makes it difficult to disentangle the impacts of disease phenotype on metabolite profiles from the impacts of long-term storage time. To explore the effect of storage time in our dataset, we isolated just the control subjects, whose samples were collected and stored over a period of 7 years (see Figure A3). Then, we used univariate linear models to regress each metabolite against storage time, and extracted p-values for the test that the slope is zero. We find that of the 41 targeted metabolites reported in Table 3; nine of them had a relationship with storage time (i.e., had a BH-corrected p-value less than 0.05). However, five out of nine of these metabolites exhibited the opposite relationship with time compared to what we would expect if storage time was the primary driver of the predictive power in the main analysis. For example, Table 3 reports that PD subjects in our dataset had decreased concentrations of HIAA compared to controls. If storage time was the true source of this association, then we would expect HIAA to have a negative relationship with storage time (because PD samples were collected after control samples). However, HIAA is positively associated with storage time, suggesting that storage time alone is not the sole driver of our predictive models. The four metabolites whose relationship with PD is consistent with their relationship with storage time are cystine, xanthosine, glycocyamine, and 4-aminobutyric acid. We find that removing these four metabolites from our predictive models does not change the classification performance reported in Section 2.2.2, still retaining an AUC around 0.8. Future work is needed to further study the impact of long term storage time on metabolomic profiles, but this exploratory analysis of the targeted profiles of control subjects over a 9-year span provides weak evidence that our findings are not driven solely by storage time. However, there are other potential statistical artifacts that might have been introduced by processing differences between batches, and most of them cannot be measured or corrected for in our dataset. In the remainder of this section, we explore methods of mitigating these concerns by leveraging our unique situation of having three different disease phenotypes but only two batches.
The first approach involves retroactively comparing predictions from the models presented in Section 4.3, taking advantage of the following line of reasoning: if the primary signal driving our model predictions is batch effects, then a model fit to classify AD against PD should tend to classify control subjects as AD, and a model fit to classify PD against controls should tend to classify AD subjects as controls (because AD and control subjects were processed together). We find that this is not the case in our data-the class probabilities in both cases are centered near 0.5, which suggests that our models are picking up more signal than just the batch effect. The second approach instead attempts to remove batch variation before fitting a model. Here, the goal is to fit a model to classify PD from controls using data which has batch effects removed. To remove batch effects from the metabolite concentrations, we add a preprocessing step to our methods to remove any signal in the dataset that classifies AD from PD. In particular, we do the following in each cross-validated loop: for each observation (control or PD): 1.
Fit an elastic net regression model to classify on AD/PD to get coefficient vector β. If the left-out observation is a PD subject, exclude it from this model.

2.
Orthogonalize all control and PD observations ({x i } n i=1 ) by the coefficient vector, computing Use the orthogonalized data to fit a model to classify PD from controls (using the n − 1 observations) 4.
Use the model to predict the class probability on the left out observation Intuitively, the orthogonalized dataset resulting from this procedure is unable to distinguish differences between AD and PD. This includes true metabolomic signal separating the two neurodegenerative diseases, but, more importantly, it includes processing differences between the Control/AD batch and the PD batch. Fitting models on this new orthogonalized dataset, we find that we mostly lose the ability to classify AD (AUC between 0.5-0.7 across the different profiles and imputations), but still retain some predictive power to classify PD from controls (AUC between 0.7-0.9). Figure A7 displays the predictive accuracy of these models fit on the orthogonalized targeted profiles. To evaluate the effectiveness of the orthogonalization, we adapt ideas from [78], comparing the results of univariate logistic regressions fit to classify PD from controls, before and after orthogonalization. We find that, of eight differentially abundant metabolites in the targeted profile before orthogonalization, four of them remain differentially abundant after orthogonalization (here, differentially abundant refers to BH-corrected p-values <0.05). These metabolites are 2-aminoadipic acid, 6-methyl-DL-tryptophan, adenosyl-L-homocysteine, and lysine. For the other profiles, the orthogonalization process makes minimal changes to individual metabolites, hence univariate tests produce identical results before and after orthogonalization. Figure A7. ROC curves displaying leave one out predictive accuracy of models fit on the targeted profile after the batch orthogonalization procedure. We find that these models are largely unable to distinguish AD from controls after this procedure (a), and are still able to distinguish PD from controls (but with less predictive accuracy compared to the results in the main analysis) (b). Figure A8. Comparison of univariate p-values before and after orthogonalization, using univariate logistic regressions fit on the targeted profile to classify PD against controls. p-values are Benjamini-Hochberg corrected within each dataset, with a − log 10 transformation applied. The y = x line is also drawn, with values above the line representing metabolites which are more differentially abundant after orthogonalization. Points are colored according to false discovery rate cutoffs of 0.05, indicating whether a metabolite is considered differentially abundant in both analyses, only one analysis, or in neither analysis. Tables for Discriminating AD from PD   Table A3. Names and OR (associated with a standard deviation increase in concentration) for lipids retained in all five elastic net models fit on the full data to classify PD patients against AD patients. A metabolite with OR > 1 indicates that a higher concentration is associated with AD in our models. Only ORs with magnitude greater than 1.1 or less than 0.9 are shown here.

Appendix J. Metabolites Appearing in all Leave One Out Models
To make the most of the limited size of our dataset, discriminative performance was estimated by fitting a separate model to each sample for each of the five missing data imputations, as described in Sections 4-4.3. This results in 5n models for each sample in each profile, where n is the total number of samples used for that model. For example, there are 56 PD patients and 85 controls, so there are 5 * (56 + 85) = 705 models fit to discriminate PD against controls using the targeted profile. Here, we list all targeted metabolite and lipids that appear in all 5n models. We note that there were no untargeted metabolites that appeared in all models, which explains their exclusion from these models. The average odds ratios across these models are reported to describe the sign and relative weight of these features in their respective profiles. A two standard deviation interval is also included to quantify uncertainty across leave one out iterations. The first phenotype listed in each table represents the "1" class in the odds ratios, e.g., threonine has a mean odds ratio greater than 1 in the PD vs. control table, meaning that higher concentrations of threonine were consistently found in PD patients in our data.