Cross-Platform Evaluation of Commercially Targeted and Untargeted Metabolomics Approaches to Optimize the Investigation of Psychiatric Disease

Metabolomics methods often encounter trade-offs between quantification accuracy and coverage, with truly comprehensive coverage only attainable through a multitude of complementary assays. Due to the lack of standardization and the variety of metabolomics assays, it is difficult to integrate datasets across studies or assays. To inform metabolomics platform selection, with a focus on posttraumatic stress disorder (PTSD), we review platform use and sample sizes in psychiatric metabolomics studies and then evaluate five prominent metabolomics platforms for coverage and performance, including intra-/inter-assay precision, accuracy, and linearity. We found performance was variable between metabolite classes, but comparable across targeted and untargeted approaches. Within all platforms, precision and accuracy were highly variable across classes, ranging from 0.9–63.2% (coefficient of variation) and 0.6–99.1% for accuracy to reference plasma. Several classes had high inter-assay variance, potentially impeding dissociation of a biological signal, including glycerophospholipids, organooxygen compounds, and fatty acids. Coverage was platform-specific and ranged from 16–70% of PTSD-associated metabolites. Non-overlapping coverage is challenging; however, benefits of applying multiple metabolomics technologies must be weighed against cost, biospecimen availability, platform-specific normative levels, and challenges in merging datasets. Our findings and open-access cross-platform dataset can inform platform selection and dataset integration based on platform-specific coverage breadth/overlap and metabolite-specific performance.


Introduction
Metabolomics can characterize the global biochemical activity of a biological entity as it is shaped by external factors including lifestyle and drug [1,2].In parallel, the etiology of psychopathologies is shaped by complex interactions between intrinsic biological features and external factors, which is exemplified in posttraumatic stress disorder (PTSD) [3].The temporal sensitivity of metabolomics is being leveraged to define environmental influences, pathological mechanisms, and drug-modified clinical states in psychiatric research [4,5].Thus, metabolomics can provide valuable insights into the discovery of biological markers of disease states and advance precision medicine efforts through integration with other 'omics [6,7].This can be especially impactful for psychopathologies such as PTSD, which relies on subjective symptom reporting for diagnosis and has no biological markers approved by the U.S. Food and Drug Administration (FDA).Yet, metabolomic characterization of the disease is complicated by extensive variations in metabolomic techniques, including platform-specific metabolite coverage, detection of unknowns, precision and accuracy of measurements, and measurement variability over time, which can complicate longitudinal study designs [8].Here, we took an empirical approach to identify the best fit-for-purpose metabolomics platform for characterizing pathological mechanisms or biological disease markers, with a focus on PTSD.
Metabolomics techniques are rapidly advancing, although highly variable.However, techniques can be organized into two main categories: (1) absolute quantitative (targeted) approaches and (2) relative quantitative (untargeted/discovery-based) approaches [9].Targeted approaches enable absolute quantification of a defined set of metabolites, thereby facilitating longitudinal data collection; however, one inherent caveat is the narrowed scope of metabolite detection [10].In contrast, discovery-based approaches can provide the most comprehensive metabolome coverage; however, present coverage from any single analytical platform is only a fraction of the 19,174 metabolites detected thus far in blood [11].One strategy to overcome the limited coverage of any individual metabolomics assay is to combine datasets from different analytical platforms, but obstacles for merging datasets include the use of relative measurements rather than absolute quantitative measurements.For both discovery-based and targeted approaches, the size, type, and number of metabolites captured as well as measurement precision are shaped by sample preparation and analytical methods (i.e., metabolite extraction, separation, and ionization).
Studies comparing coverage and analytical performance across discovery-based and targeted metabolomics approaches have focused on metabolite coverage and intraassay performance [8,12].Virtually all platform comparisons report highly platformspecific metabolite coverage, with several citing the more expansive metabolite coverage of discovery-based approaches as an advantage [12,13].Previous cross-platform comparisons have supported superior precision in targeted approaches [10], although metabolite measurements from targeted and untargeted platforms can be highly correlated [8,10,12,14,15].To the best of our knowledge, the current study is the first evaluation of inter-assay performance across both discovery-based and targeted metabolomics approaches.
Here, we evaluated intra-and inter-assay performance in a longitudinal assessment of discovery and targeted commercial vendors that are capable of high-throughput, which is necessary for characterizing metabolomic signatures of heterogenous psychiatric disorders [16].Given that discovery and targeted approaches differentially resolve methodological trade-offs to optimize distinct metrics, we leveraged several metrics relevant to biomarker discovery, including metabolite coverage and performance metrics in line with prior cross-platform comparisons [8,12].To understand how to optimize metabolomics investigations in psychiatric research, we focused on PTSD as an exemplar of complex interactions between intrinsic biological features and external factors that precipitate psychopathologies [3].We first mapped metabolites previously implicated in PTSD with a review of prior case vs. control metabolomics studies.Understanding the breadth of metabolomics efforts in PTSD enabled us to evaluate platform-specific coverage for PTSDassociated metabolites.In parallel, we evaluated the comprehensive platform-specific performance for all metabolites reported by each vendor.Finally, we considered the performance for PTSD-associated metabolites and limitations of the scope of metabolomics approaches applied thus far in PTSD.Moving forward, the cross-platform metabolomics database generated by this study can enable optimization of platform selection for other clinical states through mapping of performance for putative metabolites and weighing these details against coverage depth.

Metabolites Affected in PTSD
To contextualize metabolomics efforts for PTSD in the broader landscape of psychiatric research, we mapped the use of metabolomics technologies for several psychopathologies (Figure 1; Supplementary Materials and Methods).Understanding the breadth of metabolomics techniques that have been applied to a disease state is informative because analytical methods dictate metabolite coverage, and previous efforts have found overlap in coverage across techniques as low as 27% [13], 15% [10], or 7% [12].We found that evidence of metabolomic differences in published PTSD studies is accruing, but remains underexplored compared with other psychiatric conditions (Figure 1A,B,D).The breadth of techniques applied to PTSD have yielded coverage for only a fraction of the currently measurable portion of the human metabolome and samples size have been low compared with efforts in other psychiatric disease states.For studies of PTSD, exploratory group sizes have included 20 [17], 34 [18], 50 [19], 52 [20], and 77 [7] cases.In parallel, cross-cohort designs have been applied for other psychopathologies leveraging data from 5283 cases [21] or 22,623 participants [22].A long-standing limitation for cross-cohort designs has been the complexity in the integration of metabolomics datasets generated through different approaches; although studies in PTSD have focused on discovery based approaches, a number of MS techniques have been utilized which suggests that the integration of currently available metabolomics data in PTSD would be challenging [23], but see also the integration of cross-cohort epigenetics with metabolomics in one cohort in [24].
To date, published case-control comparisons have provided valuable insights into the diverse metabolite classes that are likely to be relevant in PTSD (Supplementary Table S3).Classes with the highest count of implicated metabolites include amino acids, carnitines, essential fatty acids, glycerophospholipids, and glycerophosphocholines, with some discrepancy in the direction of effects for specific metabolites, i.e., citrate [7,17,19,20,25].Currently, overlap in the findings across PTSD populations has been largely restricted to metabolite classes rather than individual metabolites, to the degree that even discovery and test groups recruited from the same hospital overlapped for only 2 metabolites out of a total of 33 implicated metabolites [26].Additionally, most metabolomics studies in PTSD thus far have focused on case vs. control comparisons in male veterans, such that the role of sex and trauma type require further exploration [7,[18][19][20]25] but see the efforts across sexes and trauma types in [17,24].Another key consideration is that the emergence of PTSD requires trauma exposure; therefore, case vs. control studies that do not include longitudinal measurements, including assessments preceding trauma, cannot dissociate predisposing metabolomic features from consequences of trauma and PTSD etiology [24,27].
Low overlap across prior studies in metabolites associated with PTSD may reflect variation in population characteristics and methodologies used, including differences in fasting state, biofluid type and processing methods, time at collection, analytical technology, and data processing.Key methodological differences can be highlighted by comparing the landmark case vs. control study by Karabatsiakis et al. [17] with a more recent keystone metabolomics dataset used in multiple publications [7,25].Karabatsiakis et al. [17] found 6 metabolite classes affected in individuals with PTSD (55% female; n = 20) compared with controls with varying degrees of trauma exposure (44% female; n = 18).In this study, participants were instructed to have regular breakfast and peripheral blood was collected at 10 am ± 15 min.Extracted serum was then analyzed by LC coupled to quadrupole/timeof-flight MS (LC-QToF-MS).By comparison, the second dataset is comprised of male combat-exposed veterans with PTSD (0% female; n = 83) or without PTSD (0% female; n = 82) [25].Veterans reported to the laboratory under fasting conditions for peripheral blood collection at 8 am.Extracted plasma was then analyzed by Metabolon, Inc., using UHPLC-MS-MS optimized for basic metabolite species and GC-MS.Additional differences that are likely to shape the two datasets include the prevalence and type of comorbidities, the severity of the PTSD population, inclusion of ethnic minorities, use of internal standards, peak annotation methods, and data processing and normalization techniques [25,28,29].
Numerous metabolomics consortia and large-scale collaborative efforts have been convened to establish best practices in metabolomics including quality assurance and quality control practices [30], the use of standards [31,32], metabolite annotation confidence [33], as well as data processing and statistical approaches [34], and data reporting [35].These important challenges are beyond the current scope, but it should be noted that fasting state and collection time appear to have differential effects across metabolite classes in both discoverybased and targeted approaches, and reproducibility is suggested to be lower in nonfasting samples as compared to fasting samples, such that methodological differences add complexity to the landscape of available PTSD metabolomics datasets [36][37][38].Identifying the metabolomic signature of PTSD is likely to require future large-scale cohort studies or thoughtful integration of distinct cohorts on the same metabolomics platform, as has been done for depression [21] and Alzheimer's Disease [22].Direct comparison of distinct cohorts could help define population differences derived from factors previously demonstrated to shape PTSD symptomatology, including gender, trauma type (e.g., civilian vs. military trauma), early/cumulative trauma load, race/ethnicity, and education, as well as longitudinal studies to dissociate metabolomic features associated with risk vs. disease etiology [39,40].

Metabolite Coverage across Platforms
Metabolite coverage was highly variable across platforms (Tables 1 and 2); reported metabolite classes were covered by an average of 1.8 vendors, with further vendorspecificity in the metabolites represented within each class.Several classes were represented by only one or few metabolites, emphasizing the expansiveness of the human metabolome, which is actively being charted and exceeds the scope of any metabolomics technology [11].In addition to variation across metabolomics platforms, there was also variation within platforms across identical shipments (percent change in the number of metabolites between shipment 1 and shipment 2: Biocrates: −15.8%,HMT: 11.3%, Lipotype: 34.5%, Nightingale: 0%, Metabolon: 61.0% (discovery coverage only)).Metabolite classes consistently reported across both shipments are reported in Table 2.
Vendor-specific coverage was determined for metabolites previously implicated in PTSD as a case-study (Figure 2), but review of metabolomics research in PTSD suggests that current findings are limited by the scope of the metabolome captured and the subpopulations represented (i.e., primarily male, combat trauma, European ancestry).Beyond case vs. control comparisons of the metabolome, efforts are ongoing to characterize the metabolomic signature of disease-linked genetic variants, which could represent a transdiagnostic approach.For example, metabolites associated with a PARK2 variant have recently been mapped which may inform the biological mechanisms underpinning associations between PARK2 and Parkinson's disease, PTSD, diabetes mellitus, certain cancers, and inflammation ( [41]; reviewed in [42]), see Figure 3.To note, tools to identify metabolites associated with biological pathways and genes of interest are constrained by currently available data and a field-wide need to identify unknowns [23].

Measurement Precision: Intra-Assay and Inter-Assay Coefficients of Variation
Metabolite measurement precision, assessed with technical replicates, was highly variable across and within metabolomics platforms (Tables 1 and 2).Certain classes of metabolites were measured with high intra-assay precision (<10 CV%) by all platforms which measured metabolites in that class, (i.e., amino acids, hormones/steroids, and sphingomyelins).Conversely, some metabolite classes were measured with low intra-assay precision (>15% CV), in nearly all sample groups for all reporting vendors (i.e., diazines and fatty acyls).Similarly, certain metabolite classes were challenging to consistently detect and were reported in only one of the two samples shipment (i.e., diazines, glycerophospholipids, and keto acids and derivatives).Across all platforms, inter-assay precision was generally lower than intra-assay precision, and several metabolite classes had inter-assay CV%s greater than 30%.For example, the piperidine class had inter-assay CV%s of 53.4% and 63.2% but intra-assay CV%s of 14.8% and 16.3% for the PTSD and control group, respectively.The variability in precision across classes can be exemplified by comparing two classes of interest in PTSD, amino acids and fatty acids [19,20,25].The amino acids class, which contains 10 metabolites implicated in PTSD, were measured with high to moderate intra-and inter-assay precision for all reporting vendors (4.8% to 14.2%).Conversely, for fatty acids, a class encompassing 11 metabolites implicated in PTSD across 4 studies [17,19,20,25], inter-assay precision ranged from 6.6% to 53.3% across all vendors and sample groups (with coverage differences across vendors).The substantial variability in precision across metabolite classes could derive from a myriad of challenges known to affect measurement quality in metabolomics, such as the frequency of isomers, fragmentation patterns, or ionization efficiency [14,43].Given the substantial class-specificity in measurement precision, class-specific limitations in current technologies should be considered in the interpretation of available findings and prospective experimental designs to enable the dissociation of a biological signal.Notes: "High" precision is shown in green (≤10%), "moderate" in yellow (10% < x < 20%), and "low" in red (≥20%).Notes: "High" precision is shown in green (≤10%), "moderate" in yellow (10% < x < 20%), and "low" in red (≥20%).

Measurement Accuracy: Comparison to Known Values in NIST Reference Plasma
The accuracy of metabolite measurements, in comparison to NIST SRM 1950 pooled reference plasma reference values in the NIST certificate of analysis (COA), is provided in Table 3. Assessments of accuracy were constrained by (i) the fraction of classes represented in the NIST COA, (ii) vendor-specific coverage of metabolites, and (iii) the use of relative units which excluded Metabolon.Accuracy was evaluated for a set of amino acids listed in the NIST COA, which showed roughly similar high or moderate accuracy across platforms.Normalization methods informed by platform-specific normative levels could inform efforts to compare or merge datasets across metabolomics approaches.The majority of metabolites across all platforms were detected with excellent linearity across the dilution curve (i.e., coefficient of determination values near 1, suggesting that abundance is not a core obstacle in current metabolomics technologies; depicted in Supplementary Figure S1).Notes: "High" accuracy is shown in green (≤10%), "moderate" in yellow (10% < x < 20%), and "low" in red (≥20%).Accuracy assessed only for vendors that reported quantitative units and not relative units; accuracy estimated using Shipment 1 data.All percent accuracy values are versus NIST COA values, such that a negative value is below the NIST-provided reference value.
values and percent accuracy for Biocrates and Lipotype were removed.4. Figure 4. Reflecting the updates from Table 3, Figure 4b has been edited and a log scale applied for all graphs in the figure to better show the included data points.
The corrected Tables 1-3 and Figure 4 appear below.The authors state that the scientific conclusions are unaffected.This correction was approved by the Academic Editor.The original publication has also been updated.

Discussion
Large-scale cohort studies are increasingly looking to advance multi-omics efforts through the integration of metabolomics, which can provide a summary of cellular activity as well as key insights into drug/toxin exposures [21,22,44].Here, we evaluated five commercial metabolomics platforms that are frequently applied in large-scale metabolomics studies to characterize the state of current technologies.To do this, we evaluated metabolite coverage, measurement precision, and accuracy in a range of targeted and discovery metabolomics platforms including NMR and MS paired with direct infusion, LC, GC, and CE.This study advances ongoing efforts to evaluate assay performance across biomarker modalities, which has previously successfully identified highly sensitive platforms for measuring inflammatory cytokines [45].
Our findings demonstrated that metabolite coverage overlap across platforms was low, congruent with prior cross-platform comparisons [10,12,13].Percentage overlap could not be determined due to irreconcilable differences in metabolite nomenclature and vendorspecific database harmonization.Platforms pioneering annotation can provide valuable living datasets; yet merging cohorts across approaches is likely to be impeded in the near future by challenges in marrying nomenclature and platform-specificity in the subspecies represented in a metabolite measurement.At the level of metabolite classes, an average of 1.8 vendors covered each reported class, with further vendor-specificity at the level of the metabolite.The maximum number of reported metabolites, reaching the 80% rule, was ~950, while the current estimated metabolite count in the human body is 114,100, with >19 k of these detected in blood [11].While comprehensive coverage of the genome and transcriptome is currently possible, incomplete and platform-specific coverage in metabolomics is reasonable given the limited availability of metabolite standards and the extraordinary size and complexity of the human metabolome, which derives from endogenous sources as well as food, environment, microbes, drugs, etc. [10].
Platform-specific coverage is likely to be a challenge for diseases that are in early stages of characterizing mechanistic pathways and mapping putative metabolites as the set of metabolites implicated in a disease state are dependent upon the breadth of techniques that have been applied.For example, the metabolites currently implicated in PTSD can nearly be comprehensively covered by Metabolon (Supplementary Table S3).However, this is not surprising given that Metabolon's technologies have been applied to PTSD populations [7,20,25], while platforms with distinct coverage, such as HMT, Lipotype, and Biocrates, have yet to be applied in large-scale case vs. control cohorts in PTSD, and could implicate novel metabolites based on their non-overlapping coverage.Given that researchers must weigh the benefits of applying multiple metabolomics technologies with cost, biospecimen availability, and challenges in merging datasets across approaches, there is great utility in vendors linking their internal libraries to prominent metabolite databases actively updated by the academic community.For example, HMDB links metabolites to published literature and key available information, including normative data, biological properties, and known associated diseases.In the future, harmonizing metabolite nomenclature is likely to increase in importance as those at the frontier expand annotation of the human metabolome.
Comparisons of precision and accuracy across absolute and relative quantitative metabolomics approaches must consider trade-offs in coverage and performance inherent in current technologies [8], As expected, the lowest coefficients of variation were yielded by NMR, with a trade-off in coverage breadth [15,46].Lower precision was found in lipid classes across platforms, reflecting prior reports that quantification of lipids is difficult due to lipid solubility, incomplete separation during chromatography, and a variety of other challenges [47].Overall, precision in the final normalized dataset was similar between targeted and untargeted MS approaches.This is consistent with a prior cross-platform comparison which reported that for overlapping metabolites the "quantitative results from the nontargeted assay are largely comparable to data derived from classical targeted assays" [13].
Conversely, within platforms, precision ranged from "low" to "high" across metabolite classes for all vendors, both within an assay and across assays.The variation in precision across metabolite classes likely reflects distinct features that can render certain classes consistently challenging to measure, including isomers, ionization efficiency, fragmentation, and the availability of standards [28,48].For example, the NMR data varied from 0.88% CV for hydroxy acids to 27.57% CV for fatty acids, in the same plasma samples within the same assay.This degree of variation between metabolite classes was similar for discovery approaches.For example, Metabolon measured several metabolites classes with excellent precision (between 2-3% CVs) but measured amine oxides in duplicate PTSD samples with up to 45% CVs.Additionally, there were certain classes for which inter-assay drift was high across all vendors, for example, latosylceramides and glycerophospholipids were in the lowest precision category in both vendors that assessed these classes.The wide-spread variation in performance across metabolite classes indicates that researchers should weigh the strength of evidence for putative metabolites with the robustness of measurements for that class, and prospective studies should consider measurement robustness when determining sample size.
All metabolomics approaches shared greater inter-assay technical variation compared with intra-assay, including several classes which transition from "high" intra-assay precision to "low" inter-assay precision for some groups/vendors (acylcarnitines, diglycerides, bile acids, organooxygen compounds, etc.).Thus, longitudinal designs would benefit from the inclusion of blinded reference plasma within each assay run to anchor study-specific normalization, as well as maximizing the number of samples within each run (and minimizing batches) to the degree possible.Although prior findings have emphasized greater measurement drift for discovery approaches compared with targeted approaches, this gap appears to be attenuated in the current dataset, likely as a result of the rapidly advancing normalization and data processing procedures conducted by the vendors [10].
For certain metabolite classes, vendor reported measurements were consistently lower compared with concentrations in the NIST COA or values reported by other vendors (for example, Figure 4C).There are many possible explanations for systematically lower values in a metabolite or metabolite class: subspecies or isomers may not be captured by a specific approach, additionally fragmentation, matrix effects, and ion suppression can result in random or systematic errors [15].These latter challenges may be exacerbated for direct infusion techniques which do not have elution order to aid in metabolite annotation [14,49].Systematic differences in the reported concentration of a metabolite or metabolite class may be especially important in light of increasing efforts to combine cohorts to achieve the statistical power necessary for biomarker discovery in complex, heterogenous disease states [21,22].Our findings and publicly available dataset provide a novel ability to address platform-specific "normative" levels.To enable the community to leverage the current cross-platform datasets for clinical states beyond PTSD, a platform exploration tool is being developed that will allow users to select specific metabolites of interest and explore coverage and technical variation between platforms.This platform exploration tool is currently undergoing beta-testing and will be launched in mid-2021.A representation of this visualization tool is provided in the Supplementary Materials.

Cross-Platform Comparison Design: Clinical, Control, and Pooled Reference Plasma Samples
Each metabolomics vendor received two identical, blinded plasma shipments containing clinical samples, control samples, and pooled reference plasma (National Institute of Standards & Technology (NIST) Standard Reference Material (SRM) 1950).Each shipment contained 23 samples, identical across shipments, for a total of 46 samples.Metabolomics data from the control and PTSD samples were evaluated separately to account for potential population-specific performance.Given that identifying metabolites affected by PTSD was not the goal of the current study, and the current sample size is far from the scale necessary for the discovery of a disease signature, control and PTSD samples were not directly compared.A complete list of plasma samples included is provided in Supplementary Table S1.
To assess inter-assay variation, identical technical replicates were sent in two shipments; the first shipment was analyzed and data were received prior to sending the second shipment.Shipments were separated by 7-20 weeks pending vendor analysis timelines, with an average of 14 weeks between the two shipments.The sample run order was block randomized within 5 × 5 sample shipment boxes and was consistent across all platforms and shipments.All plasma technical replicates were aliquoted and prepared in parallel and shipped by the Indiana University Genetics Biobank (IUGB, Indianapolis, IN, USA).IUGB also generated a dilution curve of the NIST SRM 1950 diluted with physiological saline to 80%, 60%, and 40% of the starting concentration ( [14]; additional detail in Supplementary Methods).To standardize preanalytical factors, all vendors received plasma aliquots shipped with dry ice, previously stored under the same conditions in blinded 500 uL aliquots and organized in an identical manner along with a sample manifest that provided the sample bar code, box name, sample position, and specimen type (human plasma).Vendors received plasma samples with an identical number of freeze/thaw cycles within each of the three sample types: PTSD, control, and NIST plasma samples.Vendors within the US received samples with overnight shipping.International vendors received samples through a carrier that monitors and maintains the levels of dry ice for sample integrity throughout the course of shipment.Shipment duration varied by international destination and was generally 2-5 days in the current context.
Use of human subjects was approved by the Stanford Institutional Review Board (IRB) under Protocol #25948.Clinical samples were obtained from 6 veterans with PTSD (3 male and 3 female), recruited through local VAs and assessed at Stanford University.PTSD status was determined by the Clinician-Administered PTSD Scale (CAPS).Blood was collected into vacutainer-K2 EDTA Purple-top tubes (10 mL) in the morning between 8-10 am, with participants instructed to be fasted overnight.
Control plasma samples were obtained by BioIVT and matched to clinical samples in terms of sex, age, time of day at collection, collection method (vacutainer), and overnight fasting condition (3 male and 3 female; details in Supplementary Table S2).Control blood was obtained with vacutainer-K2 EDTA Purple-top tubes with anticoagulants (6 mL; BD tube #: 367863) in the morning between 8-10 am, with participants instructed fast.Within 60 min, the whole blood was spun to obtain plasma at 1000-13,000× g for 10 min.

Metabolomics Analytical Platforms
Analytical technologies were mapped for the metabolomics platforms evaluated (Figure 5); approaches included nuclear magnetic resonance (NMR) and mass spectroscopy (MS) (discovery and targeted).For MS, a direct infusion technique was included as well as liquid chromatography (LC), gas chromatography (GC), and capillary electrophoresis (CE) separation techniques, and flow injection analysis (FIA).Metabolite coverage and the degree of quantitation varies across platforms based on inherent trade-offs that shape data output.For example, LC-MS is a versatile technique with a broad linear range, but it may be less precise compared with NMR, and LC-MS is blind to metabolites that do not ionize, which are detectable by techniques that do not require ionization [15].Ionization efficiency varies across lipid classes as it has been shown to depend largely on the lipid head group, especially in direct infusion methods [14].Direct infusion can enhance throughput but may cause suppression of low abundance species [50].Ion suppression, particularly as a matrix effect, can also be an issue for LC-MS techniques if high abundance ions suppress the ionization of coeluting ions because of competition between ions [43].NMR benefits from greater structural information that can aid in metabolite identification, but generally has lower sensitivity, therefore lower coverage, compared with MS based approaches [46].Additionally, overlapping peaks from −CH, −CH2, −CH3 groups can increase quantitative error, despite the high accuracy of NMR [15].Comprehensive descriptions of each analytical platform have been provided by the vendors, excluding proprietary data processing methods, and are located in the Supplementary Methods.

Metabolites Affected in Posttraumatic Stress Disorder (PTSD)
To understand the scope of metabolomics in psychiatric research and contextualize efforts in PTSD, we conducted a review of current literature to map (i) metabolomics techniques that have been applied and (ii) the number of participants (sample size) in discovery groups.We evaluated PTSD and three psychopathologies with overlapping symptomatology or frequent comorbidity with PTSD: major depressive disorder, traumatic brain injury, and Alzheimer's disease.Metabolomics analytical techniques were categorized as commercial or "in house" academic assay approaches.For the "in house" metabolomics studies, authors did not specify vendor or core facility, such that it was presumed that procedures were run by the academic authors (authors were not contacted).A comprehensive search of the Google Scholar database was conducted for each disease state of interest using the following criteria for inclusion: (i) conducted in humans in the disease state of interest; (ii) published in a peer reviewed journal; (iii) assessed in a blood-based biofluid; (iv) an original report of empirical metabolomics data; and (v) published in or after 2015.Details extracted included: (1) number of clinical participants; (2) number of healthy control individuals (separated by discovery vs. test sets if applicable); (3) metabolomics analytical technology; and (4) vendor (if applicable).For PTSD, a search was conducted on 21 July 2020 using the following keywords: ("PTSD" OR "posttraumatic stress disorder") AND "metabolomics".A total of 100 studies were reviewed for inclusion, with a final total of 7 studies selected for inclusion.As PTSD was the focal disease state, this list was updated to include [26], which was published after the initial search date.For major depressive disorder, a search was conducted on 17 July 2020 using the following keywords: ("depression" OR "major depressive disorder") AND "metabolomics".A total of 100 studies were reviewed for inclusion, with a final 15 studies included.For traumatic brain injury, a search was conducted on 17 July 2020 using the following keywords: ("Alzheimer" OR "Alzheimer's") AND "metabolomics".A total of 100 studies were reviewed for inclusion, with a final 10 studies included.For Alzheimer's disease, a search was conducted on 20 July 2020 using the following keywords: ("traumatic brain injury") AND "metabolomics".A total of 100 studies were reviewed for inclusion, with 24 studies included.The full list of studies included is provided in the Supplementary Materials.

Metabolite Coverage and the Nomenclature across Platforms
To enable the assessment of the coverage overlap across different platforms, vendor reported metabolite nomenclature was collaboratively harmonized to Human Metabolome Database (HMDB) IDs for the identified metabolites, where possible [11].In some cases, nomenclature could not be matched across vendors, particularly for some lipid classes and reported "unknowns".Levels for unknown metabolites were provided by HMT and Metabolon; unknowns represent analytes in a vendor's internal library which can be consistently measured and may be identified in the future, but currently do not have known identities.Unknowns are beyond the scope of the current effort; however, reported levels for unknowns are provided in the Supplementary Materials.Some vendors provided a list of subspecies and structural isomers represented in their metabolite measurements where possible/applicable.This is highly beneficial as platform-specific capturing of metabolite subspecies can result in platform-specific normative levels and can pose challenges for integrating datasets across approaches (Biocrates, Metabolon, HMT).Metabolite coverage for a platform can change over time as annotation algorithms are updated.Nightingale was the only vendor to report a change in their algorithms between shipments, reflecting an expansion of their internal dataset (~July, 2020).The Nightingale data reported here represents their most up-to-date algorithm (as of the date of publication); for reference, shipment 1 Nightingale data, analyzed with the current and prior algorithm, are provided in the Supplementary Materials.Notably, only metabolite/lipid coverage is addressed here, but Nightingale measures clinical analytes, which may be of great value beyond the current context (e.g., albumin, apolipoproteins, total omega-3 and omega-6 fatty acids, total cholesterol, and VLDL, LDL, HDL, HDL 2 , HDL 3 , esterified, as well as free cholesterol).To compare across platforms, coverage overlap was visualized with Venn diagrams [51].

Measurement Precision: Intra-Assay and Inter-Assay Coefficients of Variation
Duplicate aliquots from the same parent sample were included in two identical shipments.The difference between measurements of the duplicate technical replicates were determined for each annotated metabolite both within each shipment ((σ/(a 1 + b 1 /2)) × 100) and across shipments, representing separate assay runs ((σ/(a 1 + a 2 /2)) × 100) to determine percent coefficients of variation (CV%).

Measurement Accuracy: Comparison to Known Values in the National Institute of Standards and Technology (NIST) Reference Plasma
For the subset of metabolites which have concentrations reported in the NIST Certificate of Analysis (COA, revised June 2020), percent accuracy was determined for metabolites reported by each vendor in the NIST SRM 1950 samples ((reported value-NIST COA value)/NIST COA value) × 100; using shipment 1 data).For the dilution curve of the NIST reference plasma, containing samples at 100%, 80%, 60%, and 40% of the initial concentration, linearity was assessed by fitting a zero-intercept linear regression to the shipment 1 data in R version 4.0.4.The coefficient of determination (R 2 ) was calculated as a goodness of fit parameter for each metabolite reported by each vendor, using all points in the dilution curve.If value(s) were missing for any NIST sample, linearity was not assessed for that metabolite reported by that vendor.

Data Analysis and Visualization
Missing data in metabolomics can result from analytical, computational, and biological factors [52,53].Here, vendor-applied methods for missing data were not modified (i.e., if a vendor imputed missing data, the imputed data were used; if a vendor reported a datum as below threshold or missing it was omitted not imputed or substituted).For vendors that imputed data, a pre-imputation version of the dataset is included in the Supplementary Materials where possible.Substitution was not used here because the source of missingness likely differed between analytical platforms and therefore substitution could have introduced biases, e.g., distortion in data distribution or underestimation of the standard deviation of a variable or group [54].However, it is important to note that missing values are often imputed in an experimental context with a strategy reflecting the probable source(s) of missingness in the metabolomics platform used [54,55].Therefore, to avoid unreliable variables with a large proportion of missing values, in the absence of substitution methods, the "modified 80% rule" was used which states that a metabolite should be excluded if the proportion of non-missing elements account for less than 80% of the data for that metabolite in each biological group (control and PTSD in the current context) [54,56].

Conclusions
Our findings emphasize that comprehensive coverage is not yet possible for the metabolome and large-scale metabolomics platforms yield distinct coverage, with classspecific performance and measurement variability within and across all platforms.Therefore, researchers selecting a platform must weigh (i) the breadth and depth of coverage previously applied to their clinical state of interest, (ii) the strength of evidence for putative metabolites, and (iii) class-specific measurement robustness, which can be interrogated for metabolites of interest using the publicly available dataset generated by the current study.For diseases in nascent stages of metabolomics characterization, researchers may be able to leverage information on metabolites associated with genes or proteins of interest (e.g., [57] or metabolites associated with biological pathways (e.g., [58].Numerous publicly available databases have been developed to support these efforts (e.g., Reactome, MetaCyc).Further efforts are underway to define species-specific reference values across biospecimen types, as well as the impact of clinical variables on population-level normative values including sex and age [11].
Beyond species-specific considerations, effects of technical variation-including variation from metabolite stability over time or preanalytical variables such as freeze-thaw cycles-could be minimized through the inclusion of blinded technical plasma in every assay run to enable study-specific normalization.Similarly, distributing groups across runs in block randomization could minimize the impacts of batch variation where possible.For example, in an innovative recent metabolomics study in PTSD conducted by Konjevod et al. [26], discovery and test cohorts "were analyzed under the same conditions, one year apart".Technical variation across assays has the capacity to obfuscate biological differences and may have limited the metabolites validated in the test cohort to 2 metabolites out of the 33 initially implicated metabolites [26].
Studies combining cohorts to increase statistical power are promising but can necessitate integrating metabolomics results across approaches, which is hindered by (i) vendorspecific nomenclature, (ii) platform-specificity in the metabolite subspecies represented in a metabolite measurement, and (iii) platform-specific "normative" levels.The former two challenges may be ameliorated by harmonizing to a publicly available database where possible (e.g., HMDB).For the latter, strategies for integrating metabolomics datasets may be informed by our findings defining metabolite classes which are subject to systematically lower measurements in specific platforms (fatty acids, LPCs, etc.) or greater inter-assay variation for some or all platforms (glycerophospholipids, ceramides, quinolines, etc.).Data Availability Statement: Data and the Metabolomics Platform Exploration Tool will be made available in the BRAIN Commons, a cloud-based platform for computational discovery designed for the brain health community at https://www.braincommons.org/publications/doi-10-3390-metabo-11090609/accessed on 28 September 2022.

Figure 1 .
Figure 1.Metabolomics techniques and samples sizes in psychiatric research between 2015 and 2020 for four psychiatric conditions: (A) posttraumatic stress disorder, (B) major depressive disorder, (C) traumatic brain injury, and (D) Alzheimer's disease.

Figure 2 .
Figure 2. Venn diagram of coverage for PTSD-associated metabolites, for the five commercial metabolomics vendors assessed in the systematic platform comparison.Metabolites were identified through systematic review of case vs. control studies in PTSD.

Figure 3 .
Figure 3. Venn diagram of coverage for metabolites associated with PARK2, for the five commercial metabolomics vendors assessed in the systematic platform comparison.

Figure 4 .
Figure 4. Platform-specific, log-transformed, average metabolite levels in control samples for vendors reporting absolute units; each point represents mean ± SEM for 11 control samples in total: 9 control samples from 6 individuals (with 3 technical replicates), and 2 NIST pooled reference plasma samples.Each panel depicts the range of covered metabolites, across all assays, for an exemplar metabolite class: (A) amino acids, (B) fatty acids, (C) lysophosphatidylcholines (LPC), (D) hydroxy acids, (E) ceramides, and (F) triglycerides.Depicted data are from the second sample shipment.NIST = concentrations reported in the National Institute of Standards and Technology (NIST) SRM 1950 Certificate of Analysis (COA, revised June 2020).

Figure 4 .
Figure 4. Platform-specific, log-transformed, average metabolite levels in control samples for vendors reporting absolute units; each point represents mean ± SEM for 11 control samples in total: 9 control samples from 6 individuals (with 3 technical replicates), and 2 NIST pooled reference plasma samples.Each panel depicts the range of covered metabolites, across all assays, for an exemplar metabolite class: (A) amino acids, (B) fatty acids, (C) lysophosphatidylcholines (LPC), (D) hydroxy acids, (E) ceramides, and (F) triglycerides.Depicted data are from the second sample shipment.NIST = concentrations reported in the National Institute of Standards and Technology (NIST) SRM 1950 Certificate of Analysis (COA, revised June 2020).
A.J., H.C.L., K.C., R.M.S., C.W.T., T.V., L.E.C.All authors have read and agreed to the published version of the manuscript.Funding: This work was supported by Cohen Veterans Bioscience and generous grants COH-0013 and COH-0003 from Steven A. Cohen for the RAPID-Dx program.Institutional Review Board Statement: The study was conducted according to the guidelines of the Declaration of Helsinki, and approved by the BioIVT Western Institutional Review Board ® (#2010-017) for control samples, and the Stanford Institutional Review Board (#25948), for clinical samples.Informed Consent Statement: Informed consent was obtained from all subjects involved in the study.

Table 1 .
Intra-assay Percent Coefficient of Variance (CV%) within Metabolite Classes and CV% Standard Deviation (SD) for Technical Replicates of PTSD and Control Samples in Shipment 1.

Table 2 .
Inter-Assay percent Coefficient of Variance (CV%) within Metabolite Classes for Technical Replicates of PTSD and Control Samples across Shipment 1 and Shipment 2.