Comparison of Kit-Based Metabolomics with Other Methodologies in a Large Cohort, towards Establishing Reference Values

Metabolic profiling is an omics approach that can be used to observe phenotypic changes, making it particularly attractive for biomarker discovery. Although several candidate metabolites biomarkers for disease expression have been identified in recent clinical studies, the reference values of healthy subjects have not been established. In particular, the accuracy of concentrations measured by mass spectrometry (MS) is unclear. Therefore, comprehensive metabolic profiling in large-scale cohorts by MS to create a database with reference ranges is essential for evaluating the quality of the discovered biomarkers. In this study, we tested 8700 plasma samples by commercial kit-based metabolomics and separated them into two groups of 6159 and 2541 analyses based on the different ultra-high-performance tandem mass spectrometry (UHPLC-MS/MS) systems. We evaluated the quality of the quantified values of the detected metabolites from the reference materials in the group of 2541 compared with the quantified values from other platforms, such as nuclear magnetic resonance (NMR), supercritical fluid chromatography tandem mass spectrometry (SFC-MS/MS) and UHPLC-Fourier transform mass spectrometry (FTMS). The values of the amino acids were highly correlated with the NMR results, and lipid species such as phosphatidylcholines and ceramides showed good correlation, while the values of triglycerides and cholesterol esters correlated less to the lipidomics analyses performed using SFC-MS/MS and UHPLC-FTMS. The evaluation of the quantified values by MS-based techniques is essential for metabolic profiling in a large-scale cohort.


Introduction
A metabolome is a group of small molecules that are endogenously produced as part of the end of the central dogma and have biological functions; metabolomics can be used together with other omics techniques [1]. Metabolic changes are directly associated with phenotypic changes and are affected by genomic factors, environmental factors (such as lifestyle, food intake, and/or the gut microbiome), and disease expression and progression [2,3]. Therefore, metabolic phenotyping has been widely used in biomarker discovery studies to identify disease-specific predictive molecules in biological specimens using several analytical platforms [4][5][6].
Nuclear magnetic resonance (NMR) has conventionally been used for metabolic profiling because of its high-quality quantification [7,8]. In contrast, mass spectrometry (MS)based metabolic profiling allows for the simultaneous and sensitive detection of metabolites, and gas chromatography MS has been traditionally utilized for this purpose [9,10]. However, it is impossible to extract both hydrophilic and hydrophobic molecules together, such as amino acids and lipid species, within a single sample preparation procedure using single solvent systems. Global metabolomics has been established to detect thousands of features as a comprehensive metabolic profiling technique using liquid chromatography MS (LC/MS) for biomarker discovery, although the disadvantages of lower reproducibility, lower accuracy of quantification and increased effort to annotate the structures of the molecules while considering biological functions remain [11,12].
Recently, kit-based metabolomics (Kit-Met) using LC/MS, which is in strict accordance with a standard operating procedure involving detailed documentation for sample preparation, instrument setup, system suitability testing, and data analysis, was established, which enabled us to obtain quantified values of several hundred metabolites and to compare the quantified values with interlaboratory studies [13,14]. Metabolic profiling can be performed using a kit consisting of a consumable 96-well preparation plate that includes several internal standards (ISs) and the optimal ultrahigh-performance liquid chromatography triple quadrupole tandem mass spectrometry (UHPLC-MS/MS) methods with two separate methodologies. Based on technological developments, more than 600 metabolites can be detected by means of the current Kit-Met version. In fact, representative metabolites, including amino acids, amino acid-related metabolites, bile acids, biogenic amines, cresol, fatty acids, hormones, indole derivatives, nucleobases, and vitamins, can be detected in UHPLC-MS/MS mode using an analytical column, and most lipid species (such as acylcarnitines, ceramides (Cers), cholesterol esters (CEs), diacylglycerols (DGs), dihydroceramides, and glycerophospholipids (including lysophosphatidylcholines (LPCs) and phosphatidylcholines (PCs), glycosylceramides, sphingolipids, sugars, and triacylglycerols (TG)), can be detected in flow injection analysis (FIA)-MS/MS mode from biological samples. These kits have recently been used in clinical studies for the discovery of biomarkers for diseases in a large number of patients worldwide, such as patients with mild cognitive impairment (MCI) [15], Alzheimer's disease [16], Parkinson's disease [17], depression [18], autism [19], chronic obstructive pulmonary disease [20], cardiovascular disease [21], diabetes mellitus [22], chronic kidney disease [23], glaucoma [24], lung cancer [25], hepatocellular carcinoma [26], gastric cancer [27], head and neck cancer [28], and breast cancer [29]. However, only disease-based samples have been used in most metabolic profiling clinical studies, which did not include optimal subjects adjusted for age, sex, or body mass index (BMI) because of the limitations of obtaining samples from healthy subjects in the hospital. Therefore, it is necessary to compare the metabolic profiles of healthy controls selected with similar phenotypes to improve the sensitivity and specificity of the biomarkers for clinical examination applications.
National biobank projects have incorporated metabolic profiling in large-scale analyses in not only clinical cohorts but also prospective cohorts [16,30,31], and this information has been stored for epidemiologic analysis with genetic variation. However, method standardization, which is important for cross-study and cross-cohort comparisons in metabolic profiling, remains a challenge for LC/MS-based widely targeted metabolomics techniques [32]. Notably, interlaboratory studies have demonstrated that most lipid species in plasma vary greatly due to the use of different platforms and laboratories during the lipid consortium project [33]. Hydrophilic metabolite outliers in cell lysates can be observed depending on the analytical method [34]. Therefore, establishment of a reference database for metabolic profiling, which will allow comparison with other methods and laboratories, has been required for a long time.
The Tohoku Medical Megabank (TMM) Project was established as one of the largest cohort projects in the Tohoku area of Japan, which was affected by earthquakes and a tsunami disaster on 11 March 2011, and several biospecimens obtained from more than 150,000 participants are stored at the biobank [35]. One of the main goals of this cohort was to identify predictive biomarkers of disease expression for future precision medicine by using metabolic profiling techniques combined with other datasets, such as genomics, phenotypes, and/or habitudes [36,37]. During this large-scale cohort metabolic profiling project, forty-five metabolites from more than 30,000 plasma samples were analyzed by NMR, 110 metabolites from approximately 2000 plasma samples and 421 metabolites from approximately 2300 plasma samples were analyzed by Kit-Met [6], and their quantified values were included in the commercial database "Japanese Multi Omics Reference Panel, jMorp" [38]. The database has expanded both the number of samples and the number of quantified metabolites, and the quantified values from cohort participants could potentially be utilized as references for biomarker discovery studies.
In this study, 8700 plasma samples with reference materials were analyzed by Kit-Met. We first subjected the 6159 plasma samples to Kit-Met 1 and then subjected the 2541 plasma samples to Kit-Met 2. Both Kit-Met 1 and Kit-Met 2 analyses were performed with the same kit (MxP ® Quant 500 kit) using different UHPLC-MS/MS systems: Xevo ® TQ-S and Xevo ® TQ-XS MS/MS systems for Kit-Met 1 and Kit-Met 2, respectively. The instrument used for Kit-Met 2 is slightly more sensitive than that used for Kit-Met 1 and has the potential to expand the number of quantified metabolites. The values of the detected metabolites in the reference materials were first evaluated via Kit-Met 1 and Kit-Met 2. We then evaluated the quality of the quantified values of the metabolites in 2541 plasma samples detected by Kit-Met 2 compared with the quantified values from other platforms, such as NMR, supercritical fluid chromatography MS/MS (SFC-MS/MS) and UHPLC-Fourier transform MS (UHPLC-FTMS) systems. We demonstrated the utilization of Kit-Met profiling in a large-scale cohort with interplate normalization by principal component analysis (PCA).

Results
A summary of the present study is shown in Figure 1. A total of 8700 plasma samples from participants in the TMM Community-based Cohort Study were selected for metabolic profiling using Kit-Mets by UHPLC-MS/MS. The demographic characteristics of the participants, which were separated into two groups of 6159 and 2541 plasma samples based on the difference in UHPLC-MS/MS setups, are described in Table 1. In this study, metabolic profiling of the 6159 and 2541 plasma samples was performed by Kit-Met 1 and Kit-Met 2, respectively ( Table 2). Kit-Met 2 was used as a system with higher MS sensitivity than Kit-Met 1. NIST ® SRM ® 1950 plasma samples and four global quality control (gQC) plasma samples (pooled normal human plasma, Na EDTA) along with 77 cohort plasma samples were analyzed on each 96-well plate using the MxP ® Quant 500 kit.   We first examined 6159 cohort plasma samples with the NIST and gQC plasma samples and observed the coefficient of variation (CV, %) of each quantified metabolite to evaluate the variation in quantified values by Kit-Met analyses. During data analysis, we selected only quantification values of higher quality, which were defined by the Kit-Met criteria, as highlighted in green and blue in Supplemental Table S1, and the detected metabolites were analyzed separately in UHPLC-MS/MS mode and FIA-MS/MS mode.
In UHPLC-MS/MS mode, most metabolites, including amino acids, amino acidrelated metabolites, bile acids, biogenic amines, cresol, fatty acids, hormones, indole derivatives, nucleobases, and vitamins, were widely detected around several ranges of the CV in the NIST plasma samples by Kit-Met 1, and a total of 36 metabolites were quantified inside 30% CV (Figure 2a, upper left panel, black bars). In addition, the peak frequency of numerous metabolites, including lipid species (such as acylcarnitines, Cers, CEs, DGs, dihydroceramides, and glycerophospholipids (including LPCs and PCs, glycosylceramides, sphingolipids, sugars and TG)), was detected at approximately 30% of the CV in the   We therefore examined 2541 plasma samples by Kit-Met 2 to improve the data quality of the quantified values of the metabolites. Indeed, most metabolites could be observed within 10% of the CV in the NIST plasma samples in UHPLC-MS/MS mode by Kit-Met 2, and a total of 75 metabolites, including amino acids and their related metabolites, were quantified inside 30% CV (Figure 2a, left panel, white bars). The peak frequency of the number of lipid species shifted from 30% to 10% in FIA-MS/MS mode by Kit-Met 2, and a total of 376 metabolites, including LPCs, PCs, sphingolipids, and TGs, were quantified inside 30% CV (Figure 2b, right panel, white bars). The CV differences between Kit-Met 1 and Kit-Met 2 were similarly observed with the quantified values of the metabolites in the gQC plasma samples (Supplemental Figure S1a,b). The list of mean values of metabolites sorted by CV values in NIST and gQC plasma is shown in Supplemental Table S2.
These differences in variation demonstrated that the metabolite values quantified by Kit-Met 2 were more reliable than those quantified by Kit-Met 1 in our cohort study. However, there was a threefold higher number of plates run on Kit-Met 1, which tends to increase variance in a large-scale cohort study. Although the median intensities of the boxplot for detected metabolites in NIST plasma could not observe significant differences among 80 plates evaluated by Kit-Met 1 and 33 plates by evaluated Kit-Met 2 (Supplemental Figure S2a-d), the single plots of select molecules (such as glutamine and PCaa 34:2) had the largest variation among 80 plates evaluated by Kit-Met 1 analyses, even after normalization by gQCs (Figure 2c,d).

Correlation Analysis of the Quantified Values of Metabolites in the NIST Plasma Samples Detected by NMR and Kit-Met 1 and Kit-Met 2 in UHPLC-MS/MS Mode
We then examined the NIST plasma samples by NMR analysis by following our previously described protocol [39]. Fifty-two metabolites detected from the NIST plasma samples by NMR are summarized in Supplemental Table S3. Thirty-one metabolites detected by Kit-Mets were correspondingly quantified with NIST plasma samples by NMR. The plot of the ratio of the concentration between the mean values of metabolites detected by Kit-Met 1 or Kit-Met 2 and their mean values detected by NMR with CV (%) demonstrated that approximately half of the metabolites were highly correlated with the quantified values determined by NMR within the lower CV (dotted circle colored red, Figure 3a,b, upper panels). However, cysteine (Cys), arginine (Arg), lysine (Lys), lactic acid (Lac), ornithine (Orn), p-cresol sulfate (p-Cresol-SO 4 ), and hypoxanthine were detected at twofold higher levels than by NMR, whereas hippuric acid (HipAcid), indoxyl sulfate (Ind-SO 4 ), and trimethylamine oxide (TMAO) levels were lower than those detected by NMR (Figure 3a,b, upper panels). Interestingly, the average concentrations of 32 metabolites, which were selected by a CV < 30% in 80 and 33 NIST plasma samples using Kit-Met 1 and Kit-Met 2, respectively, were highly correlated (r > 0.997, Spearman correlation analysis) ( Figure 3a, bottom left panel). Therefore, the absolute quantified values of those metabolites detected by Kit-Mets should be evaluated by other technology to create a database as a reference.  We therefore evaluated the values of lipid species separating lipid classes quantified by Kit-Met 2, which were detected with higher quality compared to other platforms used for lipidomic profiling. In addition, the consensus values of the lipid species in the NIST plasma samples were validated by an interlaboratory study [33]. We thus examined the We therefore evaluated the values of lipid species separating lipid classes quantified by Kit-Met 2, which were detected with higher quality compared to other platforms used for lipidomic profiling. In addition, the consensus values of the lipid species in the NIST plasma samples were validated by an interlaboratory study [33]. We thus examined the NIST plasma samples by using SFC-MS/MS and UHPLC-FTMS to evaluate the quality of the lipid species values quantified by FIA-MS/MS mode using Kit-Met 2, and the values from the three platforms were compared with the consensus values previously described [33].

Correlation Analysis of the Quantified Values of the Lipid Species in the NIST Plasma Samples by UHPLC-FTMS and FIA-MS/MS
High-resolution MS has been widely utilized to establish determination methods for lipidomic profiling by UHPLC-FTMS [41]. Quantification is difficult due to the less than optimal stable isotope labeling of the IS, which would prevent the effects of ion suppression without chromatographic separation. Recently, a mixture of 69 stable isotope-labeled lipid species was provided to improve lipid species quantification. We therefore used the IS to quantify the lipid species in the NIST plasma samples by UHPLC-FTMS, which was performed with modifications to the previous lipidomic profiling method [42].
A total of 148 lipid species, including LPCs, PCs, Cers, SMs, HexCers, DGs, TGs and CEs, could be annotated and quantified by UHPLC-FTMS. The annotated information and mass chromatograms of each group are shown in Supplemental Table S5 and Supplemental Figure S4 Finally, we compared the quantified values from the three platforms with the consensus values obtained from a previous publication [33]. High-resolution MS has been widely utilized to establish determination methods for lipidomic profiling by UHPLC-FTMS [41]. Quantification is difficult due to the less than optimal stable isotope labeling of the IS, which would prevent the effects of ion suppression without chromatographic separation. Recently, a mixture of 69 stable isotope-labeled lipid species was provided to improve lipid species quantification. We therefore used the IS to quantify the lipid species in the NIST plasma samples by UHPLC-FTMS, which was performed with modifications to the previous lipidomic profiling method [42].
A total of 148 lipid species, including LPCs, PCs, Cers, SMs, HexCers, DGs, TGs, and CEs, could be annotated and quantified by UHPLC-FTMS. The annotated information and mass chromatograms of each group are shown in Supplemental Table S5 and Supplemental Figure S4 Finally, we compared the quantified values from the three platforms with the consensus values obtained from a previous publication [33].   Figure 6. Most LPCs and PCs could be quantified within a similar range of concentrations and corresponded with the consensus values of the previously determined results [33] (Figure 6a,b). Although values of the lipid species in each group (such as Cers, SMs, HexCers and TGs) determined by FIA-MS/MS might be relative, the absolute quantified values have large variation among the three methodologies (Figure 6c-e,g). In contrast, the quantified values of DGs determined with the three methodologies overlapped because some lipid species overlapped and were identified without separation (Figure 6g). Notably, the values of CEs quantified from FIA-MS/MS showed large differences compared with those of the other three platforms (Figure 6h). The results indicate that it is necessary to normalize the quantified values of these lipid species with a reference material, such as the NIST and gQC plasma samples, for the analysis of a large-scale cohort. Additionally, the values of DGs analyzed by FIA-MS/MS did not show any correlation with the values from the other three platforms or references and should be used carefully as reference values.  (Figure 6a,b). Although values of the lipid species in each group (such as Cers, SMs, HexCers, and TGs) determined by FIA-MS/MS might be relative, the absolute quantified values have large variation among the three methodologies (Figure 6c-e,g). In contrast, the quantified values of DGs determined with the three methodologies overlapped because some lipid species overlapped and were identified without separation (Figure 6g). Notably, the values of CEs quantified from FIA-MS/MS showed large differences compared with those of the other three platforms (Figure 6h). The results indicate that it is necessary to normalize the quantified values of these lipid species with a reference material, such as the NIST and gQC plasma samples, for the analysis of a large-scale cohort. Additionally, the values of DGs analyzed by FIA-MS/MS did not show any correlation with the values from the other three platforms or references and should be used carefully as reference values.

Effect of Normalization with the gQC Samples for Metabolic Profiling in a Large-Scale Cohort
We assessed the normalization procedure using the values of the metabolites from the prospective cohort containing 2541 plasma samples and 33 NIST and 132 gQC plasma samples for inter-and intraplate variation within the 33-plate analysis. The plasma metabolite values quantified by Kit-Met 2 were examined by PCA to visualize plate-to-plate variations. We colored each plate and selected PC5 and PC6 to observe the time-dependent batch effects on the score plot determined by PCA, and drift could be observed among the 33 plates (Figure 7a).

Effect of Normalization with the gQC Samples for Metabolic Profiling in a Large-Scale Cohort
We assessed the normalization procedure using the values of the metabolites from the prospective cohort containing 2541 plasma samples and 33 NIST and 132 gQC plasma samples for inter-and intraplate variation within the 33-plate analysis. The plasma metabolite values quantified by Kit-Met 2 were examined by PCA to visualize plate-to-plate variations. We colored each plate and selected PC5 and PC6 to observe the time-dependent batch effects on the score plot determined by PCA, and drift could be observed among the 33 plates (Figure 7a).
We therefore introduced a normalization process to correct for interplate variation using four intermittent gQC samples, which was efficient to correct plate-to-plate drift by following the standard process using Kit-Met. Finally, the interplate variations among the 33 plates were significantly reduced on the score plots after the normalization process (Figure 7b), and the visualized variation among the cohort plasma, 33 NIST (yellow dots), and 132 gQC (orange dots) samples clearly disappeared (Figure 7a,b). We also examined the normalization process using Kit-Met 1 for the analysis of 6159 cohort plasma samples and 80 NIST and 320 gQC plasma samples. However, interplate variation in the PCA score plot could not be improved after normalization (data not shown). The average quantified values of 2541 cohort plasma samples after normalization are listed in Supplemental Table  S6.
We surmised that our gQC normalization process could be a way to create a more accurate metabolic profiling database of the obtained large-scale cohort for Kit-Met analysis. We therefore introduced a normalization process to correct for interplate variation using four intermittent gQC samples, which was efficient to correct plate-to-plate drift by following the standard process using Kit-Met. Finally, the interplate variations among the 33 plates were significantly reduced on the score plots after the normalization process (Figure 7b), and the visualized variation among the cohort plasma, 33 NIST (yellow dots), and 132 gQC (orange dots) samples clearly disappeared (Figure 7a,b). We also examined the normalization process using Kit-Met 1 for the analysis of 6159 cohort plasma samples and 80 NIST and 320 gQC plasma samples. However, interplate variation in the PCA score plot could not be improved after normalization (data not shown). The average quantified values of 2541 cohort plasma samples after normalization are listed in Supplemental Table S6.
We surmised that our gQC normalization process could be a way to create a more accurate metabolic profiling database of the obtained large-scale cohort for Kit-Met analysis.

Discussion
Metabolic profiling in plasma samples has been widely performed via Kit-Met in several large-scale cohort studies, such as EPIC (European Prospective Investigation into Cancer and Nutrition) [30,43,44] and KORA (Cooperative Health Research in the Region of Augsburg) [31], and biomarker candidates have been reported to predict disease expression and progression when a previous version of this kit (AbsoluteIDQ p180 Kit, kit180) was used. In fact, kit180 stably quantified 110 metabolites in 10 µL plasma samples within a large-scale analysis in our previous study [38]. The current version, Kit-Met, increased the number of quantified metabolites by more than 600, and therefore, Kit-Met was utilized by the UHPLC-MS/MS system. We also analyzed approximately 2300 plasma samples with Kit-Met 1 in our cohort, and the quantified values were published in 2020 [6].
In the present study, we examined 6159 plasma samples to expand the reference values from 2020 to 2021 by means of a Kit-Met 1. However, the large variation in quantified values in the NIST plasma samples could be observed from the metabolites detected using Kit-Met 1. Notably, the interplate variation of a single plot of typical molecules (such as glutamine and PCaa 34:2) in NIST plasma among 80 plates detected by Kit-Met 1 was larger than that detected in 33 plates by Kit-Met 2 ( Figure 2). Although the lower sensitivity of the MS system contributed to the CV differences, especially the lower concentrations of lipid species (such as Cers, HexCers, DGs, and TGs) detected using Kit-Met 1 and Kit-Met 2, interplate variation should be considered for large-scale analyses. We next established Kit-Met 2 in our large-scale cohort study to improve and continue the analysis for metabolic profiling. The CV of the quantified values clearly improved during the analysis of the 2541 plasma samples (Figure 2). We surmise that the metabolic profiles from the 2541 cohort plasma samples have the potential to be utilized to create a database including values quantified by Kit-Met 2.
Although Kit-Met, which was used in both UHPLC-MS/MS mode and FIA-MS/MS mode, has been established with an SOP, the absolute quantified values of the metabolites need to be considered. We previously demonstrated that chromatographic separation was essential to obtain the precise and accurate determination of endogenous molecules [45,46], and the separation of MRM chromatograms in UHPLC-MS/MS mode would be sufficient for the necessary level of determination (Supplemental Figure S5a). In fact, our previous 2300 cohort plasma analyses performed using Kit-Met 1 demonstrated that the quantified values of the metabolites, such as amino acids and their related compounds, in UHPLC-MS/MS mode were highly correlated with their values determined by NMR [6], which allowed reliable and accurate absolute quantified values to be obtained from biological samples. In fact, our present study demonstrated that approximately half of the 31 codetected metabolites were highly correlated with the quantified NMR values within lower CVs. However, because of the lower stability of derivatization, some metabolites (such as Cys, Arg, Lys, Lac, Orn, p-Cresol-SO 4 , HipAcid, Ind-SO 4 and TMAO) have large variations compared with those determined using NMR. The stability and efficacy of derivatization with PITC should be considered to study the differences in the quantified values of these metabolites.
Shotgun lipidomic profiling approaches, such as direct infusion and FIA, are utilized for the rapid detection or determination of lipid species [47,48]. Kit-Mets have therefore been widely used for lipidomic profiling, and lipid species could be potential biomarkers of disease prediction. For instance, a lower abundance of SMs in plasma contributes to a high risk of MCI [49], and a lower abundance of PCs and a higher abundance of TGs are typically observed in subjects with a high BMI [30]. However, most annotated lipid species in the kit include several isobaric and isomeric compounds that cannot be separated by the applied FIA approach [50]. Indeed, the MRM chromatograms in FIA-MS/MS mode showed that these isomers were detected together, and low-abundance species were detected due to ion suppression (Supplemental Figure S5b).
In contrast, LC/MS techniques based on separation using columns, such as octadecyl silica (ODS) and hydrophilic interaction chromatography (HILIC), have been widely used for lipidomic profiling in recent studies [41,42,51]. A wide range of lipid species and isomers can be detected after chromatographic separation using an ODS column. In fact, we previously quantified lipid species by separating sn-1 and sn-2 positional isomers of lysophospholipids and sphingolipids by LC-MS/MS [52,53]. However, it is difficult to prepare optimal ISs for all detected lipid species because matrix effects cannot eliminate background ions coeluting and overlapping with the peaks of interest on the chromatogram [41]. We therefore used a mixture of 69 ISs to improve the limitations of LC/MS-based lipidomic quantification. The level of quantified values reached the consensus values from previous reports [33]. However, many lipid species still overlapped on the chromatograms, and the selection of the optimal ISs to obtain accurately quantified values was unclear. Therefore, lipid class separation is still necessary for accurate quantification of lipid species globally. Previously, HILIC-based separation was utilized to overcome this limitation [54,55]. However, the sensitivity and reproducibility of the retention times are generally reduced compared with those of ODS-based separation techniques due to the lower efficacy of electrospray ionization, which uses a high rate of water during ionization, and the lower robustness of the HILIC column [54,55].
Recently, lipid class-based separation by the SFC technique has been utilized for lipidomic profiling [56]. This methodological improvement in the quantification of individual lipid species that coelute with ISs in endogenous species conducted with MRM seems to correspond more accurately with the exact values of the lipid species [40,57]. Based on this background, the quantified values of the lipid species in FIA-MS/MS mode need to be considered and must be evaluated by other technologies, and it is essential to show the utilization of quantified values as a healthy control reference for biomarker discovery studies and precision medicine. We therefore analyzed NIST plasma samples by SFC-MS/MS to evaluate the quantified lipid species values detected by FIA-MS/MS.
The values of LPCs, PCs, Cers, DGs, and TGs quantified by SFC-MS/MS showed better correspondence with the reported consensus values than the values determined by FIA-MS/MS. However, some lipid species were detected without separation and presented in higher abundance, such as CE 18:2, CE 20:4 and CE 18:1, which might be strongly suppressed by ionization using FIA-MS/MS [33,41]. In addition, the one-point quantification method has one disadvantage of not estimating the saturation of ionization to obtain absolute values of quantification. To avoid these phenomena, tenfold diluted samples were analyzed immediately after the first analysis by SFC-MS/MS to quantify CE 18:2, Interestingly, the value of CE 20:4 was three times higher than the consensus value from a previous publication [33]. We surmise that the consensus values of the lipid species in an interlaboratory study could show a large variation between several methodologies for analysis of the same samples because the values might not consider the capacity of ionization difference between different MS systems, and the quantified values could be estimated to be lower than the absolute values of these lipid species in the NIST plasma samples. In addition, we did not evaluate the rate of false positive identifications or the correctness in applied annotations by FIA-MS/MS. Therefore, we believe that kit analysis did not produce accurate quantified values of these lipid species, such as CEs, although the results of SFC-MS/MS analysis were more accurate.
On the other hand, the LPCs and TG values quantified by FIA-MS/MS correlated with the values from other methodologies but were 1.5 times higher than those from the SFC-MS/MS and UHPLC-FTMS analyses (Figures 4 and 5). For the lipid species that were potentially isomers, SFC or UHPLC methodologies using a column allow the separation of the positional isomers sn-1 and sn-2 and LPCs and TGs, as their values were estimated to be lower than the values determined by FIA-MS/MS. This result is due to interference from background ions that were derived from other lipid species and detected at the same MRM transitions without separation by FIA-MS/MS [52]. Therefore, system differences cannot be avoided by MS-based lipidomic profiling, and it is necessary to normalize the lipid species according to the values of reference materials, such as the NIST and gQC plasma samples.
Stocks et al. demonstrated the impact of the batch normalization method with the quantified values of metabolites in NIST plasma samples in both UHPLC-MS/MS mode and FIA-MS/MS mode in an interlaboratory study [13]. We therefore normalized the data from the 2541 plasma samples in the metabolic profiling study obtained by Kit-Met 2 with the quantified values of each metabolite in the gQC plasma samples by MetIDQ Oxygen software. Notably, the interplate variations were reduced after normalization in our present study (Figure 7). Although the normalization process works for several thousand assays, it is essential for evaluating the detailed variation of interplate differences. In fact, we examined the normalization process via Kit-Met 1 for 6159 cohort plasma samples to combine with the 2541 cohort plasma samples detected by Kit-Met 2, and the CV values of metabolites, especially lipid species, were improved by Kit-Met 1. For instance, the 314 lipid species in the 33 NIST plasma samples were inside 30% CV after normalization and approximately three times higher than before normalization. However, interplate variation was still observed, even after normalization by Kit-Met 1, which was shown as PCaa 34:2 and could not be improved by the visualized distribution of 6159 plasma sample analyses on the score plot of PCA. We therefore tried to create criteria to exclude the sample/s and metabolite/s in data processing and would like to present the technique using phenotypic analyses of cohort information in future studies.
We therefore surmised that our gQC normalization process could be a way to adjust the values obtained by Kit-Met analysis to obtain true representations with the minimum requirement of the criteria to exclude samples/plates and obtain the highly reliable quantified values of metabolites in large-scale sample analysis.
Bowden et al suggested that method standardization for lipidomic profiling using several platforms might be the greatest challenge, and the values of the lipid species within the same reference material obtained by different laboratories might not correspond [33]. Based on this background, the Lipidomics Standards Initiative (LSI; https://lipidomicsstandards-initiative.org/, accessed on 1 August 2021) was launched in spring 2018 to propose the introduction of guidelines and standards for lipidomics; it aims to improve the overall understanding of analytical chemistry (mass spectrometric analysis) and lipid biology and should be particularly useful to researchers new to the lipidomics field [58].
The accurate values of lipid species quantified by standardized methods will be confirmed in future studies of the consortium.
However, we should deeply consider how accurate the current quantitative estimates are relative to absolute values by means of different methodologies, such as SFC-MS/MS and UHPLC-FTMS using a mixture of 69 ISs, and improve the accurate values by the gQC normalization procedure in future studies. Moreover, the reference values should be evaluated by comparison with real clinical studies to determine whether they are more significant for biological variation than previous studies [59]. In addition, even if the methodologies are better for obtaining accurate values, it is essential to evaluate the utilization of metabolic profiles for biomarker research with large-scale data, such as genome-wide association studies and multiomics analyses [60].
There are several limitations of the present study. Since 2541 cohort plasma samples were not sufficient to create a reference value database for metabolic profiling, we continued using Kit-Met. Evidently, the number of analyses will be expanded in our future studies. However, even when the same system was used, the algorithm used to calculate the quantified values was essential for the reproducibility of the assays. Indeed, the CV values were completely different because the previous version of Kit-Met 1 used the peak height of each lipid species for quantification, whereas the current version of Kit-Met 2 used the peak area in FIA-MS/MS mode for quantification. Moreover, the present study has shown a comparison of quantified results using two systems. However, these experiments in our cohort study included only Kit-Met analysis, and we did not imply quality differences in the MS systems. Interlaboratory studies, such as ring trials, are needed in the future.

Study Population and Plasma Collection of Metabolic Profiling
For metabolome analyses, we selected a total of 8700 plasma samples, which comprised 6159 and 2541 samples for the Kit-Met 1 and Kit-Met 2 systems, respectively. All of the plasma samples were from adult participants joining the TMM project of populationbased prospective cohort studies, which has more than 150,000 participants and includes the Community-based Cohort Study and Birth and Three-Generation Cohort Study in Japan. The inclusion criteria for the TMM Community-based Cohort Study were as follows.
(1) For the specific health checkup site-based survey, persons were aged 40 to 74 years.
(2) For the Community Support Center or the Satellite-based survey, persons were aged 20 years or more at the time of enrollment, and for the TMM Birth and Three-generation Cohort Study, pregnant women and their fetuses and children, fathers, grandparents, and other family members were included. More detailed information on the participants was described in a previous publication [35]. The cohort study and omics study were approved by the ethics committee of Tohoku University. All adult participants signed an agreement based on informed consent.
Blood samples were collected from participants with overnight fasting or morning fasting using tubes containing ethylenediaminetetraacetic acid (EDTA)-2Na. After collection, sample tubes were immediately inverted 10 times and stored at 4 • C. These sample tubes were transported to the laboratory using refrigerated containers with temperature data loggers. The transported tubes were centrifuged at 2330× g for 10 min at 4 • C. The plasma fraction was transferred to a liquid handling machine and dispensed into 1.0-mL 2D barcoded screw tubes. The number of dispensed tubes was four per blood sample, the plasma volume in each tube was approximately 700 µL, and the samples were stored at −80 • C in a TMM biobank. For metabolomics analyses, the plasma sample in each dispensed tube (700 µL) was further divided into six tubes (approximately 120 µL per tube) and stored at −80 • C [37].

Sample Preparation
Plasma (77 samples), gQC (4 samples), and NIST plasma (1 sample) were set out in a 96-well format according to a predefined plate layout and were prepared using the MxP ® Quant 500 kit (Biocrates Life Sciences AG, Innsbruck, Austria). Then, the blank calibration standard, Biocrates QC, gQC, NIST QC or sample (10 µL), was applied to each plate well. All of the sample preparation procedures followed the kit protocols, and the detailed UHPLC-MS/MS methods and conditions were previously described [6]  All NMR experiments were performed at 298 K (25 • C) on a Bruker Avance III HD 600 MHz spectrometer equipped with a CryoProbe and Sample-Jet changer (Bruker BioSpin, Germany). Standard 1D-NOESY and CPMG (Carr-Purcell-Meiboom-Gill) spectra were obtained for each sample. All spectra were acquired with 32 scans and 32 k complex data points. All data were processed using the Chenomx NMR Suite 8.4 Processor module (Chenomx).

Manual Quantification of the Metabolites in Plasma
Metabolites in plasma were identified and quantified using the target profiling approach implemented in the Chenomx NMR Suite 8.4 Profiler module (Chenomx). Standard 1D-NOESY spectra were analyzed for the identification and quantification of metabolites. 1D-CPMG spectra were also used to eliminate the influence of residual proteins on quantification. A typical NMR spectrum with identification of metabolites is shown in Supplemental Figure S6 Five NIST plasma samples were prepared for extracting lipids using the Bligh and Dyer method [62]. Lipids were extracted from NIST plasma (50 µL) with 930 µL of methanol, IS-A (10 µL, Mouse SPLASH Lipidomix Mass Spec Standard, Avanti Polar Lipids Inc., Alabaster, AL, USA), and IS-B (10 µL, Avanti Polar Lipids Inc.) containing 0.050 nmol Cer d18:1 (d 7 )-15:0 and 0.050 nmol HexCer d18:1 (d 7 )-18:1. The samples were vigorously mixed for 1 min followed by 5 min of sonication. The extracts were then centrifuged at 16,000× g for 5 min at 4 • C, and the resultant supernatant (400 µL) was collected. After mixing with chloroform (400 µL) and water (320 µL), the aqueous and organic layers were separated by mixing and centrifugation at 16,000× g and 4 • C for 5 min. The organic layer (bottom, 280 µL) obtained by phase separation was dried under a nitrogen stream and stored at −80 • C until analysis. Prior to analysis, the dried sample was reconstituted in methanol/chloroform (1/1, v/v, 100 µL).

Data Acquisition and Data Processing
The analytical conditions for SFC-MS/MS analysis were performed as previously described [40]. The SFC (Nexera UC system, Shimadzu) conditions were as follows: column, ACQUITY UPC2 Torus diethylamine column (3.0 mm i.d. × 100 mm, 1.7 µm particle size, Waters); injection volume, 2 µL; column temperature, 50 • C; mobile phase A, supercritical carbon dioxide; mobile phase B (modifier) and make-up pump solvent; methanol/water  The five NIST plasma samples were prepared by the Folch method [63]. Methanol (80 µL) containing IS solution was added to NIST plasma (20 µL), and the sample was homogenized with a mixer for 30 s. Then, chloroform (80 µL) was added to the sample followed by mixing for 5 min, and water (20 µL) was added to the sample and mixed for 30 s. Following centrifugation at 1500× g for 10 min at 4 • C, the chloroform phase (60 µL, bottom) was transferred to a sample tube. The extraction process with chloroform (80 µL) was repeated, and a second chloroform phase (60 µL) was added to the same sample tube. The final sample (120 µL) was dried by vacuum centrifugation for 30 min at room temperature. Then, the residue was reconstituted in methanol (100 µL), mixed for 3 min and transferred to the sample vial for UHPLC-FTMS analysis.

Data Acquisition and Data Processing
The UHPLC system consisted of binary pumps, an autosampler, and a column compartment (Vanquish UHPLC system, Thermo Fisher Scientific, San Jose, CA, USA) connected with Vipers (Thermo Fisher Scientific). The UHPLC conditions were modified from a previous method [42]. Separation was performed using a metal-free C18 column (L-column2 ODS, 2.0 mm i.d. × 100 mm, 2 µm particle size; CERI, Saitama, Japan). The mobile phases consisted of (A) acetonitrile/water/ammonium formate (1 mol/L), 60/40/1 (v/v/v%) containing 0.1% formic acid, and (B) acetonitrile/isopropanol, 10/90 (v/v%) containing 0.1% formic acid. The components were separated by gradient elution. The initial condition was 30% B at a flow rate of 0.2 mL/min, followed by a linear gradient to 100% B from 2.0 min to 20.0 min, and 100% B was maintained for 10.0 min. Then, the mobile phase was returned to the initial conditions and maintained for 5.0 min until the end of the run. The total run time was 35.0 min, and the temperature of the column compartment was 45 • C.
The FTMS system was a Q Exactive Orbitrap mass spectrometer (Thermo Fisher Scientific) equipped with a heated-ESI-II (HESI-II) source. The voltages in positive and negative ion modes were 3.5 and 2.5 kV, respectively, the heated capillary temperature was 275 • C, the sheath gas pressure was 45 psi, the auxiliary gas setting was 10 psi, and the heated vaporizer temperature was 300 • C. Both the sheath gas and auxiliary gas were nitrogen. The collision gas was argon at a pressure of 1.5 mTorr. The FTMS scan type was full MS/data dependent (dd)-MS 2 . The parameters of the full mass scan were as follows: resolution of 70,000, autogain control target under 1 × 10 6 , maximum isolation time of 100 ms, and m/z range of 350-1050. The parameters of the dd-MS 2 scan were as follows: resolution of 17,500, autogain control target under 1 × 10 5 , maximum isolation time of 50 ms, loop count of 5, a number of top peaks of 10, an isolation window of m/z 1.5, a normalized collision energy of 30, an underfill ratio of 5.00%, and an intensity threshold under 1 × 10 5 . The UHPLC-FTMS system was controlled by Xcalibur 4.2.28.14 (Thermo Fisher Scientific), and the data were collected with this software.
The lipid species were annotated with mass accuracy and MS/MS fragmentation with the analysis of typical chemical standards according to the group of lipid species, such as CEs, LPCs, PCs, Cers, SMs, HexCers, DGs and TGs. These annotations were used to confirm the adduct ion for each group of lipid species, obtain fragment ion mass spectra information for identifying the fatty acids of lipid species, and elucidate the retention time to estimate the detection of other species within the same group of lipids that may have a different number of carbon atoms or double bonds. The quantified values were calculated from the ratio of the concentration of each optimal IS in the UltimateSPLASH TM ONE chemical standards using Xcalibur software (Thermo Fisher Scientific).

Conclusions
We evaluated the quality of the quantified values of the detected metabolites in the group of 2541 plasma samples compared with the quantified values from other platforms. We demonstrated the utilization of Kit-Met in a large-scale cohort with interplate normalization by PCA. Kit-Met has been widely utilized for not only plasma but also several kids of biological specimens, and molecules detected from other specimens, such as tissue, urine, feces, bile, or cultured cells, using the kit were different from those detected in plasma. However, some lipid species values, such as those of DGs and CEs, could show large variation in absolute values. Therefore, normalization via reference materials should be considered when creating a database and used for biomarker searching in future studies of precision medicine.
Supplementary Materials: The following are available online at https://www.mdpi.com/article/10 .3390/metabo11100652/s1, Figure S1: Frequency of detected metabolites in gQC plasma, Figure S2: Median intensities of boxplot for detected metabolites in NIST plasma, Figure S3: Multiple reaction monitoring chromatograms of representative lipid species detected in the NIST plasma samples by SFC-MS/MS, Figure S4: Total ion current (TIC) and mass chromatograms of 148 lipid species with ISs detected in the NIST plasma samples by UHPLC-FTMS in positive ion mode, Figure S5: Examples of typical MRM chromatograms of NIST plasma samples analyzed by Kit-Met 2, Figure S6: 1H-NMR spectrum of metabolites detected in NIST plasma samples, Table S1: List of all quantified metabolites in NIST and gQC plasma samples detected by Kit-Met 1 and Kit-Met 2, Table S2: Classification of metabolites by coefficient of variation, Table S3: List of all quantified metabolites in NIST and gQC plasma samples detected by NMR, Table S4: List of all quantified metabolites in NIST plasma samples detected by SFC-MS/MS, Table S5: List of all quantified metabolites in NIST plasma samples detected by UHPLC-FTMS with annotated information of the metabolites, Table S6: Summary of quantified metabolites in cohort, NIST, and gQC plasma samples detected by Kit-Met 2 after normalization. Informed Consent Statement: Written informed consent was obtained from all subjects involved in the study.

Data Availability Statement:
Because of the participant consent obtained as part of the recruitment process, it is not possible to make these data publicly available. Data are available upon request, please contact the contributing author.