Quality Control of Targeted Plasma Lipids in a Large-Scale Cohort Study Using Liquid Chromatography–Tandem Mass Spectrometry

High-throughput metabolomics has enabled the development of large-scale cohort studies. Long-term studies require multiple batch-based measurements, which require sophisticated quality control (QC) to eliminate unexpected bias to obtain biologically meaningful quantified metabolomic profiles. Liquid chromatography–mass spectrometry was used to analyze 10,833 samples in 279 batch measurements. The quantified profile included 147 lipids including acylcarnitine, fatty acids, glucosylceramide, lactosylceramide, lysophosphatidic acid, and progesterone. Each batch included 40 samples, and 5 QC samples were measured for 10 samples of each. The quantified data from the QC samples were used to normalize the quantified profiles of the sample data. The intra- and inter-batch median coefficients of variation (CV) among the 147 lipids were 44.3% and 20.8%, respectively. After normalization, the CV values decreased by 42.0% and 14.7%, respectively. The effect of this normalization on the subsequent analyses was also evaluated. The demonstrated analyses will contribute to obtaining unbiased, quantified data for large-scale metabolomics.


Introduction
Metabolomics is an omics science that analyzes hundreds of metabolites in biological samples. Recent improvements in this technology have enabled high-throughput and large-scale metabolomic studies. Metabolomics has been widely applied in epidemiological studies that include ≥1000 participants. These studies have led to the discovery of novel metabolic features and biomarkers for various chronic diseases, such as diabetes [1,2], cardiovascular disease [3,4], chronic kidney disease [5], Alzheimer's disease [6], obesity [7], and blood pressure [8]. In addition to efficient data sharing and standardization, the Consortium of Metabolomics Studies (COMETS) was established to encourage large-scale collaboration of prospective cohort studies with human metabolome research [9].
Nuclear magnetic resonance (NMR) spectroscopy and mass spectrometry (MS) have been used in large-scale metabolomic studies. NMR is suitable for large-scale cohort studies involving long-term measurements because it is highly reproducible, and metabolites can be measured following simple pretreatment. However, NMR has relatively low sensitivity; therefore, only abundant metabolites can be profiled. In contrast, MS can measure a wide range of metabolites with high selectivity and sensitivity. It is impossible to analyze all metabolites using either method because of their wide variety of physical and chemical properties. Therefore, a combination of separation systems with MS, such as gas chromatography-MS (GC-MS), liquid chromatography-MS (LC-MS), and capillary electrophoresis-MS (CE-MS) are used according to the properties of the metabolites to be measured.
The Tsuruoka Metabolomics Cohort Study (TMCS), a cohort study of the Japanese population, was conducted in April 2012. This study used CE-MS to quantify charged metabolites in plasma and urine, and LC-MS was used to analyze lipids in plasma. The samples were collected from more than 10,000 registered participants, which required long-term analyses including multiple batches. The quality control (QC) of the CE-MS plasma data for 52 months resulted in coefficient of variation (CV) values of quantified metabolites of <30% for 85.1% of metabolites [10]. Applications of this dataset have been published for various diseases [11][12][13], physical activity [14], and food intake [15].
A strategy for obtaining high-quality LC-MS data for large-scale metabolomics has been proposed. Luo et al. developed a pseudo-targeted LC-MS method to improve the stability of large-scale metabolomic data [16]. This method included a blank wash step that eliminated the build-up of contaminants from the system and a postcalibration process using QC samples to correct signal drift among multiple batches. Consequently, the CV of 54% of the metabolite features was <15% in three independent batches. Brunius et al. proposed a new approach, including interbatch metabolite feature alignment and intrabatch cluster-based drift correction, to normalize multiple batch data from large-scale nontargeted LC-MS metabolomic data [17].
An approach for overcoming problems related to the conjunction of multiple batches of LC-MS-based lipidomic data was examined in this study. We selected 147 lipid species that are considered to be clinically important, such as bioactive lipids and lipid mediators. This approach could be used for various metabolome analyses, including large-scale cohort studies, although the original use was targeted at LC-MS lipidomics.

Study Population and Sample Collection
TMCS is a Japanese cohort study that started in April 2012 (Tsuruoka City, Yamagata Prefecture, Japan), involving 11,002 participants aged 35 to 74 years old [10,[12][13][14]18]. Participants were recruited from among attendees of annual municipal or workplace health checkup programs held at four city sites at baseline (from April 2012 to March 2015). Written informed consent was obtained from all participants. This study was approved by the Medical Ethics Committee of the School of Medicine, Keio University (approval No. 20110264) and the corresponding regulatory agencies, and all experiments were performed in compliance with approved guidelines.
Blood samples were collected in the morning after 12 h of overnight fasting, and plasma samples were prepared using EDTA-2Na as an anticoagulant and immediately stored at 4 • C. The samples were centrifuged at 1500× g for 10 min at 4 • C within 3 h of sampling. The upper layer was stored at −80 • C until lipid extraction.
The multiple reaction monitoring (MRM) settings were determined using flow injection analyses of commercially available compounds. The lipids belonging to a respective group were measured based on the conditions of the standard internal compounds, including product ions, collision energy, and cell exit potential. The MRM conditions of IS and corresponding lipids are summarized in Supplemental Table S1.

Method Validation
The developed analytical method, including linearity, accuracy, precision, recovery, and sensitivity, was validated according to the bioanalytical method validation guidance for industry (2018) issued by the U.S. Food and Drug Administration.
The IS mixture was diluted with methanol to prepare a 19-point calibration standard at 0.2-100,000 nmol/L to evaluate linearity. Each calibration standard was analyzed five times, and the mean area was used to prepare the calibration curve. Linearity was assessed by calculating the least-squares regression and was expressed using the coefficient of determination. The linear dynamic range was determined to be within ±20% accuracy of the calibration curve.
The concentration of limit of detection (C LOD ) can be calculated using Equation (1): where t s is Student's t-distribution factor for four degrees of freedom at the 99% confidence level (t s = 3.747), C STD is the minimum concentration of the standard in the dynamic range, and S STD is the standard deviation of the peak area after five repeated injections of the standard at the minimum concentration.

Quantitative and Normalization Method of Metabolomic Profile
The concentration of each lipid was calculated using Equation (2) and the corresponding IS (Supplemental Table S1).
where C Lipid is the concentration of the lipid, C IS is the concentration of the corresponding IS, A Lipid is the peak area of the lipid, and A IS is the peak area of the IS. The median value of each lipid was calculated from the quantitative values of the five QC samples measured in the same batch, and the relative value of each lipid in the sample was obtained by dividing by this value.

Statistical Analysis
Metabolomic profiles with and without normalization were analyzed using partial least squares-discriminant analysis (PLS-DA). The concentrations of lipids below the detection limit were substituted with half of the minimum value across all detected samples. Additionally, the relationship between the metabolites and age (each being ten years old) was analyzed. Both data were log 10 scaled and transformed to Z-scores. PLS-DA using all data resulted in R 2 values. The generalization ability (Q 2 ) was assessed at the average value of 5 times 10-fold cross-validations with various random values. The PatternHunter function implemented in MetaboAnalyst with the Spearman correlation option was used to explore the monotonous increase or decrease in metabolites depending on age. Scaling and normalization, similar to those for PLS-DA, were used for this analysis.

Method Validation
In large-scale metabolomic studies, quantified data are generally collected over long periods of time in multiple batches. During this period, one of the most important factors for obtaining stable and reliable data is the use of well-validated analytical methods. The method used in this study was validated using IS. Figure 1 shows the chromatograms of the 13 deuterated or odd-chain IS used in this study. method used in this study was validated using IS. Figure 1 shows the chromatograms of the 13 deuterated or odd-chain IS used in this study.  Table 1 shows the calibration curve results, R 2 value of the coefficients of determination, lower limit of detection, and linear dynamic range for the IS. The calibration curves for all compounds were linear, with coefficients of determination of 0.979-0.994. The limits of detection determined from the standard deviation of the peak area at a minimum concentration in the dynamic range were between 0.41 and 29.17 nmol/L, sufficient to detect low concentrations of lipid components. It was also found that the linear dynamic range of the investigated compounds was between two and four orders of magnitude, which could correspond to a wide concentration range.   Table 1 shows the calibration curve results, R 2 value of the coefficients of determination, lower limit of detection, and linear dynamic range for the IS. The calibration curves for all compounds were linear, with coefficients of determination of 0.979-0.994. The limits of detection determined from the standard deviation of the peak area at a minimum concentration in the dynamic range were between 0.41 and 29.17 nmol/L, sufficient to detect low concentrations of lipid components. It was also found that the linear dynamic range of the investigated compounds was between two and four orders of magnitude, which could correspond to a wide concentration range.
Accuracy, precision, and extraction recovery were also investigated ( Table 2). The error between the quantitative and theoretical values at the three concentrations was within 20% for all examined compounds. The precision of the results for some compounds exceeded 30% at low concentrations. However, almost all of these were within 10% at medium and high concentrations. These results suggest that the analytical method developed in this study is sufficiently accurate and precise for lipid quantification. The accuracy and precision results at 19 standard calibration concentration points are summarized in Supplemental  Table S2. Before sample preparation, the extraction recoveries for lipids from human plasma were determined by adding a known amount of IS mixtures (the concentration of each standard is described in Section 2.4). Except for LPC, the recoveries for the tested compounds ranged from 100.2% to 111.5%, indicating that the extraction of lipids from plasma could be quantitatively performed. Although the extraction recovery of LPC was slightly worse (63.0%) than that of the others in this study, the effect of reduced recovery due to sample preparation can be compensated for by spiking with isotope-labeled standards.

Comparison of Analytical Results with and without Normalization
This study measured 10,833 samples, with each batch consisting of 40 samples. We prepared a pooled QC sample at the beginning of the study and aliquots of the same pooled QC sample were used across all batches for the entire study. QC samples were measured first and last, and for every ten samples; therefore, 5 QC samples were measured for each batch, and 1376 QC samples were analyzed in all batches. The mass spectrometer was autocalibrated every three months, with maintenance performed once a year. Finally, it took 1580 days to measure all samples. Figure 2 shows the normalization strategy adopted in this study. The median value of each lipid was calculated from the quantitative values of the five QC samples measured in the same batch, and the relative value of each lipid in the sample was obtained by dividing by this value. spectrometer was autocalibrated every three months, with maintenance performed once a year. Finally, it took 1580 days to measure all samples. Figure 2 shows the normalization strategy adopted in this study. The median value of each lipid was calculated from the quantitative values of the five QC samples measured in the same batch, and the relative value of each lipid in the sample was obtained by dividing by this value. Figure 2. The quantitative and normalization method adopted in this study. Figure 3 shows the relative areas of the QC samples. Each relative area was divided by the average of those in all QC samples, and, therefore, the horizontal bar at y = 1.0 was the ideal line without any variations. Figure 3A shows the variations in FA, including FA 18:0, FA 18:1, FA 18:2, and FA 22:6. Figure 3B shows the variations in lactosylceramide, including lactosylceramide d18:1-14:0, lactosylceramide d18:1-16:0, lactosylceramide d18:1-18:0, and lactosylceramide d18:1-24:1. FA 18:0 exhibited relatively small variations, whereas FA 22:6 exhibited relatively large variations. Figure 3C shows chromatograms of the FA. Compared to FA 18:0, the retention time (RT) of the FA 22:6 was significantly different from that of the IS and exhibited a larger variation. This trend was the same for lactosylceramide ( Figure 3D). Lactosylceramide d18:1-14:0, whose RT was closest to that of the IS, exhibited quantified data close to 1.0. Additionally, lactosylceramide d18:1-18:0 and lactosylceramide d18:1-24:1, whose RT was significantly different from that of the IS, exhibited a large difference. This large difference indicates that the fluctuations of these peaks were different from those of the IS. Among the 147 metabolites in the QC samples, the CV among batches (inter-CV) and the CV in a batch (intra-CV) are shown in Figure  3E,F. The median values of the inter-and intra-CV were 17.8% and 10.6%, respectively, indicating that the variance among batches was smaller.   Figure 2. The quantitative and normalization method adopted in this study. Figure 3 shows the relative areas of the QC samples. Each relative area was divided by the average of those in all QC samples, and, therefore, the horizontal bar at y = 1.0 was the ideal line without any variations. Figure 3A shows the variations in FA, including FA 18:0, FA 18:1, FA 18:2, and FA 22:6. Figure 3B shows the variations in lactosylceramide, including lactosylceramide d18:1-14:0, lactosylceramide d18:1-16:0, lactosylceramide d18:1-18:0, and lactosylceramide d18:1-24:1. FA 18:0 exhibited relatively small variations, whereas FA 22:6 exhibited relatively large variations. Figure 3C shows chromatograms of the FA. Compared to FA 18:0, the retention time (RT) of the FA 22:6 was significantly different from that of the IS and exhibited a larger variation. This trend was the same for lactosylceramide ( Figure 3D). Lactosylceramide d18:1-14:0, whose RT was closest to that of the IS, exhibited quantified data close to 1.0. Additionally, lactosylceramide d18:1-18:0 and lactosylceramide d18:1-24:1, whose RT was significantly different from that of the IS, exhibited a large difference. This large difference indicates that the fluctuations of these peaks were different from those of the IS. Among the 147 metabolites in the QC samples, the CV among batches (inter-CV) and the CV in a batch (intra-CV) are shown in Figure 3E,F. The median values of the inter-and intra-CV were 17.8% and 10.6%, respectively, indicating that the variance among batches was smaller. Metabolites 2023, 13, x FOR PEER REVIEW 8 of 13  Figure 4 shows the effect of normalization using QC samples on the peaks in the sample data. Figure 4A,B show the quantified data for FA 18:1 before and after normalization. The data for FA 18:1 before normalization exhibited a horizontally flat trend, and most of the data were less than 50 μM. Although several data at the sample numbers around No. 4800 exhibited conspicuously high values over 100 μM ( Figure 4A), all data were flat after normalization ( Figure 4B). Figure 4C,D show the quantified data for lactosylceramide d18:1-16:0 before and after normalization, respectively. The quantified data exhibited larger fluctuations than that of FA 18:1. These fluctuations did not follow a random trend, but the curves showed some patterns, for example, a gradual increase between samples No. 1 and 4000 ( Figure 4C). In addition, abrupt changes were observed at around 4000, 7000, and 10,000 chromatographic runs, which correspond to annual instrument maintenance. After normalization, these patterns disappeared, and all data showed a horizontally flat trend ( Figure 4D). The median value of the inter-CV before normalization was 20.8% ( Figure 4E). After normalization, these values decreased to 14.7% ( Figure 4F). The median value of the intra-CV before normalization was 44.3% and decreased to 42.0% after normalization.  Figure 4 shows the effect of normalization using QC samples on the peaks in the sample data. Figure 4A,B show the quantified data for FA 18:1 before and after normalization. The data for FA 18:1 before normalization exhibited a horizontally flat trend, and most of the data were less than 50 µM. Although several data at the sample numbers around No. 4800 exhibited conspicuously high values over 100 µM ( Figure 4A), all data were flat after normalization ( Figure 4B). Figure 4C,D show the quantified data for lactosylceramide d18:1-16:0 before and after normalization, respectively. The quantified data exhibited larger fluctuations than that of FA 18:1. These fluctuations did not follow a random trend, but the curves showed some patterns, for example, a gradual increase between samples No. 1 and 4000 ( Figure 4C). In addition, abrupt changes were observed at around 4000, 7000, and 10,000 chromatographic runs, which correspond to annual instrument maintenance. After normalization, these patterns disappeared, and all data showed a horizontally flat trend ( Figure 4D). The median value of the inter-CV before normalization was 20.8% ( Figure 4E). After normalization, these values decreased to 14.7% ( Figure 4F). The median value of the intra-CV before normalization was 44.3% and decreased to 42.0% after normalization.

Data Analysis
The effect of normalization on the subsequent statistical analyses was analyzed. Figure 5 shows the relationship between lipid profile and gender, and Figure 5A,B show the score plots of PLS-DA using the data without and with normalization, respectively. The analyses using the data without and with normalization using up to five components resulted in R 2 = 0.859, Q 2 = 0.856 ± 4.49×10 −5 and R 2 = 0.859, Q 2 = 0.856 ± 1.30 × 10 −4 , respectively. The metabolites showing high variable importance in projection (VIP) scores within the top 20 without and with normalized data are shown in Figure 5C and Figure  5D, respectively. Both results include testosterone, one of the androgens, which consistently showed the largest VIP value. Acylcarnitine 20:4, FA 12:0, FA 14:1, FA 18:3, FA 12:1, lactosylceramide d18:1-14:0, acylcarnitine 20:3, and acylcarnitine 20:5 were included within the top ten VIP scores, although the orders were slightly different among them. As an inconsistent result, progesterone and LPC 20:2 were included in the analysis without normalized data, and glucosylceramide d18:1-18:0 was included in the analysis with normalized data.

Data Analysis
The effect of normalization on the subsequent statistical analyses was analyzed. Figure 5 shows the relationship between lipid profile and gender, and Figure 5A,B show the score plots of PLS-DA using the data without and with normalization, respectively. The analyses using the data without and with normalization using up to five components resulted in R 2 = 0.859, Q 2 = 0.856 ± 4.49×10 −5 and R 2 = 0.859, Q 2 = 0.856 ± 1.30 × 10 −4 , respectively. The metabolites showing high variable importance in projection (VIP) scores within the top 20 without and with normalized data are shown in Figure 5C,D, respectively. Both results include testosterone, one of the androgens, which consistently showed the largest VIP value. Acylcarnitine 20:4, FA 12:0, FA 14:1, FA 18:3, FA 12:1, lactosylceramide d18:1-14:0, acylcarnitine 20:3, and acylcarnitine 20:5 were included within the top ten VIP scores, although the orders were slightly different among them. As an inconsistent result, progesterone and LPC 20:2 were included in the analysis without normalized data, and glucosylceramide d18:1-18:0 was included in the analysis with normalized data. The effect of the lipidomic profile on age was analyzed using PLS-DA, and the score plots are shown in Figure 6A,B. PLS-DA resulted in R 2 = 0.521, Q 2 = 0.515 ± 3.73 × 10 4 using the data without normalization, and R 2 = 0.527, Q 2 = 0.519 ± 1.89 × 10 −4 using the data with normalization. The VIP scores are shown in Figure 6C,D. Progesterone levels declined with age, and all other metabolites were included. FA 20:5 and acylcarnitine 20:5 were ranked first and second, respectively. FA 22:6 and acylcarnitine 22:5 were also consistently included, although the order differed. Lactosylceramide 18:0 was included in the data without normalization, whereas LPE 22:6, LPC 20:5, and LPC 24:0 were included in the data with normalization. The effect of the lipidomic profile on age was analyzed using PLS-DA, and the score plots are shown in Figure 6A,B. PLS-DA resulted in R 2 = 0.521, Q 2 = 0.515 ± 3.73 × 10 4 using the data without normalization, and R 2 = 0.527, Q 2 = 0.519 ± 1.89 × 10 −4 using the data with normalization. The VIP scores are shown in Figure 6C,D. Progesterone levels declined with age, and all other metabolites were included. FA 20:5 and acylcarnitine 20:5 were ranked first and second, respectively. FA 22:6 and acylcarnitine 22:5 were also consistently included, although the order differed. Lactosylceramide 18:0 was included in the data without normalization, whereas LPE 22:6, LPC 20:5, and LPC 24:0 were included in the data with normalization.
The PatternHunter function was used to identify metabolites that showed monotonous increases and decreases with age. The analyzed results using the data with and without normalization are shown in Figure 6E  The PatternHunter function was used to identify metabolites that showed monotonous increases and decreases with age. The analyzed results using the data with and without normalization are shown in Figure 6E,F, respectively. FA 20:5 consistently showed the highest positive absolute correlation values in both results. Acylcarnitine 22:5, FA 22:6, and acylcarnitine 20:5 were included in the data without normalization. The negative correlation included progesterone and lysoPAF 18:0. Four metabolites of lactosylceramide and four metabolites of glucosylceramide were included. Only glucosylceramide d18:1-22:2 was included in the normalized data.

Conclusions
A method for the long-term stable measurement of lipid components contained in the plasma of more than 10,000 samples was developed in this study. Long-term quantification of stability using IS representing each lipid group alone is inadequate. In particular, lipids with a larger distance from the IS in the time dimension show large fluctuations. To minimize this problem, a method for correcting the median value of QC samples measured multiple times in the same batch was examined. Using this method, the interbatch

Conclusions
A method for the long-term stable measurement of lipid components contained in the plasma of more than 10,000 samples was developed in this study. Long-term quantification of stability using IS representing each lipid group alone is inadequate. In particular, lipids with a larger distance from the IS in the time dimension show large fluctuations. To minimize this problem, a method for correcting the median value of QC samples measured multiple times in the same batch was examined. Using this method, the interbatch CV decreased from 20.8% (before correction) to 14.7% (after correction). Several lipids were also found to be correlated with age and gender.
In this study, all samples were analyzed using a single instrument. The usefulness of this method should be verified in the future by examining multiple apparatus, apparatus from various vendors, and multiple laboratories.
In conclusion, the correction method used in this study is versatile for various metabolome analyses, even in the long-term measurement of multiple samples, such as cohort studies.  Informed Consent Statement: Written informed consent was obtained from all patients involved in the study.

Data Availability Statement:
The data presented in this study are available upon request from the corresponding author. The data are not publicly available to prevent misuse.