Calibration-Curve-Locking Database for Semi-Quantitative Metabolomics by Gas Chromatography/Mass Spectrometry

Calibration-Curve-Locking Databases (CCLDs) have been constructed for automatic compound search and semi-quantitative screening by gas chromatography/mass spectrometry (GC/MS) in several fields. CCLD felicitates the semi-quantification of target compounds without calibration curve preparation because it contains the retention time (RT), calibration curves, and electron ionization (EI) mass spectra, which are obtained under stable apparatus conditions. Despite its usefulness, there is no CCLD for metabolomics. Herein, we developed a novel CCLD and semi-quantification framework for GC/MS-based metabolomics. All analytes were subjected to GC/MS after derivatization under stable apparatus conditions using (1) target tuning, (2) RT locking technique, and (3) automatic derivatization and injection by a robotic platform. The RTs and EI mass spectra were obtained from an existing authorized database. A quantifier ion and one or two qualifier ions were selected for each target metabolite. The calibration curves were obtained as plots of the peak area ratio of the target compounds to an internal standard versus the target compound concentration. These data were registered in a database as a novel CCLD. We examined the applicability of CCLD for analyzing human plasma, resulting in time-saving and labor-saving semi-qualitative screening without the need for standard substances.


Introduction
Recently, the demand for quantitative metabolomics to derive metabolite concentrations has increased with the expansion of research fields that require a data comparison across measurement batches, methods, and facilities (e.g., cohort studies, international collaborative research, pharmacokinetic analysis, and trans-omics research) [1][2][3]. Chromatography coupled with mass spectrometry (MS) (e.g., gas chromatography/mass spectrometry, GC/MS, liquid chromatography/mass spectrometry, LC/MS, and capillary electrophoresis/mass spectrometry, CE/MS) are the most commonly used techniques in metabolomics because they allow the identification of a wide range of molecular species [4]. However, it is not easy to guarantee quantitative performance with mass spectrometry-based metabolomics because procedures for experiments and data processing are highly complex and error-prone [5]. In particular, it is necessary to obtain calibration curves for numerous target metabolites for each experiment because the detection sensitivity for each quantifier ion in mass spectrometry generally fluctuates day by day. Hence, obtaining metabolite concentrations from mass spectrometry-based metabolomics is labor-intensive.
GC/MS is utilized as a primary method for metabolomics because of its high sensitivity, peak capacity, and repeatability, especially for low-molecular-weight metabolites [6]. In addition, its compound identification capabilities are superior to those of other techniques owing to the reproducible fragmentation patterns by the electron ionization (EI) mass spectra, in which the electron acceleration energy is unified at 70 eV. In particular, the retention time (RT) of peaks obtained by GC analysis can be constant among the analysis batches by employing the retention time locking (RTL) method. Therefore, the repeatability of the GC separation can also be improved. Taking advantage of these features of GC/MS, public databases have been constructed for GC/MS-based metabolomics with accurate metabolite identification [7][8][9]. The most versatile and large-scale database for GC/MS-based metabolomics has been reported by Kind et al. as Fiehn Library, in which more than 1000 compounds have been registered [7]. Furthermore, several research groups have constructed and updated GC/MS libraries for target/non-target metabolome analyses [8][9][10].
In addition to these advantages, a quantification methodology based on GC/MS was developed by constructing a calibration curve locking database (CCLD) in the fields of environmental analysis, pesticide analysis, and forensics [11][12][13]. CCLD includes the RT, EI mass spectrum, and calibration curve for each target compound. Under target tuning methods of MS such as a decafluorotriphenylphosphine (DFTPP) tuning in the method provided by the United Sates Environment Protection Agency (US EPA method 625) (https: //www.epa.gov/sites/production/files/2015-10/documents/method_625_1984.pdf, latest accessed on 1 February 2021), the calibration curves based on the relative peak area (RPA) are constant even if the absolute sensitivity of mass spectrometry fluctuates. Therefore, CCLD enables the quantification of metabolites without the day-to-day preparation of calibration curves by measuring standard substances.
In this study, we attempted to apply these CCLD concepts to GC/MS-based metabolomic analysis. Because the quantitative performance and identification accuracy of CCLD can be secured as long as the apparatus conditions remain constant, development of methods for apparatus conditioning were necessary. Most of the targets in previous CCLDs are compounds that do not require derivatization, such as organic compounds, agricultural chemicals, and other drugs [11][12][13]. On the other hand, many metabolites are non-volatile and require derivatization. To ensure the quantitative performance of GC/MS analysis with CCLD, it is necessary to stably reproduce the derivatization reaction for each measurement [14][15][16]. To this end, automatic derivatization systems have been developed using an auto-sampler and robotic platforms [17][18][19][20]. We employed a robotic platform system (i.e., a multifunction automatic sampler, PAL RTC system) for the automation of sequential sample manipulation including two-step derivatization and injection to GC/MS ( Figure 1). After optimization of the automated sequential two-step derivatization for oximation and trimethylsilylation, we collected calibration curves of 52 metabolites in central carbon metabolism under DFTPP tuning conditions. After verifying the stability of the calibration curves over several days, calibration curves were registered in the EI spectrum database based on the Fiehn Library, resulting in a novel CCLD for metabolomics. A novel CCLD was validated by its quantity and repeatability for quantification of reference biological samples, Standard Reference Material (SRM) 1950, provided by the National Institute of Standards and Technology (NIST). Conceptual diagram showing workflow for construction of a novel calibration curve locking database (CCLD) for metabolome analysis. The novel CCLD was constructed for semi-quantitative screening by gas chromatography/mass spectrometry (GC/MS). Our in-house CCLD contains the retention time (RT), calibration curve, and electron ionization (EI) mass spectrum for automatic compound search and semi-quantification of target compounds. To achieve repeatable quantification by using CCLD, automated batch and in-time sample derivatization and sample loading protocol by a PAL RTC system was employed for the construction of the CCLD.

Verification of Stability of Relative Sensitivity of Mass Spectrometry
For the construction of the CCLD, the relative sensitivity among each m/z of mass spectrometry should be kept constant by DFTPP tuning [10]. The actual sensitivity fluctuation was monitored by auto tuning and DFTPP tuning for 48 days ( Figure 2). The absolute sensitivity for m/z = 69, 219, and 502 under the auto tuning method fluctuated with relative standard deviations (RSDs) of 13.1%, 18.0%, and 20.4%, respectively. Under DFTPP tuning, the fluctuations in the abundance of m/z = 69, 219, and 502 were 6.5%, 8.2%, and 11.0%, respectively. The auto tuning algorithm sets the parameters to maximize the mid-range and high-end sensitivity (i.e., high abundances of ions 219 and 502). On the other hand, that of the DFTPP tuning sets the target relative abundances of m/z = 69, 219, and 502 to 100%, 55%, and 2%, respectively. Under auto tuning, the relative abundance ratio of m/z = 219 and 502 normalized by m/z = 69 fluctuated with RSDs = 12.5% and 17.3%, respectively, according to the fluctuation of absolute sensitivity for each m/z ( Figure 2B). In the case of DFTPP tuning, the fluctuation of the relative abundance ratio between each m/z was 3.8% and 9.7% ( Figure 2C). These results showed that the RPA for the calibration curve of each target metabolite remained constant among different analytical batches by DFTPP tuning.

Optimization of Automatic Derivatization Condition Using PAL RTC
For the metabolome analysis based on GC/MS, a two-step sequential derivatization reaction was employed, which combines oximation and trimethylsilylation. The method provided by the Fiehn group enabled efficient derivatization by decreasing the amount of oxime reagent and increasing the amount of silylating reagent [6]. We employed Fiehn's method for the automation of the two-step sequential derivatization using a robotic platform PAL system, according to our previous report [21]. For the construction of a CCLD ensuring high repeatability and quantitative performance, it is necessary to improve the sensitivity of GC/MS because the absolute sensitivity under the DFTPP tuning tended to be lower than under auto tuning, as shown in Figure 2. The reaction temperature and reagent amount for the two-step sequential derivatization reaction were modified to improve the sensitivity of the GC/MS-based metabolomic analysis. To evaluate the improvement in sensitivity by the modification of the derivatization method, standard substance mixtures (SSMs) of 52 metabolites (Table 1) were analyzed, and the relative peak area (RPA), which is the normalized peak area of the quantifier of target metabolites by that of the internal standard (IS), were compared for each method.
For comparison of the method, SSM and IS 1 (d 10 -phenanthrene) were prepared so that their theoretical final concentration after the derivatization process was set to constant, [SSM] = 200 µmol/L and [IS 1 ] = 53.1 µmol/L (10 µg/L) (Table S1). To unify the agitators for automatic derivatization using the PAL system, the same temperature conditions were applied to both the oximation and silylation reaction. We compared RPA of each metabolite among method A (30 • C), B (37 • C), and C (50 • C) using the PAL system. As a control method, we employed a manual derivatization method provided by the Fiehn group (oximation at 30 • C and silylation at 37 • C), as shown in Table S1. All detected TMS derivatives of the target metabolites are listed in Table S2. One derivative was chosen for each metabolite as a quantification target based on sensitivity and repeatability (Table 1). No significant differences in RPA values were observed between the control method and method A (30 • C) (Table S3). With methods B (37 • C) and C (50 • C), the RPA values of α-ketoglutarate and cysteine increased. The RPA values of fumaric acid and glutamic acid decreased, while those of pyroglutamic acid increased at 50 • C. These results indicate that increasing the temperature increased the derivatization efficiency for α-ketoglutarate and cysteine, while a high temperature caused the conversion of glutamic acid to pyroglutamic acid. Therefore, we employed 37 • C for the derivatization reaction. Next, to achieve higher sensitivity by reducing the sample dilution with a derivatization reagent, the reagent volumes of oximation and silylation were reduced (methods D and E in Table S1). In the case where the total volume of reagent for oximation and silylation was reduced by 50% (method D), almost all target metabolites were detected with higher RPA values compared with those of the control method (Table S3). A further 25% reduction in the volume of the silylation reagent (method E) was achieved with higher RPA values than that obtained using the control method for almost all target metabolites (Table S3).
Considering these results, method E was chosen as the optimized derivatization method for the novel CCLD construction. Finally, we compared the GC/MS analysis results of the same analyte (25 µL SSM containing 200 µmol/L for each metabolite) obtained using method E and the control method. Figure 3A shows the fold changes of RPA (the value obtained by dividing the RPA with method E by the RPA with the control method) for all target metabolites. The improvement in sensitivity using method E was over four-fold among all 52 metabolites ( Figure 3A). This improvement in sensitivity was mainly due to the concentration effect by a reducing reagent volume, whereas improvement in the derivatization efficiency by changing the temperature also enhanced the sensitivity for some metabolites. The repeatability of the RPA value of each metabolite also improved with increasing sensitivity ( Figure 3B).

Construction of CCLD
To prepare calibration curves for 52 metabolites for CCLD, 0, 5, 50, 100, and 200 µM of SSMs were analyzed under DFTPP tuning. Three additional concentrations (500, 1000, and 1500 µM) were analyzed for glucose because the standard reference sample SRM 1950 contained glucose at this concentration range. Each calibration curve was prepared as a linear calibration by least square regression using the RPA normalized by IS 1 (d 10 -phenanthrene) ( Figure S1). All concentrations were analyzed in triplicate for three different days, resulting in nine calibration curve sets and one averaged calibration curve set (Table 1). Intra-day and day-to-day variations in the slope of each calibration curve were small for all metabolites (Table S4). These results clearly demonstrated that DFTPP tuning facilitated the repeatable analysis of the same analytes among different analytical batches. The limits of quantification (LOQ) was determined by analysis of serially diluted SSMs. The LOQ was defined as the lowest analyte concentration at which the signal-to-noise ratio (S/N ratio) of over 10 was detected. The LOQs were 50 µM for the following metabolites: ergosterol, glycine, glycolic acid, pyruvic acid, and α-ketoglutaric acid. The LOQs were 5 µM for all other metabolites (Table S4). The average calibration curve set, mass spectra, and other parameters listed in Table 1 (RT, quantifier, and qualifiers) were registered to a quantification method using MassHunter software, resulting in the novel CCLD.

Method Validation by Quantification of Human Plasma Sample
Finally, we performed method validation of CCLD-based GC/MS quantification using a reference sample (50 µL) of human plasma NIST SRM 1950. To determine the appropriate sample pretreatment, different volumes of plasma extract (upper phase 50, 100, and 150 µL, see Materials and Methods for details) were analyzed, and the range where the RPA value was linear was determined (Table S5). Since linearity was lost with fructose and succinic acid in the case of 150 µL of a sample volume ( Figure S2). The amount of plasma extract was set to 100 µL afterwards. To validate the constructed CCLD, an addition-recovery test was performed using SRM 1950. A mixture of 25 µL of SSM (50 µM) and 100 µL of plasma sample extract was prepared in a vial and analyzed by GC/MS using CCLD. As a result, a reasonable recovery rate was achieved for all metabolites (58-133%) ( Table 2).
To evaluate the quantification by a novel CCLD for metabolomics, reference human plasma sample SRM 1950 was analyzed, and the quantification results were compared with the metabolite concentrations provided by NIST. We analyzed SRM 1950 three times per day, and the analysis was performed for three days. As a result, 28 metabolites in CCLD were detected and identified ( Table 3). The quantified concertation of each identified metabolite by the novel CCLD was close to the literature values provided by NIST (https://www-s.nist.gov/srmors/certificates/1950.pdf (accessed on 1 February 2021)).
Although we confirmed that adipic acid was not detected from SRM1950, several studies mentioned that adipic acid can be included in human blood as an exogenous food additive [22][23][24]. In order to improve the quantitative performance of CCLD, it is helpful to use commercially available stable isotopope-labeled adipic acid as an alternative internal standard (e.g., 13 C 2 -adipic acid). To expand the variation of CCLD for different types of biological sample (e.g., foods, cells, tissue, urine, and feces), there is room to select appropriate internal standards for each sample type and target metabolites.
From these results, it was proven that CCLD could be used for quantitative metabolome analysis over different batches with high repeatability (except for metabolites that have been recognized as difficult to quantify). In the analysis of SRM 1950, 28 metabolites were detected in addition to the 52 target metabolites registered in the CCLD. Since the CCLD is based on non-target GC/MS analysis, if there are interesting metabolites, it is possible to quantify them even in past data measured in different batches by updating the CCLD with the addition of new calibration curves. By analyzing various biological samples with CCLD and expanding the library, it is expected that the usefulness of metabolome data would be synergistically increased.

Preparation of Standard Mixtures
To prepare the calibration curves of target metabolites, we used the metabolite mixture kit as the SSM, which contains 52 metabolites (200 µmol/L each) related to central carbon metabolism (Table 1). SSM was serially diluted by water to obtain an appropriate range of concentration, and 100 µL each was transferred to a clean 300 µL fixed-insert vial. They were dried for 2 h using a centrifugal concentrator (VC-36S, TAITEC Co., Saitama, Japan). During this drying step, the inner wall of the vial insert was washed three times with methanol to wash down the compounds that stick to the insert wall into the reaction solution. To gradually reduce the liquid level in the insert, we decreased the washing methanol volume step-by-step to 50 µL at 20 min, 20 µL at 40 min, and 10 µL at 55 min. When the applied volume of SSM in each method was changed to optimize the derivatization conditions, we adjusted the SSM concentration to ensure that the on-column concentration of each metabolite was the same.

Extraction of Metabolites from the Plasma Sample
Each 50 µL of the plasma sample was collected in a clean 2 mL microtube. It was diluted with 950 µL of extraction solution containing 546 µL of methanol, 222 µL of chloroform, 172 µL of water, and 10 µL of adipic acid (1 mmol/L in stock) as IS 2 ( Figure S4). After vortex mixing, the mixture was centrifuged at 16,000× g at 4 • C for 5 min, and then the upper phase (700 µL) was transferred to a clean 2 mL microtube. After the addition of 155 µL of water and 235 µL of chloroform, the mixture was vortexed and centrifuged at 16,000× g at 4 • C for 3 min. The upper phase (100 µL) was transferred to a clean 300 µL fixed-insert vial and evaporated to dryness, same with the SSM drying step described above. The dried plasma extract was subjected to automated derivatization and GC/MS analysis, as described in the next section.

Automated Derivatization and GC/MS Analysis
The method for sequential automatic derivatization and injection to GC/MS analysis was constructed using the robotic platform PAL RTC and PAL Sample Control software (CTC Analytics AG, Zwingen, Switzerland). The configuration of the PAL RTC system is shown in Figure S3. A brief summary of the automatic sequential derivatization and injection procedure is presented in Figure S4. The vials containing the derivatization reagents and the sample were placed on a vial tray on the tray holder of the PAL RTC system ( Figure S3B). Oximation reagent (pyridine containing 40 mg/mL of methoxyamine hydrochloride and d 10 -phenanthrene as IS 1 ) and MSTFA were placed into clean vials on tray 2 on slot 2 of the tray holder. Concentration of d 10 -phenanthrene in the oximation reagent was changed so that the theoretical final concentration after the derivatization is 53.1 mmol/L (10 µg/L) when the total volume of the silylation reagent and the oximation reagent were changed.
The vial with dried plasma extracts was capped with a metal cap (GL Sciences Inc., Tokyo, Japan) for the PAL system and placed on tray 1 on slot 1 of the tray holder. To start the derivatization process, a vial with the sample was moved to the agitator, and 5 µL of the oximation reagent was added to dissolve the dried sample and mixed. The sample was mixed for 90 min at 750 rpm and 37 • C in an agitator. The second reagent (20 µL), MSTFA, was then added to the sample and mixed for 30 min at 750 rpm and 37 • C. After 2 h of the completion of derivatization, the vial was moved back to tray 1.
Agilent GC 7890A coupled to a 5975C inert MSD (Agilent Technologies, Santa Clara, CA, USA) was used for metabolomic analysis. For this purpose, 1 µL of the sample was injected using a 10 µL syringe of PAL RTC. GC analysis was performed on a DB5-MS (i.d.: 30 m × 0.25 mm, film thickness: 0.25 µm) capillary column (Agilent Technologies). Prior to sample analysis, the RT was locked at 16.727 min using d 27 -trimethylsilylated myristic acid [10]. The samples were injected in a split mode (1:10) with the injection port held at 250 • C. The initial oven temperature was maintained at 60 • C for 1 min and then ramped at 10 • C/min to 325 • C and held for 10 min. The MSD transfer line was held at 290 • C, the ion source was held at 250 • C, and the quadrupole was held at 150 • C. EI mass spectra were generated at an ionization energy of 70 eV. The GC/MS data were acquired at 37.5 min with 5.9 min of solvent delay at a normal scan rate (781 u/s) in the mass range of m/z 50-650. DFTPP tuning was performed to obtain uniform mass spectra using a DTFPP tune file provided by Agilent Technologies, which allows adjusting MS parameters to meet relative abundance criteria defined by EPA methods 625 (https://www.epa.gov/sites/production/ files/2015-10/documents/method_625_1984.pdf, accessed on 1 February 2021).

Data Analysis
Data analysis was performed using Mass Hunter B.07.01 (Agilent Technologies). All quantifier ions for the target metabolites and ISs are listed in Table S2. After peak detection with deconvolution, the target compound peaks were identified by a library search. The peak area of each quantifier ion was then determined. The peak area of d 10 -phenanthrene (m/z 188) as IS 1 was used to normalize all the other peak areas. The extraction efficiency of the plasma sample in the sample preparation was corrected based on the peak area of adipic acid (m/z 111) as IS 2 .

Preparation of Calibration Curves
SSMs were prepared for the calibration curve construction so that the theoretical concentration after the derivatization was set to 0, 5, 50, 100, and 200 µmol/L. For glucose, three additional concentrations of 500, 1000, and 1500 µmol/L were analyzed. As the IS 1 , d 10-phenanthrene was prepared at a concentration of 25 µg/mL in the oximation reagent (pyridine containing 40 mg/mL methoxyamine hydrochloride) to obtain a theoretical final concentration of 53.1 µmol/L (10 µg/L). Each calibration curve was prepared for linear calibration by least square regression using the plot of RPA normalized by IS 1 .

Conclusions
We constructed a novel CCLD for quantitative metabolomics, including EI mass spectrum, RT, quantifier ion, and calibration curves, for 52 metabolites in central carbon metabolism. The derivatization reaction was automated using a robotic platform to ensure high repeatability in the GC/MS analysis with CCLD. We improved the sensitivity for each metabolite over four times by optimizing the conditions for the derivatization reaction to ensure quantitative performance. Using SRM 1950 from NIST as a reference biological sample, we demonstrated that quantification of metabolites using CCLD showed high repeatability in different batches. Quantification of SRM 1950 by CCLD was similar to that provided by NIST. Since CCLD has extensibility and MS scan data measured under target tuning can be reused, it is expected that the more biological samples analyzed, the greater the usefulness of the accumulated data. For further expansion of application of CCLD to various types of biological samples (e.g., foods, cells, tissue, urine, and feces), there is room to investigate matrix effects and selection of appropriate internal standards for each sample type and target metabolites.
Supplementary Materials: The following are available online at https://www.mdpi.com/article/10 .3390/metabo11040207/s1. Figure S1. Results of calibration curves collection among three different days under DFTPP tuning. Figure S2. Correlation of RPA value and sample amount. Figure S3. Diagram of a robotic platform PAL system. Figure S4. Workflow of the automated sequential derivatization and the injection to GC/MS system using the PAL RTC system. Table S1. Derivatization condition list. Table S2. All detected derivatives and IS. Table S3. Comparative result of derivatization conditions. Table S4. Repeatability and limit of quantification of CCLD. Table S5. Results of GC/MS analysis to optimize the sample amount for SRM 1950.