Differentiation of Mountain- and Garden-Cultivated Ginseng with Different Growth Years Using HS-SPME-GC-MS Coupled with Chemometrics

Although there are differences in the appearance of Mountain-Cultivated Ginseng (MCG) and Garden-Cultivated Ginseng (GCG), it is very difficult to distinguish them when the samples are processed to slices or powder. Moreover, there is significant price difference between them, which leads to the widespread adulteration or falsification in the market. Thus, the authentication of MCG and GCG is crucial for the effectiveness, safety, and quality stability of ginseng. In the present study, a headspace solid-phase microextraction gas chromatography mass spectrometry (HS-SPME-GC-MS) coupled with chemometrics approach was developed to characterize the volatile component profiles in MCG and GCG with 5-,10-,15-growth years, and subsequently to discover differentiating chemical markers. As a result, we characterized, for the first time, 46 volatile components from all the samples by using the NIST database and the Wiley library. The base peak intensity chromatograms were subjected to multivariate statistical analysis to comprehensively compare the chemical differences among the above samples. MCG5-,10-,15-years and GCG5-,10-,15-years samples were mainly divided into two groups by unsupervised principal component analysis (PCA), and 5 potential cultivation-dependent markers were discovered based on orthogonal partial least squares-discriminant analysis (OPLS-DA). Moreover, MCG5-,10-,15-years samples were divided into three blocks, and 12 potential growth-year-dependent markers enabled differentiation. Similarly, GCG5-,10-,15-years samples were also separated into three groups, and six potential growth-year-dependent markers were determined. The proposed approach could be applied to directly distinguish MCG and GCG with different growth years and to identify the differentiation chemo-markers, which is an important criterion for evaluating the effectiveness, safety, and quality stability of ginseng.


Introduction
Ginseng, a perennial herb of the Acanthopanax family, is known as the king of herbs and also the king of medicine, and has been used clinically for thousands of years in Asian countries. It is called "Ginseng" because its rhizome looks like a person [1]. According to the different growth environments and diverse cultivation modes, ginseng is mainly divided into three categories: Mountain-Cultivated Ginseng (MCG), Garden-Cultivated Ginseng (GCG), and Wild Ginseng (WG). MCG is planted artificially in mountain forests and grows 2. Results 2.1. Components Identification from MCG 5-,10-,15-years and GCG 5-,10-,15-years Using the optimal HS-SPME-GC-MS conditions described in Section 4.3, the representative Based Peak Intensity (BPI) chromatogram of ginseng samples is presented in Figure 1. MCG and GCG samples with different growth years have similar chemical profiles, but the difference in content of some compounds can be visually noted. Based on the NIST database and the Wiley library, a total of 46 components were preliminarily identified, including 29 sesquiterpenes, 7 carbonyl compounds, 1 pyrazine, and 9 others. The relative contents of 46 components in different samples were calculated using 2-heptanone as the internal standard (I.S.), and the detailed information is summarized in Table 1. Table 1. The contents of the volatile compounds in MCG 5-,10-,15-years and GCG 5-,10-,15-years (ng/g, n = 6).

Principal Component Analysis (PCA)
To clearly differentiate among ginseng samples, unsupervised pattern recognition PCA, which converts multi-index data into a small number of feature components and provides visual images of large sample differences, was applied to intuitively refine group differences. After Pareto scaling and mean-centering, the dataset of MCG5-,10-,15-years and GCG5-,10-,15-years were displayed as score plots in a coordinate system of principal components after dimensionality reduction. As shown in Figure 2A, PCA score plots mainly separated MCG5-,10-,15-years and GCG5-,10-,15-years into two groups, independent of growth years, indicating that the growth environment and cultivation mode played more important roles regarding volatile secondary metabolites. Generally, R 2 X(cum) and Q2(cum) are used to evaluate the quality of mathematical models. R 2 X(cum) represents the percentage of model interpretation matrix information, Q2(cum) represents the prediction ability of the model after modeling, and both of them should be greater than 0.5 [16,17]. In the present study, R 2 X(cum) and Q2(cum) were 0.896 and 0.782, respectively, indicating good adaptability and prediction ability of the established PCA model.  To clearly differentiate among ginseng samples, unsupervised pattern recognition PCA, which converts multi-index data into a small number of feature components and provides visual images of large sample differences, was applied to intuitively refine group differences. After Pareto scaling and mean-centering, the dataset of MCG 5-,10-,15-years and GCG 5-,10-,15-years were displayed as score plots in a coordinate system of principal components after dimensionality reduction. As shown in Figure 2A, PCA score plots mainly separated MCG 5-,10-,15-years and GCG 5-,10-,15-years into two groups, independent of growth years, indicating that the growth environment and cultivation mode played more important roles regarding volatile secondary metabolites. Generally, R 2 X(cum) and Q2(cum) are used to evaluate the quality of mathematical models. R 2 X(cum) represents the percentage of model interpretation matrix information, Q2(cum) represents the prediction ability of the model after modeling, and both of them should be greater than 0.5 [16,17]. In the present study, R 2 X(cum) and Q2(cum) were 0.896 and 0.782, respectively, indicating good adaptability and prediction ability of the established PCA model.

Chemo-Markers Discovery for Distinguishing MCG 5-15-years and GCG 5-15-years
To determine the variables responsible for the separation between MCG 5-15-years and GCG 5-15-years , the orthogonal partial least squares-discriminant analysis (OPLS-DA) approach was applied to the volatile components' profiles. As shown in Figure 2B, OPLS-DA score plots mainly separated MCG 5-15-years and GCG 5-15-years into two blocks, especially revealing the growth-year variation in the component P1 direction (X-axis), and component P2 (Y-axis). R 2 X(cum) and Q2(cum) were 0.912 and 0.832 respectively, indicating good adaptability and prediction ability of the established OPLS-DA mode. To verify the effectiveness of the OPLS-DA model, 200 rounds of a permutation test were conducted. As shown in Figure S1A (Supplementary Materials), all blue Q2 values on the left were lower than the origin on the right, and the blue regression line of Q2 intersected the vertical axis (left) at or below zero, suggesting that our OPLS-DA model was reliable. To exhibit the responsibility of each ion for the separation more intuitively, S-plots were obtained. As shown in Figure 2C, most of the ions were gathered around the origin with only a few ions scattered around the edge area, and only the compounds represented by these few ions contributed to the separation observed in the OPLS-DA score plots. In the present study, five variables (marked in red) with Variable Importance for the Projection (VIP) > 1 and p < 0.05 were selected as the potential chemo-markers, which directly led to the differentiation between MCG 5-15-years and GCG 5-15-years . Table 2 summarizes the detailed information of the five potential cultivation-dependent markers, including hexanal, β-gurjunene, 2oxo-pentanoic acid, heptanal, and octanal. The contents of hexanal, 2-oxo-pentanoic acid, heptanal, and octanal in MCG 5-15-years were significantly higher, while β-gurjunene in GCG 5-15-years was significantly higher ( Figure 2D). Therefore, our data suggests that the above 5 volatile components might be used as the unique chemo-markers for discrimination between MCG 5-15-years and GCG 5-15-years . To determine the variables responsible for the separation between MCG5-15-years and GCG5-15-years, the orthogonal partial least squares-discriminant analysis (OPLS-DA) approach was applied to the volatile components' profiles. As shown in Figure 2B, OPLS-DA score plots mainly separated MCG5-15-years and GCG5-15-years into two blocks, especially revealing the growth-year variation in the component P1 direction (X-axis), and component P2 (Y-axis). R 2 X(cum) and Q2(cum) were 0.912 and 0.832 respectively, indicating good adaptability and prediction ability of the established OPLS-DA mode. To verify the effectiveness of the OPLS-DA model, 200 rounds of a permutation test were conducted. As shown in Figure S1A (Supplementary Materials), all blue Q2 values on the left were lower than the origin on the right, and the blue regression line of Q2 intersected the vertical axis (left) at or below zero, suggesting that our OPLS-DA model was reliable. To exhibit the responsibility of each ion for the separation more intuitively, S-plots were obtained. As shown in Figure 2C, most of the ions were gathered around the origin with only a few ions scattered around the edge area, and only the compounds represented by these few ions contributed to the separation observed in the OPLS-DA score plots. In the present study, five variables (marked in red) with Variable Importance for the Projection (VIP) > 1 and p < 0.05 were selected as the potential chemo-markers, which directly led to the differentiation between MCG5-15-years and GCG5-15-years. Table 2 summarizes the detailed information of the five potential cultivation-dependent markers, including hexanal, βgurjunene, 2-oxo-pentanoic acid, heptanal, and octanal. The contents of hexanal, 2-oxopentanoic acid, heptanal, and octanal in MCG5-15-years were significantly higher, while β-  As before, a PCA was used to intuitively refine group difference among MCG 5-,10-,15-years . As shown in Figure 3A, the MCG 5-,10-,15-years were obviously separated into three blocks, indicating that the volatile component profiles of MCG change with growth years. R 2 X(cum) and Q2(cum) were 0.846 and 0.779, respectively, indicating good adaptability and prediction ability of the established PCA model ( Figure 3A). As before, a PCA was used to intuitively refine group difference among MCG5-,10-,15years. As shown in Figure 3A, the MCG5-,10-,15-years were obviously separated into three blocks, indicating that the volatile component profiles of MCG change with growth years. R 2 X(cum) and Q2(cum) were 0.846 and 0.779, respectively, indicating good adaptability and prediction ability of the established PCA model ( Figure 3A). To determine the variables responsible for the separation between MCG 5-years and MCG 10-years , MCG 5-years and MCG 15-years , MCG 10-years and MCG 15-years , the OPLS-DA model was applied to the above dataset. As shown in Figure 3B-D, OPLS-DA score plots separated every two samples well into two blocks, with R 2 X(cum) values of 0.838, 0.9085, and 0.885, respectively, and Q2(cum) values of 0.704, 0.564, and 0.785, respectively. These models were subjected to 200 rounds of permutation tests to confirm their high predictability (Figure S1C,E,G). The ions intuitively responsible for the separation are marked in red in the S-plots ( Figure 3E-G). As a result, with VIP > 1 and p < 0.05, six variables were determined as potential chemo-markers for discrimination between MCG 5-years and MCG 10-years , seven chemo-markers were determined for the differentiation between MCG 5-years and MCG 15-years , and 1 chemo-marker was determined for to distinguish between MCG 10-years and MCG 15-years . Altogether, 12 potential growth-year-dependent markers, including (−)-α-neoclovene, β-panasinsene, γ-gurjunene, β-neoclovene, β-chamigrene, heptanal, 2oxo-pentanoic acid, benzaldehyde, octanal, (E)-2-octenal, (E)-2-heptenal, and 3-octen-2-one, were determined and the information is summarized in Table 3. Of the 12 chemo-markers, except for benzaldehyde, the contents of the other 11 compounds increased with the growth years ( Figure 3H), consistent with the market assessment that the higher the growth-years, the greater the value. Similarly, PCA was used to intuitively refine group differences among GCG with 5-, 10-, and 15-growth years. As shown in Figure 4A, PCA score plots separated GCG 5-,10-,15-years into three groups, indicating the chemical profile changes with the growth years, where the model fit parameters were 0.969 for R 2 X(cum), and 0.79 for Q2(cum) respectively, indicating good fitness and prediction of the established PCA model.

Chemo-Markers Discovery for Distinguishing GCG with 5-, 10-, 15-Growth Years
To select the ions responsible for discrimination between GCG 5-years and GCG 10-years , GCG 5-years and GCG 15-years , GCG 10-years and GCG 15-years , the above dataset was subjected to OPLS-DA. As shown in Figure 3B-D, OPLS-DA score plots significantly separated each of the two samples into two groups, where the model fit parameters were 0.938, 0.913, and 0.842 for R 2 X(cum), and 0.915, 0.728, and 0.692 for Q2(cum), respectively. These models were subjected to 200 rounds of permutation tests to confirm their high predictability ( Figure S1I,K,M). The volatile components intuitively responsible for the differentiation are marked in red in the S-plots ( Figure 3H-J). As a result, with VIP > 1 and p < 0.05, 6 ions were determined as the potential chemo-markers for discrimination between GCG 5-years and GCG 10-years , six chemo-markers were determined for the differentiation between GCG 5-years and GCG 15-years , and two chemo-markers were determined for the distinguishing between GCG 10-years and GCG 15-years . Altogether, six potential growth-year-dependent markers, including hexanal, 2-oxo-pentanoic acid, heptanal, benzaldehyde, octanal, and 2-isopropyl-3-methoxypyrazine, were determined, and the information is summarized in Table 4. The contents of all six chemo-markers were the highest in GCG 5-years (Figure 4H), which is also consistent with the market assessment that GCG 5-years is the main commodity in the market.

PCA
Similarly, PCA was used to intuitively refine group differences among GCG with 5-, 10-, and 15-growth years. As shown in Figure 4A, PCA score plots separated GCG5-,10-,15years into three groups, indicating the chemical profile changes with the growth years, where the model fit parameters were 0.969 for R 2 X(cum), and 0.79 for Q2(cum) respectively, indicating good fitness and prediction of the established PCA model.

Discussion
Ginsenosides are the main components in the chemical profiles of ginseng, and some prevalent analytical methods, such as UPLC-MS and 1 H NMR are developed to distinguish different types of ginseng based on the diversity of ginsenosides. As is well known, ginseng has its own odor, and its aroma characteristics has attracted wide attention for ginseng identification. Thus, GC-MS also enables ginseng discrimination based on the volatile components.
According to their different growth environments and cultivation modes, MCG 5-,10-,15-years and GCG 5-,10-,15-years have different volatile chemical profiles. To further systematically compare the similarities and differences of the volatile components contained in MCG 5-,10-,15-years and GCG 5-,10-,15-years , and particularly to compare the composition variations with growth years, HS-SPME-GC-MS coupled with chemometrics was developed to discriminate the volatile component profiles and subsequently to discover differentiating chemical markers.
First, a total of 46 volatile components were preliminarily characterized in the samples by using the NIST database and the Wiley library. According to reports in the literature, the main volatile components of ginseng are sesquiterpenoids and sesquiterpenols, which have anticancer, anti-inflammatory, and immunomodulatory pharmacological activities [18,19]. In our study, the contents of sesquiterpenes such as (−)-β-elemene, βpanaxalene, (−)-α-neobutene and γ-gultrane were high in ginseng. In addition, derivatives of methoxypyrazine, an earthy aroma component in wine, may be the main source of ginseng's unique flavor. 2-Isobutyl-3-methoxypyrazine was detected in all groups, consistent with previous studies and, with increasing growth-years, its contents gradually increased.
Secondly, combined with multivariate analysis, it was found that different cultivation modes affected the volatile components in ginseng. Although both GCG and MCG are sources of ginseng, their cultivation modes are quite different. GCG is planted in the field, and the growth process is subject to human interference. Thus, the light exposure time, light intensity, and nutrition of GCG are much better than those of MCG. For comparison, the volatile components of Zea mays L. decrease significantly under light or nutrient deficiency and the contents of (Z)-3-hexenyl acetate and (Z)-3-hexen-1-ol in Brassica napus decreased under nitrogen deficiency [20]. This precedent indicates that light, nutrition, and temperature may be important factors for the significant differences in volatile components of ginseng, and may be the reasons why some volatile component contents were higher in GCG than in MCG [21]. However, although the contents of hexanal, 2-oxo-pentanoic acid, heptanal and octanol in GCG were higher than in MCG, it has been reported that alkane components and intermediate products such as 2-oxo-pentanoic acid contribute less to the fragrance of plants. β-Gurjunene with balsam flavor was significantly higher in MCG than that in GCG, which may be why the taste of MCG is more popular with consumers. In short, the cultivation environment leads to different volatile components among MCG and GCG. Five components were selected as potential cultivation-dependent chemo-markers for the differentiation between MCG 5-,10-,15-years and GCG 5-,10-,15-years , including hexanal, 2-oxo-pentanoic acid, heptanal, octanal, and β-gurjunene. The contents of the first four compounds were significantly higher in GCG 5-,10-,15-years , while β-gurjunene was significantly higher in MCG 5-,10-,15-years .
Thirdly, the compositional variation with growth years was explored in MCG. A total of 12 volatile components were selected as potential growth-year-dependent chemomarkers for discrimination among MCG 5-,10-,15-years , including β-panasinsene, β-neoclovene, (−)-α-neoclovene, β-chamigrene, γ-gurjunene, 2-oxo-pentanoic acid, (E)-2-heptenal, 3octen-2-one, benzaldehyde, (E)-2-octenal, heptanal, and octanal. The first to sixth compounds were responsible for discrimination between MCG 5-years and MCG 10-,15-years , and 2-oxo-pentanoic acid was responsible for the differentiation between MCG 10-years and MCG 5-,15-years . The contents of the sixth to twelfth chemo-markers in MCG 15-years were significantly higher. Of the 12 chemo-markers, except for benzaldehyde, the contents of the other 11 compounds increased with the growth years, consistent with the market assessment that the higher the growth-years, the greater the value. Therefore, our data suggest that the 12 volatile components might be used as unique chemo-markers to distinguish among MCG 5-,10-,15-years .
Finally, the compositional variation with growth years was explored in GCG. A total of six volatile components were selected as potential growth-year-dependent chemomarkers for discrimination among GCG 5-,10-,15-years , including hexanal, 2-oxo-pentanoic acid, heptanal, benzaldehyde, octanal, and 2-isopropyl-3-methoxypyrazine. Interestingly, the contents of all six chemo-markers were the highest in GCG 5-years , also consistent with the market assessment that GCG 5-years is the main commodity in the market. GCG 5-years is the main commodity in the market, suggesting that it has greater value than that of other growth years, which is also consistent with our results. Therefore, our data suggest that the six volatile components might be used as unique chemo-markers for discrimination among GCG 5-,10-,15-years .

Chemicals and Reagents
Lazi mountain is an extension branch of Changbai Mountain, with dense forest, distinct seasons and abundant precipitation. Methanol (Mass grade) and 2-heptone was purchased from Sigma-Aldrich (Steinheim, Germany) and NaCl from Solarbio (Beijing, China). The ultra-pure water was prepared by the Milli-Q water purification system (Millipore, Bedford, MA, USA).

Sample Preparation and HS-SPME-GC-MS Analysis
Ginseng samples (n = 6) were cut into 0.2-0.3 cm slices and dried at 37 • C for 16 h before the slices were crushed and screened through a 40-mesh sieve. A total of 100 mg of ginseng powder was transferred into a 4-mL headspace vial containing 400 µL of 20% NaCl solution that was used to disrupt the enzymatic activity of ginseng samples [22]. 2-heptone, as the internal standard, was also added into the vial with a final concentration of 0.125 ng/µL. For the volatile component analysis, a 100-µm fused silica fiber coated with DVB/PDMS/CAR was used and preheated at 40 • C for 5 min before being exposed to the headspace at 40 • C for 25 min. HS-SPME-GC-MS analysis was performed with a 7890A GC (Agilent, Palo Alto, CA, USA), equipped with a 5977B single quadrupole mass detector (Agilent, Palo Alto, CA, USA). The chromatographic separation was performed on a DB5-MS mass spectrometry column (30 m × 0.25 mm × 0.25 µm, Agilent, Palo Alto, CA, USA). The instrumental method was a modified version of Li and Gou's work [23,24]. The injector (splitless mode) temperature was set at 270 • C. The oven temperature was initially set at 40 • C, held for 5 min, and increased to 150 • C at the rate of 5 • C/min, and then the temperature was ramped up to 260 • C at the rate of 15 • C/min and held for 7 min. The temperature of the quadruple mass analyzer was set at 150 • C. The EI ion source was used for MS data acquisition with a temperature of 230 • C and the full-scan acquisition range was from 50 to 600 amu. Helium was used as the carrier gas at a flow rate of 1 mL/min (constant flow).

Identification and Semi-Quantitative of Volatile Compounds
The mass spectra were used for qualitative identification of compounds by matching with the NIST mass-spectral library (NIST 11.0, National Institute of Standards and Technology, Gaithersburg, MD, USA) and the Wiley library search data system. Data analysis was performed with MassHunter qualitative (B.07.00) workstation software. Volatile compounds were semi-quantitatively analyzed using 2-heptanone as the I.S. [25,26]. concentration ng g = extracted ion peak area extracted ion peak area of I.S I.S. 10 ng g

Statistical Analysis
SIMCA-P analysis software (version 13.0, Umetrics, Malmo, Sweden) was used for multivariate statistical analysis, including PCA and OPLS-DA. During the analysis, PCA was first used to detect clustering formation and to get the overview and classification, and OPLS-DA was then performed, aiming to determine the maximum separation between the two groups. S-plots were available to provide visualization of the OPLS-DA predictive component loading to facilitate model interpretation. VIP was used to help screen the different components. The components were screened with VIP > 1, and then a Student's t-test was performed to confirm the significant difference with SPSS (SPSS 22.0; Chicago, IL, USA).

Conclusions
In summary, we characterized, for the first time, 46 volatile components in MCG 5-15-years and GCG 5-15-years , and five of them were screened out to distinguish MCG 5-15-years and GCG 5-15-years , 12 of them to discriminate MCG 5-,10-,15-years , and six of them to differentiate GCG 5-,10-,15-years . Thus, our data suggest that 15 volatile components might be used as unique chemo-markers for discrimination among MCG and GCG with different growth years. The proposed approach could be applied to directly distinguish MCG and GCG with different growth years and to identify the differentiating chemo-markers, which is an important criterion for evaluating the effectiveness, safety, and quality stability of ginseng. In addition, the research has digitized the traditional identification method of "nose smell" by HS-SPME-GC-MS, which is of great significance for the inheritance and innovation of traditional identification methods.