Discrimination and Characterization of the Volatile Organic Compounds in Schizonepetae Spica from Six Regions of China Using HS-GC-IMS and HS-SPME-GC-MS

Volatile organic compounds (VOCs) are the main chemical components of Schizonepetae Spica (SS), which have positive effects on the quality evaluation of SS. In this study, HS-SPME-GC-MS (headspace solid-phase microextraction-gas chromatography-mass spectrometry) and HS-GC-IMS (headspace-gas chromatography-ion mobility spectrometry) were performed to characterize the VOCs of SS from six different regions. A total of 82 VOCs were identified. In addition, this work compared the suitability of two instruments to distinguish SS from different habitats. The regional classification using orthogonal partial least squares discriminant analysis (OPLS-DA) shows that the HS-GC-IMS method can classify samples better than the HS-SPME-GC-MS. This study provided a reference method for identification of the SS from different origins.


Introduction
Schizonepetae Spica (the dry spike of Schizonepeta tenuifolia Briq) is a traditional Chinese medicine (TCM). It distributes in Jiangsu, Zhejiang, Hebei, and Henan provinces in China. Clinical applications are widely used in colds, respiratory diseases, and skin diseases. Chemical studies showed that SS contained volatile organic compounds (VOCs), flavonoids, and organic acids, among which VOCs were the main medicinal component, and pharmacological activities of VOCs possess anti-inflammatory, antineoplastic, and antiviral properties [1][2][3][4][5][6][7]. Studies on SS mainly focus on volatile oil extraction and bioactive ingredients, but the overall characterization of VOCs of SS from different sources is not comprehensive enough [8].
At present, the most common used technologies for the comprehensive characterization of VOCs include gas chromatography-mass spectrometry (GC-MS), two-dimensional gas chromatography (GC×GC), gas chromatography olfactory determination mass spectrometry (GC-O-MS), and electronic nose (E-nose) [9]. GC-MS is being widely used in the analysis of VOCs, which has high-efficiency separation of gas chromatography and high resolution of mass spectrometry, but it has the drawback of cumbersome treatment before sampling and higher cost [10]. E-nose is a new aroma detection technology, which has the advantages of rapid detection, but it also has some problems, such as low precision, The VOCs of the SS in six different regions were analyzed by HS-GC-IMS. HS-GC-IMS depends on the fast ion-molecular reaction between air clusters and analytes generated by beta ionization and realizes the identification of VOCs [17]. We selected a batch of samples from six regions to make 2D top view, as shown in Figure 1A. The background color of the entire spectrum is blue, the abscissa represents the drift time, the ordinate represents the holding time, and the Ko is 2.032-2.034 cm 2 /Vs [18]. Different hues indicate different concentrations, with white dots indicating a lower concentration and red dots indicating a higher concentration, so the deeper the hue, the higher the concentration [19]. It was observed that the VOCs in SS from different regions were well separated at a retention time of 100-1000 s and drift time of 1.0-2.0 ( Figure 1A). Based on the NIST library, VOCs of SS were determined by combining retention indexes (RI), retention times, drift times, and Ko [20]. These analysis results are shown in Table 1, in which forty compounds were tentatively identified. Forty distinct VOCs include five terpenoids ( Figure 1B    In order to more obviously compare the differences of samples from different regions, the chemical composition in the samples was classified and presented as a fingerprint in Figure 1B. The main VOCs in SS are terpenoid substances, as shown in Figure 1B-a. In each region, the highest concentration of substances is 1-menthol and d-camphor. Other terpenoids have a relatively high concentration in the two kinds of A and B. In general, A contains a high concentration of terpenoids. The alcohol substances in the SS samples are summarized in Figure 1B-b. The content of 2-hexanol has the highest content in the sample, but 3-heptanol and 3-furanmethanol have significant differences in the sample, and the content of these two substances is higher in the three kinds of J, A, and S. 1-Octen-3-ol content is generally low in Z. Interestingly, as can be seen from the Figure 1B, there are also differences among different batches within the same region, which may contribute to their different storage methods-for example, different storage times and temperature [21,22]. As shown in Figure 1B-c, the samples varied significantly according to the proportion of aldehyde substances in each. Trans-2-pentenal was higher in the J; (E)-hept-2-enal was higher in the N; benzaldehyde was higher in the J, A, and S. As shown in Figure 1B-d, there is no significant difference between samples according to the proportion of ester substances in each sample. There were relatively high levels of ethyl butanoate in J. The contents of ketone substances are summarized in Figure 1B-e. Acetophenone and cyclohexanone were present in high concentrations in all samples, but the 6-methyl-5-hepten-2-one had a relatively high concentration in B and nonan-2-one was higher in Z. In addition, there was a relatively high amount of p-cresol in A.
However, the limitations of the currently available library for HS-GC-IMS hinder the qualitative analysis of the VOCs of SS. As shown in Figure 1C, twenty-nine species were not identified. In these undefined compounds, Figure 1C-a summarize substances with high content in the sample. As shown in Figure 1C-b, the samples varied significantly in the undefined compounds.

Identification of VOCs by HS-SPME-GC-MS
In this study, forty-two VOCs were identified through HS-SPME-GC-MS, including nineteen terpenes, four alcohols, four esters, eleven ketones, one phenol, and three others, as summarized in Table 2. The results showed that l-menthone, (+)-pulegone, piperitone, menthofuran, verbenone, (+)-isomenthone, trans-carveyl acetate, and caryophyllene oxide were the main volatile compounds in all of the analyzed SS samples. Five substances passed the standard substance verification, as shown in Figure 2B, which was consistent with the identification results of the database. Surprisingly, cubebene was the volatile component only found in A and Z. Moreover, menthofuran was only found in the B and Z.
The MetaboAnalyst 5.0 was utilized for heat map clustering analysis to better understand the variations between SS samples in different regions. Each variable is normalized to generate the clustering heat map of SS in different regions on the basis of row-scale, using the relative contents of the discovered 42 compositions as variables. The distribution frequency of each substance in the sample is shown by row comparison. As shown in Figure 2C, the main VOCs in SS are terpenoids. (+)-pulegone is a key active component of SS essential oil and has been determined to have anti-inflammatory properties [23]. Compared with samples in other regions, those originating from A exhibit higher levels of pulegone. Moreover, the contents of other terpenoids (8)(9)(10)(11)(12)(13) in A are significantly higher than those detected in other samples. As shown in Figure 2C, according to the Euclidean distance, samples from the six regions could be divided into three categories: B and N, Z, G and S, and A. The A has the highest content of terpenoids. Unlike the other four groups of samples, the samples from B and A are rich in ketones and esters. The contents of VOCs have a low concentration in the three kinds of J, Z, and S, respectively.  than those detected in other samples. As shown in Figure 2C, according to the Euclidean distance, samples from the six regions could be divided into three categories: B and N, Z, G and S, and A. The A has the highest content of terpenoids. Unlike the other four groups of samples, the samples from B and A are rich in ketones and esters. The contents of VOCs have a low concentration in the three kinds of J, Z, and S, respectively.  Table 2).

Comparison of the Recognition Abilities of HS-GC-IMS and HS-SPME-GC-MS for VOCs in SS in Different Regions
To further compare the ability of the two instruments to distinguish SS from different regions, the two sets of data were analyzed. In this study, OPLS-DA analysis was used to eliminate the interference of sample differences from different batches of the same Chinese medicinal materials on the VOC characteristics. OPLS-DA is usually applied as an approach to discriminate two or more groups and to model multiple classes simultaneously [24]. The characteristics of OPLS-DA are integrated orthogonal signal correction  Table 2).

Comparison of the Recognition Abilities of HS-GC-IMS and HS-SPME-GC-MS for VOCs in SS in Different Regions
To further compare the ability of the two instruments to distinguish SS from different regions, the two sets of data were analyzed. In this study, OPLS-DA analysis was used to eliminate the interference of sample differences from different batches of the same Chinese medicinal materials on the VOC characteristics. OPLS-DA is usually applied as an approach to discriminate two or more groups and to model multiple classes simultaneously [24]. The characteristics of OPLS-DA are integrated orthogonal signal correction filters, which can separate system changes in the prediction (related to Y-related) and orthogonal (and Y-in-correlated) components. Therefore, OPLS-DA can reduce system noise and extract variable information, with stronger classification capabilities. He et al. used OPLS-DA to compare the applicability and predictive ability of E-nose and HS-SPME-GC-MS for the regional classification of Baijiu samples [25].
In our study, to achieve more accurate classification results, preprocessing is applied to experimental data. First, peak area collected from HS-SPME-GC-MS and HS-GC-IMS was subjected to a logarithmical transformation. The peak area acquired was transformed to log10, thus narrowing the scope of data [26]. It was then centered and pareto-scaled for the transformed variables. The Pareto scale, through dividing by the square root of the standard deviation of each column, for each variable provides a standard deviation that is equal to its initial variance [27]. For more than 50% of variables, the missing value has been excluded before the model is established, the datasets of different regional samples are introduced into the OPLS-DA model to perform region classification.
The OPLS-DA models are based on the results of HS-SPME-GC-MS and HS-GC-IMS; the whole sample size in the two models was set at 54 (six cultivars, three batches × triplicate). Seventy-four VOCs were used in the OPLS-DA model of HS-GC-IMS, while a total of 49 VOCs were used in the OPLS-DA model of HS-SPME-GC-MS. The sevenfold cross-validation procedures were used to validate the OPLS-DA models. The component is significant if the Y variation fraction was predicted by the X model >0.01 [25]. Then, based on analytical methods of both HS-GC-IMS and HS-SPME-GC-MS, five predictive components and eight orthogonal components were selected for the OPLS-DA model. As shown in Figure 3A,C, the classification model fitness of model based on HS-GC-IMS was 96.9% (R2Y(cum) = 0.969), while, for HS-SPME-GC-MS, it was 93.5% (Q2(cum) = 0.935). From the prediction ability point of view, the model built from HS-GC-IMS was as high as 95.5% (Q2(cum) = 0.935); however, it was only 85.9% (Q2(cum) = 0.859) correspondingly in HS-SPME-GC-MS. The higher values of R2 and Q2 indicated that the model based on HS-GC-IMS had better fitness and prediction ability compared with HS-SPME-GC-MS. From the predictive point of view, both models acquired Q2 > 0.5, indicating that there are good predictive capabilities, but the predictive ability of the HS-GC-IMS model is better. In order to better verify the ruggedness of the two OPLS-DA model, the two models have been exchanged for 200 replacement experiments. As shown in Figure 3B,D, the values of intercepts of randomly permutated models were significantly lower than that of the original one, indicating the ruggedness of the developed model [28].  As can be observed, in the score plot of OPLS-DA model of HS-GC-IMS, the SS from different regions show strong clustering, without any overlap. However, in the score plot of OPLS-DA mode of HS-SPME-GC-MS, SS samples of six geographical origins were not clearly discriminated. Samples from regions A, B, and C were not separated and samples from regions N and B also overlapped. This is similar to the result of heat map clustering in Figure 3. From this result, the difference between medium molecular weight VOCs of SS from different geographic is not obvious, but the compounds of small molecules identified by HS-GC-IMS have obtained better distinction. Therefore, we will further analyze the compounds identified by HS-GC-IMS.

Rapid Identification of SS in Different Regions by HS-GC-IMS
In general, due to the influence of growing environment, the composition and content of volatile components in the SS of different origins are also different. Therefore, it is particularly important to select SS from different regions. HS-GC-IMS takes less time to obtain analysis results, and data do not require complex processing. Differences between samples can be directly compared through the fingerprint generated by the machine. In this study, HS-GC-IMS was used for rapid identification of the volatile components of SS. The OPLS-DA model established based on the results of HS-GC-IMS had a high prediction As can be observed, in the score plot of OPLS-DA model of HS-GC-IMS, the SS from different regions show strong clustering, without any overlap. However, in the score plot of OPLS-DA mode of HS-SPME-GC-MS, SS samples of six geographical origins were not clearly discriminated. Samples from regions A, B, and C were not separated and samples from regions N and B also overlapped. This is similar to the result of heat map clustering in Figure 3. From this result, the difference between medium molecular weight VOCs of SS from different geographic is not obvious, but the compounds of small molecules identified by HS-GC-IMS have obtained better distinction. Therefore, we will further analyze the compounds identified by HS-GC-IMS.

Rapid Identification of SS in Different Regions by HS-GC-IMS
In general, due to the influence of growing environment, the composition and content of volatile components in the SS of different origins are also different. Therefore, it is particularly important to select SS from different regions. HS-GC-IMS takes less time to obtain analysis results, and data do not require complex processing. Differences between samples can be directly compared through the fingerprint generated by the machine. In this study, HS-GC-IMS was used for rapid identification of the volatile components of SS. The OPLS-DA model established based on the results of HS-GC-IMS had a high prediction ability for the panicles of SS from different habitats, and the samples obtained a good separation degree.
However, not all VOCs in various samples were significantly different. In order to see the difference, we analyzed the variable importance for the projection (VIP) predictive of VOCs in SS (Figure 4). VIP is generally used to evaluate the contributions of X-variables to a model [29]. Based on the criteria of VIP > 1, 30 (red) important variables were selected in the SS of different regions from the VIP plot of the OPLS-DA model of HS-GC-IMS, but there are unknown compounds in these variables, as shown in Figure 4A. In order to further explore advantageous biomarkers, a random forest model (RF) is established using known compounds. RF is an effective high-dimensional data analysis supervision method, which is a popular ensemble learning algorithm for classification and prediction [30]. The classification trees were set as 1000 in this study. During the construction of the tree, a third of the sample is excluded from the bootstrap sample (out-of-bag data, OOB data). For an unbiased estimation of the classification error (OOB error), the OOB data were used as test samples [31]. Academically, the lower the OOB error, the more accurate the classifier. In this experiment, after several trees, the cumulative OOB error rates dropped to 0.0185. As shown in Figure 4B, the 14 variables with significant differences contributed to the classification of SS in different regions. In A, p-cresol was more abundant. Trans-2-hexenal and 1-menthol are higher in B. J has greater amounts of ethyl acetate, 2-ethyl-5-methylpyrazine, and trans-2-pentenal. 1-phenylethanol was more abundant in the N. Diethyl trisulfide, 1-octen-3-ol, and 2-hexanol are higher in S. The levels of 3-methylbutanoic acid and alphaphellandrene in Z are higher. These compounds are similar to the analysis results of Figure 1B. Those compounds with significant differences improve the accuracy of random forest classification, which may be a key factor in distinguishing from SS of different origins based on HS-GC-IMS.
Molecules 2022, 27, x FOR PEER REVIEW 10 of 16 known compounds. RF is an effective high-dimensional data analysis supervision method, which is a popular ensemble learning algorithm for classification and prediction [30]. The classification trees were set as 1000 in this study. During the construction of the tree, a third of the sample is excluded from the bootstrap sample (out-of-bag data, OOB data). For an unbiased estimation of the classification error (OOB error), the OOB data were used as test samples [31]. Academically, the lower the OOB error, the more accurate the classifier. In this experiment, after several trees, the cumulative OOB error rates dropped to 0.0185. As shown in Figure 4B, the 14 variables with significant differences contributed to the classification of SS in different regions. In A, p-cresol was more abundant. Trans-2-hexenal and 1-menthol are higher in B. J has greater amounts of ethyl acetate, 2-ethyl-5-methylpyrazine, and trans-2-pentenal. 1-phenylethanol was more abundant in the N. Diethyl trisulfide, 1-octen-3-ol, and 2-hexanol are higher in S. The levels of 3-methylbutanoic acid and alpha-phellandrene in Z are higher. These compounds are similar to the analysis results of Figure 1B. Those compounds with significant differences improve the accuracy of random forest classification, which may be a key factor in distinguishing from SS of different origins based on HS-GC-IMS.

Discussion
SS is one of the most important drugs for relieving exterior syndrome in TCM. The volatile oil distilled from SS showed potent anti-inflammatory and fumigant activity [32]. Steam distillation, solvent extraction, and headspace capture are commonly used to extract volatile components from plants [33]. However, the first two methods require large sample size, long extraction time, and long heating time, and some components may be destroyed in the heating process. HS-SPME combined with GC-MS has the advantages of short extraction time, simple operation, and high sensitivity. At the same time, based on the NIST database search, this technology can realize the discovery of trace volatile components in complex sample systems [34]. HS-GC-IMS is a powerful technique for the separation and sensitive detection of VOCs, with the advantages of high sensitivity and res-

Discussion
SS is one of the most important drugs for relieving exterior syndrome in TCM. The volatile oil distilled from SS showed potent anti-inflammatory and fumigant activity [32]. Steam distillation, solvent extraction, and headspace capture are commonly used to ex-tract volatile components from plants [33]. However, the first two methods require large sample size, long extraction time, and long heating time, and some components may be destroyed in the heating process. HS-SPME combined with GC-MS has the advantages of short extraction time, simple operation, and high sensitivity. At the same time, based on the NIST database search, this technology can realize the discovery of trace volatile components in complex sample systems [34]. HS-GC-IMS is a powerful technique for the separation and sensitive detection of VOCs, with the advantages of high sensitivity and resolution [35]. In the present study, the VOCs of SS were investigated by HS-GC-IMS and HS-SPME-GC-MS. As can be seen from Tables 1 and 2, most volatile organics detected by HS-GC-IMS are small molecular compounds, whereas the detection range of HS-SPME-GC-MS is usually medium molecular weight VOCs. This suggests that the HS-GC-IMS has a higher sensitivity to high-volatile chemicals than the HS-SPME-GC-MS. It is the ability of SPME to capture high boiling point compounds that leads to the higher content of high boiling point compounds in volatile organic compounds detected by HS-SPME-GC-MS, which is consistent with previous reports [36]. In addition, SPME is affected by temperature, and the distribution coefficient decreases when the temperature increases. As a result, volatile compounds with low distribution coefficients cannot be detected by temperament. The combination of HS-GC-IMS and HS-SPME-GC-MS technology can better realize the rapid identification and comprehensive characterization of volatile organic compounds in SS. By using the well-established HS-GC-IMS technique, 40 VOCs were discovered in SS. On the other hand, 42 VOCs were identified in SS, by the established HS-SPME-GC-MS analysis method. The SS of volatile compounds may be affected by the cultivation region, resulting in the differentiation of the volatile composition in each SS sample. In this study, HS-SPME-GC-MS and HS-GC-IMS ware used to analyze the VOCs in 18 samples collected from different provinces in China. In total, terpenes ware the main volatile components in the SS. The results of both instruments indicated that there were more terpenoids in the samples from Anhui province. In addition, the two instruments reflect a certain difference in the content of volatile components of SS in different places, which may be related to factors such as climate conditions, soil conditions, sunshine intensity, cultivation conditions, and transportation conditions.
Modern studies show that there are obvious differences in the types and quantities of chemical components in TCM from different geographical sources [37]. Therefore, it is very important to establish a fast and reliable regional identification method for TCM. For this purpose, the analysis strategies of two different HS-SPME-GC-MS and HS-GC-IMS were selected, tested, and compared. In this study, the origin of SS samples from six regions was analyzed. A uniform statistical technique was used to analyze data collected on two analytical systems. Regional classification performed using OPLS-DA indicated that the method based on HS-GC-IMS could better classify the SS from six regions than HS-SPME-GC-MS. This may be because the environmental differences in different regions have a greater impact on the small molecule metabolites of plants. The characterization and classification of VOCs from Chinese herbal medicines by HS-GC-IMS coupled with an appropriate multivariate analysis has the potential to be used as a non-destructive way to evaluate Chinese herbal medicines from different origins. This gives us a new idea to distinguish Chinese herbal medicines in different regions. A number of studies have indicated that HS-GC-IMS has the capacity to confirm geographical and botanical origin [35]. For example, HS-GC-IMS was successfully used for reliable classification of geographical origins for both olive oil (EVOO) [38] and wine [39]. Furthermore, the HS-GC-IMS method was applied to quickly identify Ophiopogonis Radix from different regions [40]. Moreover, HS-GC-IMS has also been used to determine the geographical origins of "Chenpi" [41]. The compounds identified by HS-GC-IMS were subjected to multivariate analysis. A total of 14 volatiles that could explain the separation of SS samples into six regions were identified based on VIP scores and RF.
Present observations showed that HS-GC-IMS and HS-SPME-GC-MS can better characterize volatile components in TCM. The application of HS-GC-IMS can better distinguish between different sources of SS and improve the effectiveness and efficiency of the process. HS-GC-IMS has good potential in identifying TCM from different regions. However, the HS-GC-IMS database is not perfect, so it is still limited in the comprehensive characterization and accurate quantitative analysis of samples. Therefore, a large number of experiments are still needed to enrich the database for the application of HS-GC-IMS in distinguishing TCM from different origins.

Sample Source and Preparation
The SS samples were collected from different geographical localities, including Hebei, Henan, Jiangsu, Zhejiang, Shandong, and Anhui. Each sample bought three batches from one region, and there was no difference among batches. Detailed information of all the samples is listed in Supplementary Table S1. All samples in the experiments were authenticated by Professor Lijuan Zhang from Tianjin University of Traditional Chinese Medicine. Voucher specimens were deposited in the School of Pharmacy, Tianjin University of traditional Chinese medicine, China.
All SS samples were crushed with a grinder (Tai site, Tianjin, China) and sieved through a 40-mesh sieve. For subsequent examination, the powdered sample was immediately packed in a plastic bag and stored in a dark, dry environment of 20 • C.

Chemicals and Reagents
A C8-C20 n-alkane standard for HS-SPME-GC-MS was purchased from Sigma-Aldrich Trading Co., Ltd.

HS-GC-IMS Analysis Conditions
The HS-GC-IMS system (Flavourspec ® , G.A.S, Dortmund, Germany) was equipped with an autosampler (CTC Analytics AG, Zwingen, Switzerland) and an FS-SE-54-CB-1 capillary column (15 m × 0.53 mm ID, 1 µm, CS-Chromatographie Service GmbH, Germany). A 0.2 g sample of SS was accurately weighed into a 20 mL headspace bottle and incubated at 75 • C for 20 min at 500 r/min. The volume of the extracted headspace air was 100 µL, and the syringe temperature was 45 • C. The column temperature was 60 • C, and the carrier gas consisted of 99.99% pure nitrogen and its flow rate was first set at 2 mL/min for 2 min, then increased to 10 mL/min within 15 min, increased to 100 mL/min over 25 min and increased to 120 mL/min over 30 min. The pre-separated compounds driven into an ionization chamber and ionized by a 3H ionization source with 300 MBq activity in positive ion mode. The resulting ions were driven to a drift tube (9.8 cm in length), which was operated on a constant temperature (45 • C) and voltage (5 kV). The flow rate of the drift gas (nitrogen gas) was set at 150 mL/min. The retention index (RI) of each compound was calculated using n-ketones C4-C9 as external references. VOCs were identified based on an IMS database of the HS-GC-IMS Library Search application software. The mobility Ko is also involved in the identification of compounds. It is a normalized expression of the ionic mobility (K) at standard temperature and pressure, and its calculation formula is based on references [42].
4.4. HS-SPME-GC-MS Analysis 4.4.1. Extraction of Volatile Compounds SPME fiber (Supelco, Inc., Bellefonte, PA, USA) was installed on a MultiPurpose sampler (Gerstel, GER) and combined with 7890 B-7000 D triple quadrupole gas chromatography mass spectrometry (Agilent Technologies, Palo Alto, CA, USA) to detect VOCs in SS samples. The univariate method was used to select the SPME conditions for each factor individually. Crushed samples (0.1 g) were placed into a 20 mL headspace vial with a magnetic screw cap and a Teflon-lined rubber septum. Then, the sample vial was equilibrated for 5 min at a certain temperature (50 • C, 60 • C, 70 • C, and 80 • C) on a heating platform. The extraction was conducted by inserting the SPME fibers (polyacrylate 85 µm, polydimethylsiloxane/divinylbenzene 65 µm phase thickness (PDMS/DVB), polydimethylsiloxane/carbon wide range/divinylbenzene 50/30 µm phase thickness (PDMS/CAR/DVB)) into the head space of the vial for a certain time (15 min, 20 min, 25 min and 30 min). The fiber was desorbed into the injection port of the GC for 5 min at the end of the extraction. Under the same conditions of sample size, extraction temperature and extraction time, the optimal SPME fiber was first determined. Then, the best SPME fiber was chosen to optimize the other factors at the same way, and peak capacity was used as the criterion. The resulted optimal extraction parameters were determined as follows, using PDMS/CAR/DVB fiber to extract 0.1 g of sample at 60 • C for 25 min.

GC-MS Analysis
An Agilent 7890B gas chromatography system with an HP-5MS elastic quartz capillary column (30 m × 0.25 mm × 0.25 m, 19091 S-433, J&W Scientific, Folsom, CA, USA) and an Agilent 7000 D mass spectrometry detector was used for the GC-MS analysis. Helium was used as the carrier gas, with a flow rate of 1 mL/min. The heating procedure of the column was as follows: maintain a temperature of 50 • C for 2 min; increase to 70 • C at a rate of 10 • C /min; increase to 110 • C at a rate of 5 • C /min; increase to 115 • C at a rate of 1 • C /min; increase to 147.5 • C at a rate of 5 • C /min; increase to 160 • C at a rate of 3 • C /min; heated to 220 • C at 5 • C /min and held for 5 min. The mass spectrometry conditions were as follows: quadrupole temperature of 150 • C, ion source temperature of 230 • C, injection temperature of 280 • C, scanning range 50-600 m/z, and ionization voltage of 70 eV.
In this study, unknown volatile compounds were qualitatively analyzed by three methods, including searching for spectral peaks in the NIST17 standard mass spectrometry library of the chemistry workstation, calculating retention index (RI) values, and comparing the retention times of substances in the sample with external standards. The peak area normalization method was used to calculate the relative percentage content of each compound in SS samples from different regions.

Statistical Analysis
The SIMCA-P 14.1 software (Umerics, Umea, Sweden) was used to run the OPLS-DA. A Laboratory Analytical Viewer (LAV), three plugins, and a HS-GC-IMS library search are all included in the HS-GC-IMS assisted analysis software to analyze samples from different angles. The specific volatile compounds were identified by HS-GC-IMS library search software. Then, the galley plot program in LAV software was used to obtain the visual galley plot of the sample. The HS-SPME-GC-MS results were expressed as the mean ± standard deviation. The heat map and random forest ware performed by the online website MetaboAnalyst 5.0 for data processing.

Conclusions
In general, the study explored the 82 VOCs of SS from six regions by using HS-SPME-GC-MS and HS-GC-IMS. Among them, the content of terpenoids accounted for a large proportion of the SS. Through these two kinds of instrument analysis, it was concluded that the terpenoid substance content is higher in Anhui province. In addition, some aldehydes and phenols were found by HS-GC-IMS. The combination of the two analytical methods realized the rapid identification and comprehensive characterization of VOCs in SS. The fingerprint of HS-GC-IMS has the advantage of data visualization, which can directly and conveniently compare VOCs in samples from different regions. The discriminant analysis of SS from six regions was carried out by OPLS-DA. Only parts of SS from different regions can be separated by HS-SPME-GC-MS. HS-GC-IMS could effectively distinguish the six kinds of SS. By random forest analysis, 14 compounds were identified that were beneficial to the classification of SS in different areas. Although HS-GC-IMS was more commonly used to identify food from various sources, it was less commonly used in traditional Chinese medicine. HS-GC-IMS coupled with an appropriate multivariate analysis has potential in the characterization and classification of TCM containing VOCs. However, due to the complex composition of TCM, the application of HS-GC-IMS in the identification of TCM from different origins still needs to be studied with a larger sample size.
Supplementary Materials: The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/molecules27144393/s1, Table S1: Detailed information of all the samples. Table S2: Peak area of standard product. Table S3: Peak area of sample determined by HS-SPME-GC-MS. Table S4: Peak area of sample determined by HS-GC-IMS.