Exploration of Habitat-Related Chemical Markers for Stephania tetrandra Applying Multiple Chromatographic and Chemometric Analysis

Geo-authentic herbs refer to medicinal materials produced in a specific region with superior quality. Stephania tetrandra S. Moore (S. tetrandra) is cultivated in many provinces of China, including Anhui, Zhejiang, Fujian, Jiangxi, Hunan, Guangxi, Guangdong, Hainan, and Taiwan, among which Jiangxi is the geo-authentic origin. To explore habitat-related chemical markers of herbal medicine, an integrated chromatographic technique including gas chromatography-mass spectrometry (GC-MS), ultra-high-performance liquid chromatography coupled with quadrupole time-of-flight mass spectrometry (UHPLC-Q-TOF-MS/MS) and ultra-high-performance liquid chromatography-mass spectrometry (UHPLC-MS/MS) combined with chemometric analysis was established. The established methods manifested that they were clearly divided into two groups according to non-authentic origins and geo-authentic origins, suggesting that the metabolites were closely related to their producing areas. A total of 70 volatile compounds and 50 non-volatile compounds were identified in S. tetrandra. Meanwhile, tetrandrine, fangchinoline, isocorydine, magnocurarine, magnoflorine, boldine, and higenamine as chemical markers were accurately quantified and suggested importance in grouping non-authentic origins and geo-authentic origins samples. The discriminatory analysis also indicated well prediction performance with an accuracy of 80%. The results showed that the multiple chromatographic and chemometric analysis technique could be used as an effective approach for discovering the chemical markers of herbal medicine to fulfill the evaluation of overall chemical consistency among samples from different producing areas.


Introduction
S. tetrandra is derived from the dried root of Stephania tetrandra S. Moore (S. tetrandra) [1], a perennial liana plant of the genus Stephania, belonging to the Menispermaceae family [2]. It was first recorded as medicine in Shen Nong's Herbal Classic and suitable as a treatment for arthralgia associated with rheumatoid arthritis, wet beriberi, eczema, and inflamed sores, which acts as a diuretic, analgesic, and anti-inflammatory [2][3][4]. Currently, the compounds isolated and identified from S. tetrandra are mainly alkaloids which are critical for evaluating their therapeutic effects and quality [5]. However, only two characteristic components, tetrandrine and fangchinoline, were defined as the quantitative indexes recorded in the Chinese Pharmacopoeia 2020 edition [1]. Other major components with high content and extensive pharmacological properties, such as magnoflorine, magnocurarine, isocorydine, higenamine, and boldine, have been less studied so far [2]. Therefore, it was an urgent need to develop more potential chemical markers to better evaluate the holistic chemical features of S. tetrandra from different origins.
It is well accepted that medicinal herbs exert their efficacies through synergistic actions via "multi-components hitting multi-targets" of complex chemicals in the herbs [6,7], and integrating multiple methods simultaneously characterizing different kinds of components has been employed as a comprehensive strategy to evaluate the holistic quality, thus assure the efficacy of medicinal herbs [8]. Moreover, chemometrics provides a variety of good algorithms to explore and obtain more valuable chemical information [9][10][11][12][13][14][15]. Among these, discriminatory analysis is an effective tool for accurate prediction according to various characteristic values. The combination of multiple chromatographic techniques and chemometrics would provide a reliable method for the biomarker screening of herbal medicine.
In this study, an integrated chromatographic technique based on ultra-high-performance liquid chromatography coupled with quadrupole time-of-flight mass spectrometry (UHPLC-Q-TOF-MS/MS) and gas chromatography-mass spectrometry (GC-MS) was proposed to study the geo-herbalism of S. tetrandra. The habitat-related biochemical markers were further investigated using multiple pattern recognition models. Subsequently, seven main components were quantitatively compared using the validated ultra-high-performance liquid chromatography-mass spectrometry (UHPLC-MS/MS). A discriminant function equation was established to verify the accuracy of the chemical markers and achieve the prediction of the origin of S. tetrandra. The proposed strategy, which is comprehensive and effective, can be used to assist the application of S. tetrandra as well as related herbal medicine.

Method Validation
The relative standard deviations (RSDs) of retention time (Rt) and peak area for precision, repeatability, and stability were less than 1.0% and 7.7% using GC-MS and UHPLC-Q-TOF-MS/MS analysis (Supplementary Table S1), which validated that the established method was precise for differential component analysis of S. tetrandra from different origins.
The UHPLC-MS/MS method was verified in terms of linearity, lower limits of quantification (LLOQs), precision, repeatability, stability, and recovery. The results are shown in Supplementary Tables S2-S4, which revealed that the established method was precise enough for the simultaneous quantitative determination of seven compounds.

Identification of Volatile Components by GC-MS Analysis
GC-MS analysis was carried out to analyze the volatile components of S. tetrandra samples, and the relative contents of volatile compounds were calculated by peak area normalization. According to the database NIST08 and NIST08s, 70 volatile components, including alkanes, fatty acids, esters, carbonyls, alcohols, and phenols, were identified. Among them, the content of esters was the highest, accounting for 28.44%, of which methyl (9E)-9-octadecenoate, methyl linoleate, and methyl palmitate were 11.48%, 6.50%, and 4.50%, respectively. Alkanes have the most types, with a relative content of 14.36%, of which 2,4-dimethyl-1-heptene was the most abundant, reaching 1.80%. The relative content of fatty acids was 10.26%, among which oleic acid and palmitic acid were the highest, which were 4.54% and 2.94%, respectively. The identification results are shown in Table 1, and the corresponding peak of each compound was exhibited in the total ion chromatograms (TIC) diagrams (Supplementary Figure S1).

Identification of Nonvolatile Components by UHPLC-Q-TOF-MS/MS Analysis
Chromatographic data collected from UHPLC-Q-TOF-MS/MS were imported into Agilent Masshunter Qualitative Workstation Analysis B.07.00 (Agilent Technologies Inc., Santa Clara, CA, USA) for the identification of nonvolatile components of S. tetrandra. The identification of compounds is based on accurate mass, Rt, ion pattern, and MS/MS information. The obtained mass spectrograms were verified by: (a) matching with the instrumentgenerated molecular formula; (b) analyzing the structural information of metabolites acquired from the Metlin database (http://metlin.scripps.edu, accessed on 29 June 2022); (c) comparing with the fragment information of the standard samples; (d) combining with the compound information of the previous reports. According to the above-mentioned data acquisition and mining strategies, a total of 50 compounds, including 25 alkaloids, six amino acids, four amides, four fatty acids, two phenols, two purine derivatives, one phospholipid, one nucleoside, and five other compounds, were identified in S. tetrandra. The detailed information on the compounds is shown in Table 2, and the TIC figures of S. tetrandra under positive and negative ion modes are illustrated in Supplementary Figure S2.

Exploration of Habitat-Related Chemical Markers Based on Global Components
Principal component analysis (PCA) is an unsupervised pattern recognition method that is often used to sort unclassified samples into groups. As displayed in Figure 1, samples from different provinces were separated into two groups. The geo-authentic samples had a more clustered distribution and homogeneous quality, whereas the non-authentic samples had a wide range of interval distribution and large differences in quality. Supervised orthogonal partial least squares discriminant analysis (OPLS-DA) was subsequently used to filter out random noises, distinguish differences between groups, and improve the validity and analytical ability of the model. The results in Figure 2A-C showed that the medicinal materials were grouped into two groups according to the non-authentic and geoauthentic origins. The model was conducted using 7-fold cross-validation in this research. R 2 describes the goodness of fit and the cross-validation parameter, Q 2 , represents the predictive ability of the model. The constructed model had good quality, with a cumulative R 2 Y of 0.997 and Q 2 Y of 0.491. Permutation tests were conducted 200 times to assess whether the model was overfitted. As shown in Figure 2D-F, the blue regression lines of the Q 2 points intersected the vertical axis below zero, and the intercepts of all regression lines on the vertical axis were less than 0.5; therefore, the model was reliable and not overfitted.
Molecules 2022, 27, x FOR PEER REVIEW 7 of 14 assess whether the model was overfitted. As shown in Figure 2D-F, the blue regression lines of the Q 2 points intersected the vertical axis below zero, and the intercepts of all regression lines on the vertical axis were less than 0.5; therefore, the model was reliable and not overfitted.  In addition, based on the above models, the variables with the variable importance in the projection (VIP) >1 were explicitly detected. To further visualize the components with VIP > 1, OPLS-DA was performed to generate S-plots ( Figure 3). Finally, 14 differential volatile components (Supplementary Table S5) and 14 nonvolatile markers (Supplementary Table S6) were characterized, all of which played a significant role in the differentiation of different origins of S. tetrandra. assess whether the model was overfitted. As shown in Figure 2D-F, the blue regression lines of the Q 2 points intersected the vertical axis below zero, and the intercepts of all regression lines on the vertical axis were less than 0.5; therefore, the model was reliable and not overfitted.  In addition, based on the above models, the variables with the variable importance in the projection (VIP) >1 were explicitly detected. To further visualize the components with VIP > 1, OPLS-DA was performed to generate S-plots ( Figure 3). Finally, 14 differential volatile components (Supplementary Table S5) and 14 nonvolatile markers (Supplementary Table S6) were characterized, all of which played a significant role in the differentiation of different origins of S. tetrandra. In addition, based on the above models, the variables with the variable importance in the projection (VIP) > 1 were explicitly detected. To further visualize the components with VIP > 1, OPLS-DA was performed to generate S-plots ( Figure 3). Finally, 14 differential volatile components (Supplementary Table S5) and 14 nonvolatile markers (Supplementary  Table S6) were characterized, all of which played a significant role in the differentiation of different origins of S. tetrandra.
The structural formulas and detailed content of seven analytes in the 16 samples from different origins are exhibited in Figure S3 and Supplementary Table S7. The contents of seven compounds in sample extracts from different origins were inconsistent, which indicated that the different growth conditions, such as climate, and sunlight of different origins, may influence the quality of S. tetrandra. As shown in Figure 4, the total contents of analytes in samples from geo-authentic origins (S11-S16), especially the characteristic components, tetrandrine and fangchinoline, were higher. Besides, the contents of tetrandrine, fangchinoline, isocorydine, and higenamine in samples from geo-authentic origins were highly consistent ( Figure 5). In contrast, the compound contents in non-authentic origin samples showed high fluctuation, and the contents of the S1 and S5 samples were low.
The structural formulas and detailed content of seven analytes in the 16 samples from different origins are exhibited in Figure S3 and Supplementary Table S7. The contents of seven compounds in sample extracts from different origins were inconsistent, which indicated that the different growth conditions, such as climate, and sunlight of different origins, may influence the quality of S. tetrandra. As shown in Figure 4, the total contents of analytes in samples from geo-authentic origins (S11-S16), especially the characteristic components, tetrandrine and fangchinoline, were higher. Besides, the contents of tetrandrine, fangchinoline, isocorydine, and higenamine in samples from geo-authentic origins were highly consistent ( Figure 5). In contrast, the compound contents in non-authentic origin samples showed high fluctuation, and the contents of the S1 and S5 samples were low.
The structural formulas and detailed content of seven analytes in the 16 samples from different origins are exhibited in Figure S3 and Supplementary Table S7. The contents of seven compounds in sample extracts from different origins were inconsistent, which indicated that the different growth conditions, such as climate, and sunlight of different origins, may influence the quality of S. tetrandra. As shown in Figure 4, the total contents of analytes in samples from geo-authentic origins (S11-S16), especially the characteristic components, tetrandrine and fangchinoline, were higher. Besides, the contents of tetrandrine, fangchinoline, isocorydine, and higenamine in samples from geo-authentic origins were highly consistent ( Figure 5). In contrast, the compound contents in non-authentic origin samples showed high fluctuation, and the contents of the S1 and S5 samples were low.

Discriminatory Analysis
Discriminant analysis is a multivariate statistical analysis method to determine the classification of research objects according to various characteristic values under the condition of a classification determination. In this study, establishing the domain U = {X1, X2, …, X11}, representing 11 randomly selected samples of S. tetrandra, and selecting the content of seven characteristic peaks such as tetrandrine, fangchinoline, magnoflorine, higenamine, magnocurarine, isocorydine, and boldine as the discriminant factors to form an 11 × 7 matrix. Then, by SPSS 21.0 (IBM, San Diego, CA, USA), Wilk's lambda method

Discriminatory Analysis
Discriminant analysis is a multivariate statistical analysis method to determine the classification of research objects according to various characteristic values under the condition of a classification determination. In this study, establishing the domain U = {X1, X2, . . . , X11}, representing 11 randomly selected samples of S. tetrandra, and selecting the content of seven characteristic peaks such as tetrandrine, fangchinoline, magnoflorine, higenamine, magnocurarine, isocorydine, and boldine as the discriminant factors to form an 11 × 7 matrix. Then, by SPSS 21.0 (IBM, San Diego, CA, USA), Wilk's lambda method was used for stepwise discriminant analysis, and a discriminant function for determining the origin of S. tetrandra was obtained. The regression estimation method was used to evaluate the superiority and inferiority of the discriminant function. Finally, the discriminant function equation of S. tetrandra was obtained as follows (S1: tetrandrine, S2: fangchinoline, S3: magnoflorine, S4: higenamine, S5: magnocurarine, S6: isocorydine, S7: boldine): By taking the content of each characteristic peak after screening into the function equation and comparing the Y values of the function equations from different origins, the value of which is the largest belonging to the origin represented by this equation. Through the test of the regression estimation method, we tested another five batches of S. tetrandra of known origin, the discriminant analysis of the source origin was compared with the actual results, and the correct rate was 80% (Table 3). This showed that the discriminant function equation established was relatively stable, can achieve the prediction and identification of the origin of S. tetrandra, and is valuable in promotion and application. Table 3. Effect evaluation of discriminant function equations of two groups of S. tetrandra samples.
Sixteen batches of S. tetrandra were collected from seven different provinces (Guangxi, Guangdong, Sichuan, Neimeng, Anhui, Zhejiang, and Jiangxi) in China. The sample information is shown in Supplementary Table S8.

Preparation of Standard and Sample Solutions
Stock solutions of tetrandrine, fangchinoline, magnoflorine, magnocurarine, isocorydine, higenamine, and boldine were dissolved with methanol at a concentration of 10 mg/mL and serially diluted to plot the standard curves.
All dried samples were grounded, and the powder was passed through a 50-mesh sieve. Powdered samples (1 g) were ultrasonically extracted in 10 mL of n-hexane for 30 min. After cooling, the resulting mixture was centrifuged, and the supernatant filtered through 0.22 µm nylon membranes was collected for GC-MS analysis.
Pulverized samples (0.5 g) were sonicated in 20 mL of 70% methanol for 30 min. After centrifugation, the supernatants were filtered through a 0.22 µm nylon membrane to obtain the sample solutions for LC-MS/MS analysis.
Meanwhile, all equivalent volumes (100 µL) of sample solutions for GC-MS and LC-MS/MS analysis were respectively mixed as quality control (QC) samples. These sample solutions were all stored at 4 • C until analysis.
Mass spectrometry was performed in electron impact (EI) mode and full scan mode at a mass-to-charge ratio (m/z) of 50-1000. The temperatures of the ion source and interface were 230 • C and 250 • C, respectively. The mass spectrometer was operated in both positive and negative modes with a scanning range of m/z 50-1500 and a scanning rate of 1 spectra/s. High resolution (4 GHz, High Res Mode) was used. The optimized instrumental parameters were as follows: capillary temperature, 350 • C; drying gas (N 2 ) flow rate, 8 L/min; nebulizer pressure, 25 psi; collision energy, 30 V; fragment voltage, 135 V.

UHPLC-MS/MS Analysis
Quantitative analysis was performed on an Agilent 1290 UHPLC system (Agilent Technologies Inc., Palo Alto, CA, USA) equipped with an Agilent 6470 Triple quadrupole tandem mass spectrometer (Agilent Technologies, Singapore) with electrospray ionization (ESI) source. A Waters ACQUITY UPLC ®® BEH C18 column (2.1 × 100 mm, 1.7 µm) was used for chromatographic separation, and the column temperature was maintained at 20 • C. The binary gradient elution system consisted of 0.05% formic acid in water (A) and acetonitrile (B). The gradient profile started from 10% B, increased linearly to 23% B within 2 min, and then increased to 50% B within 3 min.
The mass spectrometer was operated in positive mode. The optimized mass conditions were as follows: gas temperature, 320 • C; gas flow rate, 8 L/min; nebulizer, 35 psi; sheath gas temperature, 250 • C; sheath gas flow, 12 L/min; capillary voltage, 4000 V. Multiple reaction monitoring (MRM) mode was applied for the quantitative analysis of different compounds. An MRM diagram is shown in Supplementary Figure S4. The optimal mass spectral parameters and ion patterns are presented in Table 4.

Method Validation
The GC-MS and UHPLC-Q-TOF-MS/MS methods were validated in terms of precision, repeatability, and stability. The precision was evaluated by observing the intraday variations of the QC sample six consecutive times. The repeatability was accessed by preparing six replicate QC samples. The stability was obtained by detecting one QC sample at 0, 6, 12, 18, and 24 h. Fifteen chromatographic peaks were randomly selected to calculate the RSDs of peak area and RT to investigate precision, repeatability, and stability.
The methodology of UHPLC-MS/MS analysis was verified by determining linearity, LLOQs, precision, repeatability, stability, and recovery. A calibration curve for each alkaloid standard was constructed using a linear regression model, and linearity was verified using correlation coefficients (r). The LLOQs were estimated as the minimum concentration giving signal-to-noise ratios (S/N) of 10. Instrument precision was determined by analyzing six replicates. Repeatability was evaluated by performing six replicate analyses on the same QC sample. In the stability test, the QC sample solutions were stored at room temperature and then analyzed by replicate injections at 0, 2, 4, 8, 12, and 24 h. The recoveries for spiked samples were applied to examine the effect of the extraction method and matrix effect. Blank S. tetrandra samples (0.25 g) and certain amounts of mixed standard solution were dissolved in 20 mL of 70% methanol and then processed by the optimized method, which were regarded as the spiked samples. The recovery rate was calculated using the following formula: recovery rate (%) = (observed amount − original amount)/spiked amount × 100%. Pulverized samples (0.5 g) were sonicated in 20 mL of 70% methanol for 30 min

Data Preprocessing
The GC-MS and UHPLC-Q-TOF-MS/MS data of S. tetrandra from different origins were exported in MZ format using the GC-MS Postrun (Shimadzu, Kyoto, Japan) and Agilent Masshunter Qualitative Analysis software packages, respectively. The peak finding, alignment, and filtering of the raw data were preprocessed using R 2.7.2 software (R Foundation for Statistical Computing, Vienna, Austria) to obtain the Rt, m/z, and peak strength of each compound. Finally, the obtained data were imported into Simca-P 14.1 (Umetrics, Umea, Sweden) for OPLS-DA. Potential chemical markers to differentiate the S. tetrandra from different origins were screened according to the VIP value. R 2 and Q 2 values were used to validate the model. R 2 implied the explanation capability towards original data, and Q 2 indicated the prediction ability of the model. The discriminant analysis function equation was established by SPSS 21.0 software.

Conclusions
In conclusion, by combing the volatile and nonvolatile components based on multiple chromatographic analyses, the habitat-related chemical markers of S. tetrandra were discovered. Through integrated chemometrics analysis, 14 volatile components and 14 non-volatile oils were screened out as the important contributors to the chemical difference between geo-authentic and non-authentic origins samples. Among these, tetrandrine, fangchinoline, isocorydine, magnocurarine, magnoflorine, boldine, and higenamine as chemical markers with abundant pharmacological activities were quantitatively analyzed by UHPLC-MS/MS. The results showed that the total content of analytes in samples from geo-authentic origins was higher and more consistent. Finally, discriminant analysis was used to simulate the function equation representing the origin of S. tetrandra to trace the origin of medicinal materials, and it also verified the accuracy of the differential components obtained by OPLS-DA. The proposed method could aid the exploration of habitat-related chemical markers for other herbal medicines.