Geoherbalism Metabolomic Analysis of Atractylodes lancea (Thunb.) DC. by LC-Triple TOF-MS/MS and GC-MS

The rhizome of Atractylodes lancea (Thunb.) DC. (AL), called Maocangzhu in Chinese, is a geoherbalism medical herb in Jiangsu Province that is often used in the prescription of traditional Chinese medicine (TCM), such as for the treatment of COVID-19. The landform and climatic environment of each province varies greatly from south to north, which has an important influence on the chemical constituents in AL. However, there is a lack of research on the significance of its geoherbalism, especially in water-soluble parts other than volatile oil. In this study, eight known compounds were isolated and obtained as reference substances from AL. In addition, liquid chromatography coupled with triple-quadrupole time-of-flight tandem mass spectrometry (LC-triple TOF-MS/MS) and gas chromatography–mass spectrometry (GC-MS) were used to analyze and characterize chemical constituents from different habitats. Moreover, orthogonal partial least-squares discriminant analysis (OPLS-DA) was applied to reveal the differential metabolomics in AL from different habitats based on the qualitative information of the chemical constituents. Results showed that a total of 33 constituents from GC-MS and 106 constituents from LC-triple TOF-MS/MS were identified or inferred, including terpenoids, polyacetylenes, and others; meanwhile, the fragmentation pathways of different types of compounds were preliminarily deduced from the fragmentation behavior of the major constituents. According to the variable importance in projection (VIP) and p-values, only one volatile differential metabolite was identified by GC-MS screening: β-eudesmol. Overall, five differential metabolites were identified by LC-triple TOF-MS/MS screening: sucrose, 4(15),11-eudesmadiene; atractylenolide I, 3,5,11-tridecatriene-7,9-diyne-1,2-diacetate, and (3Z,5E,11E)-tridecatriene-7,9-diynyl-1-O-(E)-ferulate. This study provides metabolomic information for the establishment of a comprehensive quality evaluation system for AL.


Introduction
Atractylodes lancea (Thunb.) DC. (AL), which is known as Maocangzhu in traditional Chinese medicine, has been reported to be able to remove dampness, invigorate the spleen, and to dispel pathogenic wind and cold effects [1]. The plant is mainly distributed in the provinces of Jiangsu, Hubei, Anhui, and Henan in China. The Maoshan area in Jiangsu is the geoherbalism medicinal herb area of AL. From the external appearance, the cut section of AL has the characteristics of vermilion oil spots and precipitates white frost after being placed for a period of time; however, less of this frosting feature can be observed in the cut section in AL from the Maoshan area. Hence, distinguishing the differences in the chemical constituents of AL is also extremely necessary and important.
Previously, the focus of research was mainly on the content or ratio of some of the main volatile components. Some reports showed that a specific proportion relationship of its main volatile components existed in AL from the Maoshan area [2], and the content of atractylone was relatively high. However, some others showed that the content of atractylone in AL was low [3]. Moreover, there are also inconsistent reports on the content of atractylodin in the geoherbalism area of AL [4,5], which was used as a quality control indicator in the Chinese Pharmacopoeia. In addition to these main volatile metabolites, AL also contains a large number of other sesquiterpenes, polyacetylenes, and other chemical metabolites. If only the contents of several main chemical metabolites were used as indicators, it would be difficult to reflect the overall effect of traditional Chinese medicine (TCM). The conclusions drawn from the quality evaluation and geoherbalism research of medicinal materials have also been relatively one-sided. However, there has been a lack of in-depth analysis of the differential chemical metabolites among the overall metabolites. Therefore, it is of great significance to clarify the characterization of chemical metabolites of AL for better quality control of medicinal materials.
In recent years, LC-MS and GC-MS techniques have become the most widely used analytical methods for the direct identification of multiple metabolites in TCM [15,16]. Thus, in this experiment, LC-Triple TOF-MS/MS and GC-MS were used to qualitatively analyze AL samples. A total of 106 overall metabolites and 33 volatile metabolites were identified. Based on the above qualitative results, five differential overall metabolites and one volatile metabolite were identified based on VIP and p-values. Therefore, our study could explain the geoherbalism of AL from both the overall and volatile aspects, which could be beneficial for the breeding and cultivation of geoherbalism resources of AL in the future.

Identification of Metabolites by GC-MS
A total of 33 variables from matrixes containing retention time, mass-to-charge ratio (m/z), and peak intensity were detected in the total ion current in a 42 min measurement period. The metabolites were identified as listed in Table 1 and include monoterpene, sesquiterpenes, polyacetylenes, and other compounds. The other signals were very weak or not even included in the database. A characteristic GC-MS total ion current chromatogram of AL from the Maoshan area is shown in Figure 1. sesquiterpenes, polyacetylenes, and other compounds. The other signals were very weak or not even included in the database. A characteristic GC-MS total ion current chromatogram of AL from the Maoshan area is shown in Figure 1.    OPLS-DA, a discriminant multidimensional statistical analysis method, was consequently utilized to locate the radically differential metabolites among the AL samples in the GC-MS database. This model separated S1 and each of the other samples along the discriminating t [1] (Figure 2). The value of R 2 Y and Q 2 Y indicated that this OPLS-DA model could explain the differences between sample groups. Then, the results of permutation tests conducted 200 times showed the value of the Y-intercept of R 2 and the Y-intercept of Q 2 ( Table 2). According to these results, the models are valid and reliable. The VIP scatter plot of OPLS-DA identified the underlying biomarkers of each part, where p < 0.05 indicated significant differences. Differential chemical constituents (VIP > 1) of AL samples from different habitats were selected from the screening and the number of characteristic peaks are shown in Table 2.     The results indicated that β-eudesmol was the most contributive principle distinguishing the S1 sample from the other samples of AL. The average value and standard deviation of the peak area of the differential metabolite was calculated to obtain the relative content changes among the different samples ( Figure 3). As shown in the figure, the content of β-eudesmol of AL from S1 sample was low. The results indicated that β-eudesmol was the most contributive principle distinguishing the S1 sample from the other samples of AL. The average value and standard deviation of the peak area of the differential metabolite was calculated to obtain the relative content changes among the different samples ( Figure 3). As shown in the figure, the content of β-eudesmol of AL from S1 sample was low.

Identification of the Metabolites in AL
The base-peak chromatogram (BPC) of the QC sample in the positive-ion mode is shown in Figure 4. A total of 106 metabolites were identified, including 32 sesquiterpenes, 22 polyacetylenes, and 52 other metabolites. Due to the fact that more volatile aglycone metabolites existed in AL and that the content of water-soluble glycosides was low, the response values of glycosides were relatively low in the figure. Detailed information on the identified metabolites are shown in Tables 3 and 4.

Identification of the Metabolites in AL
The base-peak chromatogram (BPC) of the QC sample in the positive-ion mode is shown in Figure 4. A total of 106 metabolites were identified, including 32 sesquiterpenes, 22 polyacetylenes, and 52 other metabolites. Due to the fact that more volatile aglycone metabolites existed in AL and that the content of water-soluble glycosides was low, the response values of glycosides were relatively low in the figure. Detailed information on the identified metabolites are shown in Tables 3 and 4. Molecules 2023, 28, x FOR PEER REVIEW 1

Identification of Sesquiterpenes
Sesquiterpenes are the main active ingredients of AL. A total of 32 sesquiter were identified in this study, including sesquiterpene aglycone and sesquite glycoside (Table 3). Fragmentation patterns of sesquiterpenes are shown in Figure 5

Identification of Sesquiterpenes
Sesquiterpenes are the main active ingredients of AL. A total of 32 sesquiterpenes were identified in this study, including sesquiterpene aglycone and sesquiterpene glycoside ( Table 3). Fragmentation patterns of sesquiterpenes are shown in Figure 5a,b.  Sesquiterpene aglycone: According to the above standard cracking rule, it was concluded that sesquiterpene aglycone preferentially lost H2O (18 Da), CH2 (14 Da), CH4 (16 Da), and C3H8O side-chain groups (60 Da). In addition, it could be seen from the fragment ions of sesquiterpene lactones that lactones generally lost neutral fragments such as CO (28 Da  Sesquiterpene aglycone: According to the above standard cracking rule, it was concluded that sesquiterpene aglycone preferentially lost H 2 O (18 Da), CH 2 (14 Da), CH 4 (16 Da), and C 3 H 8 O side-chain groups (60 Da). In addition, it could be seen from the fragment ions of sesquiterpene lactones that lactones generally lost neutral fragments such as CO (28 Da), CO 2 (44 Da), and HCOOH (46 Da). In the MS 1 spectrum, the molecular formula composition of the screened target ions was predicted by using the Formula Finder function of the PeakView software (version 1.2). Once it was predicted to have a C 15 molecular formula, it would be initially locked in as a potential sesquiterpene aglycone compound. Furthermore, if there were methoxy group and acetic acid substitutions in the structure, the loss of CH 3 OH (32Da) and CH 3  In addition, the left or right ring might be lost in some compounds such as 9 and 18. Finally, compounds 1-19 were identified as sesquiterpene aglycones.
Sesquiterpene glycoside: Compounds 20-32 were identified as sesquiterpene glycosides. These sesquiterpene glycosides exhibited some of the same cracking laws in MS 2 , such as the loss of apiofuranosyl (150 Da), xylopyranoside (150 Da), and glucopyranoside (180 Da), while the lost glycosyl fragments might undergo protonation as well. After the loss of glycosyl fragments, the other parts were mainly sesquiterpene aglycones, which followed a similar cracking rule as above. In the MS 1 spectrum, the molecular formula composition of the screened target ions was predicted by using the Formula Finder function of the PeakView software (version 1.2). Once it was predicted to have the C 20 (linked pentose), C 21 (linked hexose), C 25 (linked two pentose, less), C 26 (linked one pentose and one hexose), and C 27 (linked two hexose) molecular formula, it would be initially locked in as a potential sesquiterpene glycoside compound. According to the above fragmentation rules and literature reports, compounds 20-32 were identified as sesquiterpene glycosides.

Identification of Polyacetylenes
Polyacetylenes are also the main active ingredients of AL. A total of 22 polyacetylenes were identified in this study, including polyacetylene aglycones, and polyacetylene glycosides (Table 3). Fragmentation patterns of polyacetylene are shown in Figure 5c.
Polyacetylene aglycone: According to the above standard cracking rule, it was concluded that polyacetylene aglycone preferentially lost H 2 O (18 Da), CH 2 (14 Da), furan ring (m/z 68), and C 2 H 2 (m/z 26). Due to the basic C=C and C≡C structures of polyacetylenes, unsaturated C-chain groups were often lost during fragmentation, which helped to distinguish other types of compounds. According to existing literature reports, acetic acid products mostly exist in polyacetylenes. The MS 2 spectrum mainly showed the loss of CH 3 COOH (m/z 60) and CH 3 COONa (m/z 82), resulting in [M + Na-CH 3  Polyacetylene glycoside: Compounds 44-54 were identified as polyacetylene glycosides. These polyacetylene glycosides exhibited some of the same cracking laws in MS 2 , such as the loss of rhamnose (146 Da), apiofuranosyl (150 Da), and glucopyranoside (180 Da), while the lost glycosyl fragments might undergo protonation as well. After the loss of glycosyl fragments, the other parts were mainly polyacetylene aglycones, which followed a similar cracking rule as above. Taking compound 52 as an example, the quasimolecular ion peak at 535.1783 [M + Na] + was first formed. Fragment ions at m/z 335.0952, and 203.0588 represented the ions of [glucopyranoside-apiofuranosyl + Na] + (335 Da) and [glucopyranoside + Na] + (203 Da).

Identification of Other Compounds
Compounds 55-106 were identified by comparison with standard samples, the mother ions of self-built molecules, the reported literature and the MS 2 fragment ions (matching ratio of nearly 100%), as well as by the MS-dial software (version 4.  Table 4.

OPLS-DA
In this experiment, the samples from other habitats were compared with the sample from the Maoshan area and analyzed by OPLS-DA analysis. As can be seen from Figure 6a, the samples from each group were obviously separated along the PC1 axis. The models were tested with 200 permutations, and the results are listed in Table 5. The results showed that the models did not overfit, indicating that they were effective and reliable. According to the VIP score chart (Figure 6b) and t-tests corresponding to the model, differential chemical constituents (VIP > 1) were selected out by screening, and the numbers of characteristic peaks are shown in Table 5.
The average values of common differential constituents between the different samples were calculated to obtain the content changes ( Figure 7). As shown in the figure, the contents of sucrose and 4(15),11-eudesmadiene were lower in S1 than in the other samples, while the contents of atractylenolide I, 3,5,11-tridecatriene-7,9-diyne-1,2-diacetate, and (3Z,5E,11E)-tridecatriene-7,9-diynyl-1-O-(E)-ferulate were higher in S1 than the others. from the Maoshan area and analyzed by OPLS-DA analysis. As can be seen from Figure  6a, the samples from each group were obviously separated along the PC1 axis. The models were tested with 200 permutations, and the results are listed in Table 5. The results showed that the models did not overfit, indicating that they were effective and reliable. According to the VIP score chart (Figure 6b) and t-tests corresponding to the model, differential chemical constituents (VIP > 1) were selected out by screening, and the numbers of characteristic peaks are shown in Table 5.

Discussion
In the past, the focus of geoherbalism research on AL was mainly on the content or ratio of some of the main volatile components; however, the conclusions were inconsistent. The problem mainly focuses on whether the content of atractylone in the

Discussion
In the past, the focus of geoherbalism research on AL was mainly on the content or ratio of some of the main volatile components; however, the conclusions were inconsistent. The problem mainly focuses on whether the content of atractylone in the Maoshan area is truly higher than that in other habitats. Moreover, in the current Chinese Pharmacopoeia, the quality control indicator of AL is the content of atractylodin, which is unable to fully distinguish whether it is from a geoherbalism area. Not only that, evaluating the quality of Chinese medicinal herbs with only a few main components is not appropriate for determining the overall quality of the TCM. At present, there are few studies on the geoherbalism research on AL, especially for water-soluble metabolites. Meanwhile, the natural resources in the Maoshan area are seriously scarce, suggesting that the breeding and cultivation of excellent geoherbalism varieties are urgent. Therefore, it is of great importance to study the geoherbalism differential metabolites of AL.
In this study, LC-triple TOF-MS/MS and GC-MS methods were used to comprehensively analyze the different metabolites of AL. Using these methods, 33 volatile chemical metabolites (GC-MS) and 106 overall chemical metabolites (LC-triple TOF-MS/MS) in AL were identified, which included sesquiterpenes, polyacetylenes, and others. The results showed that some metabolites of AL in the Maoshan area were significantly different from those in other areas, suggesting that they might be related to the geoherbalism and highly varied quality of AL. In addition, we also found that several overall differential metabolites with higher content in the Maoshan area were also higher in the Yingshan and Tongbai areas (Figure 7), which are also famous AL production areas, indirectly proving the advantage of metabolomics in comprehensively evaluating the quality of medicinal materials from different habitats.
Although the volatile components have been studied and reported, we have provided a new method that can effectively separate multiple components in the volatile oil, especially hinesol and β-eudesmol. The method is also suitable for evaluating other varieties and counterfeits of Atractylodis Rhizome, which has certain innovative significance for comprehensively evaluating the quality of Cangzhu. Not only that, we found that the content of atractylone in the Maoshan area was indeed higher than that in other habitats. However, it was not a differential metabolite, indicating that other metabolites may be more meaningful for distinguishing the geoherbalism of AL, which indirectly indicated the necessity of using metabolomics in this experiment.
The Chinese Pharmacopoeia shows that AL has certain dampness-eliminating and spleen-strengthening effects. According to literature reports, its dampness-eliminating effect is mainly related to the content of the volatile oil [40]. AL from the Maoshan area has also been considered to have a stronger spleen-strengthening effect than that from other production areas [41]. In this study, β-eudesmol played a strong contributory role in volatile oil, indicating a strong correlation with the dampness-eliminating effects of AL. However, in terms of overall detection, metabolites such as atractylenolide I contributed significantly, indicating a significant correlation with the spleen-strengthening effects of AL. This also proves the advantages of different metabolomic methods for evaluating the geoherbalism efficacy of AL.
Generally speaking, the accumulation of active ingredients in AL varies greatly according to different ecological environments. Therefore, it is necessary to analyze the relationship between different ecological environments and metabolites. The results provide data that reveal the influence of the ecological environment on the synthesis and accumulation of metabolites in AL as well as the quality formation mechanism of geoherbalism characteristics.

Plant Materials
The air-dried rhizomes of AL using for separation and extraction were collected from Huoshan of the Anhui province, China, and the air-dried rhizomes of AL used for metabolomic analyses were collected from different habitats of AL, which were identified as Atractylodes lancea (Thunb.) DC. by Prof. Gu Wei. Voucher specimens were deposited in the Laboratory of Chinese Medicine Identification, Nanjing University of Chinese Medicine. The source information of the AL samples is shown in Table 6. All sample batches were randomly sampled into three mixed-batch groups again in order to reduce intra-group errors. All of them were then crushed and immediately extracted.

. Sample Preparation
The experimental procedure for the extract preparation for GC-MS is as follows: Samples were accurately weighed (0.5 g for each sample) and transferred to 50 mL triangular flasks; then, 10 mL cyclohexane was added into each of them, followed by ultrasonic extraction for 20 min. After extraction, the samples were cooled down, weighed, and the weight of each sample was replenished. Finally, the supernatant was taken and centrifuged at 13,000 rpm for 10 min prior to GC-MS analysis.

GC-MS Methods
The Agilent 7890A/5975C GC-MS instrument (Agilent, J&W Scientific, Folsom, CA, USA) was used for analysis. Gas chromatographic conditions were as follows: Agilent 19091N-133 column (30 m × 250 µm, 0.25 µm; Agilent J&W Scientific). Split sampling was performed with an injection volume of 2 µL and a split ratio of 40:1. The injection, ion source, and interface temperatures were set at 280 • C, 250 • C, and 150 • C, respectively. The oven temperature-raising procedure was set to 100 • C for 3 min and then increased by 10 • C/min to 160 • C. After this, the temperature was increased by 3 • C/min to 175 • C and held for 2 min, followed by an increase of 5 • C/min to 200 • C, which was then held for 1 min, and finally, the temperature was increased by 2 • C/min to 210 • C and held for 15 min. The total run time was set for a 42 min measurement period. The carrier gas was helium (1.0 mL/min). Mass spectrometry conditions: electrospray ionization source, electron energy 70 eV; mass data collected in full-scan mode (m/z 35-780).

Data Preprocessing and Analysis
The raw data of the GC-MS analysis from the Agilent MSD ChemStation was automatically detected and aligned using the NIST database. On the basis of the above qualitative results, OPLS-DA was used to perform dimensionality reduction analysis on the data to obtain information about differences between groups. The characteristic peaks of the differential chemical components were screened according to the VIP (VIP > 1) and t-test (p < 0.05) results obtained from the OPLS-DA. Four reference substances, including atractylenolide I~III and atractylone, were purchased from Shanghai Yuanye Biotechnology Co., Ltd. (Shanghai, China). The other eight reference substances were self-made in this experiment. The above 12 standards were weighed in a 5 mL volumetric flask using a 1/1,000,000 electronic analytical balance (ME36S, Sartorius, Göttingen, Germany), dissolved in methanol, and prepared into a mixed standard solution. All solutions were stored at 4 • C for further analysis.
The samples were crushed into 40 meshes using a universal grinder, and then the powder was dried to constant weight. Next, 0.5 g of dried powder was accurately weighed into a 50 mL triangular flask and ultrasonically extracted with 10 mL of methanol for 60 min at room temperature. After extraction for a few minutes, the weight was made up with methanol. The supernatant was taken and centrifuged at 13,000 rpm for 10 min before filtering through a 0.22 µm membrane (Jinteng Laboratory Equipment Co., Ltd., Tianjin, China) prior to LC-triple TOF-MS/MS analysis.

LC-Triple TOF-MS/MS Conditions
An LC system (Shimadzu, Kyoto, Japan) was used for sample analysis. MS-DIAL 4.92 software was also used to perform peak alignment and noise filtering for the raw mass spectrometry data. The results were further imported into the SIMCA-P 14.1 software (Umetrics AB, Umea, Sweden) with variates mean-centered by OPLS-DA. Lastly, variant metabolites were found using t-test (p < 0.05) and variable importance plot values (VIP > 1). The drawing was completed by the Origin 2019b software.

Extraction and Isolation
The liposoluble solution was extracted from the air-dried rhizomes (20 kg) with ethyl acetate by the percolation method at room temperature. The extract was concentrated under reduced pressure to yield a residue (1.5 kg). The extract (1.0 kg) was subjected to silica-gel column chromatography (Petroleum Ether-EtOAc-MeOH, 1:0-0:1-0:1, v/v/v). The fractions were pooled into seven subfractions according to the TLC analysis.
Among the Fr.1 flow group, the same flow was merged through TLC point plates to obtain three flow groups: Fr.1.1, Fr.1.2, and Fr.1.3. A large amount of nearly colorless crystals (brownish-red mother liquor) were produced in Fr.1.2 at room temperature, and the crystals were filtered to obtain about 1.03 g to yield compound 1.
Within There was no spot under an ordinary TLC and UV lamp, but a red spot was observed under the thin-layer plate (sulfuric acid-ethanol spray baking). This was identified as compound 6 (about 3.9 mg).
Fr.4.3 was subjected to column chromatography (Si) again, and the same flow components were merged to obtain Fr.4.3.1, Fr. 4.3.2, and Fr. 4.3.3. A colorless film of flaky solid crystals was separated from the stream in Fr.4.3.2. There were also no spots under ordinary TLC and UV light, and no spots under the thin-layer plate (sulfuric acid-ethanol spray baking). However, under the thin-layer plate (phosphomolybdic acid spray baking), there were blackish-green spots (Rf value was small under the developing agent of petroleum ether-ethyl acetate). This was identified as compound 7 (about 8.3 mg).
Within the Fr.6 flow group, the same flow was merged through TLC point plates to obtain three flow groups: Fr.6.1, Fr.6.2, and Fr.6.3. Methanol was added to remove insoluble matter in Fr.6.3, and petroleum ether was added after volatilization. The white suspension in the upper layer was obtained under ultrasound, and the suspension was taken out. After volatilization, needle-shaped crystals were separated from the bottom to yield compound 8 (about 133.8 mg).

Conclusions
In this study, eight compounds were separated and obtained as standard references; among these, the two diterpenes were first isolated and obtained from this genus, which updated the phytochemistry classification difference between the Atractylodes and Atractylis genera in Flora Reipublicae Popularis Sinicae.
LC-triple TOF-MS/MS and GC-MS were used to analyze the metabolites from ten different habitats of AL. In total, 106 overall chemical metabolites and 33 volatile metabolites were identified. The fragmentation pathways of sesquiterpenes and polyacetylenes were preliminarily deduced by the fragmentation behavior of the standard references.
In addition, the content of atractylone in the Maoshan area, which was the previous focus of geoherbalism research on AL, was indeed higher than that in other habitats. However, it was not a differential metabolite, indicating that other components may be more meaningful for distinguishing the geoherbalism of AL, which indirectly indicated the necessity of using metabolomics in this experiment.
In conclusion, these results can help us to better understand the componential differences of chemical metabolites in AL from different habitats and distinguish the geoherbalism chemical characteristics of AL as well as provide data for further exploring the geoherbalism molecular significance and environmental impacts of AL.