Nontargeted Metabolomic Analysis of Four Different Parts of Platycodon grandiflorum Grown in Northeast China

Platycodonis radix is extensively used for treating cough, excessive phlegm, sore throat, bronchitis and asthma in the clinic. Meanwhile, the stems, leaves and seeds of Platycodon grandiflorum (PG) have some pharmaceutical activities such as anti-inflammation and anti-oxidation effects, etc. These effects must be caused by the different metabolites in various parts of herb. In order to profile the different parts of PG, the ultra-high performance liquid chromatography combined with quadrupole time-of- flight mass spectrometry (UPLC-QTOF-MSE) coupled with UNIFI platform and multivariate statistical analyses was used in this study. Consequently, for the constituent screening, 73, 42, 35, 44 compounds were characterized from the root, stem, leaf and seed, respectively. The stem, leaf and seed contain more flavonoids but few saponins that can be easily discriminated in the root. For the metabolomic analysis, 15, 5, 7, 11 robust biomarkers enabling the differentiation among root, stem, leaf and seed, were discovered. These biomarkers can be used for rapid identification of four different parts of PG grown in northeast China.


Introduction
It is well-known that there are both chemical and pharmacological differences in different parts of herbs. Taking Aristolochia mollissima Hance as an example, the fruits are used to treat cough and asthma, the roots have obvious antihypertensive effects, while the stems and leaves are rheumatoid medicines. This phenomenon also exists in other herbs, such as Lycium barbarum, Polygonum Multiflorum Thunb., Trichosanthes kirilowii Maxim, Ephedra sinice Stapf, etc. [1].
As both food and medicine, Platycodon grandiflorum (Jacq.) A. DC. (PG) is known as "Jiegeng" in China, "Huridunzhaga" in Mongolia, "Kikyo" in Japan and "Doraji" in North Korea [2]. In clinical, the root of PG which has various biological activities, such as apophlegmatic and antitussive [3], anti-inflammation [4], immunoregulation [5], anti-oxidant [6], etc., has been widely used for the treatment of cough, excessive phlegm, and sore throat. In addition, the stem and leaf of PG also have anti-inflammatory [7] and anti-oxidant [8,9] activities, while research on the pharmacological effects of PG seed is currently non-existent.
PG is a rich source of different natural products with various structural patterns. Around 100 compounds have been isolated from the roots of PG, including steroidal saponins, flavonoids, phenolic acids, polyacetylenes, sterols, etc. [2]. Triterpenoid saponins, mainly of the oleanane family pentacyclic type, are the active components of the root of PG [10]. Several flavonoids and phenolic acids were isolated from the aerial parts of PG [11]. Two glycosides and four flavonoids were isolated from the seeds of PG [12]. Recently, instead of traditional separation and identification method, a combination of ultra-high performance liquid chromatography (UHPLC) separation, quadrupole time-of-flight tandem mass spectrometry (QTOF-MS/MS) detection and automated data processing software UNIFI with scientific library was innovatively used for screening and identifying chemical components in herbal medicines [13,14] and traditional Chinese medicine formulas [15]. In 2015, Lee et al. reported the global profiling of various metabolites in PG by UPLC-QTOF/MS [16]. In that paper, a total of 20 metabolites were characterized from the roots, and 56 compounds from stems and leaves of PG grown in Korea. Herbs collected from different regions will show certain differences both in chemical constituents and in pharmacological activities [17]. For example, saponins in the root of PG from different sites in Gyeongnam Province, Korea showed different contents [18]. The 1 H-NMR-based metabolomics with OPLS-DA statistical models was used to cluster the ginseng samples from Korea and China, and the result suggested that the chemical profiles from two countries are quite different due to their different geographical origins [19]. Hence, in order to illustrate different chemical constituents from the different regions and from the different parts of the plants, and to better clarify the pharmacological fundamental substances of PG, the root, stem, leaf and seed of PG produced in Jilin Province, China were taken as samples in this paper.
Metabolomics, including targeted and untargeted complementary approaches, is primarily concerned with identification and quantitation of small-molecule metabolites (<1500 Da) [20]. Recently, because of its ability to profile diverse classes of metabolites, untargeted metabolomics has been widely used to compare the overall metabolic composition of different samples [21]. An untargeted analysis approach is mainly applied in metabolite identification through mass-based search followed by manual verification [20] Being a sensitive, efficient, reliable, accurate and nondestructive method, UPLC-QTOF-MS has been widely used recently in this kind of analysis, such as exploring the early detection of mycotoxins in wheat [22], estimating compliance to a dietary pattern [23], exploring the bioavailability of the secoiridoids from a seed/fruit extract in human healthy volunteers [24], evaluating the enantioselective metabolic perturbations in MCF-7 cells after treatment with R-metalaxyl and S-metalaxyl [25].
In this study we focus on both the quickly chemical components' screening and the non-targeted metabolomic analysis of the root, stem, leaf and seed of PG. UPLC-QTOF-MS E , UNIFI platform and multivariate statistical analyses, such as principal component analysis (PCA) and orthogonal partial least squares discriminant analysis (OPLS-DA) were used to profile the four different plant parts and to find the biomarkers among these four parts of PG grown in northeast China.

Identification of Components from Different Parts of PG
As a result, a total of 159 compounds were identified or tentatively characterized in both positive and negative mode from the four parts of PG, the base peak intensity (BPI) chromatograms are shown in Figure 1, and their chemical structures are shown in Figure 2. More specifically, 73, 42, 35, 44 compounds were characterized from the root, stem, leaf and seed respectively (Table 1), including triterpenoid saponins, organic acids, steroids, phenols, flavonoids, alcohols, amino acids, coumarins, terpenoids, alkaloids and amides and so on.
For the compounds which have isomers, they may be distinguished by their characteristic MS fragmentation patterns reported in literature, or may be compared with the retention times of reference standards. Taking compounds 98 and 106 as example, both have the same protonated ion [M + H] + at m/z 1413.6530 and 1413.6530. In the results, they matched 3"-O-acetylpolygalacin D2 and 2"-O-acetylpolygalacin D2, respectively.      Their identical MS fragment pattern were similar. But according to the literature, the C3-glucoside was eluted earlier than the C2-glucoside [26][27][28] in the ESI-BPI chromatogram, so the compound with the earlier RT was identified as the C3-glucoside, 3"-O-acetylpolygalacin D2, and the other one with the later RT was identified as the C2-glucoside, 2"-O-acetylpolygalacin D2.

Biomarker Discovery for Differentiating Four Parts of PG
The PCA 2D plots of the samples from the root, stem, leaf and seed groups were classified in four clusters according to their common spectral characteristics (Figure 3). That means the four parts of PG could be easily differentiated.
Their identical MS fragment pattern were similar. But according to the literature, the C3-glucoside was eluted earlier than the C2-glucoside [26][27][28] in the ESI-BPI chromatogram, so the compound with the earlier RT was identified as the C3-glucoside, 3″-O-acetylpolygalacin D2, and the other one with the later RT was identified as the C2-glucoside, 2″-O-acetylpolygalacin D2.

Biomarker Discovery for Differentiating Four Parts of PG
The PCA 2D plots of the samples from the root, stem, leaf and seed groups were classified in four clusters according to their common spectral characteristics (Figure 3). That means the four parts of PG could be easily differentiated. In order to differentiate one part from other three parts, the OPLS-DA models were built in both positive and negative modes. Then, OPLS-DA score plot, S-plot, variable trend and VIP (variable importance in the projection) values were obtained to understand which variables are the responsible for this sample separation [29]. Based on VIP values (VIP > 4) ( Figure 4) and p values (p < 0.05) [30] from univariate statistical analysis, 38 robust known biomarkers enabling the differentiation among root, stem, leaf and seed, were discovered and marked in S-plots ( Figure 5). In order to systematically evaluate the biomarkers, a heatmap was generated from these biomarkers (shown in Figure 6), which shows distinct segregation among the four parts.   In order to differentiate one part from other three parts, the OPLS-DA models were built in both positive and negative modes. Then, OPLS-DA score plot, S-plot, variable trend and VIP (variable importance in the projection) values were obtained to understand which variables are the responsible for this sample separation [29]. Based on VIP values (VIP > 4) ( Figure 4) and p values (p < 0.05) [30] from univariate statistical analysis, 38 robust known biomarkers enabling the differentiation among root, stem, leaf and seed, were discovered and marked in S-plots ( Figure 5). In order to systematically evaluate the biomarkers, a heatmap was generated from these biomarkers (shown in Figure 6), which shows distinct segregation among the four parts.
Their identical MS fragment pattern were similar. But according to the literature, the C3-glucoside was eluted earlier than the C2-glucoside [26][27][28] in the ESI-BPI chromatogram, so the compound with the earlier RT was identified as the C3-glucoside, 3″-O-acetylpolygalacin D2, and the other one with the later RT was identified as the C2-glucoside, 2″-O-acetylpolygalacin D2.

Biomarker Discovery for Differentiating Four Parts of PG
The PCA 2D plots of the samples from the root, stem, leaf and seed groups were classified in four clusters according to their common spectral characteristics (Figure 3). That means the four parts of PG could be easily differentiated. In order to differentiate one part from other three parts, the OPLS-DA models were built in both positive and negative modes. Then, OPLS-DA score plot, S-plot, variable trend and VIP (variable importance in the projection) values were obtained to understand which variables are the responsible for this sample separation [29]. Based on VIP values (VIP > 4) ( Figure 4) and p values (p < 0.05) [30] from univariate statistical analysis, 38 robust known biomarkers enabling the differentiation among root, stem, leaf and seed, were discovered and marked in S-plots ( Figure 5). In order to systematically evaluate the biomarkers, a heatmap was generated from these biomarkers (shown in Figure 6), which shows distinct segregation among the four parts.  116  83  77  95  42  96  94  82  97  89  91  106  101  102  61  53  87  144  7  47  115  26  59  119  125  60  86  57  69  84  55  99  37  18  73

Discussion
There are 73, 42, 35, 44 compounds that were characterized from the root, stem, leaf and seed, respectively. As the results show, 95 compounds were identified in ESI(−) mode and 64 compounds were identified in ESI(+) mode. According to the BPI chromatograms of the four parts of PG, it seems that ESI(−) ionization mode is better than ESI(+) based on the quantity and the responses of the identified compounds, but it is still necessary to run the ESI(+) mode because some compounds showed better respond than in ESI(−) mode.
Compared with the results from previous studies [2,8,16,31,32], 56 chemical components were identified for the first time in Campanulaceae. The stem, leaf and seed contain more flavonoids but few saponins that can be easily discriminated from the root. In previous study, various metabolites in Korean Platycodon grandiflorum were profiled by UPLC-QTOF/MS [16]. Compared with the root of PG in Korea, there were only nine constituents (compounds 5, 31, 76, 79, 83, 91, 94, 95, 97) in common. Meanwhile, the stems and leaves of PG in Korea and in China are both rich in natural components with various structural patterns, including triterpenoid saponins, flavonoids, organic acids, phenols, alcohols, amino acids, coumarins and amino acids, etc., but there are only two similar chemical components (compounds 99, 104). It is also interesting that there are eleven components (compounds 5, 14, 17, 21, 23, 31, 52, 83, 94, 95, 97) reported in stems and leaves of PG in Korea that were found in the root of PG in China. The reason for this phenomenon may be the different analytical methods and the different growing locations.
Even so, there are still some unresolved issues. Firstly, pharmaceutical effects associated with these robust biomarkers or these identified compounds should be screened in the future. Additionally, as shown in BPI chromatograms, though 159 compounds were identified there are still many unidentified components. Further research should be carried on based on the formula of these

Discussion
There are 73, 42, 35, 44 compounds that were characterized from the root, stem, leaf and seed, respectively. As the results show, 95 compounds were identified in ESI(−) mode and 64 compounds were identified in ESI(+) mode. According to the BPI chromatograms of the four parts of PG, it seems that ESI(−) ionization mode is better than ESI(+) based on the quantity and the responses of the identified compounds, but it is still necessary to run the ESI(+) mode because some compounds showed better respond than in ESI(−) mode.
Compared with the results from previous studies [2,8,16,31,32], 56 chemical components were identified for the first time in Campanulaceae. The stem, leaf and seed contain more flavonoids but few saponins that can be easily discriminated from the root. In previous study, various metabolites in Korean Platycodon grandiflorum were profiled by UPLC-QTOF/MS [16]. Compared with the root of PG in Korea, there were only nine constituents (compounds 5, 31, 76, 79, 83, 91, 94, 95, 97) in common. Meanwhile, the stems and leaves of PG in Korea and in China are both rich in natural components with various structural patterns, including triterpenoid saponins, flavonoids, organic acids, phenols, alcohols, amino acids, coumarins and amino acids, etc., but there are only two similar chemical components (compounds 99, 104). It is also interesting that there are eleven components (compounds 5, 14, 17, 21, 23, 31, 52, 83, 94, 95, 97) reported in stems and leaves of PG in Korea that were found in the root of PG in China. The reason for this phenomenon may be the different analytical methods and the different growing locations.
Even so, there are still some unresolved issues. Firstly, pharmaceutical effects associated with these robust biomarkers or these identified compounds should be screened in the future. Additionally, as shown in BPI chromatograms, though 159 compounds were identified there are still many unidentified components. Further research should be carried on based on the formula of these unknown compounds [13]. Most importantly, the stems and leaves of PG should be developed and utilized due to the presence of so many different components from the root. This comprehensive and unique phytochemical profile study revealed the structural diversity of secondary metabolites and the different patterns in various parts of PG. The method developed in this study can be used as a standard protocol for discriminating and predicting parts of PG directly.

Materials and Reagents
All samples were harvested from Jilin Province, China, as listed in Table 2  Acetonitrile and methanol suitable for UHPLC-MS purchased from Fisher Chemical Company (Geel, Belgium). Formic acid for UPLC was purchased from Sigma-Aldrich (St. Louis, MO, USA). Deionized water was purified using a Millipore water purification system (Millipore, Billerica, MA, USA). All other chemicals were of analytical grade.

Sample Preparation and Extraction
The roots, stems, leaves and seeds of PG from the different sites were respectively air dried, ground and sieved (40 mesh) to give a homogeneous powder. Then 200 mg of the powder was respectively extracted thrice with 80% methanol at 80 • C for 3 h each time. After filtering, the extracts were combined, concentrated and evaporated to dryness. Finally, the desiccated extracts were dissolved and diluted with 80% methanol to 10.0 mL. The solution was filtered through a syringe filter (0.22 µm) and injected directly into the UPLC system. The volume injected was 2 µL for each run.

UPLC-QTOF-MSE
The UPLC analysis was performed by a Waters ACQUITY UPLC System. The column used was an ACQUITY UPLC BEH C18 (100 mm × 2.1 mm, 1.7 µm) from Waters Corporation (Milford, MA, USA). The mobile phases consisted of eluent A (0.1% formic acid in water, v/v) and eluent B (0.1% formic acid in acetonitrile, v/v) with flow rate of 0.4 mL/min with a liner gradient program: 10% B from 0 to 2 min, 10-90% B from 2 to 26 min, 90% B from 26 to 28 min, 90-10% B from 28 to 28.1 min, 10% B from 28.1 to 30 min. The temperature of the UPLC column and autosampler were set at 30 • C and 15 • C. Mixtures of 10/90 and 90/10 water/acetonitrile were used as the strong wash and the weak wash solvent respectively.
The MS experiments were performed on a Waters Xevo G2-S QTOF mass spectrometer (Waters Co., Milford, MA, USA.) connected to the UPLC system through an electrospray ionization (ESI) interface. The optimized instrumental parameters were as follows: capillary voltage floating at 2.6 kV (ESI+) or 2.2 kV (ESI−); cone voltage at 40 V; source temperature at 120 • C, desolvation temperature at 300 • C and cone gas flow was 50 L/h, desolvation gas flow was 800 L/h. In MSE mode, collision energy of low energy function was set at 6 V, while ramp collision energy of high energy function was set at 20-40 V. To ensure mass accuracy and reproducibility, the mass spectrometer was calibrated over a range of 100-1600 Da with sodium formate. Leucine-enkephalin (m/z 556.2771 in positive ion mode; m/z 554.2615 in negtive ion mode) was used as the lockmass at a concentration of 200 ng/mL and flow rate of 20 µL/min. Data were collected in continuum mode, all the acquisition of data were controlled by the Waters MassLynx v.4.1 software ( waters, Milford, MA, USA).

Data Analysis
For the screening analysis, the raw data were processed using the streamlined workflow of UNIFI 1.7.0 software (Waters, Manchester, UK) to quickly identify the chemical components [15]. Besides the Waters Traditional Medicine Library in the UNIFI software, a self-built database was created including the information of chemical components from PG based on the literature and on-line databases such as China Full-text Journals Database (CNKI), PubMed, Medline, Web of Science and ChemSpider. Minimum peak area of 200 was set for 2D peak detection.The peak intensity of high energy over 200 counts and over 1000 counts for low energy were the selected parameters in 3D peak detection. A margin of error up to 5 ppm for identified compounds was allowed. Positive adducts containing +H, +Na, and negative adducts including +COOH and −H were selected. The verification of compounds was carried out by comparison with retention time of reference standards and characteristic MS fragmentation patterns reported in literature.
For metabonomics analysis, the raw data were processed by MarkerLynx XS V4.1 software for alignment, deconvolution, data reduction, etc. [33]. As a result, the list of mass and retention time pairs with corresponding intensities for all the detected peaks from each data file. The main parameters were as follows: retention time range 0-28 min, mass range 100-1600 Da, mass tolerance 0.10, minimum intensity 5%, marker intensity threshold 2000 counts, mass window 0.10, retention time window 0.20, and noise elimination level 6. The resulting data were analyzed by principle component analysis (PCA) and orthogonal projections to latent structures discriminant analysis (OPLS-DA). S-plots and VIP-plots were obtained via OPLS-DA analysis to find potential biomarkers that significantly contributed to the difference among the groups.

Conclusions
In the present study, UPLC-QTOF-MSE coupled with UNIFI platform and precise multivariate statistical analyses was used to profile the four parts of PG. For the constituent screening under the optimized conditions, a total of 159 chemical compounds (73, 42, 35, 44 compounds characterized from root, stem, leaf and seed, respectively) were identified from PG. The results showed various structural patterns including triterpenoid saponins, organic acids, steroids, phenols, flavonoids, alcohols, amino acids, coumarins, terpenoids, alkaloids and amides. The stem, leaf and seed contain more flavonoids but few saponins that can be easily discriminated from the root.
For the metabolomic analysis, four parts of PG were successfully discriminated into four different clusters. A total of 38 robust biomarkers were discovered. That is to say, 15, 5, 7, and 11 robust biomarkers enabling the differentiation among root, stem, leaf and seed, were characterized. These biomarkers can be suitable for the simultaneous differentiation of four different parts of PG, which is reported for the first time. In a word, these results provided the reliable characterization profiles and the differentiate components among root, leaf, stem and seed of PG grown in northeast China. The method developed in this study can be used as a standard protocol for discriminating and predicting the different parts of PG directly.