Compounds Identification in Semen Cuscutae by Ultra-High-Performance Liquid Chromatography (UPLCs) Coupled to Electrospray Ionization Mass Spectrometry

Semen Cuscutae is commonly used in traditional Chinese medicine and contains a series of compounds such as flavonoids, chlorogenic acids and lignans. In this study, we identified different kinds of compositions by ultra-high-performance liquid chromatography (UPLC) coupled to electrospray ionization mass spectrometry (MS). A total of 45 compounds were observed, including 20 chlorogenic acids, 23 flavonoids and 2 lignans. 23 of them are reported for the first time including 6-O-caffeoyl-β-glucose, 3-O-(4′-O-Caffeoylglucosyl) quinic acid, etc. Their structures were established by retention behavior, extensive analyses of their MS spectra and further determined by comparison of their MS data with those reported in the literature. As chlorogenic acids and flavonoids are phenolic compounds that are predominant in Semen Cuscutae, in conclusion, phenolic compounds are the major constituents of Semen Cuscutae.


Introduction
Semen Cuscutae is the dry mature seed of Cuscuta australis R.Br. or Cuscuta chinensis Lam., belonging to convolvulaceae family. It was first recorded in the "Shen Nong's Herbal" as an upper grade drug about 2000 years ago. Semen Cuscutae has been widely prescribed by Chinese medicinal practitioners to nourish the liver and kidney, improve eyesight, treat the aching and weakness of the loins and knees, prevent abortion, and treat diarrhea due to hypofunction of the kidney and the spleen [1]. Previous phytochemical investigations on Semen Cuscutae have led to the isolation of a series of natural compounds, including flavonoids, lignans, polysaccharides, alkaloids and other chemicals [2][3][4].
Most studies of identification and quantification of flavonoids and polysaccharide in Semen Cuscutae have been performed by HPLC-UV [5][6][7], but few studies have been performed by ultra-high-performance liquid chromatography (UPLC) coupled with electrospray ionization tandem mass spectrometry. This method has the advantage that it is more sensitive and selective than a HPLC-UV, leading to a more exact identification of a higher number of compounds [8].
The purpose of this work was to identify different kinds of ingredients with significant biological functions in Semen Cuscutae for further phytochemical and pharmacological study.
The purpose of this work was to identify different kinds of ingredients with significant biological functions in Semen Cuscutae for further phytochemical and pharmacological study.

Optimization of UPLC-MS Conditions
In this study, an optimized chromatographic separation was achieved using acetonitrile-water containing 0.05% formic acid solvent system as the mobile phases. Waters ACQUITY UPLC BEH C18 (2.1 × 100 mm i.d., 1.7 μm) was selected for qualitative analysis due to better separation efficiency. A representative total ion chromatographic (TIC) were shown in Figure 1.
To obtain the satisfactory analytical method, chromatographic conditions, including mobile phase (methanol, acetonitrile and acetonitrile-water), flow rate (0.1, 0.2, and 0.3 mL·min −1 ), formic acid addition (0.05% and 0.1%), and column type (Waters ACQUITY BEH C18, 2.1 × 100 mm, 1.7 μm, and Agilent Eclipse Plus C18 column (2.1 × 100 mm i.d., 1.8 μm) were optimized after several trials. Meanwhile, in order to achieve massive fragment ions, all the factors related to MS performance, including ionization mode, sheath gas flow rate, aux gas flow rate, spray voltage of the ion source, and collision energy have been optimized.

Optimization for Sample Extraction
The extraction method had been established by our team [5]. The best extracted condition was established as follows: 1.0 g of sample was extracted by refluxing using 50 mL 80% methanol as solvent for 2 h. To obtain satisfactory extraction efficiency, the extraction method (refluxing and ultrasonication), extraction concentration (40%, 60% and 80%), and extraction time (0.5, 1 and 2 h) were optimized.

Structural Characterization by UPLC-MS
Due to the lack of standards for some of the compounds, their negative identification was based on the correspondence of the ion from the deprotonated molecule with literature data, fragmentation patterns of other similar compounds and database. The chromatographic of standards were shown in Figures 2 and 3. For the LC-MS measurements, negative ion mode was used to obtain the better RT: 0.14 -33.99

Optimization for Sample Extraction
The extraction method had been established by our team [5]. The best extracted condition was established as follows: 1.0 g of sample was extracted by refluxing using 50 mL 80% methanol as solvent for 2 h. To obtain satisfactory extraction efficiency, the extraction method (refluxing and ultrasonication), extraction concentration (40%, 60% and 80%), and extraction time (0.5, 1 and 2 h) were optimized.

Structural Characterization by UPLC-MS
Due to the lack of standards for some of the compounds, their negative identification was based on the correspondence of the ion from the deprotonated molecule with literature data, fragmentation patterns of other similar compounds and database. The chromatographic of standards were shown in Figures 2 and 3. For the LC-MS measurements, negative ion mode was used to obtain the better tandem mass spectra and high-resolution mass spectra. In total of 45 compounds were identified, including 23  (Table 1). For all the compounds the high-resolution mass data was in good agreement with the theoretical molecular formulas, all displaying a mass error of below 5 ppm ( tandem mass spectra and high-resolution mass spectra. In total of 45 compounds were identified, including 23 flavonoids, 2 lignans and 20 chlorogenic acids (Table 1). For all the compounds the highresolution mass data was in good agreement with the theoretical molecular formulas, all displaying a mass error of below 5 ppm (Table 2) thus confirming their elemental composition.     Chlorogenic acids (CGAs) are a family of esters of trans-cinnamic acids (most commonly p-coumaroyl, caffeoyl, feruloyl and dimethoxycinnamoyl acids) with quinic acid [9,10]. The trans-cinnamic acids can be esterified at one or more of the hydroxyls at positions 1, 3, 4, and 5 of quinic acid, originating series of positional isomers. More importantly, it is easy to distinguish a 4-acyl chlorogenic acids by its "dehydrated" MS 2 base peak at m/z 173 ([quinic acid-H-H 2 O] − ), supported by strong MS 3 ions at m/z 93 [11,12]. P-Coumaroylquinic acid (pCoQA) has a molecular weight (Mr) of 338.1002 and three peaks (peak 34, 41, 42) at m/z 337 were detected. The three peaks are all pCoQA isomers. In addition, according to previous reported literature, the retention time of a 4-position substituted cis-isomer on a reverse phase chromatographic column is obviously longer than that of a trans-isomer [13]. Based on the above analysis, compounds 34 and 41 were identified as cis-4-p-CoQC and trans-4-p-CoQA, respectively. While compound 42 is pCoQA isomer which is uncertain. Chromatographic peaks 13 and 32 presented m/z 353 as base peaks in negative ionization mode mass spectra, which suggested positional isomers of a quinic acid (QA) esterified with a single caffeoyl (CAF) unit. The product ion spectra obtained by negative ion MS/MS for precursor ions m/z 353 were different from each other. The product ion spectrum for peak 32 showed m/z 173 (dehydrated quinic moiety) as the base peak, m/z 191 [loss of caffeic moiety], and m/z 179 (loss of quinic moiety). As m/z 173 is a diagnostic ion that acylated at position 4, peak 32 was attributed to 4-CQA ( Figure 4). As reported before, the retention times of acylated CQAs repeat the elution pattern: 3-acylquinic acid elutes first, followed by 4-acylquinic acids. So peak 13 is assumed as 3-CQA ( Figure 5) [9,14]. substituted cis-isomer on a reverse phase chromatographic column is obviously longer than that of a trans-isomer [13]. Based on the above analysis, compounds 34 and 41 were identified as cis-4-p-CoQC and trans-4-p-CoQA, respectively. While compound 42 is pCoQA isomer which is uncertain. Chromatographic peaks 13 and 32 presented m/z 353 as base peaks in negative ionization mode mass spectra, which suggested positional isomers of a quinic acid (QA) esterified with a single caffeoyl (CAF) unit. The product ion spectra obtained by negative ion MS/MS for precursor ions m/z 353 were different from each other. The product ion spectrum for peak 32 showed m/z 173 (dehydrated quinic moiety) as the base peak, m/z 191 [loss of caffeic moiety], and m/z 179 (loss of quinic moiety). As m/z 173 is a diagnostic ion that acylated at position 4, peak 32 was attributed to 4-CQA ( Figure 4). As reported before, the retention times of acylated CQAs repeat the elution pattern: 3-acylquinic acid elutes first, followed by 4-acylquinic acids. So peak 13 is assumed as 3-CQA ( Figure 5) [9,14].  Feruloylquinic acids has a Mr = 368.1107. Similar to pCoQA and CQA, molecules harboring ferulic acid moieties were also identified. The negative ionization mode fragmentation of the precursor ion m/z 367 of peak 19 produced m/z 191 as base peak. This is a diagnostic ion of acylation in position 5 of quinic acid (19) and allows the identification of the compound as 5-FQA (Figure 6), based on the chlorogenic acid identification by LEONARDO et al. [9]. substituted cis-isomer on a reverse phase chromatographic column is obviously longer than that of a trans-isomer [13]. Based on the above analysis, compounds 34 and 41 were identified as cis-4-p-CoQC and trans-4-p-CoQA, respectively. While compound 42 is pCoQA isomer which is uncertain. Chromatographic peaks 13 and 32 presented m/z 353 as base peaks in negative ionization mode mass spectra, which suggested positional isomers of a quinic acid (QA) esterified with a single caffeoyl (CAF) unit. The product ion spectra obtained by negative ion MS/MS for precursor ions m/z 353 were different from each other. The product ion spectrum for peak 32 showed m/z 173 (dehydrated quinic moiety) as the base peak, m/z 191 [loss of caffeic moiety], and m/z 179 (loss of quinic moiety). As m/z 173 is a diagnostic ion that acylated at position 4, peak 32 was attributed to 4-CQA ( Figure 4). As reported before, the retention times of acylated CQAs repeat the elution pattern: 3-acylquinic acid elutes first, followed by 4-acylquinic acids. So peak 13 is assumed as 3-CQA ( Figure 5) [9,14].  Feruloylquinic acids has a Mr = 368.1107. Similar to pCoQA and CQA, molecules harboring ferulic acid moieties were also identified. The negative ionization mode fragmentation of the precursor ion m/z 367 of peak 19 produced m/z 191 as base peak. This is a diagnostic ion of acylation in position 5 of quinic acid (19) and allows the identification of the compound as 5-FQA (Figure 6), based on the chlorogenic acid identification by LEONARDO et al. [9]. Feruloylquinic acids has a Mr = 368.1107. Similar to pCoQA and CQA, molecules harboring ferulic acid moieties were also identified. The negative ionization mode fragmentation of the precursor ion m/z 367 of peak 19 produced m/z 191 as base peak. This is a diagnostic ion of acylation in position 5 of quinic acid (19) and allows the identification of the compound as 5-FQA (Figure 6), based on the chlorogenic acid identification by LEONARDO et al. [9].  C9H6O3)). Base peak in peak 6 at m/z 179, indicating it was 3-substituted quinic acid. This ion in 3, 4-diCQA (peaks 4) was at m/z 178 and in 4, 5-diCQA (peaks 7) was absent. 3, 5-DiCQA (peaks 6) was relatively easy to distinguish, owing to its MS 3 base peak m/z 191 and similar intensities of ions at m/z 179 with data previously published [15]. Generally, it was observed that the order of elution for the diacyl CGAs in RP columns is 3, 4 > 3, 5 > 4, 5 [16]. By comparing with reference substances, peaks 4, 6 and 7 were assigned as 3, 4-diCQA, 3, 5-diCQA, 4, 5-diCQA which is consist with data previously published. Caffeoylglycoside have molecular weights of 342.0951 which indicates that the glucosyl group was linked to the caffeic acid, not quinic acid [13]. Two molecules (2 and 5) with pseudomolecular peaks at m/z 341 were assigned as isomers of caffeoylglycoside. They produced distinctive ions at m/z 179 ([caffeic acid-H] − ) by the loss of a glucosyl residue (C6H10O5) and m/z 135 ([caffeic acid-H] − ). As reported before, the retention time of a 6-position substituted β-glucose isomer on a reverse phase chromatographic column is obviously longer than that of α-glucose isomer [17]. On the basis of these ). Base peak in peak 6 at m/z 179, indicating it was 3-substituted quinic acid. This ion in 3, 4-diCQA (peaks 4) was at m/z 178 and in 4, 5-diCQA (peaks 7) was absent. 3, 5-DiCQA (peaks 6) was relatively easy to distinguish, owing to its MS 3 base peak m/z 191 and similar intensities of ions at m/z 179 with data previously published [15]. Generally, it was observed that the order of elution for the diacyl CGAs in RP columns is 3, 4 > 3, 5 > 4, 5 [16]. By comparing with reference substances, peaks 4, 6 and 7 were assigned as 3, 4-diCQA, 3, 5-diCQA, 4, 5-diCQA which is consist with data previously published.  C9H6O3)). Base peak in peak 6 at m/z 179, indicating it was 3-substituted quinic acid. This ion in 3, 4-diCQA (peaks 4) was at m/z 178 and in 4, 5-diCQA (peaks 7) was absent. 3, 5-DiCQA (peaks 6) was relatively easy to distinguish, owing to its MS 3 base peak m/z 191 and similar intensities of ions at m/z 179 with data previously published [15]. Generally, it was observed that the order of elution for the diacyl CGAs in RP columns is 3, 4 > 3, 5 > 4, 5 [16]. By comparing with reference substances, peaks 4, 6 and 7 were assigned as 3, 4-diCQA, 3, 5-diCQA, 4, 5-diCQA which is consist with data previously published. Caffeoylglycoside have molecular weights of 342.0951 which indicates that the glucosyl group was linked to the caffeic acid, not quinic acid [13]. Two molecules (2 and 5) with pseudomolecular peaks at m/z 341 were assigned as isomers of caffeoylglycoside. They produced distinctive ions at m/z 179 ([caffeic acid-H] − ) by the loss of a glucosyl residue (C6H10O5) and m/z 135 ([caffeic acid-H] − ). As reported before, the retention time of a 6-position substituted β-glucose isomer on a reverse phase chromatographic column is obviously longer than that of α-glucose isomer [17]. On the basis of these Caffeoylglycoside have molecular weights of 342.0951 which indicates that the glucosyl group was linked to the caffeic acid, not quinic acid [13]. Two molecules (2 and 5) with pseudomolecular peaks at m/z 341 were assigned as isomers of caffeoylglycoside. They produced distinctive ions at m/z 179 ([caffeic acid-H] − ) by the loss of a glucosyl residue (C 6 H 10 O 5 ) and m/z 135 ([caffeic acid-H] − ). As reported before, the retention time of a 6-position substituted β-glucose isomer on a reverse phase chromatographic column is obviously longer than that of α-glucose isomer [17]. On the basis of these arguments, the first eluting isomer was assigned as 6-O-caffeoyl-β-glucose (2) and the later eluting isomer as 6-O-caffeoyl-α-glucose (5).  [14] led to the conclusion that CQA forms a glycoside through an ether bond at either C-3 or C-4 on the aromatic caffeoyl ring. However, a MS 2 base peak at m/z 323 is a characteristic of glucosyl attachment at C-3 [18].  Table 1). The MS 2 peaks at m/z 353, 341 and 323 suggested that glucose was connected with the caffeic acid moiety by an ether linkage and caffeic acid was connected with quinic acid at C-5 by an ester bond [19]. From the above points it was clear that peak 10 can be identified as 5-O-(3 -O-Caffeoylglucosyl) quinic acid (Figure 8). Previous studies [14] led to the conclusion that CQA forms a glycoside through an ether bond at either C-3 or C-4 on the aromatic caffeoyl ring. However, a MS 2 base peak at m/z 323 is a characteristic of glucosyl attachment at C-3 [18].  Table 1). The MS 2 peaks at m/z 353, 341 and 323 suggested that glucose was connected with the caffeic acid moiety by an ether linkage and caffeic acid was connected with quinic acid at C-5 by an ester bond [19]. From the above points it was clear that peak 10 can be identified as 5-O-(3′-O-Caffeoylglucosyl) quinic acid (Figure 8).  One peak was detected at m/z 315(peak 45) in the extracted ion chromatogram and was tentatively assigned as isorhamnetin. It produced the MS 2 base peak at m/z 300 indicating the presence of a methoxyl group (loss of a methyl radical). These data matched those previously reported for isorhamnetin [20].  [20]. As reported [20], peak 26 showed weaker retention on the RP-HPLC column than peak 27, therefore they were assigned as kaempferol-3-O-galactoside, astragalin respectively. For peak 31, there were no [A-H-30] − ion, the base peak for MS 3 was M/S 151 which is consist with luteolin, therefore peak 31 was identified as luteolin-7-O-glucoside and was confirmed by comparison with a reference standard. Peak 30 had similar fragments with peak 31, thus it was assigned as luteolin-hexoside. Two peaks were detected at m/z 477 in the extracted ion chromatogram and were tentatively assigned as isorhamnetin-hexoside (28 and 29). These two compounds produced base peak at m/z 314, originating from the loss of a hexose (162 Da), and MS 3 spectrum was very similar to that of isorhamnetin. These compounds were thus tentatively identified as isorhamnetin-7-glucoside and isorhamnetin-3-O-glucoside, respectively.

Characterization of Caffeoylquinic
2.5.6. Characterization of Kaempferol-Glucoside (Mr = 568.1217) For peak 33, a significant loss of 120 Da was also observed, but no direct loss of 162 Da from the [M − H] − ion was observed. Therefore, it is rational to assign a p-hydroxybenzoyl group linked to the hexose moiety rather than the aglycone in this structure. Interestingly, a second loss of 120 Da (m/z 447→327) was also observed, which presumably results from 1,2 X fragmentation of the hexose. Peak 33 produced MS 2 base peak at m/z 285 whose fragmentation was consistent with kaempferol, and therefore peak 33 was finally confirmed as kaempferol-3-O-p-hydroxybenzoylglucoside, which is consistent with the previous report [20]. Previous study [20] reported that the retention times of flavonoid diglycosides on RP-HPLC columns generally is longer than monoglycosides. Based on the above points, compound 40 was assigned as kaempferol-3-O-coumaroylglucoside. Peak 21 gave MS 2 and MS 3 spectra very similar to those of astragalin, and was plausibly identified as hyperoside, which has been previously reported [20]. However peak 22 had similar fraction information with 21, so it was tentatively identified as isoquercitrin.

Sample Collection
The crude products of Semen Cuscutae (Lot number: 160161211) were purchased from Beijing Kangmei Pharmaceutical Co., Ltd. which were identified and authenticated as Semen Cuscutae by Yang Yaojun, the professor of Pharmacognosy Department in Beijing University of Chinese Medicine. Voucher specimens were retained in the School of Chinese Materia, Beijing University of Chinese Medicine.

Extraction Method
The extraction method referenced to our previous study and was set as follow [5]: powdered samples (60 mesh, 1 g) were suspended in 80% methanol (50 mL) and extracted under reflux for 2 h. After cooling, the loss of weight was replenished with 80% methanol. All solvents and samples were filtered through 0.22-μm organic-membranes prior to injection.

Sample Collection
The crude products of Semen Cuscutae (Lot number: 160161211) were purchased from Beijing Kangmei Pharmaceutical Co., Ltd. which were identified and authenticated as Semen Cuscutae by Yang Yaojun, the professor of Pharmacognosy Department in Beijing University of Chinese Medicine. Voucher specimens were retained in the School of Chinese Materia, Beijing University of Chinese Medicine.

Extraction Method
The extraction method referenced to our previous study and was set as follow [5]: powdered samples (60 mesh, 1 g) were suspended in 80% methanol (50 mL) and extracted under reflux for 2 h. After cooling, the loss of weight was replenished with 80% methanol. All solvents and samples were filtered through 0.22-µm organic-membranes prior to injection.
For LC/MS analysis, an LTQ-Orbitrap mass spectrometer (Thermo Scientific, Bremen, Germany) was connected to the Ultra-High-Performance Liquid Chromatography instrument via an electrospray ionization (ESI) interface. Samples were analyzed in negative ion mode with a tune method set as follows: sheath gas (nitrogen) flow rate of 40 arb, aux gas (nitrogen) flow rate of 20 arb, source voltage, 4 kV, capillary temperature of 350 • C, capillary voltage of 25 V, and tube lens voltage of −110 V. Accurate mass analysis was calibrated according to the manufacturer's guidelines. Centroided mass spectra were acquired in mass range of m/z 50-1000 and resolution set at 30,000 using a normal scan rate detected by Orbitrap analyzer.

Data Processing
Thermo Xcaliber 2.1 (Thermo Fisher Scientific, San Jose, CA, USA) was used for qualitative data acquiring and processing. All the relevant data including peak number, retention time, accurate mass and predicted chemical formula were recorded into an Excel file.

Conclusions
In this study, we identified 45 compositions in Semen Cuscutae using UPLC coupled with electrospray ionization tandem mass spectrometry system. 23 of them are reported for the first time including 6-O-caffeoyl-β-glucose, 3-O-(4 -O-Caffeoylglucosyl) quinic acid, etc. As chlorogenic acids and flavonoids are phenolic compounds which are predominant compounds in Semen Cuscutae, we can conclude that phenolic compounds are the major constituents of Semen Cuscutae.
Author Contributions: Y.Z., H.X., X.X. and X.L. conceived and designed the experiment. Y.Z. and H.X. performed the experiment and data analysis. Y.Z., Y.G., H.Z. and X.L. drafted the paper. S.X., H.L., X.X. and M.L. revised the manuscript. All authors have contributed to the final version and approved the publication of the final manuscript.