Lamiophlomis rotata Identification via ITS2 Barcode and Quality Evaluation by UPLC-QTOF-MS Couple with Multivariate Analyses

Lamiophlomis rotata (L. rotata), is known as “Daba” in the Tibetan region, Ajuga ovalifolia and Oreosolen wartii have also been utilized as substitutes for “Daba”, however, only L. rotata has been officially listed in the Chinese Pharmacopoeia for hemostasis preparations. To safely apply the traditional uses of the herb, internal transcribed spacer 2 (ITS2) DNA barcodes were employed to discriminate L. rotata from its adulterants. For further evaluation of the quality of different originating habitats, the chemical profiles of 25 samples were determined by ultra-high-performance liquid chromatography coupled with time-of-flight mass spectrometry (UPLC-QTOF-MS) coupled with multivariate analyses. ITS2 DNA barcodes differentiated L. rotata from O. wartii and A. ovalifolia accurately. A neighbor-joining (NJ) tree showed that three origins clustered into three clades. Forty-nine compounds were identified in the total ion current (TIC) profile of L. rotata. Additionally, two pairs of isomers were identified for the first time by using mass spectrometry fragmentation. The differences between the variable habitats were determined by multivariate statistical analysis of the UPLC-QTOF-MS data from 25 specimens. Ten compounds were identified as the characteristic markers distinguishing the sample from four geographical origins. The results also suggest that samples from Qinghai and Sichuan province would be the most suitable choice for traditional prescriptions and preparations.


Introduction
Lamiophlomis rotata (Benth.) kudo (L. rotata), a plant growing at high altitude in China, has been used to treat rheumatic arthritis and grasserie for more than 2000 years in the Traditional Tibetan System (TTS). It is known as "Daba" and "Dabuba" in the TTS, additionally, Ajuga ovalifolia (A. ovalifolia) and Oreosolen wartii (O. wartii) have been utilized as substitutes for Daba in the TTS [1]. However, ITS2 segments were successfully extracted from all samples. The polymerase chain reaction (PCR) amplification success rates for ITS2 were 100%. All PCR products were successfully sequenced, and high-quality bidirectional sequences were obtained.
The sequence characteristics are summarized in Table 1. The average G-C contents of the ITS2 sequences in L. rotata and A. ovalifolia were 70% to 72% and 65%. ITS2 sequences of L. rotata ranged from 0 bp to 219 bp with five variable sites; five haplotypes were identified from 25 samples. ITS2 sequences of A. ovalifolia ranged from 0 bp to 229 bp with three variable sites, and three haplotypes were identified. In general, gene segments are available when the minimum inter-specific distances are larger than the maximum intra-specific distances by the Kimura 2-Parameter (K2P) model. In this experiment, the minimum inter-specific distances were 0.258 cM, and the maximum intra-specific distances were 0.021 cM (Table 1). Therefore, the ITS2 region could be an ideal barcode for discriminating the three origins of L. rotata, A. ovalifolia, and O. wartii. In this study, a phylogenetic tree was constructed using the neighbor-joining (NJ) method, with 1000 bootstrap replicates for ITS2 regions ( Figure 1). All species were clearly identified, including the medicinal and non-medicinal species. Each of the three species is in one branch of the phylogenetic tree. Specifically, the NJ tree also showed that all samples of L. rotata were clustered into three subgroups according to their geographical origins. The samples from Tibet, and Gansu, are gathered into one branch each and the samples of Sichuan and Qinghai are gathered together into another branch.

Identification of the Constituents in L. rotata by UPLC-QTOF-MS Spectra
Crude extracts of L. rotata were analyzed by mass and MS n in negative and positive ion modes ( Figure 2). The SciFinder Scholar and PubChem data bases were searched for the spectral data of compounds reported previously in the L. rotata and Lamium species [22][23][24][25][26][27], and a total of 51 compounds in four classes were detected in the total ion current (TIC) profile of L. rotata, and 49 of these were identified by comparing the retention times and mass spectra of the compounds to those of authentic standards, including 23 iridoids, 16 phenylethanoid glycosides, nine flavanoids and one phenolic acid (Table 2). For the first time, two pairs of isomers (8-O-acetylshanzhiside methyl ester and 6-O-acetylshanzhiside methyl ester, phlorigidoside C, and zaluzioside) were identified by mass spectrometry based on their different group substitution positions. Peaks 19 and 29 exhibited the same [M + Na] + ions at m/z 471 in the positive mode, consistent with a molecular formula of C 19 H 28 O 12 , they both product the ions at m/z 227 and 209 Da. However, a hydroxyl group was linked to C-6 in compound 29, so it easily lost a methanol molecule to form a lactone with the carboxymethyl (COOCH 3 ) group at the C-4 position. This came, with a (neutral) loss of 32 Da, and successive losses of two CO groups, which yielded further peaks simultaneously appearing at m/z 177 Da (m/z 209→177, ∆m = 32 Da, MeOH group loss), 149 Da (m/z 177→149, ∆m = 28 Da, CO group loss), and the characteristic ion at m/z 121 Da (m/z 149→121, ∆m = 28 Da, CO group loss). Since there was no hydroxyl group substituted at the C-6 position in compound 19, after successive losses of H 2 O, CH 2 , and CH 2 O groups, the peak yielded ions at m/z 191 Da (m/z 209→191, ∆m = 18 Da, H 2 O group loss), 177 Da (m/z 191→177, ∆m = 14 Da, CH 2 group loss), and the distinctive ion at m/z 135 Da (m/z 177→135, ∆m = 42 Da, CH 2 CO group loss). The proposed fragmentation pathways of the isomers are shown in Figure 3a

Multivariate Statistical Analysis
Principle component analysis (PCA) was used to maximize the discrimination and present the metabolite differences among groups (Figure 5a), the result demonstrated that 25 samples were separated into two groups; all samples from Gansu Province and two samples from Qinghai Province were clustered into group I, other samples were clustered into group II. Moreover, the 13 samples of group II were branched into two subgroups. Partial least-squares discriminant analysis (PLS-DA) was also used for a better understand of the geographical origins of the collected samples. As shown in Figure 5b, all of the L. rotata samples were clustered into three groups. The high R 2 Y (0.814) of this model presented a goodness of fit, and the Q2 at 0.316 indicated good predictivity. The samples of group I were from Gansu Province (green dots), whereas group II was composed of samples from Tibet (red dots). Notably, the samples from Qinghai and Sichuan province (purple dot and blue dots, respectively) forming group III, could not be distinguished from each other, this is because, these populations come from a similar natural environment, which it was referred to as the "Amdo Tibetan area" during the Qing Dynasty. As shown in the loading and score plots of serum different serum (Figure 5c), there were ten biomarkers found to characterize samples from the four geographical origins. According to their significance in discriminating geographical characteristics, these compounds were identified as a (forsythoside B, t R 24.04 min, m/z 755.1098), b (verbascoside, t R 24.04 min, m/z 623.0175), c (kaempferol-3-glycoside, t R 13.83 min, m/z 593.1594), d (phlorigidoside C, t R 9.36 min, m/z 427.1621), e (chlorogenic acid, t R 4.59 min, m/z 353.0718) f (loganin, t R 21.02 min, m/z 413.1342), g (luteolin-7-O -β-D-glucopyranside, t R 24.65 min, m/z 447.1927), h (5-deoxypulchelloside I, t R 11.99 min, m/z 429.0361), i (7-epi-loganin, t R 18.23 min, m/z 412.9987), and j (decaffeoylcrenatoside, t R 14.26 min, m/z 459.2102). It is also clearly shown that samples from Tibet are characterized by a high content of phenylethanoid glycosides compound a and b, but a low content of iridoid and flavonoid glycosides. Samples from the Gansu location had a higher relative concentration of compound g and f, and a low content of phenylethanoid and flavonoids glycosides. Similarly, samples from the Qinghai and Sichuan province were branched into one group, and are characterized by a high content of compound c, d, e, h, i, and j. The results also suggest that these samples are characterized by a high content of flavonoids glycosides, a moderate content of iridoid glycosides, and a low content of phenylethanoid glycosides. Since iridoid glycoside and total flavonoids contents are used to qualitatively and quantitatively analyze L. rotata and the preparations in the latest Chinese Pharmacopoeia [28], the above result shows that, samples from the Qinghai and Sichuan provinces would be the most suitable choice for traditional prescriptions and preparations.

Discussion
Liu et al. reported the detection of L. rotata from three genetic groups corresponding to three geographic regions using inter simple sequence repeats (ISSR) and randomly amplified polymorphic DNA (RAPD) techniques [3]. Pan et al. investigated the systematic positions of Lamiophlomis and Paraphlomis (Lamiaceae) based on ITS DNA barcodes and chloroplast rpl16 and trn L-F sequences [4]. In this paper, the variation at the genus and species level among L. rotata, O. wartii and A. ovalifolia was distinguished for the first time using ITS2 DNA barcodes, the result confirmed that ITS2 DNA barcodes are one of the best-performing barcodes for identifying medicinal plants characteristics.
In our previous study, we reported a method using 1 H nuclear magnetic resonance (NMR) spectroscopy with multivariate analysis to discriminate the extracts in deuterium reagents of L. rotata from the Gansu, Tibet, and Qinghai provinces. The result of that study revealed that the Gansu samples had a higher iridoid glycoside contents, and the Tibet samples had a high content of phenylethanoid glycosides. However, as NMR spectroscopy lacks sensitivity, compounds present at low levels were not analyzed in previous studies, but were found by the UPLC-QTOF-MS method. Trace compounds, such as, 5-deoxypulchelloside I, kaempferol-3-glycoside, and 7-epi-loganin, have been found to characterize samples of four geographical origins, a comprehensive chemical composition profile of 25 samples were revealed, and the result also confirmed that of our previous study, that Tibetan samples had higher content of phenylethanoid glycosides while individuals from the Gansu province had a higher iridoid glycoside contents. In this study, variation was seen in the overall pattern of metabolites between samples from different geographical locations. Duan et al. also found that growth locations have greater impact on the metabolite composition and quantity than the genotypes (cultivated versus wild) in Menggu Huangqi (Astragalus mongholicus) [29]; Huang et al. reported that the chemical level and composition of Cistanche deserticola was affected by the key factors of temperature, moisture, and illumination [30]. Additionally, samples from the Sichuan provinces were employed for a more comprehensive origin study than that undertaken in our previous studies. Altogether, this study contributes to closing knowledge gaps in the topic of systematic characterization of L. rotata and its safe application in traditional uses.

Plant Materials, Reagents, and Chemicals
Twenty-five populations of L. rotata were collected from throughout the geographical distribution of official source plants, including the Tibet, Qinghai, Sichuan, and Gansu provinces ( Table 3). The sampling strategy covered most of its presently known populations [29]. Eight individual samples of A. ovalifolia were also collected from Qinghai and Gansu provinces, and eight ITS2 sequences of O. wartii were downloaded from Gen Bank. Details of the Gen Bank accession numbers, haplotypes of ITS2 sequences, and the locations of the sampling areas are provided in Table 3. The voucher samples were deposited in the College of Ethnic Medicine (Chengdu University of Traditional Chinese Medicine, Chengdu, China) and the Chongqing Academy of Chinese Materia Medica (Chongqing, China). We included all sequences in the final analysis.
HPLC-grade methanol and formic acid were purchased from Merck (Darmstadt, Germany) and Tedia (Fairfield, OH, USA). Deionized water was prepared using a Millipore water treatment system (Bedford, MA, USA). All other reagents were of analytical grade. DNA extraction was performed according to the method described in reference [31]. In brief, samples taken from dried stems of L. rotata and A. ovalifolia (30 mg) were rubbed for 2 min at a frequency of 30 r/s. DNA was extracted using a Plant Genomic DNA Kit (Tiangen Biotech Co., Beijing, China) in accordance with the manufacturer's instructions. PCR was carried out according to the following program: 94 • C for 5 min followed by 40 cycles of 94 • C for 30 s, 56 • C for 30 s, and 72 • C for 45 s with DNA polymerase (Biocolor BioScience & Technology Co., Shanghai, China). ITS2-specific primers were used as follows: GTTATGCATGAACGTAATGCTC (5 -3 ) as the forward primer and CGCGCATGGTGGATTCACAATCC (5 -3 ) as the reverse primer. PCR products were separated and detected using 1% agarose gel electrophoresis. PCR products were purified following the manufacturer's protocol and directly subjected to sequencing.
The PCR products were visualized on agarose gels (the electrophoresis was run in 1 × TBE for 20 min at a constant voltage 120 V). After electrophoresis, purified PCR products were bidirectionally sequenced using the same primers as were used for PCR in a 3730XL sequencer (Applied Biosystems, Foster, CA, USA).

Sequence Alignment and Analysis
ITS2 sequences of O. wartii were collected from the Gen Bank database. Sequences attained from sequencing of the samples were submitted to Gen Bank database (Table 3).
Proofreading and coting assembly of the sequencing peak diagrams was performed using Codon Code Aligner 3.7.1 (Codon Code Co., Centreville, MA, USA). The ITS2 region was obtained using the annotation method based on the Hidden Markov model(HMMer) to remove the 5.8S and 28S sections at both ends of the sequences. All sequences were aligned (MUSCLE option) by MEGA 6.0 (Center for Evolutionary Medicine and Informatics, Tempe, AZ, USA), and the genetic distances were calculated according to the K2P model. The distribution of intra-versus/inter-specific variability was assessed by DNA barcoding gaps. An NJ tree was constructed and bootstrap resampling (1000 replicates) was conducted to assess the confidence in the phylogenetic analysis by MEGA 6.0 [32].

Sample Preparation
Dried stems samples of L. rotata (1.0 g of powder each) were extracted into 10 mL of 70% aqueous methanol in an ultrasonic bath for 30 min and cooled at room temperature. The extraction was repeated three times using fresh aliquots of the solvent. After combining the three aliquots, the solutions were centrifuged at 12,000 rpm for 10 min and filtered through 0.22-µm pore membranes prior to UPLC-QTOF-MS analysis.
Mass spectrometry data were obtained using a Xevo G2 Q/TOF (Waters MS Technologies, 129 Manchester, UK) fitted with an electron spray ionization source. Each sample was analyzed twice, once in positive ionization mode and once in negative ionization mode. MS full scans were acquired mode over the range (m/z) 100 to 1000 Da in two channels with a scan time of 1 s. The capillary voltages were set to 2500 V and the cone voltage to 40 V [33]. Nitrogen gas was used both as a nebulizer and for desolvation. The desolvation and cone gas flow rates were 650 and 50 L·h −1 , respectively. The desolvation temperature was 300 • C, and the source temperature was 100 • C, the capillary voltage and cone voltage were set to 2700 V and 35 V. The Leu-Enkephalin ions at m/z 556.2771 and 554.2615 were used to calibrate the mass accuracy.

Data Processing and Statistical Analysis
The original data were processed for alignment, data reduction, and normalization by Marker Lynx software (Waters, Manchester, UK), and the processed data were exported to SIMCA-P software (ver. 13.0; Umetrics, Umeå, Sweden) for data analysis. A list of the intensities of detected peaks was generated using the retention time (t R ) and the mass data (m/z) pairs to identify each peak. An arbitrary ID was assigned to each t R -m/z pair in the order of their UPLC elution to facilitate data alignment. This procedure was repeated for each run. Ions from different samples were considered to be identical when they had the same t R (tolerance within 0.01 min) and m/z (tolerance within 0.01 Da). If a peak was not detected in a particular sample, that ion intensity was recorded as zero. In multivariate analysis of statistical significance, p < 0.05 and variable importance for projection (VIP) > 3, respectively, were set as the screening criteria for potential markers responsible for the discrimination of different groups.

Conclusions
In summary, using ITS2 DNA barcodes, L. rotata was accurately differentiated from O. wartii and A. ovalifolia. Additionally, an NJ tree showed that all samples of L. rotata were clustered into three subgroups according to their geographical origins. For further evaluation of the quality of the herbs from different habitats, a method coupling UPLC-QTOF-MS with multivariate analysis was implemented. Ten compounds were identified as the characteristic markers distinguishing samples from the four geographical origins. The results also suggest that samples from Qinghai and Sichuan province would be the most suitable choice for traditional prescriptions and preparations.