Molecular Characterization of Terpenoid Biosynthetic Genes and Terpenoid Accumulation in Phlomis umbrosa Turczaninow

: The root of Phlomis umbrosa has traditionally been used as a medicine in South Asian nations to treat colds and bone fractures, to staunch bleeding, and as an anti-inﬂammatory, and such use continues today. We identiﬁed 10 genes that are involved in terpenoid biosynthesis, while using the Illumina / Solexa HiSeq2000 platform. We investigated the transcript levels of the 10 genes using quantitative real-time PCR and quantiﬁed the level of terpenoid accumulation in di ﬀ erent organs of P. umbrosa while using high-performance liquid chromatography. The transcript levels of PuHDR and PuHMGR1 were the highest among the studied genes. Sesamoside, an iridoid glycoside, appeared in higher quantity than shanzhiside methylester, umbroside (8-O-acetyl shanzhiside methyl ester), and acteoside. We speculate that PuHDR and PuHMGR1 may contribute to terpenoid biosynthesis in P. umbrosa . This study highlights the molecular mechanisms that underlie iridoid glycoside biosynthesis in P. umbrosa .


Introduction
Phlomis umbrosa Turczaninow is a perennial herbaceous plant species in the Lamiaceae family; the genus Phlomis contains hundreds of species across Africa, Europe, and Asia [1]. The aerial parts of some species have been used in herbal teas, and the roots of P. umbrosa have been used as a traditional medicine for a long time. Preparations of P. umbrosa are considered as potential therapeutic drugs to treat bone fractures, rheumatic diseases, and cold; to reduce swelling; and, to staunch bleeding in South Asian countries, particularly in Korea, China, and Japan [2,3]. The plant contains triterpenoids, iridoid glycosides, and phenylethanoid glycosides [4], and it has been used in anti-allergic, Terpenoids belong to a large cluster of terpene-derived secondary metabolites that are formed through two metabolic pathways in plants. The mevalonic acid (MVA) pathway is the first to be formed, later becoming the 2-C-methyl-D-erythritol 4-phosphate (MEP) pathway. Terpenoids are primarily categorized by their number of isoprene units, and then categorized into hemiterpenoids (one isoprene unit with five carbons), monoterpenoids (two isoprene units with 10 carbons), sesquiterpenoids (three isoprene units with 15 carbons), diterpenoids (four isoprene units with 20 carbons), triterpenoids (six isoprene units with 30 carbons), tetraterpenoids (eight isoprene units with 40 carbons), and polyterpenoids ((C 5 )n with n > 8) [7]. Iridoids are derived from monoterpenes, which are typically found in plants as glycosides. Many iridoid glycosides have been isolated in P. umbrosa, including asacteoside, asperuloside, aucubin, feretoside, geniposide, sesamoside, shanzhiside methylester, ixoroside, and ixoside and others [8][9][10]. To date, iridoid glycosides have been shown to have anti-inflammatory, anti-fibrinolytic, and neuroprotective effects [11][12][13].
Several studies have investigated the accumulation of terpenoids and the transcription regulation of their related genes in plants [15][16][17][18]. However, the gene expression and characterization of terpenoid compounds in different tissues of P. umbrosa have not yet been published. Thus, we studied the accumulation of terpenoids and their related genes in the transcriptional regulation of different organs of P. umbrosa. Scheme 1. Schematic representation of terpenoid biosynthesis in plants [19]. Terpenoids and their biosynthetic enzymes were analyzed in this study, and the analyzed enzyme is indicated with black boxes.

Plant Materials
P. umbrosa specimen plants were established in the experimental field of the National Institute of Horticultural and Herbal Science, Rural Development Administration (Eumseong, Korea). Samples of flowers, stems, leaves, and roots were collected from 15 healthy plants, white and pink type plants, respectively, and immediately freeze-dried with liquid nitrogen, then stored at −80 • C until use. The samples were used for RNA extraction and HPLC ultraviolet analysis. All of the experiments were performed with at least three biological replicates.

Total RNA Isolation and Quantitative Real-Time PCR (qRT-PCR)
The total RNA from each individual sample was extracted using an RNeasy total RNA kit (Qiagen, Hilden, Germany), following the manufacturer's protocol. RNAse-Free DNase Set (cat. no. 79254) was used to digest contaminating DNA in RNA solutions prior to RNA cleanup. The quantity of RNA was computed on a NanoVuePlus spectrophotometer (GE Healthcare Life Sciences, Logan, UT, USA), and the quality was verified by adding 1 µg of the RNA to a 1.2% formaldehyde RNA agarose gel. Subsequently, 1 µg of total RNA was reverse-transcribed while using the SuperScript™ double-stranded synthesis kit (Invitrogen, Carlsbad, CA, USA) following the manufacturer's protocol. For qRT-PCR, the cDNA was diluted 20-fold. The reaction volume was 20 µL comprising 0.5 µM primers and 2× real-time PCR Smart mix (SolGent, Daejeon, Korea). qRT-PCR was performed while using three independent biological replicates with a BIO-RAD CFX96 real-time PCR system (Bio-Rad Laboratories, Hercules, CA, USA) under the following conditions: initial denaturation at 95 • C for 15 min. followed by 40 cycles of denaturation at 95 • C for 20 s, annealing at 55 • C for 40 s, and extension at 72 • C for 20 s. Actin gene (GenBank accession no. KU317507) was used as the reference gene.

Next Generation Sequencing of Transcriptome
We implemented Illumina-based NGS sequencing in order to obtain high-throughput transcriptome data of P. umbrosa. The total RNAs were quantified using Nanodrop spectrophotometer (Thermo Scientific, Waltham, MA, USA) and quality-assessed by RNA 6000 Nano assay kit (Agilent, Santa Clara, CA, USA) and Bioanalyser 2100 (Agilent). NGS sequencing libraries were generated from one microgram of total RNA using Truseq RNA Sample Prep Kit (Illumina, San Diego, CA, USA) according to the manufacturer's protocol. In brief, the poly-A containing RNA molecules were purified while using poly-T oligo attached magnetic beads. After purification, the total poly A+ RNA was fragmented into small pieces using divalent cations under elevated temperature. The cleaved mRNA fragments were reverse transcribed into first strand cDNA using random primers. QiaQuick PCR extraction kit was used for the purification of the shorts fragments and further resolved with EB buffer for end reparation and addition of poly (A). After that, the short fragments with poly (A) tail were interlinked with sequencing adapters. Each library was separated by adjoining distinct MID tag. The resulting cDNA libraries were then paired-end sequenced (2 × 101 bp) with the Illumina HiSeq™ 2000 system.

Sequence Analysis
P. umbrosa terpenoid biosynthetic pathway genes and the deduced amino acid sequences were aligned while using the Biological Sequence Alignment Editor (BioEdit). Protein molecular weight and isoelectric point values were measured using the Compute pI/Mw tool (ExPASy, http://ca.expasy. org/tools/pi_tool.html). Terpenoid biosynthesis pathway gene-specific primers were created using an online program (https://www.genscript.com/ssl-bin/app/primer) ( Table 1).

Isolation and HPLC Analysis of P. umbrosa Terpenoids
Terpenoids were analyzed with HPLC following a previously published protocol, with minor amendments [20]. The samples were freeze-dried and ground into a fine powder. Each sample (100 mg dry weight (DW)) was extracted while using 10 mL of 100% methanol (MeOH). This step was performed under vortexing for 5 min. at room temperature, and then, the sample was extracted for 1 h at 37 • C, with 1 min. of vortexing every 20 min. After centrifugation at 1000× g for 5 min, the supernatant was filtered and used for analysis. External standards were procured from Chemfaces (Wuhan, HB, China). MeOH, acetonitrile, and phosphoric acid were purchased from Wako Pure Chemical Industries (Osaka, Japan) and Junsei Chemical Co., Ltd. (Kyoto, Japan), respectively. The extract was filtered through a 0.45-µm poly-filter and then diluted two-fold with MeOH prior to HPLC analysis. Terpenoids were detected in a C18 column (250 × 4.6 mm, 5 µm; Shisido Company, Tokyo, Japan) by Agilent 1100 HPLC (Santa Clara, CA, USA), provided with a photodiode array (PDA) detector. The detection range for terpenoids was 235 nm in the chromatograms. Solvent A (100% acetonitrile) and solvent B (water/phosphoric acid at 99.5:0.5 v/v) were used as the mobile phase. The injection volume was 20 µL and the column was maintained at 30 • C. Gradient elution analysis (1 mL/min.) was accomplished by maintaining the following conditions: A 5%, B 95%, 0 min; A 17%, B 83%, 20 min; A 22%, B 78%, 30 min; A 30%, B 70%, 40 min; A 5%, B 95%, 41 min; and, A 5%, B 95%, 45 min.

Statistical Analysis
Statistical analyses were performed while using SAS Enterprise Guide 4.2 (Statistical analysis system, Cray, NC, USA, 2009) software. The data provided in this study are the mean and standard deviation of multiple replicates, with a minimum of three samples. Duncan's Multiple Range Test (DMRT) at p < 0.05 determined variations between the means.

Sequencing and Transcriptome Assembly
cDNAs prepared from root of P. umbrosa were sequenced using Illumina Hiseq platform. As a result of sequencing, 21,281,712 of total raw reads were obtained from sample. In order to facilitate sequence assembly, these reads were assembled using Velvet and Oases assembly program, resulting in 149,488 contigs with an average contig length of 1231 nt and an N50 of 1935 nt, ranging from 500 nt to >4000 nt. Furthermore, Velvet and Oases were used for assembling 75,392 unigenes with a mean size of 849 nt and an N50 of 1422 nt ( Table 2).   Figure S1). In addition, the thiamin diphosphate binding domain (GDGAMTAGQAYEAMNNAGYLDSDMIVILNDN) was also found in PuDXS [21]. The partial PuDXR had 1406 bps and 468 aa. The predicted MW was 50.98 kDa and the estimated pI value was 6.09. It showed high homology (85-90%) with DXR genes from other plants (Supplementary Figure S2). The preserved motif (CSA; PPAWPGRAV) and NADPH binding domain (GSTGSIG) were also observed in PuDXR [22]. The fractional PuCMK had 1055 bp and 351 aa, with a predicted MW of 38.02 kDa and a deduced pI of 9.14. It included an ATP binding domain (DKKVPTGAGLGGGSSNAATAL) (Supplementary Figure S3) [23]. Furthermore, fractional PuMCS had 711 bp and 236 aa, with a predicted MW of 25.05 kDa and deduced pI of 8.33. It also expressed a magnesium/manganese ion binding site (Supplementary Figure S4) [24]. The partial PuHDS had 2229 bp and 742 aa, with a predicted MW of 82.58 kDa and a deduced pI of 6.19. It produced high similarity (87-92%) with HDS genes from other plants (Supplementary Figure S5). The preserved motif (GCXVNXXGE; x being designated for any amino acid) was also observed in PuDXR [25]. The ORFs of PuHDS1 and partial PuHDS2 were 1443 and 1380 bp long, respectively, and encoded 480 and 459 aa, respectively. The predicted MWs of PuHDS1 and PuHDS2 were 54.89 and 51.85 kDa, respectively, and the deduced pI values were 6.23 and 5.62, respectively. These illustrated an N-terminal conserved domain and exhibited 78-82% homology with the HDS genes of other plants (Supplementary Figure S6) [26].
The ORFs of PuHMGR1 and partial PuHMGR2 were 1698 and 1438 bp and encoded 565 and 479 amino acids; the predicted MWs were 60.78 and 51.66 kDa, respectively. The predicted pI values were 7.50 and 5.04, respectively. These included preserved motifs (EMPVGF; IVSAVFIATGQ) and NAD(P) H-binding domains (TGDAMGMNMVS; VGTVGG) (Supplementary Figure S7) [27]. The PuMK had 1164 bp and 387 aa, with a predicted MW of 41.17 kDa and predicted pI of 5.45. It demonstrated high similarity (81-87%) with the MK genes of other plants. The protein sequence confirmed three conserved motifs (PGKIILAGEH, PLGSGLGSSAA, and KLTGAGGGGC) (Supplementary Figure S8) [28]. The PuPMK was 1515 bp long and it restrained 504 aa. The estimated MW and predicted pI value were 54.42 kDa and 5.25, respectively. It also contained three conserved motifs (GKVLMTGGY, GLGSSAA, and LLGEPGAGGS) (Supplementary Figure S9) [29]. The terminal gene (PuMVD) in the MVA pathway comprised 1269 bp and 422 aa, with a predicted MW of 46.78 kDa and predicted pI value of 6.33. It showed high homology (82-92%) with MVD genes of other plants. The preserved motifs (DRMWLNGK, RFQNCLRELR, NNFPTAAGL, and ASSAAGLAC) and an ATP binding domain (NNFPTAAGLASSAAGLAC) were also found in PuMVD (Supplementary Figure S10) [30].

Terpenoid Biosynthetic Genes Spatial Expression Profiles in of P. umbrosa
The expression profiles of PuDXS, PuDXR, PuCMK, PuMCS, PuHDS, and PuHDRs in the flower, bud, leaf, stem, and root tissues of P. umbrosa were analyzed while using qRT-PCR ( Figure 1). All the studied genes were expressed differently in different organs (i.e., flower, bud, leaf, and stem), and minimal expression profiles were noted in the root tissue. The initial enzyme PuDXS demonstrated less than 0.5-fold expression in most of the organs, but the expression level was high in the bud and stem tissues of pink flowers, as well as in the leaves of white-flowering plants. Similarly, the expression patterns of PuHMGRs, PuMK, PuPMK, and PuMVD were evaluated in different parts of both color variants while using qRT-PCR ( Figure 2). All five genes were highly expressed in the roots and stems. The expression levels were higher in pink flowers than in white flowers, and in the pink flowering plants, all genes except PuMVD showed higher gene expression profiles in the stem than the roots. PuHMGR1 showed the highest expression levels across all tissue types.

Terpenoid (Iridoid Glycosides) Analysis in Different Organs of P. umbrosa by HPLC
The terpenoids sesamoside, shanzhiside methylester, umbroside, and acteoside were quantitatively computed and analyzed in the different organs of P. umbrosa while using HPLC (Table 3). Sesamoside and shanzhiside methylester levels were the highest in the roots, umbroside was highest in the leaves, and acteoside was highest in the buds. The sesamoside contents in the roots of white flowering plants (133.78 ± 1.38 µg/mg dry weight [DW]) and the roots of pink flowering plants (116.10 ± 0.48 µg/mg DW) were 2-12-fold higher than those in other organs. The shanzhiside methylester content of the white flowering plant root (16.12 ± 0.17 µg/mgDW) was five-fold higher than that of pink flowering plant roots (3.00 ± 0.01 µg/mg DW). In particular, the umbroside content of the white flowering plant roots (11.07 ± 0.13 µg/mg DW) was similar to that of the old leaves (10.79 ± 0.04 µg/mgDW) and that of young leaves (11.37 ± 0.10 µg/mgDW). However, the umbroside content of the pink flower plant roots (12.91 ± 0.04 µg/mgDW) was 0.7-and 0.5-fold lower than those of the old leaves (18.47 ± 0.25 µg/mgDW) and young leaves (25.73 ± 0.10 µg/mgDW), respectively. The acteoside content of the white flowering plant buds (62.70 ± 0.64 µg/mgDW) and pink flowering plant buds (44.14 ± 0.92 µg/mgDW) was 2.5-32 fold greater than those of other organs. When the leaves of S. indicum and P. umbrosa were compared, the content of sesamoside was higher in P. umbrosa, but the content of shanzhiside methylester and acteoside was higher in S. indicum.

Discussion
Several reports have suggested that terpenoid biosynthetic genes are differentially expressed in different plant tissues. In a recent study, the genes OfDXS1, OfDXS2, and OfHDR1 in Osmanthus fragrans were found to have higher levels of expression in the inflorescence than in other organs [31]. In a study of Nicotiana sylvestris, the gene NsyCMS was expressed in all organs, but the level of expression was the highest in the leaves of seedlings [32]. Most of the genes in the MVA pathway of Valeriana fauriei were expressed at the highest levels in the stem [33]. In P. umbrosa, the genes PuHDR1 and PuHMGR2 were expressed in all of the tissues, but the highest levels were observed in the leaves, stems, and roots.
Gene expression levels that are related to the biosynthetic pathways of terpenoid compounds (MVA and MEP pathways) have been reported for V. fauriei [33]. However, the metabolism of terpenoids has not been reported for P. umbrosa. This study constitutes the first report on the cloning and characterization of the terpenoid biosynthetic pathway genes DXS, DXR, CMK, MCS, HDS, HDRs, HMGRs, MK, PMK, and MVD from P. umbrosa. In addition, we detected terpenoid accumulation in different organs of this species. The expression of transcripts of PuHMGR1, PuHMGR2, and PuMVD was the highest in the root tissues, where sesamoside and shanzhiside methylester were also more abundant than in other organs. However, in terms of the gene expression level, PuHMGR plays a vital role in the biosynthesis of terpenoids in the roots of P. umbrosa. The transcript levels of PuDXS, PuCMK, PuMCS, PuHDR1, and PuHDR2 were the highest in the bud and leaf tissues, and the contents of umbroside and acteoside were also the highest in those tissues. The gene expression profiles of PuHDR are believed to play an important role in terpenoid biosynthesis in the buds and leaves. Many studies have shown a relationship between the accumulation of terpenoids and the genes HDR and HMGR, in multiple species [34][35][36][37]. The contents of iridoid glycosides, such as sesamoside (0.38 ± 0.87%DW), shanzhiside methylester (0.04 ± 1.07%DW), and acteoside (0.13 ± 4.86%DW), have been reported for the leaves of Sesamum indicum [37]. When comparing the total amount of terpenoids, there were many white flower plant in the underground part and pink flower plant in the above-ground part. P. umbrosa white flower plants contained five times more Shanzhiside methyl ester than pink flower plant at the root. Therefore, it would be better to select white flower plant when breeding P. umbrosa for herbal medicine in the future. We suggest that PuHMGR and PuHDR may play key roles in terpenoid biosynthesis in P. umbrosa. Based on this result, the breeding research team intends to develop a variety with a fixed flower color.

Conclusions
The present study aimed to characterize the molecular genetics of the terpenoid biosynthetic pathway in P. umbrosa. In this study, PuDXS, PuDXR, PuCMK, PuMCS, PuHDS, PuHDR, PuHMGR, PuMK, PuPMK, and PuMVD were cloned and characterized from P. umbrosa while using NGS data. We analyzed the transcript levels of the genes using qRT-PCR and quantified the accumulated terpenoids in the various organs of P. umbrosa using HPLC. The transcript levels of PuHMGRs were the highest in the roots, where sesamoside and shanzhiside methylester content were also found at the highest level. The transcript levels of PuHDRs and contents of umbroside and acteoside were the highest in the bud and leaf tissues. Our analysis of PuHMGR and PuHDR expression profiles indicated that these genes are involved in the biosynthesis of terpenoids. These findings provide a technical framework that is based on which study of the underlying mechanism of terpenoid biosynthesis in the different organs of P. umbrosa can proceed.

Conflicts of Interest:
The authors declare no conflict of interest.