De Novo Transcriptome Analysis Reveals Putative Genes Involved in Anthraquinone Biosynthesis in Rubia yunnanensis

Rubia yunnanensis Diels (R. yunnanensis), a Chinese perennial plant, is well-known for its medicinal values such as rheumatism, contusion, and anemia. It is rich in bioactive anthraquinones, but the biosynthetic pathways of anthraquinones in R. yunnanensis remain unknown. To investigate genes involved in anthraquinone biosynthesis in R. yunnanensis, we generated a de novo transcriptome of R. yunnanensis using the Illumina HiSeq 2500 sequencing platform. A total of 636,198 transcripts were obtained, in which 140,078 transcripts were successfully annotated. A differential gene expression analysis identified 15 putative genes involved in anthraquinone biosynthesis. Additionally, the hairy roots of R. yunnanensis were treated with 200 µM Methyl Jasmonate (MeJA). The contents of six bioactive anthraquinones and gene expression levels of 15 putative genes were measured using ultra performance liquid chromatography coupled with mass spectrometry (UPLC-MS/MS) and real-time quantitative polymerase chain reaction (RT-qPCR), respectively. The results showed that the expressions levels for 11 of the 15 genes and the contents of two of six anthraquinones significantly increased by MeJA treatment. Pearson’s correlation analyses indicated that the expressions of 4 of the 15 putative genes were positively correlated with the contents of rubiquinone (Q3) and rubiquinone-3-O-β-d-xylopranosyl-(1→6)-β-d-glucopyranoside (Q20). This study reported the first de novo transcriptome of R. yunnanensis and shed light on the anthraquinone biosynthesis and genetic information for R. yunnanensis.


Functional Annotation of Transcripts
To obtain annotation, transcripts were compared with public databases using NCBI BLASTX [34].

Identification of Differentially Expressed Genes (DEGs)
Clean reads of each sample were mapped to the assembled transcripts with Bowtie2 [35]. Then, the FPKM (fragments per kilobase of per million mapped reads) for each transcript was calculated using RSEM software [36]. The FPKM quantified the expression abundance of transcripts in each sample. Last, edgeR software [37] was used to perform differential gene expression analyses based on the FPKM. When false discovery rate (FDR) ≤0.05 and an absolute value of the log2 ratio > 1.5, the transcripts were treated as DEGs.

Analysis of Putative Genes Involved in Anthraquinone Biosynthesis in the R. yunnanensis Transcriptome
The DEGs found in the R. yunnanensis transcriptome were proceeded for GO and KEGG annotation analyses. In combination with the biosynthetic pathways of higher plant anthraquinones reported in the literatures [16][17][18][19]38], we identified 15 putative genes involved in anthraquinone biosynthesis.

RT-qPCR Validation of the Gene Expression Level Quantified Using Illumina Sequencing
To validate the gene expression level quantified using RNA-seq, we selected six of the 15 putative genes, viz., the gene of acetyl-CoA acetyltransferase (AACT), the gene of osuccinylbenzoate-CoA ligase (OSBL), the gene of 3-hydroxy-3-methylglutaryl-coenzyme A reductase (HMGR), the gene of 1-deoxy-D-xylulose-5-phosphate synthase (DXS), the gene of isochorismate synthase (ICS), and the gene of isopentenyl-diphosphate delta-isomerase (IPPI), to analyze the expression levels at different tissues using RT-qPCR.
All RT-qPCR reactions were carried out on StepOnePlus ™ Real-Time PCR System using AceQ ® qPCR SYBR ® Green Master (Vazme, Nanjing, China). Each reaction contained 10 µL 2 × AceQ qPCR SYBR Green Master Mix, 0.4 µL of each primer (10 µM), 2 µL of cDNA, and 7.2 µL of ddH 2 O. The RT-qPCR was performed according to the following conditions: denaturation at 95 • C for 5 min, 40 cycles at 95 • C for 10 s, and 60 • C for 30 s. To confirm the specificity of the primers, the melting curve was generated by heating the amplicon from 60 to 95 • C. The products of all reactions were separated via electrophoresis in 1.5% agarose gels. Each PCR reaction was repeated in three biological and technical replicates. The relative quantification of gene expression was computed using the 2 −∆∆Ct method [39] and normalized to the expression of hnRNP and TBP, the two most stable genes in R. yunnanensis at the different treatments [40].

RT-qPCR Analysis for 15 Genes Involved in Anthraquinone Biosynthesis
Total RNA was isolated from hairy roots of R. yunnanensis using the RNAiso Plus reagent (Takara, Tokyo, Japan), following the manufacturer's instructions. RNA quality was tested by measuring the A260/280 ratios, and the integrity was accessed by electrophoresis on 1.5% (w/v) agarose gels. Only RNA samples with an A260/A280 ratio of 1.9-2.2 were considered as high quality for further analysis. Total RNA samples were reversely transcribed into cDNA with HiScript ® II Reverse Transcriptase (Vazyme, Nanjing, China). The cDNA was stored at −80 • C until PCR-based analyses. All RT-qPCR reactions were the same as 2.7. The relative quantification of gene expression was normalized to the expression of hnRNP. The sequences of the primers for 15 genes involved in anthraquinone biosynthesis were shown in Supplementary Table S2.

Quantitative Analysis of Six Anthraquinones and Two Naphthoquinones in the Hairy Roots of R. yunnanensis
The quantification of Q3, Q4, Q11, Q12, Q17, Q18, Q19, and Q20 was carried out using UPLC-MS/MS. The analyses were performed using a Waters ACQUITY ® UPLC H Class system (Waters, Milford, MA, USA) fitted with a Waters XEVO ® TQD system (Milford, MA, USA) equipped with electrospray ionization (ESI). The conditions for UPLC and MS analyses were performed according to our previous method [8]. The eight compounds in the samples were identified by comparing their retention time and mass spectra with the standards, which were isolated from Rubia species in our laboratory. The quantitative data were calculated based on the calibration curves of the individual standards through plotting the MS peak areas versus the corresponding concentrations of each representative standard.

Statistical Analysis
The data in the Tables and Figures were represented as the means ± standard deviations. Statistical analysis was performed through a one-way analysis of variance (ANOVA). Pearson's correlation coefficients were computed between quantitative variables using SPSS software (version 23.0). Heatmaps were produced in Heatmapper (http://www.heatmapper.ca/expression/, accessed on 25 May 2021).

Illumina Sequencing and De Novo Sequence Assembly
Total RNA was extracted from roots (R) and a mixture of stems and leaves (SL) (each three replicates), respectively. Sequencing was separately carried out for six samples using the Illumina HiSeq-2500 platform. A total of 34,612,412-41,363,450 clean reads for each sample were obtained from 36,888,628-43,747,160 raw reads by filtering and removing adapters, low quality reads, and unknown nucleotides (N > 10%) (Supplementary Table S3). All these data were characterized by Raw Q30 ≥86.46% and Raw GC contents ≥48.18%, respectively. The de novo transcriptome was assembled using Trinity [41] with all the six samples, which generated 636,198 transcripts belonging to 554,646 genes (Supplementary Table S4

Function Annotation Analysis
With the BLASTX, 17.09%, 15.45%, 16.27%, 13.71%, 6.11%, and 11.41% of the 636,198 transcripts were annotated using NR, SwissProt, KEGG, KOG, Pfam, and GO databases, respectively (Table 1). In total, 22.02 percent of transcripts were annotated. The high proportion of annotated transcripts could be attributed to lacking a reference genome of R. yunnanensis and strict parameters used in BLASTX. NR annotation revealed that Coffea canephora, Zea mays, Hordeum vulgare subsp. vulgare, and Sesamum indicum are the top species, which accounted for 28.0%, 6.34%, 3.64%, and 2.72% of the matched transcripts, respectively. We assigned the annotated transcripts to GO categories. GOs were divided into three classes, viz. biological process, cellular component, and molecular function. Within the biological process, metabolic processes were the most highly represented groups (Supplementary Figure S3). According to the KEGG annotation, the three most abundant pathways are signal transduction (7984), carbohydrate metabolism (6549), and global and overview maps (5025) (Supplementary Figure S4). Table 1. Statistical results of transcript annotation.

Ratio of Annotated Genes (%)
All

Identification of Genes Involved in Anthraquinone Biosynthesis
According to our previous work, R. yunnanensis roots (R) have higher contents of anthraquinones than that of a mixture of stems and leaves (SL) [8]. A heatmap was constructed using a clustering analysis based on the FPKM values, which showed the different gene expression patterns in R and SL samples ( Figure 1). We also conducted a differential gene expression analysis between roots and a mixture of stems and leaves using edgeR with FPKM values [37]. A total of 8811 DEGs were identified based on the criteria of FDR < 0.05 and |log 2 Fold change| > 1.5 (Supplementary Figure S5). Among the 8811 genes, 5372 were significantly up-regulated and 3439 were down-regulated in roots. The KEGG enrichment analysis showed that 113 DEGs were involved in the ubiquinone and other terpenoid-quinone pathways ( Figure 2). According to the NR annotation and literatures [16][17][18][19]38]

Validation of the Gene Expression Level
To validate the gene expression level quantified by RNA-seq, the expressions of ICS, AACT, DXS, IPPI, OSBL, and HMGR were quantified using RT-qPCR. The results indicated that the expression level quantified using RT-qPCR was consistent with RNA-seq (Supplementary Figure S6). Both of them showed that the expression levels of the six genes in the roots (R) were higher than that in a mixture of stems and leaves (SL).

Gene Expressions Levels and Anthraquinone Contents in the Hairy Roots of R. yunnanensis under MeJA Treatment
To gain further insights into anthraquinone biosynthesis, the hairy roots of R. yunnanensis were treated by MeJA for 1, 6, 12, and 24 h, respectively. Then, the expression levels of the 15 putative genes involved in anthraquinone biosynthesis and the contents of six anthraquinones and two naphthoquinones were measured using RT-qPCR and UPLC-MS/MS, respectively. All treatments were compared with the control (0 h).
The results showed that the expressions for nine of these 15 genes was up-regulated after 6

Validation of the Gene Expression Level
To validate the gene expression level quantified by RNA-seq, the expressions of ICS, AACT, DXS, IPPI, OSBL, and HMGR were quantified using RT-qPCR. The results indicated that the expression level quantified using RT-qPCR was consistent with RNA-seq (Supplementary Figure S6). Both of them showed that the expression levels of the six genes in the roots (R) were higher than that in a mixture of stems and leaves (SL). Eight quinones were found in R. yunnanensis, including two anthraquinones aglycones (Q3 and Q4), four anthraquinones glycosides (Q12, Q20, Q19, and Q11), and two naphthoquinones (Q17 and Q18). The contents of these anthraquinones were determined by UPLC-MS/MS analysis. The results showed the contents of Q4 increased by 28.01% after MeJA treatment for 24 h. The contents of Q3 increased by 23.05% at 6 h. However, the contents of Q11, Q20, Q19, and Q12 significantly decreased by 16.23%, 54.33%, 33.01%, and 33.30% for 6, 24, 12, and 24 h. The contents of Q17 and Q18 contents showed no significant difference at 0 or 24 h (Supplementary Figure S7).

Correlation Analysis between the Expression Levels of Genes Involved in Anthraquinone Biosynthesis and Anthraquinone Contents
We treated the hairy roots of R. yunnanensis with MeJA and compared the expression levels of 15 genes (ICS, AACT, SK, HMGS, OSBS, OSBL, IPPI, DXS, DXR, HMGR, MVK, SD, PMVK, ISPE, and ISPF), the contents of six anthraquinone (Q3, Q4, Q12, Q20, Q19, and Q11), and two naphthoquinones (Q17 and Q18). Our results revealed that ten genes were upregulated and five were downregulated (Figure 4). Two anthraquinones were significantly increased. Four anthraquinones and two naphthoquinones were decreased after MeJA treatment (Supplementary Figure S7). To study the expression pattern of these 15 putative genes, we performed a correlation analysis between the gene expression levels and anthraquinone contents. As shown in Figure 5, 15 genes were grouped into two clusters based on the correlation values. The the contents of Q20, Q19, and Q12 were positively correlated with the expression of genes involved in the MVA pathways, which were positioned in Cluster I, including MVK, AACT, PMVK, and HMGR.
We used a |correlation coefficient| > 0.5 and p < 0.05 as the cutoff for a significant correlation (Supplementary Table S6). The results showed a significant correlation between three anthraquinones and six genes. The expressions of HMGR, MVK, and PMVK were positively correlated with the Q20 contents, with the correlation coefficients of 0.648, 0.694, and 0.797, respectively. The expression of SD was positively correlated with the Q3 contents. There was a negative correlation between the expression levels of ISPE and ISPF genes and the Q19 contents, with the correlation coefficients of −0.671 and −0.721, respectively ( Figure 6). putative genes, we performed a correlation analysis between the gene expression levels and anthraquinone contents. As shown in Figure 5, 15 genes were grouped into two clusters based on the correlation values. The the contents of Q20, Q19, and Q12 were positively correlated with the expression of genes involved in the MVA pathways, which were positioned in Cluster I, including MVK, AACT, PMVK, and HMGR. We used a |correlation coefficient| > 0.5 and p < 0.05 as the cutoff for a significant correlation (Supplementary Table S6). The results showed a significant correlation between three anthraquinones and six genes. The expressions of HMGR, MVK, and PMVK were positively correlated with the Q20 contents, with the correlation coefficients of 0.648, 0.694, and 0.797, respectively. The expression of SD was positively correlated with the Q3 contents. There was a negative correlation between the expression levels of ISPE and ISPF genes and the Q19 contents, with the correlation coefficients of −0.671 and −0.721, respectively ( Figure 6).

Discussion
Due to its significant efficacy, R. yunnanensis is commonly used in promoting blood circulation, stretching tendons, and eliminating blood stasis. R. yunnanensis is rich in anthraquinones with some pharmacological effects [4][5][6][7]. Abiotic stress could promote the accumulation of anthraquinones in hairy roots [42][43][44]. For instance, elicitation with 200 µM MeJA increased the anthraquinones in hairy roots of Oplopanax elatus by 2.0-fold [45]. These findings indicated that MeJA can be used in hairy culture systems to increase metabolites production. In this study, when using a concentration of 200 µM MeJA, the an-

Discussion
Due to its significant efficacy, R. yunnanensis is commonly used in promoting blood circulation, stretching tendons, and eliminating blood stasis. R. yunnanensis is rich in anthraquinones with some pharmacological effects [4][5][6][7]. Abiotic stress could promote the accumulation of anthraquinones in hairy roots [42][43][44]. For instance, elicitation with 200 µM MeJA increased the anthraquinones in hairy roots of Oplopanax elatus by 2.0-fold [45]. These findings indicated that MeJA can be used in hairy culture systems to increase metabolites production. In this study, when using a concentration of 200 µM MeJA, the anthraquinone contents increased 35.11%. It supported the opinion that effective elicitors, such as MeJA, were able to improve the accumulation of anthraquinones in the hairy roots of R. yunnanensis [31].
Anthraquinones in Rubia could be biosynthesized through the combination of shikimate and MVA/MEP pathways. Several genes in biosynthetic pathways of anthraquinones have been identified from the genus Rubia, including DXS, DXR, ICS, OSBS, OSBL, and IPPI [46][47][48]. Here, we reported the first de novo transcriptome of R. yunnanensis using the Illumina HiSeq platform. Most of the genes encoding enzymes, which have been reported to be involved in the biosynthesis of the anthraquinones [17][18][19], were present in the transcriptome of R. yunnanensis. Fifteen transcripts were found in the MEP pathway (DXS, DXR, ISPF, ISPF, and IPPI), MVA pathway (HMGR, AACT, HMGS, MVK, and PMVK), and shikimate pathway (SD, SK, ICS, OSBS, and OSBL), respectively. In addition, the correlation analysis in Figure 6, showed that the expressions of HMGR, MVK, PMVK, and SD were positively correlated with the anthraquinone contents. These genes were involved in the MVA (HMGR, MVK, and PMVK) and shikimate (SD) pathways. HMGR and MVK are rate-limiting enzyme genes [49,50] and participate in a series of biochemical reactions, such as anthraquinones, triterpenes, and phytosterols synthesis. PMVK is found to catalyze a key step in isoprenoid biosynthesis, converting mevalonate 5-phosphate to mevalonate 5-diphosphate in the MVA pathway [51,52]. SD was found to catalyze shikimate synthesis, representing the fourth step of the shikimate pathway, a conserved biosynthetic route commonly in plants, fungi, and bacteria [53,54]. It was also reported that the expressions of shikimate pathway genes responded to various types of elicitors [55]. Our results also showed that the expression level of SD was up-regulated at 6 h, possibly responding to the MeJA elicitor.
In addition, we found the contents of anthraquinones were increased after MeJA treatment, while two naphthoquinones (Q17 and Q18) showed no significant difference. These results were consistent with the cell suspension of Rubia cordifolia [56]. The biosynthetic pathways of anthraquinones and naphthoquinones are both derived from 1, 4-dihydroxy-2-naphthoic acid (DHNA) synthesized in the shikimate pathway, but branching occurs after the prenylation with an isopentenyl moiety IPP [18] (Figure 3). The IPP origin was synthesized from the MVA and MEP pathways. In our study, the genes in the MVA (HMGR, MVK, and PMVK) positively correlated with the anthraquinone contents, which may imply that MeJA mainly regulated the MVA pathway through these three genes to synthesize IPP for the accumulation of anthraquinones. However, the synthesis of IPP of naphthoquinone (Q17) appeared to be mainly from the MEP pathway [56]. Moreover, the function of the genes of the MEP pathway is influenced by the light [57,58], while the MVA pathway is negatively regulated by light [59]. In our study, the hairy root culture of R. yunnanensis was in darkness, which may promote the regulation of the MVA pathway leading to the increase of anthraquinones. These findings implied MVA pathway may play an important role in the biosynthesis of anthraquinones in R. yunnanensis, especially the four genes HMGR, MVK, PMVK, and SD, which could contribute to the production of anthraquinones. However, the integration and regulation of four targeted genes may need further investigation by using CRISPR activation/interference [60,61].
In the late stage of anthraquinone biosynthesis, the backbone of anthraquinones experiences various modifications, such as SAM-dependent methyltransferases [62], Cytochrome P450 (CYPs) [63], and UDP-glucose glucosyltransferases (UDPGs). These modification enzymes play different roles in anthraquinones' metabolism. For example, CYPs could catalyze most of the oxidative reactions, including epoxidation, dealkylation, dehydration, and carbon-carbon bond cleavage of anthraquinones [64]. UDPGs catalyzed glycosylation at the site of the hydroxyl group to produce glycosylated anthraquinones [20]. In this study, 18 and 2 transcripts were identified for the first time as CYPs and UDPGs, respectively, from the R. yunnanensis transcriptome, which could contribute to a better understanding of the anthraquinone biosynthesis.

Conclusions
R. yunnanensis is rich in bioactive anthraquinones, while little is known regarding its genes in the anthraquinone biosynthesis. In this study, we reported the first de novo assembled transcriptome and identified 15 putative genes involved in the anthraquinone biosynthesis pathway in R. yunnanensis. Four of these genes (HMGR, MVK, PMVK, and SD) exhibited a high positive correlation in the anthraquinone contents. This study has a potential to discover putative genes in anthraquinone biosynthesis pathways in the Rubia plants.
Supplementary Materials: The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/genes13030521/s1, Figure S1: Chemical structures of six anthraquinones and two naphthoquinones isolated from R. yunnanensis, Figure S2: Length distribution of the assembled R. yunnanensis transcripts, Figure S3: GO classification of R. yunnanensis assembled transcripts, Figure S4: KEGG classification of assembled transcripts from R. yunnanensis, Figure S5: Number of differentially expressed genes (DEGs) in the samples between roots (R) and a mixture of stems and leaves (SL) of R. yunnanensis, Figure S6: RT-qPCR validation of the expression levels of six putative key genes involved in anthraquinone biosynthesis pathways [40], Figure S7: Contents of anthraquinones and naphthoquinones in the hairy roots of R. yunnanensis after 1, 6, 12, and 24 h's MeJA treatment, Table S1: RNA in roots (R), a mixture of stems and leaves (SL) of R. yunnanensis, Table S2. The sequences of the primers for 15 genes involved in anthraquinone biosynthesis, Table S3: A summary of raw data, Table S4: Statistics of transcript assembly, Table S5: Transcripts associated with ubiquinone and terpenoid-quinone biosynthesis according to the KEGG pathway mapping, Table S6: Correlation values between 15 genes involved in the anthraquinone biosynthesis and the contents of six anthraquinones and two naphthoquinones.