Molecular Regulation of Catalpol and Acteoside Accumulation in Radial Striation and non-Radial Striation of Rehmannia glutinosa Tuberous Root

Rehmannia glutinosa L., a perennial plant of Scrophulariaceae, is one of the most commonly used herbs in traditional Chinese medicine (TCM) that have been widely cultivated in China. However, to date, the biosynthetic pathway of its two quality-control components, catalpol and acteoside, are only partially elucidated and the mechanism for their tissue-specific accumulation remains unknown. To facilitate the basic understanding of the key genes and transcriptional regulators involved in the biosynthesis of catalpol and acteoside, transcriptome sequencing of radial striation (RS) and non-radial striation (nRS) from four R. glutinosa cultivars was performed. A total of 715,158,202 (~107.27 Gb) high quality reads obtained using paired-end Illumina sequencing were de novo assembled into 150,405 transcripts. Functional annotation with multiple public databases identified 155 and 223 unigenes involved in catalpol and acteoside biosynthesis, together with 325 UGTs, and important transcription factor (TF) families. Comparative analysis of the transcriptomes identified 362 unigenes, found to be differentially expressed in all RS vs. nRS comparisons, with 143 upregulated unigenes, including those encoding enzymes of the catalpol and acteoside biosynthetic pathway, such as geranyl diphosphate synthase (RgGPPS), geraniol 8-hydroxylase (RgG10H), and phenylalanine ammonia-lyase (RgPAL). Other differentially expressed unigenes predicted to be related to catalpol and acteoside biosynthesis fall into UDP-dependent glycosyltransferases (UGTs), as well as transcription factors. In addition, 16 differentially expressed genes were selectively confirmed by real-time PCR. In conclusion, a large unigene dataset of R. glutinosa generated in the current study will serve as a resource for the identification of potential candidate genes for investigation of the tuberous root development and biosynthesis of active components.


Introduction
Rehmannia glutinosa L. is a perennial herb of Scrophulariaceae and is one of the most commonly used herbs in traditional Chinese medicine (TCM) that have been widely cultivated in China. It was recorded in the Chinese medical classic "Shennong's Herba" and was thought to be a "top-grade" Different cultivars of R. glutinosa have varying medicinal ingredients, and the content presents typical tissue characteristics [12]. In previous studies, we found that there are significantly different medicinal ingredients between radial striation and non-radial striation [13]. By observing the crosssectional size of the four R. glutinosa cultivars, it was found that the proportions of radial striation in 1706 and Wen 85-5) (W85) were larger, and they were smaller in BJ1 and QH1 ( Figure 1A). The fingerprint of R. glutinosa was then constructed, and the peak of sterol was highest. Furthermore, the content of some chemical components was different in the two parts of tuberous root ( Figure 1B). The determination result showed that catalpol and total iridoid glycoside levels were higher in the nonradial striation of BJ1 and QH1 in the middle period of the tuber enlargement ( Figure 1C). Simultaneously, acteoside and total phenylethanoid glycoside levels in 1706, BJ1, and QH1 were higher in non-radial striation than those in radial striation, especially in BJ1 and QH1 ( Figure 1D).
In the later period of tuber enlargement, the amount of catalpol in non-radial striation was higher in both BJ1 and W85, and the amount of acteoside in non-radial striation was higher than in radial striation in all four cultivars ( Figure S2).
This indicates that the catalpol and other iridoid glycosides are not tissue-specific in some R. glutinosa cultivars, while the phenylethanoid glycosides such as acteoside are mainly distributed in non-radial striation. These data can be used to identify candidate genes involved in catalpol and acteoside biosynthesis. HPLC chromatographic fierprints in radial striation and non-radial striation of tuberous roots, and monitored at 205 nm, the green arrow and red arrow point to peaks of catalpol and acteoside, respectively; (C) the contents of catalpol and total iridoid glycosides in radial striation and non-radial striation; (D) the contents of acteoside and total phenylethanoid glycosides in radial striation and non-radial striation. Data shows the average values ± SE of three independent experiments. * indicate significant difference at p < 0.05. Abbreviations: RS, radial striation; nRS, non-radial striation. The same as follows. HPLC chromatographic fierprints in radial striation and non-radial striation of tuberous roots, and monitored at 205 nm, the green arrow and red arrow point to peaks of catalpol and acteoside, respectively; (C) the contents of catalpol and total iridoid glycosides in radial striation and non-radial striation; (D) the contents of acteoside and total phenylethanoid glycosides in radial striation and non-radial striation. Data shows the average values ± SE of three independent experiments. * indicate significant difference at p < 0.05. Abbreviations: RS, radial striation; nRS, non-radial striation. The same as follows.

R. glutinosa Transcriptome Sequencing and Unigene Assembly
In the later period of tuber enlargement, the amount of catalpol in non-radial striation was higher in both BJ1 and W85, and the amount of acteoside in non-radial striation was higher than in radial striation in all four cultivars ( Figure S2). This indicates that the catalpol and other iridoid glycosides are not tissue-specific in some R. glutinosa cultivars, while the phenylethanoid glycosides such as acteoside are mainly distributed in non-radial striation. These data can be used to identify candidate genes involved in catalpol and acteoside biosynthesis.

R. glutinosa Transcriptome Sequencing and Unigene Assembly
For a comprehensive overview of the gene expressing profiles in R. glutinosa radial striation and non-radial striation tissues, radial striation and non-radial striation cDNA libraries were constructed, respectively, from the four R. glutinosa cultivars, 1706, BJ1, QH1, and W85, and sequenced by the Illumina TM HiSeq 4000 platform in our experiments. The paired-end (PE) sequencing of 24 different libraries resulted in 859,849,100 raw reads, ranging from 33.75 to 37.56 million for each library. Quality filtering after the removal of adaptor sequences, as well as ambiguous and low quality reads, resulted in 715,158,202 high quality reads being obtained (Table 1). After assembling all samples together and filtering the abundance, we obtained 150,405 unigenes. The total length, average length, N50, and GC content of unigenes are 187,170,349 bp, 1,244 bp, 1,974 bp, and 40.87%, respectively (Table S2).  (Figure 2A and Table S3). To study the sequence conservation of R. glutinosa in other plant species, we analyzed the species distribution of the All-Unigene datasets by aligning sequences against the NR database. The result shows that 75.26% of the distinct sequences have top matches (first hit) with the sequences from Sesamum indicum, followed by Coffea canephora (2.46%), Vitis vinifera (2.2%), Nicotiana sylvestris (1.55%), and N. tomentosiformis (1.25%) ( Figure 2B). TransDecoder software was used to screen the candidate coding region among the unigenes. The longest ORF (open reading frame) was selected and blast with the SwissProt and Hmmscan database to search for the Pfam protein homology sequence for the prediction of CDS (Coding DNA sequences). In total, more than 102,764 CDS were predicted from All-Unigene sequences (Table S4).
Taken together, All-Unigene represented by far the most comprehensive transcriptome database of R. glutinosa, and the dataset was used as the reference transcriptome database in the gene expression analysis.

Expression Analysis for Radial Striation and Non-Radial Striation of Four R. glutinosa Cultivars
In order to obtain the gene expression profiles of each individual tissue, reads from different tissues were mapped to the obtained non-redundant unigenes. According to the FPKM (fragments per kb per million fragments) values, the mappable reads were used to estimate the transcription levels. Among 24 samples of R. glutinosa, more than 60.0% of unigenes both in radial striation and non-radial striation from 1706 and W85 had FPKM values greater than 1, showing a higher number of transcriptionally active unigenes, while radial striation and non-radial striation from BJ1 and QH1 showed a lower number ( Figure 3A). However, expression levels of unigenes from BJ1 and QH1 samples showed a higher degree of dispersion than those from the 1706 and W85 samples ( Figure  3B). TransDecoder software was used to screen the candidate coding region among the unigenes. The longest ORF (open reading frame) was selected and blast with the SwissProt and Hmmscan database to search for the Pfam protein homology sequence for the prediction of CDS (Coding DNA sequences). In total, more than 102,764 CDS were predicted from All-Unigene sequences (Table S4).
Taken together, All-Unigene represented by far the most comprehensive transcriptome database of R. glutinosa, and the dataset was used as the reference transcriptome database in the gene expression analysis.

Expression Analysis for Radial Striation and Non-Radial Striation of Four R. glutinosa Cultivars
In order to obtain the gene expression profiles of each individual tissue, reads from different tissues were mapped to the obtained non-redundant unigenes. According to the FPKM (fragments per kb per million fragments) values, the mappable reads were used to estimate the transcription levels. Among 24 samples of R. glutinosa, more than 60.0% of unigenes both in radial striation and non-radial striation from 1706 and W85 had FPKM values greater than 1, showing a higher number of transcriptionally active unigenes, while radial striation and non-radial striation from BJ1 and QH1 showed a lower number ( Figure 3A). However, expression levels of unigenes from BJ1 and QH1 samples showed a higher degree of dispersion than those from the 1706 and W85 samples ( Figure 3B). In order to analyze the global similarities and differences among tissues and cultivars of R. glutinosa, PCA and box-whisker plots were conducted. The first four principal components (PCs) accounted for 82.37% of the variation ( Figure S3). The 24 samples were divided into different groups according to the PCA loading plot, and six replicates from the same cultivar clustered closely in the same region. However, the radial striation and non-radial striation of the same cultivar were not clustered together ( Figure S3). This may be because radial striations and non-radial striations both belong to R. glutinosa tuberous root. Most samples from the All-Unigenes database showed expression patterns similar to these tissues.
To identify R. glutinosa genes that were significantly up-or downregulated in non-radial striation compared to corresponding radial striation controls, DeSeq2 was used to generate log2 fold change values for each gene. The unigenes that had at least a two-fold change with an Adjusted Pvalue ≤ 0.05 were screened and taken as differentially expressed transcripts (DETs). It was interesting that 1706_RS vs. 1706_nRS comparisons had the most differentially expressed unigenes: 3247 in total, with 1789 upregulated genes and 1458 downregulated genes. Only 1166 DETs were determined between W85_RS vs. W85_nRS, with 533 upregulated genes and 633 downregulated unigenes ( Figure  4A). In total, 362 unigenes were found to be differentially expressed in all comparisons. The tissuespecific expression of these unigenes revealed that only 1660, 864, 321, and 241 genes from 1706, BJ1, QH1, and W85, respectively, were found to be differentially expressed in RS vs. nRS comparisons ( Figure 4B).
In detail, 143 common upregulated genes were found among the four comparisons, including the phenylalanine ammonialyase gene (PAL, CL2036.Contig6), the geraniol 8-hydroxylase gene (G10H, CL1629.Contig4), and the bHLH transcription factor gene (CL15084.Contig2) ( Figure S4A; Table S5). In contrast, up to 219 common downregulated genes were found in the same comparisons, including the auxin-responsive protein gene (ARF, CL15166.Contig1), the aluminum-activated malate transporter gene (ALMT, Unigene23223), and the ABC transporter B family member gene (ABCG, CL15718.Contig25) ( Figure S4B; Table S6). In order to analyze the global similarities and differences among tissues and cultivars of R. glutinosa, PCA and box-whisker plots were conducted. The first four principal components (PCs) accounted for 82.37% of the variation ( Figure S3). The 24 samples were divided into different groups according to the PCA loading plot, and six replicates from the same cultivar clustered closely in the same region. However, the radial striation and non-radial striation of the same cultivar were not clustered together ( Figure S3). This may be because radial striations and non-radial striations both belong to R. glutinosa tuberous root. Most samples from the All-Unigenes database showed expression patterns similar to these tissues.
To identify R. glutinosa genes that were significantly up-or downregulated in non-radial striation compared to corresponding radial striation controls, DeSeq2 was used to generate log 2 fold change values for each gene. The unigenes that had at least a two-fold change with an Adjusted p-value ≤ 0.05 were screened and taken as differentially expressed transcripts (DETs). It was interesting that 1706_RS vs. 1706_nRS comparisons had the most differentially expressed unigenes: 3247 in total, with 1789 upregulated genes and 1458 downregulated genes. Only 1166 DETs were determined between W85_RS vs. W85_nRS, with 533 upregulated genes and 633 downregulated unigenes ( Figure 4A). In total, 362 unigenes were found to be differentially expressed in all comparisons. The tissue-specific expression of these unigenes revealed that only 1660, 864, 321, and 241 genes from 1706, BJ1, QH1, and W85, respectively, were found to be differentially expressed in RS vs. nRS comparisons ( Figure 4B).
In detail, 143 common upregulated genes were found among the four comparisons, including the phenylalanine ammonialyase gene (PAL, CL2036.Contig6), the geraniol 8-hydroxylase gene (G10H, CL1629.Contig4), and the bHLH transcription factor gene (CL15084.Contig2) ( Figure S4A; Table S5). In contrast, up to 219 common downregulated genes were found in the same comparisons, including the auxin-responsive protein gene (ARF, CL15166.Contig1), the aluminum-activated malate transporter gene (ALMT, Unigene23223), and the ABC transporter B family member gene (ABCG, CL15718.Contig25) ( Figure S4B; Table S6). To further identify the functions of the DETs between radial striation and non-radial striation libraries, the KEGG pathway classification was carried out for the functional enrichment of the unigenes. Among these DETs with a KEGG pathway annotation, 2531, 1814, 1044, and 910 DETs were identified in the RS vs. nRS libraries from 1706, BJ1, QH1, and W85, which mapped onto 126, 120, 116, and 112 KEGG pathways (Table S7). To determine whether genes involved in secondary metabolites were enriched, the KEGG pathway database was searched using the DETs to reveal the top 20 significantly enriched pathways. The results showed that 413, 309, 195, and 156 DETs were related to the "biosynthesis of secondary metabolites." Furthermore, we also noticed that "phenylpropanoid biosynthesis" was the first enriched pathway in all comparisons, the numbers of DETs involved in phenylpropanoid biosynthesis were 148, 134, 95, and 71, respectively. The DETs related to "sesquiterpenoid and triterpenoid biosynthesis" and "diterpenoid biosynthesis" had also been enriched ( Figure 5 and Table S7). Beyond that, 22, 12, 10, and 9 DETs were enriched in "terpenoid backbone biosynthesis" (Table S7). The pathway analysis of the DETs between radial striation and non-radial striation libraries could provide valuable information for the following research on the molecular mechanisms of catalpol and acteoside biosynthesis in R. glutinosa. To further identify the functions of the DETs between radial striation and non-radial striation libraries, the KEGG pathway classification was carried out for the functional enrichment of the unigenes. Among these DETs with a KEGG pathway annotation, 2531, 1814, 1044, and 910 DETs were identified in the RS vs. nRS libraries from 1706, BJ1, QH1, and W85, which mapped onto 126, 120, 116, and 112 KEGG pathways (Table S7). To determine whether genes involved in secondary metabolites were enriched, the KEGG pathway database was searched using the DETs to reveal the top 20 significantly enriched pathways. The results showed that 413, 309, 195, and 156 DETs were related to the "biosynthesis of secondary metabolites." Furthermore, we also noticed that "phenylpropanoid biosynthesis" was the first enriched pathway in all comparisons, the numbers of DETs involved in phenylpropanoid biosynthesis were 148, 134, 95, and 71, respectively. The DETs related to "sesquiterpenoid and triterpenoid biosynthesis" and "diterpenoid biosynthesis" had also been enriched ( Figure 5 and Table S7). Beyond that, 22, 12, 10, and 9 DETs were enriched in "terpenoid backbone biosynthesis" (Table S7). The pathway analysis of the DETs between radial striation and non-radial striation libraries could provide valuable information for the following research on the molecular mechanisms of catalpol and acteoside biosynthesis in R. glutinosa.

Genes Related to Catalpol Biosynthesis in R. glutinosa
Based on precursor feeding experiments, researchers established the pathway of catalpol biosynthesis and inferred that catalpol and other iridoid glycosides were generated from geraniol, which was produced through a combined biosynthetic route involving non-mevalonate (MEP) and mevalonate (MVA) pathways [19,20]. Shitiz et al. reported that the complete biosynthetic pathway of catalpol has been deciphered for all possible intermediates with their corresponding enzymes in Picrorhiza kurroa ( Figure 6A) [21]. The putative genes encoding enzymes in the upstream of the iridoid pathway, including geraniol synthase (GES), geraniol 10-hydroxylase (G10H), cytochrome P450 reductase (CPR), and 10-hydroxygeraniol oxidoreductase (10HGO), have been identified in R. glutinosa [16]. However, the genes encoding enzymes in the downstream of R. glutinosa catalpol biosynthesis are still unknown.

Genes Related to Catalpol Biosynthesis in R. glutinosa
Based on precursor feeding experiments, researchers established the pathway of catalpol biosynthesis and inferred that catalpol and other iridoid glycosides were generated from geraniol, which was produced through a combined biosynthetic route involving non-mevalonate (MEP) and mevalonate (MVA) pathways [19,20]. Shitiz et al. reported that the complete biosynthetic pathway of catalpol has been deciphered for all possible intermediates with their corresponding enzymes in Picrorhiza kurroa ( Figure 6A) [21]. The putative genes encoding enzymes in the upstream of the iridoid pathway, including geraniol synthase (GES), geraniol 10-hydroxylase (G10H), cytochrome P450 reductase (CPR), and 10-hydroxygeraniol oxidoreductase (10HGO), have been identified in R. glutinosa [16]. However, the genes encoding enzymes in the downstream of R. glutinosa catalpol biosynthesis are still unknown.

UDP-Dependent Glycosyltransferases (UGT) Gene
UDP-dependent glycosyltransferases (UGTs) are pivotal in the process of glycosylation for decorating iridoid glycoside and phenylethanoid glycosides with sugars. Phenylpropanoids are usually glycosylated at the 3-O or 7-O position, which serves to increase stability and solubility, as well as decrease the potential toxicity of the aglycone precursor [23]. UGTs belong to the largest family (Family 1) of the glycosyltransferase superfamily, and in Arabidopsis thaliana, more than 100 UGTs encoding genes have been identified [24,25].
In this research, a total of 325 UGTs were found in the R. glutinosa transcriptome. Comparative transcriptome analysis revealed that 32, 25, 12, and 9 unigenes were differentially expressed in nonradial striations compared with radial striations in 1706, BJ1, QH1, and W85. Among them, 27, 23, 11, and 9 were upregulated genes, and only several genes were downregulated. Three members of UGT genes, including CL3773.Contig3, CL5268.Contig2, and Unigene12367, were upregulated in non-radial striations for every cultivar (Table S10, Figure 8A). Using the deduced amino acid sequences of the

UDP-Dependent Glycosyltransferases (UGT) Gene
UDP-dependent glycosyltransferases (UGTs) are pivotal in the process of glycosylation for decorating iridoid glycoside and phenylethanoid glycosides with sugars. Phenylpropanoids are usually glycosylated at the 3-O or 7-O position, which serves to increase stability and solubility, as well as decrease the potential toxicity of the aglycone precursor [23]. UGTs belong to the largest family (Family 1) of the glycosyltransferase superfamily, and in Arabidopsis thaliana, more than 100 UGTs encoding genes have been identified [24,25].
In this research, a total of 325 UGTs were found in the R. glutinosa transcriptome. Comparative transcriptome analysis revealed that 32, 25, 12, and 9 unigenes were differentially expressed in non-radial striations compared with radial striations in 1706, BJ1, QH1, and W85. Among them, 27, 23, 11, and 9 were upregulated genes, and only several genes were downregulated. Three members of UGT genes, including CL3773.Contig3, CL5268.Contig2, and Unigene12367, were upregulated in non-radial striations for every cultivar (Table S10, Figure 8A). Using the deduced amino acid sequences of the 18 R. glutinosa UGT genes, multiple amino acid sequence alignment was conducted. The sequence alignment indicated that there is a conserved domain with the sequence of WAP-H-GWNS-E-DQ ( Figure S5), which has been reported as a binding domain. Interestingly, homologous comparison analysis found that some UGT genes possessed higher sequence similarities and shared similar expression patterns ( Figure 8B). For example, CL10318.Contig4 (UGT85A2) and CL11397.Contig2 (UGT85A1) were clustered into the same group, their expression was significantly upregulated in the non-radial striation of BJ1, and their Log 2 Ratio(nRS/RS) values were greater than 1 in QH1 (Table S10). The qRT-PCR analysis indicated that CL10318.Contig4 and CL11397.Contig2 have extremely similar expression patterns in detected tissues ( Figure 8C). It can be seen from the transcriptome and qRT-PCR analysis that most of the differentially expressed UGT genes showed higher expression levels in non-radial striations of R. glutinosa tuberous roots, which provides strong evidence of the molecular mechanisms of active component accumulation.  Figure S5), which has been reported as a binding domain. Interestingly, homologous comparison analysis found that some UGT genes possessed higher sequence similarities and shared similar expression patterns ( Figure 8B). For example, CL10318.Contig4 (UGT85A2) and CL11397.Contig2 (UGT85A1) were clustered into the same group, their expression was significantly upregulated in the non-radial striation of BJ1, and their Log2 Ratio(nRS/RS) values were greater than 1 in QH1 (Table  S10). The qRT-PCR analysis indicated that CL10318.Contig4 and CL11397.Contig2 have extremely similar expression patterns in detected tissues ( Figure 8C). It can be seen from the transcriptome and qRT-PCR analysis that most of the differentially expressed UGT genes showed higher expression levels in non-radial striations of R. glutinosa tuberous roots, which provides strong evidence of the molecular mechanisms of active component accumulation.

Transcription Factor Genes
Transcription factors (TFs) were not only involved in a wide range of plant growth, development, and response to environmental stresses, but also involved in the regulation of the biosynthesis of active ingredients in medicinal plants [26][27][28]. To identify transcription factors involved in the regulation of catalpol and acteoside accumulation in tuberous root, we analyzed transcription factor genes, whose expression was positively or negatively related to catalpol and acteoside content. A total of 4461 unigenes encoding TFs were classified into 59 TF families ( Figure S6). The largest gene family was the MYB family, followed by the MYB-related family, the C3H family, the AP2-EREBP family, the bHLH family, the WRKY family, and the ARF family of transcription factors.

Discussion
The tuberous roots of R. glutinosa are derived from the swelling of fibrous roots, which are developed from adventitious roots [33]. It is traditionally believed that each tuberous root consists of two parts: radial striation and non-radial striation [13]. The radial striation is mainly composed of a xylem formed by the proliferation of a large number of parenchyma cells, with a small number of secretory cells. The non-radial striation mainly consists of phloem. The outside has wooden bolt cells and the inside arranges loosely to the parenchyma cells, dispersiving most secretory cells contained in the orange red oil drop, which occasionally has stone cells. Studies have shown that the survival of all vascular plants depends on the xylem and phloem, which comprise a hydraulically coupled tissue system that transports metabolites [34]. In the ancient literature of traditional Chinese medicine (TCM), the radial striation characteristic of R. glutinosa tuberous roots is a justification standard for genuine medicinal material. Our previous studies revealed that the active components in the radial striation and non-radial striation of R. glutinosa tuberous root were significantly different among different cultivars [35]. However, to date, the molecular regulation mechanisms of quality forming in radial striation and non-radial striation of R. glutinosa tuberous root have not been reported.

Discussion
The tuberous roots of R. glutinosa are derived from the swelling of fibrous roots, which are developed from adventitious roots [33]. It is traditionally believed that each tuberous root consists of two parts: radial striation and non-radial striation [13]. The radial striation is mainly composed of a xylem formed by the proliferation of a large number of parenchyma cells, with a small number of secretory cells. The non-radial striation mainly consists of phloem. The outside has wooden bolt cells and the inside arranges loosely to the parenchyma cells, dispersiving most secretory cells contained in the orange red oil drop, which occasionally has stone cells. Studies have shown that the survival of all vascular plants depends on the xylem and phloem, which comprise a hydraulically coupled tissue system that transports metabolites [34]. In the ancient literature of traditional Chinese medicine (TCM), the radial striation characteristic of R. glutinosa tuberous roots is a justification standard for genuine medicinal material. Our previous studies revealed that the active components in the radial striation and non-radial striation of R. glutinosa tuberous root were significantly different among different cultivars [35]. However, to date, the molecular regulation mechanisms of quality forming in radial striation and non-radial striation of R. glutinosa tuberous root have not been reported.
Chemical composition is an important factor affecting the quality of medicinal materials. However, it is not uniformly distributed in the medicinal materials, as well as the expression of related biosynthesis genes, which all have obvious tissue specificity, such as Radix polygalae [36], Salvia miltiorrhiza [8], Lonicera japonica [37], and Radix paeoniae [38]. Therefore, with the rapid development of high throughput sequencing technology, transcriptome sequencing has become an effective method to identify the mechanisms of active component accumulation in medicinal plants.
R. glutinosa was one of the earliest medicinal plants to be utilized in high throughput sequencing technology to study the molecular regulation mechanisms of growth and development, stress response, and so on. Using a Solexa DNA sequencing platform, 29 conserved and three novel R. glutinosa miRNAs were differentially expressed between the first year of crop (FP) and the second year of crop (SP) plants [39]. Transcriptome was used to identify R. glutinosa miRNAs and their targets related to replanting disease [40]. Expression profiling identified a total of 6794 differentially expressed unigenes among R. glutinosa adventitious roots (ARs), thickening adventitious roots (TARs), and developing tuberous roots (DTRs), which provided insights into the molecular mechanisms underlying tuberous root development [41]. However, the number of unigene of R. gutinosa varied in different experiments. The quality of reference transcriptome may significantly impact the accuracy of expression profile analysis. The power to detect a transcript depends on its length and abundance in the sequencing library, which is affected by library preparation [42]. Li et al. initially obtained 94,544 and 99,708 unigenes from leaf and root transcript libraries, respectively [40]. They then collected 10 root samples from SP (five stages) and FP (five stages) and mixed them together to construct a root transcriptome library, collected 10 leaf samples to construct a leaf transcriptome library, and obtained 71,922 and 54,722 unigenes from the mixed root and mixed leaf libraries via the Illumina HiSeq ™ 2500 system, respectively [43]. Sun et al. generated a total of 81,003 unique sequences from the cDNA library constructed from the ARs, TARs, and DTRs of R. glutinosa by using the Illumina sequencing platform [41]. In this study, we obtained 38,678~54,378 unigenes from the 24 cDNA libraries of radial striations and non-radial striations of R. glutinosa tuberous roots and eventually assembled a final set of 150,405 entries with an average length of 1244 bp higher than mixed roots unigenes (383 bp and 678 bp) [17,41] and mixed leaf and root unigenes (868 bp) [43]. This novel unigene database provides a high-quality reference transcriptome on gene expression detection and molecular mechanisms for R. glutinosa.
Gene expression analysis has been extensively utilized for the identification of enzyme genes involved in secondary metabolites biosynthesis by measuring transcriptional levels in different tissues and developmental stages. Phytochemical analysis of the peeled Salvia miltiorrhiza root tissues demonstrated the localization of tanshinones in the periderm, and transcriptome analysis revealed that the two key enzyme coding genes SmCPS1 and SmKSL1 were specifically expressed in the periderm [7]. Lithospermic acid B accumulates in the phloem and xylem of S. miltiorrhiza roots, in agreement with the expression patterns of the identified key genes, including SmRAS, SmC4H1, and SmCYP98A78, related to rosmarinic acid biosynthesis [8]. We identified 155 transcripts (encoding 14 enzymes) involved in the iridoid glycoside biosynthesis pathway from this R. glutinosa unigene database. The expression patterns of Unigene66333 (IPI) and Unigene34693 (CPM) could match the change in catalpol content and were considered to be the better candidate genes for catalpol biosynthesis. Among the 223 unigenes encoding nine enzymes involved in acteoside biosynthesis, the expression levels of CL7017.Contig3 (PPO) and CL4500.Contig3 (TyDC) in different tuberous root tissues were similar to content changes in acteoside and total phenylethanoid glycosides. Furthermore, CL4500.Contig3 (TyDC) was considered to be the best candidate for acteoside synthesis. Glycosyltransferases (GTs) comprise a ubiquitous family of enzymes that catalyze the transfer of a sugar moiety from an activated donor molecule onto saccharide or non-saccharide acceptors, resulting in the formation of various glycosides of non-carbohydrate moieties [44], which play important roles in acteoside and catalpol biosynthesis. In total, 325 UGTs were identified in this report. Transcriptome analysis showed that three members of UGTs, including CL3773.Contig3, CL5268.Contig2, and Unigene12367, were upregulated in non-radial striations for every cultivar. Furthermore, CL10318.Contig4 and CL11397.Contig2 indicated higher expression levels in non-radial striation of BJ1 and QH, which is consistent with the change in catalpol accumulation, possibly involved in catalpol synthesis.
Over the past decade, TFs have been key regulators in controlling gene expression by binding to the promoter of single or multiple genes. Many experiments have confirmed that TFs are involved in the biosynthesis of various secondary metabolites in plants [30,32,45,46]. Two bHLH family genes, TSAR1 (Triterpene Saponin biosynthesis Activating Regulator1) and TSAR2, regulate triterpene saponin biosynthesis in Medicago truncatula [47]. Overexpression and RNAi transient assays revealed that the two MYB transcription factors (bMYB10b and PbMYB9) regulate flavonoid biosynthesis in Pyrus bretschneideri fruit [30]. Both the qRT-PCR and protein interaction network results showed that four bHLHs (FabHLH25, FabHLH29, FabHLH80, and FabHLH98) are responsive to the fruit anthocyanin biosynthesis in strawberries [32]. A total of 4461 transcripts encoding TFs were identified in our R. glutinosa novel transcriptome database. Transcriptome analysis revealed that many differentially expressed TF genes showed a higher expression in non-radial striations of R. glutinosa tuberous roots. Among them, a member of the ERF family, Unigene21326, possibly the key regulator in acteoside biosynthesis, was found to have an expression that was significantly upregulated based on an RS vs. nRS comparison for R. glutinosa 1706, BJ1, and QH1. All materials were sowed in mid-April 2017 by planting a double line at the width ridge. The height, width, and spacing of ridges were 15, 40, and 30 cm, respectively. In addition, all plants of the four cultivars were managed uniformly. Then, samples were taken on September 30, and the radial striation was separated from the non-radial striation by scalpels. In addition, samples used for transcriptome sequencing were stored at −80 • C, and the materials for analysis of the quality characteristics were placed in a desiccator after drying and powdering.

Analysis of Quality Characteristics of Radial Striation and Non-Radial Striation of R. glutinosa
The HPLC fingerprint method established by the research group was used to analyze the samples [13]. The catalpol and acteoside content in the radial striation and non-radial striation of R. glutinosa tuberous root was determined by referring to the method of Chinese Pharmacopoeia (2015 edition). Total phenylethanoid glycoside and iridoid glycoside content was determined with a UV spectrophotometer, with reference to the method established by the previous studies [11,48].

RNA Isolation and Illumina Sequencing
Total RNA was extracted from the three replicates for the radial striation and non-radial striation of R. glutinosa tuberous roots with TRIzol reagent, with reference to the manufacturer's instructions. RNA quality and quantity were determined with a Nanodrop TM 2000 spectrophotometer (Thermo Fisher Scientific, Waltham, MA, USA) and a Bioanalyzer 2100 (Agilent, California, USA). A total of 24 RNA samples were of sufficient quality with RNA integrity number (RIN) values ranging from 8.0 to 9.5. Then, they were converted to cDNA by reverse transcription, and the cDNA library was sequenced using an Illumina Hiseq TM 4000 system from BGI-Tech (Shenzhen, China; Project ID: F17FTSCCKF3479).

Sequence Assembly and Annotation
The data obtained by sequencing is known as raw reads or raw data. Quality reads are performed on the raw reads to determine whether the sequencing data is suitable for subsequent analysis. Fragments were assembled using Trinity software after QC was qualified [49]. The assembled Unigene was compared with the seven databases, nr (http://www.ncbi.nlm.nih.gov/, NCBI non-redundant protein sequences) (access on 9 November 2017), nt (NCBI nucleotide sequences), Pfam (Protein family, http://pfam.sanger.ac.uk/) (access on 9 November 2017), KOG (euKaryotic Ortholog Groups, http://www.ncbi.nlm.nih.gov/cog/) (access on 9 November 2017), Swiss-Prot (a manually annotated and reviewed protein sequence database, http://www.expasy.ch/sprot/) (access on 9 November 2017), KEGG (Kyoto Encyclopedia of Genes and Genomes, http://www.genome.jp/kegg/) (access on 9 November 2017), and GO (Gene Ontology, http://www.geneontology.org/) (access on 9 November 2017), by the BLAST program for sequence similarity analysis and functional annotation of unigenes, and the DETs were screened out according to the expression level of genes.

Differential Gene Expression Analysis
Gene expression levels were calculated based on the numbers of reads mapped to the reference sequence using the FPKM method [50]. Differentially expressed genes were identified based on the method described by Audic and Claverie [51]. The differentially expressed genes were defined by default as genes with an FDR ≤ 0.05 and multiple differences of more than two-fold.

Quantitative RT-PCR Analysis
The expression levels of all genes of interest in the study were detected by quantitative real-time PCR (qRT-PCR). Total RNA was extracted from plant samples with the Plant RNA Extraction Kit (Takara, Dalian, China), according to the kit's instructions. cDNA was synthesized from about 1 µg of total RNA using Super Script TM reverse transcriptase (Takara, Dalian, China). The amplification reaction of qPCR was performed in a Bio-Rad iQ5 Real-Time PCR System (Bio-Rad, California, USA), using SYBR ®® Premix Ex Taq™ II (Tli RNaseH Plus) (Takara, Dalian, China), according to the manufacturer's instructions. Primers for qRT-PCR analysis are listed in Table S1, and the PCR amplification conditions were as follows: 94 • C for 5 min; 45 cycles of 94 • C for 10 s, 55 • C for 15 s, and 72 • C for 20 s. For each candidate gene, the PCR reactions were carried out in triplicate. The RgTIP41 gene (GenBank accession number KT306007) was used as internal control to calculate relative expression levels based on the 2 −∆∆C t method [52].

Data Submission
The transcriptome data of the twenty-four R. glutinosa samples were submitted to the GenBank-SRA repository under the project identification number PRJNA504376.

Conclusions
In this study, we detected the content distribution of catalpol, acteoside, total iridoid glycosides, and total phenylethanoid glycosides in radial striation and non-radial striation of four cultivars: 1706, BJ1, QH1, and W85. Catalpol and total iridoid glycosides amounts were higher in the non-radial striation of BJ1 and QH1, and acteoside and total phenylethanoid glycoside amounts in 1706, BJ1, and QH1 were higher in non-radial striation than that in radial striation. The radial striation and non-radial striation cDNA libraries were constructed, respectively, from the four R. glutinosa cultivars and sequenced by the Illumina TM HiSeq 4000 platform. A total of 715,158,202 high quality reads were obtained, and 150,405 unigenes were finally assembled. Among them, 3247, 2343, 1340, and 1166 unigenes were differentially expressed in RS vs. nRS comparisons of 1706, BJ1, QH1, and W85, respectively. The expression profiles of some catalytic enzyme genes and transcript factor genes were positively correlated with content changes of catalpol, acteoside, total iridoid glycosides, and total phenylethanoid glycosides. It will help not only to elucidate the molecular mechanisms of catalpol and acteoside biosynthesis, but also to illuminate the influence of radial striation on the quality characteristics of R. glutinosa tuberous roots.

Conflicts of Interest:
The authors declare that they have no competing interests.