A Comparative Transcriptomic with UPLC-Q-Exactive MS Reveals Differences in Gene Expression and Components of Iridoid Biosynthesis in Various Parts of Gentiana macrophylla

Gentiana macrophylla Pall. (G. macrophylla)—a member of the family Gentianaceae—is a well-known traditional Chinese medical herb. Iridoids are the main active components of G. macrophylla, which has a wide range of pharmacological activities such as dispelling wind, eliminating dampness, clearing heat and asthenic fever, hepatoprotective and choleretic actions, and other medicinal effects. In this study, a total of 67,048 unigenes were obtained by transcriptomic sequencing analysis of G. macrophylla. A BLAST analysis showed that 48.21%, 33.66%, 46.32%, and 32.62% of unigenes were identified in the NR, Swiss-Prot, eggNOG, and KEGG databases, respectively. Twenty-five key enzymes were identified in the iridoid biosynthesis pathway. Most of the upregulated unigenes were enriched in flowers and leaves. The trustworthiness of the transcriptomic data was validated by real-time quantitative PCR (qRT-PCR). A total of 22 chemical constituents were identified by ultra-high performance liquid chromatography-quadrupole-electrostatic field Orbitrap mass spectrometry (UPLC-Q-Exactive MS), including 10 iridoids. A correlation analysis showed that the expression of 7-DLH and SLS was closely related to iridoids. The expression of 7-DLH and SLS was higher in flowers, indicating that flowers are important for iridoid biosynthesis in G. macrophylla.


Introduction
Gentiana macrophylla Pall. is a perennial herb of the genus Gentiana, mainly distributed in Shaanxi, Gansu, and Tibet, in China [1]. In traditional Chinese medicine, the roots of G. macrophylla are used, while their flowers are also used in Tibetan medicine [2]. Research shows that iridoids are the main active ingredients of G. macrophylla [3]. It has many medical effects, such as dispelling wind, eliminating dampness, clearing heat and asthenic fever, hepatoprotective and choleretic actions, etc. [4]. Owing to its high medical value, its uncontrolled exploitation leads to wild resources being extremely scarce, and it is listed by the Chinese government as third-class protected wild herb [5]. The studies on G. macrophylla are mainly focused on its active constituents, pharmacological effects, germplasm resources, and pharmacognosy. The active components in medicinal plants are formed by their unique biosynthetic pathways. The biosynthesis process of iridoids can be divided into three stages: the synthesis of intermediates (i.e., IPP and DMAPP), terpenoid synthesis (i.e., catalyzing various intermediates or terpenoids from IPP and DMAPP), and final modification (i.e., the complex structural modification of iridoids end products) [6]. Based on previous research [7] on iridoid biosynthesis in G. macrophylla, we used transcriptomic and metabolomic analyses to further reveal the iridoid biosynthesis in different parts of G. macrophylla. G. macrophylla plants were collected in the town of Badu, Long County, Baoji, Shaanxi Province, in July 2021 (latitude: 34 • 71 4444" N; longitude: 106 • 82 6931" E). Samples were identified by Professor Shuonan Wei of Northwestern University as Gentiana macrophylla Pall. Three plants were taken as biological replicates for analysis. Each sample was divided into roots, stems, leaves, and flowers. Then, the samples were split into two parts. One part was used for UPLC-Q-TOF MS analysis and was quickly dried in an oven at 100 • C after being shredded [16]. The other part was collected in RNase-free tubes and quick-frozen in liquid nitrogen, before being stored at −80 • C in a refrigerator for later use. Specimens and reserved samples were kept in the Shaanxi Provincial Key Laboratory of Biomedicine.

RNA Extraction and Sequencing
Total RNA was extracted using TRIzol reagent (Invitrogen, CA, USA) and an RNA purification kit, following the manufacturers' procedures. The total RNA quantity and purity were analyzed using Bioanalyzer 2100 and RNA 1000 Nano LabChip Kit (Agilent, CA, USA), with a RIN number >7.0. The cleaved RNA fragments were reverse-transcribed to create the final cDNA library in accordance with the protocol for the mRNA-Seq sample preparation kit (Illumina, San Diego, CA, USA); the average insert size for the pairedend libraries was 300 bp (±50 bp). Then, we performed the paired-end sequencing on an Illumina NovaSeq™ 6000 at LC Sciences, Houston, TX, USA, following the vendor's recommended protocol.

Transcript Assembly and Unigene Functional Annotation
Firstly, in-house Cutadapt [17] and Perl scripts were used to remove low-quality and undetermined bases. Then, sequence quality was verified using FastQC (http://www. bioinformatics.babraham.ac.uk/projects/fastqc/, accessed on 1 October 2021). De novo assembly of the transcriptome was performed using Trinity 2.4.0 [18]. Trinity grouped transcripts into clusters based on shared sequence content. The longest transcript in the transcript cluster was selected as the 'gene' sequence (also known as unigene).

Analysis of Differentially Expressed Genes
Salmon [20] was used to determine expression levels for unigenes by calculating TPM [21]. The differentially expressed unigenes were selected with log2 (fold change) > 1 or log2 (fold change) < −1 based on statistical significance (p-value < 0.05) using the R package edgeR [22]. The identified differentially expressed genes were subjected to the GO and KEGG pathway enrichment analyses to further analyze the main differential functions between sites.

Identification of Transcription Factors
According to the priority order of the NR, Swiss-Prot, KOG, and KEGG, unigenes were aligned with the above protein library (E value < 1 × 10 −5 ) using BLASTx [23]. ESTScan [24] was used to predict the coding regions. The predicted unigenes' encoded protein sequences were compared with the plant transcription factor database (Plant-TFDB) using hmmscan to search for transcription factor families and their members.

qRT-PCR Validation of Key Genes in the Biosynthesis of Iridoid Compounds
qRT-PCR was performed using the QuantGene 9600 System (Bioer Technology, ZJ, CHN) and the SYBR ® Green Premix Pro Taq HS qPCR Kit (Accurate Biology). Special primers were designed for different genes using Primer Premier 5.0 (Table S1). The RNAs from the four parts (i.e., roots, stems, leaves, and flowers) were extracted and reversetranscribed into cDNA using the Evo M-MLV RT Mix Kit with gDNA Clean for qPCR (Accurate Biology). The polymerase chain reaction conditions were as follows: 95 • C for 30 s, 40 cycles of 95 • C for 5 s, and 60 • C for 30 min. All qRT-PCR analyses were repeated in three biological and three technical replicates. The UBC 13 gene was used as a reference. The relative expression levels of the selected genes were determined using the 2 −∆∆Ct method [25].

Sample Preparation
Samples of four parts (roots, stems, leaves, and flowers) were dried and ground. The powder (sieved using a No.3 sifter) was weighed to precisely 0.25 g, placed in a 10 mL volumetric flask, and methanol was added to the scale line, before it was sonicated (power 500 W, frequency 40 kHz) for 30 min and cooled to room temperature, after which its capacity was fixed with methanol and it was filtered. The filtrate was filtered through a 0.22 µm microporous membrane.
The ion source was a HESI source with the following parameters: positive and negative ion detection modes, a sheath gas flow rate of 4.58 L/min, an auxiliary gas flow rate of 7.97 L/min, a spray voltage of 3.44 KV, a capillary temperature of 320 • C, an ion transport tube temperature of 350 • C, and an auxiliary gas temperature of 350 • C. The scan modes were as follows: full MS/dd-MS2, full MS resolution 70,000, dd-MS2 resolution 17,500, and a scan range m/z 100-800. The collision gas was nitrogen (purity > 99.99%) and the collision energy was 30 eV.

UPLC-Q-Exactive MS Data Acquisition and Analysis
The mass spectrometry data were analyzed using Xcalibur software (Thermo Fisher Scientific, Waltham, MA, USA) to derive the possible molecular formulae from the high-resolution mass spectral information, with mass spectral deviations in the range of δ < 4 × 10 −6 . The parent ion was determined from a relevant literature search combined with the ion abundance > 1 × 10 6 in the full-scan spectrum in positive and negative ion modes. Identification of the products was based on retention times, parent ions, and secondary ion fragments, and it was confirmed by a literature search.

Correlation Analysis between Expression of Key Enzyme Genes and Constituents
Six key enzyme genes (HDR, GPPS, G8O, SLS, 7-DLGT, and 7-DLH) were selected through iridoid biosynthesis pathways. Abbreviations of the enzymes are listed in Table S2. The enzyme genes' qRT-PCR data were processed via the 2 −∆∆CT method, with the roots treated as a control group. The data were correlated with the peak areas of the 14 iridoids of G. macrophylla. A clustering correlation heatmap with signs was constructed using the OmicStudio tools (https://www.omicstudio.cn, accessed on 7 July 2022)

Statistical Analysis
Graphing with GraphPad Prism 8.0 (GraphPad Software Inc., San Diego, CA, USA), results were presented as means ± standard error of the means (S.E.M.). Statistical analyses were performed by one-way ANOVA followed by Dunnett's post-hoc test. A p-value < 0.05 was considered statistically significant.

RNA-Seq and De Novo Transcriptome Assembly
A total of 79.89 GB of sequence data were generated, including 20.50 GB from the roots, 21.35 GB from the stems, 18.64 GB from the leaves, and 19.40 GB from the flowers (Table S3). A principal component analysis (PCA) showed that the values of PC1 and PC2 were 48.00 and 25.85%, respectively, and the root group was clearly separated from the other groups ( Figure S1).
After assembling the valid reads, a total of 67,048 unigenes were obtained, with sizes ranging from 200 to 15,651 bp and an average size of 917 bp ( Figure 1C). There were 46,487 transcripts (69.34%) in the size range of 200-1000 bp, 12,714 (18.96%) at 100-2000 bp, and 7847 (11.70%) > 2000 bp. The resulting unigenes were sorted by length from high to low, and the length (N50) at half of the total length was 1571 bp.

Functional Annotation of Unigenes
The 67,048 assembled unigenes were aligned using BLASTx in the six databases of NR, GO, KEGG, Pfam, Swiss-Prot, and eggNOG. The annotation results are summarized in Figure 1A.
In the comparison with the NR database, 32,327 unigenes were annotated, accounting for 48.21% of the total. It can be seen that G. macrophylla has the highest homology with Coffea arabica, followed by Coffea eugenioides, Coffea canephora, Vitis vinifera, and Olea europaea ( Figure 1B).
In the GO database, there are three systematically defined ways of describing the functions of gene products, namely the molecular function, biological process, and cellular component. As shown in Figure 2, a total of 27,111 unigenes were classified by GO annotation and divided into fifty functional groups in three categories. Among all categories, the Genes 2022, 13, 2372 5 of 17 nucleus category in the category of cellular components was the most annotated, accounting for 29.6% of the total annotations, followed by the cytoplasm (16.3%), and biological process categories (13.1%).

Functional Annotation of Unigenes
The 67,048 assembled unigenes were aligned using BLASTx in the six databases of NR, GO, KEGG, Pfam, Swiss-Prot, and eggNOG. The annotation results are summarized in Figure 1A.
In the comparison with the NR database, 32,327 unigenes were annotated, accounting for 48.21% of the total. It can be seen that G. macrophylla has the highest homology with Coffea arabica, followed by Coffea eugenioides, Coffea canephora, Vitis vinifera, and Olea europaea ( Figure 1B).
In the GO database, there are three systematically defined ways of describing the functions of gene products, namely the molecular function, biological process, and cellular component. As shown in Figure 2, a total of 27,111 unigenes were classified by GO annotation and divided into fifty functional groups in three categories. Among all categories, the nucleus category in the category of cellular components was the most annotated, accounting for 29.6% of the total annotations, followed by the cytoplasm (16.3%), and biological process categories (13.1%). In order to further analyze the function of unigenes in the transcriptome of G. macrophylla, the eggNOG functional classification analysis was performed. As shown in Figure 3, a total of 23 different eggNOG functional groups were obtained, including most life activities. The number of genes predicted by general functions was the largest, with 3389 unigenes; 17,167 unigenes were annotated in the KEGG database, involving a total of six major branches of the KEGG metabolic pathway ( Figure 4). The top three annotated subcategories were translation, carbohydrate metabolism, and folding sorting and degradation, accounting for 7.82%, 7.68%, and 5.94% of the total annotations in the database, respectively.

Correlation Analysis of Secondary Metabolism
Most medical components are from secondary metabolites; therefore, secondary metabolism is closely related to medicinal value. In the G. macrophylla transcriptome, a total of 2170 unigenes were involved in the 128 standard KEGG secondary metabolism pathways, of which 92 unigenes were involved in the terpenoid backbone biosynthesis (Table 1) In order to further analyze the function of unigenes in the transcriptome of G. macrophylla, the eggNOG functional classification analysis was performed. As shown in Figure  3, a total of 23 different eggNOG functional groups were obtained, including most life activities. The number of genes predicted by general functions was the largest, with 3389 unigenes; 17,167 unigenes were annotated in the KEGG database, involving a total of six major branches of the KEGG metabolic pathway (Figure 4). The top three annotated subcategories were translation, carbohydrate metabolism, and folding sorting and degradation, accounting for 7.82%, 7.68%, and 5.94% of the total annotations in the database, respectively.   In order to further analyze the function of unigenes in the transcriptome of G. macrophylla, the eggNOG functional classification analysis was performed. As shown in Figure  3, a total of 23 different eggNOG functional groups were obtained, including most life activities. The number of genes predicted by general functions was the largest, with 3389 unigenes; 17,167 unigenes were annotated in the KEGG database, involving a total of six major branches of the KEGG metabolic pathway ( Figure 4). The top three annotated subcategories were translation, carbohydrate metabolism, and folding sorting and degradation, accounting for 7.82%, 7.68%, and 5.94% of the total annotations in the database, respectively.  As special terpenoids, iridoids also have similar synthetic pathways and are synthesized from geranyl pyrophosphate via complex ring opening, rearrangement, cyclization, and glycosylation processes. In total, 102 unigenes were annotated to 25 enzymes involved in iridoid synthesis pathways. The expression of enzyme genes in the biosynthetic pathways of iridoids is shown in Figure 5. Most of the genes showed higher expression in leaves and flowers.

Differential Gene Analysis
Differential gene analysis was performed on the transcriptomic data of the roots, stems, leaves, and flowers ( Figure 6A). In the pairwise comparisons between different parts, significant differences in transcription were observed. Compared with the above-ground parts, most of the differential genes in the roots were downregulated. A total of 5244 differential genes were identified between the roots and flowers, including 1466 upregulated genes and 3778 downregulated genes. According to the cluster analysis of differential genes ( Figure 6B), the expression of genes in flowers was much higher than in other parts. This shows that the most vigorous parts of the physiological activity are flowers.

Correlation Analysis of Secondary Metabolism
Most medical components are from secondary metabolites; therefore, secondary metabolism is closely related to medicinal value. In the G. macrophylla transcriptome, a total of 2170 unigenes were involved in the 128 standard KEGG secondary metabolism pathways, of which 92 unigenes were involved in the terpenoid backbone biosynthesis (Table  1).   As special terpenoids, iridoids also have similar synthetic pathways and are synthesized from geranyl pyrophosphate via complex ring opening, rearrangement, cyclization, and glycosylation processes. In total, 102 unigenes were annotated to 25 enzymes involved in iridoid synthesis pathways. The expression of enzyme genes in the biosynthetic pathways of iridoids is shown in Figure 5. Most of the genes showed higher expression in leaves and flowers.

Differential Gene Analysis
Differential gene analysis was performed on the transcriptomic data of the roots, stems, leaves, and flowers ( Figure 6A). In the pairwise comparisons between different Figure 5. Expression analysis of genes involved in iridoid biosynthesis. Different color blocks represent the normalized gene expression levels (log10 (TPM+1)) in different tissues of G. macrophylla. The blocks from left to right represent roots, stems, leaves, and flowers, respectively. Red: higher expression; blue: lower expression.

Analysis of Transcription Factors
The transcription factor (TF) analysis of all unigenes in the transcriptome of G. macrophylla predicted that there were 940 unigenes belonging to 55 families. The most frequent TF type was bHLH, accounting for 7.55%, followed by C2H2, accounting for 7.13%, and ERF, accounting for 6.81% (Figure 7).

Validation of Key Enzyme Genes Using qRT-PCR
To validate the transcriptomic analysis data and provide a better understanding of the biosynthesis of iridoids in G. macrophylla, we selected six key enzymes in the iridoids pathway to examine their different expressions in four parts of G. macrophylla by using qRT-PCR (Figure 8). The relative expression of the HDR and GPPS genes was the highest in leaves and the lowest in roots. The 7-DLGT and G8O genes had the highest relative expression in roots and the lowest in stems. The 7-DLH and SLS genes had the highest relative expression in flowers and the lowest in leaves. The expression trends were consistent with the results of the transcriptomic analysis ( Figure S2). It was confirmed that transcriptomic analysis can accurately reflect the physiological situation of G. macrophylla. parts, significant differences in transcription were observed. Compared with the aboveground parts, most of the differential genes in the roots were downregulated. A total of 5244 differential genes were identified between the roots and flowers, including 1466 upregulated genes and 3778 downregulated genes. According to the cluster analysis of differential genes ( Figure 6B), the expression of genes in flowers was much higher than in other parts. This shows that the most vigorous parts of the physiological activity are flowers.

Metabolite Analysis of G. macrophylla by UPLC-Q-Exactive MS
The total ion flow diagram of the mass spectrometric base peaks determined via (-) ESI-MS is shown in Figure 9. A total of twenty-two compounds were identified from the mapping of each part, as shown in Table 2; ten of them were iridoids, four were flavonoids, two were triterpenes, two were phenylpropanoids, and four were others.

Analysis of Transcription Factors
The transcription factor (TF) analysis of all unigenes in the transcriptome of G. macrophylla predicted that there were 940 unigenes belonging to 55 families. The most frequent TF type was bHLH, accounting for 7.55%, followed by C2H2, accounting for 7.13%, and ERF, accounting for 6.81% (Figure 7).

Validation of Key Enzyme Genes Using qRT-PCR
To validate the transcriptomic analysis data and provide a better understanding of the biosynthesis of iridoids in G. macrophylla, we selected six key enzymes in the iridoids pathway to examine their different expressions in four parts of G. macrophylla by using qRT-PCR ( Figure 8). The relative expression of the HDR and GPPS genes was the highest in leaves and the lowest in roots. The 7-DLGT and G8O genes had the highest relative expression in roots and the lowest in stems. The 7-DLH and SLS genes had the highest relative expression in flowers and the lowest in leaves. The expression trends were consistent with the results of the transcriptomic analysis ( Figure S2). It was confirmed that transcriptomic analysis can accurately reflect the physiological situation of G. macrophylla.

Validation of Key Enzyme Genes Using qRT-PCR
To validate the transcriptomic analysis data and provide a better understanding of the biosynthesis of iridoids in G. macrophylla, we selected six key enzymes in the iridoids pathway to examine their different expressions in four parts of G. macrophylla by using qRT-PCR (Figure 8). The relative expression of the HDR and GPPS genes was the highest in leaves and the lowest in roots. The 7-DLGT and G8O genes had the highest relative expression in roots and the lowest in stems. The 7-DLH and SLS genes had the highest relative expression in flowers and the lowest in leaves. The expression trends were consistent with the results of the transcriptomic analysis ( Figure S2). It was confirmed that transcriptomic analysis can accurately reflect the physiological situation of G. macrophylla.

Metabolite Analysis of G. macrophylla by UPLC-Q-Exactive MS
The total ion flow diagram of the mass spectrometric base peaks determined via (-) ESI-MS is shown in Figure 9. A total of twenty-two compounds were identified from the mapping of each part, as shown in Table 2; ten of them were iridoids, four were flavonoids, two were triterpenes, two were phenylpropanoids, and four were others.
A loganic acid 11-O-β-glucopyranosyl ester was not identified in stems or flowers. Morroniside and swertiapunimarin were not detected in the leaves. Most of the iridoids were abundant in roots ( Figure 10).    A loganic acid 11-O-β-glucopyranosyl ester was not identified in stems or flowers. Morroniside and swertiapunimarin were not detected in the leaves. Most of the iridoids were abundant in roots ( Figure 10).

Correlation Analysis between the Expression of Key Enzyme Genes and Contents of Iridoids
A correlation analysis was performed on the expression of six enzyme genes and the contents of ten iridoids in different parts of G. macrophylla (Figure 11). The results showed that GPPS and HDR were clustered and highly correlated with the secologanoside content. The expression of 7-DLH and SLS was significantly correlated withmost iridoids. Iridoids 7-DLGT and G8O were more strongly correlated with the content of the loganic acid 11-O-β-glucopyranosyl ester and Gentimacroside.

Correlation Analysis between the Expression of Key Enzyme Genes and Contents of Iridoids
A correlation analysis was performed on the expression of six enzyme genes and the contents of ten iridoids in different parts of G. macrophylla (Figure 11). The results showed that GPPS and HDR were clustered and highly correlated with the secologanoside content. The expression of 7-DLH and SLS was significantly correlated withmost iridoids. Iridoids 7-DLGT and G8O were more strongly correlated with the content of the loganic acid 11-O-β-glucopyranosyl ester and Gentimacroside.

Discussion
Iridoids are the main bioactive products of G. macrophylla and have important medicinal value [37]; hence, their biosynthetic pathways merit further clarification-especially the relationship between enzyme genes' expression and the contents of iridoids in different parts of G. macrophylla. Therefore, transcriptomic and metabolomic experiments were conducted to further explore the relationships and differences in iridoids between parts of G. macrophylla. A total of 67,048 unigenes were identified using the Illumina NovaSeq™

Discussion
Iridoids are the main bioactive products of G. macrophylla and have important medicinal value [37]; hence, their biosynthetic pathways merit further clarification-especially the relationship between enzyme genes' expression and the contents of iridoids in different parts of G. macrophylla. Therefore, transcriptomic and metabolomic experiments were conducted to further explore the relationships and differences in iridoids between parts of G. macrophylla. A total of 67,048 unigenes were identified using the Illumina NovaSeq™ 6000 platform. Compared with a previous study [7], more genetic information on G. macrophylla was obtained. The GO catalogs and proportions of annotated unigenes were similar to related species such as G. lhassica [38], G. waltonii, and G. robusta [39]. The annotation of 136 standard KEGG metabolic pathways and 55 transcription families suggested that G. macrophylla involves a very complex transcriptional regulatory mechanism.
Through screening of rate-limiting enzymes in iridoid biosynthetic pathways by their differential expression in G. macrophylla, six key enzyme genes were selected that were involved in the upstream, midstream, and downstream of the pathway to verify the transcriptome data via qRT-PCR. The results confirmed that the transcriptomic analysis was reliable.
All too often, the investigation of gene expression remains the major trend in unraveling the regulatory mechanisms of metabolic pathways [40]. From annotation information, we identified 25 enzyme genes involved in iridoid biosynthesis, finding that most of them showed higher expression in leaves and flowers. In the formation of IPP intermediates, the MEP pathway dominates in G. macrophylla. These findings are consistent with those of a recently published work [41]. HDR is the last key enzyme on the MEP pathway, playing an important regulatory role in terpenoid synthesis [42]. The high expression of HDR in leaves suggests that IPP may be mainly synthesized in leaves. This is consistent with Oncidium orchid [43] and Arabidopsis [44]. The clustering analysis showed that there were far more differentially expressed genes in flowers than in other parts. As most of the enzyme genes of iridoids were highly expressed in leaves and flowers, we speculate that the iridoid components of G. macrophylla are mainly synthesized in the aboveground parts and then transported to roots for storage.
UPLC-Q-Exactive mass spectrometry was used not only to analyze the iridoids among different parts of G. macrophylla, but also to identify their components. Based on the results of the relative contents of iridoids in different parts of G. macrophylla, the iridoids were more abundant in the roots than in other parts. This is consistent with the results of Gentiana crasicaulis [36]. Loganic acid and gentiopicroside are the content detection items named in the 2020 edition of the Chinese Pharmacopoeia [45], representing the main active components of G. macrophylla; we found that their contents showed no significant differences between flowers and roots. This is reasonable based on the use offlowers in Tibetan and Mongolian medicine.
The results of the correlation analysis on the expression of key enzyme genes and iridoid contents showed gentiopicroside was the most important representative iridoid of G. macrophylla, and its content was inseparable from the expression of 7-DLH and SLS. This provides a feasible idea of increasing the expression levels of 7-DLH and SLS enzyme genes, which may lead to higher contents of gentiopicroside. In addition, the expression of 7-DLH and SLS was extremely high in flowers compared to other parts. This indicates that the flowers are important for iridoid biosynthesis in G. macrophylla.
There are many studies showed that the secondary metabolism of plants is closely related to the environment [46,47]. Shaanxi Province, as the genuine producing area of G. macrophylla, has unique geographical and climatic conditions that have positive contributions to the accumulation of iridoids. However, its mechanism still remains to be explored. Additionally, gene expression control is critical to increase the production of enzymes, fine-tune metabolic pathways, and reliably express synthetic pathways [48]. In the future, we hope to verify the effects of differentially expressed genes on iridoid biosynthesis by controlling them, as well as to further investigate the mechanisms of iridoid transport in G. macrophylla.

Conclusions
In this study, a comparative transcriptomic with UPLC-Q-Exactive MS revealed differences in the gene expression and components of iridoid biosynthesis in various parts of G. macrophylla. According to the GO and KEGG databases, the 25 enzyme genes were identified in the iridoid biosynthesis pathway, and their differential expression resulted in the differential content of the 10 iridoids in various parts of G. macrophylla. Iridoids 7-DLH and SLS showed a highly positive correlation with most other iridoids. These findings provide a comprehensive genetic resource that can enable improvements in our understanding of the regulation of iridoids' biosynthesis and accumulation at the molecular level.
Supplementary Materials: The following supporting information can be downloaded at https:// www.mdpi.com/article/10.3390/genes13122372/s1, Table S1: Primer names and sequences were used in this study; Table S2: Abbreviations of pathways and enzymes; Table S3: Transcriptome sequencing results of G. macrophylla; Figure S1: The principal components analysis (PCA) of four parts of G. macrophylla; Figure