Cloning and Functional Characterization of Two Germacrene A Oxidases Isolated from Xanthium sibiricum

Sesquiterpene lactones (STLs) from the cocklebur Xanthium sibiricum exhibit significant anti-tumor activity. Although germacrene A oxidase (GAO), which catalyzes the production of Germacrene A acid (GAA) from germacrene A, an important precursor of germacrene-type STLs, has been reported, the remaining GAOs corresponding to various STLs’ biosynthesis pathways remain unidentified. In this study, 68,199 unigenes were studied in a de novo transcriptome assembly of X. sibiricum fruits. By comparison with previously published GAO sequences, two candidate X. sibiricum GAO gene sequences, XsGAO1 (1467 bp) and XsGAO2 (1527 bp), were identified, cloned, and predicted to encode 488 and 508 amino acids, respectively. Their protein structure, motifs, sequence similarity, and phylogenetic position were similar to those of other GAO proteins. They were most strongly expressed in fruits, according to a quantitative real-time polymerase chain reaction (qRT-PCR), and both XsGAO proteins were localized in the mitochondria of tobacco leaf epidermal cells. The two XsGAO genes were cloned into the expression vector for eukaryotic expression in Saccharomyces cerevisiae, and the enzyme reaction products were detected by gas chromatography–mass spectrometry (GC-MS) and liquid chromatography–mass spectrometry (LC-MS) methods. The results indicated that both XsGAO1 and XsGAO2 catalyzed the two-step conversion of germacrene A (GA) to GAA, meaning they are unlike classical GAO enzymes, which catalyze a three-step conversion of GA to GAA. This cloning and functional study of two GAO genes from X. sibiricum provides a useful basis for further elucidation of the STL biosynthesis pathway in X. sibiricum.


Introduction
Sesquiterpene lactones (STLs) are widely distributed in nature and have a broad range of beneficial biological activities, including anti-bacterial, anti-inflammatory, and anti-cancer effects [1][2][3][4][5]. Two specific STLs, xanthatin and xanthinosin, are produced in the burs and leaves of Xanthium L. plants [6][7][8]. Many studies have been conducted on the quality and pharmacological activities of X. sibiricum. However, the details of the biological pathways associated with the anti-cancer effects of STLs in Xanthium species remain unclear.
Based on the carbon skeleton, STLs can be classified into multiple types, including germacrene, guaiane, xanthane, pseudoguaiane, eudesmane, and elemane lactones [9]. The molecular mechanisms of STLs differ among types. For example, eudesmane-type STLs synthesize the core carbon skeleton 10-epi-junenol before lactone ring synthesis [10], while guaiane-type STLs are produced from germacrene-type STLs and are induced by protonation [11]. The STL synthesis pathway is usually divided into three main processes: The first process is the synthesis of intermediates, including isopentenyl pyrophosphate (IPP) by protonation [11]. The STL synthesis pathway is usually divided into three main processes: The first process is the synthesis of intermediates, including isopentenyl pyrophosphate (IPP) and dimethylallyl pyrophosphate (DMAPP). The second step involves the formation of the sesquiterpene skeleton, which is preceded by the formation of farnesyl pyrophosphate (FPP) from IPP and DMAPP. Sesquiterpene synthase (STP) then catalyzes FPP to produce the sesquiterpene skeleton. The final step is the formation of the STL end-product, which involves a variety of structural modifications to the carbon skeleton. In particular, STP is critical for the structural transformation of FPP to STL, which subsequently catalyzes the formation of multiple types of sesquiterpenes through a series of chemical processes such as intermediate cyclization of carbenium ion, deprotonation, and hydrogen transfer [12][13][14][15][16][17]. Importantly, some STPs, the cytochrome P450 enzymes, play a modulatory role in SLT biosynthesis, participating in the addition of functional groups to the sesquiterpene backbone [18]. For example, the cytochrome P450 GAO, isolated from an Asteraceae plant, catalyzes the three-step sequential oxidation of germacrene A to GAA [19]. In addition, in common chicory (Cichorium intybus L.), the most critical modification enzyme in the biosynthetic pathway of the 6α-type STL-myrcene lactone C12 is the cytochrome P450 enzyme [20]. It was also shown that the cytochrome P450 enzymes parthenolide synthase (PTS) and kauniolide synthase (KLS), cloned from the aster Chrysanthemum paludosum, catalyzed the oxidation reaction of the C4-5 double-bonds of costunolide, which in turn, generated parthenolide [11,21]. Since germacrene-derived STLs are the simplest, a number of studies have investigated these STLs using synthetic biology and related techniques. Three STP genes were cloned from X. strumarium: XSTPSS1 catalyzed the production of germacrene D, XSTPSS2 catalyzed the formation of Guaia-4,6-diene, and XSTPSS3 catalyzed the production of germacrene A ( Figure 1) [22]. However, it is not clear how different biologically active STLs are produced in Xanthium species after sesquiterpene skeleton formation. Studies have shown that GAA is an important precursor substance in the biosynthesis pathway of germacrene-derived STLs [19]. Based on the findings outlined above, we hypothesized that, in X. sibiricum, GAO would catalyze the production of GAA from germacrene A to produce GAA in two consecutive steps ( Figure 2). Studies have shown that GAA is an important precursor substance in the biosynthesis pathway of germacrene-derived STLs [19]. Based on the findings outlined above, we hypothesized that, in X. sibiricum, GAO would catalyze the production of GAA from germacrene A to produce GAA in two consecutive steps ( Figure 2).  To test this hypothesis, we identified the genes homologous to GAO by searching known GAO gene sequences in the National Center for Biotechnology Information (NCBI) database against a transcriptome library of X. sibiricum established therein and cloned the GAO gene using complementary DNA (cDNA). Subsequently, bioinformatic analyses of the predicted amino acid and protein structures, gene expression patterns, and subcellular localization were carried out. The enzyme reaction products were detected by GC-MS and LC-MS methods, where GAO genes were cloned into the expression vector for eukaryotic expression in S. cerevisiae. The results clarify the downstream STL synthesis pathway in X. sibiricum in future work.

Establishment of a Transcriptome Library and Gene Annotation of X. sibiricum
The de novo transcriptome library of X. sibiricum included 5,989,562,311 nucleotides (nt), and the transcriptome Q20, N, and GC percentages were 97.57%, 0.01%, and 45.36%, respectively. After low-quality reads and filtering out those containing duplicates or junctions, 49,957,916 valid clean reads remained. The clean reads were assembled de novo using Trinity assembly software [23], and a total of 68,199 unigenes were obtained, with an average length of 639 nt and an N50 of 954 nt.
Protein function annotation information for all unigenes was obtained using BLAST [24]. Of the 68,199 unigenes, 19,129,13,845,24,721,37,205, and 14,001 unigenes were successfully annotated with the Pfam, Kyoto Encyclopedia of Genes and Genome (KEGG), SwissProt, non-redundant protein sequence database (NR), and string libraries, respectively. A total of 13,619 unigenes from the transcriptome were successfully annotated to the Cluster of Orthologous Groups of proteins (COG) database, corresponding to the 25 functional categories. A total of 990 unigenes mainly focused on function prediction, with the highest percentage focusing on STLs, and a further 250 unigenes were annotated to secondary metabolite biosynthesis, transport, and catabolism (Appendix A, Figure A1).
A total of 13,845 unigenes were annotated to 128 metabolic pathways in the KEGG database, and 342 were annotated to "metabolism of terpenoids and polyketides" (Appendix A, Figure A2). Among these 342 unigenes, 89 were related to "terpenoid backbone biosynthesis", and 22 were involved in "sesquiterpenoid and triterpenoid biosynthesis." In plants, the biosynthesis of STLs primarily occurs through the mevalonate (MVA) or methyl-D-erythritol phosphate (MEP) pathways, which synthesize DMAPP and IPP precursors [14]. KEGG   To test this hypothesis, we identified the genes homologous to GAO by searching known GAO gene sequences in the National Center for Biotechnology Information (NCBI) database against a transcriptome library of X. sibiricum established therein and cloned the GAO gene using complementary DNA (cDNA). Subsequently, bioinformatic analyses of the predicted amino acid and protein structures, gene expression patterns, and subcellular localization were carried out. The enzyme reaction products were detected by GC-MS and LC-MS methods, where GAO genes were cloned into the expression vector for eukaryotic expression in S. cerevisiae. The results clarify the downstream STL synthesis pathway in X. sibiricum in future work.

Establishment of a Transcriptome Library and Gene Annotation of X. sibiricum
The de novo transcriptome library of X. sibiricum included 5,989,562,311 nucleotides (nt), and the transcriptome Q20, N, and GC percentages were 97.57%, 0.01%, and 45.36%, respectively. After low-quality reads and filtering out those containing duplicates or junctions, 49,957,916 valid clean reads remained. The clean reads were assembled de novo using Trinity assembly software [23], and a total of 68,199 unigenes were obtained, with an average length of 639 nt and an N50 of 954 nt.
Protein function annotation information for all unigenes was obtained using BLAST [24]. Of the 68,199 unigenes, 19,129,13,845,24,721,37,205, and 14,001 unigenes were successfully annotated with the Pfam, Kyoto Encyclopedia of Genes and Genome (KEGG), SwissProt, non-redundant protein sequence database (NR), and string libraries, respectively. A total of 13,619 unigenes from the transcriptome were successfully annotated to the Cluster of Orthologous Groups of proteins (COG) database, corresponding to the 25 functional categories. A total of 990 unigenes mainly focused on function prediction, with the highest percentage focusing on STLs, and a further 250 unigenes were annotated to secondary metabolite biosynthesis, transport, and catabolism (Appendix A, Figure A1).
A total of 13,845 unigenes were annotated to 128 metabolic pathways in the KEGG database, and 342 were annotated to "metabolism of terpenoids and polyketides" (Appendix A, Figure A2). Among these 342 unigenes, 89 were related to "terpenoid backbone biosynthesis", and 22 were involved in "sesquiterpenoid and triterpenoid biosynthesis." In plants, the biosynthesis of STLs primarily occurs through the mevalonate (MVA) or methyl-D-erythritol phosphate (MEP) pathways, which synthesize DMAPP and IPP precursors [14]. KEGG pathway analysis showed that a total of 31 transcripts in X. sibiricum encoded six enzymes of the MVA pathway (acetyl-CoA C-acetyltransferase [

Cloning and Bioinformatics Analysis of XsGAO Genes
Two candidate GAO genes were designed using these sequences as templates (Appendix A, Table A1). The full lengths of XsGAO1 and XsGAO2 were 1467 and 1527 bp, encoding 488 and 508 amino acids, respectively. SMART (Simple Modular Architecture Research Tool) analysis showed that XsGAO1 and XsGAO2 encoded proteins with molecular weights of approximately 54.98 and 58.01 kDa, respectively, protein isoelectric points of approximately 8.72 and 8.40, respectively, and protein a pH value of 5.1. Protein domain analysis showed that XsGAO1 harbored a P450 domain comprising 454 amino acids (amino acid 42-485, Appendix A, Figure A3A,) and XsGAO2 harbored a P450 domain comprising 462 amino acids (amino acid 40-501, Appendix A, Figure A3B). The secondary structures of the XsGAO1 and XsGAO2 proteins were predicted using SOMPA online. The prediction of XsGAO1 indicated that 244 amino acid residues (50%) were involved in the formation of α-helix, 66 residues (13.52%) were involved in the extended chain, 27 residues (5.53%) were involved in β-turn, and 151 residues (30.94%) were involved in a random coil (Appendix A, Figure A4A). Prediction of XsGAO2 indicated that 253 amino acid residues (49.8%) were involved in the formation of α-helix, 62 residues (12.2%) were involved in the extended chain, 31 residues (6.1%) were involved in β-turn, and 162 residues (31.89%) were involved in a random coil (Appendix A, Figure A4B). To better characterize the XsGAO bioinformation, 3D-structure prediction of the XsGAO1and XsGAO2-encoded proteins was performed, and spatial predictions showed high similarity with the P450 enzyme from Salvia miltiorrhiza (Appendix A, Figure A5).
BLAST analysis showed that XsGAO1 and XsGAO2 were highly similar to 17 GAO protein homologs from 10 species, which were obtained from the NCBI database (Appendix A, Table A3). Phylogenetic tree analysis indicated that two XsGAO genes were mainly conserved in Asteraceae; the amino acid sequence of XsGAO1 shared 94.06% sequence identity with HaGAO from Helianthus annuus (XP_022000663.1), while XsGAO2 was recovered on a distinct branch ( Figure 3).

Cloning and Bioinformatics Analysis of XsGAO Genes
Two candidate GAO genes were designed using these sequences as templates (Appendix A, Table A1). The full lengths of XsGAO1 and XsGAO2 were 1467 and 1527 bp, encoding 488 and 508 amino acids, respectively. SMART (Simple Modular Architecture Research Tool) analysis showed that XsGAO1 and XsGAO2 encoded proteins with molecular weights of approximately 54.98 and 58.01 kDa, respectively, protein isoelectric points of approximately 8.72 and 8.40, respectively, and protein a pH value of 5.1. Protein domain analysis showed that XsGAO1 harbored a P450 domain comprising 454 amino acids (amino acid 42-485, Appendix A, Figure A3A,) and XsGAO2 harbored a P450 domain comprising 462 amino acids (amino acid 40-501, Appendix A, Figure A3B). The secondary structures of the XsGAO1 and XsGAO2 proteins were predicted using SOMPA online. The prediction of XsGAO1 indicated that 244 amino acid residues (50%) were involved in the formation of α-helix, 66 residues (13.52%) were involved in the extended chain, 27 residues (5.53%) were involved in β-turn, and 151 residues (30.94%) were involved in a random coil (Appendix A, Figure A4A). Prediction of XsGAO2 indicated that 253 amino acid residues (49.8%) were involved in the formation of α-helix, 62 residues (12.2%) were involved in the extended chain, 31 residues (6.1%) were involved in β-turn, and 162 residues (31.89%) were involved in a random coil (Appendix A, Figure  A4B). To better characterize the XsGAO bioinformation, 3D-structure prediction of the XsGAO1-and XsGAO2-encoded proteins was performed, and spatial predictions showed high similarity with the P450 enzyme from Salvia miltiorrhiza (Appendix A, Figure A5).
BLAST analysis showed that XsGAO1 and XsGAO2 were highly similar to 17 GAO protein homologs from 10 species, which were obtained from the NCBI database (Appendix A, Table A3). Phylogenetic tree analysis indicated that two XsGAO genes were mainly conserved in Asteraceae; the amino acid sequence of XsGAO1 shared 94.06% sequence identity with HaGAO from Helianthus annuus (XP_022000663.1), while XsGAO2 was recovered on a distinct branch ( Figure 3).

XsGAO1 and XsGAO2 Expression Patterns in X. sibiricum
Across the three X. sibiricum tissues tested (fruit, leaves, and stems), the expression patterns of XsGAO1 and XsGAO2 were similar: both genes were more strongly expressed in the fruits than the leaves or stems ( Figure 4A,C). from Oryza sativa Japonica Group. Bootstrap value were shown in percentage values from 1000 replicates.

XsGAO1 and XsGAO2 Expression Patterns in X. sibiricum
Across the three X. sibiricum tissues tested (fruit, leaves, and stems), the expression patterns of XsGAO1 and XsGAO2 were similar: both genes were more strongly expressed in the fruits than the leaves or stems ( Figure 4A,C). To localize XsGAO1 and XsGAO2 in the cell, tobacco leaf transformation was performed. Confocal laser scanning microscopy (CLSM) examination of the transformed tobacco leaves identified XsGAO1 and XsGAO2 signals in the mitochondria ( Figure 5). This was consistent with predictions based on their sequence features. To localize XsGAO1 and XsGAO2 in the cell, tobacco leaf transformation was performed. Confocal laser scanning microscopy (CLSM) examination of the transformed tobacco leaves identified XsGAO1 and XsGAO2 signals in the mitochondria ( Figure 5). This was consistent with predictions based on their sequence features.

Functional Study of XsGAO1 and XsGAO2
The full-length XsGAO genes were cloned into the yeast expression vector pYeDP60 and co-transferred into S. cerevisiae WAT11. Compared with the expression of LsGAS alone or the control yeast bearing the empty vector, germacrene A was obtained through prokaryotic expression and recombinant protein enzyme activity assays and was used as the substrate for XsGAO1 and XsGAO2 in the enzymatic activity reaction. Addi-

Functional Study of XsGAO1 and XsGAO2
The full-length XsGAO genes were cloned into the yeast expression vector pYeDP60 and co-transferred into S. cerevisiae WAT11. Compared with the expression of LsGAS alone or the control yeast bearing the empty vector, germacrene A was obtained through prokaryotic expression and recombinant protein enzyme activity assays and was used as the substrate for XsGAO1 and XsGAO2 in the enzymatic activity reaction. Additionally, the LsGAO gene was cloned from Lactuca sativa, and the microsomal protein expressed in this gene was used as a positive control. The inactivated microsomes were used as a negative control to verify whether the microsomal protein expressed in XsGAO1 and XsGAO2 showed catalytic activity through a two-step enzyme activity catalytic assay (Appendix A, Figures A6-A9).
The gas chromatography-mass spectrometry (GC-MS) chromatogram of enzymatic activity experiments on the LsGAS recombinant protein showed two distinct peaks ( Figure 6A). The second peak exhibited fragment ion peaks at m/z 53, 67, 79, 93, 107, 119, 133, 147, 161, 175, 189, and 204 ( Figure 6B), which corresponded to the characteristic ions of germacrene A. However, the first peak exhibited fragment ion peaks at m/z 77, 81, 93, 107, 121, 133, 147, 161, and 189, which corresponded to the characteristic ions of β-elemene ( Figure 6C) [10,25]. Therefore, we assumed that peak 1 corresponded to β-elemene and peak 2 corresponded to germacrene A. An analysis of the enzymatic activity of XsGAO1 and XsGAO2 in microsomes identified peaks in both liquid chromatography-mass spectrometry (LC-MS) chromatograms with the same retention time as the positive control LsGAO (tR = 12.54 min; Figure 7A). An analysis of the enzymatic activity of XsGAO1 and XsGAO2 in microsomes identified peaks in both liquid chromatography-mass spectrometry (LC-MS) chromatograms with the same retention time as the positive control LsGAO (tR = 12.54 min; Figure 7A). Further analysis showed that the peaks produced in all three assays (i.e., XsGAO1, XsGAO2, and LsGAO) had the same m/z in the positive ion mode ([M+H] + = 235.17; Figure 7B), suggesting that this peak corresponded to GAA. The results indicated that the XsGAO1 and XsGAO2 proteins both catalyzed the production of GAA from germacrene A in yeast microsomes.

Discussion
To investigate plant gene expression and analyze its function, transcriptome sequencing is an important molecular method that can provide genetic information in the absence of genomic data [26]. The de novo assembly platform greatly contributes to finding new genes, providing databases of sesquiterpene synthases and cytochrome P450s for cloning in X. strumarium [27]. X. sibiricum is a traditional plant containing unique secondary metabolites, of which STLs have various pharmaceutical properties. Although sesquiterpene synthase (STP) has only been cloned from X. strumarium glandular trichomes [28], transcriptome databases established from fruits of X. sibiricum provided cDNAs of two GAO genes that were cloned accurately in this study.
As an enzyme that produces an important precursor substance for the synthesis of STLs, the GAO gene is conserved in Asteraceae [25,29]. To investigate the function of this gene, the full-length cDNA sequences of XsGAO1 and XsGAO2 were successfully cloned. Phylogenetic analysis indicated that XsGAO1 may have a similar function to the GAO in H. annuus. However, XsGAO2 was distinct from other GAOs, representing a separate branch that needs further investigation. Multiple comparisons showed that the predicted XsGAO1 protein had high homology with other redox-like proteins, and its protein sequence contained conserved amino acid residues that are expected in the cytochrome P450 enzyme family [30]. Analysis of the 3D structure predicted that XsGAO1 and XsGAO2 had functions similar to those corresponding with ferruginol synthase

Discussion
To investigate plant gene expression and analyze its function, transcriptome sequencing is an important molecular method that can provide genetic information in the absence of genomic data [26]. The de novo assembly platform greatly contributes to finding new genes, providing databases of sesquiterpene synthases and cytochrome P450s for cloning in X. strumarium [27]. X. sibiricum is a traditional plant containing unique secondary metabolites, of which STLs have various pharmaceutical properties. Although sesquiterpene synthase (STP) has only been cloned from X. strumarium glandular trichomes [28], transcriptome databases established from fruits of X. sibiricum provided cDNAs of two GAO genes that were cloned accurately in this study.
As an enzyme that produces an important precursor substance for the synthesis of STLs, the GAO gene is conserved in Asteraceae [25,29]. To investigate the function of this gene, the full-length cDNA sequences of XsGAO1 and XsGAO2 were successfully cloned. Phylogenetic analysis indicated that XsGAO1 may have a similar function to the GAO in H. annuus. However, XsGAO2 was distinct from other GAOs, representing a separate branch that needs further investigation. Multiple comparisons showed that the predicted XsGAO1 protein had high homology with other redox-like proteins, and its protein sequence contained conserved amino acid residues that are expected in the cytochrome P450 enzyme family [30]. Analysis of the 3D structure predicted that XsGAO1 and XsGAO2 had functions similar to those corresponding with ferruginol synthase (CYP71 family).
The expression patterns of XsGAO1 and XsGAO2 in X. sibiricum leaves differed over time, with the highest expression level observed in young leaves, and expression levels decreasing with maturation. XsGAO1 and XsGAO2 were also differentially expressed among fruits and stems, presumably related to their functions. The expression of the two XsGAOs was the highest in fruits, which explains why fruits with higher contents of STLs are used in the traditional Chinese medicine Cang Er Zi.
XsGAO1 and XsGAO2 cDNAs with the correct sequence were successfully inserted into an expression vector and used in transient transformation assays of Nicotiana benthamiana. However, no fluorescence was observed in the protoplasts. This may be because the accumulated concentration of the product was below the detection limit of the instrument or because the GAA generated was intermediately transient in N. benthamiana [31]. Both XsGAO1 and XsGAO2 were localized in the mitochondria, which was consistent with terpene synthase in tomatoes (also localized in the mitochondria) [32,33].
GAO isolated from L. sativa was expressed in an engineered yeast to synthesize GAA de novo, and the classical GAO activity involved three-step oxidation of germacrene A (GA) to yield GAA and 12,6-guaianolides [25,34], similar to N. benthamiana [35]. Meanwhile, an XsGAO from X. strumarium catalyzed only one-step conversion of germacrene A to germacrene alcohol [36], but this study clearly shows that XsGAO1 and XsGAO2 catalyzed a second step of oxidation of the non-natural substrate germacrene A to germacrene A acid, which was not observed in yeast with a different GAO. Apparently, XsGAO2 is a unique enzyme, a functional adaption of SLTs' biosynthetic pathway diversification. Of course, a structural analysis of the XsGAO2 biochemical function, such as to identify the active center and crystal structure of oxidase, would help to examine whether it has unique GAO activity. In addition, with the advent of CRISPR-Cas genome editing, CRISPR-Cas-mediated gene knockout in tomatoes and the medicinal plant Salvia miltiorrhiza has been successfully performed [37,38]. This approach could be applied to verify the function of XsGAO2 in the future.
Beyond this, X. sibiricum contains a variety of biologically active STLs, mainly xanthane-STLs with anti-tumor activities [39,40]. As such, studying the genes in the STL synthesis pathway could provide new ideas for the investigation of xanthane-STL biosynthesis pathways.

Establishment of a Transcriptome Library and Gene Annotation
The fresh samples (fruit, leaf, and stem) of X. sibiricum used for the RNA extraction were collected from Chaoyang District (N: 40.0031, E: 116.5468, H: 113 m, Beijing, China) in August, wrapped in tinfoil, and frozen immediately in liquid nitrogen for storage at −80 • C. Species verification was performed by Professor Dongmei Xie at the School of Pharmacy (Anhui University of Chinese Medicine). The total ribonucleic acid (RNA) was isolated using a TransZol Up Plus RNA kit (TansGen Biotech, Beijing, China), according to the manufacturer's protocol. The RNA extract was reverse-transcribed and then sequenced on an Illumina HiSeq 3000 platform at Shanghai Majorbio Bio-pharm Technology Corporation. After high-throughput sequencing, unigenes were assembled de novo from the clean reads obtained from the raw sequencing reads. To predict the biological function, all unigenes were annotated via a similarity search against the public databases, which contained

Cloning and Bioinformatics Analysis of XsGAO1 and XsGAO2
GAO gene sequences were searched for in the transcriptome database of X. sibiricum using local BLAST, with seed sequences downloaded from the NCBI. The full-length cDNA of two candidate genes, XsGAO1 and XsGAO2, was then cloned using reverse transcription polymerase chain reaction (RT-PCR) (primers are listed in Appendix A, Table A1).
The nucleotide sequences and their encoded amino acid sequences were analyzed using bioinformatics software, and the physicochemical properties of the encoding proteins were predicted using vector NTI, open reading frames (ORF), and amino acid sequence translation through Expasy Translate (http://web.expasy.org/translate/) (accessed on 1 March 2020). Gene domain analysis was performed using SMART (http://smart.emblheidelberg.de/) (accessed on 5 March 2020) [41].
Phylogenetic relationships were constructed using the amino acid sequences of XsGAO1 and XsGAO2 with different reported GAO sequences. Nineteen sequences were aligned using ClustalX2, and the alignment was used to construct a phylogenetic tree using MEGA5.0 software [42].

Examination of the Expression Patterns of the XsGAO1 and XsGAO2 Genes
To further understand the distribution characteristics of GAO genes in X. sibiricum, qRT-PCR (primers are listed in Appendix A, Table A2) was employed to determine the expression patterns of two GAO genes in different organs (leaf, stem, and fruit) at different stages of X. sibiricum. RNA was extracted according to protocol provided for the TRlzol reagent (Invitrogen), and this was then converted into cDNA via reverse transcription with the TransScriptor First-Strand cDNA Synthesis Supermix kit. Subcellular localization of the protein was observed using CLSM, and the consistency with prediction was determined using SLP-Local (https://sunflower.kuicr.kyoto-u.ac.jp/~smatsuda/slplocal.html) (accessed on 25 March 2018) online [27,43].

Functional Study of XsGAO1 and XsGAO2 Genes in Yeast
The ORFs of XsGAO1 and XsGAO1 were PCR-amplified (primers are listed in Appendix A, Table A2), and the amplicons were digested with BamHI/EcoRI and cloned into the respective sites in pYeDP60-XsGAO (pYeDp60 plasmid provided by the Department of Pharmacology, Second Military Medical University). To supply the substrate for XsGAOs, the germacrene A synthase gene (LsGAS; AF489965) from L. sativa (provided by the Department of Pharmacognosy, Second Military Medical University) was inserted into the E. coli expression vector pet28a-LsGAS at EcoRI-SacI sites. Germacrene A was produced by LsGAS, which was expressed through Transetta (DE3), and then germacrene A was catalyzed by the XsGAO gene expressed in S. cerevisiae WAT11 (WAT11 provided by the China Academy of Chinese Medical Sciences). For comparison, we used a classical GAO from L. sativa (LsGAO; GU198171) that is known to oxidize germacrene A in a three-step oxidation process. Transgenic yeast cells were cultivated in appropriate dropout media, and the expression of the transferred genes was induced by 2% galactose [25,34,41].

Data Availability Statement:
The data presented in this study are available in the article.

Conflicts of Interest:
The authors declare no conflict of interest.
Sample Availability: Samples of the compounds GA and GAA are available from the authors.

Appendix A
The appendix provides data supplemental to the main text.