Genomic and Transcriptomic Approaches Provide a Predictive Framework for Sesquiterpenes Biosynthesis in Desarmillaria tabescens CPCC 401429

Terpenoids constitute a structurally diverse class of secondary metabolites with wide applications in the pharmaceutical, fragrance and flavor industries. Desarmillaria tabescens CPCC 401429 is a basidiomycetous mushroom that could produce anti-tumor melleolides. To date, no studies have been conducted to thoroughly investigate the sesquiterpenes biosynthetic potential in Desarmillaria or related genus. This study aims to unravel the phylogeny, terpenome, and functional characterization of unique sesquiterpene biosynthetic genes of the strain CPCC 401429. Herein, we report the genome of the fungus containing 15,145 protein-encoding genes. MLST-based phylogeny and comparative genomic analyses shed light on the precise reclassification of D. tabescens suggesting that it belongs to the genus Desarmillaria. Gene ontology enrichment and pathway analyses uncover the hidden capacity for producing polyketides and terpenoids. Genome mining directed predictive framework reveals a diverse network of sesquiterpene synthases (STSs). Among twelve putative STSs encoded in the genome, six ones are belonging to the novel minor group: diverse Clade IV. In addition, RNA-sequencing based transcriptomic profiling revealed differentially expressed genes (DEGs) of the fungus CPCC 401429 in three different fermentation conditions, that of which enable us to identify noteworthy genes exemplified as STSs coding genes. Among the ten sesquiterpene biosynthetic DEGs, two genes including DtSTS9 and DtSTS10 were selected for functional characterization. Yeast cells expressing DtSTS9 and DtSTS10 could produce diverse sesquiterpene compounds, reinforced that STSs in the group Clade IV might be highly promiscuous producers. This highlights the potential of Desarmillaria in generating novel terpenoids. To summarize, our analyses will facilitate our understanding of phylogeny, STSs diversity and functional significance of Desarmillaria species. These results will encourage the scientific community for further research on uncharacterized STSs of Basidiomycota phylum, biological functions, and potential application of this vast source of secondary metabolites.


Introduction
Sesquiterpenes are synthesized from the prenyl precursors farnesyl diphophate (FPP) by a family of enzymes known as sesquiterpene synthases (STSs) [1]. Basidiomycete mushrooms are particularly adept at producing a wide range of structurally complexified sesquiterpenes [2]. Exploration of the medicinal and pharmacological potential of these underexplored mushroom species could be scientifically and pharmacologically attractive. Continuing secondary metabolites discovery targeting rare basidiomycetes, previous investigations have reported a number of active terpenoids or meroterpenoids, such as antroalbocin A [3], melleolides [4], illudin S [5], conosilane A [6], sterhirsutins [7], many of which exhibit some antibiotic and cytotoxic properties. The ever-increasing genomic information has revealed great potential for mining of putative STS homologues, indicating that the phylum Basidiomycota terpenoids and their synthases represent abundant but largely untapped natural resources [8,9]. Specially, each fungus of the phylum Basidiomycota has an average of 10-20 putative STSs encoding genes. However, only few sesquiterpenes are reportedly characterized [4,7]. Intriguingly, the basidiomycetous STSs have been shown to be readily expressed in well-established laboratory hosts, i.e., Escherichia coli [8,10] and Saccharomyces cerevisiae [11]. Moreover, as many basidiomycetes are relatively slow growing in laboratory conditions. Thus, all this makes heterologous expression an attractive approach for accessing the basidiomycete terpenoid chemistry.
Basidiomycetous Armillaria, belonging to the family Physalacriaceae family, are ubiquitous as destructive forest pathogens of forests, orchards, and plantations [12]. Species from this genus cause the Armillaria butt and root disease that weakens and often kills woody perennials [12,13]. Armillaria species were considered to be polymorphic, which impact on forest populations has both ecological and economic significance [13]. There are approximately more than 70 officially described species [14], and accurate species delimitation and pathogenicity levels assessment of Armillaria genus is pivotal for forestry conservation [15]. As saprotrophs, Armillaria species produce fleshy fruiting bodies that appear in large clumps around contaminated plants [16]. The fungus is beneficial in some ways, as it serves as a natural recycler of nutrients in wooded ecosystems, breaking down woody materials to increase soil health. Interestingly, Armillaria species are also considered as source of nutrition in Europe and Asia and some species are associated with traditional Chinese medicine for their pharmaceutical properties [17]. Armillaria spp. were demonstrated to yield a wealth of biologically active metabolites, exemplified as sterols, sesquiterpene aryl esters, and many other components exhibiting therapeutic or dietary values [17][18][19]. Thus, it is obligated to fully elucidate the molecular basis or mechanism of Armillaria at the genomic level.
The technological revolution in genome sequencing and bioinformatics have markedly increased the number of sequenced Basidiomycetous genomes. There are already nine Armillaria genomes that have been published to date, and those for A. mellea DSM3731, A. ostoyae C18/9, A. solidipes 28-4, A. borealis FPL87.14 v1.0, A. fuscipes CMW2740, A. cepistipes B5, A. altimontana 837-10, two A. gallica strains Ar21-2 and 012m were included. Analyses of these genomic and transcriptomic information have provided insights into evolution, genetics, developmental biology, plant-cell-wall-degrading enzymes (PCWDE) repertoire, and related pathogenicity factors [20][21][22][23][24]. Sipos and coworkers performed omics sequencing to provide genetic and regulatory insights into toolkits for wood-decaying, morphogenesis and complex multicellularity including rhizomorph and fruiting bodies development of the strain A. ostoyae C18 [23]. The taxonomy of Armillaria has raised controversial debate, recently, Armillaria was narrowed down to include only annulate species, and Desarmillaria was introduced to accommodate two exannulate mushroom-forming armillarioid species (D. ectypa and D. tabescens) [25,26]. The lineage to diverge is that composed of annulate mushroom-forming armillarioid species, in what A. mellea (Vahl: Fr.) P. Kumm is the type species [27]. Koch et al. (2017) conducted six-locus phylogenetic analysis among Armillaria and related affinities, demonstrated that Guyanagaster and Desarmillaria are close lineages [25]. On the delimitation and characterization of Armillaria species, the genome-based phylogenomics and comparative genomics were invaluable to understand the phylogeny of Armillaria and sister groups.
In this study, we report a 50.36 Mb draft genome sequence of D. tabescens CPCC 401429. MLST-based phylogeny and synteny analysis at the genomic level were investigated, which reinforced that the D. tabescens was supposed to belong to the genus of Desarmillaria instead of Armillaria as it used to. Genome profiling and transcript on three different fermentation conditions shed light on the secondary metabolism. Notably, guided by STSs phylogeny based predictive framework, a diverse network of twelve putative sesquiterpene biosynthetic genes were obtained and half of them were categorized into the new subfamily, i.e., diverse Clade IV. Intriguingly, two genes including DtSTS9 and DtSTS10 were codon optimized and cloned for heterologous expression in yeast host. Through gas chromatography-mass spectrometry (GC-MS) analysis, yeast cells expressing DtSTS9 and DtSTS10 produce diverse sesquiterpenes. This work provides the genomic basis to understand and investigate the new genus Desarmillaria. We exemplify the exploitation on uncharacterized STSs of basidiomycetes and help to broaden our knowledge of sesquiterpenes biosynthesis.
Standard DNA cloning experiments were conducted using E. coli Trans1-T1. E. coli cultures carrying respective plasmid were grown in Luria-Bertani (LB) medium supplemented with ampicillin (100 µg/mL). S. cerevisiae BJ5464 was used as the host for heterologous expression [28]. Individual transformant of the yeast strain was grown in shaking cultures in 10 mL synthetic dextrose minimal medium containing 6.7 g/L of yeast nitrogen base without amino acids, 20 g/L of dextrose, 1.3 g/L of amino acid dropout powder. For production of the sesquiterpenes, a total of 200 mL YPD broth (BD) was inoculated with 5 mL of the starter culture and incubated for 96 h at 30 • C under shaking conditions (200 rpm). The strains, plasmids, primers, and genetically engineered cells are listed in Table 1. Genomic DNA from Desarmillaria was isolated using HP Fungal DNA Kit (Omega, Guangzhou, China) according to the manufacturer's instructions. Further purification was conducted using RNase digestion, chloroform extraction, and isopropanol precipitation [29]. The high molecular weight gDNA was washed twice with 75% ethanol, and dissolved in 200 µL RNase-free water. Integrity and amount of the extracted gDNA were detected by 1% agarose gel electrophoresis and using the NanoDrop ND-2000 Spectrophotometer (Thermo Fisher, Waltham, MA, USA). Shotgun sequencing was performed in the laboratory of at Shanghai Majorbio Bio-pharm Technology Co. Ltd. (Shanghai, China) using Illumina Hiseq 2000 platform. The PE150 (pair-end) libraries with a mean insert size 400 bp were subjected to PacBio sequencing. Adapter sequences were trimmed and short reads were formatted using Trimmomatic v. 0.36 [30] with minimum quality of 19 and minimum length of 30 bp. Quality was assessed using SMRT Analysis software [31].
The obtained sequence reads were assembled into contigs and scaffolds using SOAPdenovo v. 2.04 [32] and Canu v. 1.7 [33]. K-mer values were automatically selected based on read length and data type. The final scaffold set was verified for miss-assemblies using the BUSCO v3.0 and corrected if necessary [34]. Gene annotations of the coding sequences were formatted to GenBank database for BLASTP alignment or obtained via AUGUSTUS program [35] and verified on the basis of homologous proteins in the NCBI database. The gene ontology (GO) functional annotation was performed using Blast2go [36]. In parallel, protein annotations were also performed using KEGG (Kyoto Encyclopedia of Genes and Genomes), Pfam [37], and Swiss-prot databases [38].

RNA Purification, Library Construction, and RNA-Sequencing
The total RNA was extracted from PDB broth, YMEG broth, and F1 broth using PureLink TM RNA Mini Kit (Invitrogen, USA) and gDNA was removed using DNaseI (Takara). The RNA quality and concentration measurements were performed using 2100 Bioanalyzer (Agilent) and quantified using the NanoDrop ND-2000, and only high-quality purified RNA was processed for construction sequencing library (OD 260/280 = 1.8-2.2, OD 260/230 ≥ 2.0, RIN ≥ 8.0, 28S:18S ≥ 1.0, >1 µg). RNA purification, reverse transcription, library construction and sequencing were performed at Shanghai Majorbio Bio-pharm Biotechnology Co., Ltd. (Shanghai, China) based on the manufacturer's instructions (Illumina). Double-stranded cDNA libraries construction were carried out using the TruSeq TM RNA sample preparation Kit using 1.0 µg of total RNA (Illumina). Libraries were size selected for cDNA target fragments of 300 bp on 2% Low Range Ultra Agarose followed by PCR amplified using Phusion DNA polymerase (NEB) for 15 PCR cycles. Sequencing was conducted on Illumina NovaSeq 6000 sequencer instrument (2 × 150 bp read length).

Quality Control, Read Mapping, and Transcriptome Analysis
The raw paired end reads were trimmed and quality controlled by FASTQ preprocessor with default parameters [39]. Then clean reads were separately aligned to reference genome with orientation mode using HISAT2 software [40]. The mapped reads of each sample were assembled by StringTie in a reference-based strategy [41]. To identify differential expression genes (DEGs) between two different samples, the expression level of each gene was calculated according to the transcripts per million reads (TPM) approach. RSEM was used to quantify gene abundances [42]. Functional-enrichment analysis including GO, KEGG were performed to identify which DEGs were significantly enriched in GO terms and metabolic pathways at P-adjust ≤ 0.05 compared with the whole-transcriptome background. GO functional enrichment and KEGG pathway analysis were performed by Goatools and KOBAS [43,44].
Sequences alignment were conducted using MAFFT v7.471 with FFT-NS-2 algorithm [46], followed by MEGA 7 to refine the alignment done manually [47]. The introns of the housekeeping genes were excluded. ModelFinder was used to predict the best-fitted substitution model according to Akaike Information Criterion (AIC) [48]. The model VT was selected for Bayesian inference (BI) analyses. Bayesian trees were constructed with MrBayes v3.2.6 under the GTR+G+I model [49]. Two independent runs were conducted for 10,000,000 generations with sampling every 100 generations using Markov Chain Monte Carlo (MCMC) algorithm. The first 25% of the samples were discarded as burn-in fraction. The Bayesian tree generated was visualized in FigTree v1.4.4 (http://tree.bio.ed.ac.uk/software/figtree/, accessed on 25 November 2018) and edited using Adobe Illustrator software.

Bioinformatic Annotation for STSs and Phylogenetic Tree Construction
The genome of D. tabescens CPCC 401429 was further bioinformatically analyzed using antiSMASH database. All the predicted amino acid sequences of protein-coding genes present in the genome of the strain CPCC 401429 have been searched for homologues by BLASTP tool in NCBI database. The predicted STSs genes used for the functional analysis were annotated and verified for a coherent exon/intron structure. In addition, several characterized fungal STSs from Clitopilus pseudopinsitus, Omphalotus olearius, Agrocybe aegerita, Coniophora puteana, Coprinopsis cinerea, Postia placenta, Lignosus rhinocerotis, Stereum hirsutum, Termitoymces sp., Armillaria gallica, Fomitopsis pinicola, and Heterobasidion annosum were incorporated for phylogenetic analyses. Bayesian trees of STSs were constructed with MrBayes v3.2.6 using the same parameters during MLST-based phylogenetic analysis [49].

Gene Synthesis, Cloning and Expression of STSs in Yeast
Candidate STSs coding genes from the strain CPCC 401429 were synthesized by Shanghai Generay Biotech Co., Ltd. and codon-optimized for heterologous expression in host S. cerevisiae. The oligonucleotide primers used for constructing the genes are listed in Table 1. PCR reactions were performed using Q5 High-Fidelity DNA Polymerase (New England Biolabs) according to the manufacturer's protocol at an annealing temperature of 56 • C. In vivo yeast recombination cloning strategy was performed using a Frozen-EZ Yeast Transformation II Kit™ (Zymo Research) where competent cell S. cerevisiae BJ5464 was transformed with the individual DNA fragment (PCR product) of the respective STS gene and linearized vector backbone derived from YET plasmid [28,29]. Yeast transformants grown on synthetic dropout agar plate lacking tryptophan were screened by diagnostic PCR. Plasmid DNA was isolated from positive transformants using a E.Z.N.A ® Yeast Plasmid Miniprep Kit (Omega, Guangzhou, China) and used to transform chemically competent E. coli Trans1-T1 (TransGen Biotech, Beijing, China). Plasmid DNA was isolated from the positive clones grown on LB broth with ampicillin using E.Z.N.A ® Plasmid DNA Mini Kit (Omega, Guangzhou, China) and further verified by DNA sequencing.

GC-MS Analysis of Terpenoids
After fermentation, the S. cerevisiae liquid cultures (5.0 g each) were harvested in the headspace (20 mL) and the samples were placed at 55 • C for 15 min and volatile compounds were sampled at 55 • C for 40 min by SPME with a DVB/CAR/PDMS fiber. Compounds were desorbed in the split/splitless inlet of an Agilent 7890B/5977A gas chromatography equipped with an Agilent 7200 accurate-mass quadrupole time-of-flight (GC-MS-TOF; Agilent Technologies, Santa Clara, CA, USA) and analyzed at 250 • C for 5 min. The GC/MS-TOF was equipped with a DB-WAX column (Agilent Technologies; 60 m × 0.25 mm ID, 0.25 µm film thickness) [8,11,12]. The system was operated under the following condition: compounds were detected in split mode at split ratio of 0:1; the GC oven temperature was programmed to ramp from 50 • C (held for 5 min) to 90 • C at 3 • C min −1 , to 150 • C at 2 • C min −1 , then to 230 • C at 8 • C min −1 , and finally to 230 • C (held for 5 min). Full scan mass spectra were acquired in the mass range of 50-350 m/z, and ionization was performed by electron impact at 70 eV with an electrospray ionization source temperature set at 230 • C.

General Genome Features of D. tabescens CPCC 401429
With the Illumina PE150-based HiSeq 2500 sequencer and PacBio Sequel, raw data containing 871,317 reads and 6147.6 million bases were obtained. The high-quality reads were assembled by SOAPdenovo 2.04 and Canu 1.7 to generate 713 polished scaffolds. According to the coverage and the GC content, the assembly were performed and optimized to afford the draft genome sequence of D. tabescens CPCC 401429. The genome was 50.36 Mb in size and included 705 scaffolds with an N 50 of 126 kb and a GC content of 47.14% (Table 2). There were approximately 0.06% (29,142 bp) repetitive sequences in the genome. A total of 15,415 protein-coding genes were predicted in the genome by AUGUSTUS program [35]. The genome completeness was 92.4% based on fungal lineage employed by the BUSCO 3.0 assessment [50]. In addition, 255 tRNAs and 4 rRNAs were defined with tRNA-scan-SE 2.0 and Barrnap V 0.8, respectively. The genome size and number of predicted genes are markedly different with eight other Armillaria mushrooms (Table S1, supporting information I). All the assembly data indicated that whole genome supports a detailed investigation of the gene content and sesquiterpene synthases-encoding pathway of sequenced Armillaria strains. Among the sequenced strains, the D. tabescens is the smallest, while others are >55 Mb, and even the strain of A. gallica 012m with 87.31 Mb. Significantly, a number of STSs encoding genes were detected among the genus, in which the strain CPCC 401429 represented the median level with twelve genes (Table S1, supporting information I).

Phylogenetic Analysis of the Strain CPCC 401429
The tef -1α DNA sequence retrieved from the genome of CPCC 401429 strain was used to preliminarily determine the identity in a BLAST nucleotide search of NCBI and exactly matched the tef -1α region of D. tabescens HKAS86603 (GenBank accession KT822441) with 99.6% similarity. A phylogenetic dendrogram of tef -1α recovered from CPCC 401429 and other Armillaria species or lineages ( Figure 1) was generated, as shown in this phylogenetic tree, the D. tabescens is more closely clustered to D. ectypa species and formed a distinct group (clade V). Interestingly, Koch et al. (2017) performed six loci (28S, EF1α, RPB2, TUB, actin-1 and gpd) phylogenetic analysis to investigate and resolve the evolutionary status of diverging lineages Guyanagaster and Armillaria subgenus Desarmillaria [25]. To further investigate the armillarioid phylogeny, 18S-ITS-28S regions and ten housekeeping genes (tef -1α, rpb2, act1, gpd, tub, atp, pgi, rpl, rpk and ubq) were selected for MLST-based phylogenetic tree construction. As shown in Figure 2, markedly, the species D. tabescens CPCC 401429 and the species Guyanagaster necrorhizus MCA 3950 clustered as a wellsupported branch. However, Armillaria sensu stricto includes annulate mushroom-forming armillarioid species, exemplified as A. mellea, A. ostoyae, A. borealis, A. cepistipes, A. solidipes, A. fuscipes, A. altimontana, and A. gallica species. Our result reinforced that D. tabescens was supposed to belong to the genus of Desarmillaria [25,51].

Genome Annotation and Bioinformatic Analysis
The data of genomic annotations were summarized in Table S1. During the analysis of GO classification, 9044 genes were annotated with the three main categories: cellular component, molecular function, and biological process, respectively ( Figure 3A). The identified coding proteins associated with molecular function are more than other two groups. In the KEGG pathway annotation, the 8300 predicated genes could be divided into 46 categories based on their functions ( Figure 3B). Among these categories, 94 genes are associated with the metabolism of polyketides and terpenoids, which indicates that D. tabescens was a potential natural resource for polyketides and terpenoids biosynthesis. In the carbohydrate-active enzymes (CAZymes) annotation, 509 candidate CAZymes coding genes were identified in the genome of the strain CPCC 401429. Significantly, 199 glycoside hydrolases (GH) dominated in the classes of CAZymes, followed by 145 auxiliary activities proteins (AA), carbohydrate esterases (CE), glycosyl transferases (GT), polysaccharide lyases (PL), and carbohydrate-binding modules (CBM). Compared with the genome of other close Armillaria species, the strain CPCC 401429 had less CAZymes in which the GHs, GTs, and CBMs represented the smallest sets among six Armillaria strains (Table  S2, supporting information I). In addition, as saprotrophic wood-rotting fungi, Armillaria species exhibit higher quantities of CEs and AAs than other Agaricales exemplified as Agaricus bisporus, Pleurotus ostreatus, Coprinopsis cinerea, Schizophyllum commune, Lentinula edodes, and Laccaria bicolor. Interestingly, most CEs and AAs could be functioning as lignin-, cellulose-, pectin-and hemicellulose-degrading enzymes, indicating the strong potential to degrade plant cell wall (PCW) components. Cytochrome P450s, as multifunctional oxygenases, play a key role in adaptation to nutritional niche or secondary metabolites biosynthesis. Based on domain and pfam prediction, a total of 404 cytochrome P450s coding genes were screened in the genome. The number exceeded those of A. gallica 012m (n = 271), A. cepistipes B5 (n = 324) and A. gallica Ar21-2 (n = 330). These P450s might be closely related to the formation of basic metabolism and secondary metabolites biosynthesis in D. tabescens [4]. There are three major bioactive compounds in D. tabescens, exemplified as

Genome Annotation and Bioinformatic Analysis
The data of genomic annotations were summarized in Table S1. During the analysis of GO classification, 9044 genes were annotated with the three main categories: cellular component, molecular function, and biological process, respectively ( Figure 3A). The identified coding proteins associated with molecular function are more than other two groups. In the KEGG pathway annotation, the 8300 predicated genes could be divided into 46 categories based on their functions ( Figure 3B). Among these categories, 94 genes are associated with the metabolism of polyketides and terpenoids, which indicates that D. tabescens was a potential natural resource for polyketides and terpenoids biosynthesis. In the carbohydrate-active enzymes (CAZymes) annotation, 509 candidate CAZymes coding genes were identified in the genome of the strain CPCC 401429. Significantly, 199 glycoside hydrolases (GH) dominated in the classes of CAZymes, followed by 145 auxiliary activities proteins (AA), carbohydrate esterases (CE), glycosyl transferases (GT), polysaccharide lyases (PL), and carbohydrate-binding modules (CBM). Compared with the genome of other close Armillaria species, the strain CPCC 401429 had less CAZymes in which the GHs, GTs, and CBMs represented the smallest sets among six Armillaria strains (Table  S2, supporting information I). In addition, as saprotrophic wood-rotting fungi, Armillaria species exhibit higher quantities of CEs and AAs than other Agaricales exemplified as Agaricus bisporus, Pleurotus ostreatus, Coprinopsis cinerea, Schizophyllum commune, Lentinula edodes, and Laccaria bicolor. Interestingly, most CEs and AAs could be functioning as lignin-, cellulose-, pectin-and hemicellulose-degrading enzymes, indicating the strong potential to degrade plant cell wall (PCW) components. Cytochrome P450s, as multifunctional oxygenases, play a key role in adaptation to nutritional niche or secondary metabolites biosynthesis. Based on domain and pfam prediction, a total of 404 cytochrome P450s coding genes were screened in the genome. The number exceeded those of A. gallica 012m (n = 271), A. cepistipes B5 (n = 324) and A. gallica Ar21-2 (n = 330). These P450s might be closely related to the formation of basic metabolism and secondary metabolites biosynthesis in D. tabescens [4]. There are three major bioactive compounds in D. tabescens, exemplified as melleolides, armillamide, and ergosterol [4]. The synthesis of orsellinatederived melleolides and armillamide are catalyzed by a cascade of redox reactions through the combined action of cytochrome P450s and oxidoreductases, respectively [4]. melleolides, armillamide, and ergosterol [4]. The synthesis of orsellinate-derived melleolides and armillamide are catalyzed by a cascade of redox reactions through the combined action of cytochrome P450s and oxidoreductases, respectively [4].

RNA-Sequencing Based Transcriptome Analysis of the Strain CPCC 401429
To further investigate and evaluate the differently expressed genes (DEGs), we constructed three DEGs libraries to compare YMEG to F1, PDB to YMEG, and PDB to F1 (Figures S1-S3, supporting information I). Overall, there are 10,003, 10,623, and 9895 genes which were transcribed in F1, PDB, and YMEG fermentation broth, respectively. Among the transcribed genes, 9367 genes were commonly shared ( Figure 6A). Additionally, we detected 1721, 1531, and 1616 up-regulated genes, and 1948, 2271, and 2026 down-regulated genes between PDB and YMEG libraries, PDB and YMEG libraries, PDB and F1 libraries, respectively ( Figure 6B, Figures S3 and S4, supporting information I). In organisms, different genes accomplish in a synchronizing mode to perform the biological functions. Pathway-directed analysis supports to further characterize the molecular functions of genes. Functional enrichment analysis was conducted using all DEGs against the GO database to investigate DEGs involved in metabolism and development. The enrichment data supports the hypothesis that the DEGs are approximately related to glycoside metabolism, fruiting body formation, membrane transporters, and secondary metabolites (SMs) production (Tables S3-S6, supporting information I). The latter includes SMs biosynthesis, exemplified as meroterpenoid melleolides, polyketides, and sesquiterpenes. It is worth noting that the DEGs functioning in the basal metabolic process, e.g., glycoside metabolism and transmembrane transport, which could promote the biological process of

RNA-Sequencing Based Transcriptome Analysis of the Strain CPCC 401429
To further investigate and evaluate the differently expressed genes (DEGs), we constructed three DEGs libraries to compare YMEG to F1, PDB to YMEG, and PDB to F1 (Figures S1-S3, supporting information I). Overall, there are 10,003, 10,623, and 9895 genes which were transcribed in F1, PDB, and YMEG fermentation broth, respectively. Among the transcribed genes, 9367 genes were commonly shared ( Figure 6A). Additionally, we detected 1721, 1531, and 1616 up-regulated genes, and 1948, 2271, and 2026 down-regulated genes between PDB and YMEG libraries, PDB and YMEG libraries, PDB and F1 libraries, respectively ( Figures 6B, S3 and S4, supporting information I). In organisms, different genes accomplish in a synchronizing mode to perform the biological functions. Pathway-directed analysis supports to further characterize the molecular functions of genes. Functional enrichment analysis was conducted using all DEGs against the GO database to investigate DEGs involved in metabolism and development. The enrichment data supports the hypothesis that the DEGs are approximately related to glycoside metabolism, fruiting body formation, membrane transporters, and secondary metabolites (SMs) production (Tables S3-S6, supporting information I). The latter includes SMs biosynthesis, exemplified as meroterpenoid melleolides, polyketides, and sesquiterpenes. It is worth noting that the DEGs functioning in the basal metabolic process, e.g., glycoside metabolism and transmembrane transport, which could promote the biological process of secondary metabolism (Tables S3 and S4, Supplementary Information I). Moreover, KEGG pathway analysis indicated that mevalonate and acetyl-CoA and metabolic pathways were also up-regulated, which facilitate the biosynthesis of sesquiterpenes and melleolides. Ten genes associated with GO terms related to sesquiterpene biosynthesis were identified. Among them, one gene DtSTS3 exhibited up-regulation in F1 fermentation broth, which is responsible for melleolides biosynthesis. While five genes (DtSTS4, DtSTS7, DtSTS9, DtSTS11, DtSTS12) and four genes (DtSTS2, DtSTS5, DtSTS8, and DtSTS10) showed up-regulated expression in PDB broth and YMEG broth, respectively ( Figure 6C). We envisage that these transcriptome analysis results would be conductive to the further study of STSs characterization in Desarmillaria or close affinities. secondary metabolism (Tables S3 and S4, Supplementary Information I). Moreover, KEGG pathway analysis indicated that mevalonate and acetyl-CoA and metabolic pathways were also up-regulated, which facilitate the biosynthesis of sesquiterpenes and melleolides. Ten genes associated with GO terms related to sesquiterpene biosynthesis were identified. Among them, one gene DtSTS3 exhibited up-regulation in F1 fermentation broth, which is responsible for melleolides biosynthesis. While five genes (DtSTS4, DtSTS7, DtSTS9, DtSTS11, DtSTS12) and four genes (DtSTS2, DtSTS5, DtSTS8, and DtSTS10) showed up-regulated expression in PDB broth and YMEG broth, respectively ( Figure 6C). We envisage that these transcriptome analysis results would be conductive to the further study of STSs characterization in Desarmillaria or close affinities.

Discussion
Armillaria spp. are devastating forest pathogens and they forage for hosts and achiev growing dispersal via root-like multicellular rhizomorphs [23]. Compared with other A millaria species, natural melanized rhizomorphs of D. tabescens are rarely observed in l boratory or nature [25,26]. Herein, we sequenced and investigated the genome, performe RNA sequencing and directed discovery of sesquiterpene synthases in the strain CPC 401429. The genome information would be helpful to investigate this unusual fungus. A millaria species encode a large proportion of carbohydrate metabolism genes, and th could supply a strategy to bypass competition with other microbes of Armillaria spp. [

Discussion
Armillaria spp. are devastating forest pathogens and they forage for hosts and achieve growing dispersal via root-like multicellular rhizomorphs [23]. Compared with other Armillaria species, natural melanized rhizomorphs of D. tabescens are rarely observed in laboratory or nature [25,26]. Herein, we sequenced and investigated the genome, performed RNA sequencing and directed discovery of sesquiterpene synthases in the strain CPCC 401429. The genome information would be helpful to investigate this unusual fungus. Armillaria species encode a large proportion of carbohydrate metabolism genes, and this could supply a strategy to bypass competition with other microbes of Armillaria spp. [

Discussion
Armillaria spp. are devastating forest pathogens and they forage for hosts and achiev growing dispersal via root-like multicellular rhizomorphs [23]. Compared with other A millaria species, natural melanized rhizomorphs of D. tabescens are rarely observed in l boratory or nature [25,26]. Herein, we sequenced and investigated the genome, performe RNA sequencing and directed discovery of sesquiterpene synthases in the strain CPC 401429. The genome information would be helpful to investigate this unusual fungus. A millaria species encode a large proportion of carbohydrate metabolism genes, and th could supply a strategy to bypass competition with other microbes of Armillaria spp. [

Discussion
Armillaria spp. are devastating forest pathogens and they forage for hosts and achieve growing dispersal via root-like multicellular rhizomorphs [23]. Compared with other Armillaria species, natural melanized rhizomorphs of D. tabescens are rarely observed in laboratory or nature [25,26]. Herein, we sequenced and investigated the genome, performed RNA sequencing and directed discovery of sesquiterpene synthases in the strain CPCC 401429. The genome information would be helpful to investigate this unusual fungus. Armillaria species encode a large proportion of carbohydrate metabolism genes, and this could supply a strategy to bypass competition with other microbes of Armillaria spp. [23].

Discussion
Armillaria spp. are devastating forest pathogens and they forage for hosts and achieve growing dispersal via root-like multicellular rhizomorphs [23]. Compared with other Ar millaria species, natural melanized rhizomorphs of D. tabescens are rarely observed in la boratory or nature [25,26]. Herein, we sequenced and investigated the genome, performed RNA sequencing and directed discovery of sesquiterpene synthases in the strain CPCC 401429. The genome information would be helpful to investigate this unusual fungus. Ar millaria species encode a large proportion of carbohydrate metabolism genes, and this could supply a strategy to bypass competition with other microbes of Armillaria spp. [

Discussion
Armillaria spp. are devastating forest pathogens and they forage for hosts and achieve growing dispersal via root-like multicellular rhizomorphs [23]. Compared with other Armillaria species, natural melanized rhizomorphs of D. tabescens are rarely observed in laboratory or nature [25,26]. Herein, we sequenced and investigated the genome, performed RNA sequencing and directed discovery of sesquiterpene synthases in the strain CPCC 401429. The genome information would be helpful to investigate this unusual fungus. Armillaria species encode a large proportion of carbohydrate metabolism genes, and this could supply a strategy to bypass competition with other microbes of Armillaria spp. [23] α-himachalene

Discussion
Armillaria spp. are devastating forest pathogens and they forage for hosts and achiev growing dispersal via root-like multicellular rhizomorphs [23]. Compared with other A millaria species, natural melanized rhizomorphs of D. tabescens are rarely observed in la boratory or nature [25,26]. Herein, we sequenced and investigated the genome, performe RNA sequencing and directed discovery of sesquiterpene synthases in the strain CPC 401429. The genome information would be helpful to investigate this unusual fungus. A millaria species encode a large proportion of carbohydrate metabolism genes, and th could supply a strategy to bypass competition with other microbes of Armillaria spp. [

Discussion
Armillaria spp. are devastating forest pathogens and they forage for hosts and achieve growing dispersal via root-like multicellular rhizomorphs [23]. Compared with other Armillaria species, natural melanized rhizomorphs of D. tabescens are rarely observed in laboratory or nature [25,26]. Herein, we sequenced and investigated the genome, performed RNA sequencing and directed discovery of sesquiterpene synthases in the strain CPCC 401429. The genome information would be helpful to investigate this unusual fungus. Armillaria species encode a large proportion of carbohydrate metabolism genes, and this could supply a strategy to bypass competition with other microbes of Armillaria spp. [

Discussion
Armillaria spp. are devastating forest pathogens and they forage for hosts and achiev growing dispersal via root-like multicellular rhizomorphs [23]. Compared with other Ar millaria species, natural melanized rhizomorphs of D. tabescens are rarely observed in la boratory or nature [25,26]. Herein, we sequenced and investigated the genome, performed RNA sequencing and directed discovery of sesquiterpene synthases in the strain CPCC 401429. The genome information would be helpful to investigate this unusual fungus. Ar millaria species encode a large proportion of carbohydrate metabolism genes, and thi could supply a strategy to bypass competition with other microbes of Armillaria spp. [23] thujopsene-(I2) 43

Discussion
Armillaria spp. are devastating forest pathogens and they forage for hosts and achieve growing dispersal via root-like multicellular rhizomorphs [23]. Compared with other Armillaria species, natural melanized rhizomorphs of D. tabescens are rarely observed in laboratory or nature [25,26]. Herein, we sequenced and investigated the genome, performed RNA sequencing and directed discovery of sesquiterpene synthases in the strain CPCC 401429. The genome information would be helpful to investigate this unusual fungus. Armillaria species encode a large proportion of carbohydrate metabolism genes, and this could supply a strategy to bypass competition with other microbes of Armillaria spp. [23].

Discussion
Armillaria spp. are devastating forest pathogens and they forage for hosts and achiev growing dispersal via root-like multicellular rhizomorphs [23]. Compared with other Ar millaria species, natural melanized rhizomorphs of D. tabescens are rarely observed in la boratory or nature [25,26]. Herein, we sequenced and investigated the genome, performe RNA sequencing and directed discovery of sesquiterpene synthases in the strain CPCC 401429. The genome information would be helpful to investigate this unusual fungus. Ar millaria species encode a large proportion of carbohydrate metabolism genes, and thi could supply a strategy to bypass competition with other microbes of Armillaria spp. [

Discussion
Armillaria spp. are devastating forest pathogens and they forage for hosts and achieve growing dispersal via root-like multicellular rhizomorphs [23]. Compared with other Armillaria species, natural melanized rhizomorphs of D. tabescens are rarely observed in laboratory or nature [25,26]. Herein, we sequenced and investigated the genome, performed RNA sequencing and directed discovery of sesquiterpene synthases in the strain CPCC 401429. The genome information would be helpful to investigate this unusual fungus. Armillaria species encode a large proportion of carbohydrate metabolism genes, and this could supply a strategy to bypass competition with other microbes of Armillaria spp. [23].

Discussion
Armillaria spp. are devastating forest pathogens and they forage for hosts and achiev growing dispersal via root-like multicellular rhizomorphs [23]. Compared with other A millaria species, natural melanized rhizomorphs of D. tabescens are rarely observed in la boratory or nature [25,26]. Herein, we sequenced and investigated the genome, performe RNA sequencing and directed discovery of sesquiterpene synthases in the strain CPC 401429. The genome information would be helpful to investigate this unusual fungus. A millaria species encode a large proportion of carbohydrate metabolism genes, and th could supply a strategy to bypass competition with other microbes of Armillaria spp.

Discussion
Armillaria spp. are devastating forest pathogens and they forage for hosts and achieve growing dispersal via root-like multicellular rhizomorphs [23]. Compared with other Armillaria species, natural melanized rhizomorphs of D. tabescens are rarely observed in laboratory or nature [25,26]. Herein, we sequenced and investigated the genome, performed RNA sequencing and directed discovery of sesquiterpene synthases in the strain CPCC 401429. The genome information would be helpful to investigate this unusual fungus. Armillaria species encode a large proportion of carbohydrate metabolism genes, and this could supply a strategy to bypass competition with other microbes of Armillaria spp. [23].

Discussion
Armillaria spp. are devastating forest pathogens and they forage for hosts and achieve growing dispersal via root-like multicellular rhizomorphs [23]. Compared with other Ar millaria species, natural melanized rhizomorphs of D. tabescens are rarely observed in la boratory or nature [25,26]. Herein, we sequenced and investigated the genome, performed RNA sequencing and directed discovery of sesquiterpene synthases in the strain CPCC 401429. The genome information would be helpful to investigate this unusual fungus. Ar millaria species encode a large proportion of carbohydrate metabolism genes, and this could supply a strategy to bypass competition with other microbes of Armillaria spp. [23]

Discussion
Armillaria spp. are devastating forest pathogens and they forage for hosts and achieve growing dispersal via root-like multicellular rhizomorphs [23]. Compared with other Armillaria species, natural melanized rhizomorphs of D. tabescens are rarely observed in laboratory or nature [25,26]. Herein, we sequenced and investigated the genome, performed RNA sequencing and directed discovery of sesquiterpene synthases in the strain CPCC 401429. The genome information would be helpful to investigate this unusual fungus. Armillaria species encode a large proportion of carbohydrate metabolism genes, and this could supply a strategy to bypass competition with other microbes of Armillaria spp. [23]. Compared to the genomes of other Armillaria fungi, the strain CPCC 401429 had a smaller genomic size ( Table 2). The genomic features of D. tabescens had its own unique characteristics. There were 507 genes associated with carbohydrate metabolism, resulting in less genes associated with GHs, GTs, and CBMs among sequenced Armillaria strains. Slipos et al. (2017) reported that Armillaria species usually include large genomes for white rot saprotrophs evolved mostly by gene family diversification driven by transposable elements (TEs) proliferation [23,53,54]. Likewise, the D. tabescens exhibited relatively small proportion of TEs coding genes (2.89%), which is much lower than Armillaria species including A. solidipes (20.9%), A. gallica (32.6%), and A. cepistipes (34.8%) [23]. Meanwhile, the fungus CPCC 401429 encodes a wide array of genes involved in secondary metabolism of terpenoids and polyketides, which indicates that D. tabescens was a potential resource strain for polyketides and terpenoids production. Genome mining-based analysis of the sequenced Armillaria spp. and closely related species including Guyanagaster necrorhizus MCA 3950 and Floccularia luteovirens C10, demonstrated that melleolides biosynthetic gene cluster is commonly shared [4,55]. This indicated that these melleolide-coding BGCs is an interesting and evolutionary conserved feature of Armillaria genus or closely related species. This might provide crucial defense against predators or other microbes in adapting to a soil-borne lifestyle [23]. Conspicuously, the mel locus in D. tabescens is a shortest, delicate, and tightly clustered BGC responsible for melleolides biosynthesis [4]. Additionally, as to phylogenetic analysis of D. tabescens CPCC 401429, with the results of NR annotation, MLST-based phylogeny, and synteny analysis, we considered that D. tabescens was supposed to belong to the genus of Desarmillaria instead of Armillaria as it used to. It is noteworthy that the D. tabescens species formed a distinct group with G. necrorhizus MCA 3950, suggesting it would support the proposal of reattribution of A. tabescens and A. ectypa species in Armillaria subgenus Desarmillaria [25,26].
Basidiomycetous mushrooms are particularly adept at synthesizing a broad range of structurally diversified sesquiterpenes. Up to now, many Basidiomycetous sesquiterpene synthases have been characterized, exemplified as putative STSs from O. olearius, C. puteana, L. rhinocerotus, and A. aegerita [8][9][10][11]. With the recent revolution in genome-sequencing technology, a large number of Basidiomycetous genomes have been completed. Bioinformatic mining indicates that most genes/clusters of the phylum Basidiomycota are silent or cryptic under laboratory conditions [4,9,10]. The terpenoids and their synthases represent largely untapped "dark matter" that awaits to be characterized [1,2,4]. What are the ecological functions of STSs that play important roles in secondary metabolism during the lifecycle of mushrooms? What is the underlying mechanism that controls the functional evolution of STSs? Are these STSs unique to specific evolutionary lineage (taxa) or widely distributed? To answer these questions, functional characterization of increasing STSs encoded in the fungal genome is essential and indispensable.
As a cutting-edge discipline that drives original breakthroughs in the field of biopharmaceuticals, the rise of synthetic biology can promote drug leads discovery by designing new biosynthetic pathways. Therefore, mining and discovering of biosynthetic elements are urgently needed in synthetic biology technology [10]. Also, bioinformatic analysis and functional prediction of specific STSs would facilitate the screening of ideal candidate synthetic elements or bioprospecting for new sesquiterpenes. Since most Basidiomycetous mushrooms are difficult to manipulate in genetic transformation, S. cerevisiae yeast has emerged as a powerful host cell for production of fungal secondary metabolites and pathways reconstitution [11,56]. To our knowledge, this is the first report to document the endeavor of the yeast expression for mining the sesquiterpene repertoire in Desarmillaria species and Armillaria affinities. Phylogenetic analysis of characterized STSs reinforced that certain group of STSs shares highly conserved sequences, suggesting that they might catalyze the likely cyclization mechanism [9,10]. Thus, phylogenetic analysis might be useful to mine novel terpenoid synthases. In this study, noteworthy, half members STSs from the strain CPCC 401429 were distributed in minor group Clade IV. This group of STSs belongs to a new subfamily and only few STSs have been characterized. Our report on STSs derived from D. tabescens enriches the number of STSs of this subfamily and expands our knowledge of STSs diversity for Desarmillaria or close genus. Yeast cells expressing DtSTS9 and DtSTS10 yielded diverse sesquiterpene molecules, suggesting that this subfamily might be highly promiscuous producers. This highlights the potential of Armillaria in generating novel sesquiterpenoids. Accessing the basidiomycetous pool of STSs can be an immediate source of novel biocatalysts, or yield STSs that could be further specialized by directed evolution or combinatorial engineering. This also lays foundation for downstream in-depth deciphering catalytic mechanism of novel STSs.

Conclusions
In summary, we sequenced the genome of the fungus D. tabescens CPCC 401429 and identified twelve STSs encoding genes. Bioinformatic analysis and MLST-based phylogeny allowed us to support the reclassifying the fungus D. tabescens in Desarmillaria subgenus. Also, the large number of STS genes in the genome highlights the potential of the strain CPCC 401429 in generating varying sesquiterpenoids. In combination with transcriptomic analysis, two new STSs encoding genes were selected for yeast heterologous expression. Based on phylogenetic analysis and GC-MS identification, DtSTS9 and DtSTS10 belonging to the subfamily IV, are novel and very interesting sesquiterpene synthases with high potential for downstream research. This study also demonstrated that yeast based heterologous expression is an attractive approach for accessing the basidiomycete terpenoid chemistry.