Genome-Wide Study of Conidiation-Related Genes in the Aphid-Obligate Fungal Pathogen Conidiobolus obscurus (Entomophthoromycotina)

Fungi in the Entomophthorales order can cause insect disease and epizootics in nature, contributing to biological pest control in agriculture and forestry. Most Entomophthorales have narrow host ranges, limited to the arthropod family level; however, rare genomic information about host-specific fungi has been reported. Conidiation is crucial for entomopathogenic fungi to explore insect resources owing to the important roles of conidia in the infection cycle, such as dispersal, adhesion, germination, and penetration into the host hemocoel. In this study, we analyzed the whole genome sequence of the aphid-obligate pathogen Conidiobolus obscurus strain ARSEF 7217 (Entomophthoromycotina), using Nanopore technology from Biomarker Technologies (Beijing, China). The genome size was 37.6 Mb, and encoded 10,262 predicted genes, wherein 21.3% genes were putatively associated to the pathogen–host interaction. In particular, the serine protease repertoire in C. obscurus exhibited expansions in the trypsin and subtilisin classes, which play vital roles in the fungus’ pathogenicity. Differentially expressed transcriptomic patterns were analyzed in three conidiation stages (pre-conidiation, emerging conidiation, and post-conidiation), and 2915 differentially expressed genes were found to be associated with the conidiation process. Furthermore, a weighted gene co-expression network analysis showed that 772 hub genes in conidiation are mainly involved in insect cuticular component degradation, cell wall/membrane biosynthesis, MAPK signaling pathway, and transcription regulation. Our findings of the genomic and transcriptomic features of C. obscurus help reveal the molecular mechanism of the Entomophthorales pathogenicity, which will contribute to improving fungal applications in pest control.


Introduction
Filamentous fungal species undergo asexual reproduction in their life cycle by the formation and release of asexual spores from aerial hyphae [1,2]. Conidiospores (conidia) are the main asexual fungal spores that favor fungal spread, survival, and colonization in new habitats [3,4]. Pathogenic fungi also use conidia as infectious propagules to recognize and infect host plants and animals [5,6]. Entomopathogenic fungi represent over 1000 species, mainly in the orders Hypocreales (ca. 600 species belonging to the filamentous Ascomycetes) and Entomophthorales (ca. 300 species belonging to the basal fungi in Zoopagomycota), are widely distributed in the environment, and regulate host insect populations by causing mycosis and epizootics [5,7]. These entomopathogens develop millions of differently shaped clonal conidia initiating their infection cycles (conidial adhesion, infection structure differentiation, detoxification, insect hemocoel adaptation, and host nutrient deprivation) [6,8]. Most Entomophthorales are host-specific and have unique 2. Materials and Methods 2.1. Fungal Culture C. obscurus 7217 was obtained from the United States Department of Agriculture-Agricultural Research Service Collection of Entomopathogenic Fungal Cultures (USDA-ARSEF, Ithaca, NY, USA) and stored at −80 • C [22]. The isolate was cultured on rich Sabouraud dextrose agar plus yeast extract (40 g L −1 dextrose, 10 g L −1 peptone, 10 g L −1 yeast extract, and 15 g L −1 agar) for 4 days in Petri dishes at 24 ± 1 • C with a 12:12 h light: dark photoperiod. Culture pieces were mashed and transferred into 50 mL liquid media (150 mL flask) and incubated in a shaker at 120 rpm and 24 ± 1 • C for 3 d. Fresh mycelia (pre-C stage) were collected using a 0.2 µm filter, and evenly poured into a 90 mm Petri dish to form a mycelial mat while removing any excess water using sterile paper. Moist conditions (100% relative humidity) were maintained for 12 h at 24 • C to form the C stage. The mycelial mats maintained for 72 h at the same temperature and humidity represented the post-C stage.

Genome Sequencing, Assembly, and Annotation
Fresh liquid-cultured mycelium was collected and ground into a fine powder in liquid nitrogen for DNA extraction. Total genomic DNA was extracted following the protocol set forth in the TakaRa universal genomic DNA extraction kit (Tokyo, Japan) for sequencing. Genome sequencing was carried out by Biomarker Technologies (Beijing, China) on a nanopore sequencing platform using the standard protocol provided by Oxford Nanopore Technologies [23]. Raw data were deposited in the CNGB Sequence Archive (CNSA) of the China National GenBank Database (CNGBdb, https://db.cngb.org/ accessed on 1 January 2022) under the accession number CNP0001555 [24]. After filtering low quality reads and short reads (<2000 bp), nanopore reads longer than 5 kb were selected for error correction using Canu v1.7.1 [25]. Clean reads obtained from Canu were assembled using wtdbg [26]. The Benchmarking Universal Single-Copy Orthologs (BUSCO) tool v2.0 was used to assess completeness of the final assembly using the fungi_odb9 dataset [27].
Repeat sequences were predicted using RepeatMasker 4.06 [28]. tRNAscan-SE and Infernal 1.1 software were used for tRNA and rRNA identification, respectively [29,30]. Structural annotation was performed using Augustus v2.4 as gene predictor [31]. GeMoMa v1.3.1 was used for homolog-based gene prediction with the genome sequence of C. coronatus [20,32]. RNA-seq data generated in this study (CNGBdb accession number: CNP0002115) were also used as supporting gene evidence during the annotation process.
The functions of the putative protein-coding genes were annotated using the basic local alignment search tool (BLASTx) with an of E-value < 10 −5 . Public databases of NCBI non-redundant protein sequences (NR), NCBI eukaryotic orthologous groups of proteins (KOG), Swiss-Prot, Pfam, Gene Ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG), Fungal Transcription Factor Database, and Pathogen-Host Interaction database (PHI, http://www.phi-base.org accessed on 9 February 2021) platforms were used. Secretory proteins were screened based on their structures using the signal peptide predicted by SignalP 5.0 without the membrane-spanning domain [33]. CAZymes were identified as previously described [21].

RNA Extraction and Transcript Assembly
Three biological replicates were sampled from each of the three stages: pre-C, C, and post-C to screen for conidiation-related genes in C. obscurus. Total RNA was extracted using the RNAiso Plus kit (TaKaRa, Tokyo, Japan), and its concentration was measured using a NanoDrop 2000 spectrophotometer (Thermo Fisher Scientific, New York, NY, USA). RNA degradation and contamination (especially DNA contamination) were monitored on 1% agarose gels and samples were sent to Biomarker Technologies (Beijing, China) for transcript sequencing. Ribosomal RNA was removed from total RNA using a Ribo-Zero rRNA removal kit (Epicenter, Madison, WI, USA). A total of 1.5 µg rRNA-free RNA per sample was used to generate a sequencing library using the NEBNext Ultra Directional RNA library prep kit for Illumina (New England BioLabs, Boston, MA, USA) on an Illumina HiSeq 4000 platform (BGI, Beijing, China) according to the sequencing pipeline (Supplementary Figure S1A). The adapter sequences were trimmed to remove low-quality sequences, and the high-quality clean reads were aligned to the genome of C. obscurus 7217 using hierarchical indexing for spliced alignment of transcripts (HISAT2) [34]. Reads with no more than two mismatches from each sample were used to generate the transcripts using StringTie software 1.3.1 [35]. The workflow of the HISAT-StringTie analysis is shown in Supplementary Figure S1B.

Quantification of Transcripts
Transcript expression was quantified as fragments per kilobase of transcript per million mapped reads (FPKM) using StringTie 1.3.1 [35]. Differential expression analysis of the transcripts between the three libraries was performed using the DESeq R package 1.10.1 [36]. The resulting p-values were adjusted using the Benjamini-Hochberg method (multiple hypothesis test) for controlling the false discovery rate (FDR). Transcripts with an adjusted p-value of less than 0.01 and an absolute value of log2 (fold change, FC) of more than 1 were designated as differentially expressed [34]. Quantitative real-time polymerase chain reaction (qRT-PCR) was performed using designed primers (Supplementary Table S1) to evaluate the relative expression levels of the selected genes for further validation of the findings.

Co-Expression Network Construction
The distinct hub genes of conidiation were established using a weighted gene coexpression network analysis (WGCNA v1.69 package on the BMKCloud platform) [37]. Gene expression values were imported into WGCNA to construct co-expression modules using the automatic network construction function block-wise modules with the default settings. The expression levels of the differentially expressed genes (DEGs) were logtransformed using log2 (FPKM + 1). Pearson's correlation coefficient was used to determine the co-expression relationship between each pair of genes. The WGCNA network was constructed with a soft thresholding power of β = 17, a minimum module size of 30 genes, the TOM-Type was unsigned, and the merge cut height was 0.25. The module-trait relationship was used to differentiate the hub genes among the conidiation stages.

Phylogenetic Analysis
The MEGAX software suite was used to determine the phylogenetic relationships of the genes encoding the serine proteases [38]. The phylogenetic tree was constructed using the maximum likelihood method based on the Poisson correction model, with 500 bootstrap replicates. The protein sequences and information were obtained from the assembled genome of C. obscurus 7217 (CNP0001555).

Comparative Genomic Analysis
The wide-host-range generalist Conidiobolus coronatus (GenBank assembly accession: GCA_001566745.1) [20] was used for comparison with genomic information obtained for the assembled genome of C. obscurus 7217 (CNP0001555). The gene family was constructed and compared using OrthoMCL [39].

Global Characteristics of the C. obscurus Genome
Nanopore sequencing generated a total of 11.9 Gb clean reads, and the net length of C. obscurus genome sequence was 37.6 Mb, with the assembly consisting of 167 scaffolds and a 26.46% G+C content (Table 1 and Supplementary Figure S2). In total, 10,262 proteinencoding genes were predicted, of which 84.08% were homolog-and RNA-seq-based predictions. A total of 587 non-coding RNAs (ncRNAs) in 67 families were identified, including 526 tRNAs, 8 rRNAs, and 49 other ncRNAs. The total length of repeat sequences was 6.6 Mb, representing 17.44% of the genomic length (Supplementary Table S2).

Annotation in PHI and CAZy Databases
PHI-base is a database that catalogues the experimentally verified pathogenicity and effector genes from pathogens of animal, plant, fungal, and insect hosts. In the C. obscurus genome, 2888 genes were associated to the PHI database, 75.9% of which were putatively involved in the fungal pathogenicity. Results revealed that the maximum (1306) number of matched genes was related to the functional class of "reduced virulence", while 695 genes were identified as "unaffected pathogenicity" (Supplementary Figure S3). Based on Figure 3. The KOG classes in the C. obscurus genome are associated with records in the PHI database. The x-axis represents the classes of genes and left y-axis represents the number of genes while the right y-axis represents the number of the PHI-annotated genes.

Annotation in PHI and CAZy Databases
PHI-base is a database that catalogues the experimentally verified pathogenicity and effector genes from pathogens of animal, plant, fungal, and insect hosts. In the C. obscurus genome, 2888 genes were associated to the PHI database, 75.9% of which were putatively involved in the fungal pathogenicity. Results revealed that the maximum (1306) number of matched genes was related to the functional class of "reduced virulence", while 695 genes were identified as "unaffected pathogenicity" (Supplementary Figure S3). Based on KOG classes, many of these PHI-associated genes function in signal transduction mechanisms (224 genes, representing 39.7% in T class), post-translational modification, protein turnover, chaperones (219, 27.8% in O class), lipid transport and metabolism (186, 42.1% in I class), and secondary metabolites biosynthesis, transport, and catabolism (171, 72.2% in Q class) as shown in Figure 3.
The CAZy database describes the families of carbohydrate-active enzymes (CAZymes) that degrade, modify, or create glycosidic bonds. The members of CAZymes in C. obscurus  In the genome, 125 genes were associated with cytochrome P450s. Many (28) of the genes (P450-DIT2, CYP56) putatively involved in conidial wall maturation by catalyzing the oxidation of tyrosine residues in the formation of LL-dityrosine (a precursor of the cell wall) were found. In total, 11 CYP52-coding genes were found, which, putatively together with NADPH cytochrome P450 (CYP505, 6 genes), were involved in the first step in the assimilation of alkanes and fatty acids (the main components in the host epicuticle). Others (Supplementary Figure S5) were predicted to encode enzymes involved in detoxification and secondary metabolite production.

Abundance of Serine Protease-Encoding Genes in C. obscurus
Based on Pfam annotation, the putative peptidase-encoding genes of C. obscurus (328 genes) consisted of serine proteases, metallopeptidases, and cysteine peptidases.

Comparative Genomic Analysis
There are 3538 common gene families between C. obscurus and C. coronatus ( Figure  6A), and the difference in the unique gene families probably contributes to the distinct host ranges of these two fungal pathogens. Based on Pfam annotation, the unique functional genes in C. obscurus are enriched in trypsin, glycosyl hydrolase, peptidase, protein kinase, multicopper oxidase, P450s, transcription factors (fungal Zn(2)-Cys(6) binuclear cluster domain and homeobox domain), and RNA recognition motif ( Figure 6B).

Comparative Genomic Analysis
There are 3538 common gene families between C. obscurus and C. coronatus ( Figure 6A), and the difference in the unique gene families probably contributes to the distinct host ranges of these two fungal pathogens. Based on Pfam annotation, the unique functional genes in C. obscurus are enriched in trypsin, glycosyl hydrolase, peptidase, protein kinase, multicopper oxidase, P450s, transcription factors (fungal Zn(2)-Cys(6) binuclear cluster domain and homeobox domain), and RNA recognition motif ( Figure 6B).

Gene Differential Expression in the three Conidiation Stages
The mycelia before, during, and after conidiation showed obvious morphological differences by changing from full and round hyphae to spherical conidia formation in hyphal tips, and to shriveled hyphae (Figure 7). Approximately 90.7 Gb of clean reads were generated from the nine libraries of C. obscurus mycelia in the three stages, and 62.4-84.3% reads were mapped to the reference genome (Supplementary Table S4  The functions of the DEGs in C vs. pre-C were concentrated in amino sugar and nucleotide sugar metabolism, biosynthesis of amino acids, biosynthesis of antibiotics, carbon metabolism, and starch and sucrose metabolism based on KEGG functional enrichment analysis ( Figure 8A). Similarly, the functions of the DEGs in C vs. post-C were concentrated in biosynthesis of amino acid, 2-oxocarboxylic acid metabolism, biosynthesis of antibiotic, and lysine biosynthesis ( Figure 8B).

Gene Differential Expression in the three Conidiation Stages
The mycelia before, during, and after conidiation showed obvious morphological differences by changing from full and round hyphae to spherical conidia formation in hyphal tips, and to shriveled hyphae ( Figure 7). Approximately 90.7 Gb of clean reads were generated from the nine libraries of C. obscurus mycelia in the three stages, and 62.4-84.3% reads were mapped to the reference genome (Supplementary Table S4

Gene Differential Expression in the three Conidiation Stages
The mycelia before, during, and after conidiation showed obvious morphological differences by changing from full and round hyphae to spherical conidia formation in hyphal tips, and to shriveled hyphae ( Figure 7). Approximately 90.7 Gb of clean reads were generated from the nine libraries of C. obscurus mycelia in the three stages, and 62.4-84.3% reads were mapped to the reference genome (Supplementary Table S4  The functions of the DEGs in C vs. pre-C were concentrated in amino sugar and nucleotide sugar metabolism, biosynthesis of amino acids, biosynthesis of antibiotics, carbon metabolism, and starch and sucrose metabolism based on KEGG functional enrichment analysis ( Figure 8A). Similarly, the functions of the DEGs in C vs. post-C were concentrated in biosynthesis of amino acid, 2-oxocarboxylic acid metabolism, biosynthesis of antibiotic, and lysine biosynthesis ( Figure 8B). The functions of the DEGs in C vs. pre-C were concentrated in amino sugar and nucleotide sugar metabolism, biosynthesis of amino acids, biosynthesis of antibiotics, carbon metabolism, and starch and sucrose metabolism based on KEGG functional enrichment analysis ( Figure 8A). Similarly, the functions of the DEGs in C vs. post-C were concentrated in biosynthesis of amino acid, 2-oxocarboxylic acid metabolism, biosynthesis of antibiotic, and lysine biosynthesis ( Figure 8B). . "Rich factor" refers to the ratio of the number of differentially expressed transcripts to the total number of transcripts for each pathway; the larger the rich factor, the higher the degree of enrichment. The q value indicated the P-value after the multiple hypothesis test corrections ranging from 0 to 1; the closer it is to 0, the more significant the enrichment considering a false discovery rate (FDR) ≤ 0.01 as the threshold.
Genes related to conidiation in C. obscurus were screened based on the criteria of being upregulated in the C stage versus either the pre-C or post-C stages. In total, 2915 DEGs were putatively related to conidiation wherein 918 DEGs were upregulated in both C Figure 8. Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analyses of differentially expressed genes (DEGs) in C stage vs. pre-C stage (A) and C stage vs. post-C stage (B). "Rich factor" refers to the ratio of the number of differentially expressed transcripts to the total number of transcripts for each pathway; the larger the rich factor, the higher the degree of enrichment. The q value indicated the P-value after the multiple hypothesis test corrections ranging from 0 to 1; the closer it is to 0, the more significant the enrichment considering a false discovery rate (FDR) ≤ 0.01 as the threshold.
Genes related to conidiation in C. obscurus were screened based on the criteria of being upregulated in the C stage versus either the pre-C or post-C stages. In total, 2915 DEGs were putatively related to conidiation wherein 918 DEGs were upregulated in both C stage vs. pre-C stage and C stage vs. post-C stage (Supplementary Figure S7). The most upregulated DEGs during the conidia-emerging phase included those encoding glycosylphosphatidylinositol (GPI)-anchored protein, tryptophan-rich sensory protein (TspO), polysaccharide deacetylase, lipase, cytochrome P450, homeoprotein, trypsin, subtilisin-like serine proteinase, glycosyl hydrolase, and AphC/TSA antioxidant enzyme ( Table 2). There were only 65 CAZyme-encoding genes upregulated in conidiation, including 14 genes in GH18 family (endochitinase) (Figure 4). A total of 375 DEGs putatively encoding transcription factors in 25 transcription factor families were upregulated in conidiation (Supplementary Figure S8 and Table S5). The relative expression level of the selected genes from the three conidiation stages revealed by qRT-PCR corroborated the data obtained from analyzing the transcriptome (Supplementary Figure S9).

Co-Expression Network Analysis of Conidiation-Related Genes
A WGCNA across all nine samples elucidated the conidiation-related genes. Highly interconnected genes were grouped into three different modules (black, brown, and turquoise) (Figure 9). The black module containing 772 DEGs showed the strongest correlation with conidiation including 580 DEGs upregulated in both C vs. pre-C and C vs. post-C. Most of these DEGs were grouped into GO terms related to cell parts, catalytic activity, and metabolic processes (Supplementary Table S6) and enriched in KEGG pathways involved in oxidative phosphorylation, ribosomes, and MAPK signaling (Supplementary Table S7). The KOG terms were concentrated in "post-translational modification, protein turnover, chaperones", "carbohydrate transport and metabolism", and "cell wall/membrane/envelope biogenesis" (Supplementary Table S8). DEGs in the black module annotated by the PHI database showed that putative virulence factors were mainly associated with diverse secretory proteins, including subtilisin-like serine proteases, trypsin-like serine proteases, lipases, and chitinases (Supplementary Table S9). A heat map using hierarchical clustering analysis of those genes showed that the nine samples localized into three clusters at each of the pre-C, C, and post-C stages wherein C stage DEGs had larger FPKM values than those in the other two stages ( Figure 10). DEGs in the black module annotated by the PHI database showed that putative virulence factors were mainly associated with diverse secretory proteins, including subtilisinlike serine proteases, trypsin-like serine proteases, lipases, and chitinases (Supplementary  Table S9). A heat map using hierarchical clustering analysis of those genes showed that the nine samples localized into three clusters at each of the pre-C, C, and post-C stages wherein C stage DEGs had larger FPKM values than those in the other two stages (Figure 10). Figure 10. Heatmap of the main DEGs from the black modules of Figure 9 among the nine samples localized into three clusters at different stages (pre-C, C, and post-C). Genes with high and low FPKM values are orange and blue, respectively, whereas * refers to putative secretory proteins.

Discussion
The subphylum Entomophthoromycotina (Zoopagomycota) includes arthropodpathogenic fungi, saprophytes, and human pathogens and boasts a large range of genome sizes, with the largest one at 8000 Mb [18]. In our study, the genome size of the aphidobligate pathogen C. obscurus ranges between the 34.4 Mb of the saprobiotic Conidiobolus heterosporus [21] and 39.9 Mb of C. coronatus with a broad host range [20]. The G+C content of C. obscurus is closer to C. coronatus (27.7%) than to C. heterosporus (36.8%), consistent with the low G+C content (<40%) usually observed in the basal fungi of Zoopagomycota [21]. Genome duplication and gene family expansions contribute to speciation and host specialization [18]. In the comparison, the functions of the unique genes and gene families in the C. obscurus genome were concentrated on host cuticular component degradation and transcription regulation (Figure 6), implying the specialization in host-pathogen interaction. A total of 23.1% of the predicted genes of C. obscurus were putatively related to Figure 10. Heatmap of the main DEGs from the black modules of Figure 9 among the nine samples localized into three clusters at different stages (pre-C, C, and post-C). Genes with high and low FPKM values are orange and blue, respectively, whereas * refers to putative secretory proteins.

Discussion
The subphylum Entomophthoromycotina (Zoopagomycota) includes arthropod-pathogenic fungi, saprophytes, and human pathogens and boasts a large range of genome sizes, with the largest one at 8000 Mb [18]. In our study, the genome size of the aphid-obligate pathogen C. obscurus ranges between the 34.4 Mb of the saprobiotic Conidiobolus heterosporus [21] and 39.9 Mb of C. coronatus with a broad host range [20]. The G+C content of C. obscurus is closer to C. coronatus (27.7%) than to C. heterosporus (36.8%), consistent with the low G+C content (<40%) usually observed in the basal fungi of Zoopagomycota [21]. Genome duplication and gene family expansions contribute to speciation and host specialization [18]. In the comparison, the functions of the unique genes and gene families in the C. obscurus genome were concentrated on host cuticular component degradation and transcription regulation (Figure 6), implying the specialization in host-pathogen interaction. A total of 23.1% of the predicted genes of C. obscurus were putatively related to pathogenicity based on PHI annotation, higher than an average of 15% of total proteins as determinants of fungal virulence [6]. Among these, 806 genes were upregulated during conidiation, demonstrating the intricate association of conidiation and fungal pathogenicity.
There are significant functional differences in the CAZymes family members, although the number of CAZymes-encoding genes are comparable in C. obscurus (392) and C. heterosporus (394). The genes in CE12 (pectin acetylesterases) and CBM12 (chitin-binding) families are abundant for the saprobiotic lifestyle of C. heterosporus to degrade leaf molds and plant detritus [21]. In our study, the dominant genes belonged to GH18, GT1, and CE10 families, consistent with the differential nutrient-acquisition strategies, and only 16.6% of CAZymes-encoding genes are involved in conidiation. In entomopathogens, P450s have multiple functions, facilitating penetrating through the host epicuticle, detoxification, and increasing virulence by toxic secondary metabolite production [11,12]. In the C. obscurus genome, the number of P450s-encoding genes (125) are less than those (142) in C. coronatus [20], while 13.6% were upregulated in conidiation (Supplementary Table S10), probably contributing to the fungus' pathogenicity.
The conidia of fungal entomopathogens play multiple roles in dispersal, survival, and exploitation of nutritious resources in new habitats or hosts [5]. Fungal conidiation features morphological changes (tip growth) of conidiophores expanding in surface area and cell volume. This involves degradation, biosynthesis, and assembly (cross-linking) of conidial wall and membrane components such as glucan, chitin, mannans, glycoproteins, and ergosterol [40]. For example, glycosyl hydrolase and 1,3-β-glucan synthase are involved in cell wall remodeling and regulating cell wall thickness [41]. In our study, many genes encoding glycosidases, glucan synthases, and chitin synthases were upregulated (Supplementary Table S8). The gene encoding GPI-anchored protein, which is an abundant constituent of the eukaryotic cell surface and played a pathogenic role in fungal infection, was markedly upregulated during conidiation in this study [42]. This explains the dynamic transport of amino acids, lipids, carbohydrates, and iron in the conidiation process.
The Entomophthorales conidia are coated with a thick film of mucinous fluid that contain secreted enzymes, facilitating their adhesion and subsequent infection of the hosts [43]. Many omics studies on the entomopathogenicity in Entomophthoromycotina revealed core fungal virulence factors are secreted enzymes, including subtilisin-and trypsin-like serine proteases, chitinases, and lipases [14,44,45]. Most of the sequenced fungal lineages encode a set of 13-16 serine protease families that perform a variety of functions [46]. C. obscurus has only 10 families of serine proteases but exhibits the gene expansions of trypsin and subtilisin families. Diversifying the subtilisin-like proteases is an advanced evolutionary tactic to enhance pathogenicity in the wide-spectrum entomopathogenic fungi, such as Beauveria bassiana and Metarhizium anisopliae [10,47]. The proteinase-K-like fungal S8 is the principal subtilisin-like serine proteases in Entomophthoromycotina that exhibits remarkably high host-specificity for insects in pathogenic fungi, consistent with C. obscurus [9]. Additionally, the gene number (52) of these subtilisin-like proteases is significantly higher in C. obscurus than in C. coronatus (36), C. thromboides (18), and Entomophthora muscae (22). Most genes encoding the S1 family (chymotrypsin) in C. obscurus are also in the S1A subfamily (trypsin) for extracellular degradation. Many subtilisin (S8)-and trypsin (S1A)-encoding genes were upregulated during conidiation, depicting their precise correlation with the fungal pathogenicity.
Conidia are aerial-born spores that have adapted to harsh conditions such as oxidative stress and UV radiation. Several DEGs related to environmental stress responses during conidiation were observed in this study. For example, five putative genes encoding peroxiredoxin in AhpC/TSA family may participate in the defense against oxidative damage response, probably comprising a strong antioxidant machinery to collapse the host immune system, favoring the fungus exploring the nutrients in the host hemolymph [48]. Nine genes encoding tryptophan-rich sensory protein (TSPO), the translocator proteins which play pivotal role in transporting cholesterol inside mitochondria, were substantially upregulated. These genes may be involved in the cellular response to environmental changes in oxygen and light conditions [49]. Two upregulated DEGs (EVM0003905 and EVM0001948) homologous to ATP-dependent DNA helicase may be involved in DNA repair following UV-induced damage [50].
The gene expression patterns involved in conidiation require transcriptional regulation by regulatory proteins and components of signal transduction pathways [2]. Previous studies have revealed the involvements of homeobox genes in conidiation, which are evolutionarily conserved in eukaryotes [51,52]. This study revealed various genes encoding diverse families of transcription factors that were upregulated during conidiation. Several such genes encode homeoproteins (Supplementary Table S5 and Figure S8). Moreover, we also observed 15 genes encoding the components of MAPK signaling pathway that were significantly upregulated, suggesting an essential role of MAPK pathway in the conidiation process [53]. For example, EVM0001653 is putatively related to the plasma membrane osmosensors that activate the high osmolarity glycerol (HOG) MAPK signaling pathway, which may favor the fungal adaptation to the osmotic pressure of the host hemolymph and appressorium development [53,54]. Moreover, many transcription factors are encoded by the unique genes in the C. obscurus genome (Figure 6), implying that the fungal conidiation process is suitable for its unique ecological niche.

Conclusions
Our present study provides valuable information about the genomic features of the aphid-obligate pathogen C. obscurus and enhances understanding of the unique characteristics of Entomophthoromycotina, considering that most species in this subphylum are obligate pathogens of arthropods and there are few reported genomes. Based on the theoretical support of our whole-genome sequencing, the transcriptomic profiling provides new insights into the molecular mechanism of the conidiation process, showing that the abundance of transcripts with diverse functions is the basis for the multiple roles of conidia in mycosis. These results also provided a basis for further understanding the gene network for the infection cycle of entomopathogenic fungi and reference information for studying other related fungi, contributing to the further application of Entomophthoromycotina in agricultural and forestry pest control.
Supplementary Materials: The following supporting information can be downloaded at: https:// www.mdpi.com/article/10.3390/jof8040389/s1, The online version contains supplementary material: Figure S1: mRNA sequencing experimental pipeline (A) and the workflow of transcript assembly and statistics (B), Figure S2: Characteristics of the de novo genome assembly, Figure S3: PHI-annotated categories of the predicted protein-encoding genes of the Conidiobolus obscurus genome. Figure S4: CAZyme-encoding genes in Conidiobolus obscurus, Figure S5: Putative functions and gene number of the main P450 classes, Figure S6: MA plots of DEGs in C vs. Pre-C and C vs. Post-C, Figure S7: Classification of upregulated genes in C stage vs. pre-C stage or C stage vs. post-C stage, Figure S8: Transcription factor families, Figure S9: Validation of four differentially expressed transcripts at pre-C, C, and post-C stages of C. obscurus using RT-qPCR, Table S1: The designed primers for qPCR, Table S2: Repeative sequence statistic of Conidiobolus obscurus, Table S3: Gene number of the protein-encoding genes annotated in different databases, Table S4: The sequencing data and alignment statistics, Table S5: Differential expression of selected genes putatively related to transcription factors (TF) during conidiation, Table S6: The DEGs associated with the main GO terms, Table S7: The DEGs associated with the main KEGG pathway, Table S8: Differential expression of selected genes related to cell wall/membrane/envelope biogenesis during conidiation, Table S9: The DEGs associated with PHI database, Table S10: Differential expression of selected genes related to secondary metabolite biosynthesis, transport, and catabolism during conidiation.