De Novo RNA Sequencing and Transcriptome Analysis of Sclerotium rolfsii Gene Expression during Sclerotium Development

Sclerotium rolfsii is a destructive soil-borne fungal pathogen that causes stem rot in cultivated plants. However, little is known about the genetic basis of sclerotium development. In this study, we conducted de novo sequencing of genes from three different stages of S. rolfsii (mycelia, early sclerotium formation, and late sclerotium formation) using Illumina HiSeqTM 4000. We then determined differentially expressed genes (DEGs) across the three stages and annotated gene functions. STEM and weighted gene-co-expression network analysis were used to cluster DEGs with similar expression patterns. Our analysis yielded an average of 25,957,621 clean reads per sample (22,913,500–28,988,848). We identified 8929, 8453, and 3744 DEGs between sclerotium developmental stages 1 versus 2, 1 versus 3, and 2 versus 3, respectively. Additionally, four significantly altered gene expression profiles involved 220 genes related to sclerotium formation, and two modules were positively correlated with early and late sclerotium formation. These results were supported by the outcomes of qPCR and RNA-sequencing conducted on six genes. This is the first study to provide a gene expression map during sclerotial development in S. rolfsii, which can be used to reduce the re-infection ability of this pathogen and provide new insights into the scientific prevention and control of the disease. This study also provides a useful resource for further research on the genomics of S. rolfsii.


Introduction
Sclerotium rolfsii Sacc. is a major soil-borne pathogen with a worldwide distribution that infects over 500 species of vegetables, grains, Chinese herbal medicines, and ornamental crops in warm-temperate, subtropical, and tropical regions [1].This pathogen is the source of multiple diseases, including seedling damping-off, crown and root rot, and dry rot canker [2], which are commonly referred to as southern blight.The recent influence of climate warming, high-density planting, successive cropping obstacles, and other factors have led to a continued increase in southern blight in various crop-producing areas, with a resulting yield loss of 10-80% and serious economic losses [3].Moreover, new hosts such as mung bean, Bletilla orchid, and Artocarpus heterophyllus continue to be detected [4][5][6].Southern blight is especially difficult to control because S. rolfsii produces melanin-containing, thick-walled, resistant bodies that survive in the soil for over four years [7].
S. rolfsii predominantly overwinters in the soil as sclerotium and as mycelium on diseased plant residues.High humidity and warm conditions in the following year allow the sclerotium to germinate and grow mycelium, which spreads through cracks in the soil to neighboring plants [2].The mycelium directly invades the host from the wound or the epidermis of the plant root or stem, causing plant disease and rot.Mature sclerotium on the infected strain can then be spread via rain, insects, and farm operations, causing re-infection [8].Sclerotium is the core structure of the life history and infection of S. rolfsii; therefore, clarifying its developmental mechanism will not only elucidate the developmental biology of sclerotium but also provide a theoretical and molecular basis for the further analysis of its pathogenic mechanism.Sclerotia are the asexual structures formed from the aggregation of fungal mycelia.The biogenesis of sclerotia is related to lipid peroxidation and induced by oxidative stress as a pathogen consumes carbon sources [9].Xing et al. [10] reported that an increase in oxalic acid levels inhibits sclerotial initiation at the collar region of the host.In addition, sclerotium exudates directly affect the development and maturation of sclerotium [11].Scholars have also studied the effects of air, light, and soil on the size, quantity, and germination rate of sclerotium [12].Numerous environmental factors induce sclerotial development, including cold, drought, low nutrition, oxidative stress, hyphal damage, and pH imbalance [13,14].However, the genes that control these processes and that are differentially expressed between the filamentous mycelium and sclerotia of S. rolfsii remain unknown.
Next-generation sequencing technologies such as Illumina sequencing are now commonly applied in comprehensive transcriptional studies.Illumina sequencing, which is cost-effective and has a higher output than traditional methods, yields highly accurate measurements of gene expression during transcriptome analysis because the sequencing reads can be counted and mapped to a genome or annotated transcripts [15][16][17], facilitating the genome-wide identification of coding sequences, gene structures, and alternative splicing.The transcriptomes of several fungi have already been sequenced (e.g., Aspergillus flavus [18], Rhizoctonia solani [19], and Wolfiporia cocos [20]); however, these data are not available for S. rolfsii, and very few studies have investigated sclerotia formation in S. rolfsii [21].
Therefore, to reveal the molecular mechanisms of sclerotial development in S. rolfsii, we used Illumina HiSeq TM 4000 for transcriptome analysis at three stages of sclerotium development along with RNA sequencing and de novo assembly.The results reveal the developmental-stage-specific gene expression patterns of sclerotia, as well as multiple differentially expressed genes associated with the development process of S. rolfsii.Moreover, bioinformatic analysis was used to analyze the gene expression profiles and related gene modules exhibiting significant changes during sclerotial development.These genes reveal the major metabolic processes and signaling pathways during sclerotial hardening in S. rolfsii.This study provides valuable insights into the regulatory mechanisms that control sclerotial development, which can be used to design new control strategies for southern blight.

Strains and Specimen Collection
S. rolfsii strain CB1 was obtained from the Macleaya cordata nursery at Hunan Agricultural University, Changsha, China [22].Mycelia were grown on a cellophane membrane placed on potato dextrose agar medium at 28 • C for three days (stage 1, S1) before collection.Sclerotia were also collected after 10 (stage 2, S2) and 30 days (stage 3, S3) (Figure 1).Samples were then immediately frozen in liquid nitrogen for total RNA extraction.Each stage had three biological replicates.

RNA Extraction
Total RNA from S1, S2, and S3 was extracted using a TRIzol reagent kit (Life Technologies Inc., Carlsbad, CA, USA).Degradation and contamination of RNA were verified using 1% RNase-free agarose gel electrophoresis, and the purity and integrity were determined using a Nanodrop 2000 (Thermo Fisher Scientific, Waltham, MA, USA) and 2100 Bioanalyzer (Agilent Technologies, Santa Clara, CA, USA).High-quality RNA (RNA Integrity Number [RIN] scores > 7.5) was used for subsequent experiments, with three biological replicates for each stage.
Genes 2023, 14, x FOR PEER REVIEW 3 of 16 (Figure 1).Samples were then immediately frozen in liquid nitrogen for total RNA extraction.Each stage had three biological replicates.

RNA Extraction
Total RNA from S1, S2, and S3 was extracted using a TRIzol reagent kit (Life Technologies Inc., Carlsbad, USA).Degradation and contamination of RNA were verified using 1% RNase-free agarose gel electrophoresis, and the purity and integrity were determined using a Nanodrop 2000 (Thermo Fisher Scientific, Waltham, USA) and 2100 Bioanalyzer (Agilent Technologies, California, USA).High-quality RNA (RNA Integrity Number [RIN] scores > 7.5) was used for subsequent experiments, with three biological replicates for each stage.

Library Construction and Sequencing
Sample mRNA was enriched using oligo (dT) beads, treated with fragmentation buffer, and reverse-transcribed into cDNA using random primers.Second-strand cDNA was synthesized using DNA polymerase I, RNase H, dNTP, and a buffer.Next, cDNA was purified using 1.8× Agencourt AMPure XP Beads, end-repaired, affixed with poly(A) tails, and ligated to Illumina sequencing adapters.Ligation products were size-selected using agarose gel electrophoresis, PCR-amplified, and sequenced in an Illumina HiSeq TM 4000 by Gene Denovo Biotechnology Co. (Guangzhou, China).

De Novo Assembly
The transcriptome was assembled in the modular program Trinity [23], which includes three components: Inchworm, Chrysalis, and Butterfly.Inchworm assembles reads using a greedy k-mer-based approach, resulting in a collection of linear contigs.Chrysalis then clusters related contigs that correspond to sections of alternatively spliced transcripts or otherwise unique regions of paralogous genes and subsequently builds de Bruijn graphs for each cluster.Finally, Butterfly analyzes the paths taken by reads and read pairings in the corresponding de Bruijn graph and then outputs one linear sequence for each alternatively spliced isoform and transcript derived from paralogous genes.

Library Construction and Sequencing
Sample mRNA was enriched using oligo (dT) beads, treated with fragmentation buffer, and reverse-transcribed into cDNA using random primers.Second-strand cDNA was synthesized using DNA polymerase I, RNase H, dNTP, and a buffer.Next, cDNA was purified using 1.8× Agencourt AMPure XP Beads, end-repaired, affixed with poly(A) tails, and ligated to Illumina sequencing adapters.Ligation products were size-selected using agarose gel electrophoresis, PCR-amplified, and sequenced in an Illumina HiSeq TM 4000 by Gene Denovo Biotechnology Co. (Guangzhou, China).

De Novo Assembly
The transcriptome was assembled in the modular program Trinity [23], which includes three components: Inchworm, Chrysalis, and Butterfly.Inchworm assembles reads using a greedy k-mer-based approach, resulting in a collection of linear contigs.Chrysalis then clusters related contigs that correspond to sections of alternatively spliced transcripts or otherwise unique regions of paralogous genes and subsequently builds de Bruijn graphs for each cluster.Finally, Butterfly analyzes the paths taken by reads and read pairings in the corresponding de Bruijn graph and then outputs one linear sequence for each alternatively spliced isoform and transcript derived from paralogous genes.

Gene Quantification
Gene abundance was calculated and normalized to reads per kilobase per million reads (RPKM) [24] using the following formula: RPKM = (106C)/(NL/103); here, C is the number of reads uniquely mapped to a given unigene, N is the total number of reads mapped to all unigenes, and L is the length in bases of the unigene.The reads of each unigene belonging to the same pathway were summed.For selected pathways, the number of reads per stage was transformed into Z-scores, clustered, and plotted in the R package Heatmap 4.2 (http://www.r-project.org/(accessed on 13 March 2023)).

Analysis of Differentially Expressed Genes
The R package "edge" was used to identify differentially expressed genes (DEGs) based on fold change (FC) ≥ 2 and false discovery rate < 0.05.Significant DEGs were then subjected to GO and KEGG enrichment analyses.Significantly enriched terms were those with a corrected p < 0.05.Next, principal component analysis was performed on the transcriptomes of samples across the three sclerotial developmental stages (S1, 2, and 3) using the R package "psych" with standard settings, and a heatmap was generated in the "NMF" package.Different samples were compared using the COR function in the "stats" R package; Pearson's correlations were reported only for the accuracy of observations.After normalizing (log 2 transformation) expression data, DEGs (|log 2 FC| > 1) were clustered using STEM [25] to obtain expression profiles related to sclerotium formation.The threshold cluster number and correlation coefficient were set to 20 and 0.7, respectively.Significance was set at p < 0.05.

Weighted Gene Co-Expression Network Analysis
Genes with standardized RPKM ≥ 1 were used for co-expression analysis.Coexpression modules were constructed using the weighted gene co-expression network analysis version 1.47 package in R version 4.2 [26], and genes with similar expression patterns were assigned to the same module.Biological modules were classified based on gene expression levels and module eigengene (ME) similarity.Parameters for module classification were set as follows: minimum power = 5, ME = 0.65, and minModuleSize = 50.Pearson correlation coefficients (r) were calculated between MEs.Modules were associated with sclerotium formation if r > 0.5 and p < 0.05.Gene function was then further explored using GO and KEGG analyses.Lastly, intramodular connectivity between genes and intermodular correlations was calculated via weighted gene co-expression network analysis.

Quantitative Real-Time PCR (qPCR) Validation of mRNA Expression
To verify the sequencing accuracy, 10 DEGs were randomly selected for qPCR, with α-tubulin serving as the reference gene.Specific primers (Table S1) were designed using Primer 5 based on sequenced transcripts.The thermocycling profile was as follows: 95 • C for 2 min, followed by 40 cycles at 95 • C for 10 s and 60 • C for 30 s. Relative expression was calculated using 2 −∆∆Ct .Values for each biological replicate were calculated using three technical replicates.
Lastly, KEGG analysis demonstrated that amino acid biosynthesis was the top enriched metabolic pathway in both S2-vs-S1 and S3-Vvs-S1 (Figure 6 and Tables S12 and  S13).In contrast, the top enriched pathway in S3-vs-S2 was starch and sucrose metabolism (Figure 6 and Table S14), indicating similarly enriched pathways in the S2 and S3 samples.

Correlation between Modules and Identification of Key Modules
We generated three final modules after merging similar modules (Figure S2), the first of which contained 5377 genes, the second contained 1286 genes, and the third contained 796 genes.The results of the interaction analysis and the heatmap revealed that the modules were independent, demonstrating unrelated gene expression within each module (Figure 8A).We also identified and clustered eigengenes to explore co-expression across all modules.This analysis revealed three modules divided into two clusters, a result that was consistent with the eigengene-based heatmap (Figure 8).Furthermore, the MEs of the second and third modules were both highly correlated with sclerotium formation (Figure

Correlation between Modules and Identification of Key Modules
We generated three final modules after merging similar modules (Figure S2), the first of which contained 5377 genes, the second contained 1286 genes, and the third contained 796 genes.The results of the interaction analysis and the heatmap revealed that the modules were independent, demonstrating unrelated gene expression within each module (Figure 8A).We also identified and clustered eigengenes to explore co-expression across all modules.This analysis revealed three modules divided into two clusters, a result that was consistent with the eigengene-based heatmap (Figure 8).Furthermore, the MEs of the second and third modules were both highly correlated with sclerotium formation (Figure 8B), specifically the late and early stages, respectively.In contrast, the first module was positively correlated with mycelium formation.

Validation of Differential Expression Profiles using qPCR
The qPCR results validated the RNA-Seq data for two upregulated (ALDH3A1 and con-6) and four downregulated (HMOX1, SPCC1494.01,DUG3, and PSD2) genes.The results showed consistent differences between qPCR and RNA-Seq (Figure 9).The observed similarity between the qPCR and RNA-Seq data supports the validity of our results.

Validation of Differential Expression Profiles Using qPCR
The qPCR results validated the RNA-Seq data for two upregulated (ALDH3A1 and con-6) and four downregulated (HMOX1, SPCC1494.01,DUG3, and PSD2) genes.The results showed consistent differences between qPCR and RNA-Seq (Figure 9).The observed similarity between the qPCR and RNA-Seq data supports the validity of our results.

Discussion
In this study, we successfully identified the potential genetic mechanisms underlying the pathology of S. rolfsii, which causes southern blight disease in numerous crops.Previously, Song et al. [27] cultured S. rolfsii in sucrose and glucose media to investigate global

Discussion
In this study, we successfully identified the potential genetic mechanisms underlying the pathology of S. rolfsii, which causes southern blight disease in numerous crops.Previously, Song et al. [27] cultured S. rolfsii in sucrose and glucose media to investigate global metabolic and genetic changes using LC-MS combined with RNA-Seq.Furthermore, Schmid applied massively parallel short-read 454 pyrosequencing to identify DEGs under scleroglucan-producing and non-producing conditions [21].However, our study is the first to provide de novo comparative transcriptome data related to sclerotial development in this pathogenic species, analyzing transcriptomic variation across the mycelium, early sclerotium, and mature sclerotium stages.Transcriptome sequencing has previously been widely applied to investigate the morphological development pathogenic bacteria [15].For example, Yu et al. [28] used transcriptomics to compare gene expression differences in the fruit body development and sporulation stages of Ustilaginoidea virens, thereby clarifying sexual reproduction mechanisms.
Here, we identified key genes in sclerotium biosynthesis through the Illumina sequencing of the S. rolfsii transcriptome.We successfully annotated 15,322 (54.41%) unigenes and performed functional analyses (GO, KEGG, and KOG).Furthermore, we performed DEG analysis across the three stages of sclerotial development and identified thousands of upregulated and downregulated genes between S1-vs-S2, S1-vs-S3, and S2-vs-S3.Overall, this study provides a novel perspective based on transcriptome sequences of developmental stages and annotated gene functions, which can aid further research on sclerotium formation in S. rolfsii.Functional analysis indicated that amino acid biosynthesis was the most enriched metabolic pathway between S2-vs-S1 and S3-vs-S1, and this result is consistent with the significant enrichment of multiple amino acid pathways during sclerotium formation in Botrytis cinerea [29].Hence, amino acid metabolism appears to play an important role in sclerotia formation.S3 is similar to S2 in that starch and sucrose metabolism are the most enriched pathways.Wu et al. [20] demonstrated that fewer transcriptional changes occurred during earlier developmental stages and that molecular mechanisms became increasingly sophisticated with greater sclerotium maturity.Therefore, future studies should further examine DEGs at different sclerotium developmental stages to elucidate the biological mechanisms involved.
Numerous studies have shown that reactive oxygen species are necessary for sclerotium initiation and are accompanied by increased lipid peroxidation levels [30].As the superoxidized state is harmful to cells, the internal defense mechanisms of organisms have evolved to produce antioxidant molecules such as glutathione, superoxide dismutase, catalase, and glutathione peroxidase, which neutralize the accumulation of reactive oxygen species [31].The expression level of NADPH-related unigenes in S. rolfsii was higher in S2 and S3; this finding is consistent with the results reported by Schurmann et al. [32] on the sclerotial formation of Botrytis cinerea.In addition, unigenes involved in autophagy were consistently highly expressed in S2 and S3.Autophagy is a highly conserved cellular degradation process in eukaryotes that is typically maintained at very low levels [33].However, autophagy is induced under stress conditions and can reduce reactive oxygen species levels, thereby accelerating protein elimination [34].Moreover, other stress-regulatory proteins, such as heat shock protein and redox enzyme FAD/NAD-binding protein, were also highly expressed in S2.To protect an organism from harmful reactive oxygen species and regulate the balance between cell survival and death, heat shock proteins may be produced [35].This study shows that S. rolfsii responds to oxidative stress through different pathways during sclerotium development.
Signal transduction pathways are essential in sclerotium formation.The cAMP signaling pathway mediated by G protein-coupled receptors is an important signal transduction pathway.It plays an important role in fungal sclerotial formation [36].In this study, the expression of G protein-coupled receptor family A increased at S2, indicating a close relationship between cAMP and sclerotium formation in S. rolfsii.Kinases and phosphokinases influence sclerotium formation by controlling phosphorylation and dephosphorylation in signaling pathways.The expression of kinase-like proteins in S. rolfsii increases in S2, which is consistent with the results of Harel et al. [37].In addition, carbonic anhydrase, carbohydrate-binding module, and oxalate decarboxylase showed a continuous increasing trend throughout the three stages.Carbonic anhydrase superfamilies are prevalent in all organisms and are essential for fungal pathogen perception and controlling sexual development [38].Recent findings also highlight the many functions of CBM32, suggesting its direct involvement in the pathogenesis of S. rolfsii [39].In summary, the results showed that sclerotial development is regulated by many factors and involves complex signal transduction pathways, which should be further explored to provide a new target for inhibiting sclerotial development.
Our study confirms the gene expression patterns and stage-specific transcriptomic changes in sclerotial development.Furthermore, unigenes, which are differentially regulated during sclerotia formation, are involved in the biosynthesis of secondary metabolites, autophagy, and active oxygen metabolism.The annotated unigenes in this study will provide useful information for disease control strategies designed to prevent sclerotium formation.

Conclusions
To the best of our knowledge, this is the first study to sequence the transcriptome of S. rolfsii at three different developmental time points.By identifying the key genes in sclerotium biosynthesis, this study elucidates the expression patterns at different stages of sclerotium maturation.Furthermore, the functional annotation of these genes offered insights into the molecular mechanisms underlying sclerotium development in S. rolfsii.In conclusion, our study represents an important contribution to S. rolfsii pest management efforts by revealing the target genes that may reduce pathogenicity.

Figure 2 .
Figure 2. GO enrichment analysis of the Sclerotium rolfsii transcriptome.Results are g biological process, cellular component, and molecular function.Y-axis represents th genes in a category.

Figure 2 .
Figure 2. GO enrichment analysis of the Sclerotium rolfsii transcriptome.Results are grouped into biological process, cellular component, and molecular function.Y-axis represents the number of genes in a category.

es 2023 ,
14, x FOR PEER REVIEW 8 of

Figure 4 .
Figure 4. Principal component analysis (A), cluster analysis (B), and correlation analysis (C) resu of DEGs at different sclerotial developmental stages.

Figure 4 .
Figure 4. Principal component analysis (A), cluster analysis (B), and correlation analysis (C) results of DEGs at different sclerotial developmental stages.

REVIEW 11 of 16 Figure 7 .
Figure 7. STEM analysis of all DEGs based on sclerotium formation time series data.(A) Ten expression profiles; (B) expression trends for genes in profile 11; (C) expression trends for genes in profile 14; (D) expression trends for genes in profile 15.Each line in the figure B,C, and D represents one gene.

Figure 7 .
Figure 7. STEM analysis of all DEGs based on sclerotium formation time series data.(A) Ten expression profiles; (B) expression trends for genes in profile 11; (C) expression trends for genes in profile 14; (D) expression trends for genes in profile 15.Each line in the figure (B-D) represents one gene.

Figure 8 .
Figure 8. (A) The cluster dendrogram of gene in Sclerotium rolfsii.Each branch in the figure represents one gene, and every color below represents one co-expression module.(B) Eigengene dendrogram and eigengene adjacency heatmap.Every color below represents one co-expression module.

Figure 8 .
Figure 8. (A) The cluster dendrogram of gene in Sclerotium rolfsii.Each branch in the figure represents one gene, and every color below represents one co-expression module.(B) Eigengene dendrogram and eigengene adjacency heatmap.Every color below represents one co-expression module.

Genes 2023 , 16 Figure 9 .
Figure 9. qPCR analysis of six genes used for the validation of RNA-Seq data.

Figure 9 .
Figure 9. qPCR analysis of six genes used for the validation of RNA-Seq data.

Table 1 .
Summary of transcriptome assembly after filtering.

Table 2 .
Summary of de novo assembly of unigenes.

Table 3 .
All-in-one list of annotations.