Next Article in Journal
Community Structure Diversity of Endophytic Fungi in Cissampelos pareira from Different Habitats and Their α-Glucosidase Inhibitory Activity
Previous Article in Journal
Identification of Macadamia integrifolia Leaf Blight Disease Caused by Pestalotiopsis colombiensis in China
Previous Article in Special Issue
Recent Advances in Heterologous Protein Expression and Natural Product Synthesis by Aspergillus
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Genome Sequencing and Comparative Genomics of the Hyper-Cellulolytic Fungus Talaromyces pinophilus Y117

Institute of Microbiology, Jiangxi Academy of Sciences, Nanchang 330095, China
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
J. Fungi 2025, 11(9), 614; https://doi.org/10.3390/jof11090614
Submission received: 19 July 2025 / Revised: 18 August 2025 / Accepted: 19 August 2025 / Published: 22 August 2025

Abstract

Talaromyces pinophilus is a filamentous fungus with notable lignocellulose-degrading capacity based on enzyme activities and protein secretion potential, making it a compelling candidate for industrial biotechnology applications. In this study, we present the genomic characterization of the highly cellulolytic strain Y117, a domesticated variant of T. pinophilus, based on whole-genome sequencing and comparative genomic analysis with eleven related strains. Comprehensive analysis of CAZymes, transcription factors, and secondary metabolite diversity in T. pinophilus strains revealed that the exceptional lignocellulose degradation capacity of Y117 is driven by its unique genomic architecture. Key genomic features that distinguish Y117 include (1) significant expansion of glycoside hydrolase (GH) and carbohydrate-binding module (CBM) families, (2) loss of fungal-RiPP-like clusters, and (3) absence of the developmental regulator BrlA. These genomic adaptations could indicate a metabolic trade-off favoring hydrolytic enzyme production over secondary metabolism and sporulation. Our findings provide fundamental insights into fungal lignocellulose degradation mechanisms while establishing Y117 as a promising chassis for metabolic engineering applications in industrial enzyme production and heterologous protein expression.

1. Introduction

Cellulosic biomass is an abundant, renewable, and underutilized global carbon source [1,2]. It contains a large quantity of sugars in the form of cellulose and hemicellulose, thus holding significant potential for sustainable bioenergy production. Efficient conversion of cellulose and hemicellulose into sugar is the key to fully unlock its energy potential. It has been reported that some filamentous fungi are wood-degrading organisms through secreting large amounts of extracellular cellulases and hemicellulases that can hydrolyze lignocellulosic biomass to fermentable sugars [3]; notably, Trichoderma reesei is the leading cellulase producer for biorefinery applications [4]. However, some challenges, such as improving the efficiency of biomass pre-treatment, enzymatic hydrolysis, and conversion technologies, need to be addressed to optimize the utilization of lignocellulosic biomass [5,6]. Ongoing efforts are being made to turn cellulosic biomass into a cornerstone of the renewable energy sector [7,8].
Talaromyces pinophilus (T. pinophilus), formerly designated as Penicillium pinophilum, is a filamentous fungus belonging to the Ascomycota phylum, widely recognized for its ability to degrade lignocellulosic biomass and produce valuable enzymes [9,10]. The fungus thrives in diverse environments and exhibits robust growth under varying conditions, making it a versatile candidate for industrial use. T. pinophilus Y-94 was isolated in 1982 from soil in the Tohoku region, Japan, which can secrete large quantities of cellulolytic and hemicellulolytic enzymes [11]. In the pursuit of more economical cellulolytic enzyme production, successive rounds of random mutagenesis were applied to wild-type strain Y-94 to screen for hyper-producing mutants. Among all mutants, CF-2612 emerged as the most efficient cellulase producer, exhibiting significantly superior β-glucosidase secretion capacity that surpassed even that of Trichoderma reesei [12,13]. Following these discoveries, researchers have focused on identifying the underlying regulation mechanisms of cellulase expression to further improve their productivity [14,15]. Strategic genetic modifications, including the deployment of enhanced promoter systems and productivity-related transcription factors, have been applied to maximize enzyme yields and functional efficiency for industrial applications [16]. Additionally, T. pinophilus exhibits remarkable potential as a cellular factory for the production of various bioproducts, including thermophilic proteins, owing to its robust protein secretion capacity [17,18].
Despite advances in T. pinophilus genetic modification, limitations such as low enzymatic efficiency and high production costs remain, hindering its industrial application in the degradation of lignocellulosic biomass [19]. To address these challenges, in-depth genetic investigations are essential to elucidate the comprehensive mechanisms of cellulose degradation in T. pinophilus, thereby facilitating its industrial utilization [20]. Although the complete genome of T. pinophilus strain 1-95 has been sequenced and its degradation mechanisms have been characterized, this strain’s limited cellulase production makes it unsuitable as a model organism [21]. Moreover, the existing research lacks systematic comparative genomic analyses among different T. pinophilus strains, which could provide critical insights into strain-specific degradation capabilities and help identify superior enzyme producers for industrial applications.
Our laboratory isolated a wild-type strain from Jiangxi Province soil samples that demonstrated exceptional lignocellulose-degrading capability with high hydrolysis efficiency against various biomass substrates, including corn cob, rice straw, and ramie. T. pinophilus Y117, a mutant strain derived from the wild-type strain through random mutagenesis (Figure 1), was shown to produce a cellulase mixture with higher filter paperase (FPase), Avicelase, and β-glucosidase activity compared to previously reported strains, representing a valuable resource for large-scale production of bioenergy and biobased products [Table S1]. In this study, we performed whole-genome sequencing of the hyper-cellulolytic T. pinophilus strain Y117 and conducted systematic comparative genomic analysis with other T. pinophilus strains. This investigation was designed to (1) identify key genetic determinants associated with lignocellulose degradation, (2) elucidate strain-specific variations in degradation capabilities, and (3) uncover the molecular mechanisms underlying its superior cellulose-degrading efficiency. Our findings not only contribute to advancing the knowledge of the mechanistic basis of lignocellulose degradation by filamentous fungi but also establish a genetic framework for engineering T. pinophilus as a robust microbial cell factory for enhanced heterologous protein production.

2. Materials and Methods

2.1. Isolation and Culture Maintenance of T. pinophilus Y117

T. pinophilus Y117 is a hyper-cellulolytic mutant strain developed through successive mutagenesis of the wild-type TP117 isolated from paddy soil in Jiangxi Province, China. The strain development protocol involved sequential UV and chemical mutagenesis: The parental strain TP117 was first cultured on potato dextrose agar (PDA, Solarbio P8931) at 30 °C for 5 days, after which mycelial mats were harvested and homogenized in sterile distilled water to prepare a suspension adjusted to 108 CFU/mL using a hemocytometer. For primary mutagenesis, 5 mL aliquots of this suspension were subjected to UV irradiation (254 nm wavelength) for 30 s at a distance of 10 cm, achieving 90% lethality as determined by viability plating. The irradiated cells were then plated on selective PDA medium containing 0.1% (w/v) AZCL-HE-cellulose (Megazyme, Wicklow, Ireland) and incubated at 30 °C for 72 h. Colonies demonstrating enhanced cellulolytic activity were identified by the formation of distinct blue halos (≥2 mm radius) resulting from chromogenic substrate hydrolysis. Potential mutants were validated through shake-flask cultivation in Mandels’ medium with 1% microcrystalline cellulose as a carbon source, followed by quantitative filter paper activity (FPA) assays performed according to IUPAC standard methods [22]. The top-performing UV mutant (TP117-UV9) showing the highest FPA was subsequently treated with 0.05% (w/v) N-methyl-N′-nitro-N-nitrosoguanidine (NTG) in phosphate buffer (50 mM, pH 7.0) for 30 min at room temperature, again calibrated to 90% lethality. Post-treatment, mycelia were recovered by centrifugation (8000× g, 4 °C, 10 min), washed twice with sterile phosphate buffer, and subjected to secondary screening on AZCL-HE-cellulose plates. The ultimate mutant Y117 was selected based on superior and stable cellulase production characteristics through three consecutive generations of cultivation. T. pinophilus Y117 was deposited at the China General Microbiological Culture Collection Center (CGMCC) under accession number CGMCC No. 42105. The strain was grown and maintained in potato dextrose agar (PDA) medium at 30 °C. A conidial suspension was stored in 25% glycerol at −80 °C and activated on PDA plates. The activated strain was inoculated on PDA medium at 30 °C for 4 days.

2.2. Cellulase Production and Enzyme Activity Assays

Cellulase production by T. pinophilus Y117 was evaluated using a modified protocol based on the method described by Ikeda et al. [23]. The strain was cultivated in a cellulase-inducing medium containing the following components (per liter): 50 g microcrystalline cellulose (CAS 9004-34-6; BBI Life Sciences, Shanghai, China) as the carbon source; 24 g KH2PO4; 1 g Tween 80; 5 g (NH4)2SO4; 1.2 g MgSO4·7H2O; 0.01 g ZnSO4·7H2O; 0.01 g MnSO4·6H2O; 0.01 g CuSO4·7H2O; 4 g urea. The medium pH was adjusted to 4.0 using 1 M H2SO4 or 1 M KOH prior to sterilization (121 °C, 20 min). Cultivation was performed in 250 mL Erlenmeyer flasks containing 100 mL of medium and incubated at 30 °C with agitation at 220 rpm for 7 days in an orbital shaker (ZHWY-211B; Shanghai Zhicheng Analytical Instrument Manufacturing Co., Ltd., Shanghai, China). Culture aliquots were withdrawn and centrifuged (10,000× g, 10 min, 4 °C) to remove mycelia and residual cellulose. The resulting supernatants were filtered (0.22 μm pore size) and stored at −20 °C as crude enzyme extracts for subsequent analyses.
Enzyme activities were determined in triplicate using standard spectrophotometric methods. Filter paper activity (FPase) was measured according to the method described by Ghose (1987) [22] using Whatman No. 1 filter paper as the substrate. One unit (U) of enzyme activity was defined as the amount of enzyme releasing 1 μmol of reducing sugars (glucose equivalent) per minute at 50 °C and pH 4.8. CMCase activity was assessed using carboxymethyl cellulose (CMC) as the substrate [24], expressed as 1 μmol reducing sugars produced per minute per mg soluble protein. Avicelase activity was determined using microcrystalline cellulose following the procedure described by Wood & Bhat (1988) [25]. Xylanase activity was measured using 1% (w/v) beechwood xylan in 50 mM sodium acetate buffer (pH 5.0) at 45 °C for 30 min according to established protocols [26]. β-glucosidase activity was quantified using 10 mM p-nitrophenyl-β-D-glucopyranoside (pNPG) in a 100 mM citrate buffer system at 45 °C for 10 min, with one unit corresponding to 1 μmol p-nitrophenol released per minute [27]. β-xylosidase activity was determined with 10 mM p-nitrophenyl-β-D-xylopyranoside (pNPX) under similar conditions, and absorbance at 420 nm was then measured. For all the assays, reducing sugars were quantified via the 3,5-dinitrosalicylic acid (DNS) method [28], and soluble protein concentration was determined by the Bradford assay [29] using bovine serum albumin (BSA) as the standard.

2.3. Hydrolysis of Corncob

The corncob powder (300 mesh, without additional pre-treatment) was enzymatically hydrolyzed in 50 mM acetate buffer (pH 4.8) with a solid loading of 20 wt% (1:5 w/v ratio). The crude cellulase extract obtained from T. pinophilus Y117 fermentation was added at a dosage of 1 mL per gram of dry substrate. The reaction mixture was incubated in a temperature-controlled orbital shaker at 50 °C and 100 rpm for 48 h. The released reducing sugars were measured using the 3,5-dinitrosalicylic acid (DNS) method [28]. Briefly, aliquots of hydrolysate were mixed with DNS reagent, heated in boiling water for 5 min, and cooled to room temperature. Absorbance was measured at 540 nm using a spectrophotometer, with glucose as the standard. The reducing sugar yield was expressed as grams of glucose equivalents per gram of dry corncob powder.

2.4. DNA Extraction, Sequencing, and Genome Assembly

Genomic DNA was extracted using Fungal Genomic DNA Extraction Kits (Guangzhou Magen Biotechnology Co., Ltd., Guangzhou, Guangdong, China) according to the manufacturer’s instructions. The DNA quality was assessed using Qubit (Thermo Fisher Scientific, Waltham, MA, USA) and Nanodrop (Thermo Fisher Scientific, Waltham, MA, USA) accordingly.
For sequencing on the PacBio platform (PacBio, Menlo Park, CA, USA), qualified genomic DNA was fragmented with G-tubes (Covaris, Woburn, MA, USA) and end-repaired to prepare SMRTbell DNA template libraries (with fragment size of >10 Kb selected by blue pippin system) according to the manufacturer’s specification. Library quality was assessed by Qubit® 2.0 Flurometer (Life Technologies, Carlsbad, CA, USA), and average fragment size was estimated on a Bioanalyzer 2100 (Agilent, Santa Clara, CA, USA). Single-molecule real-time (SMRT) sequencing was performed on the Pacific Biosciences Hifi sequencer (PacBio, Menlo Park, CA, USA) according to the standard protocols.
Continuous long reads were generated from three SMRT sequencing runs. Reads longer than 500 bp with a quality value over 0.75 were merged together into a single dataset. Next, the hierarchical genome-assembly process (HGAP) pipeline [30] was used to correct for random errors in the long seed reads (seed length threshold 6 Kb) by aligning shorter reads from the same library against them. The resulting corrected, preassembled reads were used for de novo assembly using Celera Assembler (version 8.3) with an overlap-layout- consensus (OLC) strategy [31].

2.5. Genome Feature Prediction and Functional Annotation

Several genome features were predicted. Specifically, open reading frames (ORFs) were predicted using AUGUSTUS (v3.4.0) [32]. Repetitive elements were identified by RepeatMasker (v4.0.5) [33]. Prediction of noncoding RNAs, such as rRNAs, was carried out using rRNAmmer (v1.2) [34], while tRNAs were identified by tRNAscan (v1.3.1) [35].
Functional annotation of predicted genes was performed using an integrated multi-database approach. The sequences of predicted protein-coding genes were aligned against five major databases: (1) the NCBI non-redundant protein (Nr) database (downloaded May 2024), (2) UniProt/Swiss-Prot (release 2024_04), (3) Kyoto Encyclopedia of Genes and Genomes (KEGG, release 111), (4) Gene Ontology (GO), and (5) Clusters of Orthologous Groups for Eukaryotic Complete Genomes (KOG), assigning GO terms and pathway information.

2.6. Comparative Genomics and Phylogenetics Analysis

The genomic sequences of 11 T. pinophilus strains were retrieved from the NCBI database, while the genome of strain Y117 was newly sequenced in this study. Orthologous gene clusters among all 12 strains were identified using OrthoFinder (v2.5.4) [36] with default parameters. Single-copy orthologous genes were subsequently extracted for phylogenetic analysis. Multiple sequence alignment of these conserved genes was performed using MAFFT (v7.505) [37] to ensure accurate alignment of homologous sequences. To improve alignment quality, ambiguous regions and poorly aligned positions were removed using Gblocks (v0.91b) [38] with relaxed parameters.
A maximum-likelihood phylogenetic tree was reconstructed using FastTree (v2.1.11) [39] based on 5783 single-copy orthologous genes. Branch support values were estimated using the Shimodaira–Hasegawa test with 1000 resamples. The resulting phylogenetic tree was visualized and annotated using the version 7.1 of theInteractive Tree of Life (iTOL) web platform [40] to elucidate the evolutionary relationships among strain Y117 and the other T. pinophilus strains. The Gene Ontology (GO) enrichment analysis was performed using GOATOOLS (https://github.com/tanghaibao/GOatools, accessed on 18 February 2025) [41]. Fisher’s exact test was employed to determine statistically significant overrepresentation of GO terms among strain-specific genes compared to the background of all predicted genes in each strain. To control the false discovery rate (FDR) in multiple hypothesis testing, we applied four correction methods: Bonferroni, Holm, Sidak, and FDR (Benjamini–Hochberg). A GO term was considered significantly enriched if its corrected p-value was ≤0.05.

2.7. Identification of CAZymes, Secondary Metabolite Gene Clusters, and Transcription Factor

Carbohydrate-Active Enzymes (CAZymes) in T. pinophilus were predicted through comprehensive analysis against the CAZy database (www.cazy.org) [42] using BLASTp version 2.8.1 (E-value < 1 × 10−5) [43]. Proteins exhibiting >50% sequence identity and ≥90% query coverage were automatically assigned to CAZy families/subfamilies, whereas those with <50% identity were subjected to rigorous manual validation, including catalytic residue conservation analysis using InterProScan (v5.56-89.0) [44] and structural domain verification via HMMER (v3.3.2) against Pfam-A (v35.0) [45]. Concurrently, secondary metabolite biosynthetic gene clusters (BGCs) were predicted using antiSMASH (v6.1.1 fungal version) [46] with default parameters. Putative transcription factors (TFs) were identified by reciprocal BLASTp search against the Fungal Transcription Factor Database (FTFD; http://ftfd.snu.ac.kr/, accessed on 20 January 2025) [47], employing an E-value threshold of <1 × 10−10. Among the identified TFs, we selected 20 well-characterized TFs known to regulate biomass-degrading enzyme genes in filamentous fungi, based on established criteria [21].

2.8. Statistical Analysis and Visualization

Three replicates were used for each group. The SPSS software (v2.0) was used for data significance analysis, and the GraphPad Prism 9.5 software was used for plotting. Different lowercase letters indicate significant differences (p < 0.05).

2.9. Data Access

A total of 12 genome assemblies were analyzed in this study. One was newly sequenced (see Data Availability Statement). The remaining 11 genomes were retrieved from GenBank, under the accession numbers: T. pinophilus NCIM1228, GCA_033439165.1; T. pinophilus NA01, GCA_009805475.2; T. pinophilus AR155, GCA_011392495.1; T. pinophilus P8041, GCA_023623755.1; T. pinophilus M8004A, GCA_023623785.1; T. pinophilus J7Y3, GCA_023623735.1; T. pinophilus Y-94, GCA_000829775.1; T. pinophilus 1-95, GCA_001571465.2; T. pinophilus PSF-12(1), GCA_003316675.1; T. pinophilus NRRL3647, GCA_027569565.1; T. pinophilus NRRL3503, GCA_027569545.1.

3. Results

3.1. Assembly Statistics and General Features of T. pinophilus Y117 Genome

The genome of T. pinophilus strain Y117 was sequenced using the PacBio Sequel II platform (PacBio, Menlo Park, CA, USA), generating a high-quality assembly with 98.9% completeness as assessed by BUSCO (version 5.7.1). The assembly comprised 21 contigs with an N50 contig length of 3.2 Mb and a maximum contig length of 4.2 Mb, exhibiting a GC content of 46.36% (Figure 2). Gene annotation predicted 12,406 protein-coding genes, along with 38 rRNA and 121 tRNA genes. The genome assembly was deposited in the NCBI genome database under BioProject ID: PRJNA1203055 (https://www.ncbi.nlm.nih.gov/bioproject/PRJNA1203055/, accessed on 15 August 2025). A summary of assembly and annotation statistics is shown in Table 1. Functional annotation using NCBI’s Clusters of Orthologous Groups (COG) database classified 5436 genes into eukaryotic orthologous groups (KOG categories) (Figure S1). The most gene-rich KOG categories included “General function prediction only”, “Secondary metabolites biosynthesis, transport and catabolism”, “Lipid transport and metabolism”, “Posttranslational modification, protein turnover, chaperones”, “Carbohydrate transport and metabolism”, “Energy production and conversion”, and “Signal transduction mechanisms” (Figure S1). The KEGG annotation results revealed that global and overview maps (555 genes), carbohydrate metabolism (155 genes), and amino acid metabolism (222 genes) were the most abundant categories in T. pinophilus Y117 (Figure S2).

3.2. Phylogenetic Tree and Comparative Genomics of T. pinophilus Strains

To elucidate the evolutionary relationships and genomic diversity within the T. pinophilus species, a comparative genomic analysis of twelve strains—including Y117 and eleven additional T. pinophilus strains accessed from the NCBI database (e.g., Y-94, 1-95, NRRL3503, NRRL3647, NCIM1228, etc.)—was conducted. These strains were isolated from diverse ecological niches—including soil, plant rhizospheres, animal feces, and synthetic fabric—providing a broad representation of the species’ genetic diversity. The assembled genomes exhibited an average size of 35.23 Mb and GC content of 46.41%, consistent with previously reported Talaromyces genomes (Figure 3). Notably, Y117 demonstrated superior assembly continuity, with only 21 scaffolds (Scaffold N50 = 3.28 Mb), far exceeding the contiguity of other strains such as PSF-12(1) (N50 = 178 kb) and NRRL lineages (average N50 = ~1.07 Mb). This high-quality assembly facilitates accurate gene annotation and downstream functional analyses. Furthermore, Y117 encodes 12,406 predicted proteins, a count comparable to those of high-yield strains NCIM1228 (12,501) and AR155 (12,486), and markedly higher than that of Y-94 (10,869). This expanded proteome may be associated with increased lignocellulolytic potential. Phylogenetic analysis of these 12 strains was performed based on 5783 single-copy core genes using FastTree and MAFFT. The resulting tree (scale = 0.001 substitutions per site) revealed that Y117 clusters most closely with Y-94, forming a distinct subclade separate from the other analyzed strains (Figure 3). The high bootstrap support (>90%) for all the major nodes confirms robust evolutionary relationships among these strains, with Y117’s unique phylogenetic position potentially associated with its exceptional lignocellulolytic phenotype.

3.3. Strain-Specific Genes

Strain-specific genes were identified among the predicted genes in the genomes of the twelve T. pinophilus strains. These genes were defined as meeting either of the following criteria: (1) <50% sequence similarity, or (2) <50% coverage when aligned with nucleotide sequences of genes from the other sequenced strains (whichever criterion was more stringent). Based on these criteria, more than 5783 genes were identified as homologous (Figure 4). The genome of T. pinophilus NA01 had the highest number of strain-specific genes (1026, Figure 4), followed by T. pinophilus AR155 (600) and T. pinophilus Y117 (156). An enrichment analysis was conducted to compare the strain-specific genes of an isolate with all the predicted genes of the same isolate. The analysis revealed that T. pinophilus NA01 exhibits significant enrichment in the GO terms related to fungal polarized growth and cellular morphogenesis, including “cellular bud” (GO:0005933), “site of polarized growth” (GO:0030427), and “cellular bud neck” (GO:0005935) (Figure S3). The GO enrichment analysis of strain-specific genes in strain AR155 revealed significant enrichment in biological processes related to cellular morphogenesis and genome maintenance (Figure S4). The most significantly enriched terms included ‘regulation of cell shape’ (GO:0008360), ‘regulation of cell morphogenesis’ (GO:0008632), ‘regulation of developmental process’ (GO:0050793), and ‘DNA metabolic process’ (GO:0006259). The genome of T. pinophilus Y117 is only represented in the GO term “organonitrogen compound biosynthetic process” (GO:1901566), which suggests distinct adaptations in nitrogen metabolism compared to other related strains.

3.4. The CAZymes Among T. pinophilus Strains

Comprehensive analysis of carbohydrate-active enzymes (CAZymes) across T. pinophilus strains revealed 2026 CAZyme-encoding genes within the 12,406 predicted proteins of T. pinophilus Y117. These genes spanned six functional classes, including 75 glycoside hydrolase (GH) families, 49 glycosyltransferase (GT) families, 14 carbohydrate esterase (CE) families, 34 carbohydrate-binding module (CBM) families, 9 auxiliary activity (AA) families, and 2 polysaccharide lyase (PL) families. Comparative genomic analysis with eleven additional T. pinophilus strains identified significant strain-specific variations in CAZyme gene content, highlighting the enzymatic diversity within this species (Figure 5). Notably, Y117 exhibited a substantial expansion of GH genes (n = 837), exceeding the other strains (363–416 genes) by 101–130%. Similarly, its CBM gene repertoire (n = 351) was 3.5–4.6 times larger than that of comparator strains (77–101 genes). These genomic features suggest elevated capacity for both glycosidic bond hydrolysis and carbohydrate substrate recognition in Y117. Conversely, Y117 showed reduced gene complements in AA (9 families) and PL (2 families) categories compared to the other strains, potentially reflecting specialized adaptations in oxidative and lytic enzymatic pathways. The exceptional enrichment of GH and CBM genes in Y117 correlates with its demonstrated efficiency in plant biomass degradation, while the relative paucity of AA and PL genes may indicate niche-specific metabolic specialization.
The comparative genomic analysis of carbohydrate-active enzyme (CAZyme) distributions across twelve fungal strains revealed striking enzymatic specialization in strain Y117, which exhibits an exceptionally expanded repertoire of glycoside hydrolases (GHs), carbohydrate esterases (CEs), and glycosyltransferases (GTs), distinguishing it as a metabolic outlier with unique biotechnological potential (Figure 6). Y117 demonstrated remarkable expansion in three key GH families—GH18 (10.39% of total GHs, 1.7-fold higher than the 6.24 ± 0.72% average of the other strains), GH78 (7.53%, 2.2-fold higher than the 3.48 ± 0.41% average), and GH79 (5.73%, 5.8-fold higher than the 0.98 ± 0.38% average)—while control strains like Y-94 and NCIM1228 displayed typical ascomycete GH distributions dominated by GH3 (cellulases; 6.8 ± 0.9%) and GH43 (xylanases; 4.3 ± 0.5%). In the CE family, Y117 showed unprecedented dominance of CE14 (34.21% of total CEs, 30.5-fold higher than the 1.12 ± 0.38% average) and CE6 (13.68%, 6.1-fold higher than the 2.25 ± 0.42% average), suggesting enhanced capacity for deacetylating xylan and chitin, whereas other strains exhibited conserved CE1 dominance (54.2 ± 3.8%). Y117’s GT profile further distinguished it through complete GT55/GT34/GT51 suites (32.3% of GTs versus <8% in other strains), including GT55 (12.99%, absent in other strains), indicating potential innovation in α-glucan synthesis. The co-occurrence of hyper-expressed CE14 with enhanced GH78/GH79 pectinases (12.8% of GHs) suggests a coordinated strategy for degrading complex plant cell walls, positioning Y117 as a strain with potential for biomass conversion, biofuel production, and processing of recalcitrant polysaccharides.

3.5. Secondary Metabolite (SM) Biosynthetic Potential Across T. pinophilus Strains

Comprehensive analysis of SM biosynthetic gene clusters (BGCs) was performed using fungal antiSMASH v6.0.1, revealing distinct strain-specific metabolic profiles (Table S2). T. pinophilus Y117 exhibited a streamlined SM repertoire, encoding only 69 backbone genes—the second lowest among all the strains after NA01 (Figure 7). This represents a 38.9% reduction compared to the most gene-rich strain, NRRL3503 (113 backbone genes), suggesting evolutionary selection for metabolic economy in Y117. The conserved core SM pathways present in all the strains included Type I polyketide synthases (T1PKSs) (19–28 clusters per strain), non-ribosomal peptide synthetases (NRPSs) (6–10 clusters), NRPS-like enzymes (12–17 clusters), and ubiquitous betalactone clusters (1 per genome). These evolutionarily maintained pathways likely support essential functions. Fungal-RiPP-like clusters displayed higher diversity in strain Y-94 and 1-95 (Y-94:7; 1-95:14), while these were completely absent in Y117, J7Y3, P8041, M8004A, and NRRL strains. While most strains possessed NRPS-T1PKS hybrids (0–3 clusters), Y117 uniquely lacked these but contained two indole and two T1PKS-NRPS clusters, reflecting divergent metabolic specialization. The complete absence of fungal-RiPP-like clusters in Y117, coupled with its simplified NRPS/T1PKS profile, could indicate adaptive genome streamlining favoring hydrolytic enzyme production over secondary metabolic complexity.

3.6. Analysis of Transcription Factors in T. pinophilus

Transcription factors (TFs) play a crucial role in regulating gene expression and cellular processes, significantly influencing metabolite levels. To explore their potential regulatory contributions to strain-specific diversity in cellulose degradation, we compared the presence or absence of specific transcription factor (TF) genes among the twelve T. pinophilus strains (Figure 8). Several TF genes, such as CAA36731.1-areA, A2QJX5.1-AraR, EPS27888.1-Hap3, EPS28222.1-CreA, and EPS33410.1-FlbC, were identified in all strains, including Y117, indicating their conserved roles in fungal biology. Although Y117 shares most TF genes with other strains, with no TF genes uniquely present in this strain, several differences between Y117 and other strains have been observed. For example, EPS32714.1-XlnR is absent in Y117 but present in some other strains (e.g., Y-94, 1-95, AR155, and NCIM1228), indicating a possible difference in the regulation of xylanase-related pathways. The TF gene EPS25156.1-BrlA, which plays a key role in the initiation of asexual development in fungi, is present in all the strains except Y117, suggesting a potential loss or divergence in this regulatory pathway in Y117.

4. Discussion

Talaromyces pinophilus is a filamentous fungus renowned for its exceptional lignocellulose-degrading capabilities, attributed to its robust enzymatic machinery and potential as a microbial cell factory for industrial applications. The mutant strain T. pinophilus Y117, derived from the wild-type parental strain TP117 through UV and chemical mutagenesis, demonstrates superior enzymatic productivity, secreting a comprehensive cellulolytic enzyme system including filter paperase (FPase), carboxymethyl cellulase (CMCase), avicelase, β-glucosidase, xylanase, and xylosidase (Table S1). The outstanding lignocellulose-degrading capacity of Y117 can be traced to its parental strain, TP117, which was originally isolated from paddy soil in Jiangxi Province, China. In this region, rice is cultivated twice a year (double-cropping). As part of local farming practices, after each harvest, rice straw is crushed and returned to the fields as an organic carbon source. This practice has created an environment conducive to the enrichment of lignocellulose-degrading fungi in the paddy soil. Notably, combined UV and chemical mutagenesis has been demonstrated as a highly effective strategy for enhancing cellulase production in fungi [49,50]. The enzymatic profile of Y117 is particularly robust due to its balanced secretion of multiple cellulolytic and hemicellulolytic enzymes, making it an outstanding candidate for biomass conversion applications.
In this study, we performed whole-genome sequencing and comprehensive functional annotation of this strain. Additionally, we conducted a comparative genomic analysis using all publicly available whole-genome sequences of T. pinophilus strains. Our findings revealed that Y117 possesses a genomically streamlined architecture, characterized by high assembly continuity and strong coding potential, features that make it an ideal candidate for mechanistic studies on lignocellulose degradation (fewer scaffolds enable more accurate gene mapping). Gene annotation identified 12,406 protein-coding genes in Y117, a number significantly greater than that of the well-characterized cellulase-producing strain Y-94, suggesting elevated enzymatic potential for lignocellulose deconstruction. Although phylogenetic analysis confirmed a close relationship between Y117 and Y-94, Y117 exhibited distinct genomic features, which likely reflect specialized metabolic and regulatory adaptations crucial for its ecological niche. These unique genomic traits may underpin its superior lignocellulolytic efficiency and could help unravel strain-specific optimization strategies.
Genomic analysis of strain Y117 identified 2026 CAZymes, representing a remarkably rich repertoire compared to the average of 800 observed in other reference T. pinophilus strains (Figure 5). This extensive CAZyme complement includes numerous enzymes with demonstrated biomass-hydrolyzing potential. The comparative analysis of CAZyme distribution between Y117 and Y-94 revealed striking functional specialization in Y117, particularly within the glycoside hydrolase (GH) families, where GH78 (7.5%) and GH79 (5.7%) were significantly more abundant than in Y-94. While GH78 (α-L-rhamnosidase) primarily targets pectin, its activity could indirectly facilitate cellulose access by removing rhamnogalacturonan barriers in plant cell walls [51]. GH79 enzymes cleave β-glucuronidic linkages in glycosaminoglycans and plant polysaccharides [52]. The combination of GH78 (pectin-targeting) and GH79 (glucuronide-targeting) may synergistically improve Y117’s cellulose degradation efficiency, as demonstrated by its higher FPase (1.32 U/mg) and β-glucosidase (3.22 U/mg) activities [Table S1] compared to Y-94, through the dismantling of complex plant cell walls or extracellular matrices. This multifunctional approach designates Y117 as a robust lignocellulose degrader, where GH78/GH79 may play indirect but critical roles in enhancing cellulose accessibility and hydrolysis. Strain Y117 demonstrates a unique and highly expanded CE profile (Figure 6). Most remarkably, Y117 showed an exceptional representation of CE14 and CE6, corresponding to 30.5-fold and 6.1-fold increases, respectively, over the average of the other strains. CE14 enzymes primarily remove acetyl groups from plant polysaccharides (e.g., xylan or chitin), enhancing their accessibility to glycoside hydrolases (GHs) for degradation [53]. The co-occurrence of hyper-expressed CE14 with the strain’s enhanced GH78/GH79 pectinases indicates a coordinated strategy for tackling complex plant cell walls, positioning Y117 as a promising candidate for biotechnological applications in biomass conversion and biofuel production. Y117’s GT55 expansion likely reflects niche adaptation for polysaccharide modification, as these glycosyltransferases are rare in related strains. Their predicted role in α-glucan synthesis may contribute to both self-tolerance during lignocellulose breakdown and production of modified oligosaccharides with industrial applications [54,55]. The strain’s GT55 profile warrants functional characterization to validate these hypothesized mechanisms [56].
Comparative genomic analysis of T. pinophilus strains revealed a specialized secondary metabolite profile of Y117 marked by the complete absence of fungal-RiPP-like clusters—a striking contrast to related strains such as Y-94 (7 clusters) and 1-95 (14 clusters). Genomic reduction in Y117, consistent with the streamlining phenomenon observed in industrial fungi, parallels its exceptional enrichment in cellulolytic enzymes (e.g., GH78 endoglucanases and CBM13 cellulose-binding modules) [57,58,59]. We propose this represents an adaptive metabolic trade-off where resource allocation favors lignocellulose degradation over specialized metabolite biosynthesis, a pattern documented in other high-performance cellulolytic fungi [54,60]. Notably, Y117 retains key biosynthetic capabilities, including 8 NRPS clusters, 25 T1PKS clusters, and unique T1PKS-NRPS hybrids (n = 2, absent in most strains). The lack of fungal-RiPPs may reflect Y117’s adaptation to nutrient-rich environments where competitive antimicrobials are less critical, while its retained NRPS/PKS systems could fortify stress tolerance and/or support niche colonization. These features collectively identify Y117 as an exceptional chassis for cellulase production, where redirected metabolic flux from secondary metabolism to hydrolytic enzyme synthesis could explain its superior biomass degradation performance (Figure 1). Future studies should validate this model through targeted approaches, including (1) time-course transcriptomic analysis under industrial fermentation conditions, (2) comparative promoter activity assays of key hydrolase genes, and (3) metabolic flux analysis during different growth phases.
TFs are proteins that control the transcription of target genes through binding to specific DNA sequence elements such as promoters, enhancers, etc. Elucidating the regulatory mechanisms of TFs coupled with synthetic biology-driven engineering of high-performance expression systems could significantly increase the industrial utilization of cellulolytic enzymes. Similar strategies have been successfully applied to improve enzyme production [61]. The transcription factor BrlA serves as a crucial developmental regulator in fungi, particularly within the Aspergillus genus, where it governs the formation of asexual spores (conidia) [62]. Loss of BrlA function is known to completely abolish conidiation. In the context of our study, the absence of the BrlA gene in T. pinophilus Y117 likely triggers metabolic reallocation, whereby resources typically dedicated to sporulation are redirected toward other biosynthetic pathways. This shift may account for the observed enhancement in cellulase production (Table S1). Such metabolic reprogramming aligns with established fungal adaptation strategies, wherein disruptions in developmental pathways often lead to compensatory upregulation of secondary metabolic or secretory processes [63]. Notably, this phenomenon has been documented in Aspergillus nidulans, where BrlA-deficient strains exhibit altered carbon flux toward secondary metabolite synthesis [64]. The Y117 phenotype thus exemplifies how targeted genetic modifications can exploit endogenous regulatory networks to optimize industrial enzyme yields. The transcription factor XlnR is a key transcriptional regulator of xylanolytic and cellulolytic enzyme genes in fungi [65]. Knockout of XlnR typically leads to a significant reduction in xylanase production, while its impact on cellulase activity tends to be less pronounced [66]. Consistent with this paradigm, our findings demonstrate that the absence of the XlnR gene in the Y117 genome aligns with the strain’s observed enzymatic profile [Table S1]. This phenotypic consistency reinforces the hierarchical regulatory function of XlnR, where its loss disproportionately affects xylanases over cellulases. Overall, these findings offer novel insights into the regulatory mechanisms underlying fungal plant-biomass-degrading enzymes and identify potential targets for genetic engineering to enhance industrial applications.

5. Conclusions

In conclusion, this study presents a high-quality genome assembly of T. pinophilus Y117, revealing genomic adaptations that underlie its exceptional cellulolytic performance. Comparative genomics highlighted three key evolutionary signatures: (1) significant expansion of GH and CBM families enabling efficient lignocellulose hydrolysis, (2) specialized secondary metabolism evidenced by fungal-RiPP cluster loss but retention of unique T1PKS-NRPS/indole clusters, and (3) BrlA gene deletion likely redirecting metabolic flux from sporulation to cellulase production. These findings not only provide a deeper understanding of fungal CAZyme regulation but also identify Y117 as an ideal chassis for industrial enzyme production, with its genomic features offering specific targets for strain engineering aimed at optimizing biomass degradation.

Supplementary Materials

The following supporting information can be downloaded at https://www.mdpi.com/article/10.3390/jof11090614/s1, Figure S1: KOG categories in the T. pinophilus strain Y117 genome. Figure S2: Distribution of genes based on KEGG categories in the genomes of T. pinophilus strain Y117. Figure S3: GO enrichment analysis of strain-specific genes in T. pinophilus NA01. Figure S4: GO enrichment analysis of strain-specific genes in T. pinophilus AR155. Table S1: Specific enzyme activities of crude enzyme produced by T. pinophilus strain Y-94, strain CF-2612, and strain Y117. Table S2: Distribution of transcription factor genes among twelve T. pinophilus strains.

Author Contributions

Conceptualization, L.Y. and Y.L.; methodology, D.X.; software, S.Z.; validation, S.Y., P.L. and J.Z.; formal analysis, J.G.; writing—original draft preparation, Y.L.; writing—review and editing, L.Y.; visualization, T.D.; supervision, Y.L. and L.Y.; project administration, L.Y.; funding acquisition, Y.L. and L.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This study was funded by the Jiangxi Natural Science Foundation (20243BCE51127, 20243BCE51054, S2021GDQN2403, 20242BAB202533), the Jiangxi Science and Technology Major Project (20223AAG02023), and the Science and Technology Program of Jiangxi Academy of Sciences (2023YSBG22008).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Acknowledgments

During the preparation of this manuscript, the authors used DeepSeek (https://www.deepseek.com) for the purposes of language polishing and grammatical refinement during the preparation of this manuscript. The authors have reviewed and edited the output and take full responsibility for the content of this publication.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Singhvi, M.S.; Gokhale, D.V. Lignocellulosic biomass: Hurdles and challenges in its valorization. Appl. Microbiol. Biotechnol. 2019, 103, 9305–9320. [Google Scholar] [CrossRef]
  2. Ragauskas, A.J.; Williams, C.K.; Davison, B.H.; Britovsek, G.; Cairney, J.; Eckert, C.A.; Frederick, W.J., Jr.; Hallett, J.P.; Leak, D.J.; Liotta, C.L.; et al. The path forward for biofuels and biomaterials. Science 2006, 311, 484–489. [Google Scholar] [CrossRef]
  3. Zhang, Z.; Xing, J.; Li, X.; Lu, X.; Liu, G.; Qu, Y.; Zhao, J. Review of research progress on the production of cellulase from filamentous fungi. Int. J. Biol. Macromol. 2024, 277, 134539. [Google Scholar] [CrossRef]
  4. Meng, Q.; Abraham, B.; Hu, J.; Jiang, Y. Cutting-edge advances in strain and process engineering for boosting cellulase production in Trichoderma reesei. Bioresour. Technol. 2025, 419, 132015. [Google Scholar] [CrossRef]
  5. Anuj, K.C.; Vijay, K.G.; Akhilesh, K.S.; Felipe, A.F.A.; Silvio, S.S. The path forward for lignocellulose biorefineries: Bottlenecks, solutions, and perspective on commercialization. Bioresour. Technol. 2018, 264, 370–381. [Google Scholar] [CrossRef] [PubMed]
  6. Periyasamy, S.; Isabel, J.B.; Kavitha, S.; Karthik, V.; Mohamed, B.A.; Gizaw, D.G.; Sivashanmugam, P.; Aminabhavi, T.M. Recent advances in consolidated bioprocessing for conversion of lignocellulosic biomass into bioethanol—A review. Chem. Eng. J. 2023, 453, 139783. [Google Scholar] [CrossRef]
  7. Liu, X.Q.; Gao, F.; Wang, Y.R.; Zhang, J.; Bai, Y.G.; Zhang, W.; Luo, H.Y.; Yao, B.; Wang, Y.; Tu, T. Characterization of a novel thermostable α-l-arabinofuranosidase for improved synergistic effect with xylanase on lignocellulosic biomass hydrolysis without prior pretreatment. Bioresour. Technol. 2024, 394, 130177. [Google Scholar] [CrossRef] [PubMed]
  8. Agrawal, K.; Nair, L.G.; Chaturvedi, V.; Verma, P. Designing microbial cellulases using genetic engineering approach: A promising strategy towards zero-waste cellulosic biorefinery. Biocatal. Agric. Biotechnol. 2023, 52, 102830. [Google Scholar] [CrossRef]
  9. Fujii, T.; Hoshino, T.; Inoue, H.; Yano, S. Taxonomic revision of the cellulose-degrading fungus Acremonium cellulolyticus nomen nudum to Talaromyces based on phylogenetic analysis. FEMS Microbiol. Lett. 2014, 351, 32–41. [Google Scholar] [CrossRef]
  10. Inoue, H.; Decker, S.R.; Taylor, L.E.; Yano, S.; Sawayama, S. Identification and characterization of core cellulolytic enzymes from Talaromyces cellulolyticus (formerly Acremonium cellulolyticus) critical for hydrolysis of lignocellulosic biomass. Biotechnol. Biofuels 2014, 7, 151. [Google Scholar] [CrossRef]
  11. Takashi, Y.; Yasushi, M.; Yoshiyuki, T. Isolation of a cellulolytic enzyme producing microorganism, culture conditions and some properties of the enzymes. Agric. Biol. Chem. 1987, 51, 65–74. [Google Scholar] [CrossRef]
  12. Yamanobe, T.; Okuda, N.; Oouchi, K.; Suzuki, K. Cellulase Gene Derived from Trichoderma Fungus and Transformant Transformed with the Gene. Japanese Patent JP 2003-135052 A, 14 May 2003. [Google Scholar]
  13. Fang, X.; Yano, S.; Inoue, H.; Sawayama, S. Strain improvement of Acremonium cellulolyticus for cellulase production by mutation. J. Biosci. Bioeng. 2009, 107, 256–261. [Google Scholar] [CrossRef] [PubMed]
  14. Hideno, A.; Inoue, H.; Fujii, T. High-coverage gene expression profiling analysis of the cellulase-producing fungus Acremonium cellulolyticus cultured using different carbon sources. Appl. Microbiol. Biotechnol. 2013, 97, 5483–5492. [Google Scholar] [CrossRef]
  15. Liao, G.-Y.; Zhao, S.; Zhang, T.; Li, C.-X.; Liao, L.-S.; Zhang, F.-F.; Luo, X.-M.; Feng, J.-X. The transcription factor TpRfx1 is an essential regulator of amylase and cellulase gene expression in Talaromyces pinophilus. Biotechnol. Biofuels 2018, 11, 276. [Google Scholar] [CrossRef] [PubMed]
  16. Manglekar, R.R.; Geng, A. CRISPR-Cas9-mediated seb1 disruption in Talaromyces pinophilus EMU for its enhanced cellulase production. Enzym. Microb. Technol. 2020, 140, 109646. [Google Scholar] [CrossRef]
  17. Zhang, T.; Sun, Y.; Ma, Z.; Zhang, J.; Lv, B.; Li, C. Developing iterative and quantified transgenic manipulations of non-conventional filamentous fungus Talaromyces pinophilus Li-93. Biochem. Eng. J. 2022, 179, 108317. [Google Scholar] [CrossRef]
  18. Kishishita, S.; Fujii, T.; Ishikawa, K. Heterologous expression of hyperthermophilic cellulases of archaea Pyrococcus sp. by fungus Talaromyces cellulolyticus. J. Ind. Microbiol. Biotechnol. 2015, 42, 137. [Google Scholar] [CrossRef]
  19. Chen, Z.; Chen, L.; Khoo, K.S.; Gupta, V.K.; Sharma, M.; Show, P.L.; Yap, P. Exploitation of lignocellulosic-based biomass biorefinery: A critical review of renewable bioresource, sustainability and economic views. Biotechnol. Adv. 2023, 69, 108265. [Google Scholar] [CrossRef]
  20. de Vries, R.P.; Mäkelä, M.R. Genomic and postgenomic diversity of fungal plant biomass degradation approaches. Trends Microbiol. 2020, 28, 487–499. [Google Scholar] [CrossRef]
  21. Li, C.-X.; Zhao, S.; Zhang, T.; Xian, L.; Liao, L.-S.; Liu, J.-L.; Feng, J.-X. Genome sequencing and analysis of Talaromyces pinophilus provide insights into biotechnological applications. Sci. Rep. 2017, 7, 490. [Google Scholar] [CrossRef] [PubMed]
  22. Ghose, T.K. Measurement of cellulase activities. Pure Appl. Chem. 1987, 59, 257–268. [Google Scholar] [CrossRef]
  23. Ikeda, Y.; Hayashi, H.; Okuda, N.; Park, E.Y. Efficient cellulase production by the filamentous fungus Acremonium cellulolyticus. Biotechnol. Prog. 2007, 23, 333–338. [Google Scholar] [CrossRef]
  24. Mandels, M.; Weber, J. The production of cellulases. Adv. Chem. Ser. 1969, 95, 391–414. [Google Scholar] [CrossRef]
  25. Wood, T.M.; Bhat, K.M. Methods for measuring cellulase activities. Methods Enzymol. 1988, 160, 87–112. [Google Scholar] [CrossRef]
  26. Bailey, M.J.; Biely, P.; Poutanen, K. Interlaboratory testing of methods for assay of xylanase activity. J. Biotechnol. 1992, 23, 257–270. [Google Scholar] [CrossRef]
  27. Fujii, T.; Fang, X.; Inoue, H.; Murakami, K.; Sawayama, S. Enzymatic hydrolyzing performance of Acremonium cellulolyticus and Trichoderma reesei against three lignocellulosic materials. Biotechnol. Biofuels 2009, 2, 24. [Google Scholar] [CrossRef] [PubMed]
  28. Miller, G.L. Use of dinitrosalicylic acid reagent for determination of reducing sugar. Anal. Chem. 1959, 31, 426–428. [Google Scholar] [CrossRef]
  29. Bradford, M.M. A rapid and sensitive method for the quantitation of microgram quantities of protein utilizing the principle of protein-dye binding. Anal. Biochem. 1976, 72, 248–254. [Google Scholar] [CrossRef]
  30. Chin, C.-S.; Alexander, D.H.; Marks, P.; Klammer, A.A.; Drake, J.; Heiner, C.; Clum, A.; Copeland, A.; Huddleston, J.; Eichler, E.E.; et al. Nonhybrid, finished microbial genome assemblies from long-read SMRT sequencing data. Nat. Methods 2013, 10, 563–569. [Google Scholar] [CrossRef]
  31. Myers, E.W.; Sutton, G.G.; Delcher, A.L.; Dew, I.M.; Fasulo, D.P.; Flanigan, M.J.; Kravitz, S.A.; Mobarry, C.M.; Reinert, K.H.; Remington, K.A.; et al. A whole-genome assembly of Drosophila. Science 2000, 287, 2196–2204. [Google Scholar] [CrossRef]
  32. Stanke, M.; Waack, S. Gene prediction with a hidden Markov model and a new intron submodel. Bioinformatics 2003, 19 (Suppl. 2), ii215–ii225. [Google Scholar] [CrossRef] [PubMed]
  33. Tarailo-Graovac, M.; Chen, N. Using RepeatMasker to identify repetitive elements in genomic sequences. Curr. Protoc. Bioinf. 2009, 25, 4.10.1–4.10.14. [Google Scholar] [CrossRef]
  34. Lagesen, K.; Hallin, P.; Rødland, E.A.; Staerfeldt, H.H.; Rognes, T.; Ussery, D.W. RNAmmer: Consistent and rapid annotation of ribosomal RNA genes. Nucleic Acids Res. 2007, 35, 3100–3108. [Google Scholar] [CrossRef]
  35. Lowe, T.M.; Eddy, S.R. tRNAscan-SE: A program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res. 1997, 25, 955–964. [Google Scholar] [CrossRef]
  36. Emms, D.M.; Kelly, S. OrthoFinder: Phylogenetic orthology inference for comparative genomics. Genome Biol. 2019, 20, 238. [Google Scholar] [CrossRef]
  37. Katoh, K.; Standley, D.M. MAFFT multiple sequence alignment software version 7: Improvements in performance and usability. Mol. Biol. Evol. 2013, 30, 772–780. [Google Scholar] [CrossRef]
  38. Castresana, J. Selection of conserved blocks from multiple alignments for their use in phylogenetic analysis. Mol. Biol. Evol. 2000, 17, 540–552. [Google Scholar] [CrossRef]
  39. Price, M.N.; Dehal, P.S.; Arkin, A.P. FastTree 2: Approximately maximum-likelihood trees for large alignments. PLoS ONE 2010, 5, e9490. [Google Scholar] [CrossRef] [PubMed]
  40. Letunic, I.; Bork, P. Interactive Tree of Life (iTOL) v5: An online tool for phylogenetic tree display and annotation. Nucleic Acids Res. 2021, 49, W293–W296. [Google Scholar] [CrossRef]
  41. Klopfenstein, D.V.; Zhang, L.; Pedersen, B.S.; Ramírez, F.; Warwick Vesztrocy, A.; Naldi, A.; Mungall, C.J.; Yunes, J.M.; Botvinnik, O.; Weigel, M.; et al. GOATOOLS: A Python library for Gene Ontology analyses. Sci. Rep. 2018, 8, 10872. [Google Scholar] [CrossRef] [PubMed]
  42. Lombard, V.; Golaconda Ramulu, H.; Drula, E.; Coutinho, P.M.; Henrissat, B. The carbohydrate-active enzymes database (CAZy) in 2013. Nucleic Acids Res. 2014, 42, D490–D495. [Google Scholar] [CrossRef]
  43. Altschul, S.F.; Gish, W.; Miller, W.; Myers, E.W.; Lipman, D.J. Basic local alignment search tool. J. Mol. Biol. 1990, 215, 403–410. [Google Scholar] [CrossRef]
  44. Jones, P.; Binns, D.; Chang, H.-Y.; Fraser, M.; Li, W.; McAnulla, C.; Hunter, S. InterProScan 5: Genome-scale protein function classification. Bioinformatics 2014, 30, 1236–1240. [Google Scholar] [CrossRef] [PubMed]
  45. Mistry, J.; Chuguransky, S.; Williams, L.; Qureshi, M.; Salazar, G.A.; Sonnhammer, E.L.; Bateman, A. Pfam: The protein families database in 2021. Nucleic Acids Res. 2021, 49, D412–D419. [Google Scholar] [CrossRef] [PubMed]
  46. Blin, K.; Shaw, S.; Kloosterman, A.M.; Charlop-Powers, Z.; van Wezel, G.P.; Medema, M.H.; Weber, T. antiSMASH 6.0: Improving cluster detection and comparison capabilities. Nucleic Acids Res. 2011, 49, W29–W35. [Google Scholar] [CrossRef] [PubMed]
  47. Park, J.; Park, J.; Jang, S.; Kim, S.; Kong, S.; Choi, J.; Kim, S. FTFD: An informatics pipeline supporting phylogenomic analysis of fungal transcription factors. Bioinformatics 2008, 24, 1024–1025. [Google Scholar] [CrossRef]
  48. Krzywinski, M.; Schein, J.; Birol, I.; Connors, J.; Gascoyne, R.; Horsman, D.; Jones, S.J.; Marra, M.A. Circos: An information aesthetic for comparative genomics. Genome Res. 2009, 19, 1639–1645. [Google Scholar] [CrossRef]
  49. He, J.; Yu, B.; Zhang, K.; Ding, X.; Chen, D. Strain improvement of Trichoderma reesei Rut C-30 for increased cellulase production. Indian J. Microbiol. 2009, 49, 188–195. [Google Scholar] [CrossRef]
  50. Liu, F.; Wang, Z.; Manglekar, R.R.; Geng, A. Enhanced cellulase production through random mutagenesis of Talaromyces pinophilus OPC4-1 and fermentation optimization. Process Biochem. 2020, 90, 12–22. [Google Scholar] [CrossRef]
  51. Zhang, Y.; Li, X.; Wang, J.; Chen, Q.; Liu, W.; Feng, Y.; Zhang, G. Discovery of novel α-l-rhamnosidases from metagenomic libraries. Appl. Environ. Microbiol. 2022, 88, e00567-22. [Google Scholar] [CrossRef]
  52. Wu, T.; Zhang, L.; Chen, X.; Li, Y.; Wang, Q. Engineering GH79 β-glucuronidases for enhanced activity. ACS Synth. Biol. 2023, 12, 678–689. [Google Scholar] [CrossRef]
  53. Nakamura, A.M.; Nascimento, A.S.; Polikarpov, I. Structural diversity of carbohydrate esterases. Biotechnol. Res. Innov. 2017, 1, 35–51. [Google Scholar] [CrossRef]
  54. Benoit, I.; Culleton, H.; Zhou, M.; DiFalco, M.; Aguilar-Osorio, G.; Battaglia, E.; Bouzid, O.; Brouwer, C.P.J.M.; El-Bushari, H.B.O.; Coutinho, P.M.; et al. Closely Related Fungi Employ Diverse Enzymatic Strategies to Degrade Plant Biomass. Biotechnol. Biofuels 2015, 8, 107. [Google Scholar] [CrossRef] [PubMed]
  55. Schmölzer, K.; Lemmerer, M.; Gutmann, A.; Nidetzky, B. Sucrose Synthase: A Unique Glycosyltransferase for Biocatalytic Glycosylation Process Development. Biotechnol. Adv. 2016, 34, 88–111. [Google Scholar] [CrossRef]
  56. Lairson, L.L.; Henrissat, B.; Davies, G.J.; Withers, S.G. Glycosyltransferases: Structures, Functions, and Mechanisms. Annu. Rev. Biochem. 2008, 77, 521–555. [Google Scholar] [CrossRef]
  57. Brakhage, A.A. Regulation of fungal secondary metabolism. Nat. Rev. Microbiol. 2013, 11, 21–32. [Google Scholar] [CrossRef] [PubMed]
  58. Frisvad, J.C.; Andersen, B.; Thrane, U. The use of secondary metabolite profiling in chemotaxonomy of filamentous fungi. Mycol. Res. 2008, 112, 231–240. [Google Scholar] [CrossRef]
  59. Andersen, M.R.; Salazar, M.P.; Schaap, P.J.; van de Vondervoort, P.J.; Culley, D.; Thykaer, J.; Albermann, K. Comparative genomics of citric-acid-producing Aspergillus niger ATCC 1015 versus enzyme-producing CBS 513.88. Genome Res. 2011, 21, 885–897. [Google Scholar] [CrossRef]
  60. Coradetti, S.T.; Craig, J.P.; Xiong, Y.; Shock, T.; Tian, C.; Glass, N.L. Conserved and essential transcription factors for cellulase gene expression in ascomycete fungi. Proc. Natl. Acad. Sci. USA 2012, 109, 7397–7402. [Google Scholar] [CrossRef]
  61. Nevalainen, K.M.H.; Te’o, V.S.J.; Bergquist, P.L. Heterologous protein expression in filamentous fungi. Trends Biotechnol. 2005, 23, 468–474. [Google Scholar] [CrossRef] [PubMed]
  62. Adams, T.H.; Wieser, J.K.; Yu, J.H. Asexual sporulation in Aspergillus nidulans. Microbiol. Mol. Biol. Rev. 1998, 62, 35–54. [Google Scholar] [CrossRef] [PubMed]
  63. Bayram, Ö.; Braus, G.H. Coordination of secondary metabolism and development in fungi: The velvet family of regulatory proteins. FEMS Microbiol. Rev. 2012, 36, 1–24. [Google Scholar] [CrossRef]
  64. Park, H.S.; Yu, J.H. Genetic control of asexual sporulation in filamentous fungi. Curr. Opin. Microbiol. 2012, 15, 669–677. [Google Scholar] [CrossRef]
  65. van Peij, N.N.; Gielkens, M.M.; de Vries, R.P.; Visser, J.; de Graaff, L.H. The transcriptional activator XlnR regulates both xylanolytic and endoglucanase gene expression in Aspergillus niger. Appl. Environ. Microbiol. 1998, 64, 3615–3619. [Google Scholar] [CrossRef]
  66. Stricker, A.R.; Mach, R.L.; de Graaff, L.H. Regulation of transcription of cellulases- and hemicellulases-encoding genes in Aspergillus niger and Hypocrea jecorina (Trichoderma reesei). Appl. Microbiol. Biotechnol. 2008, 78, 211–220. [Google Scholar] [CrossRef] [PubMed]
Figure 1. Phenotypic and functional characterization of T. pinophilus Y117: (A) Hydrolysis of corncob powder by Y117—(Left) untreated substrate; (Right) substrate after enzymatic degradation, showing visible structural breakdown. (B,C) Colony morphology of Y117 grown on (B) PDA (potato dextrose agar) and (C) YPD (Yeast Extract–Peptone–Dextrose) media after 5 days at 30 °C. (DF) Hyphal morphology during enzyme production—(D) pre-production phase, (E) mid-production phase, and (F) late-production phase. Scale bars: 20 μm (D,E); 10 μm (F).
Figure 1. Phenotypic and functional characterization of T. pinophilus Y117: (A) Hydrolysis of corncob powder by Y117—(Left) untreated substrate; (Right) substrate after enzymatic degradation, showing visible structural breakdown. (B,C) Colony morphology of Y117 grown on (B) PDA (potato dextrose agar) and (C) YPD (Yeast Extract–Peptone–Dextrose) media after 5 days at 30 °C. (DF) Hyphal morphology during enzyme production—(D) pre-production phase, (E) mid-production phase, and (F) late-production phase. Scale bars: 20 μm (D,E); 10 μm (F).
Jof 11 00614 g001
Figure 2. Overview of the Y117 genome assembly. The circle diagram from outside to inside represents different chromosome lengths, gene densities, repeat sequence densities (red to blue, representing the increase in sequence density), GC contents, and numbers of CAZyme genes in the window (white, green, yellow, and red, representing the increase in the number of genes). The first 21 scaffolds of the genome were taken for mapping using the Circos v0.69-3 software [48] with a window size of 50 kb.
Figure 2. Overview of the Y117 genome assembly. The circle diagram from outside to inside represents different chromosome lengths, gene densities, repeat sequence densities (red to blue, representing the increase in sequence density), GC contents, and numbers of CAZyme genes in the window (white, green, yellow, and red, representing the increase in the number of genes). The first 21 scaffolds of the genome were taken for mapping using the Circos v0.69-3 software [48] with a window size of 50 kb.
Jof 11 00614 g002
Figure 3. Phylogeny and genome statistics of Y117 and eleven other T. pinophilus strains. The phylogenetic tree was constructed using FastTree and MAFFT based on 5783 single-copy core genes (single-copy orthologs present in all strains). Seven bubble plots in the right panel illustrate key genomic features and sequencing quality parameters: GC content (%), genome size (Mbp), NR% (percentage of predicted proteins with BLASTP matches to NCBI Non-Redundant Protein Database), predicted protein count, scaffold L50 (number of scaffolds covering 50% of the genome), scaffold N50 (shortest scaffold length covering 50% of the genome), and total scaffolds. Bubble sizes are scaled within each panel and are not comparable across panels.
Figure 3. Phylogeny and genome statistics of Y117 and eleven other T. pinophilus strains. The phylogenetic tree was constructed using FastTree and MAFFT based on 5783 single-copy core genes (single-copy orthologs present in all strains). Seven bubble plots in the right panel illustrate key genomic features and sequencing quality parameters: GC content (%), genome size (Mbp), NR% (percentage of predicted proteins with BLASTP matches to NCBI Non-Redundant Protein Database), predicted protein count, scaffold L50 (number of scaffolds covering 50% of the genome), scaffold N50 (shortest scaffold length covering 50% of the genome), and total scaffolds. Bubble sizes are scaled within each panel and are not comparable across panels.
Jof 11 00614 g003
Figure 4. Venn diagram illustrating the numbers of homologous and strain-specific genes identified in the genomes of T. pinophilus strains. Genes were classified as strain-specific if they had <50% coverage and/or <50% similarity to the nucleotide sequences of all genes in the other strains.
Figure 4. Venn diagram illustrating the numbers of homologous and strain-specific genes identified in the genomes of T. pinophilus strains. Genes were classified as strain-specific if they had <50% coverage and/or <50% similarity to the nucleotide sequences of all genes in the other strains.
Jof 11 00614 g004
Figure 5. Carbohydrate-active enzymes in the twelve T. pinophilus strains, sorted according to the phylogram of Figure 1. Abbreviation: carbohydrate-binding module (CBM), carbohydrate esterase (CE), glucoside hydrolase (GH), glycosyl transferase (GT), polysaccharide lyase (PL), and auxiliary activity (AA).
Figure 5. Carbohydrate-active enzymes in the twelve T. pinophilus strains, sorted according to the phylogram of Figure 1. Abbreviation: carbohydrate-binding module (CBM), carbohydrate esterase (CE), glucoside hydrolase (GH), glycosyl transferase (GT), polysaccharide lyase (PL), and auxiliary activity (AA).
Jof 11 00614 g005
Figure 6. Bubble plot depicting the CAZyme profiles of the twelve T. pinophilus strains. For each CAZyme category, the five most prevalent clusters identified in strain Y117 were compared across all strains, if applicable. The size of each bubble indicates the proportion of genes in the cluster among all the genes belonging to the specific CAZyme category, and the colors indicate different CAZyme categories.
Figure 6. Bubble plot depicting the CAZyme profiles of the twelve T. pinophilus strains. For each CAZyme category, the five most prevalent clusters identified in strain Y117 were compared across all strains, if applicable. The size of each bubble indicates the proportion of genes in the cluster among all the genes belonging to the specific CAZyme category, and the colors indicate different CAZyme categories.
Jof 11 00614 g006
Figure 7. Core biosynthetic genes in Talaromyces pinophilus strains predicted by antiSMASH. Abbreviations: RiPP (ribosomally synthesized and post-translationally modified peptide), NRPS (non-ribosomal peptide synthase), T1PKS (type I polyketide synthase), and T3PKS (type III polyketide synthase).
Figure 7. Core biosynthetic genes in Talaromyces pinophilus strains predicted by antiSMASH. Abbreviations: RiPP (ribosomally synthesized and post-translationally modified peptide), NRPS (non-ribosomal peptide synthase), T1PKS (type I polyketide synthase), and T3PKS (type III polyketide synthase).
Jof 11 00614 g007
Figure 8. Comparative genomic analysis of transcription factors regulating biomass-degrading enzyme genes across twelve Talaromyces pinophilus strains.
Figure 8. Comparative genomic analysis of transcription factors regulating biomass-degrading enzyme genes across twelve Talaromyces pinophilus strains.
Jof 11 00614 g008
Table 1. Genome features of T. pinophilus Y117.
Table 1. Genome features of T. pinophilus Y117.
Genome FeaturesValue
Size of assembled genome (Mbp)36.52
GC content of assembled genome (%)46.36
N50 length (bp)3,280,491
N90 length (bp)1,325,600
Maximum length (bp)4,185,411
Minimum length (bp)14,568
All protein-coding genes12,406
Protein-coding genes (≥60 aa)12,392
Average gene length (bp)1588.09
Average number of introns per gene1.89
Average intron length (bp)132.83
Average number of exons per gene2.89
Average exon length (bp)1455.26
tRNA genes121
rRNA genes38
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Li, Y.; Yue, S.; Li, P.; Zeng, J.; Guo, J.; Xiong, D.; Zhang, S.; Deng, T.; Yuan, L. Genome Sequencing and Comparative Genomics of the Hyper-Cellulolytic Fungus Talaromyces pinophilus Y117. J. Fungi 2025, 11, 614. https://doi.org/10.3390/jof11090614

AMA Style

Li Y, Yue S, Li P, Zeng J, Guo J, Xiong D, Zhang S, Deng T, Yuan L. Genome Sequencing and Comparative Genomics of the Hyper-Cellulolytic Fungus Talaromyces pinophilus Y117. Journal of Fungi. 2025; 11(9):614. https://doi.org/10.3390/jof11090614

Chicago/Turabian Style

Li, Ya, Siyuan Yue, Peng Li, Jing Zeng, Jianjun Guo, Dawei Xiong, Shuaiwen Zhang, Tao Deng, and Lin Yuan. 2025. "Genome Sequencing and Comparative Genomics of the Hyper-Cellulolytic Fungus Talaromyces pinophilus Y117" Journal of Fungi 11, no. 9: 614. https://doi.org/10.3390/jof11090614

APA Style

Li, Y., Yue, S., Li, P., Zeng, J., Guo, J., Xiong, D., Zhang, S., Deng, T., & Yuan, L. (2025). Genome Sequencing and Comparative Genomics of the Hyper-Cellulolytic Fungus Talaromyces pinophilus Y117. Journal of Fungi, 11(9), 614. https://doi.org/10.3390/jof11090614

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.
Back to TopTop