Genomic and Metabolomic Analysis of the Endophytic Fungus Fusarium sp. VM-40 Isolated from the Medicinal Plant Vinca minor

The genus Fusarium is well-known to comprise many pathogenic fungi that affect cereal crops worldwide, causing severe damage to agriculture and the economy. In this study, an endophytic fungus designated Fusarium sp. VM-40 was isolated from a healthy specimen of the traditional European medicinal plant Vinca minor. Our morphological characterization and phylogenetic analysis reveal that Fusarium sp. VM-40 is closely related to Fusarium paeoniae, belonging to the F. tricinctum species complex (FTSC), the genomic architecture and secondary metabolite profile of which have not been investigated. Thus, we sequenced the whole genome of Fusarium sp. VM-40 with the new Oxford Nanopore R10.4 flowcells. The assembled genome is 40 Mb in size with a GC content of 47.72%, 15 contigs (≥50,000 bp; N 50~4.3 Mb), and 13,546 protein-coding genes, 691 of which are carbohydrate-active enzyme (CAZyme)-encoding genes. We furthermore predicted a total of 56 biosynthetic gene clusters (BGCs) with antiSMASH, 25 of which showed similarity with known BGCs. In addition, we explored the potential of this fungus to produce secondary metabolites through untargeted metabolomics. Our analyses reveal that this fungus produces structurally diverse secondary metabolites of potential pharmacological relevance (alkaloids, peptides, amides, terpenoids, and quinones). We also employed an epigenetic manipulation method to activate cryptic BGCs, which led to an increased abundance of several known compounds and the identification of several putative new compounds. Taken together, this study provides systematic research on the whole genome sequence, biosynthetic potential, and metabolome of the endophytic fungus Fusarium sp. VM-40.


Introduction
Endophytic fungi represent an important and rich group of microorganisms that live in plant tissues or intercellular spaces and can establish beneficial relationships with host plants [1]. Many of them are promising suppliers of multiple natural products, including alkaloids, terpenoids, flavonoids, steroids, and phenolic compounds, which contribute to various interesting pharmacological effects, such as anti-inflammatory, anti-tumor, antiphytopathogenic, antibacterial, antifungal, antiproliferative, and antioxidant activities [2,3].
Fusarium is one of the most common fungal genera and ubiquitously exists in terrestrial and marine environments. This genus, when associated with plants, can adopt diverse lifestyles, including saprotrophic, endophytic, and pathogenic lifestyles. Most previous studies have focused on plant pathogenicity, but more recently, scientific interest in endophytic Fusarium species has risen [4]. Fusarium endophytes have been reported to produce secondary metabolites with diverse pharmacological activities, such as paclitaxel produced by Fusarium solani isolated from Taxus celebica [5], vitexin produced by Fusarium solani G6 from Cajanus cajan [6], and quinine and cinchonidine produced by Fusarium isolates from Cinchona calisaya [7]. In addition, endophytic fungi have been investigated as promising biocontrol agents against many plant pathogens [8][9][10]. For instance, Fusarium oxysporum Fo47 is effective in controlling Fusarium wilt in tomatoes [11], and Fusarium commune W5 controls bakanae disease on rice flowers [12]. In light of increasing fungicide resistance and the emergence of new plant-pathogenic strains, it is a timely endeavor to further explore the antimicrobial potential of fungal endophytes and their secondary metabolites.
Fusaria are famous for their biosynthetic potential for the production of secondary metabolites (SMs), including alkaloids, peptides, amides, terpenoids, quinones, and pyranones [13]. Despite their great potential for producing diverse SMs, it is known that the majority of biosynthetic gene clusters (BGCs) remain silent under standard laboratory conditions. This indicates that a great number of novel metabolites are yet to be discovered via the activation of such silent gene clusters. Researchers have developed a variety of strategies to activate these cryptic BGCs [14][15][16]. One of these strategies involves the application of small molecular compounds that modify chromatin remodeling, ultimately leading to the induction of silent fungal BGCs [17]. Sodium butyrate (SB), which inhibits histone deacetylases, is frequently used as an inhibitor in filamentous fungi to enhance the chemical diversity of secondary metabolites [18,19].
In this study, we isolated an endophytic fungus, Fusarium sp. VM-40, from healthy leaves of Vinca minor. Our morphological identification and phylogenetic analyses of Fusarium sp. VM-40 indicate that this strain belongs to the F. tricinctum species complex, the genomic architecture and secondary metabolite profile of which have not been investigated. Herein, we explore the genome and metabolome of Fusarium sp. VM-40 to disclose its biosynthetic potential. In addition, we successfully employed an epigenetic manipulation strategy to increase the chemical diversity of Fusarium sp. VM-40. These findings provide insight into the biotechnological potential of Fusarium sp. VM-40.

Fungus Isolation and Cultivation
The Fusarium strain was isolated from healthy-looking, surface-sterilized leaves of Vinca minor. Briefly, leaves of Vinca minor were freshly collected in Groningen (The Netherlands) in November 2021 and washed in an ultrasonic water bath (160 W, 15 min) to remove surface dirt and adherent epiphytes. Leaves were surface-sterilized in 70% ethanol for 1 min, followed by 1% sodium hypochlorite for 2 min, then washed in distilled water for 3 × 1 min [20]. Leaves were aseptically cut into small fragments and directly placed on potato dextrose agar (PDA) medium, supplemented with 100 mg·L −1 ampicillin and 30 mg·L −1 kanamycin to prevent bacterial growth, and incubated at 28 • C for 2 to 4 weeks. A 200-microliter aliquot of water from the last washing step was also inoculated onto a PDA plate and incubated at 28 • C for the same time to check the effectiveness of surface sterilization. Fungal mycelium emerging from the leaf pieces was picked and purified by restreaking on fresh PDA medium. Plates with the purified colonies were sealed with parafilm and stored at 4 • C.

Morphological Analysis and Internal Transcribed Spacer (ITS)-Based Identification
For morphological characterization, the fungal isolate was grown on Synthetically nutrient-poor agar (SNA), Czapek yeast autolysate agar (CYA), and PDA media. After 7 days, the fungal colonies on each medium were observed for colony color and diameter, medium color around the colony, and colony reverse color. For microscopic analysis, samples of a small portion of the mycelium were prepared by mixing it with lactophenol blue dye and observed using an optical microscope (Olympus BX41). The digital images were captured using a connected Leica camera (Heerbrugg, Switzerland).
The full ITS region was amplified by polymerase chain reaction (PCR) with the ITSF1 (5 CTTGGTCATTTAGAGGAAGTAA3 ) and ITS4 (5 TCCTCCGCTTATTGATATGC3 ) primers. The PCR reaction was performed in a thermal cycler with a 2 × Q5 PCR master mix (New England Biolabs, Ipswich, MA, USA) and fungal DNA with the following program: 1 min at 98 • C, 30 cycles of 10 s at 98 • C, 15 s at 55 • C, 20 s at 72 • C, followed by 5 min of final extension at 72 • C. Two microliters of the PCR product were taken for 1% agarose gel electrophoresis analysis to confirm the successful amplification. The PCR product was purified using the QIAquick PCR Purification Kit (Qiagen, Venlo, the Netherlands) and sent to Macrogen Europe (Amsterdam, the Netherlands) for Sanger sequencing. The resulting sequences were analyzed with the Basic Local Alignment Search Tool (BLAST) against the nucleotide collection of the National Center for Biotechnology Information (NCBI) to identify the best match for the fungal isolate based on E-value.
DNA extraction was performed with the Genomic Buffer Set (Qiagen) according to the manufacturer's protocol with the following minor modifications: (1) Six 2 mL Eppendorf tubes with 25 mg lyophilized and ground mycelium were used instead of cells directly from the medium; (2) Vinotaste PRO (Novozymes, Bagsvaerd, Denmark) with a final concentration of 20 mg·mL −1 was used as lysing enzyme instead of lyticase; (3) enzymatic degradation of the cell wall was performed at 30 • C for 1 h (100 rpm) and cell lysis was performed at 50 • C for 2 h (25 rpm) instead of the recommended time and temperature. The extracted DNA was then purified using QIAGEN Genomic-Tips 20 G-1 according to the manufacturer's protocol, with the modification that the tips were washed four times.
Circulomics Short Read Eliminator XS (PacBio, Menlo Park, CA, USA) was used to remove small fragments from the DNA preparations according to the manufacturer's protocol. Quality control of the purified DNA was performed using NanoDrop N-100 (ThermoFisher, Waltham, MA, USA), Qubit 3.0 (Invitrogen, Waltham, MA, USA), and the Qubit dsDNA HS Assay Kit.

Library Preparation and Sequencing
For long-read sequencing, the genomic DNA was prepared using Oxford Nanopore Technologies' Ligation kit (SQK-LSK112) according to the manufacturer's guidelines. Briefly, genomic DNA (1000 ng) was subjected to end repair and tailing by NEBNext FFPE DNA Repair mix and NEBNext Ultra II End repair/dA-tailing modules (New England Biolabs, Ipswich, MA, USA) and purified with AMPure XP (Beckman Coulter, Pasadena, CA, USA) magnetic beads. The sequencing adaptors were ligated using the NEBNext Quick Ligation Module (New England Biolabs, Ipswich, MA, USA). After a final product clean-up using the Long Fragment Buffer, the sequencing library was loaded into a primed FLO-MIN112 (ID: FAT75549) flow cell on a MinION device for a 46-h run. Data acquisition and real-time basecalling were carried out with MinKNOW software (version 22.05.5).

Computational Analysis
The raw reads were basecalled using Guppy version 6.1.5 (Oxford Nanopore Technologies, Oxford, UK) in GPU mode using the dna_ r10. 4_ e8.1_ sup. cfg model [21]. The basecalled reads were subsequently filtered to a minimum length of 2 kb and a minimum quality of Q10 using NanoFilt (version 2.8.0) [22]. NanoPlot (version 1.40.0) [22] was used to evaluate the filtered reads. Assembly was performed using Flye (version 2.9-b1778). The quality of the genome assembly was evaluated using QUAST v5.1.0rc1 [23]. Bandage (version 0.8.1) [24] was used to visualize the newly assembled genome of Fusarium sp. VM-40 ( Figure S1). The draft assembly was subsequently polished in two rounds: first using Racon version 1.4.10 with default settings [25], then Medaka version 0.11.5 with default settings. The completeness of assemblies was evaluated using BUSCO 5.4.3 (as-comycota_odb10 dataset). Genome annotation was carried out using the online platform Genome Sequence Annotation Server (GenSAS, https://www.gensas.org, accessed on 5 Septemper 2022), which provides a pipeline for whole genome structural and functional annotation [26]. The sequencing data and genome assembly for this study have been deposited in the European Nucleotide Archive (ENA) at EMBL-EBI under accession number PRJEB62500.

Comparative Analysis of Fungal Genomes and Phylogenetic Analysis
The whole genome sequence and annotated proteome of 20 other Fusarium species, together with Neonectria ditissima (to be used as an outgroup), were downloaded from the JGI database and used for phylogenetic and comparative genomics analysis (Table S1).
For the phylogenetic analysis, six barcode sequences were used: the genes coding for the translation elongation factor 1α (tef1), RNA polymerase II subunits 1 and 2 (rpb1 and rpb2), and beta-tubulin (tub2), as well as the sequence of the internal transcribed spacer (ITS) and the large ribosomal subunit (LSU). These six loci were extracted from each genome, concatenated, and aligned using ClustalW, followed by the generation of a Maximum-Likelihood (ML) tree in IQ-TREE (version 1.6.12) [27] with 1000 bootstrap replicates.
A second phylogenetic tree was built using orthologous proteins. OrthoFinder version 2.5.4 [28] was used to infer phylogeny using predicted protein sequences to determine the phylogenetic relationships. Single-copy orthologous sequences between these species were retrieved, specifying multiple sequence alignment as the method of gene tree inference (-M). The resulting single-copy orthologous sequences were aligned using MAFFT (v7.453) [29] with default parameters. Phylogenetic inferences were conducted using FastTree [30] with local bootstrap values of 1000 replicates. The tree was rooted with Neonectria ditssima as an outgroup by the STRIDE algorithm [31].
To better discriminate between the 34 isolates belonging to the F. tricinctum species complex, a phylogenetic ML tree was built on the alignment of the tef1 sequences, which is commonly the first-choice identification marker in Fusarium species [32]. Species are listed in Table S2.

Gene Prediction and Annotation
The tRNA and rRNA were predicted using tRNA scan-SE (version 2.0.11) [33] and barrnap (version 0.9). Gene Ontology (GO) annotation was performed using InterPro (version 66.0) [34]. To predict CAZymes, we used the web-based meta server dbCAN2 [35], which integrates three tools (dbCAN HMM, CAZy, and dbCAN-sub). The three outputs were combined, and CAZymes found by only one tool were removed to improve the CAZyme annotation accuracy. Secondary metabolite biosynthetic clusters were identified using the antiSMASH web server (fungal version 7.0) with the default settings [36].

Extraction of Secondary Metabolites and High-Resolution Liquid Chromatography-Mass Spectrometry (HR-LC-MS) Analysis
Fungal mycelium was transferred to small (ø 35 mm × 10 mm) PDA plates supplemented with different concentrations (0, 1, 10, and 100 mM) of the histone deacetylase (HDAC) inhibitor sodium butyrate (SB). The plates were incubated at 25 • C for 14 days alongside empty PDA and PDA-SB plates without the fungus as controls.
For extraction of SMs, the whole agar pads (agar and mycelium) were cut into pieces and transferred to 25 mL glass bottles, then extracted with 4 mL solvent (9:1 ethyl acetatemethanol (v/v)-0.1% formic acid), spiked with 5 µL caffeic acid standard solution with a concentration of 10 mg·mL −1 , and sonicated in a sonication bath for one hour. The organic phase was subsequently collected and dried under a gentle stream of N 2 . The dried extracts were resuspended in 500 µL of 1:1 MeOH-MilliQ water (v/v) and filtered with 0.45 µm PTFE filters.
HR-LC-MS/MS analysis was performed with a Shimadzu Nexera X2 high performance liquid chromatography (HPLC) system with binary LC20ADXR coupled to a Q Exactive Plus hybrid quadrupole-orbitrap mass spectrometer (Thermo Fisher Scientific, Waltham, MA, USA). A Kinetex EVO C18 reversed-phase column was applied for HPLC separations (100 mm × 2.1 mm I.D., 2.6 µm, 100 Å particles, Phenomenex, Torrance, CA, USA), which was maintained at 50 • C. The mobile phase consisted of a gradient of solution A (0.1% formic acid in MilliQ water) and solution B (0.1% formic acid in Acetonitrile). A linear gradient was used: 0-2 min 5% B, 2-17 min linear increase to 50% B, 17-21 min linear increase to 90% B, 21-24 min held at 90% B, 24-24.01 min decrease to 5% B, and 24.01-30 min held at 5% B. The injection volume was 2 µL, and the flow was set to 0.25 mL·min −1 . MS and MS/MS analyses were performed with electrospray ionization (ESI) in positive mode at a spray voltage of 3.5 kV and sheath and auxiliary gas flow set at 60 and 11, respectively. The ion transfer tube temperature was 300 • C. Spectra were acquired in data-dependent mode with a survey scan at m/z 100-1500 at a resolution of 70,000, followed by MS/MS fragmentation of the top 5 precursor ions at a resolution of 17,500. A normalized collision energy of 30 was used for fragmentation, and fragmented precursor ions were dynamically excluded for 10 s.

Data Processing and Analysis
The acquired data were further processed by Thermo Scientific FreeStyle software version 1.8. The raw MS/MS data file was converted to mzXML format using the easy convertor provided by the Global Natural Products Social Molecular Networking (GNPS) (https: //ccms-ucsd.github.io/GNPSDocumentation/fileconversion/, accessed on 20 March 2023). The data files were subsequently uploaded to GNPS (https://gnps.ucsd.edu/, accessed on 20 March 2023) using WinSCP.
A molecular network was created using the online workflow on the GNPS website [37]. The data were filtered by removing all MS/MS fragment ions within ±17 Da of the precursor m/z. MS/MS spectra were window filtered by choosing only the top six fragment ions in the ±50 Da window throughout the spectrum. The precursor ion mass tolerance and MS/MS fragment ion tolerance were both set to 0.02 Da. A network was then created where edges were filtered to have a cosine score above 0.7 and more than six matched peaks. Further, edges between two nodes were kept in the network if each of the nodes appeared in the other's respective top 10 most similar nodes (molecular networking job: https:// gnps.ucsd.edu/ProteoSAFe/status.jsp?task=4d9f4cd19b0d4279838c3bea94fa0bff, accessed on 20 March 2023). All mass spectrometry data have been deposited on GNPS under the accession number MassIVE ID: MSV000092159. The molecular network was visualized in Cytoscape version 3.9.1 [38]. Nodes that also existed in the PDA and PDA-SB controls were considered background and thus omitted from the final molecular network.
The spectra in the network were then searched against the GNPS spectral libraries. Matches were kept with a score above 0.7 and at least six matched peaks. The data was also analyzed by the GNPS molecular library search V2. The precursor ion mass tolerance and fragment ion tolerance were both set to 0.02 Da. The minimal matched peaks were set to six, and the score threshold was 0.7 (library search job: https://gnps.ucsd.edu/ProteoSAFe/ status.jsp?task=74d4363bcacf4bcf89db4c4278fd3d73, accessed on 25 March 2023). Several matched annotations in the library search mode were manually added to the molecular network.

Isolate VM-40 from Vinca minor Is a Fusarium
Vinca minor is a popular ornamental plant nowadays that was already appreciated by the ancient Romans for its beauty and medicinal properties. It produces a wide array of vinca alkaloids with neuroprotective and antioxidant bioactivities and was used in folk medicine for the treatment of hypertension, as a carminative, emetic, hemostatic, and astringent, and in the treatment of toothache and snakebite [39,40]. In previous studies, three endophytic Trichoderma species were isolated from the stems of V. minor [41], and ten not further specified species were isolated from various plant tissues [42]. One of these isolates was reported to produce vincamine, the main alkaloid found in V. minor leaves [42].
In an effort to learn more about the microbes associated with the inner tissues of the V. minor plant, we isolated nine endophytic fungi from healthy leaves collected in Groningen, The Netherlands. Based on ITS sequencing, the isolates were identified as Phialophora sp., Pleosporales sp., Neocucurbitaria sp., Cadophora sp., Boeremi sp., Lophiostoma sp., Alternaria sp., Diaporthe sp., and Fusarium sp. The best matches for the Fusarium isolate in the NCBI nr/nt database (100% sequence identity) were F. oxysporum, F. tricinctum, F. avenaceum, F. redolens, F. acuminatum, F. lateritium, F. paeoniae, F. sp., and various uncultured Fusarium strains (Dataset S1). This drew our attention since Fusarium strains are most known for their plant-pathogenic lifestyle [43], yet the isolate at hand did not cause any visible symptoms of disease. Furthermore, endophytic Fusarium species were reported to be a rich source of bioactive compounds, and they have been attracting considerable interest, as recently reviewed by Ahmed et al. [44]. Therefore, we decided to further investigate this fungus in terms of morphology, genomics, and metabolomics.
The fungal isolate grew on PDA, CYA, and SNA media after 7 days of incubation at 25 • C, spreading with aerial mycelium and smooth, regular margins. On PDA, the colonies attained a diameter of 30-35 mm with a velvety to floccose texture, with a light pink to yellowish color in the front and a dark ruby color in reverse ( Figure 1A). Colonies on CYA reached 35-40 mm with dense aerial mycelia and showed a pink coloration with a pale white peripheral border on the obverse side and a yellowish to red color in reverse ( Figure 1A). On SNA, Fusarium sp. VM-40 formed smaller colonies of 28-32 mm in diameter, with pink coloration in the center and white hyphae at the margin ( Figure 1A).

Genome Sequencing, Assembly, and Genomic Features
With an optimized extraction protocol, we isolated high-quality, high-molecularweight genomic DNA from the mycelium of Fusarium sp. VM-40 (Table S3) and subjected it to long-read sequencing with the Oxford Nanopore Technology. We obtained in total 1,997,205 raw reads (6.1 Gb) with an N50 value of 5.5 kb before filtering and 7.3 kb after filtering with high read quality (Table 1). We assembled the reads into 15 contigs with a total size of 40 Mb and polished the draft assembly by Racon and Medaka. The final as- Under the optical microscope, the conidiophores showed branches bearing doliiform phialides. Macroconidia were rare in colonies on PDA and CYA but abundant in colonies on SNA. They were relatively slender, sickle-shaped to almost straight, with 3-5 septae ( Figure 1B).
Based on the culture, morphological observation, and ITS regions, the isolated endophytic fungus was preliminary assigned to the genus Fusarium and named Fusarium sp. VM-40. Since most of its close relatives based on its ITS sequence are uncharacterized fungi, we decided to further investigate its genome.

Genome Sequencing, Assembly, and Genomic Features
With an optimized extraction protocol, we isolated high-quality, high-molecularweight genomic DNA from the mycelium of Fusarium sp. VM-40 (Table S3) and subjected it to long-read sequencing with the Oxford Nanopore Technology. We obtained in total 1,997,205 raw reads (6.1 Gb) with an N50 value of 5.5 kb before filtering and 7.3 kb after filtering with high read quality (Table 1). We assembled the reads into 15 contigs with a total size of 40 Mb and polished the draft assembly by Racon and Medaka. The final assembly revealed a GC content of 47.72% and a BUSCO completeness of 97.4%. Next, we structurally annotated the genome of Fusarium sp. VM-40 with GenSAS and predicted 13,546 proteins. For the non-coding RNAs, we predicted 80 rRNAs and 320 tRNAs. Overall, we achieved a highly contiguous assembly (Table S4) with a good degree of completeness.

Multilocus Phylogeny and Comparative Analysis of the Fusarium sp. VM-40 Genome
In order to continue the taxonomic classification of the fungal isolate, we extracted the sequences of several taxonomic markers from the whole genome assembly and used them for a multilocus phylogenetic analysis. We compared the concatenated sequences of tef1, rpb1, rpb2, tub2, ITS, and LSU (~4000 nucleotides in total) of Fusarium sp. VM-40 and 20 other Fusarium species in the Maximum-Likelihood phylogenetic analysis (including Neonectria ditssima as an outgroup) ( Figure 2).
In order to continue the taxonomic classification of the fungal isolate, we extracted the sequences of several taxonomic markers from the whole genome assembly and used them for a multilocus phylogenetic analysis. We compared the concatenated sequences of tef1, rpb1, rpb2, tub2, ITS, and LSU (~4000 nucleotides in total) of Fusarium sp. VM-40 and 20 other Fusarium species in the Maximum-Likelihood phylogenetic analysis (including Neonectria ditssima as an outgroup) ( Figure 2). In this analysis, Fusarium sp. VM-40 clustered together with F. avenaceum and F. tricinctum, which both belong to the F. tricinctum species complex (FTSC). Due to highly similar barcode sequences across species, taxonomic assignments in a species-rich genus, In this analysis, Fusarium sp. VM-40 clustered together with F. avenaceum and F. tricinctum, which both belong to the F. tricinctum species complex (FTSC). Due to highly similar barcode sequences across species, taxonomic assignments in a species-rich genus, such as Fusarium, can be complicated [45]. To confirm the phylogenetic placement of Fusarium sp. VM-40 within the 20 Fusarium species, we also performed a whole genome comparison of the 22 species with Orthofinder [28]. Orthofinder identified a total of 316,098 genes in the 22 whole genome sequences and assigned them to 20,601 orthogoups (Dataset S2). From these orthogoups, 6309 orthogoups were shared, 445 were classified as species-specific, and 3613 were single-copy for 22 species. The resulting phylogenetic analysis with 3613 single-copy genes ( Figure S2) shows once again that Fusarium sp. VM-40 clusters with the FTSC, confirming the previous results.
Finally, we compared the tef1 sequence of Fusarium sp. VM-40 to the sequences of 34 Fusarium isolates belonging to the FTSC to further dissect their phylogenetic relationship. We observed that Fusarium sp. VM-40 was closely related to F. paeoniae and Fusarium sp. FTSC 5 ( Figure S3). Most species in the FTSC are known plant pathogens. Therefore, we were even more curious to investigate the specific primary and secondary metabolic pathways encoded in the genome of our new Fusarium sp. VM-40 in order to find out whether it is likely to be an opportunistic pathogen.

The Genome of Fusarium sp. VM-40 Encodes for Various Enzymes of Biotechnological Interest
Based on GO annotation, we classified the predicted genes within the Fusarium sp. VM-40 genome into functional categories. The top 50 terms were grouped into the three major GO terms as follows: biological processes (18.9%), molecular functions (34.8%), and cellular components (46.3%) (Figure 3).

The Genome of Fusarium sp. VM-40 Encodes for Various Enzymes of Biotechnological Interest
Based on GO annotation, we classified the predicted genes within the Fusarium sp. VM-40 genome into functional categories. The top 50 terms were grouped into the three major GO terms as follows: biological processes (18.9%), molecular functions (34.8%), and cellular components (46.3%) (Figure 3).  The gene ontology analysis of the genes that are related to CAZymes includes "carbohydrate metabolic process", "hydrolase activity", "hydrolyzing O-glycosyl compounds", "pectate lyase activity", and "carbohydrate binding. These enzymes play an important role in carbohydrate degradation, modification, and biosynthesis in fungi and are particularly interesting for industrial applications [46]. In total, dbCAN predicted 691 genes encoding CAZymes in the genome of  (Table S5). In general, when compared with other Fusarium species from the CAZy database, this isolate shows a similar abundance of CAZymes [47].
Although not essential for life, secondary metabolism is an important biological process, e.g., for niche adaptation, inter-and intra-species communication, and competition. Therefore, we queried the genomes of Fusarium sp. VM-40 and seven Fusarium species from Table S1 for biosynthetic gene clusters (BGCs) using the online-based tool fungiSMASH [36].  Table S6). Among them, 25 BGCs showed similarities with known BGCs in the MiBIG database [48] and are predicted to give rise to various types of compounds ( Figure S4).
cess, e.g., for niche adaptation, inter-and intra-species communication, and competition. Therefore, we queried the genomes of Fusarium sp. VM-40 and seven Fusarium species from Table S1 for biosynthetic gene clusters (BGCs) using the online-based tool fun-giSMASH [36]. Fusarium sp. VM-40 possesses 56 BGCs for secondary metabolite biosynthesis, classified as follows based on the class of core biosynthetic enzymes: 12 polyketide synthases (PKSs), 15 NRPSs (nonribosomal peptide synthetases), 6 NRPS-PKS hybrids, 1 indole, 1 NRPS-indole hybrid, 11 terpene synthases (TSs), 1 NRPS-TS, 6 fungal-ribosomally synthesized and post-translationally modified peptides (RiPPs), and one phosphonate cluster ( Figure 4A, Table S6). Among them, 25 BGCs showed similarities with known BGCs in the MiBIG database [48] and are predicted to give rise to various types of compounds ( Figure S4).  Across all Fusarium species analyzed, we predicted between 40 and 61 BGCs, averaging 47 clusters per genome. In total, 41 clusters showed similarities with known BGCs in the MiBIG database. Among the predicted clusters, NRPS clusters are the most abundant type of BGC, followed by terpene, PKS, and hybrid clusters, in particular NRPS-T1PKS hybrids. We constructed a presence/absence matrix of these known BGCs to visualize the biosynthetic diversity among the eight Fusarium genomes ( Figure 4B). The comparison revealed considerable differences with only five BGCs, namely the ones predicted to give rise to choline, squalestatin S1, lucilactaene, oxyjavanicin, and gibepyrone A, conserved across all genomes. Fusarium sp. VM-40 shares 19 out of the 30 known BGCs with F. avenaceum and F. tricinctum. In addition, Fusarium sp. VM-40 shares a hexadehydroastechrome (HAS) BGC with these two species and F. oxysporum Fo47. F. oxysporum Fo47 is a putative biocontrol strain, and the HAS cluster has been implicated in this biocontrol function since it is absent in pathogenic strains of the same species, such as F. oxysporum f.sp. lycopersici, which is pathogenic to tomatoes [49]. In Fusarium sp. VM-40, the core NRPS gene of the HAS BGC shows 73% sequence identity with that in A. fumigatus [50], yet it remains unclear what the role of the HAS cluster is in these FTSC isolates.
Another BGC shared among Fusarium sp. VM-40, F. avenaceum, and F. tricinctum is also present in the F. graminearum Z3639 genome. It is predicted to give rise to fusaristatin A, the in vivo function of which is unknown. Interestingly, however, previous research has suggested that the fusaristatin A BGC is only present in a subset of Western Australian F. pseudograminearum isolates [51], and its absence appears to be associated with increased crown rot aggressiveness of F. pseudograminearum on wheat [52]. This implies that fusaristatin A also has a biocontrol function.
The BGC predicted to produce fusaridione A is only present in Fusarium sp. VM-40 within FTSC. The core gene of this BGC shows 58% sequence identity with that in Fusarium heterosporum, in which fusaridione A was first isolated [53]. The biosynthesis and biological functions of fusaridione A are thus far unknown, which is likely due to the fact that this compound is highly unstable.
Taken together, the number of predicted BGCs in Fusarium sp. VM-40 indicates that this species has a broad potential for SM biosynthesis and is worth further analysis.

Fusarium sp. VM-40 Produces a Wide Range of Secondary Metabolites
A preliminary analysis of the crude organic extracts from cultures of Fusarium sp. VM-40 showed poor production of SMs ( Figure 5A), despite the relatively high abundance of BGCs. This suggests that many of these clusters are expressed at low levels or are completely silent under standard culture conditions. To overcome this, we set up epigenetic manipulation experiments. We grew Fusarium sp. VM-40 on media containing 1, 10, and 100 mM sodium butyrate, a commonly used epigenetic modulator, monitored the phenotype of the cultures, and analyzed the TIC chromatograms of crude extracts after 14 days of cultivation ( Figure 5). With increasing concentrations of SB, the fungal colony gradually turns from yellow to red in morphology, indicating that new metabolites are produced, possibly due to one or more BGCs being upregulated by the effect of SB in a dose-dependent fashion. As expected, we also observed several changes in the TIC: the abundance of several peaks with retention times (rt) around 20 min, labeled 2, 4, 5, 6, and 11 in Figure 5A, gradually increases upon treatment with increasing concentrations of SB. Peak 28, however, and others remain unchanged in the treatment groups. There are also several new peaks that appear in the extracts of fungus grown in the presence of 1 and 10 mM SB ( Figure 5B,C), e.g., the small peaks 1 and 7 (rt around 20 min), and the peaks m/z 607.3800 (rt 14.08 min), and m/z 639.4057 (rt 15.83 min). Interestingly, these latter, unidentified peaks are again absent in the extracts of the fungus grown in the presence of 100 mM SB ( Figure 5D). In addition to With increasing concentrations of SB, the fungal colony gradually turns from yellow to red in morphology, indicating that new metabolites are produced, possibly due to one or more BGCs being upregulated by the effect of SB in a dose-dependent fashion. As expected, we also observed several changes in the TIC: the abundance of several peaks with retention times (rt) around 20 min, labeled 2, 4, 5, 6, and 11 in Figure 5A, gradually increases upon treatment with increasing concentrations of SB. Peak 28, however, and others remain unchanged in the treatment groups. There are also several new peaks that appear in the extracts of fungus grown in the presence of 1 and 10 mM SB ( Figure 5B,C), e.g., the small peaks 1 and 7 (rt around 20 min), and the peaks m/z 607.3800 (rt 14.08 min), and m/z 639.4057 (rt 15.83 min). Interestingly, these latter, unidentified peaks are again absent in the extracts of the fungus grown in the presence of 100 mM SB ( Figure 5D). In addition to the thus far discussed changes, treatment with 100 mM SB also elicits the production of more new compounds, labeled as 3, 8, 9, 10, 13, 26, 27 in Figure 5D To gain further information on the chemical diversity of the Fusarium sp. VM-40 metabolome, especially the differentially produced SMs, we performed a molecular networking analysis using the Global Natural Products Social Molecular Networking (GNPS) platform. The generated molecular network was manually curated by deleting nodes present in the PDA and PDA-SB control groups, ultimately leading to a molecular network consisting of 895 nodes ( Figure 6, Dataset S3). Each node is represented as a pie chart, where different colors correspond to secondary metabolites that exist in groups with different concentrations of SB in the medium. The border width of the node indicates the relative abundance of the compounds in the extract.
Within this network, we identified 31 compounds either via a direct match with the GNPS MS/MS spectral library or via inference from matched adjacent nodes (Table 2, Figures 6 and 7). For instance, we identified the major constituents of the extracts with rt around 20 min to be enniatins (ENNs). ENNs are cyclic hexadepsipeptides consisting of alternating N-methyl amino acids and hydroxy acid residues [54]. Which consists of Hiv and NMeVal. Similarly, trace amounts of compound 2 are detected in 0 mM, 1 mM, and 10 mM SB groups, while after 100 mM SB treatment, a higher peak at a retention time of 19.30 min appears. Compound 3 (m/z 682.468) is predicted as enniatin A, which is composed of Hiv and N-methyl-isoleucine (NMeIle). In the MS/MS spectrum, compound 3 possesses a fragment ion at m/z 100.1125 as a result of y NMeLeu/Ile , and the lack of fragment ion y NMeVal at m/z 86.0967 also demonstrates the absence of the isopropyl group. Compound 5 (m/z 668.451 and m/z 685.477) is predicted as enniatin A1. Compared with compound 3, compound 5 has a NMeVal group instead of NMeIle, which contributes to the discovery of the fragment ion y NMeVal (m/z 86.0967). In addition, according to the molecular network, the node at m/z 699.494 (7) is related to 5, with a -CH 5 N group difference. In the MS/MS spectrum of compound 7, only one fragment ion at m/z 100.1125 is found, indicating that compound 7 possesses NMeLeu or NMeIle but not NMeVal. It is worth mentioning that 3 and 7 do not display a direct association with each other in the molecular network. Combined with the reference [13], compound 7 is predicted to be enniatin F, formed by three Hiv groups, two NMeIle groups, and one NMeLeu group. Compound 7 also exists in all SB treatment groups, becoming more abundant with increasing SB concentrations.        In addition to the cyclic hexadepsipeptides mentioned above, we also find a cyclic tetradepsipeptide 10 (m/z 427.282) (Figures 7 and S6) We further identified a cluster of pyridine-type amides in the molecular network (cluster 6). Oxysporidinone (11) and its dimethyl-ketal derivative (12) show high peaks in the extracts of fungi grown in the presence of 0 mM, 1 mM, and 10 mM SB. However, upon addition of 100 mM SB, 12 is not detected, and only 11 is present in the extract. Compound 13 shows a -H 2 O group difference compared with 11, and could therefore be 4, 6 -anhydrooxysporidinone. Analogously, 14 and 15 are predicted to be sambutoxin, and (E)-4-(6-(4,6-dimethyloct-2-en-2-yl)-5-methyltetrahydro-2H-pyran-2-yl)-9a-hydroxy-2-methyl-2,5a,6,9a-tetrahydrobenzofuro [3,2-c]pyridine-3,7-dione.
Compound 26 from cluster 8 is annotated by GNPS as fusarin C. Based on the mass difference, we speculate that compound 27 is lucilactaene. Moreover, most of the 2-pyrrolidone derivates in cluster 8 could only be found in the 100 mM SB group, indicating that 100 mM SB might trigger expression of the BGC in region 10.4, which may be responsible for forming lucilactaene analogs. It was reported that lucilactaene and its derivatives could be promising lead compounds for antimalarial drug development because of their their unique structure [55].
In Overall, the addition of 100 mM sodium butyrate to the cultivation medium significantly altered the metabolic profile of Fusarium sp. VM-40, resulting in increased production of known compounds and several putative new compounds. Several other nodes in the molecular network could not be assigned to any known compound, suggesting that these might be novel metabolites that need further investigation.

Metabologenomic Analysis-Linking Secondary Metabolites to BGCs of Fusarium sp. VM-40
Based on the genomic and metabolomic analyses above, it shows that Fusarium sp. VM-40 has the potential to produce a diverse set of SMs. Among them, the most striking group were the enniatins. ENNs are cyclic hexadepsipeptides formed by the condensation of three D-α-hydroxy acids and three N-methyl-L-amino acids. Many of the biological activities of the enniatins are of pharmaceutical interest, such as antimicrobial activities [56], inhibitors of major drug efflux pumps [57], and acyl-CoA cholesterol acyltransferase inhibition [58]. The structural differences related to the N-methyl-amino acids were previously linked to the different bioactivities of the ENNs [59]. In this study, we identified nine enniatin analogs (compounds 1-9) with the highest amount of enniatin B (6), followed by enniatin B1 (4) and enniatin A1 (5) [60]. The main amino acid constituents in these compounds are N-methyl-valine, N-methyl-isoleucine, and N-methyl-leucine. However, we also identified two N-methyl-threonine-containing compounds, enniatin P1 (8) and enniatin P2 (9). These compounds were only produced upon stimulation with 100 mM SB, and to our knowledge, there are no prior studies on their bioactivities. Therefore, they could be interesting candidates for future investigations. The BGC in region 9.1 (NRPS) of the Fusarium sp. VM-40 genome is predicted to be responsible for the production of ENNs. As shown in Figure 8A, the A1 domain of the first module activates the D-2-hydroxycarboxylic acid substrate and loads it onto the T1 domain in the same module. The A2 domain of the second module activates and loads an L-amino acid substrate molecule onto each of the adjacent twin T2 domains. Amide bond formation between the D-2-hydroxycarboxylic acid and N-methyl-L-amino acid thioesters is carried out by the C2 domain. This generates the dipeptidol monomer, three or four copies of which would then be ligated and finally cyclized in a programmed cyclooligomerization process to generate the cyclohexadepsipeptide or cyclooctadepsipeptide products, respectively [60]. However, the cyclization of two dipeptidol monomers into a cyclic tetradepsipeptide has not been reported yet.
Interestingly, we identified one such cyclic tetradepsipeptide, compound 10, which was only detected in extracts of fungi challenged with 100 mM SB. A similar compound [-(aoxyisohexanoyl-N-methyl-Leu)2-] was first isolated from the endophytic fungus F. tricinctum SYPF 7082 of Panax notoginseng [61]. The novel structure of 10 makes it an interesting candidate for investigating its bioactivities and biosynthesis, which may rely on the same NRPS as the cyclohexadepsipeptides ( Figure 8A).
Compounds 11-13 were previously isolated from F. oxysporum, and compound 11 shows antifungal effects [62,63]. Moreover, the structures of these compounds are similar to ilicicolin H, and region 8.3 in the BGC of this fungus might be responsible for their biosynthesis. This gene cluster mainly contains genes encoding a central PKS-NRPS hybrid, a PKS, one sugar transport protein, a serine/threonine protein kinase, an NADH: flavin oxidoreductase/NADH oxidase, two methyltransferases, two cytochromes P450, a crotonyl-CoA reductase/alcohol dehydrogenase, a nitrilase/cyanide hydratase, and apolipoprotein.
A recent study identified two enzymes (OsdM and OsdN) involved in the phenol dearomatization process in the formation of oxysporidinone [64]. Based on the similarity of the PKS-NRPS and the P450s to those in the reported oxysporidinone biosynthesis gene cluster, we propose that compounds 11-15 follow a similar biosynthetic route ( Figure 8B). At the same time, region 8.3 bears a significant number of unknown genes, and thereby potentially interesting bioactive compounds need to be further explored.
The highly interesting compounds 26-27 and 28-30 are biosynthesized by enzymes encoded in T1PKS-NRPS hybrid BGCs (regions 10.4 and 14.3) and have attracted the interest of researchers due to their unique structures. The biosynthetic gene cluster of lucilactaene (27) was identified in Fusarium sp. RK 97-94, and a putative biosynthetic pathway was proposed [65,66]. The biosynthetic gene cluster for fusaristatin A (28) was identified in F. graminearum and partially characterized [67]. Future studies could further elaborate on the biosynthesis pathway of these interesting compounds.

Conclusions
Fusarium is a treasure trove of SMs with diverse chemical structures and biological properties. In addition to phylogenetic analysis based on the multi-locus and wholegenome sequence, we obtained a high-quality whole-genome sequence of the endophytic strain Fusarium sp. VM-40 from Vinca minor and extensively analyzed it by gene prediction and annotation in this work. Our initial morphological characterization and ITS-based identification were sufficient to categorize Fusarium sp. VM-40 as a Fusarium species. A six-locus gene tree (tef1, rpb1, rpb2, tub2, ITS, and LSU) and a phylogenetic analysis based on single-copy orthologs with 21 Fusarium species showed that Fusarium sp. VM-40 is clustered together with Fusarium avenaceum and Fusarium tricinctum, which are both from the FTSC. Further, a phylogenetic analysis based on tef1 with 34 FTSC isolates revealed that Fusarium sp. VM-40 is closely related to Fusarium paeoniae. Within the Fusarium sp. VM-40 genome, we predicted various BGCs, two of which were previously implicated in the biocontrol properties of Fusarium species. For one of these, the fusaristatin A BGC, we even identified several potential pathway products in the extracts of Fusarium sp. VM-40. This observation may open the door for further investigation of this fungal isolate to elucidate its biological function within the endophytic microbiome.
The highly interesting compounds 26-27 and 28-30 are biosynthesized by enzymes encoded in T1PKS-NRPS hybrid BGCs (regions 10.4 and 14.3) and have attracted the interest of researchers due to their unique structures. The biosynthetic gene cluster of lucilactaene (27) was identified in Fusarium sp. RK 97-94, and a putative biosynthetic pathway was proposed [65,66]. The biosynthetic gene cluster for fusaristatin A (28) was identified in F. graminearum and partially characterized [67]. Future studies could further elaborate on the biosynthesis pathway of these interesting compounds.

Conclusions
Fusarium is a treasure trove of SMs with diverse chemical structures and biological properties. In addition to phylogenetic analysis based on the multi-locus and whole- Our chemical investigation of the fungal extract further indicated that the most abundant SMs produced by Fusarium sp. VM-40 under standard culture conditions are cyclic depsipeptides. Therefore, a great number of other types of BGCs in this strain are silent and/or expressed at a low level. In the current study, to explore its potential for producing metabolites, an epigenetic manipulation strategy using an HDAC inhibitor was employed to activate cryptic BGCs. Remarkably, our metabolomic analysis reveals a large diversity of metabolic changes and allows the identification of some potentially new compounds upon treatment with 100 mM sodium butyrate. Therefore, these findings open possibilities for targeted genome mining, such as gene knockout, introduction or heterologous expression of microbial genes, regulation of promoters, and induction of mutations to biosynthesize newer bioactive SMs for new drug research and development.  Figure S1. Contigs of Fusarium sp. VM-40 visualized by Bandge; Figure S2. Taxonomic tree generated by comparing 3613 single copy orthologous genes from the genomes of 22 Fusarium species using OrthoFinder; Figure S3. Maximum-Likelihood (ML) phylogram of FTSC species based on the tef1 gene; Figure S4. Chemical structures of the secondary metabolites predicted by antiSMASH; Figure  S5. Proposed fragmentations of compound 8; Figure S6. Proposed fragmentations of compound 10; Table S1. List of Fusarium species from JGI used for comparative analysis and phylogenetic analysis; Table S2. List of Fusarium isolates belonging to the F. tricinctum species complex (FTSC) from NCBI; Table S3. Quality metrics for isolated genomic DNA; Table S4. The QUAST statistics of Fusarium sp. VM-40; Table S5. Gene distribution of Fusarium sp. VM-40 based on the six major modules of CAZymes; Table S6. Biosynthetic gene clusters of Fusarium sp. VM-40 predicted by antiSMASH; Dataset S1. Matches of ITS sequencing blasted against the NCBI nr/nt database; Dataset S2. Overall statistics of single-copy orthologous genes from the genomes of 22 Fusarium species using OrthoFinder; Dataset S3. Molecular network of SMs of Fusarium sp. VM-40 in Cytoscape. References [69][70][71][72][73][74][75][76][77][78][79][80][81]