Genome Sequencing and Analysis of Trichoderma (Hypocreaceae) Isolates Exhibiting Antagonistic Activity against the Papaya Dieback Pathogen, Erwinia mallotivora

Erwinia mallotivora, the causal agent of papaya dieback disease, is a devastating pathogen that has caused a tremendous decrease in Malaysian papaya export and affected papaya crops in neighbouring countries. A few studies on bacterial species capable of suppressing E. mallotivora have been reported, but the availability of antagonistic fungi remains unknown. In this study, mycelial suspensions from five rhizospheric Trichoderma isolates of Malaysian origin were found to exhibit notable antagonisms against E. mallotivora during co-cultivation. We further characterised three isolates, Trichoderma koningiopsis UKM-M-UW RA5, UKM-M-UW RA6, and UKM-M-UW RA3a, that showed significant growth inhibition zones on plate-based inhibition assays. A study of the genomes of the three strains through a combination of Oxford nanopore and Illumina sequencing technologies highlighted potential secondary metabolite pathways that might underpin their antimicrobial properties. Based on these findings, the fungal isolates are proven to be useful as potential biological control agents against E. mallotivora, and the genomic data opens possibilities to further explore the underlying molecular mechanisms behind their antimicrobial activity, with potential synthetic biology applications.


Introduction
Papaya dieback disease (PDD), or bacterial crown rot (BCR), is a bacterial-caused disease that has negatively affected papaya crops (Carica papaya L.) in Malaysia, reducing the national papaya export up to 70% [1]. Initially, Erwinia papayae was reported to be the causal agent of PDD [2], though later Mat Amin et al. [3] molecularly validated the pathogen to have the closest match to E. mallotivora Goto strain DSM 4565 and this was further supported through genome sequencing [4]. Most of the Malaysian papaya cultivated varieties (cultivars) such as 'Eksotika', 'Sekaki', and 'Setiawan' are highly susceptible to this Gram-negative pathogen [5]. Once infected, water soak symptoms become prevalent on papaya petioles, leaves, and the apical stem, and the tree can succumb to the disease within one to two weeks [5][6][7]. In addition to physical diagnosis, Mohd Said et al. [8] developed a DNA-based detection method for E. mallotivora to further assist in early detection of the disease and Ramachandran et al. [9] conducted a repetitive element PCR fingerprinting (rep-PCR) study to provide an accurate and rapid identification method on various local E. mallotivora isolates. Currently, there is no effective cure for PDD from the moment the Table 1. Origin of the soil samples used for the isolation of the fungi identified in this study. The soil samples were collected from wild and cultivated healthy papaya populations in different states (Selangor, Negeri Sembilan, Pahang, and Kedah) of Peninsular Malaysia.

Soil Sample ID Location Site, State Coordinates
Mardi-2018-TK Tanjong

Antagonistic Activity of Fungal Isolates against E. mallotivora Strain BT-MARDI
The antagonistic activity of fungal isolates against E. mallotivora was screened in vitro using the agar well diffusion method on PDA plates [31,42,43]. An agar plug (5 mm 2 ) from each five-day old fungal culture was inoculated into 100 mL potato dextrose broth (PDB; 20 g/L glucose, 4 g/L potato extract) in 500 mL flasks and grown for seven days at 25 • C with agitation at 180 rpm. A lawn of E. mallotivora was prepared by spreading 100 µL of the E. mallotivora inoculum over the entire agar surface. A hole with a diameter of 6 mm was punched aseptically with a sterile cork borer and 100 µL of the fungal suspension were dispensed into the well. A kanamycin solution (100 µg/mL) and PDB, 100 µL of each, were used as the positive and negative controls, respectively. After overnight incubation at 28 • C, the plates were observed for zone of inhibition formation on the E. mallotivora lawn. The experiment was repeated in triplicate for each isolate to record the average diameter of the inhibition zone. The data for fungal isolates exhibiting significant antagonism compared to the positive control was analysed using IBM SPSS Statistics version 28.0.1.1 (14).

Genomic DNA Extraction
Cultures of the three selected fungal isolates, UKM-M-UW RA3a, UKM-M-UW RA5, and UKM-M-UW RA6 (hereafter referred to as RA3a, RA5, and RA6), were set up in 25 mL of PDB from PDA plates and placed in 250 mL conical flasks. The fungal cultures were incubated at 25 • C for 72 h, shaking at 200 rpm. The mycelia from the broth were pelleted by centrifugation at 8000× g for 5 min, and subsequently washed twice with sterile distilled water to remove any residual medium. High molecular weight (HMW) genomic DNA extraction was carried out via cryogenic grinding using a sterile mortar and pestle. Fungal mycelia were ground into a fine powder in liquid nitrogen; thereafter, 200 mg of the powder were transferred aseptically into a microcentrifuge tube for immediate use. A GenElute Plant Genomic DNA miniprep kit (Sigma-Aldrich, Burlington, MA, USA) was used for DNA extraction, following the manufacturer's instructions. HMW genomic DNA was concentrated using ethanol precipitation. Briefly, 1/10 volume of 3 M sodium acetate, pH 5.2, was added, followed by three volumes of 100% ethanol. Samples were inverted, incubated at −20 • C overnight, then spun at 13,000× g for 30 min at 4 • C. Supernatant was decanted, the DNA pellet was air dried, and then resuspended in nuclease-free water (Thermo Fisher, Waltham, MA, USA). Gel electrophoresis (1% w/v agarose gel with 1X GelRed), a Nanodrop ND1000 (Thermo Fisher, USA), and a Qubit (dsDNA Broad range, Invitrogen, Waltham, MA, USA) were used to assess the concentration (Table S1) and quality of the HMW genomic DNA extracted from the fungal strains.

Identification of Fungi Based on Mycelia Morphology, PCR Amplification, and Sanger Sequencing
Mycelial morphology was originally used to identify the class/genus of the fungal isolates based on the colour of the mycelia and spore characteristics. Subsequently, PCR amplification of target sequences was carried out using purified genomic DNA as a template to identify the fungal isolates of interest, RA3a, RA5, and RA6. PCR was carried out using Q5 High-Fidelity 2X Master Mix (NEB, Ipswich, MA, USA) according to the manufacturer's instructions, using the primers listed in Table S2. The amplicons were purified upon gel electrophoresis (1% w/v agarose gel with 1X GelRed) using the Gene-JET PCR purification kit (Thermo Fisher, USA) following the manufacturer's instructions. One hundred nanograms of the purified PCR product was used for Sanger sequencing through Eurofins-GATC, which enabled for the phylogenetic confirmation of the strains. The sequences were trimmed to obtain diagnostic fragments for molecular identification as detailed in Kopchinskiy et al. [44] prior to submission to BLASTN [45].

DNA Library Preparation and Sequencing through Oxford Nanopore Technology (ONT)
Utilising the extracted HMW genomic DNA, preparation of the library was performed using the native barcoding expansion kit (EXP-NBD104 and EXP-NBD114) and the ligation sequencing kit (SQK-LSK109) for multiplex genomic DNA sequencing of the fungal isolates RA3a, RA5, and RA6. NBD01 was used to tag RA3a, NBD02 for RA6, and NBD03 for RA5. The manufacturer's protocol was followed with minor modifications introduced for both DNA repair and end-prep stages, by increasing the incubation time and temperature after washing with ethanol to 15 min at 37 • C. Furthermore, at the native barcode ligation step, the incubation time and temperature were increased to 30 min at 37 • C. To enrich DNA fragments of 3kb or longer, the Long Fragment Buffer was used for the DNA clean-up. An amount of 1.2 µg of each genomic DNA were used for the library preparation, resulting in a final amount of 350 ng of DNA at the end of the adapter ligation and clean-up step. The resulting DNA library containing the sequencing buffer and loading beads were loaded on the primed SpotON flow cell. Nanopore sequencing was performed on MinION (ONT) with a FLO-MIN-106 R9.4 flow-cell (ONT). The MinION (ONT) multiplex sequencing reaction was run for 18 h.

Genome Assembly and Error Corrections
A de novo strategy was employed for sequencing each of the three genomes. The raw data produced by nanopore was base-called and simultaneously demultiplexed using Guppy version 4.4.1 available to ONT users via https://community.nanoporetech.com (accessed on 23 January 2022). The config file used in Guppy was dna_r9.4.1_450bps_hac.cfg, the remaining parameters used were the default settings. Guppy was also used to trim the barcodes from the reads during the base-calling, depositing each genome's reads in a directory as .fastq files. The .fastq files for each read were merged into one .fastq file and used as the input for Flye version 2.8.2 with the nano-raw mode selected, which was used to assemble the draft genomes [46]. The draft assembly was then polished multiple times, by aligning the base-called raw reads in the merged file against the draft genome using Minimap2 version 2.11 [47] before correcting errors using Racon version 1.4.20 [48], where matching bases were assigned a score of 8 and mismatches a score of −6, with a gap penalty of −8 and a window size of 500. The output from Racon was aligned to the raw reads again and the process was repeated three times for each assembly. Further corrections were made using Medaka version 1.2.1 utilising model r941_min_high_g360.
Polishing the genome continued by using Illumina short-read data generated through Apical Scientific Sdn. Bhd. (Seri Kembangan, Selangor, Malaysia). This was done by aligning the Illumina data to the output from Medaka using Bowtie2 version 2.4.2 [49]. Bowtie2 was run with the following settings, -D (number of attempts at extension before skipping to the next task) 20, -R (number of seeds looked at before moving to the next task) 3, -N (maximum number of mismatches in alignments) 1, -L (length of seed substrings) 20, and -i (interval between seed substrings) S,1,0.50, all other settings were left at the default parameters. Polishing was performed with Pilon version 1.23 [50], the output from Pilon was then re-aligned to the Illumina reads and the draft genome polished again; after Pilon had been run four times, the final output was analysed using BUSCO version 5.0.0 [51], comparing the three genomes against the Hypocreales_odb10 database. Unless specified above, parameters within programs were left at their default setting.

Genome Annotations
Functional annotation was performed using the Funannotate version 1.8.3 pipeline [52]. Funannotate was also able to clean (remove small repetitive contigs by using minimap2 or mummer) and perform mask repeats (softmasking of low complexity and short period tandem repeats using tantan). The prediction of functional elements was done using a seed species of Verticillium longisporum and the BUSCO Sordariomycetes database; the minimum number of training models required was dropped to 100, and the Optimize Augustus setting was turned on. Funannotate predict uses AUGUSTUS, snap, GlimmerHMM, and tRNAscan-SE in order to predict genes for proteins and tRNAs, as well as other programs to generate the inputs for those mentioned above and the final outputs. The predictions were then run through Phobius [53] and AntiSMASH [54], contained within Funannotate. Finally, the annotation function was used to create the final genomes. BUSCO was run again to look at the predicted proteins generated, once again using the Hypocreales_odb10 database. All parameters not mentioned above were left on their default setting within the software package.
CMscan [55] was used with StructRNAfinder [56] for screening the presence of potential non-coding RNA. The Rfam database [57] was used as the input database for StructRNAfinder with the default parameters.

Comparative Genomics and Phylogenomic Analysis
Pairwise genomic similarities between our isolates with addition of nine additional genomes from other Trichoderma spp. (obtained from the NCBI database) were calculated using FastANI, with assembled genomic sequences as input [58]. The pairwise genomic similarities were then visualised using the Intervene Shiny package [59]. Next, to reconstruct the phylogenomic association between several strains of Trichoderma spp., single-copy orthologous proteins were first identified from the predicted proteome sequence via Or-thoFinder v2.2.7 with default settings [60]. Then, amino acid sequences from single-copy orthologous proteins identified were aligned using MAFFT [61]. This alignment was then used for maximum likelihood phylogenomic tree reconstruction via the IQ-TREE program, using the JTT+F+G4 model and 1000 ultrafast bootstrap replications [62]. iTOL was used for phylogenomic tree visualisation [63]. Finally, OrthoVenn2 web server was used for comparison of orthologous proteins from our isolates with the predicted proteome as input [64].

Antagonism of Fungal Isolates against E. mallotivora Strain BT-MARDI
In this study, a total of 128 filamentous fungal and yeast isolates were cultured from five soil samples collected in various locations in Peninsular Malaysia ( Table 1). Out of these, only 17 fungal isolates that grew mycelia in less than 14 days on PDA plates-species putatively belonging mainly to the genera Trichoderma, Fusarium, Aspergillus, and Penicillium based on morphological identification-were observed to have antagonistic activity against E. mallotivora strain BT-MARDI. According to plate-based inhibition assays, five fast-growing fungal isolates that required less than seven days to form a full mycelial lawn on PDA plates were confirmed to exhibit visible antagonism ( Figure 1) and selected for further analysis. The taxonomic affiliation of these isolates was originally deduced based on mycelium morphology, with all five strains, or isolates, putatively assigned to the genus Trichoderma ( Figure 2). All isolates exhibited inhibition of the growth of E. mallotivora (Table 2). Based on the Tukey's test, RA5, RA6, and RA3a had a significant antagonistic activity when compared to the positive control (kanamycin) (Figure 3).

Antagonism of Fungal Isolates against E. mallotivora Strain BT-MARDI
In this study, a total of 128 filamentous fungal and yeast isolates were cultured from five soil samples collected in various locations in Peninsular Malaysia ( Table 1). Out of these, only 17 fungal isolates that grew mycelia in less than 14 days on PDA plates-species putatively belonging mainly to the genera Trichoderma, Fusarium, Aspergillus, and Penicillium based on morphological identification-were observed to have antagonistic activity against E. mallotivora strain BT-MARDI. According to plate-based inhibition assays, five fast-growing fungal isolates that required less than seven days to form a full mycelial lawn on PDA plates were confirmed to exhibit visible antagonism ( Figure 1) and selected for further analysis. The taxonomic affiliation of these isolates was originally deduced based on mycelium morphology, with all five strains, or isolates, putatively assigned to the genus Trichoderma ( Figure 2). All isolates exhibited inhibition of the growth of E. mallotivora (Table 2). Based on the Tukey's test, RA5, RA6, and RA3a had a significant antagonistic activity when compared to the positive control (kanamycin) (Figure 3).  Plate-based growth inhibition assays of E. mallotivora through well diffusion method. All fungal isolates displayed notable growth inhibition zones. Potato dextrose broth (PDB) and kanamycin were used as the negative and positive controls, respectively. Table 2. Antagonistic potential of the Trichoderma isolates against E. mallotivora. The negative (−) control consists of 100 µL of PDB, the positive (+) control of 100 µL of kanamycin (100 µg/mL).

Genomic DNA Extraction
High molecular weight (HMW) genomic DNA extraction was performed on the three fungal isolates, RA3a, RA5, and RA6, that showed significant inhibition on E. mallotivora.

Genomic DNA Extraction
High molecular weight (HMW) genomic DNA extraction was performed on the three fungal isolates, RA3a, RA5, and RA6, that showed significant inhibition on E. mallotivora. To improve the genomic DNA yield and quality, the extracted fungal DNA was subjected to ethanol precipitation and resuspended in 35 µL of nuclease-free water. The purity and concentration of the DNA was assessed with a Nanodrop ND1000 (Thermo Fisher, Boston, MA, USA) and Qubit (dsDNA Broad range, Invitrogen, Boston, MA, USA), with all preparations yielding more than 100 ng/µL HMW DNA according to Qubit readings (Table S1). The HMW DNA was further visualised via agarose gel electrophoresis ( Figure S1). Overall, these analyses proved that the HMW DNA extracted from the three isolates was suitable to be sequenced through Oxford nanopore and Illumina whole-genome sequencing.

Molecular Identification of Strains
The three fungal isolates RA3a, RA5, and RA6 were identified through PCR amplification and Sanger sequencing of three DNA barcodes [65]: internal transcribed spacer (ITS) [66], translation elongation factor 1 alpha (tef1) gene [67] and RNA polymerase B subunit II (rpb2) gene [68]. Amplification of each sample yielded a single DNA fragment, the sequence of which was analysed through BLASTN [45]. The sequencing of the ITS marker allowed us to confidently assign all three fungal isolates to the Trichoderma genus. Further sequencing of the tef1 and rpb2 genes enabled us to identify all three fungal strains to be from a single species, T. koningiopsis (Table 3).

Genome Sequencing
The genomic DNA of the three fungal isolates, RA3a, RA5, and RA6, were sequenced using a combination of Oxford nanopore and Illumina whole-genome sequencing. This generated a significant quantity of both long-read and short-read data which gave a high degree of coverage for each of the three genomes. From the nanopore data, after barcodes had been separated and trimmed, we had generated an estimated 3 [69]. As shown in Table 4, the genomes contain 14 (RA3a), 11 (RA5), and 13 (RA6) contigs, the majority of these contigs may be putatively assembled at the chromosome level. Closely related species of Trichoderma have seven chromosomes [70], and our assemblies also included the mitochondrial DNA. Using BUSCO [71] to assess the genome assemblies for core conserved genes across the order Hypocreales, we were able to generate the results presented in Tables 5 and 6. Table 5 shows the conserved genes identified within the scaffold of our assemblies and Table 6 shows the conserved proteins identified in the predicted proteome of the annotated genomes of each species. An outline of the predicted non-coding RNA is shown in Table S3, the full set of results from StructRNAfinder are publicly available at https://osf.io/vsbc2/ (Accessed on 17 February 2022).

Genome Analysis
While the gene function of the predicted genes in each of the three genomes was assigned using Funannotate, the predicted proteome of each strain was run again through Eggnog-mapper v2.1.4-2. This yielded the following results, where for strain RA3a of the 8,951 predicted proteins, 86.7% could have a COG (Clusters of Orthologous Groups) category assigned; similarly, of the 8964 proteins predicted in strain RA5, 86.7% could have COG categories assigned, and 86.2% of the 9124 predicted proteins from RA6 could be too.
The distribution of these predicted proteins across the COG categories is shown in Figure 4. Distribution of proteins across the categories did not differ significantly across any of the three genomes, and while the most frequently mapped category was "Function Unknown", of those that could be placed into a category of known function, the five most common in order of decreasing predicted protein count within the categories were "Intracellular trafficking, secretion, and vesicular transport", "Amino acid transport and metabolism", "Secondary metabolite biosynthesis, transport and catabolism", "Posttranslational modification, protein turnover, chaperones", and "Carbohydrate transport and metabolism". The presence of over 475 proteins associated with secondary metabolites in each genome is promising and these deserve further examinations, particularly in the context of the antagonistic activity of the fungal isolates against E. mallotivora and, potentially, other microorganisms. It is also interesting that no mobilome elements associated with transposons and prophages were detected, and less than 100 predicted proteins were identified in each genome that were associated with any of the following categories: "Extracellular structures", "Cell motility", "Nuclear structure", and "Defence mechanisms". Figure 5 shows that the eggnog mapper was also able to assign gene ontology terms; often multiple terms are assigned to individual proteins, as well as enzyme commission numbers, a variety of matches to multiple KEGG databases, and also to BRITE hierarchies. A limited number of predicted proteins (no more than 2% in any genomes) were also associated with matches to the CAZy database of carbohydrate active enzymes, as well as matches to BiGG IDs. More promising was that around 85% of genes in each genome were matched to proteins in the pfam database. There was considerable overlap between these categories, however; most genes are close to identical across each of the three genomes. All data generated using eggnog mapper are included in the Dataset S1.
numbers, a variety of matches to multiple KEGG databases, and also to BRITE hierarchies. A limited number of predicted proteins (no more than 2% in any genomes) were also associated with matches to the CAZy database of carbohydrate active enzymes, as well as matches to BiGG IDs. More promising was that around 85% of genes in each genome were matched to proteins in the pfam database. There was considerable overlap between these categories, however; most genes are close to identical across each of the three genomes. All data generated using eggnog mapper are included in the Dataset S1.

COG Functional Categories Predictions from Predicted Proteome
RA3a RA5 RA6

Comparative Genomics and Phylogenomic Analysis
Pairwise comparison of genomic similarities between our isolates was performed by calculating the average nucleotide identity (ANI) values. The three isolates, RA3a, RA5, and RA6, showed high genomic similarities between each other (ANI value of 99%) in addition to the T. koningiopsis POS7 isolate, which shared an ANI value of 96% ( Figure 6).

RA3a
RA5 RA6 Figure 5. Assignment of predicted proteins in the three fungal isolates RA3a, RA5, and RA6 using eggnog mapper to multiple databases.

Comparative Genomics and Phylogenomic Analysis
Pairwise comparison of genomic similarities between our isolates was performed by calculating the average nucleotide identity (ANI) values. The three isolates, RA3a, RA5, and RA6, showed high genomic similarities between each other (ANI value of 99%) in addition to the T. koningiopsis POS7 isolate, which shared an ANI value of 96% ( Figure 6). Conversely, the three isolates showed lower genomic similarities with other Trichoderma spp. (79-89% of ANI value). These results further confirm the identity of our isolates as belonging to the T. koningiopsis species.
The clustering was done based on a Euclidean distance matrix.

Secondary Metabolite Clusters
The whole genome sequences of the Trichoderma isolates RA3a, RA5, and RA6 were mined for putative biosynthetic gene clusters (BGCs) using AntiSMASH [54]. A total number of 124 clusters were identified across the three genomes; 43 BGCs were identified in isolate RA3a, 40 in isolate RA5, and 41 BGCs in isolate RA6. The strains varied in the putative classes they encode (Table 7 and Dataset S2). Non-ribosomal peptide synthetase (NRPS) clusters are the most dominant across the isolates followed by polyketide synthase (PKS) clusters, and Terpene and hybrid NRPS-PKS clusters. The number of BGCs from the sequenced fungal strains (RA3a, RA5, and RA6) was compared to the number of BGCs from other members of the Trichoderma genus (Figure 7), as predicted by AntiSMASH. This revealed that the fungal strains isolated in this study have the potential to be talented producers of polyketides and non-ribosomal peptide synthases, in line with other members of the Trichoderma genus.
We then screened for clusters that were consistently present across the three strains, which could be indicative of their shared bioactivity against E. mallotivora. As predicted by AntiSMASH, we could find BGCs with low-to-high similarity to those for known bioactive metabolites, namely fusaric acid, naphthopyrone, neurosporin A, ascochlorin, and clavaric acid (Table S4 and Dataset S3). Further manual BlastP analysis of the putative neurosporin A BGC, revealed that this cluster may in fact code for the biosynthesis of a salicylaldehyde-related compound. This observation is based on the identification within Further analysis using OrthoFinder suggested that there is a total of 12,270 orthogroups present among the predicted proteomes used in our analysis. Among that, a total of 3392 of the orthologous proteins exist as single-copy and alignment of their amino acids sequence was used for phylogenomic tree construction. The phylogenomic tree generated shows that the three isolates formed a distinct monophyletic group and shared a recent common ancestor within T. atroviride IMI206040 ( Figure S2). The robustness of the phylogenomic tree generated was confirmed as all the branches showed 100% bootstrap values.
Next, the predicted protein sequences from our isolates were compared with each other using the OrthoVenn2 web server. All the predicted protein sequences extracted from our genomes were further grouped into 9225 orthologous proteins. Among these, a total of 8245 protein groups were shared by all isolates. These proteins accounted for 86.64 (RA3a), 89.09 (RA5), and 89.00% (RA6) of the total orthologous protein group encoded in each genome ( Figure S3). Therefore, OrthoVenn2 analysis confirmed that most of the proteins encoded for by the three isolates are shared.

Secondary Metabolite Clusters
The whole genome sequences of the Trichoderma isolates RA3a, RA5, and RA6 were mined for putative biosynthetic gene clusters (BGCs) using AntiSMASH [54]. A total number of 124 clusters were identified across the three genomes; 43 BGCs were identified in isolate RA3a, 40 in isolate RA5, and 41 BGCs in isolate RA6. The strains varied in the putative classes they encode (Table 7 and Dataset S2). Non-ribosomal peptide synthetase (NRPS) clusters are the most dominant across the isolates followed by polyketide synthase (PKS) clusters, and Terpene and hybrid NRPS-PKS clusters. The number of BGCs from the sequenced fungal strains (RA3a, RA5, and RA6) was compared to the number of BGCs from other members of the Trichoderma genus (Figure 7), as predicted by AntiSMASH. This revealed that the fungal strains isolated in this study have the potential to be talented producers of polyketides and non-ribosomal peptide synthases, in line with other members of the Trichoderma genus. antimicrobial compounds trichoxide and virensols from the biocontrol fungus T. virens [72] (Table S5 and Figure S4).

Discussion
Trichoderma spp. are easily culturable and generally considered to be harmless to humans and animals-except for T. longibrachiatum [73,74]. Members of this genus have been used extensively in agriculture as eco-friendly biological control agents as they are capable to control (or suppress) other microorganisms directly or indirectly [37,38,75,76]. It is reported that various Trichoderma spp. are effective biocontrol means against many fungal pathogens and some bacterial pathogens [77]. Previous works explored and applied the idea of using microbial species to control E. mallotivora [26][27][28][29][30][31]; nevertheless, the use of Trichoderma as a fungal antagonist against the PDD pathogen has not been reported. In this study, we demonstrated the potential of five Trichoderma strains to cause growth inhibition of E. mallotivora strain BT-MARDI. Trichoderma offers an advantage over many other microbes (especially bacteria) in terms of fast rhizosphere colonisation, involvement in the soil nutrient cycle, and excellent viability after an extended storage period (>12 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%  We then screened for clusters that were consistently present across the three strains, which could be indicative of their shared bioactivity against E. mallotivora. As predicted by AntiSMASH, we could find BGCs with low-to-high similarity to those for known bioactive metabolites, namely fusaric acid, naphthopyrone, neurosporin A, ascochlorin, and clavaric acid (Table S4 and Dataset S3). Further manual BlastP analysis of the putative neurosporin A BGC, revealed that this cluster may in fact code for the biosynthesis of a salicylaldehyderelated compound. This observation is based on the identification within the BGC of a hr-PKS megasynthase and tailoring enzymes, all showing high homology with genes of the vir gene cluster, which is involved in the biosynthesis of the antimicrobial compounds trichoxide and virensols from the biocontrol fungus T. virens [72] (Table S5 and Figure S4).

Discussion
Trichoderma spp. are easily culturable and generally considered to be harmless to humans and animals-except for T. longibrachiatum [73,74]. Members of this genus have been used extensively in agriculture as eco-friendly biological control agents as they are capable to control (or suppress) other microorganisms directly or indirectly [37,38,75,76]. It is reported that various Trichoderma spp. are effective biocontrol means against many fungal pathogens and some bacterial pathogens [77]. Previous works explored and applied the idea of using microbial species to control E. mallotivora [26][27][28][29][30][31]; nevertheless, the use of Trichoderma as a fungal antagonist against the PDD pathogen has not been reported.
In this study, we demonstrated the potential of five Trichoderma strains to cause growth inhibition of E. mallotivora strain BT-MARDI. Trichoderma offers an advantage over many other microbes (especially bacteria) in terms of fast rhizosphere colonisation, involvement in the soil nutrient cycle, and excellent viability after an extended storage period (>12 months), thus making the fungus more efficient and appealing to the farmers [78,79]. In addition to their antagonistic features against pathogens, some species of Trichoderma are even capable of inducing plant defence mechanisms, which could be another important advantage [80].
To further gain insights on the antagonism of our isolates, we performed wholegenome sequencing on three fast-growing Trichoderma strains that showed significant inhibition of E. mallotivora to aid in the elucidation of the metabolites that might play a role in this interaction. From the sequencing results, we believe that the annotated draft genomes we produced can be treated as full open reading frames. Mutations and evolutionary divergence can explain some of the fragmented BUSCO scores, and even well-conserved genes can still be lost in some lineages [81]. The large coverage of the long read nanopore sequencing allowed for a robust first draft genome to be created, with few fragments. However, nanopore read base calling remains less accurate than Illumina, around 95% [82], and while modern base calling software harnessing neural networks are improving this, it is still necessary to polish, with higher accuracy, short read Illumina sequencing data to create a robust genome assembly, in order to remove the indels and miscalled bases present in nanopore reads that otherwise lead to frameshifts and fragmentation of genes.
The genus Trichoderma is a well-recognised group of filamentous fungi known for their production of secondary metabolites, especially as talented producers of bioactive peptides, polyketides, plant growth regulators, enzymes, siderophores, and other antibiotics [38,81,83], and has been credited for its biocontrol activity as antifungal and antibacterial. For example, peptaibols are a well-studied class of peptide natural products from Trichoderma, synthesised by NRPS modules, producing a linear peptide consisting of dialkylated amino acids, isovaline, amino isobutyric acid (Aib), an acetylated N-terminus, and a C-terminal amino alcohol [38]. They are credited for their antimicrobial properties, as well as their ability to induce systemic resistance in plants against microbial invasion. Another major class of Trichoderma natural products with biocontrol activity are koninginins. This family of compounds were first isolated from T. koningii and exhibited antifungal activity [38,84]. Interestingly, koninginins have also been isolated from T. koningiopsis in other studies and are reported to exhibit antifungal properties against Fusarium spp., Plectosphaerella cucumerina, and Alternaria panax [85]. Prominent examples of polyketides isolated from Trichoderma spp. include pyrones and pyridines [38,86]. Considering the diversity of bioactive molecules isolated from the genus-for a report about Trichoderma natural products we refer the reader to the review by Shenouda and Cox [86]-and given the vast biosynthetic potential emerged from AntiSMASH analysis conducted in our study, the three Trichoderma strains (RA3a, RA5, and RA6) have high potential to produce bioactive molecules that will warrant their use as biocontrol agents against plant pathogens. Noteworthy, we have identified BGCs conserved across the three fungal isolates, including one for the biosynthesis of putative salicylaldehyde-related compounds, which are known for their antimicrobial properties, such as those that have been isolated from T. virens [72] and T. citrinum [87]. Heterologous expression and/or knockout of the salicylaldehyde BGC homologues from RA3a, RA5, and RA6 will be needed to reveal the structure of the corresponding metabolites and their potential role in the bioactivity of the fungal isolates.
The dogma of biosynthetic studies remains that many fungal biosynthetic clusters are silent under standard laboratory conditions, which makes their full exploitation challenging [88,89]. Advances in synthetic biology via heterologous expression or genome editing may help us uncover the biosynthetic potential of the gene clusters [90,91].

Conclusions
In this study, we have isolated fungal strains from soil samples collected in the rhizosphere of healthy papaya trees from different locations in Peninsular Malaysia, with the aim to identify a potential biological control agent capable of suppressing E. mallotivora, the pathogen that is responsible for the ongoing PDD outbreak in Malaysia and surrounding countries. The three Trichoderma isolates, UKM-M-UW RA3a, UKM-M-UW RA5, and UKM-M-UW RA6, have shown significant inhibition of the growth of E. mallotivora from plate-based bioassays, and molecular identification allowed us to assign them to the T. koningiopsis species. Whole-genome sequencing was performed, thereby providing a platform for their biosynthetic exploitation, with the goal of linking secondary metabolites to biosynthetic gene clusters. Biosynthetic gene clusters homologous to those for known bioactive metabolites were identified and found to be conserved across the three isolates, opening the way for future exploration of the biosynthetic potential of these fungi. With the growing need for greener alternatives to chemical pesticides, the biosynthetic studies on natural products from Trichoderma spp. is expected to grow, which may give rise to a new generation of biocontrol agents with an enormous impact in the agrochemical sector.
Supplementary Materials: The following are available online at https://www.mdpi.com/article/ 10.3390/jof8030246/s1, Table S1: Concentration of the extracted gDNA measured through Qubit. Figure S1: HMW genomic DNA of the three Trichoderma isolates. Table S2: Primers and PCR amplicon sizes for molecular identification of isolates. Table S3: Details on the number of predicted non-coding RNAs identified in each genome. Figure S2: Phylogenomic tree from alignment of 3392 single-copy orthologous proteins from selected Trichoderma spp. Figure S3: Shared and unique orthologous protein group among isolates sequenced in this study as inferred from OrthoVenn2 web server analysis. Table S4: Distribution of predicted BGCs found across the fungal genomes. Table S5: Putative salicylaldehyde cluster found across the three sequenced fungal genomes. Figure  S4: Protein sequence comparison of the putative salicylaldehyde cluster found in the genome of strains RA6, RA5, and RA3a. Dataset S1: Genome analysis through eggnog mapper. Dataset S2: AntiSMASH analysis of genomes. Dataset S3: Details of secondary metabolite clusters conserved across the genomes of strains RA3a, RA5, and RA6.

Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.