Comparative Genomics Used to Predict Virulence Factors and Metabolic Genes among Monilinia Species

Brown rot, caused by Monilinia spp., is among the most important diseases in stone fruits, and some pome fruits (mainly apples). This disease is responsible for significant yield losses, particularly in stone fruits, when weather conditions favorable for disease development appear. To achieve future sustainable strategies to control brown rot on fruit, one potential approach will be to characterize genomic variation among Monilinia spp. to define, among others, the capacity to infect fruit in this genus. In the present work, we performed genomic and phylogenomic comparisons of five Monilinia species and inferred differences in numbers of secreted proteins, including CAZy proteins and other proteins important for virulence. Duplications specific to Monilinia were sparse and, overall, more genes have been lost than gained. Among Monilinia spp., low variability in the CAZome was observed. Interestingly, we identified several secondary metabolism clusters based on similarity to known clusters, and among them was a cluster with homology to pyriculol that could be responsible for the synthesis of chloromonilicin. Furthermore, we compared sequences of all strains available from NCBI of these species to assess their MAT loci and heterokaryon compatibility systems. Our comparative analyses provide the basis for future studies into understanding how these genomic differences underlie common or differential abilities to interact with the host plant.


Introduction
Monilinia spp., necrotrophic Ascomycetes of the family Sclerotiniaceae, are the main pathogens responsible for brown rot in stone and some pome fruits. They are present in numerous fruit areas of the world, and are responsible for important economic losses in the pre-harvest and the post-harvest periods [1]. Brown rot can appear at any stage of fruit development, during and after harvest [2], and infect fruit either through an opening (micro-cracks, lenticels, wounds, or stomata) or through contact with an intact surface [3]. Post-harvest losses are usually more severe than pre-harvest losses, and typically occur during storage and transport, in some cases even affecting fruit at the processing stage [4]. The incidence and development of this disease are clearly affected by climatic conditions, and can cause up to 80% fruit loss under favorable environmental conditions [5].
To ensure the quality of all proteomes used, we assessed the completeness of the genome using BUSCO (v4.0.2) [26] with the helotiales_odb10 dataset. All proteomes with the exception of two of the strains not used in the phylome had a completeness above 90% (see Table S1).

Phylome Reconstruction
To reconstruct the phylomes, i.e., the complete collection of phylogenetic trees reconstructed for each gene encoded in a genome, the proteomes of 15 Leotiomycetes species were collected (see Table S1). A phylome was reconstructed for each of the five Monilinia species present in the dataset, using a slightly modified PhylomeDB pipeline [27]. In brief, for each gene in the genomes of a Monilinia species, a BlastP v2.5.0+ [28] search was performed to obtain the list of homologous proteins. Blast results were filtered so that only results that had an e-value below 1 × 10 −5 and an overlap of more than 50% over the query sequence were kept. The top 150 hits per query protein were kept. For each resulting set of homologous proteins, six multiple sequence alignments were reconstructed using three different programs: MUSCLE v3.8.1551 [29], MAFFT v7.407 [30], and KALIGN v2.04 [31] in both forward and reverse directions [32]. A consensus alignment was then obtained using MCOFFEE, from the tcoffee package v12.00 [33]. This alignment was then trimmed to remove columns that were not supported by the individual alignments (-ct 0.16667) and to remove columns with a large percentage of gaps (-gt 0.1) using TrimAl v1.4. rev15 [34]. The trimmed alignment was used to reconstruct a maximum likelihood phylogenetic tree using IQTREE v1.6.9 [35]. Model selection was performed by IQTREE limiting the choices to 5 of the most common models (DCmut, JTTDCMut, LG, WAG, VT). 1000 rapid bootstraps were calculated per tree. Trees and alignments were uploaded to phylomedb [36] under the phylome IDs 383 to 387 (http://phylomedb.org/, accessed on 31 May 2021).

Phylome Analysis
Each phylome was computationally scanned to determine orthology and paralogy relationships across species using the species overlap algorithm [37] as implemented in ETE3 v3.1.1 [38]. Shortly, for each tree the lineage spanning from the seed protein (protein which initiates the blastP search during the phylome reconstruction, see above) to the root was scanned, and each node was assessed to qualify whether it was a duplication or a speciation. A duplication node was called when leaves at both sides of the node had at least one species in common. Leaves connected through a speciation node were considered orthologs whereas leaves connected through a duplication node were considered paralogs. Expansions were considered when multiple duplication events occurred specifically in one single species of Monilinia leading to the presence of at least 5 in-paralogous sequences. As the presence of transposable elements can impact the calculation of such expansions, we performed this analysis after removing trees that contained proteins that had Pfam domains associated with transposable elements. In addition, we also removed any duplication that was suspected to have been called due to an erroneous gene prediction that resulted from protein fragmentation. This was done by scanning the blastP results and observing whether two paralogs were consistently mapping to two parts of the same orthologous protein.
Gene gain and loss was derived from orthology information. So, a gene family was considered to have been gained at the common ancestor of all the species that contained an orthologous gene from the family. Once the gain was inferred, losses were calculated by first marking all the species that had lost the gene family and then minimizing the number of inferred losses by assigning the loss to the common ancestor of monophyletic species that lack the gene. Note that gains can only be assigned to the lineages leading to seed species of the phylome. To obtain a more complete view of gains and losses, we joined calculations from the five computed Monilinia phylomes, and for each node the average number of gains or losses was calculated. Proteins were assigned as "orphans" when they had no homologous proteins among the species used in the phylome reconstruction.

Species Tree Reconstruction
Duptree v1.48 [39] was used to reconstruct species trees for each of the phylomes using all the gene trees reconstructed in each phylome. Additionally, a maximum likelihood tree was reconstructed using a concatenation approach based on the phylome of M. fructicola. First, 2871 single copy alignments obtained in the phylome were concatenated to obtain a single multiple sequence alignment that contained nearly two million positions (1,992,645). Then, IQTREE v1.6.9 [35] was used to reconstruct the species tree using LG as the model. The rest of the parameters were inferred by IQTREE, and the best model according to the BIC criterion was (LG + F + R4). 1000 rapid bootstraps were calculated. Bootstrap support was 100 except for two nodes (see Figure 1).

Figure 1.
The tree represents the evolution of the species included in the phylome. Support is 100 for all nodes except the two nodes where a number can be found surrounded by a circle. Green numbers placed in the tree indicate the average number of genes gained at that particular node, whereas red values indicate the average number of losses. The table placed next to the tree indicates, for each species, the preferred host and the kind of lifestyle. The number within the blue circles indicate the number of secondary metabolism gene clusters detected by SMURF, while the numbers in the brown circles indicate the number of effector proteins found in each Monilinia species. Finally, the bar plot indicates in yellow the number of secreted proteins in each genome and in yellow the size of the CAZome.

Functional Prediction
All proteins used in the phylome were annotated using interproscan v5 [40]. In addition, CAZy proteins were annotated using dbcan metaserver [41]. Predicted CAZy proteins were filtered to retain only those that were annotated by at least two of the three programs in dbcan metaserver. A blastP search was run against the MycoCLAP [42] database (accessed June 2020) using the predicted CAZy proteins as queries. Secreted proteins were predicted using SignalP v5.0. Proteins with a signal peptide were scanned for the presence of transmembrane domains using TMHMM. Proteins with multiple TMHMM domains or with a single TMHMM domain that fell outside the first 60 AA of the protein were omitted. EffectorP 2.0 [43] was then used to detect effector genes in the predicted secretomes of Monilinia species. GO term enrichment was calculated using an in-house adaptation of FatiGO [44].
Secondary metabolism (SM) gene clusters were detected using the SMURF webserver [45], accessed June 2020. Proteomes were also scanned for the presence of known SM gene clusters by performing a blastP search between a collection of 174 gene clusters obtained from the literature and the proteomes included in the phylome. The presence of a cluster was assumed when a large part of the cluster was present in a contiguous region of a genome in one of the species. For short clusters (i.e., having 4 or fewer genes), all genes were required to be present. For longer clusters, at least half of the genes were required to be present as long as at least 3 genes were found in a cluster. A blastP search against the NCBI nr database was performed for all orphan genes in order to see whether they had homologs outside the phylome.

Monodictyphenone Gene Trees Reconstruction
The cluster of the secondary metabolite monodictyphenone was explored for evidence of horizontal gene transfer. For each protein encoded in the cluster, a blast search was performed against a local database including 340 fungal genomes [46]. The results were added to the ones found in Monilinia, and a phylogenetic tree was built using the same methodology as detailed for the phylome reconstruction. ETE3 v3.1.1 [38] was used to explore the resulting topologies and obtain the images.

Strain Comparison
Reads belonging to seven Monilinia strains were downloaded from SRA (see Table S1). Reads were mapped to their own reference genome using BWA mem v0.7.17-r1188 [47] and SNPs were called using GATK v4.0.4.0 [48]. Bedtools v2.29.2 [49] was then used to calculate genome coverages. Missing genes were obtained by checking coverage at their positions. If no reads were mapping at a position, it was considered not present. Genes with less than 90% of their coding position present were considered missing. Note that this approach avoids labeling genes that are not part of the gene catalog as a result of annotation errors due to sequencing or assembly errors causing frameshifts or spurious internal stop codons as missing.

Genes Involved in Pathogenicity
Genes related to the early infection process of M. laxa strain 8L on nectarines from RNA-Seq analysis [50] were searched for in four different species of Monilinia (M. fructicola CMPC6 strain, M. fructigena gena6 strain, M. polystroma SP-61 strain, and M. aucupariae LMK733 strain).

Comparison of Monilinia Genomes
Phylomes are complete collections of phylogenetic trees built for each gene encoded in a genome of interest. They allow a broad view of the evolutionary forces that have shaped a species' genome. We reconstructed five phylomes, one for each sequenced Monilinia species: M. laxa 8L [22], M. fructicola CPMC6 [23], M. fructigena gena6, M. polystroma SP-61, and M. aucupariae LMK733. A total of 15 proteomes were used to reconstruct the phylomes (see Table S1), including the five Monilinia spp. and ten other Leotiomycetes, including the well-annotated necrotroph B. cinerea [51,52]. Based on the information of these phylomes, and using a combination of a super-tree and gene concatenation approaches, we reconstructed a species tree representing the evolutionary relationships of the 15 Leotiomycetes species. The five phylomes resulted in the exact same species tree ( Figure 1). The resulting topology was well supported and confirmed the monophyly of the five Monilinia spp. as compared to the other 10 taxa (Figure 1). M. aucupariae, a non-pathogen on stone fruit, was the first diverging species followed by M. fructicola, M. laxa, and finally the two species that were until recently considered to be the same species: M. fructigena and M. polystroma (see Figure 1). These results confirmed those obtained by Villarino et al. [7], where the dendrogram generated based only on the ability to produce sclerotia on potato dextrose agar and lesion length on infected fruit separated Monilinia isolates into two clusters. One cluster was composed of M. laxa and M. fructigena isolates, and the other comprised M. fructicola isolates. Species from the Myriosclerotinia genus and the Ciborinia genus were sisters to the Monilinia clade. It is noteworthy that Myriosclerotinia species are the only ones in the analysis that are not necrotrophs, but rather biotrophs. Orthology and paralogy relationships were calculated from the phylomes using the species overlap algorithm [37], and the number of gains and losses that had occurred in each node were mapped to the species tree (see Methods, and Figure 1). Of note, a very large number of gene losses was inferred specifically in M. aucupariae (1348), despite the fact that this species has the largest genome sequence among Monilinia species. This difference cannot be attributed to the quality of the genome or annotation as BUSCO results show similar results for other Monilinia species. This wave of gene loss was not associated with enrichment in any gene ontology (GO) terms.
We detected 1021 orphan genes in the five Monilinia species that had no homologs in any of the other species used in the phylome. Out of these, 913 did not have a hit in the nr database and may represent either annotation errors or truly species-specific genes, 35 had hits in other Leotiomycetes not included in the phylome, and an additional 24 had hits in other Pezizomycotina species. Four proteins had a best hit in other fungal proteins, though their identity was low. A closer look at those four sequences revealed that they were either very short, had low complexity regions, or both. The remaining proteins (44) had hits outside fungi and may represent cases of horizontal gene transfer (HGT) or more likely contamination or miss-annotations.
We examined the phylomes for duplications that happened specifically in the lineage of Monilinia, which diverged 14.4 MyA according to a recent study [53]. There were very few duplications in these nodes ( Figure S1). This can be partially attributed to the short evolutionary time of Monilinia species and partially to a low duplication rate. Of note is a gene family expansion of the phytotoxin Bcnep1 specific to M. aucupariae (see Figure S2). Bcnep1 and its paralog Bcnep2 have been described in B. cinerea [54], and cause necrosis in dicotyledonous plants. All Monilinia species have a single copy of Bcnep2, but whereas most have also one single copy of Bcnep1, we found up to 20 copies of Bcnep1 in M. aucupariae. These duplicated genes are dispersed throughout the genome and therefore do not result from tandem duplications. Figure S3 contains a summary of this and subsequent analyses performed in this study.

Secretome Comparison in Monilinia Species
Necrotrophs secrete a range of cell wall degrading enzymes that serve to break plant cells and acquire nutrients [55]. Secreted proteins, as well as cell wall degrading enzymes, can be regarded as virulence factors, and as such can be related to the pathogenesis of a species. A list of secreted proteins found in the 15 species used in the phylome was obtained using SignalP v5.0 [56] (see methods, Table S2, and Figure S3).
The number of secreted proteins ranges from 628 to 1,203 in the set of 15 species used in the phylome analysis (see Table S3). 374 potential effectors were detected, which represented roughly 10% of the secretome. Differences in secretome size between different species are smaller when the number of secreted proteins is normalized by proteome size.
On average, secretomes account for roughly 7.5% of a species' protein content (range 6.56% to 8.29%). In absolute terms, the species with the smallest secretome was M. aucupariae (628). We grouped secreted proteins into orthologous groups and searched for families specific to Monilinia species. Seventeen families were exclusively present in more than one Monilinia species, of which five were present in all five Monilinia species.
Secreted proteins that were specific to Monilinia species and present in all five species included a glucose-methanol-choline oxidoreductase and two proteins annotated as Berberinelike. These last proteins could be homologous to proteins found in the Pyriculol cluster of Magnaporthe oryzae [57]. Pyriculol and its derivatives produce leaf lesions, although the deletion of this cluster in M. oryzae did not affect virulence in rice [57]. A homolog of this cluster is present in all five Monilinia species (see Figure 2A and Table S4). One of the proteins in the cluster, shown in Figure 2A marked with an X, was not thought to be involved in the synthesis of pyriculol in the original description of the cluster [57] but is conserved across the Monilinia species. The presence of an additional dehydrogenase (Y in Figure 2A) and the conservation of gene X could indicate that the compound synthesized by the Monilinia cluster is not Pyriculol but a related compound. Sassa et al. [58] described the presence in M. fructicola of monilidiol, a compound with antibiotic and phytotoxic activity similar to pyriculol. It is possible that the compound synthesized by this cluster is monilidiol [58]. Alternatively, the PKS encoded in this cluster matches a fragment sequenced that is thought to be responsible for synthesis of the antibiotic and antifungal chloromonilicin [59]. Further research is necessary to establish whether this cluster is responsible for the synthesis of monolidiol, chloromonicilin, or an alternative compound. Also specific to Monilinia, though not found in all the species, was a peptidase of the subtilase family found in M. fructigena, M. fructicola, and M. laxa. We used EffectorP v2.0 [43] to detect potential effector proteins among the set of secreted proteins in Monilinia. Out of a total of 3612 secreted proteins in all five Monilinia species, 374 (10.4%) were detected as potential effector proteins (see Table S2). Using orthologous relationships, we were able to group 330 out of the 374 effector proteins into 127 families, leaving 44 effector proteins unmatched. Only 20 of the families were present in all five Monilinia species.
Among the proteins detected as effector proteins by EffectorP [43] are the homologs to the phytotoxin BcNEP1.

The CAZome
Another set of potential enzymes that could contribute to the pathogenic capacity of the different species are the CAZy proteins, as they degrade complex carbohydrates. We predicted all CAZy proteins in the species considered (Table S5 and Figure S3). Overall, 6261 proteins out of the 161,183 proteins included in the phylome were annotated as CAZy. Out of the 6261 CAZy proteins in our phylome, 1494 presented hits with at least 50% identity to mycoCLAP proteins. The differences in numbers of CAZy proteins found in the different species is limited (see Figure 1 and Table S3 A transcriptomic analysis of M. fructicola, M. laxa, and M. fructigena conducted under in vitro conditions identified carbohydrate-active enzymes [12], but the role of these enzymes in infection of peach fruit requires more detailed transcriptome analyses and functional analysis by gene knock-out mutants. So, we explored the variability of the CAZy proteins in different species by dividing them into different families (see Table S6). Most families showed little differences in numbers of CAZy family members, and there was no Monilinia-specific CAZy family. Therefore, we focused on those families that showed the largest variability in their presence in different species. The largest variability is observed in family AA3, which contains auxiliary proteins that catalyse the oxidation of alcohols or carbohydrates with the formation of hydrogen peroxide or hydroquinones. These proteins have been found in several wood-degrading fungi. While they are present in all Monilinia species, they are even more abundant in the three Botrytis species that infect non-woody tissues such as leaves, flowers, and fruit.
Another CAZyme family with significant variation is GH28, a family of pectin degrading enzymes that has been reported to be expanded in necrotrophs [60]. The results presented here are consistent with those findings, as Myriosclerotinia species, which are biotrophs, have fewer copies (average of 11) than the necrotrophs Botrytis and Monilinia (averages of 19 and 17, respectively). Most proteins in the GH28 family are polygalacturonases, though some of them hydrolyze rhamnogalacturonan, which forms a branched part of the pectin complex. Within the GH28 family, three orthogroups were specific to necrotrophic species and these could be annotated more precisely based on homology to the MycoCLAP database [42].
Considering these results, differences in CAZy proteins between Monilinia and other species seem to be centered mainly around two families, AA3 and GH28, though smaller differences in the number of homologs in other families can be found.

Genes Involved in Pathogenicity
The availability of the genome of M. laxa 8L [22] and a study of the transcriptome of this pathogenic species on nectarines previously conducted by Balsells-Llauradó et al. [50] allowed a search for possible genes related to pathogenesis. After analysis of the RNA-Seq results, a set of 22 genes was selected for their expression early in nectarine infection ( Figure S4) and their relationship with pathogenesis in Monilinia spp. or B. cinerea (PHI database [61]) (Table S7). The transcripts to M. laxa were de novo annotated for multiple functional categories, including Gene Ontology (GO) terms, carbohydrate-active enzymes (CAZymes), Pfam, fungal peroxidases (fPox), genes involved in pathogen-host interactions (PHI), membrane transport proteins (TCBD), and proteins with signal peptides (SignalP), among others. The presence of orthologs of these 22 genes was checked in the other four species of Monilinia considered. GO terms classification for these genes showed a significant overrepresentation of CAZymes related to plant cell wall degradation (Table S7). Table S7 includes polygalacturonase 1 (MlPG1), and pectin lyase 2 gene (MlPNL2) described by Rodríguez-Pires et al. [62], as well as genes involved in oxidative-reduction processes and transmembrane transport. All of them had orthologs in all Monilinia species with the exception of Phy00AEVLC_61186, which encodes a transmembrane transporter and was missing in M. aucupariae (Table S7).

Secondary Metabolism Clusters
We examined the 15 proteomes used in the phylome for the presence of gene cluster families that were homologous to a collection of 174 known SM clusters (updated from Ballester et al. [63]). A cluster was considered present when most of the cluster genes were present, with the understanding that the synthesized compound may be related but not necessarily the same (see Table S6 and Figure S3). Among the clusters present in Monilinia, one was the aforementioned one homologous to the pyriculol gene cluster in M. oryzae, and the sordarial gene cluster in Neurospora crassa [64]. Another interesting cluster was a homolog of the cluster synthesizing monodictyphenone (see Figure 2B), a prenyl xanthone derivative. Xanthones have pharmacological uses as antifungal and antibacterial compounds and are produced by several groups of plants and fungi [65]. The monodictyphenone gene cluster was described in Aspergillus nidulans [66] and consists of 12 genes, two of which (mdpI and mdpD) are not thought to be involved in the synthesis of Monodictyphenone. A similar cluster is present in four of the five Monilinia species, missing only in M. aucupariae. The clusters of the four species are identical, with the only difference that mdpG in M. fructigena is absent in the annotation. This can either be due to a miss-assembly or genuinely be missing from this species in which case the cluster would not produce any compound. When comparing the clusters of Monilinia to the one in A. nidulans, they share most genes. The few differences include the lack of mdpI and mdpJ, the duplication of mdpH, and the presence of two additional genes (white arrows in Figure 2). Gene order is not conserved between the clusters of Monilinia and its homolog in A. nidulans. As many SM gene clusters have been transferred between unrelated fungal taxa, we considered the possibility that this was another such case. We searched homologous clusters in a database formed by 340 fungal genomes and obtained a list of 22 clusters from four of the subphyla of Pezizomycotina (Leotiomycetes, Sordariomycetes, Eurotiomycetes and Dothideomycetes) ( Figure S5). For each protein family, a gene tree was reconstructed to determine their phylogenetic relation to the clusters in Monilinia. All trees followed a similar pattern in which the sequences from Monilinia were far related from the ones from A. nidulans. The phylogenetic distance, coupled with the difference in protein composition of the gene cluster, could indicate that the MDP cluster in Monilinia diverged long ago from the homologous clusters in other taxa, so they share a very ancient ancestor. The Monilinia sequences were always grouped with an ortholog in Sclerotinia borealis, another member of the Leotiomycetes subphylum that infects barley, rye, and wheat. Orthologs in S. borealis were also present in a cluster that had a similar configuration as in Monilinia and was only missing mdpK and one of the additional proteins. The sister species to this group of Monilinia plus Sclerotinia was not always the same. In some cases, the sister sequence in the gene tree was from a member of the Penicillium or Talaromyces genus (for instance mdpG, mdpE, mdpD, mdpF, mdpH, or mdpK), in some other cases the sister belonged to different Sordariomycetes species (such as mdpL or mdpC). In the remaining cases, the sister was uncertain, as the bootstrap value was below 50 (see mdpA). While we found clusters in some of the Penicillium and Talaromyces species included in the database, none of them showed similarities in terms of gene order with the cluster found in Monilinia. While we cannot completely exclude the possibility of HGT playing a role in the evolution of this SM gene cluster, there is no strong evidence for HGT, since multiple species across multiple taxonomic groups seem to have a variant of this gene cluster.
Another cluster that was present in Monilinia was homologous to the botcinic acid (BOA) cluster in B. cinerea (see Figure 2C). This polyketide compound is a phytotoxin that is not essential for virulence even though a B. cinerea double mutant lacking the BOA cluster and the cluster synthesizing botrydial was shown to be less virulent [67]. The BOA cluster consists of 13 proteins, of which 11 (BOA3 to BOA13) are found in clusters in three of the Monilinia species: M. aucupariae, M. laxa and M. fructicola. The two first genes of the botcinic acid cluster (BOA1 and BOA2) are not in the main cluster, but are found together elsewhere in the genome in M. fructicola and M. laxa, though they are not present in the genome of M. aucupariae. This configuration is the same one that was found in Sclerotinia sclerotiorum, and this species is not known to produce botcinic acid, so it is likely that the compound of this partial cluster is different. A synteny analysis of the two partial BOA clusters strongly suggested that the presence of two separate clusters (BOA1-2 and BOA3-13) is ancestral in the Sclerotiniaceae, whereas the joining of these clusters into a single genomic locus is a derived trait unique to the genus Botrytis [23]. M. fructigena and M. polystroma have lost most but not all of the genes of the BOA cluster. M. fructigena retains the two first genes (Boa1 and Boa2) and the last one that has homologs in the other Monilinia species (Boa13), whereas M. polystroma retains Boa1, Boa11, and Boa13.
Other known pathways identified in Monilinia species were homologs to agnestins, prenyl xanthones, and geodin but all targeted parts of the aforementioned monodictyphenone gene cluster homolog. In addition, a homolog to the tenellin gene cluster [68] was identified, but it is unlikely that this gene cluster can produce tenellin given that the key enzyme that produces tenellin is a hybrid PKS/NRPS gene whereas the key enzyme in the Monilinia homologous cluster was a PKS. Additionally, the Monilinia cluster was missing one of the cytochrome P450 monooxygenase proteins (tenA) that catalyzes the reaction that transforms pretenellinA to pretenellinB.
We also used SMURF to detect unknown SM gene clusters (see Table S8). The number of detected clusters per species ranges between the 14 found in Myriosclerotinia scirpicola and the 36 found in the outgroup H. dingleyae. Botrytis spp. have, on average, more SM biosynthetic gene clusters than Monilinia and Myriosclerotinia. M. fructigena has an unusually low number of SM gene clusters (16) considering the average for the other Monilinia species is 20. The number of key enzymes for SM biosynthesis, which refers to PKS, NRPS, DMAT, and PKS-NRPS hybrids found in the genome follows that trend as M. fructigena has only 22 while M. fructicola has 33, therefore it is unlikely that the low number of clusters is solely the result of genome fragmentation. Genes involved in the biosynthesis of antimicrobials such as solanopyrone and clavulanic acid have been identified in some Monilinia spp. and are usually related to antimicrobial activity [12].  Table S1 and Figure S3). Beyond the reference we have used in this analysis up until now, the genomes of seven additional strains were obtained from NCBI (see Figure S1). Three of those are strains of M. fructicola, two of M. fructigena, and two of M. laxa. For each of these three species, genome sequences are available for isolates with each idiomorph (see Table S1).

Comparative Genomics of Sexual Reproduction and Heterokaryon Formation in
While signs of sexual reproduction have been observed in M. fructicola in the USA, analysis of European populations led to the conclusion that M. fructicola reproduced mainly asexually there [69]. In Australia, meanwhile, no sexual stages have been recorded yet [71]. Still, there was some evidence of recombination which could either be attributed to sexual reproduction or to the formation of heterokaryons. Heterokaryon formation is specific to filamentous fungi and is regulated by het (heterokaryon incompatibility) genes. Polymorphisms in het genes of two isolates that undergo anastomosis cause inviable heterokaryons, and consequently strains of the same species are assigned to different vegetative compatibility groups depending on whether or not they can form heterokaryons. Differences in het genes that cause heterokaryon incompatibility can be allelic, in which one single het locus contains polymorphisms that prevents the formation of a heterokaryon, or genetic, in which het genes found in different loci differ between strains [72]. One of the best characterized het loci is het-c. In B. cinerea, the het-c (bc-hch) loci was studied in a population of 117 isolates. Two different allelic types were found that could split the population in two. In addition, there was a strict correlation between the allelic het-c type and a resistance to the fungicide fenhexamide phenotype [73]. We searched for the bc-hch ortholog in the Monilinia species for which multiple strains were sequenced, and examined the allelic variation in the loci. In M. fructicola, we found that the two strains had 11 SNPs Based on read mapping, one of the HET proteins was consistently inferred as missing. Therefore, either the gene was missing in the strains assemblies or numerous mutations affected this locus, making it impossible for reads to map there. A blast search revealed that the protein was in fact not missing but it had just accumulated enough SNPs to remain undetected by read mapping. Along with this HET protein, a second protein located next to it was also missing in all strains except for M. fructigena Mfg5SPA.
Comparative genomic methods have allowed us to gather information about genetic variation, differential gene expression, and evolutionary dynamics in Monilinia species. Comparison of whole genome sequences provides a highly detailed view of how Monilinia species are related to each other at the genetic level, and also provides a powerful tool for studying Monilinia evolution.

Conclusions
We have performed a comparative analysis of the genomes of five plant pathogenic Monilinia species, including the main stone fruit pathogens M. laxa, M. fructicola, and M. fructigena. Using a phylome approach, we analyzed the patterns of gene gain and loss in these species and found that more genes had been lost than gained; this was especially evident in M. aucupariae, which, despite having the largest genome, had a net loss of 973 genes. The low number of genes gained is coupled with a very low duplication rate, although the NEP1 protein family was specifically expanded in M. aucupariae.
Comparison of secreted proteins and the CAZome showed a low variability in Monilinia species. We uncovered the presence of SM gene clusters homologous to the clusters synthesizing pyriculol, monodictyphenone, botcinic acid, and tenellin. The fact that most of the clusters contain variations in the form of additional proteins or loss of parts of the cluster indicates that the final product of these clusters is uncertain. Based on previous observations, the pyriculol homologous cluster may synthesize monilidiol or chloromonilicin. This will need to be experimentally tested. The comparison of different strains of Monilinia species showed the presence of different mating types as previously reported for these species. In addition, we examined a wide array of het genes that could modulate the formation of heterokaryons. One gene in particular seemed to vastly differ between the different Monilinia strains, suggesting that it could be one of the main drivers of heterokaryon incompatibility. Admittedly, our comparisons are limited to a single strain per species and therefore are not informative of intra-specific variations. Some of the genes here reported as specific or lost from a given species may represent genes with variable presence-absence patterns within a species. Future studies considering genome sequences from larger, representative strain sets per species will allow the reconstruction of panand core-genomes per species. Genome assembly and annotation quality can influence comparative results. The used Monilinia assemblies had similar gene content completeness as assessed by BUSCO, and therefore we do not expect that the large differences are driven by genome quality parameters. Nevertheless, the increased use of long-read sequencing technologies will allow better coverage of telomeric and repetitive regions, which are likely to contain part of the gene variability. Finally, our results are based on bioinformatic analyses and future studies should confirm some of the predictions made.
Supplementary Materials: The following are available online at https://www.mdpi.com/article/10 .3390/jof7060464/s1; Figure S1: Number of duplications inferred at each node of the Monilinia species tree. Each number in the internal branches belongs to a duplication prediction done in a different phylome. Colors correspond to the colors of the species and indicate the seed of the phylome that produced a given number of duplications. In squares surrounding the tree are lists of functions for the duplicated proteins; Figure S2: Gene tree showing the expansion of the NEP1 gene family in M. aucupariae (purple in the tree). In blue, Bcnep1 and Bcnep2 are shown. In light blue are the orthologs in other Monilinia species; Figure S3: Schematic representation of the bioinformatic analyses performed in this study; Figure S4. Mean of the three RNA seq readings of the 22 selected genes related to pathogenicity of Monilinia laxa strain 8L at the inoculation time points (6, 14, 24 and 48 hpi) in mature fruit; Figure S5: Gene trees for mdpG, mdpL, and mdpA exemplifying the topologies discussed above. Monilinia species are shadowed in blue and the gene belonging to the A. nidulans original cluster can be found shadowed in grey. Next to the tree appears the taxonomic group the leaf belongs to. Colors also correlate with the taxonomic group; Table S1: List of genome data used in this project; Table S2: List of secreted proteins in Monilinia species; Table S3: Number of proteins detected as secreted or part of the CAZome part of secondary metabolism gene clusters according to SMURF and number of backbone proteins in secondary metabolism gene clusters; Table S4: List of proteins involved in known secondary metabolism clusters; Table S5: List of proteins in the Monilinia CAZome; Table S6: Number of CAZy proteins per CAZy category and species. Note that if a protein is classified in multiple classes it will be counted in each class; Table S7: Genes and enzymes identified from transcriptional changes in early Monilinia laxa strain 8L infection on mature nectarine and their orthologs in four other Monilinia species; Table S8: List of proteins involved in secondary metabolism clusters based on SMURF predictions. Data Availability Statement: All phylogenetic trees reconstructed in the phylome can be obtained from phylomedb.org. All additional data is available upon request.

Conflicts of Interest:
The authors declare no conflict of interest.