Abstract
Newer effectorome prediction algorithms are considering effectors that may not comply with the canonical characteristics of small, secreted, cysteine-rich proteins. The use of effector-related motifs and domains is an emerging strategy for effector identification, but its use has been limited to individual species, whether oomycete or fungal, and certain domains and motifs have only been associated with one or the other. The use of these strategies is important for the identification of novel, non-canonical effectors (NCEs) which we have found to constitute approximately 90% of the effectoromes. We produced an algorithm in Bash called WideEffHunter that is founded on integrating three key characteristics: the presence of effector motifs, effector domains and homology to validated existing effectors. Interestingly, we found similar numbers of effectors with motifs and domains within two different taxonomic kingdoms: fungi and oomycetes, indicating that with respect to their effector content, the two organisms may be more similar than previously believed. WideEffHunter can identify the entire effectorome (non-canonical and canonical effectors) of oomycetes and fungi whether pathogenic or non-pathogenic, unifying effector prediction in these two kingdoms as well as the two different lifestyles. The elucidation of complete effectoromes is a crucial step towards advancing effectoromics and disease management in agriculture.
1. Introduction
Fungi and oomycete pathogens are the principal constraints to achieving world food security. These pathogens infect their hosts by releasing effectors, virulence-promoting molecules that manipulate a variety of host processes. Some effectors alter chromatin configuration, mimic host transcriptional activators, target host transcription factors, or interfere with the biosynthesis of phytoregulators, among other functions that alter host physiology. Effectors ultimately suppress plant defense responses, enabling the pathogen to form an association with the plant host which can result in disease.
Alternatively, effectors can have a positive impact on plant health when they are recognized by resistance receptors in the host. This recognition triggers the hypersensitive response which prevents further disease development. The current applications of effectors involve their use in genetic improvement programs [1,2], screening germplasm for effector cognates; primarily resistance proteins (R) [3] or susceptibility proteins that are targeted by effectors [4]. These efforts are propelling effectoromics as a key area of investigation in phytopathology.
Effector identification has been facilitated, in large part, by next generation sequencing and the accessibility of information deposited in public databases. Recently effectors have been identified from genomic, proteomic, and transcriptomic studies, particularly in pathosystems like that of Pseudocercospora fijiensis—banana [5], Zymoseptoria tritici—rice [6], Ustilago hordei—barley [7] and Puccinia striiformis—wheat [8] among others. Effector identification has become a staple of plant-pathogen investigations as the need heightens for novel and sustainable solutions to disease management.
The identification of effector proteins has been based primarily on bioinformatic pipelines that use common or “canonical” criteria to facilitate effector identification. These canonical characteristics include the presence of a peptide signal, protein length ≤400 amino acids, cysteine-rich amino acid content (≥4 cysteines) and the absence of transmembrane domains (TMD) [9,10,11,12]. These criteria classify canonical effectors, the effector type predominantly identified in high-throughput effector studies of the last two decades.
However, effector proteins that differ in one or more of these canonical criteria also exist and we will refer to them as “non-canonical effectors” (NCEs). Non-canonical effectors have been identified based on specific searches for motifs and domains that are associated with other characterized effectors [13,14,15], or because of overexpression data observed in transcriptomes of plant-pathogen interactions [8,16]. The effector Pi04314 (PexRD24) was identified while searching for the “RXLR” motif deduced from ESTs of the oomycete Phytophthora infestans during its interaction with potato and tomato. This non-canonical effector does not have a signal peptide in its sequence, but it has been shown to be secreted and then translocated to the host nucleus, promoting the host’s susceptibility to infection [17]. In the fungus, Blumeria graminis, a non-canonical effector called CSEP0064, found within a group of proteins containing a “RNase-like” domain denominated “RALPH”, has only two cysteines and was identified as part of a general search for domains within the small, secreted proteins of the fungus [18]. PsIsc1 and VdIsc1 are NCEs lacking signal peptides that were found by BLASTing sequences of known isochorismate synthases from other organisms and identifying their homologs in Phytophthora sojae and Verticillium dahliae [19]. Other NCEs surpass the 300 or 400 amino acid limit of canonical effectors. SAD1 of Sporisorium reilianum induces the loss of apical dominance in maize and Arabidopsis and is a NCE with 626 amino acids [20]. Similarly, the Puccinia graminis f. sp. tritici effector AvrSr35 is a secreted protein which interacts with the Sr35 cognate in wheat and is 578 amino acids in length [21]. AvrSr35 is not recognized as an effector by EffHunter or EffectorP 2.0. Like the other examples mentioned, these NCEs were proven to be effectors though functional characterization after identification. Other experimentally validated NCEs are not recognized by EffHunter or EffectorP 2.0 individually, or both [12]. The contribution of NCEs cannot be understated for the elucidation of complete pathogen effectoromes.
Many recent reports continue to base their predictions of effectors on short amino acid lengths and cysteine richness [22,23], but others are searching by other means [8,13,14,15,16]. Available algorithms include the EffectorP machine learning (ML) series, among which the latest version, EffectorP 3.0, is able to classify effectors in the apoplast and cytoplast [24]. Sperschneider and Dodds (2022) [24], classified 176 true, experimentally-validated effectors; 64 were predicted apoplastic (extracellular) while a significantly larger 112 were predicted to be cytoplasmic, revealing a bias in effector identification based on canonical criteria. Another recent predictor, EffHunter, is a Perl script that is suitable for canonical effector classification since it strictly retrieves canonical effectors [12]. FunEffector-Pred, a ML algorithm, was trained with a similar number of proteins in both datasets to overcome the resulting bias of EffectorP which was trained with imbalanced positive and negative datasets [25]. Predector is another ML algorithm dedicated to fungal effectoromes, but for the predictive ranking of candidate effectors [26]. In the case of oomycete effectors, Nur et al. (2021) [27] constructed Effector-O, following a similar approach like that of FunEffector-Pred; this ML algorithm was trained with balanced 1:1 positive to negative training datasets, but Effector-O refines the prediction by retrieving the lineage-specific proteins.
The identification of effectors can be challenging, but the advent of these algorithms has facilitated faster effector identification. All aforementioned algorithms were trained on validated true effectors, and these datasets comprise effectors that were identified following the criteria of canonical effectors. Previously, motifs such as RxLR-dEER and Y/F/WxC were once believed to be exclusive to oomycetes and were therefore excluded in the identification of fungal effectors. A turning point occurred when Godfrey et al. (2010) [28] found the motifs RxLR-dEER and Y/F/WxC within the N-terminal of 35 and 107 candidates, respectively, in Blumeria graminis f.sp. hordei. Recently, Zhang et al. (2020) [22] identified effectors in the transcriptome of the interaction of the basidiomycete fungus, Puccinia triticina and wheat. These authors used a Perl script that encompassed a motif search including RxLR found in oomycetes, [Y/F/W]xC found in powdery mildew, G[I/F/Y][A/L/S/T]R of flax rust, and [L/I]xAR, [R/K]CxxCx12H, and YxSL[R/K] of Magnaporthe oryzae, where they identified 635 effector candidates. Interestingly, part of them match the canonical criteria, but 45 had no cysteines at all, while 47 had only one. It is important to note that 244 cysteine-rich small, extracellular proteins of P. triticina had the [Y/F/W]xC motif, 24 had RxLR, 5 had G[I/F/Y][A/L/S/T]R, 64 had [L/I]xAR, and 2 had YxSL[R/K], indicating that these motifs are not exclusive to oomycetes. In contrast, Wood et al. (2020) [29] found effector candidates in the oomycete pathogen, Bremia lactucae, containing the WY domain but lacking the canonical RXLR motif. This shows that going beyond the canonical criteria allows for the expansion of effectoromes and the discovery of novel effectors. Likewise, Nur et al. (2021) [27], predicted 5814 candidates in the effectorome of Phytophthora infestans; they used a new identification approach which focused on seven biochemical characteristics of the N-terminus of the protein sequence instead of the classical oomycete effector motifs. The sum of the novel effectors found was one order of magnitude larger than the previously estimated effectorome of this pathogen. These results emphasize the need for an innovative algorithm that goes beyond classical effector identification, one that can identify both canonical and non-canonical effectors. Realistic estimations of pathogen effectoromes can provide a wide range of tools which can be exploited for disease control, for example, selecting non-redundant effector families, or designing strategies to target all members of a redundant family.
We present a new effector identification tool called WideEffHunter. This is a user-friendly, modular and stand-alone algorithm for the identification of canonical and non-canonical fungal and oomycete protein effectors. The algorithm conducts a search in deduced proteomes for effectors containing domains or motifs, as well as proteins with homology to known fungal and oomycete effectors. Recent reports have shown in some fungal effectors the existence of previously believed oomycete effector exclusive motifs. Conversely, domains from fungal proteins have been identified in oomycete effectors [22,29,30]. Similarly, WideEffHunter found classical motifs of oomycete effectors in fungal effector candidates, meanwhile in Phytophthora infestans, the algorithm was able to identify LysM and other domains commonly found in fungal effectors. Characterization of effectoromes with EffHunter shows that the subset of canonical effectors comprises less than 10% of predicted effectoromes, suggesting that they represent just the tip of the iceberg in effectoromes. Interestingly, the comparison of the predicted effectoromes in fungi and oomycetes showed similar proportions of effectors containing domains, effectors containing motifs, and effectors that share homology with validated effectors, i.e., similar abundancies of effector conserved families. This suggests that evolution has shaped similar effectorome patterns in fungi and oomycetes, contrary to what is currently believed. It is worth mentioning that meanwhile other predictors were designed to be dedicated to one kingdom (fungi or oomycetes), or even to a particular lifestyle (for example only pathogens), the results for WideEffHunter support that this new predictor can be applied to both fungi and oomycetes, whether pathogenic or non-pathogenic to the plant host.
2. Results
2.1. Protein Databases
The true fungal effector dataset comprises validated effector proteins from diverse reports (Table 1); a non-redundant list of effectors was compiled which contains 228 true fungal effectors. The oomycete dataset was similarly constructed and it comprises 86 true oomycete effectors, as shown in Table 1.
Table 1.
List of positive datasets compiled in the present study.
With respect to the non-canonical effectors, a comprehensive search of recent literature for novel, validated (true) non-canonical effectors was done. Thirteen NCEs were added to the fungal dataset, and three to the oomycete dataset. The lists of effectors comprising the fungal database are provided in Supplementary Table S1 while the list of oomycete effectors is provided in Supplementary Table S2.
2.2. In Silico Characterization of True Effectors
Effector identification is challenging, and even confusing at times, as different combinations of criteria can be used. The literature frequently states that not all effectors meet all the established effector criteria. Some predictions allow one or two TMDs, meanwhile others do not allow for proteins with any TMD. Similarly, the protein length cut-off used for effector identification is variable, between 200 to 400 amino acids. Other criteria such as cysteine content may also vary according to the study [5,12,32,33,34].
To help researchers prioritize the most important criteria for selecting or ranking effectors, as well as to identify properties that could aid in WideEffHunter’s design, true effectors were in silico characterized.
Consistent with current criteria for effector identification, the majority (281 protein sequences, ~89%) was shorter than 400 amino acids, but 10.5% of them were not small proteins. The length of the largest known effectors is between 415 and 847 amino acids. Among them, KEX1, a yeast carboxypeptidase B-like killer toxin, has 847 amino acids. Other examples include PsCRN108, a CRN effector of Phytophthora sojae, which has 820 amino acids, and Jsi1, an effector of Ustilago maydis that interferes in host jasmonate/ethylene signaling and has a length of 641 amino acids. It is evident that large effectors occur both in fungal and oomycete kingdoms, but usually elude the current predictors.
According to EffHunter, 142 proteins were canonical (45%), i.e., they had less than 400 amino acids, at least 4 Cys residues, a signal peptide for secretion and no TMD [12]. Non-canonical effectors (172 protein sequences, 54.7%) do not meet some of these criteria. Twenty-eight effectors had one or two TMDs (8.9%), meanwhile 3 effectors had 3–6 TMDs (Supplementary Tables S1 and S2). Only 11 effectors (3.5%) were predicted to have a Glycosylphosphatidylinositol (GPI) anchor domain.
The order or ranking of the weight of each criterion based on the percentage of effectors that complied is as follows: No GPI (96.5%), no TMD (91.1%), sequence length less than 400 amino acids (89.4%), signal peptide (85%), extracellular (71.6%), ≥4% Cys (54.4%). Forty-five percent had only 0 to 3 Cys residues. Results are shown in Table 2.
Table 2.
Summary of in silico characterization of canonical and non-canonical fungal/oomycete true effectors.
To better evaluate the effectors of each of these kingdoms (fungi and oomycetes), the analyses were repeated on each database independently. Here, differences were evident between both groups. While 57% of fungal effectors were canonical, 86% of oomycete effectors were non-canonical (Table 3). With respect to fungi, only 7% of effectors had no cysteines, meanwhile 36% of oomycete effectors were cysteine-free. In total, 79.2% of oomycete effectors contained 3 cysteines or less, compared with 32.9% of fungal effectors. Conversely, 67% of fungal effectors had 4 cysteines or more, compared with 20.8% of oomycete effectors. Both classes coincide regarding TMDs, with the 90% of fungi and 93% of oomycete effectors having no TMD. Similarly, ~96 and 99% of fungi and oomycetes, respectively, had no GPI anchors (Table 3).
Table 3.
Characterization and comparison of fungal and oomycete effectors.
2.3. Functional Annotation of Fungal/Oomycete Effector Proteins: Domains and Motifs
Recently, with the intention of expanding effector prediction in fungal genomes, Huang et al. (2022) [13], Jaswal et al. (2021) [14] and Zhao et al. (2020) [15] conduced searches based on motifs, a strategy typically used to identify oomycete effectors (the motifs RXLR, ERR, LXL, FLAK, are usually associated with oomycete effectors). Conversely, motif- independent prediction of effectors was recently applied in oomycetes [27]. In both cases, the change of strategy rendered larger effectoromes.
To gain a better understanding of the role of domains and motifs in effector prediction, the fungal and oomycete effector databases were analyzed with the program InterProScan version 5.39–77.0 [35], which automatically and simultaneously searches in the databases of the modules CDD [36], PFAM [37], PRINTS [38], SMART [39] and TIGRFAM [40], among others; default parameter settings were used.
Fifty-six domains were identified (Table 4). Some domains were identified only in fungal effectors (LysM, CFEM, cerato-platanin, among others), others in oomycetes (RXLR, Tetratricopeptide repeat domain, cystatin/monellin, RuvA domain), and others were shared among effectors of both kingdoms (glycosyl hydrolase, pectin lyase fold, NPP1, PROKAR lipoprotein, among others). The crinkler domain, usually associated with oomycete effectors, is present in RiNLE1, a nuclear-targeted effector of the arbuscular mycorrhizal fungus Rhizophagus irregularis [41]. This is a non-canonical fungal effector, since its length is 469 amino acids and no signal peptide is computationally deduced. The Localizer program predicts nuclear localization for RiNLE1, congruent with the report of Wang et al. (2021) [41]. Details of in silico characterization are provided in Supplementary Tables S1 and S2.
Table 4.
Functional domains identified in fungal and oomycete effectors.
In total, 133 effectors contained at least one INTERPRO-domain; 49 domains were present in the fungal dataset (in 99 protein sequences), and 17 in the oomycete dataset (in 34 effectors). Details are included in Supplementary Tables S1 and S2. The most frequently occurring domains are related to carbohydrate binding or hydrolysis (LysM, glycosyl hydrolase, pectin lyase fold), since they play critical roles in host cell wall damage and pathogen cell wall-remodeling. Other effector functions are associated with entering the host cell, for example RXLR signatures in oomycete effectors, and fungal hydrophobins and cerato-platanins. In the important category of host defense suppression, the following domains were identified: crinkler, isochorismatase and chorismate mutase domain-containing effector. Various other domains are related to protein-protein interactions, which is expected since effectors need to bind their targets. Some effectors have domains characteristic of enzymes, such as lipases and different classes of proteases, meanwhile other effectors have protease-inhibitor domains.
Motifs have been used as probes to retrieve effector candidates, but usually only the most frequently occurring motifs are taken into consideration [13,14,15,22]. To date, no database of effector domains exists and the creation of this comprehensive list of effector domains represents a valuable tool for effectoromics. With respect to the number of known motifs, this list is still small. Further discovery of novel classes of effectors by genome mining and comparison of effectoromes may help to discover new effector-related domains.
In the positive dataset used here, no domains were identified in 181 effectors (57.6%): 129 from fungi (56.6%), and 52 (60.4%) from oomycete. All domain-free oomycete effectors belong to the non-canonical classification (Supplementary Table S2), but with respect to fungi, 64 non-canonical and 65 canonical effectors lacked domains. Table 5 shows a summary of these results, and details can be found in Supplementary Tables S1 and S2.
Table 5.
Classification of fungal and oomycete effectors with respect to functional domains present.
To test the regex designed here for domains, as well as the regex compiled from the literature regarding motifs, both regexes were used to mine the database of true effectors (positive dataset). As expected, these domains and motifs were found in the positive dataset (not shown). In fungi there were 110 hits, YFWxC being the most frequent (36), followed by motifs EAR (23), LysM (16), and [LI]xAR (16); curiously, 9 fungal true effectors had the RXLR motif. In the oomycete effectors, in addition to classical motifs for these microorganisms, the LysM domain was identified in 5 effectors and one was identified with a ToxA domain.
To potentially find novel motifs, the sequences of the true effectors were analyzed using MEME suite. Table 6 shows the top 15 motifs found in fungal and oomycete effectors, respectively. The most frequent motif in fungi was MKFFTILL, found in 173 effectors (77.6% of fungal effectors; 55% considering the total database of 314 effectors). The other 14 motifs in fungal effectors were only present in 2 to 7 effectors. Regarding oomycetes, the most frequent motif was the RXLR motif found in 59 effectors (68.6%). The second most frequent was the motif MRLCYFLFVAAAAI, which was identified in 36 effectors, and the third, LYEHWHMRGCTPEHVYTILKLN, in 28 effectors. Similarly, the other 12 motifs were present in 2 to 7 effectors. For these most frequently occurring motifs (one for fungi and two for oomycete) found by MEME, a regex was created for them to be included in WideEffHunter.
Table 6.
Sequence motifs found in fungal and oomycete true effectors. Top15 MEME motifs found in true, validated fungal and oomycete effectors.
Analyses conducted here, even with these still limited sets of validated effectors, enable us to discover novel domains and motifs in fungal and oomycete effectors. Further discovery of novel classes of effectors through genome mining and effectorome comparative analysis may discover new effector-related domains and motifs.
2.4. Construction and Validation of WideEffHunter Algorithm
The WideEffHunter code concatenates the mining of each regex for effector-related domains and motifs, including the three new motifs found here by MEME in the positive dataset (Table 6), and the results of Local Blastp against the database of true effectors. After pooling all hits, redundancy was eliminated which resulted in the predicted effectorome.
Table 7 shows validation results of WideEffHunter compared with SignalP 1.0 [9], SignalP 2.0 [31], SignalP 3.0 [24], and EffectorO [27], comparing predictions on the positive and negative datasets.
Table 7.
Validation of WideEffHunter for prediction of fungal and oomycete effector proteins and comparison with EffectorP 3.0, EffectorP 2.0, EffectorP 1.0, and EffectorO.
Since WideEffHunter includes the Blastp database of true effectors, it retrieves all sequences when tested on the positive dataset. On the contrary, tested on the negative dataset, WideEffHunter retrieves 1545 hits. This high number of “false positives” results in a very low F1 score.
To improve the performance of WideEffHunter, analysis of the negative dataset using the MEME program was conducted. Supplementary Table S3 shows the top 15 motifs found which were used to refine the prediction of effectoromes. The number of hits from the positive dataset did not change because these motifs were not present in the dataset of known true effectors. Elimination of hits in the negative dataset containing these MEME motifs found in the negative sequence controls, reduced the number of false positives to 192. Specificity, precision, accuracy, false positive rate and F1 score parameters were all improved; these values were close to those shown by the three EffectorP versions (Table 7) and indicates that this version of WideEffHunter is sufficiently robust for effector prediction in fungal and oomycete proteomes.
Figure 1 shows the WideEffHunter code and proposed downstream steps for effectorome characterization.
Figure 1.
(A) Workflow to predict fungal and oomycete effectors with WideEffHunter. Positive database of true (validated) effectors comprises 228 fungal effectors and 86 oomycete effectors. Effector-related motifs were compiled from literature and enriched with motifs found in true effectors by the MEME program. (B) Classification and characterization of canonical and non-canonical effectors.
2.5. WideEffHunter Prediction of Effectoromes in Fungal and Oomycete Proteomes
WideEffHunter was used to predict effectors on deduced proteomes of selected fungi and oomycetes.
With respect to the oomycete effectoromes of Bremia lactucae and Phytophthora infestans, WideEffHunter predicted a similar number of effectors to that reported by Nur et al. (2021) [27] for B. lactucae (1812 vs. 1777 in the reference), and a lower number of effectors than that predicted by Nur et al. (2021) [27] for P. infestans (3811 in comparison with 5814 in the reference). In fungi, in all examples predicted here, WideEffHunter expanded the effectoromes: 3 times for Puccinia triticina, and 1.6 times for Venturia inaequalis (Table 8). In the case of the fungal endophytes Pestalotiopsis fici and Xylona heveae, and in the antagonist Trichoderma harzianum, the increases were significant, ranging from 6 to 18 times (Table 8).
Table 8.
Effectoromes predicted by WideEffHunter in selected fungi and oomycetes and comparison with other predictors.
Curiously, the number of effector candidates in unfiltered WideEffHunter’s predictions is similar in most cases to predictions made by EffectorP 3.0, while the filtered predictions (that is, candidates without MEME motifs found in the negative dataset) in the pathogens P. triticina, V. inaequalis, P. infestans and B. lactucae were similar to those of EffectorP 2.0 (Table 8). Discrepancies between these two predictors were found with T. harzianum, P. fici, and X. heveae, in which WideEffHunter predicted larger effectoromes. Predictions of effectoromes of the non-pathogens P. fici and X. heveae by WideEffHunter were similar to EffectorP 1.0 predictions (Table 8).
Comparing the compositions of the effectoromes, we found that WideEffHunter shared ~60–70% hits with EffectorP 3.0 and EffectorO (Supplementary Table S4, tab “prediction”), but common hits were lower between WideEffHunter and EffectorP 3.0 for the non-pathogens (~40–46%). The lowest number of shared sets for WideEffHunter were observed in the effectoromes predicted by EffectorP 2.0 (~13–24%). Between 6 and 13% of effectoromes predicted by WideEffHunter were shared with those predicted by EffectorP 1.0, EffectorP 2.0, EffectorP 3.0, and EffectorO (Supplementary Table S4, tab “prediction”).
Analysis of the catalogs of the effector candidates predicted by WideEffHunter revealed that >87% were non-canonical (Supplementary Table S4, tab “classification”). Around 80% lack TMDs and 64–80% are <400 amino acids in length, ~50% have at least 4 Cys residues, and less than 20% have signal peptides (Supplementary Table S4, tab “characterization”). The majority of effector candidates were predicted apoplastic (~50%), followed by nuclear (~30%), meanwhile proportions for mitochondria and chloroplast targeting were similar (~10–12%). Domains occurred in 40–60% of candidates and motifs were identified in 80–96%; the lesser contributing factor to the effectoromes was the subset of homologs of confirmed effectors (1.8–9.3%).
3. Discussion
Effectoromics is a central research area in plant pathology, but identification of effectors has been slow, difficult, and even confusing. There are several criteria used for effector identification, but not all effectors perfectly match the established criteria, making effector identification a challenge [9,30,34,43,44]. Effector identification pipelines are quite variable; the identification of effectors in fungi and oomycetes can permit the presence of one or two TMDs [33] or entirely exclude TMDs altogether [12,32]. They can have a protein size cutoff of 250 amino acids or less [5,33], 300 amino acids [43], or the upper limit can be set to 400 amino acids [12,25]. Some pipelines define effectors as having a cysteine content of ≥2% [45], ≥5% [46] while others consider at least 4 cysteine residues for effector candidature [12,23]. Recent pipelines were based on sequence homology within species of the same microbial genus [27,32], or the identification of domains or motifs, but the latter strategy has been exclusive to either fungi (domains) or oomycete (motifs) [29,47], but with no trans-kingdom application. Novel algorithms considering domains and motifs for both fungal and oomycete effectoromes prediction are necessary.
Fortunately, during recent years, the number of validated effectors has been increasing significantly. Sperschneider et al. (2018) [31] compiled 94 fungal and oomycete effector protein sequences in order to train EffectorP v2.0. More recently, Carreón-Anguiano et al. (2020) [12] compiled 150 effector sequences to validate EffHunter. In the present study we compiled 314 protein sequences taken from different datasets of true effectors: 228 from fungi, and 86 from oomycetes. This is the largest dataset of true effectors compiled to date. We found the absence of GPI anchors in 96.5% of effectors and the absence of TMDs in 90.7% of effectors. Additionally, sequence length was less than 400 amino acids in 89.4% of effectors, 85.1% had a signal peptide, 71.6% had extracellular localization, and 54.4% had a Cys content > 4% (Table 2). Cysteine content, one of the commonly used effector identification criteria, is not met by almost 50% of the true effectors. Both fungi and oomycete coincide in that >90% of effectors lack TMDs and no GPI anchors. This knowledge about the weight of each criterion will help researchers make better decisions when they are selecting effector candidates or creating new algorithms.
According to our analysis using WideEffHunter, around 50% of known fungal effectors are canonical, while in oomycetes, more than 85% are non-canonical. These differences may be attributed, in part, to genuine evolutionary differences among effectors in these kingdoms; for example, while most known fungal effectors are secreted to the apoplast, the majority of described oomycete effectors are translocated into the host cell [48]. However, the observed differences may result from a bias in the pipelines used until this point for the identification of effectors in these kingdoms; in fungi, effectors are usually identified based on protein length and cysteine content, while in oomycetes, the search is usually based on motifs such as RXLR, ERR, LXL, and FLAK [22,25,48].
During the characterization of validated effectors (positive datasets), we compiled a comprehensive list of motifs and domains present. It is important to mention that no databases of effector domains existed before. In previous studies, the predictions only considered a few domains such as LysM or CFEM, by mining proteomes with regular expressions or Hidden Markov Models [13,14,15,49,50]. The newly created database of effector-related domains, together with the motif database compiled from literature, represent valuable tools for effectoromics. The characterization of true effectors facilitated the identification of new effector features, such as the motif MKFFTILL which was present in 173 fungal effectors, and RHLRSHYQDEE, present in 59 oomycete effectors. The potential importance of novel effector motifs, especially in fungi, may be evidenced by citing the comments of He et al. (2020) [48]; in their words “a breakthrough for oomycete pathogens was the identification of the conserved amino acid motifs RxLR and LFLAK. These motifs define sets of several hundred intracellular effectors and have led to an upsurge in research on effector–host target interactions. For fungal plant pathogens, there are no such universal motifs, so the identification of bona fide intracellular effectors is a labor-intensive process initiated by the broader bioinformatic prediction of secreted proteins”. Therefore, these motif sequences enrich the current pool of computational tools available for effector identification.
As mentioned before, domains and/or motifs have recently been used as probes to retrieve effector candidates such as the frequently occurring LysM and CFEM domains (fungi), and RXLR, LFLAK, Y/F/WxC, and CRN motifs (oomycetes). However, to date, only a few studies have employed this new “out-of-box” strategy, where motifs were the motor for fungal effector identification [13,14,15], or, in contrast, motif-independent searches for oomycete effectors were executed [27]. This strategy identified 719 RXLR-like, 19 CRN-like, and 138 Y/F/WxC new effector candidates in the fungus, P. graminis, in addition to the previously predicted effectorome following classical fungal effector identification methods [15]. This suggests that these classes of effectors are not exclusive to oomycetes and may contribute greatly to fungal effectoromics. These strategies have not only helped identify novel effectors, but have sometimes increased the number of known effectors by one order of magnitude, as was the case for P. infestans with an initial 563 effectors [51] which was further increased to 5814 [27]. According to WideEffHunter, fungal effectoromes comprise ~90% motif-containing effectors (similar to the proportion found during our analysis in oomycetes), and oomycete effectoromes comprise ~47–49% domain-containing effectors (similar to the proportion found here in fungi); likewise, the proportion of nuclear-targeted effector candidates are not very different between fungi and oomycetes. Actually, it is noteworthy that the percentages of effectors for each particular characteristic are similar among the predicted effectoromes (Supplementary Table S4, tabs “classification” and “characterization”), which suggests that contrary to current belief, the effectoromes in fungi and oomycetes have followed similar evolutionary histories. The occurrence of shared motifs and domains can facilitate the development of bioinformatics tools suitable for both kingdoms and will enable us to clarify whether fungi and oomycete effectoromes follow different evolutive histories, or the differences resulted from biases in previous identification methods.
Omics studies, especially transcriptomics and proteomics of plant-pathogen interactions, have largely contributed to the discovery of novel, non-canonical effectors (Table 2 and Table 3), but these effectors are still the most elusive for computational identification. WideEffHunter was constructed to expand effectoromes, combining domains and motifs found either in fungal or oomycete effectors for the identification of both canonical and non-canonical effectors. The in silico characterization of 172 NCEs (98 from fungi and 74 from oomycetes), shows that 56 have functional domains but 116 effectors do not (Table 5). In agreement with this result, recently in Fusarium sacchari, 41% of predicted effectors had no known domains or motifs [13]. In order to widen the prediction capacity of WideEffHunter, the database of known true effectors was nested in WideEffHunter as a search tool, added to the regex for motifs and domains.
Validation of WideEffHunter was carried out in two runs. In the first, it retrieved 1545 hits from the negative dataset (“false positives”) and had poor performance parameters (F1 score 0.287). After the elimination of hits that contained motifs found by the MEME program in the negative dataset, the retrieved hits from the negative control decreased to 192. All parameters of WideEffHunter were improved with that step (Table 7) and attained parameter values closer to those shown by the EffectorP predictors. It was observed that EffectorO retrieved 781 hits from the negative dataset. We checked the composition of the retrieved hits from the negative dataset by WideEffHunter and EffectorO and observed that most of them contain the motifs RXLR, EAR and CRN in the expected N-terminal position on the effector proteins. Additionally, WideEffHunter hits were comprised of 52 false positives with LysM domains (not shown). It is worth mentioning that the EffectorO ML algorithm was created for mining oomycete proteomes, and the overestimation observed here was because we analyzed the uploaded proteomes in Fasta files online with default settings but did not later select those candidates with lineage-specific phylogenetic distribution. That tool may improve EffectorO prediction, but we decided not to include it since the EffectorO script discards all hits that match with homologs in fungi and we would therefore not be able to apply this to fungal proteomes.
The possibility exists that some proteins in the negative dataset used in the present study are undiscovered effectors, since this set contains proteases, lipases, scytalone dehydratases, among others. Construction of negative datasets is really challenging since many non-effectors could be undiscovered effectors. Recently, in training the ML algorithms Predector and EffectorP 2.0, the authors included proteins from saprophytes and symbionts in the negative datasets, but the number of reports showing the presence of effectors in saprophytes and symbionts is currently increasing [52,53], and these predictors are most likely ruling out many potential true effectors. However, authors of EffectorP algorithms acknowledged that EffectorP 2.0 was improved in pathogen effector identification, since it excluded many proteins that are shared with non-pathogens compared to EffectorP 1.0 [31]. In congruence with what was expected, EffectorP 2.0 predicted lower effectoromes than WideEffHunter for the antagonist T. harzianum, and the endophytes P. fici and X. heveae. WideEffHunter also expanded effectoromes in comparison with Queiroz and Santana (2020) [43], since these authors restricted the identification to small, secreted cysteine-rich proteins with no conserved domains, containing a nuclear localization signal and repetitive sequences.
Curiously, predictions of WideEffHunter for pathogenic fungi and oomycete is closest to predictions made by EffectorP 2.0, meanwhile WideEffHunter predictions for endophytes match with predictions of EffectorP 1.0. This is congruent with the fact that EffectorP 1.0 was not designed to filter saprophytes. Therefore, it seems that WideEffHunter is suitable for both pathogenic and non-pathogenic fungi and oomycetes. We also observed that, on various proteomes, the prefiltered results of WideEffHunter are close to the results of EffectorP 3.0.
As an additional test to evaluate its performance, WideEffHunter was used to predict effectoromes that were previously predicted following different criteria, and WideEffHunter performed well in these predictions (Table 8). This reinforces that while other predictors are specialized for use in one kingdom, or even for a particular lifestyle (e.g., pathogens), WideEffHunter suitably works on different lifestyles in fungal and oomycete kingdoms. Around 60% of effector candidates predicted by WideEffHunter are shared with those predicted by EffectorP 3.0 or EffectorO (Supplementary Table S4). Therefore, WideEffHunter retrieves ~30–40% of novel candidates, expanding effectoromes. Effectors are so variable that no predictor can detect all potential candidates so authors usually recommend combining predictors [12,26,27,31]. Fungi and oomycetes are filamentous species that share similarities, but also differ from each other [48,54,55] so the prediction of their effectoromes has also followed different routes [25,27]. The WideEffHunter algorithm unifies the prediction of fungal and oomycete effectors.
Classification of effector candidates predicted by WideEffHunter shows that canonical effectors comprise less than 10% of effectoromes, suggesting that NCEs play a more important role than we previously believed.
Some effectors have been reported as elusive for current predictors; for example, PIIN 08944, and AvrSr355 which are not recognized by EffHunter or EffectorP 2.0; SAD1 and BEC1054, that are not recognized by EffHunter, and Mg3LysM, BEC1019 and CSEP0105, that are not recognized by EffectorP 2.0. WideEffHunter was able to retrieve all of these effectors since one of the retrieving tools is homology-based Blastp against the true effectors database. Effector candidates with homology represent 1.8 to 9% of effectoromes (Supplementary Table S4, tab “characterization”), indicating that this additional tool improved the performance of WideEffHunter. This result is congruent with the limited number of conserved families known currently in effectoromics. Some effectors that are widely distributed in fungi are Avr4, Ecp2, Ecp6, and NIS1, among others [30]. In oomycetes, the HaRxL23 [56], RXLR effectors [57], as well as CRN12_997 and other CRN effectors are conserved [58]. As more is revealed about complete effectoromes, more conserved families of effectors will be revealed.
Since effectoromics is continuously expanding, WideEffHunter was constructed modularly (Figure 1), giving researchers the opportunity to use the WideEffHunter algorithm as it was constructed, or to eliminate a particular regex of any domains or motifs for genome mining in their organism of choice. The list of motifs, domains and validated effectors are still limited, but further comparison of effectoromes may reveal new effectors, domains and motifs. The WideEffHunter algorithm also allows users to continuously feed it with new data, keeping the algorithm updated and making WideEffHunter a tool that continuously catalyzes the discovery of novel effectors.
4. Materials and Methods
4.1. Data Protein Collection
The dataset of true fungal and oomycete effectors was constructed by combining diverse datasets of experimentally validated effectors compiled in Carreón-Anguiano et al., (2020) [12], Jones et al., (2021) [26], Nur et al., (2021) [27], Sperschneider et al., (2018) [31], Wang et al., (2020) [25]. Additionally, 18 validated effector proteins were taken directly from their individual reports (sequences are provided in Supplementary Tables S1 and S2).
For the conversion of fasta files to text files and/or vice versa, the “Seqret” tool in the European EMBOSS platform (https://www.ebi.ac.uk/Tools/sfc/embossseqret/) was used. For the generation of a database in tabular format, the sequences in the fasta file were converted using a Python v2.7.18 script, separating the header and sequence motif information in a tab delimitated format.
4.2. In Silico Characterization of Effectors
A comprehensive analysis of each of the following effector criteria was done for the 228 fungal and 86 oomycete effectors belonging to the positive datasets: number of amino acids (length), cysteine residue number and percentage were analyzed with ProtParam tool at Expasy (https://web.expasy.org/protparam/; access 20 January 2022), transmembrane domain prediction with TMHMM [59], and the presence of signal peptides with SignalP 5.0 [60]. Protein subcellular localization was analyzed using LOCALIZER [61], and cell wall-bounded proteins were identified with PredGPI [62]. All programs were run with default parameters.
Canonical effectors were identified with the EffHunter algorithm [12] and the remaining proteins, (WideEffHunter prediction minus EffHunter prediction), were classified as non-canonical.
For functional domain identification, effector sequences were analyzed with PFAM [37] and InterPro [63]. Motifs were identified using MEME suite [64] and were manually searched for using motifs described in previous literature [9,10,13,15,65,66]. Functional annotation was carried out using the PFAM module in InterproScan STANDALONE mode [37].
4.3. Construction of Databases
Three databases were constructed: one for effector-related domains, another for effector-related motifs, and the third for the true validated effectors.
4.3.1. Database of Domains
Consensus sequences of the domains (for example LysM, CFEM, etc.) were downloaded from the “Simple Modular Architecture Research Tool” (SMART) web platform [39], selecting the consensus sequences with a value of 80%. Using “search SMART”, the information pertaining to the domains and the alignment consensus sequences were obtained. Consensus alignment sequences downloaded from SMART (Regex) were translated to regular expressions (regex) in Perl language (Supplementary Tables S5.1 and S5.2).
4.3.2. Database of Motifs
Regexes for effector-related motifs were taken from Huang et al. (2022) [13], Zhao et al. (2020) [15], Liu et al. (2019) [66], Sonah et al. (2016) [10], Adhikari et al. (2013) [65] and Sperschneider et al. (2016) [9]. In addition to these motifs obtained from the literature, three novel motifs identified by MEME were included: the MKFFTILL, motif found in fungi, and two oomycete motifs, MRLCYFLFVAAAAI and LYEHWHMRGCTPEHVYTILKLN. Regexes of motifs were designed in Perl language.
The databases of domain and motifs were created in tabular format as stated above.
4.3.3. Database of True Effectors
The list of amino acid sequences of fungal and oomycete validated effectors were converted to Fasta Format, and later converted to an indexed database using the following Linux command for BLAST “$:formatdb -i <Fasta.fasta> -p T –o T”.
4.4. Construction of WideEffHunter
WideEffHunter algorithm was constructed in Bash language 5.0.17 concatenating the different regexes (in Perl 5.30.0) corresponding to effector-related domains and motifs; input and output files are in Fasta format. Effector hits retrieved from the search for domains were pooled with the hits retrieved by the other criterion, the presence of motifs). The third search was performed using Local Blastp against the database of true effectors, and the hits were also pooled with the list of effector candidates retrieved in the domains and motifs searches. Redundancies were eliminated with the command pipeline “$: cat <File.txt> | sort | uniq”. The resulting list was considered to be the predicted effectorome of the fungus or oomycete under study.
All databases in FASTA and TAB format, positive protein datasets, open-source codes and accessory scripts can be found on the GitHub platform (https://github.com/Gisel-Carreon) and on the home page of Dr. Blondy Canto Canché (https://www.cicy.mx/unidad-de-biotecnologia/investigador/blondy-beatriz-canto-canche).
The command to execute WideEffHunter once it is installed in a linux/unix system, is “$: ./WideEffHunter.sh”.
It is worth mentioning that each step is modular; therefore, users can use the entire WideEffHunter as it was originally constructed for automatic prediction, or the user can delete a particular regex or database; likewise, users can add a regex for new effector-related domains and motifs, as well as upload newly discovered effectors to the positive dataset. In this way, WideEffHunter can be regularly updated.
4.5. Validation of WideEffHunter
For the validation of WideEffHunter, the positive dataset was used containing a total of 314 true effectors; 228 from fungi and 86 from oomycetes.
For the negative control, the dataset used in Carreón-Anguiano et al. (2020) [12] was used. This dataset contains 4528 protein sequences of different lengths, presence/absence of signal peptide and TMD. We selected this negative dataset because it was not constructed selecting proteins from saprophytes, as in other reports [26,31]. Saprophytes also contain effectors [52,53], and negative datasets containing their proteins to train algorithms may rule out novel, true effectors. Furthermore, during the validation of algorithms like WideEffHunter, it may result in higher numbers of “supposedly false positives”.
Motifs in proteins in the negative dataset were found through analysis with MEME; “negative exclusive” motifs were identified by searching for these motifs in the database of true effectors. To refine the prediction of false positives by WideEffHunter, the hits retrieved with the pipeline “domains + motifs + homologs of true effectors” were filtered eliminating those containing MEME motifs exclusive to negative control proteins.
The numbers of true positives, true negatives, false positives, and false negatives, were used to calculate sensitivity, specificity, precision and accuracy parameters as well as the F1 score, a parameter widely used to measure and compare performances of different software/pipelines [12,31].
The performance of WideEffHunter was compared with that of EffectorP 1.0 [9], EffectorP 2.0 [31], EffectorP 3.0 [24] and EffectorO [27].
4.6. Prediction of Effector Proteins in Fungal and Oomycete Genomes
For comparative analysis, recent reports that predict effectors using domains and motifs were selected. The genomes (rather deduced proteomes) that were searched with WideEffHunter were from the oomycetes P. infestans and B. lactucae [27], and the fungal pathogens P. triticina [15] and V. inaequalis [42]. In addition, the fungal endophytes P. fici and X. heveae [43], and the antagonist T. harzianum [12], were included.
Subsequently, effector candidates were classified as canonical or non-canonical using EffHunter. The number of non-canonical effectors was estimated by subtracting the prediction by EffHunter from the prediction by WideEffHunter.
Both classes, canonical and non-canonical effector candidates, were further in silico characterized in terms of: (a) number of amino acids, cysteine content, signal peptide, TMDs; (b) identification of effector-related domains; (c) identification of effector-related motifs and potential function (annotation); (d) homologs of true effectors; (e) cell localization.
5. Conclusions
WideEffHunter, an algorithm that predicts effectors based on effector-related domains and motifs, as well as homology to known validated effectors, is suitable for the retrieval of whole effectoromes (canonicals and non-canonical effector candidates) in pathogenic and non-pathogenic fungi and oomycetes. This is a user-friendly and modular algorithm that can be updated continuously with new domains, motifs and novel effectors, providing a powerful tool to strengthen effectoromics research.
6. Patents
The present algorithm was certified at Mexican Public Copyright Registry with the registration number 03-2022-101112004700-01.
Supplementary Materials
The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/ijms232113567/s1.
Author Contributions
Conceptualization, K.G.C.-A. and B.C.-C.; Methodology, K.G.C.-A., I.I.-F. and B.C.-C.; Software, K.G.C.-A. Validation, B.C.-C.; Formal Analysis, K.G.C.-A. and B.C.-C.; Resources, B.C.-C.; Data Curation, K.G.C.-A. and B.C.-C.; Writing—Original Draft Preparation, K.G.C.-A., J.N.A.T. and B.C.-C.; Writing—Review and Editing, B.C.-C., K.G.C.-A., O.J.C.-D., I.I.-F., J.N.A.T.; Supervision, B.C.-C.; Project Administration, B.H.C.-M., and B.C.-C.; Funding Acquisition, B.C.-C. All coauthors contributed to the writing and correction of the manuscript. All authors have read and agreed to the published version of the manuscript.
Funding
This research received funding from CONACyT-Mexico project FOP16-2021-01 No. 320993, and CONACyt-funded scholarship for doctoral students Todd J.N.A. (863239) and Couoh-Dzul O.J (644399).
Institutional Review Board Statement
Not applicable.
Informed Consent Statement
Not applicable.
Data Availability Statement
Conflicts of Interest
The authors declare no conflict of interest.
References
- Giesbers, A.K.J.; Pelgrom, A.J.E.; Visser, R.G.F.; Niks, R.E.; Van den Ackerveken, G.; Jeuken, M.J.W. Effector-Mediated Discovery of a Novel Resistance Gene against Bremia lactucae in a Nonhost Lettuce Species. New Phytol. 2017, 216, 915–926. [Google Scholar] [CrossRef] [PubMed]
- Zhang, M.; Coaker, G. Harnessing Effector-Triggered Immunity for Durable Disease Resistance. Phytopathology 2017, 107, 912–919. [Google Scholar] [CrossRef] [PubMed]
- Kanja, C.; Hammond-Kosack, K.E. Proteinaceous Effector Discovery and Characterization in Filamentous Plant Pathogens. Mol. Plant Pathol. 2020, 21, 1353–1376. [Google Scholar] [CrossRef] [PubMed]
- Gorash, A.; Armonienė, R.; Kazan, K. Can Effectoromics and Loss-of-Susceptibility Be Exploited for Improving Fusarium Head Blight Resistance in Wheat? Crop J. 2021, 9, 1–16. [Google Scholar] [CrossRef]
- Chang, T.-C.; Salvucci, A.; Crous, P.W.; Stergiopoulos, I. Comparative Genomics of the Sigatoka Disease Complex on Banana Suggests a Link between Parallel Evolutionary Changes in Pseudocercospora fijiensis and Pseudocercospora eumusae and Increased Virulence on the Banana Host. PLOS Genet. 2016, 12, e1005904. [Google Scholar] [CrossRef]
- Palma-Guerrero, J.; Ma, X.; Torriani, S.F.F.; Zala, M.; Francisco, C.S.; Hartmann, F.E.; Croll, D.; McDonald, B.A. Comparative Transcriptome Analyses in Zymoseptoria tritici Reveal Significant Differences in Gene Expression Among Strains During Plant Infection. Mol. Plant Microbe Interact. 2017, 30, 231–244. [Google Scholar] [CrossRef]
- Ökmen, B.; Mathow, D.; Hof, A.; Lahrmann, U.; Aßmann, D.; Doehlemann, G. Mining the Effector Repertoire of the Biotrophic Fungal Pathogen Ustilago hordei during Host and Non-Host Infection. Mol. Plant Pathol. 2018, 19, 2603–2622. [Google Scholar] [CrossRef]
- Ozketen, A.C.; Andac-Ozketen, A.; Dagvadorj, B.; Demiralay, B.; Akkaya, M.S. In-Depth Secretome Analysis of Puccinia striiformis f. sp. tritici in Infected Wheat Uncovers Effector Functions. Biosci. Rep. 2020, 40, BSR20201188. [Google Scholar] [CrossRef]
- Sperschneider, J.; Gardiner, D.M.; Dodds, P.N.; Tini, F.; Covarelli, L.; Singh, K.B.; Manners, J.M.; Taylor, J.M. EffectorP: Predicting Fungal Effector Proteins from Secretomes Using Machine Learning. New Phytol. 2016, 210, 743–761. [Google Scholar] [CrossRef]
- Sonah, H.; Deshmukh, R.K.; Bélanger, R.R. Computational Prediction of Effector Proteins in Fungi: Opportunities and Challenges. Front. Plant Sci. 2016, 7, 126. [Google Scholar] [CrossRef]
- Jones, D.A.; Bertazzoni, S.; Turo, C.J.; Syme, R.A.; Hane, J.K. Bioinformatic Prediction of Plant–Pathogenicity Effector Proteins of Fungi. Curr. Opin. Microbiol. 2018, 46, 43–49. [Google Scholar] [CrossRef] [PubMed]
- Carreón-Anguiano, K.G.; Islas-Flores, I.; Vega-Arreguín, J.; Sáenz-Carbonell, L.; Canto-Canché, B. EffHunter: A Tool for Prediction of Effector Protein Candidates in Fungal Proteomic Databases. Biomolecules 2020, 10, 712. [Google Scholar] [CrossRef] [PubMed]
- Huang, Z.; Li, H.; Zhou, Y.; Bao, Y.; Duan, Z.; Wang, C.; Powell, C.A.; Chen, B.; Zhang, M.; Yao, W. Predication of the Effector Proteins Secreted by Fusarium sacchari Using Genomic Analysis and Heterogenous Expression. J. Fungi 2022, 8, 59. [Google Scholar] [CrossRef] [PubMed]
- Jaswal, R.; Dubey, H.; Kiran, K.; Rawal, H.; Rajarammohan, S.; Prasad, P.; Bhardwaj, S.C.; Sonah, H.; Deshmukh, R.; Gupta, N.; et al. Comparative Secretomics Identifies Conserved WAxR Motif-Containing Effectors in Rust Fungi That Suppress Cell Death in Plants. bioRxiv 2021. [Google Scholar] [CrossRef]
- Zhao, S.; Shang, X.; Bi, W.; Yu, X.; Liu, D.; Kang, Z.; Wang, X.; Wang, X. Genome-Wide Identification of Effector Candidates with Conserved Motifs from the Wheat Leaf Rust Fungus Puccinia triticina. Front. Microbiol. 2020, 11, 1188. [Google Scholar] [CrossRef] [PubMed]
- Schurack, S.; Depotter, J.R.L.; Gupta, D.; Thines, M.; Doehlemann, G. Comparative Transcriptome Profiling Identifies Maize Line Specificity of Fungal Effectors in the Maize–Ustilago maydis Interaction. Plant J. 2021, 106, 733–752. [Google Scholar] [CrossRef] [PubMed]
- Boevink, P.C.; Wang, X.; McLellan, H.; He, Q.; Naqvi, S.; Armstrong, M.R.; Zhang, W.; Hein, I.; Gilroy, E.M.; Tian, Z.; et al. A Phytophthora infestans RXLR Effector Targets Plant PP1c Isoforms That Promote Late Blight Disease. Nat. Commun. 2016, 7, 10311. [Google Scholar] [CrossRef]
- Pennington, H.G.; Jones, R.; Kwon, S.; Bonciani, G.; Thieron, H.; Chandler, T.; Luong, P.; Morgan, S.N.; Przydacz, M.; Bozkurt, T.; et al. The Fungal Ribonuclease-like Effector Protein CSEP0064/BEC1054 Represses Plant Immunity and Interferes with Degradation of Host Ribosomal RNA. PLoS Pathog. 2019, 15, e1007620. [Google Scholar] [CrossRef]
- Liu, T.; Song, T.; Zhang, X.; Yuan, H.; Su, L.; Li, W.; Xu, J.; Liu, S.; Chen, L.; Chen, T.; et al. Unconventionally Secreted Effectors of Two Filamentous Pathogens Target Plant Salicylate Biosynthesis. Nat. Commun. 2014, 5, 4686. [Google Scholar] [CrossRef]
- Ghareeb, H.; Drechsler, F.; Löfke, C.; Teichmann, T.; Schirawski, J. SUPPRESSOR OF APICAL DOMINANCE1 of Sporisorium reilianum Modulates Inflorescence Branching Architecture in Maize and Arabidopsis. Plant Physiol. 2015, 169, 2789–2804. [Google Scholar] [CrossRef]
- Salcedo, A.; Rutter, W.; Wang, S.; Akhunova, A.; Bolus, S.; Chao, S.; Anderson, N.; De Soto, M.F.; Rouse, M.; Szabo, L.; et al. Variation in the AvrSr35 Gene Determines Sr35 Resistance against Wheat Stem Rust Race Ug99. Science 2017, 358, 1604–1606. [Google Scholar] [CrossRef] [PubMed]
- Zhang, Y.; Wei, J.; Qi, Y.; Li, J.; Amin, R.; Yang, W.; Liu, D. Predicating the Effector Proteins Secreted by Puccinia triticina Through Transcriptomic Analysis and Multiple Prediction Approaches. Front. Microbiol. 2020, 11, 538032. [Google Scholar] [CrossRef] [PubMed]
- Wang, D.; Tian, L.; Zhang, D.-D.; Song, J.; Song, S.-S.; Yin, C.-M.; Zhou, L.; Liu, Y.; Wang, B.-L.; Kong, Z.-Q.; et al. Functional Analyses of Small Secreted Cysteine-Rich Proteins Identified Candidate Effectors in Verticillium dahliae. Mol. Plant Pathol. 2020, 21, 667–685. [Google Scholar] [CrossRef] [PubMed]
- Sperschneider, J.; Dodds, P.N. EffectorP 3.0: Prediction of Apoplastic and Cytoplasmic Effectors in Fungi and Oomycetes. Mol. Plant Microbe Interact. 2022, 35, 146–156. [Google Scholar] [CrossRef]
- Wang, C.; Wang, P.; Han, S.; Wang, L.; Zhao, Y.; Juan, L. FunEffector-Pred: Identification of Fungi Effector by Activate Learning and Genetic Algorithm Sampling of Imbalanced Data. IEEE Access 2020, 8, 57674–57683. [Google Scholar] [CrossRef]
- Jones, D.A.B.; Rozano, L.; Debler, J.W.; Mancera, R.L.; Moolhuijzen, P.M.; Hane, J.K. An Automated and Combinative Method for the Predictive Ranking of Candidate Effector Proteins of Fungal Plant Pathogens. Sci. Rep. 2021, 11, 19731. [Google Scholar] [CrossRef]
- Nur, M.; Wood, K.; Michelmore, R. EffectorO: Motif-Independent Prediction of Effectors in Oomycete Genomes Using Machine Learning and Lineage Specificity. bioRxiv 2021. [Google Scholar] [CrossRef]
- Godfrey, D.; Böhlenius, H.; Pedersen, C.; Zhang, Z.; Emmersen, J.; Thordal-Christensen, H. Powdery Mildew Fungal Effector Candidates Share N-Terminal Y/F/WxC-Motif. BMC Genom. 2010, 11, 317. [Google Scholar] [CrossRef]
- Wood, K.J.; Nur, M.; Gil, J.; Fletcher, K.; Lakeman, K.; Gann, D.; Gothberg, A.; Khuu, T.; Kopetzky, J.; Naqvi, S.; et al. Effector Prediction and Characterization in the Oomycete Pathogen Bremia lactucae Reveal Host-Recognized WY Domain Proteins That Lack the Canonical RXLR Motif. PLOS Pathog. 2020, 16, e1009012. [Google Scholar] [CrossRef]
- Jones, D.A.B.; Moolhuijzen, P.M.; Hane, J.K. Remote Homology Clustering Identifies Lowly Conserved Families of Effector Proteins in Plant-Pathogenic Fungi. Microb. Genom. 2021, 7, 000637. [Google Scholar] [CrossRef]
- Sperschneider, J.; Dodds, P.N.; Gardiner, D.M.; Singh, K.B.; Taylor, J.M. Improved Prediction of Fungal Effector Proteins from Secretomes with EffectorP 2.0. Mol. Plant Pathol. 2018, 19, 2094–2110. [Google Scholar] [CrossRef] [PubMed]
- Liang, P.; Liu, S.; Xu, F.; Jiang, S.; Yan, J.; He, Q.; Liu, W.; Lin, C.; Zheng, F.; Wang, X.; et al. Powdery Mildews Are Characterized by Contracted Carbohydrate Metabolism and Diverse Effectors to Adapt to Obligate Biotrophic Lifestyle. Front. Microbiol. 2018, 9, 3160. [Google Scholar] [CrossRef] [PubMed]
- Morais do Amaral, A.; Antoniw, J.; Rudd, J.J.; Hammond-Kosack, K.E. Defining the Predicted Protein Secretome of the Fungal Wheat Leaf Pathogen Mycosphaerella graminicola. PLoS ONE 2012, 7, e49904. [Google Scholar] [CrossRef]
- Neu, E.; Debener, T. Prediction of the Diplocarpon rosae Secretome Reveals Candidate Genes for Effectors and Virulence Factors. Fungal Biol. 2019, 123, 231–239. [Google Scholar] [CrossRef] [PubMed]
- Jones, P.; Binns, D.; Chang, H.-Y.; Fraser, M.; Li, W.; McAnulla, C.; McWilliam, H.; Maslen, J.; Mitchell, A.; Nuka, G.; et al. InterProScan 5: Genome-Scale Protein Function Classification. Bioinformatics 2014, 30, 1236. [Google Scholar] [CrossRef] [PubMed]
- Marchler-Bauer, A.; Derbyshire, M.K.; Gonzales, N.R.; Lu, S.; Chitsaz, F.; Geer, L.Y.; Geer, R.C.; He, J.; Gwadz, M.; Hurwitz, D.I.; et al. CDD: NCBI’s Conserved Domain Database. Nucleic Acids Res. 2015, 43, D222–D226. [Google Scholar] [CrossRef] [PubMed]
- Mistry, J.; Chuguransky, S.; Williams, L.; Qureshi, M.; Salazar, G.A.; Sonnhammer, E.L.L.; Tosatto, S.C.E.; Paladin, L.; Raj, S.; Richardson, L.J.; et al. Pfam: The Protein Families Database in 2021. Nucleic Acids Res. 2021, 49, D412–D419. [Google Scholar] [CrossRef]
- Attwood, T.K. The PRINTS Database: A Resource for Identification of Protein Families. Brief. Bioinform. 2002, 3, 252–263. [Google Scholar] [CrossRef]
- Schultz, J.; Copley, R.R.; Doerks, T.; Ponting, C.P.; Bork, P. SMART: A Web-Based Tool for the Study of Genetically Mobile Domains. Nucleic Acids Res. 2000, 28, 231–234. [Google Scholar] [CrossRef]
- Haft, D.H.; Selengut, J.D.; White, O. The TIGRFAMs Database of Protein Families. Nucleic Acids Res. 2003, 31, 371–373. [Google Scholar] [CrossRef]
- Wang, P.; Jiang, H.; Boeren, S.; Dings, H.; Kulikova, O.; Bisseling, T.; Limpens, E. A Nuclear-Targeted Effector of Rhizophagus irregularis Interferes with Histone 2B Mono-Ubiquitination to Promote Arbuscular Mycorrhization. New Phytol. 2021, 230, 1142–1155. [Google Scholar] [CrossRef] [PubMed]
- Rocafort, M.; Bowen, J.K.; Hassing, B.; Cox, M.P.; McGreal, B.; de la Rosa, S.; Plummer, K.M.; Bradshaw, R.E.; Mesarich, C.H. The Venturia inaequalis Effector Repertoire Is Expressed in Waves, and Is Dominated by Expanded Families with Predicted Structural Similarity to Avirulence Proteins from Other Fungi. bioRxiv 2022. [Google Scholar] [CrossRef]
- de Queiroz, C.B.; Santana, M.F. Prediction of the Secretomes of Endophytic and Nonendophytic Fungi Reveals Similarities in Host Plant Infection and Colonization Strategies. Mycologia 2020, 112, 491–503. [Google Scholar] [CrossRef] [PubMed]
- van Dam, P.; Fokkens, L.; Schmidt, S.M.; Linmans, J.H.J.; Kistler, H.C.; Ma, L.-J.; Rep, M. Effector Profiles Distinguish Formae Speciales of Fusarium oxysporum. Environ. Microbiol. 2016, 18, 4087–4102. [Google Scholar] [CrossRef]
- Lu, S.; Edwards, M.C. Genome-Wide Analysis of Small Secreted Cysteine-Rich Proteins Identifies Candidate Effector Proteins Potentially Involved in Fusarium graminearum−Wheat Interactions. Phytopathology 2016, 106, 166–176. [Google Scholar] [CrossRef]
- Krijger, J.-J.; Thon, M.R.; Deising, H.B.; Wirsel, S.G. Compositions of Fungal Secretomes Indicate a Greater Impact of Phylogenetic History than Lifestyle Adaptation. BMC Genom. 2014, 15, 722. [Google Scholar] [CrossRef]
- Tabima, J.F.; Grünwald, N.J. EffectR: An Expandable R Package to Predict Candidate RxLR and CRN Effectors in Oomycetes Using Motif Searches. Mol. Plant Microbe Interact. 2019, 32, 1067–1076. [Google Scholar] [CrossRef]
- He, Q.; McLellan, H.; Boevink, P.C.; Birch, P.R.J. All Roads Lead to Susceptibility: The Many Modes of Action of Fungal and Oomycete Intracellular Effectors. Plant Commun. 2020, 1, 100050. [Google Scholar] [CrossRef]
- Chen, L.; Wang, H.; Yang, J.; Yang, X.; Zhang, M.; Zhao, Z.; Fan, Y.; Wang, C.; Wang, J. Bioinformatics and Transcriptome Analysis of CFEM Proteins in Fusarium graminearum. J. Fungi 2021, 7, 871. [Google Scholar] [CrossRef]
- Wang, D.; Zhang, D.-D.; Song, J.; Li, J.-J.; Wang, J.; Li, R.; Klosterman, S.J.; Kong, Z.-Q.; Lin, F.-Z.; Dai, X.-F.; et al. Verticillium dahliae CFEM Proteins Manipulate Host Immunity and Differentially Contribute to Virulence. BMC Biol. 2022, 20, 55. [Google Scholar] [CrossRef]
- Haas, B.J.; Kamoun, S.; Zody, M.C.; Jiang, R.H.Y.; Handsaker, R.E.; Cano, L.M.; Grabherr, M.; Kodira, C.D.; Raffaele, S.; Torto-Alalibo, T.; et al. Genome Sequence and Analysis of the Irish Potato Famine Pathogen Phytophthora infestans. Nature 2009, 461, 393–398. [Google Scholar] [CrossRef] [PubMed]
- Dölfors, F.; Holmquist, L.; Dixelius, C.; Tzelepis, G. A LysM Effector Protein from the Basidiomycete Rhizoctonia solani Contributes to Virulence through Suppression of Chitin-Triggered Immunity. Mol. Genet. Genom. 2019, 294, 1211–1218. [Google Scholar] [CrossRef] [PubMed]
- Feldman, D.; Yarden, O.; Hadar, Y. Seeking the Roles for Fungal Small-Secreted Proteins in Affecting Saprophytic Lifestyles. Front. Microbiol. 2020, 11, 455. [Google Scholar] [CrossRef] [PubMed]
- Franceschetti, M.; Maqbool, A.; Jiménez-Dalmaroni, M.J.; Pennington, H.G.; Kamoun, S.; Banfield, M.J. Effectors of Filamentous Plant Pathogens: Commonalities amid Diversity. Microbiol. Mol. Biol. Rev. 2017, 81, e00066-16. [Google Scholar] [CrossRef]
- Kale, S.D. Oomycete and Fungal Effector Entry, a Microbial Trojan Horse. New Phytol. 2012, 193, 874–881. [Google Scholar] [CrossRef]
- Ai, G.; Yang, K.; Ye, W.; Tian, Y.; Du, Y.; Zhu, H.; Li, T.; Xia, Q.; Shen, D.; Peng, H.; et al. Prediction and Characterization of RXLR Effectors in Pythium Species. Mol. Plant Microbe Interact. 2019, 33, 1046–1058. [Google Scholar] [CrossRef]
- Deb, D.; Anderson, R.G.; How-Yew-Kin, T.; Tyler, B.M.; McDowell, J.M. Conserved RxLR Effectors from Oomycetes Hyaloperonospora arabidopsidis and Phytophthora sojae Suppress PAMP- and Effector-Triggered Immunity in Diverse Plants. Mol. Plant Microbe Interact. 2018, 31, 374–385. [Google Scholar] [CrossRef]
- Stam, R.; Motion, G.B.; Martinez-Heredia, V.; Boevink, P.C.; Huitema, E. A Conserved Oomycete CRN Effector Targets Tomato TCP14-2 to Enhance Virulence. Mol. Plant Microbe Interact. 2021, 34, 309–318. [Google Scholar] [CrossRef]
- Krogh, A.; Larsson, B.; von Heijne, G.; Sonnhammer, E.L.L. Predicting Transmembrane Protein Topology with a Hidden Markov Model: Application to Complete Genomes11Edited by F. Cohen. J. Mol. Biol. 2001, 305, 567–580. [Google Scholar] [CrossRef]
- Almagro Armenteros, J.J.; Tsirigos, K.D.; Sønderby, C.K.; Petersen, T.N.; Winther, O.; Brunak, S.; von Heijne, G.; Nielsen, H. SignalP 5.0 Improves Signal Peptide Predictions Using Deep Neural Networks. Nat. Biotechnol. 2019, 37, 420–423. [Google Scholar] [CrossRef]
- Sperschneider, J.; Catanzariti, A.-M.; DeBoer, K.; Petre, B.; Gardiner, D.M.; Singh, K.B.; Dodds, P.N.; Taylor, J.M. LOCALIZER: Subcellular Localization Prediction of Both Plant and Effector Proteins in the Plant Cell. Sci. Rep. 2017, 7, 44598. [Google Scholar] [CrossRef] [PubMed]
- Pierleoni, A.; Martelli, P.L.; Casadio, R. PredGPI: A GPI-Anchor Predictor. BMC Bioinform. 2008, 9, 392. [Google Scholar] [CrossRef] [PubMed]
- Blum, M.; Chang, H.-Y.; Chuguransky, S.; Grego, T.; Kandasaamy, S.; Mitchell, A.; Nuka, G.; Paysan-Lafosse, T.; Qureshi, M.; Raj, S.; et al. The InterPro Protein Families and Domains Database: 20 Years On. Nucleic Acids Res. 2021, 49, D344–D354. [Google Scholar] [CrossRef] [PubMed]
- Bailey, T.L.; Johnson, J.; Grant, C.E.; Noble, W.S. The MEME Suite. Nucleic Acids Res. 2015, 43, W39–W49. [Google Scholar] [CrossRef] [PubMed]
- Adhikari, B.N.; Hamilton, J.P.; Zerillo, M.M.; Tisserat, N.; Lévesque, C.A.; Buell, C.R. Comparative Genomics Reveals Insight into Virulence Strategies of Plant Pathogenic Oomycetes. PLoS ONE 2013, 8, e75072. [Google Scholar] [CrossRef] [PubMed]
- Liu, L.; Xu, L.; Jia, Q.; Pan, R.; Oelmüller, R.; Zhang, W.; Wu, C. Arms Race: Diverse Effector Proteins with Conserved Motifs. Plant Signal. Behav. 2019, 14, 1557008. [Google Scholar] [CrossRef]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).