Comparative Genomic and Proteomic Analyses of Three Widespread Phytophthora Species: Phytophthora chlamydospora, Phytophthora gonapodyides and Phytophthora pseudosyringae

The Phytophthora genus includes some of the most devastating plant pathogens. Here we report draft genome sequences for three ubiquitous Phytophthora species—Phytophthora chlamydospora, Phytophthora gonapodyides, and Phytophthora pseudosyringae. Phytophthora pseudosyringae is an important forest pathogen that is abundant in Europe and North America. Phytophthora chlamydospora and Ph. gonapodyides are globally widespread species often associated with aquatic habitats. They are both regarded as opportunistic plant pathogens. The three sequenced genomes range in size from 45 Mb to 61 Mb. Similar to other oomycete species, tandem gene duplication appears to have played an important role in the expansion of effector arsenals. Comparative analysis of carbohydrate-active enzymes (CAZymes) across 44 oomycete genomes indicates that oomycete lifestyles may be linked to CAZyme repertoires. The mitochondrial genome sequence of each species was also determined, and their gene content and genome structure were compared. Using mass spectrometry, we characterised the extracellular proteome of each species and identified large numbers of proteins putatively involved in pathogenicity and osmotrophy. The mycelial proteome of each species was also characterised using mass spectrometry. In total, the expression of approximately 3000 genes per species was validated at the protein level. These genome resources will be valuable for future studies to understand the behaviour of these three widespread Phytophthora species.


Introduction
Phytophthora are filamentous, osmotrophic eukaryotes that morphologically resemble fungi but belong to the Oomycota class within the Stramenopila [1]. Phytophthora species include some of the most destructive plant pathogens, including devastating pathogens of important crops, ornamental plants, and forests. The number of identified Phytophthora species is rapidly increasing and currently includes more than 180 provisionally named species [2]. In recent years, the genomes of several Phytophthora species have been sequenced which has increased our understanding of Phytophthora evolution and pathology [3][4][5][6][7][8][9].
As osmotrophs, Phytophthora species secrete numerous classes of hydrolytic enzymes including carbohydrate-active enzymes (CAZymes) and proteases that digest complex extracellular substrates

DNA Extraction and Sequencing
To prepare mycelium for DNA extraction, Phytophthora cultures were grown in 10% cV8 liquid medium for 5 days. Mycelia were harvested using Miracloth, washed with sterile distilled water, flash-frozen in liquid nitrogen, lyophilized and stored at −80 • C until used for DNA extraction. Lyophilized mycelium was ground to a fine powder with a mortar and pestle under liquid nitrogen. DNA was extracted by transferring 20-40 mg of ground mycelia to a tube containing 800 µL of extraction buffer (0.2 M Tris-HCl, 0.25 M NaCl, 25 mM EDTA and 0.5% SDS) and 2 µL Proteinase K (20 mg/mL; Qiagen, Redwood City, CA, USA). Samples were incubated at 55 • C for 30 min. Samples were then treated with 3 µL RNase A (10 mg/mL; Thermo Fisher Scientific, Waltham, MA, USA) and incubated at 37 • C for 30 min. 800 µL of 24:1 chloroform:isoamyl alcohol was added to samples, mixed by inversion and centrifuged for 10 min at 13,000× g. The upper phase was transferred to a new tube and the chloroform step was repeated. DNA was precipitated by the addition of 1 2 volume of 5 M ammonium acetate and 2 volumes of 100% ethanol, followed by overnight incubation at −20 • C. Precipitated DNA was pelleted by centrifugation for 15 min at 13,000× g. The DNA pellet was washed twice, first with 70% ethanol and then with 100% ethanol. DNA was air-dried and resuspended in 70 µL TE buffer (10 mM Tris-HCl, 0.5 mM EDTA). DNA purity was assessed using a Nanodrop spectrophotometer (Thermo Fisher Scientific) based on the 260/280 and 260/230 absorbance ratios. DNA concentration was determined using a Qubit fluorometer with the dsDNA BR kit (Invitrogen, Carlsbad, CA, USA). DNA quality was assessed via agarose gel electrophoresis on a 1% agarose gel. DNA library construction and paired-end sequencing were carried out by BGI Tech Solutions Co., Ltd. (Hong Kong, China) on the Illumina HiSeq X Ten platform. Sequenced reads were deposited on the NCBI Sequence Read Archive (accessions: SRR10849951 for Ph. chlamydospora, SRR10849950 for Ph. gonapodyides and SRR10849937 for Ph. pseudosyringae).

Gene Annotation
Gene models were predicted using BRAKER2 [58] with the ProHint pipeline [59]. In brief, initial gene sets were predicted using GeneMark-ES (v4.46) [60]. Homologs of the initial gene predictions were identified using Diamond [61] searches against a database of 14 Peronosporales proteomes containing 267,298 proteins (Supplementary Table S1). Intron hints were generated by performing spliced alignments using Spaln2 [62] with the ProtHint pipeline [59]. GeneMark-EP [59] was trained with the intron hints and used to generate another gene set. The GeneMark-EP predictions, along with the intron hints, were then used to train Augustus [63] to generate the final gene sets. Completeness of gene sets were assessed using BUSCO v3.
Genes were functionally annotated using InterProScan (v5.39-77.0) [64] and eggNOG-mapper (v2) [65]. Secreted proteins and transmembrane proteins were predicted using SignalP v3 [66] and TMHMM (v2.0) [67], respectively. SignalP v3 was implemented instead of earlier or later versions of the software as previous studies have found v3 to be more sensitive in predicting oomycete signal peptides [68]. For a protein to be considered secreted, it had to have a positive prediction from SignalP, an HMM S probability value ≥0.9, an NN Y max score of ≥0.5, an NN D score of ≥0.5, and no transmembrane domains downstream of the predicted signal peptide cleavage site. These criteria permitted comparisons to previous studies [12,68,69]. Proteins predicted to be secreted were submitted to ApoplastP [70] to predict if they are localised to the plant apoplast. CAZymes were annotated using dbCAN2 [71]. Homologs of experimentally verified effector proteins were identified by performing BLASTp searches [47] against the pathogen-host interaction database (PHI-Base Release 4.8) [72] with an E value cut-off of 1e −20 .
Gene families were identified by performing all-versus-all BLASTp [47] searches with an E value cut-off of 1e −10 , followed by Markov clustering using MCL [73] with an inflation value of 1.5. Tandemly duplicated genes were identified using BLASTp [47]. Tandem clusters were defined as two or more adjacent genes that hit each other in a BLASTp search with an E value cut-off of 1e −10 and highest-scoring pair (HSP) length greater than half the length of the shortest sequence. Enrichment analyses were preformed using Fisher's exact test. Gene ontology enrichment analyses were performed using Fisher's exact test with Benjamini-Hochberg correction for multiple testing using GOATOOLS [74]. Corrected p-values < 0.05 were considered significant.

Identification of Cytoplasmic Effectors
RxLRs were classified using four methods as in McGowan and Fitzpatrick (2017) [12]. (i) The Win method-proteins must contain a signal peptide with a predicted cleavage site within the first 30 amino acids and an RxLR motif within residues 30-60 [75]. (ii) HMM method-hidden Markov model (HMM) searches were performed with HMMER (v3.2.1) [76] against all proteins predicted to be secreted using the "cropped.hmm" HMM profile constructed by Whisson et al. (2007) [77]. Hits with a bit score >0 were retained. (iii) Regex method-proteins must contain a signal peptide between residues 10 to 40 and an RxLR motif within the following 100 residues followed by the EER motif within 40 residues downstream of the RxLR motif, allowing for replacements of E to D and R to K [77]. This search was performed using the regular expression "ˆ. {10 [3]. An E value cut-off of 1e −20 was applied. Secreted proteins that met at least one of these four criteria were considered to be putative RxLRs. Additionally, an HMM search was performed on all putative RxLRs to determine if they have WY-domains, using the HMM described by Boutemy et al. (2011) [78].
CRNs were identified using the regular expression "ˆ.{30,70}L[FY]LA[RK]". Proteins with a positive hit from the regular expression search were aligned and an HMM model was constructed. The CRN HMM was then searched against the predicted proteomes using HMMER [76] and all proteins with a bit score >0 were considered the final set of putative CRNs.

Phylogenomics
A dataset of 33 Peronosporales genomes (Supplementary Table S1) was used for phylogenomic analysis. We also included Pythium ultimum as an outgroup. BUSCO analysis revealed 208 BUSCO families that are present and single copy in at least 90% of the species (i.e., at least 31 of the 34 species). Each BUSCO family was individually aligned with MUSCLE (v3.8.31) [79] and trimmed using trimAl (v1.4) [80] with the parameter "-automated1" to remove poorly aligned regions. Trimmed alignments were concatenated together resulting in a final supermatrix alignment of 106,315 amino acids. Maximum-likelihood (ML) phylogenetic reconstruction was performed using IQ-TREE (v1.6.12) [81] with the JTT+F+R5 model, which was the best fit model according to ModelFinder [82], and 100 bootstrap replicates were undertaken to infer branch support values. Bayesian analysis was also performed using PhyloBayes MPI (v1.8) [83] with the CAT model. Two independent chains were run for 10,000 cycles and convergence was assessed using bpcomp and tracecomp. A consensus Bayesian phylogeny was generated with a burn-in of 10%. The phylogeny was visualised and annotated using the Interactive Tree of Life (iTOL) [84].

Phylostratigraphy
Phylostratigraphic maps were determined for each species following previously published methods [85,86]. The database constructed by Drost et al. (2015) was retrieved which contains amino acid sequences from 4557 species, including 1787 eukaryotes (883 animals, 364 plants, 344 fungi and 193 other eukaryotes) and 2770 prokaryotes (2511 bacteria and 259 archaea) [85]. Protein sequences from the three Phytophthora species sequenced in this study and any publicly available oomycete proteomes were added to the database. The final database comprised 18,084,866 proteins, including 578,493 proteins from 38 oomycete genomes. Each Phytophthora protein was then searched against this database using BLASTp [47]. Proteins were assigned to the oldest phylostratum that contained at least one BLAST hit with an E value cut-off < 1e −5 . Genes that did not have a BLAST hit to any other species were considered species-specific (orphans).

Culturing Conditions and Extraction of Phytophthora Extracellular Proteins
Petri dishes containing 15 mL of liquid medium (either 10% V8 broth or 10% cV8 broth) were inoculated with a 10 mm agar plug of Phytophthora mycelium cut from the edge of a growing Phytophthora colony. Cultures were incubated in the dark, non-shaking for 10 days at their optimum temperatures (20 • C for Ph. pseudosyringae and 25 • C for Ph. chlamydospora and Ph. gonapodyides). Spent growth medium was harvested using a syringe without disturbing the mycelium. Supernatant from four petri dishes were pooled to make up one replicate. Collected supernatant was passed through a 0.2 µm syringe filter, frozen overnight at −20 • C and lyophilized. Lyophilized supernatant was resuspended in minimal volumes of PBS, desalted and concentrated using Amicon Ultra centrifugal filters (Millipore, Billerica, MA, USA) with a 3 kDa cut-off. Samples were clarified by centrifugation at 12,000× g for 5 min and brought to 15% (v/v) trichloroacetic acid (TCA) using 100% TCA. Precipitated proteins were washed twice with ice-cold acetone. Dried protein pellets were resuspended in 6 M Urea, 2 M Thiourea and 0.1 M Tris-HCl pH 8.0. Protein concentration was determined using a Qubit fluorometer (Invitrogen).

Culturing Conditions and Extraction of Phytophthora Mycelial Proteins
Phytophthora mycelium was cultured under three growth conditions. (i) Normal-cultures were grown for ten days at their optimum temperatures. (ii) Heat-cultures were grown for 7 days at their optimum temperatures, followed by incubation at 30 • C for three days. (iii) Oxidative stress-cultures were grown for ten days at their optimum temperatures, then exposed to 1 mM H 2 O 2 for three hours. All cultures were grown in 50 mL of 10% cV8, non-shaking. Mycelia were harvested using Miracloth, washed with sterile distilled water, flash-frozen in liquid nitrogen and stored at −80 • C until used for protein extraction.
To extract proteins, mycelium was ground to a fine powder with a mortar and pestle under liquid nitrogen. 200-300 mg of ground mycelium was resuspended in 400 µL of lysis buffer followed by sonication (Bandelin Sonopuls HD2200 sonicator, Cycle 6, Berlin, Germany, 3 × 10 s, Power 20%). Protein concentration was determined using a Qubit fluorometer (Invitrogen). Protein lysates (0.25 mg/mL) were incubated at 95 • C for 5 min.

Protein Digestion and LC-MS/MS Identification of Phytophthora Proteins
Three independent biological replicates were analysed for each condition. Proteins were reduced and alkylated prior to overnight trypsin digestion as described previously [87,88]. Digestion was terminated by the addition of 1 µL of 100% trifluoroacetic acid (TFA). Sample clean-up was performed using C18 ZipTips ® (Millipore), following the manufacturer's instructions. Shotgun proteomics was performed using an Ultimate 3000 RSLC from Dionex, coupled to a Thermo Scientific Q-Exactive mass spectrometer. Peptide mixtures were separated on a 50 cm EASY-Spray PepMap C18 column with 75 µm diameter (2 µm particle size) using a 10-40 % B gradient (A: 0.1% (v/v) formic acid, 3% (v/v) acetonitrile; B: 0.1% (v/v) formic acid, 80% (v/v) acetonitrile). Data were acquired for 105 min, at 70,000 resolution for MS and a Top 15 method for MS2 collection.
Protein identification from the data was performed using the Andromeda search engine [89] in MaxQuant [90] with the predicted proteomes for each species as a search database. To account for possible protein contamination from V8 juice medium, we appended the tomato proteome to the default MaxQuant contaminants database. Search parameters were as described in Delgado et al. (2019) [91]. Identified protein groups were filtered using Perseus [92], to remove protein groups that were identified only by site, or had hits to either the contaminants database or the reverse database. Proteins were considered present in a condition if they were identified by 2 or more peptides and detected in at least 2 out of 3 replicates. Proteins were considered unique to a condition if they were not detected in any replicate of any other condition.

Genome Sequencing and Assembly
Paired-end Illumina sequencing generated approximately 5.2 Gb of sequencing data for each of the three species. Genome sizes and heterozygosity levels were estimated based on K-mer analysis of sequence reads using Jellyfish [48] and GenomeScope [49], which estimated genome sizes of 51.1 Mb, 65.2 Mb and 51.0 Mb ( Table 2) for Ph. chlamydospora, Ph. gonapodyides, and Ph. pseudosyringae, respectively. The heterozygosity of Ph. gonapodyides was estimated to be 1.88% (Table 2), which was much higher than Ph. chlamydospora (0.68%) and Ph. pseudosyringae (0.15%) ( Table 2), and high compared to other oomycetes which typically have heterozygosity levels less than 1% [93]. De novo genome assembly using SPAdes [50] generated draft genome assemblies with assembly sizes of 45.3 Mb for Ph. chlamydospora, 61.1 Mb for Ph. gonapodyides and 47.9 Mb for Ph. pseudosyringae (Table 2), which compares favourably to the genome sizes estimated by GenomeScope. The Ph. gonapodyides assembly was much more fragmented (16,449 scaffolds) than Ph. chlamydospora (4077 scaffolds) and Ph. pseudosyringae (3627 scaffolds) ( Table 2). We expect that this is due to higher levels of heterozygosity found in the Ph. gonapodyides genome assembly and due to expansions of repetitive elements (Supplementary Table S2). BUSCO analysis [54] was performed using the Alveolata-Stramenopiles dataset which contains 234 target BUSCO proteins that are expected to be present and single-copy. BUSCO results suggests that the assemblies are of high gene space completeness with BUSCO completeness values of 97.8% for Ph. chlamydospora, 87.2% for Ph. gonapodyides, and 94.1% for Ph. pseudosyringae) ( Table 2 and Figure 1). Furthermore, the low number of duplicated BUSCOs suggest that haplotypes were correctly collapsed ( Figure 1). De novo repeat annotation using RepeatModeler2 [55] and RepeatMasker led to the identification of 4.1 Mb (9.0%) of repetitive elements in Ph. chlamydospora, 9.8 Mb (16.1%) in Ph. gonapodyides and 6.4 Mb (13.3%) in Ph. pseudosyringae ( Table 2). The majority of identified repeats were unclassified or were classified as long terminal repeat (LTR) retroelements (Supplementary Table S2). Overall, the proportions of repetitive elements identified are similar to that of Ph. parasitica (8%), Ph. plurivora (15%), Ph. cactorum (18%) and Ph. capsici (21%) [5,6] but less than that of Ph. sojae (31%), Ph. ramorum (54%) and Ph. infestans (74%) [3,94,95].

Phylogenomics Analysis
A phylogenomic analysis was carried out to determine the phylogenetic relationships of the three Phytophthora species using the genome sequences of 33 Peronosporales species and Py. ultimum as an outgroup (Supplementary Table S1). A supermatrix alignment was constructed from 208 highly conserved BUSCO families. Maximum Likelihood (ML) and Bayesian phylogenetic reconstruction was undertaken on the supermatrix. Both ML and Bayesian methods resulted in phylogenies with identical topologies and most nodes had maximum Bootstrap Support (BP) or maximum Bayesian Posterior Probabilities (BPP) ( Figure 2). All species were placed into their expected clades, and overall the placement of each species is in broad agreement with previous studies [93,96,97]. We also recovered the polyphyly of the downy mildews ( Figure 2). Interestingly, the branch lengths of the Downy Mildew species are longer relative to the other Peronosporales species presented indicating higher levels of genetic divergence ( Figure 2). Ph. pseudosyringae was placed as sister to Ph. pluvialis, which is also a clade 3 species, with maximum support from both ML and Bayesian methods ( Figure 2). Phytophthora gonapodyides was placed as sister to Ph. pinifolia, to the exclusion of Ph. chlamydospora, with 87% BP from the ML phylogeny and maximum support from the Bayesian phylogeny ( Figure 2). We note that this disagrees with previous phylogenies based on ITS sequences [31] and four concatenated mitochondrial loci [98], both of which group Ph. chlamydospora and Ph. pinifolia as being more closely related. Some markers, including the ITS sequence, are known to be identical or nearly identical between members of Phytophthora clade 6b, which Ph. chlamydospora, Ph. gonapodyides, and Ph. pinifolia belong to [99]. However, due to the highly conserved nature of these markers, they may not reflect the true phylogenetic relationships between species. Furthermore, phylogenomic approaches are generally considered to be more informative than single gene phylogenies or phylogenies derived from small numbers of genes, as they utilise substantially greater amounts of phylogenetically informative genomic data [100].
which is also a clade 3 species, with maximum support from both ML and Bayesian methods ( Figure  2). Phytophthora gonapodyides was placed as sister to Ph. pinifolia, to the exclusion of Ph. chlamydospora, with 87% BP from the ML phylogeny and maximum support from the Bayesian phylogeny ( Figure  2). We note that this disagrees with previous phylogenies based on ITS sequences [31] and four concatenated mitochondrial loci [98], both of which group Ph. chlamydospora and Ph. pinifolia as being more closely related. Some markers, including the ITS sequence, are known to be identical or nearly identical between members of Phytophthora clade 6b, which Ph. chlamydospora, Ph. gonapodyides, and Ph. pinifolia belong to [99]. However, due to the highly conserved nature of these markers, they may not reflect the true phylogenetic relationships between species. Furthermore, phylogenomic approaches are generally considered to be more informative than single gene phylogenies or phylogenies derived from small numbers of genes, as they utilise substantially greater amounts of phylogenetically informative genomic data [100].
Our phylogeny groups Ph. cinnamomi with Ph. sojae and Ph. pisi, to the exclusion of Ph. fragariae and Ph. rubi, with 92% BP from the ML phylogeny and maximum support in the Bayesian phylogeny ( Figure 2). This is in disagreement with a phylogeny based on seven nuclear genetic markers which groups Ph. sojae, Ph. pisi, Ph. fragariae, and Ph. rubi together to the exclusion of Ph. cinnamomi [2]. However, our phylogeny is in agreement with two separate studies based on seven nuclear loci which group Ph. cinnamomi more closely related to Ph. sojae [97,98]. We anticipate that differences in topology are due to the inclusion or exclusion of different species in datasets.   Our phylogeny groups Ph. cinnamomi with Ph. sojae and Ph. pisi, to the exclusion of Ph. fragariae and Ph. rubi, with 92% BP from the ML phylogeny and maximum support in the Bayesian phylogeny ( Figure 2). This is in disagreement with a phylogeny based on seven nuclear genetic markers which groups Ph. sojae, Ph. pisi, Ph. fragariae, and Ph. rubi together to the exclusion of Ph. cinnamomi [2]. However, our phylogeny is in agreement with two separate studies based on seven nuclear loci which group Ph. cinnamomi more closely related to Ph. sojae [97,98]. We anticipate that differences in topology are due to the inclusion or exclusion of different species in datasets.

Phytophthora Mitochondrial Genomes
Mitochondrial genomes were assembled and circularised using NOVOPlasty, resulting in mitochondrial genome assemblies sizes of 38.33 Kb for Ph. chlamydospora, 43.97 Kb for Ph. gonapodyides and 39.14 Kb for Ph. pseudosyringae in length (Figure 3), which are similar in size to previously sequenced Phytophthora mitochondrial genomes [101]. The overall mitochondrial GC content is also highly similar to other Phytophthora species, with 22.5% for Ph. chlamydospora, 23.7% for Ph. gonapodyides and 22.0% for Ph. pseudosyringae. We did not detect any inverted repeats. The gene content of each mitochondrion is similar to that of other Phytophthora mitochondrial genomes, including 35 known protein-coding genes (18 respiratory chain proteins, 16 ribosomal proteins, and the import protein secY), two ribosomal RNA genes (rns and rnl) and 25 (Ph. chlamydospora and Ph. pseudosyringae) or 26 (Ph. gonapodyides) transfer RNA genes that specify 19 amino acids (Figure 3 and Supplementary  Table S4). As with other oomycetes the tRNA gene for threonine was not located in the mitochondrial genomes of the three species presented here. Unlike animals and fungi, oomycete mitochondria use the standard genetic code [102]   pseudosyringae. The inner ring shows % GC content. Arrows indicate relative transcriptional orientation. The outer ring shows the predicted genes which are encoded on both strands. All three species are missing the tRNA gene for threonine.
Nucleotide alignment of the mitochondrial assemblies revealed that the Ph. chlamydospora and Ph. gonapodyides mitochondria are collinear (Figure 3 and Supplementary Figure S1). Two inversions are present in the Ph. pseudosyringae mitochondrial genome relative to Ph. chlamydospora and Ph. gonapodyides (Figure 3 and Supplementary Figure S1). We identified a number of open reading frames (ORFs) that are conserved between all three mitochondrial genomes and other Phytophthora mitochondrial genomes, including orf64, orf100, orf142 and orf217 (Figure 3 and Supplementary  Table S4). The functions of these ORFs are unknown. Phytophthora pseudosyringae also shares an Nucleotide alignment of the mitochondrial assemblies revealed that the Ph. chlamydospora and Ph. gonapodyides mitochondria are collinear (Figure 3 and Supplementary Figure S1). Two inversions are present in the Ph. pseudosyringae mitochondrial genome relative to Ph. chlamydospora and Ph. gonapodyides (Figure 3 and Supplementary Figure S1). We identified a number of open reading frames (ORFs) that are conserved between all three mitochondrial genomes and other Phytophthora mitochondrial genomes, including orf64, orf100, orf142 and orf217 (Figure 3 and Supplementary  Table S4). The functions of these ORFs are unknown. Phytophthora pseudosyringae also shares an additional ORF that is homologous with Ph. sojae orf206 [103] (Figure 3). A large number of unique unannotated ORFs were identified in Ph. gonapodyides (ORF4, ORF13, ORF14, ORF15, ORF16, ORF23, ORF25, ORF40, ORF44, and ORF50), compared to two in Ph. pseudosyringae (ORF8 and ORF25) and only one in Ph. chlamydospora (ORF26) (Figure 3).

Bioinformatic Characterisation of Phytophthora Effector Arsenals
Bioinformatic annotation of Phytophthora secretomes was performed using SignalP. This analysis predicted 1140, 1291 and 1131 secreted proteins for Ph. chlamydospora, Ph. gonapodyides, and Ph. pseudosyringae respectively (Table 3), accounting for 6.38%, 5.53% and 6.49% of their total genome complement, similar to the number of secreted proteins reported for other Phytophthora genomes [12]. ApoplastP predicted that 47.1% to 48.8% of putative secreted proteins localise to the plant apoplast (Table 3). Approximately 20% of all putatively secreted proteins are homologous to experimentally verified effectors in PHI-Base (Table 3). InterProScan [64] was used to annotate putative effector proteins based on conserved Pfam domains known to be implicated in plant pathogenicity. Some of these effectors are discussed below.
Elicitins are secreted proteins that bind sterols and lipids, allowing Phytophthora spp. to overcome their inability to synthesise sterols by sequestering sterols from their hosts or environments [104]. Elicitins also act as microbe-associated molecular patterns (MAMPs), triggering host cell death upon recognition by the host plant. Elicitin proteins are usually members of large multi-copy gene families in oomycete genomes [105]. Here, we have identified between 45 and 59 proteins with an elicitin domain (PF00964) for each Phytophthora species, of which approximately 78% are predicted to be secreted (Table 3). In contrast to elicitins, necrosis-inducing proteins (NLPs) have a broad taxonomic distribution having been identified in bacteria, fungi and oomycetes [106]. NLPs are known to induce ethylene accumulation and trigger necrosis in dicots [106]. Here, we identified 25, 33 and 22 proteins containing the NLP domain (PF05630) in Ph. chlamydospora, Ph. gonapodyides and Ph. pseudosyringae, respectively, of which 19 (76%), 22 (67%) and 19 (86%) were predicted to be secreted (Table 3). Multiple sequence alignment (not shown) of identified NLPs confirm they are all type 1 NLPs, characterised by the presence of two conserved cysteine residues. Other effectors of interest include the PcF phytotoxins, these are small cysteine-rich proteins that induce plant cell necrosis [107]. PcF phytotoxins appear to be unique to Peronosporales species based on available genomic data. We identified only one protein with the PcF phytotoxin domain (PF09461) in each of the three genomes, however, only the Ph. gonapodyides and Ph. pseudosyringae copies were predicted to be secreted (Table 3). Transglutaminases are proteins that strengthen structures such as cell walls by facilitating cross-linking between glutamine and lysine residues, conferring resistance to proteolysis [108]. Transglutaminases, such as Ph. sojae GP42, can elicit a host immune response upon recognition [109]. We identified 14, 16, and 17 proteins containing the transglutaminase elicitor domain (PF16683) in Ph. chlamydospora, Ph. gonapodyides and Ph. pseudosyringae, respectively ( Table 3). Each of the three genomes encode 11 transglutaminase elicitors that are predicted to be secreted (Table 3), which is similar to the number predicted to be secreted by Ph. cactorum (15) [6].
The PAN/Apple domain (PF00024, PF14295) is enriched in the secretomes of most oomycete species [12]. This domain is associated with carbohydrate-binding modules, for example, cellulose-binding elicitor lectins (CBEL). Knockdown of a Ph. parasitica CBEL with two PAN/Apple domains affected its ability to adhere to cellulosic substrates, such as plant cell walls [110]. We identified 25 proteins with PAN/Apple domains in Ph. chlamydospora, 32 in Ph. gonapodyides and 20 in Ph. pseudosyringae (Table 3), of which 21 (84%), 20 (62.5%) and 15 (75%) were predicted to be secreted (Table 3). 47% of all identified PAN/Apple domain-containing proteins had two or more PAN/Apple domains. Proteins belonging to the cysteine-rich secretory proteins, antigen 5, and pathogenesis-related 1 proteins (CAP) family (PF00188) are also enriched in most oomycete secretomes [12]. Saccharomyces cerevisiae CAP family proteins function in sterol binding and export, and are linked to fungal virulence [111]. However, little is known about their involvement in oomycete infection. We identified 34 CAP proteins in Ph. chlamydospora, and Ph. gonapodyides individually and 31 in Ph. pseudosyringae (Table 3), of which 22 (64.7%), 25 (73.5%) and 22 (71.0%) were predicted to be secreted.  (9) Putative effectors were annotated based on Pfam domains or manually curated (CRNs and RxLRs). Numbers in brackets represent proteins belonging to the predicted secretome.
We also annotated a large number of proteins putatively involved in the breakdown or binding of exogenous carbohydrates, such as plant cell walls, including cellulases, cellulose-binding proteins, cutinases, lytic polysaccharide mono-oxygenases, and pectin modifying enzymes (Table 3). Pectin modifying enzymes were the most numerous and include pectate lyases, pectinesterases and pectin acetylesterases. Pectate lyases cleave pectin, a major component of plant cell walls. Pectinesterases catalyse the de-esterification of pectin, while pectin acetylesterases deacetylate pectin, making the pectin backbone more accessible to pectate lyases [112,113]. In total, we identified 47 pectin modifying enzymes in Ph. chlamydospora, 38 in Ph. gonapodyides, and 52 in Ph. pseudosyringae (Table 3), of which 33 (70.2%), 22 (57.9%) and 37 (71.2%) are predicted to be secreted, suggesting a putative role in the breakdown of plant cells.
RxLR effectors are named due to the highly conserved RxLR motif found in their N-terminus which act as a trafficking motif, signalling the effectors to be delivered into plant cells [77]. The RxLR motif is followed by an EER motif in many RxLR effectors [77]. RxLR C-terminal domains are typically highly divergent although many contain one or more "WY" domains [114]. Many RxLRs are expressed early in infection and play roles in the suppression of host immune responses [115]. However, the function of most RxLRs is unknown and many have been shown to localise to diverse subcellular locations within plant host cells [116]. RxLRs were identified using a combination of four independent criteria (see methods). For Ph. chlamydospora, 93 proteins had a hit according to the Win method, 68 with the Regex method, 64 with the HMM method and 81 with the homology method (Supplementary Table S5). In total, across the four methods, 132 unique putative RxLRs were identified for Ph. chlamydospora (Table 3) Table S5). In total, 132 unique putative RxLRs were identified using the four methods (  Table S5). In total, across the four methods, 186 unique proteins were annotated as putative RxLRs (Table 3), of which 61 had hits from the WY-fold HMM (Supplementary Table S5). The number of putative RxLRs identified in Ph. pseudosyringae is similar to its clade 3 relative Ph. pluvialis (181) [8,12].
CRNs are modular proteins that contain highly conserved N-terminal domains containing a signal peptide and an "LxLFLAK" motif that mediates translocation into host cells [117]. CRNs are named after their crinkling and necrosis-inducing activity in leaves [118]. CRNs were identified using a combination of regular expression searches and HMM searches. The number of CRNs identified for each species is similar. In total, 77 putative CRNs were identified in Ph. chlamydospora, 80 in Ph. gonapodyides and 90 in Ph. pseudosyringae (Table 3). Similar to what has been observed for other oomycetes [5,6,12], only a small proportion of identified CRNs have a positive SignalP prediction, with 28 (36.4%) in Ph. chlamydospora, 18 (22.5%) in Ph. gonapodyides and 37 (41.1%) in Ph. pseudosyringae (Table 3).
We extended this analysis by comparing the CAZyme repertoires of 44 oomycete species with different host ranges and broad lifestyles (Figure 4 and Supplementary Table S1). In agreement with previous studies [119], our results show that Phytophthora species tend to have larger numbers of CAZymes compared to other oomycete taxa (Figure 4). Specifically examining 24 Phytophthora species, the average number of CAZymes is 450. On average, each Phytophthora genome encodes 215 glycoside hydrolases, 111 glycoside transferases, 38 polysaccharide lyases, 34 proteins involved in auxiliary activities, 46 carbohydrate esterases and 9 proteins with carbohydrate-binding modules. The number of CAZymes identified in Ph. pseudosyringae is close to the average Phytophthora (Figure 4). The two clade 6 aquatic Phytophthora species have an expanded repertoire of CAZymes relative to the average Phytophthora, with 483 CAZymes in Ph. chlamydospora and 487 CAZymes for Ph. gonapodyides (Figure 4). In particular, they have a higher than average number of glycoside hydrolases, with 234 in Ph. chlamydospora and 243 in Ph. gonapodyides ( Figure 4 and Table 4). All three genomes have a higher than average number of glycoside transferases, with 125 in Ph. chlamydospora, 118 in Ph. gonapodyides and 120 in Ph. pseudosyringae ( Figure 4 and Table 4). Ph. gonapodyides also has a higher than average number of proteins involved in auxiliary activities with 41 proteins (Figure 4 and Table 4). Interestingly, principal component analysis (PCA) of glycoside hydrolase family copy numbers clusters species with similar lifestyles together ( Figure 5). All downy mildew species (Albugo, Bremia, Hyaloperonospora, Peronospora, and Plasmopara) cluster tightly together ( Figure 5), despite evolving obligate biotrophism independently ( Figure 2) [96,120]. Plant pathogenic Pythium species are also more distantly clustered ( Figure 5). The mycoparasite Pythium oligandrum is placed distantly to all other Pythium species (Figure 5), suggesting it may have a specialised repertoire of glycoside hydrolases involved in infection of fungi and oomycetes [121]. The animal pathogens Aphanomyces astaci, Saprolegnia diclina and Saprolegnia parasitica are clustered together ( Figure 5). Aphanomyces invadans is clustered with the mammalian pathogen Pythium insidiosum ( Figure 5). The intermediate genera, Pilasporangium and Phytopythium, are clustered together between Phytophthora and Pythium ( Figure 5). All Phytophthora species are clustered together, over a wide area ( Figure 5). Phytophthora chlamydospora and Ph. gonapodyides are clustered together but relatively distant to all other Phytophthora species ( Figure 5). This suggests that these opportunistic aquatic Phytophthora species may have distinctive glycoside hydrolase arsenals. Furthermore, they are placed distantly from their closest relative in the dataset Ph. pinifolia (Figures 2 and 5). Examining individual glycoside hydrolase families, both Ph. chlamydospora and Ph. gonapodyides have expansions of glycoside hydrolase families 1, 3, 5, 10, 13 and 43 (Supplementary Figure S2). These results suggest that oomycete lifestyles may be linked to their CAZyme repertoires, in particular glycoside hydrolase families. These findings are similar to recent analyses showing clustering of oomycete species with similar lifestyles based on metabolic networks [122,123].  Interestingly, principal component analysis (PCA) of glycoside hydrolase family copy numbers clusters species with similar lifestyles together ( Figure 5). All downy mildew species (Albugo, Bremia, Hyaloperonospora, Peronospora, and Plasmopara) cluster tightly together ( Figure 5), despite evolving obligate biotrophism independently ( Figure 2) [96,120]. Plant pathogenic Pythium species are also more distantly clustered ( Figure 5). The mycoparasite Pythium oligandrum is placed distantly to all other Pythium species (Figure 5), suggesting it may have a specialised repertoire of glycoside hydrolases involved in infection of fungi and oomycetes [121]. The animal pathogens Aphanomyces astaci, Saprolegnia diclina and Saprolegnia parasitica are clustered together ( Figure 5). Aphanomyces invadans is clustered with the mammalian pathogen Pythium insidiosum ( Figure 5). The intermediate genera, Pilasporangium and Phytopythium, are clustered together between Phytophthora and Pythium ( Figure 5). All Phytophthora species are clustered together, over a wide area ( Figure 5). Phytophthora chlamydospora and Ph. gonapodyides are clustered together but relatively distant to all other Phytophthora species ( Figure 5). This suggests that these opportunistic aquatic Phytophthora species may have distinctive glycoside hydrolase arsenals. Furthermore, they are placed distantly from their closest relative in the dataset Ph. pinifolia (Figure 2 and Figure 5). Examining individual glycoside hydrolase families, both Ph. chlamydospora and Ph. gonapodyides have expansions of glycoside hydrolase families 1, 3, 5, 10, 13 and 43 (Supplementary Figure S2). These results suggest that oomycete lifestyles may be linked to their CAZyme repertoires, in particular glycoside hydrolase families. These findings are similar to recent analyses showing clustering of oomycete species with similar lifestyles based on metabolic networks [122,123].

Tandemly Duplicated Genes
Tandemly duplicated genes are duplicated genes that are located adjacent to each other in the genome. Analysis of tandemly duplicated genes using BLASTp led to the identification of 2513 (14.1%) tandemly duplicated genes in Ph. chlamydospora, 1863 (8.0%) in Ph gonapodyides and 2225 (12.8%) in Ph. pseudosyringae ( Table 5). The tandemly duplicated genes are located in 833 to 979 tandem clusters (Table 5 and Supplementary Table S7). On average, each cluster has between 2.24 and 2.57 tandemly duplicated genes (Table 5). Fewer tandemly duplicated genes were identified in Ph. gonapodyides compared to Ph. chlamydospora and Ph. pseudosyringae (Table 5), however, this analysis may have been limited by the poor assembly contiguity of Ph. gonapodyides. Overall, counts of tandemly duplicated genes are similar to those observed in other Phytophthora genomes [124]. Proteins predicted to be secreted are significantly overrepresented in tandemly duplicated clusters (p < 0.05 (Fisher's exact test)), with 354 Ph. chlamydospora secreted proteins found in tandem clusters, 265 from Ph. gonapodyides and 328 from Ph. pseudosyringae (Table 5). This adds further evidence that tandem gene duplication has played a role in the expansion of oomycete secretomes [124]. Our results show that putative effector proteins are numerous in tandem clusters. For example, in Ph. chlamydospora, 33 elicitins out of a total of 57 (58%) are located in 12 tandem clusters (Supplementary Table S7). Clusters of elicitins genes have also been reported in other Phytophthora species [6,105]. Similarly, 20 out of a total 34 CAP proteins (59%) are found in eight tandem clusters (Supplementary Table S7). Tandem gene duplication has also played a role in the expansion of Phytophthora CAZyme arsenals. For example, in Ph. chlamydospora 175 out of a total of 483 proteins (36%) annotated as putative CAZymes (Table 4) are found in tandem clusters. We observed similar trends in all three genome assemblies.

LC-MS/MS Characterisation of Phytophthora Extracellular Proteomes
Here we used a mass spectrometry-based approach to characterise the in vivo secretomes and extracellular proteomes of Ph. chlamydospora, Ph. gonapodyides and Ph. pseudosyringae. Each species was cultured under two conditions with different media types −10% V8 juice or 10% cV8 juice. Extracellular medium was harvested 10 days after inoculation. To minimise the possibility of hyphal lysis, extracellular medium was carefully harvested using a syringe without disrupting the Phytophthora hyphae. Proteins were extracted from extracellular medium and subjected to LC-MS/MS to identify extracellular proteins. Proteins were identified by searching spectra against the predicted proteome of each species. Protein groups (a group of indistinguishable proteins based on identified peptides) were considered present in a condition if they were identified based on at least two peptides and present in at least two out of three independent biological replicates. Protein groups were considered unique to a condition if they were not detected in any replicate of other conditions. Examining Ph. chlamydospora, 302 protein groups (327 proteins) were identified in the cV8 samples and 251 protein groups (274 proteins) were identified in the V8 samples (Supplementary Table S8). 20 protein groups (20 proteins) were unique to the cV8 samples and 14 protein groups (17 proteins) were unique to the V8 samples (Supplementary Table S8). In total across the two conditions, 321 protein groups (351 proteins) were identified (Table 6 and Supplementary Table S8), of which 149 proteins (42%) overlap with the predicted secretome of Ph. chlamydospora. Reducing the strictness of the SignalP 3 analysis by considering all positive HMM predictions without applying additional constraints, 196 proteins (56%) are predicted to be secreted. This compares favourably to Ph. plurivora, where 60% of extracellular proteins identified by LC-MS/MS contained predicted N-terminal signal peptides [16]. The proteins lacking signal peptides may be present in the extracellular medium due to contamination of intracellular proteins caused by hyphal lysis during protein extraction. Additionally, these proteins may have legitimate signal peptides that cannot be detected due to inaccurate gene annotation, i.e., gene models with a truncated N-terminus will lack N-terminal signal peptides [15]. It is also possible that they are leaderless secretory proteins (LSPs), lacking signal peptides that enter non-classical secretory pathways. Extracellular proteins that lack signal peptides were submitted to SecretomeP 2.0, an ab initio predictor of non-classically secreted proteins [125]. A total 78 proteins (22.2%) had a SecretomeP NN-score greater than 0.5, suggesting that they may be non-classically secreted proteins. It is important to note that SecretomeP has only been trained on mammalian LSPs, therefore its accuracy at predicting oomycete LSPs is unclear. eggNOG assigned 298 extracellular proteins (84.9%) to one or more functional COG groups. The most numerous functional categories were carbohydrate transport and metabolism (96 proteins, 32.2%), function unknown (63 proteins, 21.1%), posttranslational modification, protein turnover and chaperones (33 proteins, 11.1%), energy production and conversion (19 proteins, 6.4%), translation, ribosomal structure and biogenesis (18 proteins, 6.0%), amino acid transport and metabolism (17 proteins, 5.7%), and lipid transport and metabolism (17 proteins, 5.7%) ( Figure 6A). The high number of extracellular proteins involved in transport and metabolism is to be expected for osmotrophs which obtain their nutrients externally from the environment or from their hosts. The proportion of extracellular proteins classified as carbohydrate transport and metabolism (96 proteins; 32.2%) is particularly enriched relative to the total genome (746 proteins; 6.2%) ( Figure 6A). These proteins may be involved in the breakdown of host plant cells as well in the acquisition and uptake of nutrients. No extracellular proteins (for any species) were annotated as being involved in "cell cycle control, cell division, chromosome partitioning", "transcription", "replication, recombination and repair", "cell motility", "defense mechanisms", "extracellular structures" or " nuclear structure" (Figure 6A-C). This suggests that hyphal lysis is unlikely to have occurred during protein extraction as these annotations are associated with intracellular processes or cell structures. A large number of known effector families were also detected, including proteins with PAN/Apple domains (10), transglutaminase elicitors (6), elicitins (5), proteins belonging to the cysteine-rich secretory protein family (4), and NLPs (2) ( Table 6). An extracellular berberine-like protein was also detected ( Table 6). Berberine-like proteins were previously reported as putative virulence factors in Ph. infestans [69] and are thought to be involved in infection by the biosynthesis of alkaloids and the production of reactive oxygen species. Berberine-like proteins were also detected in the extracellular proteomes of Ph. infestans and Ph. plurivora [15,16,69]. Additionally, an extracellular ribonuclease was detected (Table 6). Secreted ribonucleases have been reported as effectors in the fungal plant pathogen Blumeria graminis [126]. Ribonucleases were also detected in the Ph. infestans extracellular proteome [15]. In total, 140 extracellular proteins (40%) have homologs in PHI-Base (Table 6). A large number of extracellular CAZymes were also identified including glycoside hydrolases (60), polysaccharide lyases (7), carbohydrate esterases (4), auxiliary activities (8) and proteins with carbohydrate-binding modules (3) ( Table 6).
Examining Ph. gonapodyides, 237 protein groups (268 proteins) were identified in the cV8 samples and 196 protein groups (230 proteins) were identified in the V8 samples (Supplementary Table S8). 17 protein groups (19 proteins) were unique to the cV8 samples and 9 protein groups (15 proteins) were unique to the V8 samples (Supplementary Table S8). In total across the two conditions, 246 protein groups (283 proteins) were identified (Table 6 and Supplementary Table S8), of which 133 proteins (47%) overlap with the predicted secretome of Ph. gonapodyides. 167 proteins (59%) have positive SignalP 3 HMM predictions, ignoring additional cut-offs. An additional 76 proteins (26.9%) have a positive prediction from SecretomeP, suggesting non-classical secretion. Functional annotation using eggNOG assigned 238 extracellular proteins (84.1%) to one or more COG categories. Overall the functional profile was similar to that of Ph. chlamydospora, with the most numerous functional categories being carbohydrate transport and metabolism (95 proteins, 39.9%), function unknown (69 proteins, 29.0%), posttranslational modification, protein turnover, chaperones (29 proteins, 12.2%), amino acid transport and metabolism (9 proteins, 3.8%), lipid transport and metabolism (8 proteins, 3.4%) and energy production and conversion (7 proteins, 2.9%) ( Figure 6B). Identified effector families include proteins with PAN/Apple domains (8), transglutaminase elicitors (6), elicitins (8), members of the cysteine-rich secretory protein family (4), NLPs (4), a berberine-like protein and a ribonuclease ( Table 6). Some 96 (34%) extracellular proteins have homologs in PHI-Base (Table 6). A similar number of extracellular CAZymes were detected including glycoside hydrolases (68), polysaccharide lyases (9), carbohydrate esterases (4), auxiliary activities (5) and proteins with carbohydrate-binding modules (4) ( Table 6). Examining Ph. pseudosyringae, 280 protein groups (296 proteins) were identified in the cV8 samples and 247 protein groups (259 proteins) were identified in the V8 samples (Supplementary Table S8). 18 protein groups (22 proteins) were unique to the cV8 samples and 6 protein groups (6 proteins) were unique to the V8 samples (Supplementary Table S8). In total across the two conditions, 313 protein groups (331 proteins) were identified (Table 6 and Supplementary Table S8), of which 145 proteins (44%) overlap with the predicted secretome of Ph. pseudosyringae. 188 proteins (56.8%) proteins have positive SignalP 3 HMM predictions without applying cut offs. An additional 67 proteins (20.2%) have a positive prediction from SecretomeP. eggNOG functional annotation assigned 279 extracellular proteins (84.3%) to one or more COG functional categories. The high-level functional annotation of the Ph. pseudosyringae is similar to that of Ph. chlamydospora and Ph. gonapodyides ( Figure 6C). The most numerous functional categories are carbohydrate transport and metabolism (93 proteins, 33.3%), function unknown (70 proteins, 25.1%), posttranslational modification, protein turnover, chaperones (42 proteins, 15.1%), energy production and conversion (13 proteins, 4.7%), amino acid transport and metabolism (10 proteins, 3.6%), translation, ribosomal structure and biogenesis (9 proteins, 3.2%) and lipid transport and metabolism (8 proteins, 2.9%) ( Figure 6C). The number of effector families is also similar and includes proteins with PAN/Apple domains (5), transglutaminase elicitors (8), elicitins (4), members of the cysteine-rich secretory protein family (4), NLPs (4) and ribonucleases (3) ( Table 6). We also detected an extracellular PcF phytotoxin from Ph. pseudosyringae, which was absent in the extracellular proteomes of both Ph. chlamydospora and Ph. gonapodyides (Table 6). Unlike Ph. chlamydospora and Ph. gonapodyides, we did not identify any berberine-like extracellular proteins from Ph. pseudosyringae, although its predicted secretome encodes a copy (Table 3). Some 109 extracellular proteins (33%) have homologs in PHI-Base (Table 6). Overall the CAZyme content of the Ph. pseudosyringae extracellular proteome is also similar with glycoside hydrolases (61), polysaccharide lyases (5), carbohydrate esterases (2), auxiliary activities (4) and proteins with carbohydrate-binding modules (3) ( Table 6).  Very few cytoplasmic effectors were detected in our analyses of all three species. CRNs were absent from the extracellular proteomes of all three species. No putative RxLRs were identified in the extracellular proteome of Ph. chlamydospora or Ph. gonapodyides. Only two putative RxLRs were identified in the extracellular proteome of Ph. pseudosyringae PHPS_09091 and PHPS_15662. PHPS_09091 was detected in all replicates of both V8 and cV8 media with 4 unique peptides (Supplementary Table S8) and was identified as an RxLR based on the Win, Regex and HMM methods (Supplementary Table S5). Similarly, PHPS_15662 was identified in all replicates of both V8 and cV8 media with 4 unique peptides (Supplementary Table S8). PHPS_15662 was identified only using the homology method and does not contain a RxLR-like motif (Supplementary Table S5), therefore it is not likely to be a legitimate RxLR effector. It is not surprising that so few cytoplasmic effectors were identified as it is possible that most cytoplasmic effectors are secreted from haustoria [127,128]. However, Meijer et al. (2014) report the detection of several Ph. infestans RxLRs and CRNs being released from hyphae in the absence of haustoria.
Previously, LC-MS/MS analysis of Ph. infestans identified 31 extracellular proteins that contained a single transmembrane domain [15]. These proteins are thought to be membrane proteins that are found in the extracellular medium due to proteolytic ectodomain shedding by sheddases. We also detected a large number of extracellular proteins that contain a single transmembrane domain. This included 33 proteins in Ph. chlamydospora, 25 in Ph. gonapodyides and 29 in Ph. pseudosyringae (Table 6). Of these, 17, 14 and 17 are homologous to the 31 Ph. infestans proteins. Similar to what was observed in Ph. infestans, the majority of identified transmembrane domains are found in the protein C-terminus.
Interestingly, we detected a number of extracellular proteins with KDEL or KDEL-like (HDEL or SDEL) C-terminal motifs. These are endoplasmic retention (ER) motifs that are usually associated with preventing protein secretion, signalling proteins to be retained in the ER lumen [129]. Proteins with KDEL motifs are usually excluded from in silico secretome studies. In Ph. chlamydospora we identified three such proteins PHCH_06832, PHCH_07252, and PHCH_15931. Both PHCH_06832 and PHCH_07252 are paralogs belonging to the same protein group. They were identified in all replicates of both V8 and cV8 media with four unique peptides (Supplementary Table S8). Both proteins contain a C-terminal HDEL motif and were annotated as belonging to heat shock protein (Hsp) 70 family. PHCH_15931 was identified in a total of five out of six replicates across the two conditions with five unique peptides and has a C-terminal KDEL motif (Supplementary Table S8). It was annotated as a calreticulin, which is an ER associated calcium-binding protein [130]. In Ph. gonapodyides, only one such protein was identified, PHGO_06390, which was detected in two out of three replicates of the cV8 samples and has a C-terminal SDEL motif (Supplementary Table S8). In Ph. pseudosyringae, three proteins were identified PHPS_03476, PHPS_04861, and PHPS_06172. PHPS_04861 is orthologous to PHGO_06390, has a C-terminal SDEL motif and was identified in all replicates of both conditions with 3 unique peptides (Supplementary Table S8). PHPS_03476 has a C-terminal KDEL motif and is orthologous to PHCH_15931 and was identified in all replicates of both conditions, with a total of seven unique peptides (Supplementary Table S8). PHPS_06172, a Hsp90 protein, contains a C-terminal KDEL motif and was identified in a total of four replicates across the two conditions, with two peptides, only one of which was unique (Supplementary Table S8). Inspecting the Ph. infestans extracellular proteins identified by Meijer et al. (2014) [15], there were six extracellular proteins identified that contain C-terminal KDEL/HDEL/SDEL motifs, five of which are orthologous to those identified above. As we detected these KDEL/KDEL-like motif-containing proteins in the extracellular medium, it suggests that their ER retention motifs are masked or perhaps they escape ER retrieval due to saturation of KDEL receptors [130].

LC-MS/MS Identification of Mycelial Proteins
We used mass-spectrometry to characterise the mycelial proteomes of Ph. chlamydospora, Ph. gonapodyides and Ph. pseudosyringae, and to understand how they change in response to oxidative stress and high temperatures. Proteins were extracted from mycelia grown under three conditions: "normal"-mycelia grown for 10 days at optimum temperatures-"heat"-mycelia grown for 7 days at optimum temperatures then switched to 30 • C for 3 days-and "oxidative stress"-mycelia grown for 10 days at optimum temperatures followed by exposure to 1 mM H 2 O 2 for 3 h.
Examining Ph. pseudosyringae, a total of 3195 protein groups (3245 proteins) were identified across the three conditions (Supplementary Table S9). Under the normal condition, 2223 protein groups (2248 proteins) were detected, 22 protein groups (22 proteins) of which were uniquely detected in this condition (Supplementary Table S9). 32 protein groups (33 proteins) were uniquely detected in the heat-treated samples which included three proteins predicted to be secreted and four proteins with predicted transmembrane helices. Heat-treated samples were significantly enriched for chaperone binding (GO:0051087). We detected 103 unique protein groups (106 proteins) in response to H 2 O 2 treatment, including three proteins predicted to be secreted and 30 proteins with at least one predicted transmembrane helix. Of the proteins identified across all conditions, 445 (13.7%) of mycelial proteins contain one or more predicted transmembrane helices. Furthermore, 160 (4.9%) belong to the predicted secretome, and 204 (6.3%) were also identified in the extracellular proteome (Supplementary  Table S9). We also detected effector families, including NLPs (2), CAP family proteins (3), elicitins (4), transglutaminase elicitors (5), PAN domain-containing proteins (5), RxLRs (8) and CRNs (18) (Supplementary Table S9). Additionally, 103 CAZymes were identified from Ph. pseudosyringae mycelium (Supplementary Table S9).
Overall, the functional annotation of all identified mycelial proteins is similar between each of the three species ( Figure 7A). Clustering with MCL grouped identified mycelial proteins from the three species into 2577 protein families, of which 1554 families were shared by all three species ( Figure 7B). More proteins were common between Ph. chlamydospora and Ph. pseudosyringae (205) and between Ph. gonapodyides and Ph. pseudosyringae (185) than between Ph. chlamydospora and Ph. gonapodyides (103) ( Figure 7B). 93 mycelial protein families were unique to Ph. chlamydospora, 84 to Ph. gonapodyides and 353 to Ph. pseudosyringae ( Figure 7B), indicating increased variation across Phytophthora clades.

Phylostratigraphy Analysis
Taxonomically restricted genes were identified using phylostratigraphy. Homologs were identified for each Phytophthora protein-coding gene by performing BLAST searches against a large protein database (18,084,866 proteins) with broad phyletic distribution [85]. Genes were assigned to one of seven phylostrata (cellular organisms, eukaryotes, Stramenopiles, oomycetes, Peronosporales, Phytophthora or species-specific orphans) based on their conservation in other taxonomic lineages.

Phylostratigraphy Analysis
Taxonomically restricted genes were identified using phylostratigraphy. Homologs were identified for each Phytophthora protein-coding gene by performing BLAST searches against a large protein database (18,084,866 proteins) with broad phyletic distribution [85]. Genes were assigned to one of seven phylostrata (cellular organisms, eukaryotes, Stramenopiles, oomycetes, Peronosporales, Phytophthora or species-specific orphans) based on their conservation in other taxonomic lineages.
Overall the proportion of genes assigned to each phylostrata are similar between the three Phytophthora genomes (Figure 8). On average, 33.4% of genes were assigned to the phylostratum cellular organisms (i.e., homologs are present in eukaryotes and prokaryotes), 29.9% are unique to eukaryotes, 1.6% are unique to stramenopiles, 23.6% are unique to oomycetes, 3.9% are unique to Peronosporales, 5.9% are unique to Phytophthora and 1.7% are orphans unique to one Phytophthora species (Figure 8). Individually, 194 orphans (1.1%) were identified for Ph. chlamydospora, 520 (2.2%) for Ph. gonapodyides and 312 (1.8%) for Ph. pseudosyringae (Figure 8). Ph. gonapodyides had a smaller proportion of genes identified as originating in cellular organisms (30.4%), compared to Ph. chlamydospora (35.8%) and Ph. pseudosyringae (34.1%) ( Figure 8B). In addition, Ph. gonapodyides had a higher proportion of genes identified as being unique to oomycetes (25.1%), unique to Phytophthora (7.4%) and species-specific (2.2%) ( Figure 8B). This suggests that the increased gene repertoire of Ph. gonapodyides is due to expansions of more recently evolved gene families, i.e., genes unique to oomycetes, Phytophthora, and Ph. gonapodyides, as opposed to expansions of more ancient genes that are conserved in other eukaryotes or prokaryotes. Overall the proportion of genes per phylostratum is similar to previous phylostratigraphic analyses for other Phytophthora genomes [124].
We coupled our phylostratigraphy analysis with the mass spectrometry data. Compared to the overall genome, a much larger proportion (approximately 61.5%) of identified proteins belongs to the phylostratum "cellular organisms" (Figure 8). This suggests that the majority of identified proteins are evolutionarily conserved proteins that possibly play roles in conserved housekeeping functions. Only 8.1% to 9.1% of identified proteins belong to the phylostratum oomycetes or "younger" (Figure 8). Furthermore, only 0.57% to 0.80% of identified proteins belong to the phylostratum Peronosporales or "younger" (Figure 8). This suggests that the more recently evolved genes may be under tighter transcriptional control or expressed only in specific scenarios. Additionally, some of the proteins identified as being species-specific may not be legitimate genes.

Conclusions
Here, we have sequenced the genomes for three ubiquitous Phytophthora species-Ph. chlamydospora, Ph. gonapodyides and Ph. pseudosyringae. Using bioinformatics methods, comparative genomics, and mass spectrometry, we provide a comprehensive characterization of the nuclear genomes, mitochondrial genomes, in silico secretomes, extracellular proteomes, and mycelial proteomes of each species. These genome resources will be useful for future studies to understand the lifestyles of these widespread Phytophthora species.