The Ever-Expanding Pseudomonas Genus: Description of 43 New Species and Partition of the Pseudomonas putida Group

The genus Pseudomonas hosts an extensive genetic diversity and is one of the largest genera among Gram-negative bacteria. Type strains of Pseudomonas are well known to represent only a small fraction of this diversity and the number of available Pseudomonas genome sequences is increasing rapidly. Consequently, new Pseudomonas species are regularly reported and the number of species within the genus is constantly evolving. In this study, whole genome sequencing enabled us to define 43 new Pseudomonas species and provide an update of the Pseudomonas evolutionary and taxonomic relationships. Phylogenies based on the rpoD gene and whole genome sequences, including, respectively, 316 and 313 type strains of Pseudomonas, revealed sixteen groups of Pseudomonas and, together with the distribution of cyclic lipopeptide biosynthesis gene clusters, enabled the partitioning of the P. putida group into fifteen subgroups. Pairwise average nucleotide identities were calculated between type strains and a selection of 60 genomes of non-type strains of Pseudomonas. Forty-one strains were incorrectly assigned at the species level and among these, 19 strains were shown to represent an additional 13 new Pseudomonas species that remain to be formally classified. This work pinpoints the importance of correct taxonomic assignment and phylogenetic classification in order to perform integrative studies linking genetic diversity, lifestyle, and metabolic potential of Pseudomonas spp.


Introduction
During the past decade, the landscape of bacterial systematics has changed drastically [1]. Once dominated by a polyphasic approach including phenotypic characterization, DNA-DNA hybridization, and 16S rRNA gene sequencing, the age of microbial genomics and metagenomics has reshaped the foundation of prokaryotic species definition [2,3]. Although 16S rRNA phylogeny remains the most common tool to evaluate the diversity of mixed prokaryotic populations, estimating inter-and intra-species relatedness was traditionally facilitated by DNA-typing methods. For several years, Multi-Locus Sequence Analysis (MLSA) represented the most widely adopted methodology for bacterial systematics, and for the exploration of evolutionary relationships within specific families/genera [4][5][6][7]. The success of high throughput and affordable Whole Genome Sequencing (WGS) technologies has tremendously increased the number of publicly available genomes and, therefore, genome-to-genome comparisons, with the Average Nucleotide Identity (ANI) and digital DNA-DNA Hybridization (dDDH), have become today's standards for species definition [1,[8][9][10][11]. This genome-based elucidation of relatedness at the inter-and intraspecies level is now encouraged and, at a larger scale, the creation of a Genome Taxonomy Database (GTDB) has allowed the bacterial taxonomy to be standardized [12,13].
According to GTDB, the Pseudomonadaceae family currently includes seven genera: Azomonas, Azotobacter, Entomomonas, Oblitimonas, Pseudomonas, Thiopseudomonas, and Ventosimonas (https://gtdb.ecogenomic.org/tree?r=f__Pseudomonadaceae, accessed on 10 July 2021). The genus Pseudomonas is the most complex, with 259 validly named species (List of Prokaryotic Names with Standing in Nomenclature (https://lpsn.dsmz. de/genus/pseudomonas, accessed on 10 August 2021), excluding subspecies and synonymous species. However, this number is constantly evolving, with over 30 new Pseudomonas species described between March 2020 and March 2021. Since the first descriptions of Pseudomonas species, which were based on morphological and phenotypical characteristics, several studies updated the taxonomy of Pseudomonas based on 16S rRNA gene sequence analysis [14]. This allowed the differentiation of the genus Pseudomonas from its sister genera, and also the definition of the three main Pseudomonas lineages, P. pertucinogena, P. aeruginosa, and P. fluorescens [6,15]. In a similar fashion, MLSA has guided the redefinition of prokaryotic species and has also impacted the phylogenomics and systematics of the genus Pseudomonas [4,6,16]. Indeed, the analysis based on four housekeeping genes (i.e., 16S rRNA, gyrB, rpoB, and rpoD) enabled the clarification of the Pseudomonas phylogeny by enhancing species delineation. This approach also proved to be a reliable tool for strain identification at the species level [4,6]. We recently demonstrated that the rpoD gene sequence alone provides a strong and low-cost alternative, particularly in the case of taxonomic affiliation of large batches of environmental Pseudomonas isolates [17].
Pseudomonas are motile, non-spore forming, Gram-negative rods belonging to the Gammaproteobacteria. Pseudomonas species are able to colonize and thrive in a wide range of ecological niches (e.g., soil, water, and plants, associated with higher organisms) [18]. In addition to the well-known human pathogen P. aeruginosa, other Pseudomonas species induce diseases in plants, fish, insects, or other animals [19][20][21]. In contrast, a large majority of Pseudomonas species are commensals but can also be used as bioremediation, biostimulation, and biocontrol agents [22,23]. Pseudomonas are ubiquitous bacteria that are often identified as fundamental components of bacterial communities and thus play essential ecological functions in the environment [24][25][26]. Furthermore, Pseudomonas are outstanding producers of bioactive secondary metabolites that often support their eclectic lifestyle (e.g., iron scavenging, swarming motility, biofilm formation, pathogenicity, cooperation, or antagonism) [27,28]. The link between secondary metabolites and Pseudomonas taxonomy has already been made through pyoverdines, a class of pigments used for a long time as a specific marker of classification [18]. Pseudomonas cyclic lipopeptides (CLPs), having a broad antimicrobial activity profile and anti-proliferative properties, have gained the attention of researchers due to their promising application potential [29]. CLP production is widespread within the genus Pseudomonas, and relationships between CLP diversity and Pseudomonas taxonomy were recently highlighted [30,31]. CLP producers tend to be grouped by CLP family and confined to specific groups or subgroups of Pseudomonas. Nonetheless, exceptions occur within the P. putida group, which hosts a large diversity of CLP producers from diverse families (i.e., Xantholysin, Entolysin, Putisolvin, and Viscosin families) [30,31].
In this study, we report 43 new Pseudomonas species and use a combination of Nanopore and Illumina sequencing to provide high quality genomes. Through the genome analysis of these new species, together with type strains of Pseudomonas, we provide an update of the Pseudomonas phylogeny based on a set of 1508 core orthogroup sequences and another based on the rpoD gene. We used nucleotide identities based on the rpoD gene and whole genome comparisons to reassign, respectively, 82 and 41 non-type strains of Pseudomonas to known and newly described Pseudomonas species. A large majority of the new species were affiliated to the P. putida group, increasing species numbers from 35 to 51. We thus explored genetic diversity within the P. putida group in a greater depth and mapped, on an expanded phylogeny of the group, the presence of Biosynthetic Gene Clusters (BGCs) for the production of CLPs.

Pseudomonas Strains
In this study we used 273 known type strains of Pseudomonas, including validly published species and recently published species still lacking taxonomic status (https://lpsn.dsmz.de/ genus/pseudomonas, accessed on 10 August 2021). Only 270 type strains of Pseudomonas were used for genome analysis because no genome sequences were available for three of these type strains. Eleven type strains of other genera within the Pseudomonadaceae and Cellvibrio japonicus were used for phylogenetic analyses ( Figure S5). The list of type strains, including their culture collection codes and accession numbers (i.e., rpoD and whole genome sequences) is provided in Table S1.
We also used 47 strains from our collection of environmental Pseudomonas isolates to describe 43 new Pseudomonas species (the type strains of newly described species are highlighted in bold; Table S2). These 47 isolates were deposited in two culture collections (i.e., Belgian Co-ordinated Collections of Micro-organisms (BCCM/LMG) and Collection Française de Bactéries associées aux Plantes (CFBP)), and their phenotypic profiles were obtained using the Biolog GEN III MicroPlate (BIOLOG, Hayward, CA, USA) according to the manufacturer's instructions (Table S3). To avoid species description based on single strains, the rpoD sequences of these 43 new type strains of Pseudomonas were then used as a query to search for additional strains using BlastN with default parameters (https://blast.ncbi.nlm.nih.gov/Blast.cgi?PAGE_TYPE=BlastSearch; accessed on 10 August 2021, Tables S4 and S5). We previously defined a cutoff value of 98% nucleotide identity to differentiate strains at the species level but also revealed some inconsistent species affiliations [17]. To avoid any misidentification due to the use of the rpoD gene, we only considered hits with 100% identity, or lower if a genome was available and thus allowed validation by ANIb calculation (> 96.5%, Tables S4 and S5). We thus used a total of 82 non-type strains of Pseudomonas (Table S4), including 29 with whole genome sequences (Table S5).
Finally, a set of 122 strains, including type and non-type strains of Pseudomonas, was specifically used for the P. putida group phylogenetic and genomic analyses (Table S6).

Genome Sequencing, Assembly, and Functional Annotation
We recently highlighted the high discriminative power of the rpoD gene as a reliable tool for the identification of environmental Pseudomonas isolates [17]. In the same study, we released draft genome sequences of 55 environmental Pseudomonas isolates and rpoD gene analysis together with whole genome comparisons allowed us to highlight the presence of 30 new Pseudomonas species (Pseudomonas #5 [17], with strains SWRI59, SWRI68, and SWRI77, was later identified as P. capeferrum). We applied the same methodology to an expanded set of Pseudomonas isolates and identified 17 additional new species. To provide high-quality genomes for the type strains of these 43 new species, we combined Illumina and Nanopore sequencing [32]. An overview of the different sequencing methodologies used for the entire set of strains is shown in Table S2. We controlled the quality of the Illumina reads with FastQC v0.11.9 and used Trimmomatic v0.38 [33] for adapter clipping, quality trimming (LEADING:3 TRAILING:3 SLIDINGWINDOW:4.15), and filtering on length (>50 bp). The quality of the Nanopore reads was assessed with Nanoplot v1.28.2 [34] and we used Porechop v0.2.4 (https://github.com/rrwick/Porechop, accessed on 8 June 2021) for barcode clipping, in addition to NanoFilt v2.6 [34] to filter quality (Q > 8) and length (>500 bp). The genomes were assembled using Unicycler v0.4.8 [35] with default options and the quality of their assemblies was assessed using QUAST v5.1 [36]. The functional annotation was undertaken with the NCBI Prokaryotic Genome Annotation Pipeline [37].
The evolutionary relationships between newly described and previously known type strains of Pseudomonas were assessed using rpoD and whole genome phylogenies. The rpoD-based phylogenies were conducted as previously described using MEGA-X ( Figure 2, right) [17]. The corresponding similarity matrix, based on a 650 bp fragment of the rpoD gene, including 316 type strains of Pseudomonas (273 known and 43 newly described species), was generated (Table S8). The phylogenetic trees based on whole genomes were inferred with IQ-TREE v1.6.12 [39] with automatic model selection and 1.000 ultrafast bootstraps (UF-Boot) using an alignment of 1508 (genus phylogeny; Figure S1, left) and 2570 (P. putida group phylogeny; Figure S5) core orthogroup sequences that were delineated with the SCARAP pipeline (https://github.com/SWittouck/SCARAP, accessed on 8 June 2021) [40].
Several new phylogenetic groups (G) and subgroups (SG) were delineated based on branch length, grouping, and bootstrap values on both rpoD and whole genome phylogenies (Tables 1 and 2, Figure 2, Figure 2, and Figure S5). The new groups and subgroups were named after the first species described in a group or subgroup.     (Table S6). All the strains included in this analysis, together with their accession numbers and the output of the prospection for CLP BGCs, are detailed in Table S6. The maximum likelihood phylogenetic tree was constructed using the GTR + G+I model (MEGA-X). Bootstrap values were calculated based on 1000 replications and only bootstrap values higher than 50% are indicated. Type strains of newly described species are highlighted in bold. The P. rhizosphaerae group is used as the outgroup. The corresponding tree based on whole genome sequences is shown in Figure S5.   (Table S6). All the strains included in this analysis, together with their accession numbers and the output of the prospection for CLP BGCs, are detailed in Table S6. The maximum likelihood phylogenetic tree was constructed using the GTR + G+I model (MEGA-X).
Bootstrap values were calculated based on 1000 replications and only bootstrap values higher than 50% are indicated. Type strains of newly described species are highlighted in bold. The P. rhizosphaerae group is used as the outgroup. The corresponding tree based on whole genome sequences is shown in Figure S5. Table 2. Synonymous species of Pseudomonas (Table S7). Species are considered synonymous when ANIb values are greater than or equal to 96.5% [18].

Cyclic Lipopeptide (CLP) NRPS Analysis
The P. putida group was previously highlighted to include CLP producers from the Viscosin (WLIP producers), Putisolvin, Entolysin, and Xantholysin families [30,31]. Among the 16 type strains of the newly described species belonging to the P. putida group, 4 were already described as CLP producers (WLIP and Xantholysin producers) [30,31]. Consequently, all strains belonging to the P. putida group (Table S6), with available genome sequences, were subjected to an antiSMASH analysis (antiSMASH 6.0) [41]. Positive hits were then inspected manually to confirm the typical features of Pseudomonas CLP Non-Ribosomal Peptide Synthetase (NRPS) clusters (i.e., the presence of tandem TE-domains and the absence of epimerization domains) and synteny (i.e., number of modules and their distribution along the encoded NRPSs), all based on previously described CLP NRPS gene cluster annotations [42,43]. All known and newly identified strains carrying CLP BGCs, together with their affiliation to CLP families and the accession numbers of their NRPS genes, are presented in Table S6. The phylogenetic relationship between known and newly identified CLP producers was assessed, by family, based on concatenated NRPS amino acid sequences ( Figures S2-S4).

Defining New Pseudomonas Species
In a recent study, we performed rpoD-based identifications which allowed us to identify 31 new Pseudomonas species [17]. In the same study, three strains were incorrectly identified as representative strains of a new species (i.e., Pseudomonas #5, SWRI59, SWRI68, and SWRI77) but subsequently identified as P. capeferrum strains. Further rpoD-based identifications enabled us to identify 17 additional Pseudomonas species. Four strains, namely, SWRI22, OE 28.3, SWRI76, and CMR5c, were first assessed as new species but were later assigned to newly published Pseudomonas species (i.e., #29 P. carnis, #30 P. edaphica, #31 P. atacamensis, and #45 P. aestus; Table S7). Finally, a total of 43 new Pseudomonas species could be defined (Appendix A) and the result of their phenotypic profiling, together with assigned culture collection numbers, are presented in Table S3. Hybrid assemblies of the genomes resulted in 22 closed genomes and 18 draft genomes with improved contiguity. Due to technical issues, we have not been able to increase the quality of the draft genomes of strains BW11P2, COW3, and SWRI196. To avoid the proposal of new species based on single strains, the rpoD sequences of the 43 new species were used as queries to search for additional strains. We therefore reassigned 82 Pseudomonas strains, including 29 with whole genome sequences, available through GenBank (Tables S1 and S2). Finally, ANIb values were calculated between a total of 346 Pseudomonas species (270 type strains and 76 (47 + 29) Pseudomonas strains affiliated to new species), and allowed us to confirm these affiliations and the presence of 43 new Pseudomonas species (Table S7). The phylogenetic position of the 43 type strains is shown in Figure 2 and their distribution within the different groups of Pseudomonas is detailed in Table 1. All of the new species are clustering within the P. fluorescens (n = 27) and P. putida (n = 16) groups. We amended the existing subgroups of P. fluorescens as follows: P. asplenii (inclusion of P. vanderleydeniana), P. corrugata (inclusion of P. alvandae, P. marvdashtae, P. tehranensis, P. zanjanensis and P. zarinae), P. fluorescens (inclusion of P. asgharzadehiana, P. azadiae, P. khavaziana, P. salmasensis and P. tritici), P. gessardii (inclusion of P. shahriarae), P. jessenii (inclusion of P. asgharzadehiana and P. azerbaijanoccidens), P. koreensis (inclusion of P. bananamidigenes, P. botevensis, P. ekonensis, P. hamedanensis, P. iranensis, P. khorasanensis, P. monsensis, P. siliginis, P. tensinigenes, P. triticicola and P. zeae), P. mandelii (inclusion of P. farris), P. protegens (inclusion of P. sessiligenes) ( Table 1). The remaining sixteen new species allowed the partitioning of the P. putida group into fifteen subgroups, as described in Section 3.3.

Comparison of Whole Genome and rpoD-based Phylogenies
The phylogenetic relationships between known and newly described type strains of Pseudomonas are presented in Figure 2, respectively, the whole genome, based on 1508 core orthogroups, and the rpoD-based phylogenies. The phylogenies include 273 type strains of Pseudomonas species (270 for the whole genome phylogeny) and 43 type strains of the newly described Pseudomonas species. Three type strains of Pseudomonas were excluded from the analysis: (1) P. hydrolytica, with an abnormally long genome (10.4 Mbp)); and (2) P. hussainii and P. caeni, harboring short genomes (respectively, 3.68 and 3.03 Mbp) and clustering with members of other genera within the Pseudomonadaceae [4,6,17]. We suspect that the latter two are not Pseudomonas species and a dedicated study needs to clarify the taxonomy of other genera within the Pseudomonadaceae family.
The thirteen groups of Pseudomonas previously identified in several studies (i.e., P. pertucinogena, P. oryzihabitans, P. aeruginosa, P. resinovorans, P. stutzeri, P. linyingensis, P. oleovorans, P. straminea, P. anguilliseptica, P. putida, P. lutea, P. syringae, and P. fluorescens) are all well supported in both trees [4,6,17,18]. In addition to these thirteen groups, three new groups, namely, P. pohangensis, P. massiliensis, and P. rhizosphaerae, were identified based on branch length and the strong bootstrap support values separating them from the neighboring groups ( Figure 2). Furthermore, as previously observed, ten species are scattered across the tree and represent orphan groups currently formed by only one species (Figure 2). An overview of all known and newly proposed groups is summarized in Table 1.
Overall, both trees are highly consistent in topology, although the tree inferred by whole genome analysis is supported by stronger bootstrap values. Two main differences can still be highlighted: (1) the position of the P. syringae and P. lutea group, clustering inside the P. fluorescens group in the rpoD-based tree; and (2) the position of P. karstica, P. spelaei, and P. yamanorum, clustering within the P. gessardii subgroup in rpoD and MLSA phylogenies [4,6,17], whereas in phylogenies based on whole genome analysis, they cluster within the P. fluorescens subgroup ( [18] and Figure 2).

Identification and Reassignment at the Species Level
Several studies have revealed inconsistencies within public databases in which genomes of Pseudomonas are not identified (Pseudomonas sp.) or incorrectly assigned at the species level [4,44,45]. Within the P. putida group, a huge number of strains are incorrectly assigned to P. putida [4,44]. Here, we propose to update the P. putida group with 16 new Pseudomonas species and tentatively reassign 44 non-type strains of Pseudomonas (Table 3). A total of 25 strains are affiliated to known and newly described species (P. shirazensis (n = 1), P. guariconensis (n = 2), P. wayambapalatensis (n = 2), P. farsensis (n = 1), P. peradeniyensis (n = 1), P. capeferrum (n = 2), P. kermanshahensis (n = 4), P. juntendi (n = 2), P. alloputida (n = 6), and P. kurunegalensis (n = 4)), and the remaining 19 strains represent an additional 13 new species. As previously observed for the genus Pseudomonas, these results confirm the fact that type strains still represent a small fraction of the genomic diversity within the P. putida group.

Distribution of CLP biosynthesis Gene Clusters
CLPs are specialized metabolites that often support important ecological functions including cooperation, phytopathogenicity, or antagonism [29,43,46]. CLPs consist of a fatty acid tail attached to a cyclized oligopeptide and are synthesized by NRPSs [29,42]. The modularity of these enzymes enables Pseudomonas strains to produce a wide diversity of CLPs, resulting in their classification in several families [28,29,42]. The relationship between CLP diversity and Pseudomonas taxonomy was recently highlighted, and it was demonstrated that certain CLP families were exclusive to specific subgroups of P. fluorescens [30,31,43]. In contrast, the P. putida group was demonstrated to host CLP producers from different families [30,31]. CLP production is widespread within the P. putida group and different type strains (i.e., P. capeferrum, P. entomophila, and P. soli) and many non-type strains (e.g., RW10S2, PCL1445, BW11M1, 250J, COR5, COW10, COR19, COR51; Table S7) were formerly characterized as producers of CLPs from the Viscosin (WLIP producers), Putisolvin, Entolysin, and Xantholysin families [30,31,[46][47][48][49][50][51][52][53][54][55][56]. Among the 16 type strains of the newly described species, four were previously described as CLP producers: two WLIP producers, P. fakonensis COW40 and P. xanthosomae COR54; and two xantholysin producers, P. maumuensis COW77 and P. muyukensis COW39 (Table S6) [30,31]. We therefore searched for CLP NRPSs in a selection of Pseudomonas genomes, including all type strains belonging to the P. putida group (n = 51) and the 44 genomes of non-type strains presented in Table 3. About 65% of the strains (i.e., 34 of 51 type strains; 28 of 44 non-type strains) did not carry CLP NRPSs in their genomes (Table S6). Our analysis revealed the presence of NRPSs from the Viscosin family (WLIP-like NRPSs) in the genomes of two strains (P. wayambapalatensis RW3S1 T and RW3S2); from the Putisolvin family in five type strains (P. fulva, P. kermanshahensis, P. parafulva, P. reidholzensis, and P. vlassakiae) and five non-type strains (P. capeferrum SWRI59 and SWRI68 and P. kermanshahensis SWRI67, SWRI50, E46); and from the Xantholysin family in two type strains (P. peradeniyensis and P. xantholysinigenes) and two non-type strains (P. mosselii BW18S1 and P. peradeniyensis BW16M2) (Figure 2 and Table S6). The poor genome quality of two type strains, namely, P. brassicae and P. juntendi, revealed the presence of NRPS gene fragments coding for tandem thioesterase (TE) domains. Tandem TE domains are specific to Pseudomonas CLP NRPS; therefore, these results indicate P. brassicae and P. juntendi carry CLP NRPS genes and most likely produce CLPs. Further analyses, chemical characterization, and/or a hybrid assembly based on long read sequencing and Illumina sequencing are needed to identify the CLPs. We previously highlighted, in the type strains of P. asplenii and P. fucovaginae, a NRPS system predicted to assemble a lipotridecapeptide (LP-13) but this metabolite still awaits chemical and functional characterization [43]. A putative LP-13 biosynthesis gene cluster is also present in the genome of P. tructae. Oni and colleagues also reported the presence of a new CLP (N8, 17:8, 17 amino acids, of which 8 are in the macrocycle) within the P. putida group [30,31]. Altogether, these results highlight a wide diversity of CLP producers from known, and yet to be described new, CLP families within the P. putida group.

Partitioning of the P. putida Group
To present an integrated approach linking the genetic diversity and the metabolic potential of Pseudomonas species, we mapped the presence of CLP biosynthesis gene clusters on an extended phylogeny of the P. putida group (Figure 2 and Figure S5). As shown in Figure 2, the P. putida group is composed of several subgroups (Figure 2). The extended phylogeny allowed us to define 15 subgroups, P. japonica (n = 6), P. vranovensis (n = 6), P. reidholzensis (n = 3), P. xanthosomae (n = 2), P. mosselii (n = 8), P. vlassakiae (n = 3), P. capeferrum (n = 2), and P. putida (n = 14), including seven orphan subgroups (P. akappagea, P. cremoricolorata, P. guariconensis, P. wayambapalatensis, P. farsensis, P. taiwanensis, and P. plecoglossicida). The distribution of all type strains in the 15 subgroups is detailed in Table 1. Among the 44 non-type strains used in Section 3.3.1, 19 were highlighted to represent 13 new species distributed in seven subgroups: P. japonica (n = 1), P. guariconensis (n = 1), P. wayambapalatensis (n = 1), P. mosselii (n = 1), P. plecoglossicida (n = 3), P. vlassakiae (n = 2), and P. putida (n = 4) ( Table 3). These additions to the P. putida phylogenies allowed us to seize a small portion of the genomic diversity among environmental Pseudomonas strains, but also to pinpoint the immediate growing potential of the newly defined subgroups. We observed that, in both rpoD and whole genome phylogenies, the distribution of CLP biosynthesis gene clusters was associated with this phylogenetic subgrouping.
All xantholysin and entolysin producers were grouped within the P. mosselii subgroup, putisolvin producers were clustered in four subgroups (i.e., P. putida, P. reidholzensis, P. vlassakiae, and P. capeferrum), and WLIP producers were distributed over two subgroups (P. xanthosomae and P. wayambapalatensis). Moreover, the phylogenies based on concatenated NRPS amino acid sequences (Xantholysin/Entolysin families, Figure S2; Putisolvin family, Figure S3; and WLIP producers, Figure S4) revealed different clusters that perfectly match the distribution of CLP producers within the different subgroups. These results demonstrate that the rpoD gene allows both the identification of Pseudomonas isolates and the construction of robust phylogenies, providing information about the affiliation of producers to CLP families.
The strong congruence between the phylogenetic tree based on the NRPS sequences and the rpoD-and whole genome-based phylogenies indicates that CLP biosynthesis genes have largely evolved in accordance with the evolutionary history of Pseudomonas species within the P. putida group. However, P. reidholzensis carries a putisolvin biosynthetic gene cluster that is absent from the genome of the closely related species. Furthermore, CLP producers from the Viscosin family, including WLIP producers, are predominantly found within the P. fluorescens group [30,31,57], with the exception of the two subclusters of WLIP producers present in the P. putida group. Altogether, these observations indicate that Pseudomonas CLP NRPS clusters have a complex evolutionary history probably involving both vertical and horizontal gene transfer.

Conclusions
Our update of the genus with 43 new species together with our analysis of 313 genomes of type strains allowed us to propose a robust revised phylogeny of the Pseudomonas spp. This study aimed to fill the gap between the currently named species and the real genomic diversity within the genus Pseudomonas. Additional work is needed to complete this task and genome-based standards for species definition should be favored over highly variable phenotypic tests for publication. Our study validated the use of the rpoD gene for species identification, and for the study of the evolutionary relationships within the genus Pseudomonas. Furthermore, rpoD-based phylogenies can also be highly useful to specifically prospect for CLP biosynthesis gene clusters and affiliation of producers to known CLP families. Finally, the use of genomic sequences appears to be essential to reveal the ecological and metabolic potential of Pseudomonas spp.

Supplementary Materials:
The following are available online at https://www.mdpi.com/article/ 10.3390/microorganisms9081766/s1, Figure S1: Genome-based phylogeny of the Pseudomonadaceae. Figure S2: Phylogenetic tree based on concatenated NRPS proteins from the Xantholysin family. Figure S3: Phylogenetic tree based on concatenated NRPS proteins from the Putisolvin family. Figure S4: Phylogenetic tree based on concatenated NRPS proteins of WLIP producers from the Viscosin family. Figure S5: Genome-based phylogeny of the P. putida group. Table S1: List of type strains used in this study.

Conflicts of Interest:
The authors declare no conflict of interest.

Appendix A Descriptions of the 43 New Pseudomonas Species
The phenotypic descriptions are presented in Table S3. RpoD and whole genome-based assignment of additional Pseudomonas strains to the newly described species are shown, respectively, in Tables S1 and S2.
The type strain is RD8MR3 T (LMG 32021 T = CFBP 8837 T ) and was isolated from the endorhizosphere of rice, Anuradhapura, Sri Lanka in 1990. Its G + C content is 63.43 mol% (calculated based on its genome sequence). The 16S rRNA gene, rpoD, and whole-genome sequence of RD8MR3 T are publicly available through the accession numbers AM911640, MT621460, and CP077097, respectively.
The type strain is RD9SR1 T (LMG 32022 T = CFBP 8838 T ) and was isolated from the exorhizosphere of rice, Anuradhapura, Sri Lanka in 1990. Its G + C content is 62.91 mol% (calculated based on its genome sequence). The 16S rRNA gene, rpoD, and whole-genome sequence of RD9SR1 T are publicly available through the accession numbers AM911646, MT621461, and JABWRZ000000000, respectively.
The type strain is RW1P2 T (LMG 32023 T = CFBP 8839 T ) and was isolated from the rhizoplane of rice, Kurunegala, Sri Lanka in 1990. Its G + C content is 62.09 mol% (calculated based on its genome sequence). The 16S rRNA gene, rpoD, and whole-genome sequence of RW1P2 T are publicly available through the accession numbers AM911650, MT621449, and JABWSB000000000, respectively.
The type strain is SWRI100 T (LMG 32035 T = CFBP 8840 T ) and was isolated from the rhizosphere of wheat (cultivar Marvdasht), Kermanshah, Iran in 2004. Its G + C content is 62.22 mol% (calculated based on its genome sequence). The rpoD and whole-genome sequence of SWRI100 T are publicly available through the accession numbers MT621423 and JABWRY000000000, respectively.
The type strain is RW3S1 T (LMG 32024 T = CFBP 8841 T ) and was isolated from the exorhizosphere of rice, Kurunegala, Sri Lanka in 1990. Its G + C content is 63.24 mol% (calculated based on its genome sequence). The 16S rRNA gene, rpoD, and whole-genome sequence of RW3S1 T are publicly available through the accession numbers AM911665, MT621434, and CP077096, respectively.
The type strain is RW9S1A T (LMG 32025 T = CFBP 8842 T ) and was isolated from the exorhizosphere of rice, Kurunegala, Sri Lanka in 1990. Its G + C content is 64.16 mol% (calculated based on its genome sequence). The 16S rRNA gene, rpoD, and whole-genome sequence of RW9S1A T are publicly available through the accession numbers AM911667, MT621442, and CP077095, respectively.
The type strain is BW13M1 T (LMG 32026 T = CFBP 8887 T ) and was isolated from banana plant endorhizosphere, Peradeniya, Sri Lanka in 1990. Its G + C content is 64.62 mol% (calculated based on its genome sequence). The rpoD and whole-genome sequence of BW13M1 T are publicly available through the accession numbers MT621446 and JABWRJ000000000, respectively.
The type strain is RW4S2 T (LMG 32027 T = CFBP 8843 T ) and was isolated from the exorhizosphere of rice, Kurunegala, Sri Lanka in 1990. Its G + C content is 62.98 mol% (calculated based on its genome sequence). The 16S rRNA gene, rpoD, and whole-genome sequence of RW4S2 T are publicly available through the accession numbers AM911658, MT621428, and JABWRP000000000, respectively.
The type strain is RW10S1 T (LMG 32028 T = CFBP 8844 T ) and was isolated from the exorhizosphere of rice, Kurunegala, Sri Lanka in 1990. Its G + C content is 60.62 mol% (calculated based on its genome sequence). The 16S rRNA gene, rpoD, and whole-genome sequence of RW10S1 T are publicly available through the accession numbers AM911668, MT621430, and CP077094, respectively.
(#10) Description of Pseudomonas urmiensis sp. nov. Pseudomonas urmiensis (ur.mi.en'sis. N.L. fem. adj. urmiensis, from Urmia, a city in Iran). The type strain is SWRI10 T (LMG 32036 T = CFBP 8845 T ) and was isolated from the rhizosphere of wheat (cultivar Marvdasht), West Azerbaijan, Iran in 2004. Its G + C content is 61.81 mol% (calculated based on its genome sequence). The rpoD and whole-genome sequence of SWRI10 T are publicly available through the accession numbers MT621419 and JABWRE000000000, respectively.
(#11) Description of Pseudomonas shirazensis sp. nov. Pseudomonas shirazensis (shi.raz.en'sis. N.L. fem. adj. shirazensis, from Shiraz, a city in Iran). The type strain is SWRI56 T (LMG 32037 T = CFBP 8846 T ) and was isolated from the rhizosphere of wheat (cultivar Shiraz), Shiraz, Iran in 2004. Its G + C content is 61.85 mol% (calculated based on its genome sequence). The rpoD and whole-genome sequence of SWRI56 T are publicly available through the accession numbers MT621418 and JABWRD000000000, respectively.
The type strain is SWRI107 T (LMG 32038 T = CFBP 8847 T ) and was isolated from the rhizosphere of wheat (cultivar Azadi), Shiraz, Iran in 2004. Its G + C content is 62.58 mol% (calculated based on its genome sequence). The rpoD and whole-genome sequence of SWRI107 T are publicly available through the accession numbers MT621411 and JABWRF000000000, respectively.
(#13) Description of Pseudomonas vanderleydeniana sp. nov. Pseudomonas vanderleydeniana (van.der.ley.den.i.a'na. N.L. fem. adj. vanderleydeniana, from Jos Vanderleyden, a Belgian microbiologist who studied plant growth-promoting properties of root-associated alpha-and gammaproteobacteria, including nitrogen-fixing and fluorescent Pseudomonas isolates. The type strain is RW8P3 T (LMG 32029 T = CFBP 8848 T ) and was isolated from the rhizoplane of rice, Kurunegala, Sri Lanka in 1990. Its G + C content is 62.97 mol% (calculated based on its genome sequence). The rpoD and whole-genome sequence of RW8P3 T are publicly available through the accession numbers MT621472 and CP077093, respectively.
The type strain is BW11P2 T (LMG 32030 T = CFBP 8849 T ) and was isolated from banana plant exorhizosphere, Galagedara, Sri Lanka in 1990. Its G + C content is 60.62 mol% (calculated based on its genome sequence). The rpoD and whole-genome sequence of BW11P2 T are publicly available through the accession numbers MT621496 and LRUN00000000, respectively.
(#15) Description of Pseudomonas iranensis sp. nov. Pseudomonas iranensis (i.ran.en'sis. N.L. fem. adj. iranensis, from Iran). The type strain is SWRI54 T (LMG 32039 T = CFBP 8850 T ) and was isolated from the rhizosphere of wheat (cultivar Shiraz), Shiraz, Iran in 2004. Its G + C content is 59.89 mol% (calculated based on its genome sequence). The rpoD and whole-genome sequence of SWRI54 T are publicly available through the accession numbers MT621504 and CP077092, respectively.
The type strain is SWRI153 T (LMG 32040 T = CFBP 8851 T ) and was isolated from the rhizosphere of wheat (cultivar Kaasparoo), Khorasan, Iran in 2004. Its G + C content is 59.71 mol% (calculated based on its genome sequence). The rpoD and whole-genome sequence of SWRI153 T are publicly available through the accession numbers MT621508 and JABWQP000000000, respectively.
The type strain is SWRI65 T (LMG 32041 T = CFBP 8852 T ) and was isolated from the rhizosphere of wheat, Hamedan, Iran in 2004. Its G + C content is 59.99 mol% (calculated based on its genome sequence). The rpoD and whole-genome sequence of SWRI65 T are publicly available through the accession numbers MT621514 and CP077091, respectively.
(#18) Description of Pseudomonas zeae sp. nov. Pseudomonas zeae (ze'ae. L. gen. n. zeae, from Zea mays, corn). The type strain is OE 48.2 T (LMG 32031 T = CFBP 8853 T ) and was isolated from the rhizosphere of maize, in Belgium,~1984-1985. Its G + C content is 58.99 mol% (calculated based on its genome sequence). The rpoD and whole-genome sequence of OE 48.2 T are publicly available through the accession numbers MT621498 and CP077090, respectively.
The type strain is ZA 5.3 T (LMG 32032 T = CFBP 8882 T ) and was isolated from the rhizosphere of wheat, in Belgium,~1984-1985. Its G + C content is 59.17 mol% (calculated based on its genome sequence). The rpoD and whole-genome sequence of ZA 5.3 T are publicly available through the accession numbers MT621501 and CP077089, respectively.
(#20) Description of Pseudomonas monsensis sp. nov. Pseudomonas monsensis (mons.en'sis. N.L. fem. adj. monsensis, from Mons, a city in Belgium). The type strain is PGSB 8459 T (LMG 32033 T = CFBP 8854 T ) and was isolated from the rhizosphere of maize, Mons, Belgium,~1984-1985. Its G + C content is 60.05 mol% (calculated based on its genome sequence). The rpoD and whole-genome sequence of PGSB 8459 T are publicly available through the accession numbers MT621495 and CP077087, respectively.
The type strain is SWRI12 T (LMG 32042 T = CFBP 8855 T ) and was isolated from the rhizosphere of wheat (cultivar Alvand), Zanjan, Iran in 2004. Its G + C content is 61.21 mol% (calculated based on its genome sequence). The rpoD and whole-genome sequence of SWRI12 T are publicly available through the accession numbers MT621484 and JABWRB000000000, respectively.
(#22) Description of Pseudomonas zarinae sp. nov. Pseudomonas zarinae (za.ri'nae. N.L. gen. n. zarinae, from Zarin, a wheat cultivar). The type strain is SWRI108 T (LMG 32043 T = CFBP 8856 T ) and was isolated from the rhizosphere of wheat (cultivar Zarin), Kermanshah, Iran in 2004. Its G + C content is 60.86 mol% (calculated based on its genome sequence). The rpoD and whole-genome sequence of SWRI108 T are publicly available through the accession numbers MT621493 and CP077086, respectively.
The type strain is SWRI196 T (LMG 32044 T = CFBP 8857 T ) and was isolated from the rhizosphere of wheat, Tehran, Iran in 2004. Its G + C content is 60.46 mol% (calculated based on its genome sequence). The rpoD and whole-genome sequence of SWRI196 T are publicly available through the accession numbers MT621473 and JABWQV000000000, respectively.
The type strain is SWRI102 T (LMG 32045 T = CFBP 8858 T ) and was isolated from the rhizosphere of wheat (cultivar Marvdasht), Kermanshah, Iran in 2004. Its G + C content is 60.64 mol% (calculated based on its genome sequence). The rpoD and whole-genome sequence of SWRI102 T are publicly available through the accession numbers MT621490 and JABWQX000000000, respectively.
(#25) Description of Pseudomonas shahriarae sp. nov. Pseudomonas shahriarae (shah.ri.a'rae. N.L. gen. n. shahriarae, from Shahriar, a wheat cultivar). The type strain is SWRI52 T (LMG 32046 T = CFBP 8859 T ) and was isolated from the rhizosphere of wheat (cultivar Shahriar), Zanjan, Iran in 2004. Its G + C content is 60.59 mol% (calculated based on its genome sequence). The rpoD and whole-genome sequence of SWRI52 T are publicly available through the accession numbers MT621521 and CP077085, respectively.
(#26) Description of Pseudomonas azadiae sp. nov. Pseudomonas azadiae (a.za'di.ae. N.L. gen. n. azadiae, from Azadi, a wheat cultivar). The type strain is SWRI103 T (LMG 32047 T = CFBP 8860 T ) and was isolated from the rhizosphere of wheat (cultivar Azadi), Shiraz, Iran in 2004. Its G + C content is 60.69 mol% (calculated based on its genome sequence). The rpoD and whole-genome sequence of SWRI103 T are publicly available through the accession numbers MT621536 and JAHSTY000000000, respectively.
The type strain is SWRI145 T (LMG 32048 T = CFBP 8883 T ) and was isolated from the rhizosphere of wheat, Zanjan, Iran in 2004. Its G + C content is 59.87 mol% (calculated based on its genome sequence). The rpoD and whole-genome sequence of SWRI145 T are publicly available through the accession numbers MT621537 and CP077084, respectively.
The type strain is SWRI126 T (LMG 32049 T = CFBP 8861 T ) and was isolated from the rhizosphere of wheat (cultivar Zarin), Salmas, Iran in 2004. Its G + C content is 60.16 mol% (calculated based on its genome sequence). The rpoD and whole-genome sequence of SWRI126 T are publicly available through the accession numbers MT621526 and CP077083, respectively. The type strain is SWRI88 T (LMG 32052 T = CFBP 8865 T ) and was isolated from the rhizosphere of wheat (cultivar Marvdasht), Kermanshah, Iran in 2004. Its G + C content is 59.99 mol% (calculated based on its genome sequence). The whole-genome sequence of SWRI88 T is publicly available through the accession JAHSTX000000000.
(#33) Description of Pseudomonas siliginis sp. nov. Pseudomonas siliginis (si.li'gi.nis. L. gen. n. siliginis, of siligo, winter wheat). The type strain is SWRI31 T (LMG 32053 T = CFBP 8866 T ) and was isolated from the rhizosphere of wheat (cultivar Zarin), Kermanshah, Iran in 2004. Its G + C content is 59.97 mol% (calculated based on its genome sequence). The whole-genome sequence of SWRI31 T is publicly available through the accession JAHSTW000000000.
(#34) Description of Pseudomonas farris sp. nov. Pseudomonas farris (far'ris. L. gen. n. farris, of husked wheat, of a grain). The type strain is SWRI79 T (LMG 32054 T = CFBP 8867 T ) and was isolated from the rhizosphere of wheat (cultivar Local), Zanjan, Iran in 2004. Its G + C content is 58.74 mol% (calculated based on its genome sequence). The whole-genome sequence of SWRI79 T is publicly available through the accession JAHSTV000000000.
The type strain is SWRI74 T (LMG 32055 T = CFBP 8868 T ) and was isolated from the rhizosphere of wheat (cultivar Zarin), Salmas, Iran in 2004. Its G + C content is 59.30 mol% (calculated based on its genome sequence). The whole-genome sequence of SWRI74 T is publicly available through the accession JAHSTU000000000.
(#36) Description of Pseudomonas alvandae sp. nov. Pseudomonas alvandae (al.van'dae. N.L. gen. n. alvandae, from Alvand, a wheat cultivar). The type strain is SWRI17 T (LMG 32056 T = CFBP 8869 T ) and was isolated from the rhizosphere of wheat (cultivar Alvand), Zanjan, Iran in 2004. Its G + C content is 60.86 mol% (calculated based on its genome sequence). The whole-genome sequence of SWRI17 T is publicly available through the accession CP077080.
The type strain is CMR12a T (LMG 32173 T = CFBP 8877 T ) and was isolated from the roots of red cocoyam (Xanthosoma sagittifolium), Bokwai, Cameroon in 2001. Its G + C content is 62.80 mol% (calculated based on its genome sequence). The 16S rRNA gene, rpoB, gyrB, and whole-genome sequence of CMR12a T are publicly available through the accession numbers FJ652622, FJ652703, FJ652730, and CP077074, respectively.
The type strain is COW39 T (LMG 32177 T = CFBP 8890 T ) and was isolated from the roots of white cocoyam (Xanthosoma sagittifolium), Ekona, Cameroon in 2008. Its