Genome Mining of Pseudomonas Species: Diversity and Evolution of Metabolic and Biosynthetic Potential

Microbial genome sequencing has uncovered a myriad of natural products (NPs) that have yet to be explored. Bacteria in the genus Pseudomonas serve as pathogens, plant growth promoters, and therapeutically, industrially, and environmentally important microorganisms. Though most species of Pseudomonas have a large number of NP biosynthetic gene clusters (BGCs) in their genomes, it is difficult to link many of these BGCs with products under current laboratory conditions. In order to gain new insights into the diversity, distribution, and evolution of these BGCs in Pseudomonas for the discovery of unexplored NPs, we applied several bioinformatic programming approaches to characterize BGCs from Pseudomonas reference genome sequences available in public databases along with phylogenetic and genomic comparison. Our research revealed that most BGCs in the genomes of Pseudomonas species have a high diversity for NPs at the species and subspecies levels and built the correlation of species with BGC taxonomic ranges. These data will pave the way for the algorithmic detection of species- and subspecies-specific pathways for NP development.

To combat the emerging worldwide challenge of antibiotic resistance, new antimicrobial agents are desperately needed. Antimicrobial resistance takes the lives of at least 700,000 people every year and it is expected that this number will reach 10 million by 2050 if the problem is not addressed [21,22]. Indeed, less than 25% of clinical drugs represent limited novel classes or act via novel mechanisms. Drugs active against Gram-negative species in the Pseudomonas genus and the subspecies of P. fluorescence to elucidate the phylogenetic diversity, distributions of known and uncharacterized BGCs, and the NP-coding potential of these genomes.

Distribution and Diversity of Biosynthetic Potential in Pseudomonas at Species Level
A total of 50 annotated reference genomic sequences of different Pseudomonas species and 31 subspecies of P. fluorescence are available in NCBI genome datasets. Among them, we analyzed 37 complete genomes of Pseudomonas species and 23 P. fluorescence subspecies genomes for their biosynthetic potential with different genome mining tools. The rest of the genomes were avoided due to the lack of the rpoB gene and were not included in the phylogenetic tree ( Supplementary Files 1 and 2).

Putative BGC Prediction by antiSMASH in Pseudomonas Species Genomes
All Pseudomonas reference genomes were scanned with antiSMASH for the exploration of known and putative secondary metabolite biosynthetic potential in their genome sequences. The diversity of these species influences the phylogenetic diversity and heterogeneity ( Figure 1). The most typical BGCs in Pseduomonas were detected to encode the multidomain enzyme nonribosomal peptide synthetase (NRPS) and polyketide synthases (PKS). Their products, nonribosomal peptides (NRPs) and polyketides (PKs), are two varied groups of secondary metabolites that have been identified as toxins, medicines, siderophores, and Phylogenetic tree of Pseduomonas along with the gene numbers, isolation sources, and NP BGCs number determined by antiSMASH. The phylogenetic tree is built using rpoB sequences extracted from the genomes based on the maximum likelihood method. Two bar-plots show genome size in thousands of genes on the left, colored by habitats, and the number of BGCs on the right. Species in these two bar-plots keep the same order as the phylogenetic tree. Hybrid clusters are shown separately. The colors matching to habitat types and 24 major NP classes are displayed below the bar-plots. N/A: Not Available.
The most typical BGCs in Pseduomonas were detected to encode the multidomain enzyme nonribosomal peptide synthetase (NRPS) and polyketide synthases (PKS). Their products, nonribosomal peptides (NRPs) and polyketides (PKs), are two varied groups of secondary metabolites that have been identified as toxins, medicines, siderophores, and pigmentation agents. The analysis of the Pseudomonas species' genomic sequences demonstrated their potential to produce a variety of NRPs through biosynthesis. The NRPS modules encoded in typical modular NRPS gene clusters had at least adjacent condensation (C) and adenylation (A) domains. We included NRPS-like clusters lacking the C domain in the NRPS clusters because they could actively produce secondary metabolite even without a proper C domain. A PKS type had at least a ketosynthase (KS) domain. The hybrid kind was made up of NRPS and PKS modules together. So far, three kinds of PKS have been identified in bacterium species. The polyketide chain elongation and synthesis are catalyzed non-iteratively by most type I PKS. The biosynthetic domains of type II PKSs encode iteratively active aromatic polyketides. The acyl carrier protein (ACP) is used by type I and II PKS to trigger acyl CoA precursors for the development of polyketide molecules. Enzymes iteratively active for aromatic polyketide biosynthesis independent of ACP are also found in type III PKSs.
RiPPs' post-translational modifications increase the structural diversity of short peptides which are generally stabilized as a result of these changes, making them more resistant to heat and proteases.
In total, Pseudomonas bacteria carry between 6-16 BGCs per genome (mean = 9.81, s.d. = 2.85). Among the 37 genomes, the smallest genome size was found to be 4.689 Mbp in P. rhizosphaerae DSM 16299, which has 7 BGCs. The largest genome size, 7.189 mb, found in P. mandelii JR-1 had 11 BGCs, whereas the most BGCs (16) were found in P. chlororaphis qlu-1, whose genome size is 6.828 Mbp. Three genomes (P. monteilii B5, P. versuta L10.10 and P. psychrophila KM02) contain at least 6 BGCs. P. bijieensis L22-9 and P. protegens CHA0 have the second most BGCs (15) (Figure 2a). The most prevalent classes of BGCs were those encoding NRPSs, RiPPS, redox-cofactor, and NAGGN (Table 1, Figure 2b). The number of BGCs per genome has a moderate but statistically significant positive connection with genome size and total genes (R 2 = 0.3556, p-value = 0.0).   (11) were found in P. glycinae MS586 and the least KS domains (2) were seen in the P. plecoglossicida XSDHY-P genome while the average number of KS domains was found to be 7.406. On the other hand, the highest number of C domains (70) were found in the P. syringae BIM B-268, and no C domain existed in the P. psychrophila KM02. The average C domain number was 42.63.

Putative BGC Prediction by PRISM in Pseudomonas Species Genomes
The PRISM 4 analyses for the Pseudomonas genome datasets revealed a total of 191 different types of BGCs (Supplementary File 3). We found a total of 97 NRPS and 41 PKS BGCs. Some hybrids clusters were also seen for melanin, NRPS-independent siderophore, ectoine, isonitrile, tabtoxin, cyclodipeptide (XYP family), acyl homoserine lactone, pantocin, aminoglycoside, class II/III confident bacteriocin, resorcinol, and class II lantipeptide, infrequently found in different genomes of Pseudomonas.

Putative BGC Prediction by BAGEL in Pseudomonas Species Genomes
From the BAGEL4 data analysis, we identified 49 bacteriocins coding clusters for the whole genome datasets of Pseudomonas species (Supplementary File 3). Bacteriocins are categorized into four subgroups based on their chemical structures and modes of action. Class I bacteriocins are post-translationally modified peptides having antibacterial action. Bacteriocins of class II are antimicrobial peptides that have not undergone posttranslational modification and are split into four subclasses. Bacteriocins of class III, commonly known as bacteriolysins, are heat-labile proteins having a molecular weight of >10 kDa. The C-terminal domains of these bacteriocins demonstrate similarity to endopeptidases and selectivity for target cells. Bacteriocins of class IV are cyclic bacteriocins that have undergone post-translational modification.
Most of the bacteriocins found here are annotated as class III bacteriocins with molecular weight > 10 kDa showing similarity with colicin_E6, carocin_D, colicin_E9, putidacin_L1, colicin, lin_M18, pyocin_S2, and colicin-10. Some Pseudomonas species are shown to produce class II bacteriocins exhibiting a similar hit to microcin, Pep5, bottromycin, class II lanthipeptide, and class III bacteriocins.

KS and C Domain Determination in the Pseudomonas Genus Using NaPDoS
KS and C domains represent, respectively, the presence of BGCs for PKs and NRPs. We found a total 274 KS domains and 810 C domains from the 37 Pseudomonas reference genomic sequences (Supplementary File 3). The most KS domains (11) were found in P. glycinae MS586 and the least KS domains (2) were seen in the P. plecoglossicida XSDHY-P genome while the average number of KS domains was found to be 7.406. On the other hand, the highest number of C domains (70) were found in the P. syringae BIM B-268, and no C domain existed in the P. psychrophila KM02. The average C domain number was 42.63.

Whole-Genome Comparisons in Pseudomonas Species
Based on ANI (average nucleotide identity) analyses and the 95 percent threshold for species delimitation, the majority of input strain clusters were grouped into six core species identification groupings. ANI is computed using different algorithms: ANIb (ANI algorithm using BLAST), ANIm (ANI using MUMmer), OrthoANIb (OrthoANI using BLAST), and OrthoANIu (OrthoANI using USEARCH). The distribution of the six clades found in previous phylogenetic analyses is the same as in this one. Figure 3 showed the similarity across the whole genomes of our studied Pseudomonas species. Two strains were considered co-specific when they shared more than 95% nucleotide identity on at least 70% of their whole genome sequence.

Distribution and Evolution of Secondary Metabolites in Pseudomonas fluorescence at Subspecies Level
In order to understand the metabolic and biosynthetic potential in subspecies level of Pseudomonas, we chose the P. fluorescence reference genomes for our study. We found obvious variation in the genome size, genes number, G+C content, and biosynthetic capability among strains of P. fluorescence (Supplementary File 4). Figure 4 exhibits the phylogenetic relationship with the diversity of biosynthetic potential among the P. fluorescence subspecies with their gene numbers and habitats.
Though strains of P. fluorescence share similarly sized genomes, due to belonging to a common species, the BGC number shows obvious differences between different strains. P. fluorescens FW300-N2C3 has the largest genome size (7.119 Mbp) with the most BGCs (18) and P. fluorescens NCTC9428 has least 7 BGCs with a size of 6.034 Mbp. P. fluorescens A506 has the smallest genome size with 12 BGCs ( Table 2). The antiSMASH tool detected a total of 298 different BGCs in P. fluorescence reference genomes (Supplementary File 4, Figure 5b). We found a total of 20 different types of major classes of BGCs in P. fluorescence, predicted to be similar to arylpolyene-23, acyl_amino_acids-2, betalactone-25, butyrolactone-8, ectoine-1, hserlactone-7, lantipeptide class II-5, NRPS-64, NRPS-like-26, NAGGN-22, PpyS-KS-1, RRE-containing-2, ranthipeptide-5, redox-cofactor-27, RiPP-like-47, siderophore-9, t3pks-4, terpene-1, thiopeptide-4, and hybrid-15 (Supplementary File 4, Figure 5b).  Figure 3. Similarity across the whole genomes of Pseudomonas species. Comparison follows the same sequence as the phylogenetic tree in Figure 1. All comparisons between a genome and itself take place on a line that runs from the top left to the bottom right corners of the genome. The numerator for each comparison is the number of comparable genes between two genomes, whereas the denominator is the genome represented by each column. The smallest genome is marked with a * (green), and the biggest genome is marked with a * (red).  Figure 1. All comparisons between a genome and itself take place on a line that runs from the top left to the bottom right corners of the genome. The numerator for each comparison is the number of comparable genes between two genomes, whereas the denominator is the genome represented by each column. The smallest genome is marked with a * (green), and the biggest genome is marked with a * (red).

Distribution and Evolution of Secondary Metabolites in Pseudomonas fluorescence at subspecies Level
In order to understand the metabolic and biosynthetic potential in subspecies level of Pseudomonas, we chose the P. fluorescence reference genomes for our study. We found obvious variation in the genome size, genes number, G+C content, and biosynthetic capability among strains of P. fluorescence (Supplementary File 4). Figure 4 exhibits the phylogenetic relationship with the diversity of biosynthetic potential among the P. fluorescence subspecies with their gene numbers and habitats. Though strains of P. fluorescence share similarly sized genomes, due to belonging to a common species, the BGC number shows obvious differences between different strains. P. fluorescens FW300-N2C3 has the largest genome size (7.119 Mbp) with the most BGCs (18) and P. fluorescens NCTC9428 has least 7 BGCs with a size of 6.034 Mbp. P. fluorescens A506 has the smallest genome size with 12 BGCs ( Table 2). The antiSMASH tool detected a total of 298 different BGCs in P. fluorescence reference genomes (Supplementary File 4, Figure 5b). We found a total of 20 different types of major classes of BGCs in P. fluorescence, predicted to be similar to arylpolyene-23, acyl_amino_acids-2, betalactone-25, butyrolactone-8, ectoine-1, hserlactone-7, lantipeptide class II-5, NRPS-64, NRPS-like-26, NAGGN-22, PpyS-KS-1, RRE-containing-2, ranthipeptide-5, redox-cofactor-27, RiPP-like-47, siderophore-9, t3pks-4, terpene-1, thiopeptide-4, and hybrid-15 (Supplementary File 4, Figure 5b).   We found a total of 149 BGCs cluster detected by PRISM 4. Among them, there were 76 clusters for NRPS and 23 for PKS (Supplementary File 4). We found a total of 34 bacteriocins detected by BAGEL4 (Supplementary File 4). Most of them are colicin bacteriocins (type I). A few microcin, PaeM, putidacin, and class II lanthipeptide were also seen. On the contrary, antiSMASH hit a total of clusters for 90 RiPPs, including 47 RiPP-like compounds, 27 redox-cofactors, 5 class II lantipeptides and ranthipeptides, 4 thiopeptides, and 2 RRE-containing compounds. Whole genome similarity across genomes of P. fluorescence subspecies was also investigated ( Figure 6). The comparison followed the same sequences as the phylogenetic tree in Figure 4. We found a total of 149 BGCs cluster detected by PRISM 4. Among them, there were 76 clusters for NRPS and 23 for PKS (Supplementary File 4). We found a total of 34 bacteriocins detected by BAGEL4 (Supplementary File 4). Most of them are colicin bacteriocins (type I). A few microcin, PaeM, putidacin, and class II lanthipeptide were also seen. On the contrary, antiSMASH hit a total of clusters for 90 RiPPs, including 47 RiPPlike compounds, 27 redox-cofactors, 5 class II lantipeptides and ranthipeptides, 4 thiopeptides, and 2 RRE-containing compounds. Whole genome similarity across genomes of P. fluorescence subspecies was also investigated ( Figure 6). The comparison followed the same sequences as the phylogenetic tree in Figure 4.  Figure 4. The smallest genome is marked with a * (green), and the biggest genome is marked with a * (red).

Discussion
Projects to sequence the genomes of microorganisms at the early stages of their development discovered dozens of cryptic biosynthetic areas inside the industrially Figure 6. Whole genome similarity across the genomes of P. fluorescence sub-species. Comparison follows the same sequences as the phylogenetic tree in Figure 4. The smallest genome is marked with a * (green), and the biggest genome is marked with a * (red).

Discussion
Projects to sequence the genomes of microorganisms at the early stages of their development discovered dozens of cryptic biosynthetic areas inside the industrially important, well-studied bacterial genomes and sparked hopes that genome mining would lead to a new "golden era" of novel NPs.
The main goal of this study was to identify probable drug-like metabolites using publicly available data for Pseudomonas species and P. fluorescens sub-species reference genomes from NCBI. Despite earlier thorough research, our findings demonstrated that both Pseudomonas species and P. fluorescence sub-species have a large and distinct natural product metabolic potential with high diversity, indicating that they are still a good source of novel metabolites.
Comparative genomic analysis is an effective approach for revealing microorganisms' capacity for the production of novel specialized compounds. Comparative genomics investigations in NP fields have revealed that there is a plethora of new compounds embedded in both culturable and non-culturable microorganism genomes waiting to be revealed. The findings that follow add to our knowledge of their genetics and behaviors.
The research presented here is the first step in establishing a comprehensive methodology for analyzing natural compounds from the Pseudomonas genus. The BGC patterns indicated that certain species and sub-species of Pseudomonas and P. fluorescence had a higher incidence of metabolic potentials in NPs than others. We grouped every gene cluster in each genus well-represented by whole genomes using different comparisons. Such gene cluster families are necessary for cluster determination.
Comparative genomics revealed the similarity and difference between the species despite their differences in geography, morphology, and secondary metabolite profiles. Gene cluster networking highlights that this genus is distinctive in the number of secondary metabolite pathways, distinct from all other bacterial gene clusters to date. These findings portend that future genome-guided secondary metabolite discovery and isolation efforts should be highly productive.
Hence, the data here will help us in future BGC prioritization. For example, we found that all the Pseudomonas species and P. fluorescence subspecies contain the pyoverdine gene cluster, where most of them encoded more than one pyoverdine BGC. All the redox-cofactor BGC type encoded lankacidin C, which showed a considerable antitumor activity [43]. Interestingly, all the redox-factor encoded lankacidin BGC showed only a 13% similarity with most known BGCs of lankacidin C, implying a high possibility to isolate lankacidinanalogues with new structures.
However, beta-lactam, CDPS, phenazine, and terpene BGCs are not seen in P. fluorescence reference genomes. The findings show that the genus has a high level of route diversity, with the majority having been gained very recently in its history. The patterns and phylogenetic trajectories of these routes reveal the processes that create novel compound variety, as well as the tactics bacteria adopt to enhance their population-level ability to manufacture various molecules.
The high diversity of NP BGCs at the subspecies level demonstrated that the secondary metabolite production pathways are among the fastest-evolving genomic elements yet found [44]. Gene duplication, loss, HGT, NRPS, and PKS genes alteration, domain reorganization, and module redundancy [44][45][46] probably contribute to the emergence of novel small-molecule diversity.
The phylogenetic trajectories of individual PKS and NRPS domains have been noted, especially as pertains to the use of the KS and C domains to reveal information on enzyme design and function [47,48]. These studies have also contributed to the understanding of how widespread HGT is among biosynthetic genes for NP production [49,50], and the variation among PKS and NRPS gene phylogenies [51]. Although establishing the evolutionary histories of complete pathways is more difficult than resolving the evolutionary histories of individual genes or domains, comparative investigations of BGCs have been beneficial in identifying route boundaries [52].
In all, Pseudomonas species have demonstrated significant variation within the genus, and among species, and even strains within the same species, according to comparative genomics studies. Many of these BGCs were strain-specific, supporting the theory that they perform specialized metabolic tasks unique to certain ecological niches.

Collection of Genome Sequences
We used the NCBI Datasets' genome browser (NCBI: https://www.ncbi.nlm.nih.gov/ datasets/genomes/, accessed on 31 August 2021) to search for and collect the Pseudomonas complete genome sequences. We found a total of 27,125 different types of Pseudomonas genomes, including contigs, scaffold, chromosome, and complete genome. We filtered, as reference genome, an annotated and complete assembly level to obtain Pseudomonas genome sequences and retrieved 50 complete reference genome sequences in FASTA format of different Pseudomonas species and 31 complete reference genome sequences of Pseudomonas fluorescens from NCBI datasets on 31 August 2021. We discarded the 13 Pseudomonas and 7 P. fluorescence reference genomes from our study due to the lack of rpoB gene in these sequences ( Supplementary Files 1 and 2). Supplementary Files 3 and 4 show genome assembly, accession numbers, and genome information (genome size, genes number, and genes of protein coding).

Phylogeny and Whole Genome Comparisons
The rpoB sequences were extracted from the genomic assemblies and aligned using MEGA X. [53]. The phylogenetic tree was constructed using rpoB sequences in these genomes (Supplementary Files 1 and File 2). Some genome sequences lacked rpoB genes, and others were in poor conditions; therefore they were removed from the phylogenetic tree. Using the program MEGA X [53] and a general time reversible (GTR) nucleotide substitution model [54], four gamma categories for rate heterogeneity, and 100 bootstrap replicates, the rpoB sequences were utilized to construct a maximum likelihood phylogeny ( Supplementary Files 1 and 2).
Comparative genomics analyses were obtained using the pairwise average nucleotide identity (ANI) with an improved ANI algorithm, called OrthoANI [55] to check the genetic diversity among genomes, or clear species boundaries (Supplementary Files 3 and 4). Typically, the ANI values between genomes of the same species are above 95%.

Computational Approaches for the Identification of Gene Clusters Potentially Encoding Secondary Metabolites
We calculated the number of BGCs for each genome based on the three methodologies. The genome mining prediction platforms, namely, antiSMASH 6 [56], PRISM 4 [57] and BAGEL4 [58], using a combination of computational programs with default settings were implemented for the possible discovery of BGCs involved in the production of secondary metabolites. The antiSMASH tool makes it easy to find, annotate, and research secondary metabolite biosynthesis gene clusters all throughout the genome. Similarly, BAGEL4 is meant to comprehensively mine RiPPs and bacteriocin [58], whereas PRISM 4 is developed to analyze secondary metabolite structure and biological activity in a complete manner [57]. These sophisticated computer model services give accurate predictions of the encoding potential of microbial secondary metabolites [59]. These programs use several database systems for BGC annotation from genomic sequences, such as the principles of the hidden Markov model (HMM) [60], BLAST algorithm [61], PFAM [62], GenBank [63], UniPro-tKB [64], BACTIBASE [65] CAMPR3 [66], and the MIBig data repository [67]. Furthermore, we used NaPDoS [68] to detect KS and C domains in these genomic sequences.
4.3.1. antiSMASH 6.0 The antiSMASH 6.0 tool is an advanced and rigorous bioinformatics platform that uses a predictive method to identify and annotate existing and suspected undiscovered BGCs. The public version of antiSMASH 6.0 can be found online (antiSMASH: https://antismash. secondarymetabolites.org/#!/start, accessed on 31 August 2021) while R&D versions can be found online (R&D versions: https://bitbucket.org/antismash/, accessed on 31 August 2021) [56]. Profile hidden Markov models (pHMMs), as published by Medema et al., and the tool HMMER were used to find signature enzymes for the main categories of bioactive molecules [69]. The antiSMASH tool can create a database of presently existing BGCs across the tree of life "Minimum Information about a Biosynthetic Gene cluster" (MIBiG) community project (MIBiG: http://mibig.secondarymetabolites.org, accessed on 31 August 2021). The current antiSMASH version, which includes the ClusterFinder and ClusterBlast packages, may now detect potential unexplored forms of BGCs based on comparisons to existing BGCs and final chemical product information [56].

PRISM 4
PRISM 4 analyzes open reading frames with a library of hundreds of hidden Markov models and curated BLAST databases to annotate bacterial genomes for BGCs, and allows for genome-guided chemical structure prediction for every class of bacterial natural antibiotics now in use in clinical trials. Furthermore, PRISM 4 dramatically improves the coverage of enzymatic tailoring processes encoded inside conventional thiotemplated pathways. In order to predict the chemical structures of 16 different classes of secondary metabolites, PRISM 4 includes 1772 hidden Markov models (HMMs) and 618 in silico tailoring reactions. PRISM 4 as a freely accessible web server is available at (PRISM: https://prism.adapsyn.com/, accessed on 31 August 2021).

BAGEL4
BAGEL4, a user-friendly web server, allows researchers to mine bacterial (meta-) genomic DNA for ribosomally synthesized and post-translationally modified peptides (RiPPs) and (unmodified) bacteriocin. BAGEL4 is the most recent edition of the BAGEL package. Due to the need for new antibiotics and their crucial function in preserving food, microbial ecology, and plant biocontrol, demand in these families of compounds is growing. BAGEL4 is available for free online (BAGEL4: http://bagel4.molgenrug.nl, accessed on 31 August 2021). It also includes directories as well as a BLAST against the core peptide databases. The mining databases have been updated and expanded to include literature references as well as connections to UniProt and NCBI. It also contains an automatic promoter and terminator prediction, as well as the ability to submit RNA expression data to be presented alongside the clusters found. Additional enhancements include the annotation of context genes, which is now based on a quick blast against the UniRef 90 database's prokaryote component, and the enhanced web-BLAST function, which dynamically imports structural data from UniProt such as internal cross-linking.

NaPDoS-Analysis of C and KS Domains from NRPS and PKS Clusters
NaPDoS [68], which is accessible online ( NaPDoS: https://npdomainseeker.sdsc. edu/, accessed on 31 August 2021) as a fast way to extract and categorize ketosynthase (KS) and condensation (C) domains from PCR products, genomes, and metagenomic datasets. Condensation (C) domains are functionally active protein sequences found in NRPS clusters that catalyze the creation of amide bonds, a key step in peptide elongation [70]. Likewise, in PKS clusters, ketosynthase (KS) domains catalyze the condensation process. These domains are good candidates for genomic study since they are highly conserved and may be utilized to differentiate between distinct NRPS/PKS natural product pathways. To uncover probable natural product pathways from NRPS and PKS gene clusters, the NaPDoS pipeline was utilized to compare C and KS domain sequences to a domain library of previously found natural products. Close database matches may be used to anticipate secondary metabolite generalized structures, whereas unique phylogenetic lineages can be utilized to discover new enzyme designs or secondary metabolite assembly processes. The findings provide a rapid method for analyzing secondary metabolite biosynthesis gene diversity and abundance in species or habitats, as well as a method for identifying genes associated with unknown biochemistry. The output from antiSMASH was used to extract the C and KS domains from NRPS and PKS found in the 37 Pseudomonas species and P. fluorescence genomes, which were then examined using the NaPDoS web server with default parameters.

Conclusions
Less than 10 percent of microorganisms' biosynthetic capabilities are utilized in searching for bioactive NPs. Genome mining has tremendously benefited natural product developments. Currently, the genome sequences' availability of diverse species of Pseudomonas and sub-species of P. fluorescence provides an excellent opportunity for comprehensive comparisons of their biosynthetic potential.
Here, by combining different computational tools, the species and sub-species genomic sequences of Pseudomonas were analyzed in silico and revealed a wide range of biosynthetic capabilities to produce diverse sets of secondary metabolites. These putative secondary metabolite coding clusters (BGCs) are promising targets for further research to uncover additional resources.
Large amounts of genomic data are now public, and significant progress has been made in data mining, chemical monitoring, single-cell techniques, and genetic approaches to pathway activation, making the cryptic metabolome accessible. New culturing methods, effective genome editing, and appropriate expression systems will eventually overcome key impediments to obtain hidden chemical diversity.
It is notable that additional methodologies are required to decipher these biosynthetic genome motifs into corresponding compounds to open a new era in the discovery of secondary metabolism. Specific triggers or stimuli are required to activate quiet or downregulated gene clusters and enhance compound production rates, allowing access to these cryptic compounds [71].