Bacillus velezensis : A Treasure House of Bioactive Compounds of Medicinal, Biocontrol and Environmental Importance

: Bacillus velezensis gram-positive bacterium, is frequently isolated from diverse niches mainly soil, water, plantroots, andfermentedfoods. B.velezensis isubiquitous, non-pathogenicandendosporeforming. Being frequently isolated from diverse plant holobionts it is considered host adapted microorganism and recognized of high economic importance given its ability to promote plant growth under diverse biotic and abiotic stress conditions. Additionally, the species suppress many plant diseases, including bacterial, oomycete, and fungal diseases. It is also able after plant host root colonization to induce unique physiological situation of host plant called primed state. Primed host plants are able to respond more rapidly and/or effectively to biotic or abiotic stress. Moreover, B. velezenis have the ability to resist diverse environmental stresses and help host plants to cope with, including metal and xenobiotic stresses. Within species B. velezensis strains have unique abilities allowing them to adopt different life styles. Strain level abilities knowledge is warranted and could be inferred using the ever-expanding new genomes list available in genomes databases. Pangenome analysis and subsequent identification of core, accessory and unique genomes is actually of paramount importance to decipher species full metabolic capacities and fitness across diverse environmental conditions shaping its life style. Despite the crucial importance of the pan genome, its assessment among large number of strains remains sparse and systematic studies still needed. Extensive knowledge of the pan genome is needed to translate genome sequencing efforts into developing more efficient biocontrol agents and bio-fertilizers. In this study, a genome survey of B. velezensis allowed us to (a) highlight B. velezensis species boundaries and show that Bacillus suffers taxonomic imprecision that blurs the debate over species pangenome; (b) identify drivers of their successful acquisition of specific life styles and colonization of new niches; (c) describe strategies they use to promote plant growth and development; (d) reveal the unlocked strain specific orphan secondary metabolite gene clusters (biosynthetic clusters with corresponding metabolites unknown) that product identification is still awaiting to amend our knowledge of their putative role in suppression of pathogens and plant growth promotion, and (e) to describe a dynamic pangenome with a secondary metabolite rich accessory genome.

The interesting features of B. velesensis strains isolated from various substrates throughout time, led to a massive amount of work dedicated to the species. According to several research reports B. velezensis strains were previously classified as B. amyloliquefaciens [37], based on the 16S-rRNA gene sequence which exceeded 99% similarity between the two strains [28,38]. The confusion in the taxonomic naming between these two species was resolved by advanced genomic data analysis targeting the whole genome rather than a gene subset to determine the exact taxonomy of the strain. The novel phylogenetic placement of B. velezensis bacteria indicated that it shares a close phylogenomic resemblance with B. amyloliquefaciens subsp. plantarum, B. methylotrophicus, and B. oryzicola [27,35,37,39,40]. In 2017, all these strains were regrouped in a 'B. amyloliquefaciens operational group' containing three tightly linked branches including B. velezensis, B. amyloliquefaciens, and B. siamensis [17,26].
In this report, using a selected collection of 130 B. velezensis bacterial genomes publicly available in GenBank we performed phylogenomic analysis of the collection to decipher infra-species diversity of B. velezensis. Additionally, genome mining of secondary metabolites encoding gene clusters, as well as plant beneficial genes allowed us to shed the light and to have clues to explain the species life style and its biotechnological relevance as a biofertilizer. We clearly document that B. velezensis strains despite being isolated from different substrates (soil, water and plant material) represent the same species. Additionally, secondary metabolite clusters, as well as comparative genomic analysis allowed us to conclude that the species is represented by a very dynamic open pan genome. These findings legitimate the ongoing genomic sequencing efforts of newly discovered strains that is shading the light on the full biotechnological potential of the species.

Selection and Phylogenomic Analysis of B. velezensis Genomes
One hundred thirty-two B. velezensis genome sequences and annotated datasets were extracted from National Center for Biotechnology Information (http://www.ncbi.nlm.nih. -gov; accession date: 4 September 2021) and used in subsequent analyses. The CheckM program (1.0.9) [41] permitted estimation of completeness and contamination rates B. velezensis isolates genomes. All collected genomes had high quality draft genome sequences with over 90% completeness and below 10% contamination which explain their utilization in the current study. The global genomes alignments were performed using the REALPHY tool (http://realphy.unibas.ch; accession date: 10 September 2021) [42]. The phylogenomic tree was constructed using a Maximum-Likelihood (ML) algorithm [43] in the MEGA v. 6.0 software [44]. The evolutionary distances were measured by the kimura 2-parameter model [45]. The obtained tree branches were checked by bootstrap values with 1000 replications [4]. The average Nucleotide Identity (ANI) calculation between the selected B. velezensis strains genome sequences, was realized using the algorithm developed by [46]. The recommended cut-off was 95% between species members [47], as implemented in the EzBioCloud server (http://www.ezbiocloud.net/tools/ani, accession date: 25 September 2021) [46,48]. The genome-to-genome distance (GGDC) calculated by DSMZ software available at http://ggdc.dsmz.de (accession date: 10 October 2021) was used for genome-based delineation of species and subspecies [49]. The recommended cut-off for species delimitation was set at 70%.

Comparative Genomics Study of B. velezensis Strains Assembled Genomes
Comparative genome analysis of a collection of 65 B. velezensis isolates was performed using BPGA pipeline set at the recommended cut-off of 50% [50]. Appointment of core and pan genomes functional genes into COG categories within the BPGA pipeline was performed using the USEARCH program against standard COG database. For KEGG Orthology (KO) functional annotation, the predicted proteins originating from the bacterial B. velezensis genomes were analyzed by BlastKOALA software (http://www.kegg.jp/ blastkoala, accession date: 16 October 2021) [51].

Genome Mining for Functional Genes Responsible for Growth Promotion and Root Colonization
The flgBCDEGKLMN, flhABFOP and swrABC genes clusters were checked in the genomes of B. velezensis collection [65]. Homologs of che/fla/fli/tlp/mcp, and their relative clusters UCMB5113 and motABPS from the genomes of B. subtilis and B. velezensis, respectively, were used in genome mining searches in all B. velezensis isolates [60]. The xerCD genes were also mined [66]. Operon genes epsA-O from B. subtilis were mined according to previously described procedures [67].

Disease Resistance Induction in Plants
Genes pchBA resulting from P. aeruginosa were mined in the genomes collection. Additionally, Bacillus genome collection strains have been mined using B. velezensis SQR9 genes [59].

Antibiotics and Other Related Compounds
Pseudomonas spp. genes hcnABC and phlACBD were utilized for B. velezensis genomes mining [56]. gabD and gabT genes were also used to decipher the potential of the B. velezensis genomes collection [71].

Drugs Resistance
Homologues of the proteins tetB, tetR, and tetA of B. subtilis were searched in all B. velezensis genomes [72,73]. The streptothricin acetyltransferase encoding operon yyaACDE-HJKLRST was used as a bait in the screening of homologues in the concerned genomes [74]. The gene fosB resulting from B. cereus was also used to search for homologs in the B. velezensis genomes [75]. The B. licheniformis homolog gene ykcA was recovered for mined B. velezensis strains genomes searching [76]. The β-lactamase gene penP originating from B. subtilis strain 168 was used in homology search of the B. velezensis genomes collection [77]. Quinolone resistance homologs of the Staphylococcus norA was examined in the genomes of B. velezensis [78]. The E. coli gene floR have been homologs mined in the B. velezensis genomes [79]. B. subtilis 168 gene aadK, was mined in the targeted isolates [80]. B. subtilis gene ycbJ was explored in the mined strains [81]. B. subtilis gene vmlR was utilized for genome mining [82]. Multidrug exporter genes were mined from all genomes [60].

Heavy Metals Tolerance
Arsenic detoxification was searched through the use of genes arsABC and ywrK for the detection of putative ability [83]. The copYZAB operon, the component regulators CueR, CopY and CsoR [84], and ctpAB and ycnJ genes encoding copper resistance proteins [85] were all mined.
Homologs of the ynbB gene from B. subtilis strain 168 have been used as a bait in blast search for homologs in the Bacillus genome collection [77]. Homology of crcA, cspE, crcB, yhdV genes was used as a bait in the search for homologs in Bacillus targeted genomes [86]. The yceGH and yaaN homologs have been searched in all Bv genomes [87]. CzcD gene encoding cadmium, cobalt and zinc/H(+)-K(+) antiporter in B. subtilis and protecting cells against elevated Zn(II), Cu, Co(II), and Ni(II) levels [88] have been used in a blast search of homology in the Bv selected genomes. Homologs of gene ndoA (ydcE) and antitoxin gene, ndoAI (ydcD) have been mined [89]. Sensors for metals; Fur, ArsR, MerR, NikR, DtxR, mtnR, and yfmP family of metalloregulators of the B. subtilis genome were mined from the different B. velezensis genomes [90].

Aromatic Compounds Degradation
Toxins, such as vanillate, 4-hydroxybenzoate, salicylic, ferulic, and p-coumaric phenolic acids could engender stress responses in microorganisms. padC and bsdBCD (yclBCD) phenolic acid decarboxylases of B. subtilis were mined. We also mined the bsdA (yclA) gene upstream of the bsdBCD operon encoding the LysR-type regulator is the transcriptional activator of bsdBCD expression in response to phenolic acids [91]. Dibenzothiophene (DBT) is the model compound for sulfur-containing organic molecules. The operon dszABC of Rhodococcus sp. [92] was used as a bait in homology-based searches in B. velezensis genomes. Genome mining of genes encoding homologs of the B. velezenzis FZB42 azoR2, mhqADNOPE was realized [93].

Determination of the Core and Accessory Genomes of the B. velezensis Isolates
The core genome, defined as the sequences present in nearly all genomes of a given species, was determined using Spine from the B. velezensis isolates sequences. On the other hand, the sequences of accessory genome of B. velezensis isolates were identified using Agent [94].

Results
A phylogenetic tree of a total 130 B. velezensis bacterial genomes was constructed ( Figure 1A) based on the genome relationships among Bv genomes. Bacterial genome size was in the range of 3.38 to 4.97 Mb and the average GC content was between 45.23% and 46.8%. The bacterial collection was divided into four levels of genome sequencing ( Figure 1B), where 49% of Bv strains were completely sequenced, 36% were having contig genomes, 11% with scaffold genomes, and 4% with chromosome genomes. Figure 1C demonstrated that almost 38% of the assembled bacteria were isolated from China, 22% were isolated from South Korea, 19% from Ukraine and 5% from the United States. The 16 remaining bacteria were isolated from several other locations, such as Poland, Germany, and Canada. B. velezensis collection were isolated from various substrates ( Figure 1D), in fact, almost half of them originated from soil (44%), 6% came from the above ground parts of plants, 4% from either slow sand biofilter or uncut heroin sample, 3% of the totality of Bv bacteria were originally isolated from water or plants roots and bulbs or as endophytes, 1% originated from air, and the remaining 32% came from other different sources. Figure 1E shows that the studied bacterial collection was dotted with various bacterial activities, in fact, 59% bacteria were having a role in suppressing plant and animal pathogens, 31% has plant growth promoting abilities, 8% were successfully used in environmental bioremediation and 2% were having anti-inflammatory activities.
Genomes of 65 strains of B. velezensis showed that the number of new genes decreased conversely to the increase in analyzed genomes ( Figure 2). Average nucleotide identity (ANI) clearly showed that four putative species within B. velezensis group could be clearly observed, with ≥98% cut-off between B. velezensis species (Supplementary Figure S1) [47]. Figure 3A evidently shows that the pan genome and the number of analyzed genomes (75 genomes) are increasing proportionally. The pan genome of Bv species was increasing with a value of 0.15 (α = 0.15), indeed, it could be described as an open pan-genome according the Heap's law [95]. The investigation of COG pathways showed that the core genes were responsible for indispensable functions of the microorganism including transcription, translocation, ribosomal structure, and biogenesis. The accessory genes were controlling amino acid transport and metabolism. Lastly, the unique genes were associated with the defense metabolisms and the secondary metabolites biosynthesis, transport and catabolism ( Figure 3B). The analysis of KEGG distribution of the B. velezensis genomes were manifested in Figure 3C, where the metabolism of amino acids, terpenoids and polyketides and the xenobiotic biodegradation were mainly positioned in the unique genome. The metabolism of nucleotides, cofactors and vitamins and the signal transduction were manipulated by the accessory genome, while the carbohydrate metabolism, the excretory system, the cell motility and translocation were mainly located in the B. velezensis core genome.   The genome mining homology of genes encoding beneficial functions to plants, was conducted on the assembled B. velezensis collection (Supplementary Figure S2). Genes encoding plant beneficial functions involve: (i) genes conferring PGPR fitness, (ii) genes accorded to nutrient acquisition, (iii) genes responsible of root colonization and growth enhancement, (iv) hormones promoting plant growth, (v) plant safeguard from oxidative stress through antioxidant enzymes, (vi) assistance of disease resistance in plants mainly salicylate, acetoin and 2,3-butandiol biosynthesis, (vii) antibiotics and related compounds, (viii) tolerance to drugs and heavy metals, and (ix) aromatic compounds decomposition. Results shows that the majority of strains were endowed with PGP capacities (Supplementary Figure S2) and that the presence of mined genes is independent of the stains association to plant rhizosphere or of the average of the genome coverage.

Secondary Metabolites Biosynthesis Capacities of B. velezensis Pan, Core and Accessory Genomes
Clusters harboring secondary metabolites from 130 B. velezensis genomes were uncovered using five distinct software antiSMASH version 3.0 [52], PRISM, NapDos [96], NP.search, and BAGEL version 3 [55]. All mentioned programs confirmed that multiple Bv bacteria were having diverse secondary metabolites clusters ( Figure 4A and Supplementary Table S1). The identified secondary metabolites (Supplementary Table S1) included either clusters whose structure has been sufficiently characterized/identified (Macrolactin, Bacillaene, Fengycin, Difficidin, Bacillibactine, etc.) or clusters encoding new products. In Figure 4B,C, both antiSMASH and PRISM programs demonstrated that there is no good correlation between the genome size of B. velezensis bacterial collection and number clusters evolved in secondary metabolites biosynthesis. The assembled B. velesensis genomes were subjected to a rarefaction analysis of the clusters of genes encoding secondary metabolites resulting ( Figure 4D). Results underlined that saturation has not yet been attained.  Figure S2). Genes encoding plant beneficial functions involve: (i) genes conferring PGPR fitness, (ii) genes accorded to nutrient acquisition, (iii) genes responsible of root colonization and growth enhancement, (iv) hormones promoting plant growth, (v) plant safeguard from oxidative stress through antioxidant enzymes, (vi) assistance of disease resistance in plants mainly salicylate, acetoin and 2,3-butandiol biosynthesis, (vii) antibiotics and related compounds, (viii) tolerance to drugs and heavy metals, and (ix) aromatic compounds decomposition. Results shows that the majority of strains were endowed with PGP capacities (Supplementary Figure S2) and that the presence of mined genes is independent of the stains association to plant rhizosphere or of the average of the genome coverage.

Secondary Metabolites Biosynthesis Capacities of B. velezensis Pan, Core and Accessory Genomes
Clusters harboring secondary metabolites from 130 B. velezensis genomes were uncovered using five distinct software antiSMASH version 3.0 [52], PRISM, NapDos [96], NP.search, and BAGEL version 3 [55]. All mentioned programs confirmed that multiple Bv bacteria were having diverse secondary metabolites clusters ( Figure 4A and Supplementary Table S1). The identified secondary metabolites (Supplementary Table S1) included either clusters whose structure has been sufficiently characterized/identified (Macrolactin, Bacillaene, Fengycin, Difficidin, Bacillibactine, etc.) or clusters encoding new products. In Figure 4B,C, both antiSMASH and PRISM programs demonstrated that there is no good correlation between the genome size of B. velezensis bacterial collection and number clusters evolved in secondary metabolites biosynthesis. The assembled B. velesensis genomes were subjected to a rarefaction analysis of the clusters of genes encoding secondary metabolites resulting ( Figure 4D). Results underlined that saturation has not yet been attained.

Prediction of Secondary Metabolites Richness and Location within B. velezensis Genomes
Results demonstrated that the secondary metabolites prediction in the core-and accessory-genomes of Bv collection underlined the big amount of unknown secondary metabolites. Figure 5A shows that all secondary clusters were carried by the accessory genome except for Basilysin and Amylocyclicin existing inside the core genome as well. AntiSMASH program reveal that the size of the accessory genome is significantly correlated to the number of secondary products genes clusters, with a percentage of 9.6% ( Figure 5B).     Results demonstrated that the secondary metabolites prediction in the core-and accessory-genomes of Bv collection underlined the big amount of unknown secondary metabolites. Figure 5A shows that all secondary clusters were carried by the accessory genome except for Basilysin and Amylocyclicin existing inside the core genome as well. An-tiSMASH program reveal that the size of the accessory genome is significantly correlated to the number of secondary products genes clusters, with a percentage of 9.6% ( Figure  5B).

Prediction of Secondary Metabolites Richness and Location within B. velezensis Genomes
Results demonstrated that the secondary metabolites prediction in the core-and accessory-genomes of Bv collection underlined the big amount of unknown secondary metabolites. Figure 5A shows that all secondary clusters were carried by the accessory genome except for basilysin and amylocyclicin existing inside the core genome as well. An-tiSMASH program reveal that the size of the accessory genome is significantly correlated to the number of secondary products genes clusters, with a percentage of 9.6% ( Figure  5B).

Prediction of Secondary Metabolites Richness and Location within B. velezensis Genomes
Results demonstrated that the secondary metabolites prediction in the core-and accessory-genomes of Bv collection underlined the big amount of unknown secondary metabolites. Figure 5A shows that all secondary clusters were carried by the accessory genome except for basilysin and amylocyclicin existing inside the core genome as well. AntiSMASH program reveal that the size of the accessory genome is significantly correlated to the number of secondary products genes clusters, with a percentage of 9.6% ( Figure 5B).
Results presented in Figure 6 showed that the highest percentage of genes were dedicated for genetic information processing. Amino acid metabolism, cofactors carbohydrate, and vitamins metabolisms were having an equal percentage of dedicated genes. Genes involved in the nucleotide metabolism and energy metabolism are other important ones with high percentages.
Supplementary Figure S3 presented the prediction of genes responsible for extracellular enzymes production. Almost all studied bacterial genomes were having both xylanases and proteases encoding genes. However, amylase encoding genes were present in 74% of studied bacterial genomes and lipases encoding genes were present in 34% of studied bacterial genomes only.
Genes involved in the nucleotide metabolism and energy metabolism are other important ones with high percentages.
Supplementary Figure S3 presented the prediction of genes responsible for extracellular enzymes production. Almost all studied bacterial genomes were having both xylanases and proteases encoding genes. However, amylase encoding genes were present in 74% of studied bacterial genomes and lipases encoding genes were present in 34% of studied bacterial genomes only.

Discussion
In recent years, extensive studies have been reported on B. velezensis species due to their possibility of utilization in agricultural, environmental, and industrial sectors [18,97]. The advanced technologies of genome sequencing facilitated the understanding of the genetic relationships and the genomic and metabolic characteristics of B. velezensis bacteria [13][14][15][16]35,98,99]. In fact, the genome of more than 200 species have been sequenced from 2007 till now [18,26,39,100]. In the current work, genomes sequences of 130 B. velezensis strains were selected from the NCBI and comparatively studied. Our results demonstrated that the majority of B. velezensis bacterial strains were isolated from soil and were devoted to the biocontrol and plant growth promotion. Similar findings were earlier confirmed by several authors, such as Meng et al. [101] and Grady et al. [102], who mentioned that Bv has been extensively utilized as an antagonist agent against a wide range of microbial pathogens in agricultural sector [101,102]. Additionally, Wu et al. [2] and Liu et al. [35] reported that bio-formulation of B. velezensis was an effective alternative to chemical pesticides [2,35]. This strain is not well exploited in the medicinal and industrial fields as mentioned by Adeniji and Babalola [5] despite its anticancer properties [103] and its capacity to and degrade harmful industrial products [104].
Here, we explore the genomic relatedness among assembled B. velezensis bacteria through the average nucleotide identity (ANI) calculation. We speculate that it indicated four prominent subspecies within B. velezensis species despite their genomic similarities (98%) above the cut-off point. Genomic differences between closely related bacteria might be explained by mutation and recombination phenomena [36,37,105]. In silico analysis of the assembled B. velezensis strains had led to an open pan-genome that is increasing proportionally to the addition of new sequenced genomes. This finding confirms that Bv bacteria are in constant evolution and that saturation was not yet been established [4,106,107]. Additionally, Adeniji and Babalola [5] mentioned that according to genomic analysis, isolated Bv bacteria are still showing new distinct characteristics [5]. Figure 3B,C covers COG pathways and KEGG distribution, respectively. They indicate that genes responsible for fundamental functions are distributed in the core genome,

Discussion
In recent years, extensive studies have been reported on B. velezensis species due to their possibility of utilization in agricultural, environmental, and industrial sectors [18,97]. The advanced technologies of genome sequencing facilitated the understanding of the genetic relationships and the genomic and metabolic characteristics of B. velezensis bacteria [13][14][15][16]35,98,99]. In fact, the genome of more than 200 species have been sequenced from 2007 till now [18,26,39,100]. In the current work, genomes sequences of 130 B. velezensis strains were selected from the NCBI and comparatively studied. Our results demonstrated that the majority of B. velezensis bacterial strains were isolated from soil and were devoted to the biocontrol and plant growth promotion. Similar findings were earlier confirmed by several authors, such as Meng et al. [101] and Grady et al. [102], who mentioned that Bv has been extensively utilized as an antagonist agent against a wide range of microbial pathogens in agricultural sector [101,102]. Additionally, Wu et al. [2] and Liu et al. [35] reported that bio-formulation of B. velezensis was an effective alternative to chemical pesticides [2,35]. This strain is not well exploited in the medicinal and industrial fields as mentioned by Adeniji and Babalola [5] despite its anticancer properties [103] and its capacity to and degrade harmful industrial products [104].
Here, we explore the genomic relatedness among assembled B. velezensis bacteria through the average nucleotide identity (ANI) calculation. We speculate that it indicated four prominent subspecies within B. velezensis species despite their genomic similarities (98%) above the cut-off point. Genomic differences between closely related bacteria might be explained by mutation and recombination phenomena [36,37,105]. In silico analysis of the assembled B. velezensis strains had led to an open pan-genome that is increasing proportionally to the addition of new sequenced genomes. This finding confirms that Bv bacteria are in constant evolution and that saturation was not yet been established [4,106,107]. Additionally, Adeniji and Babalola [5] mentioned that according to genomic analysis, isolated Bv bacteria are still showing new distinct characteristics [5]. Figure 3B,C covers COG pathways and KEGG distribution, respectively. They indicate that genes responsible for fundamental functions are distributed in the core genome, genes encoding secondary metabolites biosynthesis are mainly located in the accessory genomes and that the bacterial species diversity is in the unique genome. These finding are in accordance with others obtained by Carlos Guimaraes et al. [4], Belbahri et al. [106], and Slama et al. [108].
The genetic mechanisms of B. velezensis genes encoding beneficial functions to plants were explored through genome mining (Supplementary Figure S2). The obtained results were proved by various other reports indicating that B. velezensis strains were endowed with plant growth promoting capacities [4,18,32,39,106,109]. Additionally, our results showed that B. velezensis bacteria were able to produce diverse secondary metabolites clusters endowed with antimicrobial activities. The clusters were identified through antiSMASH, non-ribosomal peptide synthetases (NRPS), polyketide synthetases (PKS), trans-Acyl Transferase Polyketide Synthetase (TransATPKS), and several other tools. This finding is in accordance with other reports confirming the secondary metabolites potentials of B. velezensis strains [4,34,60,106]. The rarefaction analysis of the number of discovered antibiotics underlined that saturation has not yet been attained which means that there are clusters encoding novel metabolites that are not described yet. In the same way Cai et al. [29] and Pandin et al. [10] have found clusters encoding new metabolites specific to B. velezensis species.
Clustering of the B. velezensis core-and accessory genome resulted in a high number of unknown secondary metabolites and proved that almost all secondary metabolites were synthesized by the accessory genome ( Figure 5). A set of secondary metabolites were common in several B. velezensis bacteria, such as cyclic lipopeptides (surfactin, fengycin, bacillibactin, and bacilycin), polyketides (difficidin, macrolactin, and bacillaene) and some others were specific for few bacteria (locillomycin, subtilin, mersacidin, bacillomycin, and amylocyclicin). All mentioned secondary metabolites were identified as effective antibacterial and antifungal compounds [28,35,[110][111][112]. Data from bioinformatic analysis of Figure 6 demonstrated the percentage of genes given to a specific function (i.e., genetic information processing, amino acid metabolism, vitamins metabolism, etc.) is related to its importance (fundamental or secondary function). Similar results have been previously reported [107].
The prediction of genes responsible for extracellular enzymes production (Supplementary Figure S3) demonstrated that protease encoding gens were present in almost all bacteria especially theses originating from fermented food. This finding is supported by several other reports indicating that bacteria having strong proteolytic activities could accelerate brewing and fermentation processes [113][114][115][116][117].

Conclusions
This study provide evidence that the collected B. velezensis bacterial strains were very closely related despite the fact that they originate from different sources and locations. Additionally, the comparative genomic analysis shed light on the secondary metabolites richness which help in the plant growth promotion and biocontrol capacities of B. velezensis strains and their possible bio-formulation as biopesticides.