Genome Annotation of Poly(lactic acid) Degrading Pseudomonas aeruginosa, Sphingobacterium sp. and Geobacillus sp.

Pseudomonas aeruginosa and Sphingobacterium sp. are well known for their ability to decontaminate many environmental pollutants while Geobacillus sp. have been exploited for their thermostable enzymes. This study reports the annotation of genomes of P. aeruginosa S3, Sphingobacterium S2 and Geobacillus EC-3 that were isolated from compost, based on their ability to degrade poly(lactic acid), PLA. Draft genomes of the strains were assembled from Illumina reads, annotated and viewed with the aim of gaining insight into the genetic elements involved in degradation of PLA. The draft genome of Sphinogobacterium strain S2 (435 contigs) was estimated at 5,604,691 bp and the draft genome of P. aeruginosa strain S3 (303 contigs) was estimated at 6,631,638 bp. The draft genome of the thermophile Geobacillus strain EC-3 (111 contigs) was estimated at 3,397,712 bp. A total of 5385 (60% with annotation), 6437 (80% with annotation) and 3790 (74% with annotation) protein-coding genes were predicted for strains S2, S3 and EC-3, respectively. Catabolic genes for the biodegradation of xenobiotics, aromatic compounds and lactic acid as well as the genes attributable to the establishment and regulation of biofilm were identified in all three draft genomes. Our results reveal essential genetic elements that facilitate PLA metabolism at mesophilic and thermophilic temperatures in these three isolates.


Introduction
The lactic acid required to produce poly(lactic acid) (PLA) through polycondensation is derived primarily from the microbial fermentation of agro-industrial waste. Frequently, waste by-products from crops like casava, wheat, bran, corn, etc., serve as the fermentable substrates [1,2]. PLA is certified as completely biodegradable under industrial composting conditions [3]. In the last two decades, the biodegradation of PLA has been extensively studied and many microbial species (actinomycete, bacteria, fungus) have been identified with the ability to degrade PLA [4]. Most of the reported bacterial species are from the families Pseudonocardiaceae, Thermomonosporaceae, Micromonosporaceae, Streptosporangiaceae, Bacillaceae and Thermoactinomycetaceae while the fungal species are mainly from the phyla Basidiomycota (Tremellaceae) and Ascomycota (Trichocomaceae, Hypocreaceae) [5][6][7][8][9][10][11].
In our previous study on mesophilic strains, we described four bacterial isolates designated as S1, S2, S3 and S4, able to degrade PLA at ambient temperature [12]. Two of the isolated strains, Sphingobacterium sp. (S2) and P. aeruginosa (S3), were evaluated for Table 1 shows the general features for the assembled genomes. The assembly of the draft genome for strain S2 yielded 87 contigs (434,971 maximum length) and a total of 5,445,390 assembled base pairs with 43.66% GC content. There were an estimated 4951 CDS regions, 2864 proteins with functional assignments and several antibiotic resistance genes. The assembled draft genome of S3 had 63 contigs (658,980 maximum length) with a total of 6,509,961 assembled base pairs and was 66.26% GC content. There were an estimated 6239 CDS regions and 4932 proteins with functional assignments and a substantial number of putative antibiotic resistant and virulence determinants. As a thermophile, Geobacillus strain EC-3 had a smaller genome with 111 contigs and a total of 3,397,712 assembled base pairs with the largest contig of 321,190 bp. The G+C content was 52.18% with an estimated 3790 CDS regions and 2806 proteins with functional assignments. Because these were draft sequences, i.e., not closed, the number of chromosomes and plasmids could not be determined.

Genetic Relatedness Based on ANI
The average nucleotide identity (ANI) value describes the similarity between the sequences of the conserved regions of two genomes and measures the genetic relatedness between them [35]. ANI measurements are considered more informative over 16S rRNA gene identity as they are based on a larger number of genes [36] and ANI values ≥ 95% should be considered as the same species [37]. ANI comparisons were done to explore the interspecies genetic relatedness between our three isolates and related species in the public databases. The data point in the upper-right corner of Figure 1 represents S. thalpophilum DSM11723 and S. thalpophilum NCTC11429, both of which showed more than 98% 16S rRNA gene similarity and 98% ANI with strain S2. Strain NCTC11429, isolated from a human wound (NCBI Biosample SAMEA3643323), was the closest match to our strain. strain S3 were isolated from the compost samples, while the most closely related strains, S. thalpophilum DSM11723 and NCTC11429 and P. aeruginosa PSE305, were isolated from human clinical samples. While Pseudomonas aeruginosa is a well-documented pathogen, even Spingobacterium has been identified as a potential opportunistic pathogen [38,39]. Geobacillus EC-3 was also isolated from compost while the closest strain in the database, G. thermoleovorans strain CCB_US3_UF5, was isolated from Ulu hot springs in Malaysia [40].

MAUVE and MeDuSa Alignments
MAUVE and MeDuSa were used to find sequence homologies of the two isolates with the strains identified as closest in the database by ANI calculations. The alignments shown in Figure 2 were derived by first aligning the SPADE-assembled contigs against reference strains, (S. thalpophilum NCTC11429, P. aeruginosa PSE305 and G. thermoleovorans strain CCB_US3_UF5) with MeDuSa [41], followed by alignment in MAUVE [42] using the MeDuSa ordered scaffolds of the isolates. Correlation between 16S rRNA gene similarity and average nucleotide identity (ANI) of sequenced strains. Sphingobacterium S2 (I), P. aeruginosa ( ) and Geobacillus EC-3 (•) were compared to their closest relatives by 16S rRNA similarity (abscissa) and ANI (ordinate). The closest relatives, S. thalpophilum DSM11723 ( rRNA) suggests that EC-3 is a thermoleovorans. Sphingobacterium sp. strain S2 and P. aeruginosa strain S3 were isolated from the compost samples, while the most closely related strains, S. thalpophilum DSM11723 and NCTC11429 and P. aeruginosa PSE305, were isolated from human clinical samples. While Pseudomonas aeruginosa is a well-documented pathogen, even Spingobacterium has been identified as a potential opportunistic pathogen [38,39]. Geobacillus EC-3 was also isolated from compost while the closest strain in the database, G. thermoleovorans strain CCB_US3_UF5, was isolated from Ulu hot springs in Malaysia [40].

MAUVE and MeDuSa Alignments
MAUVE and MeDuSa were used to find sequence homologies of the two isolates with the strains identified as closest in the database by ANI calculations. The alignments shown in Figure 2 were derived by first aligning the SPADE-assembled contigs against reference strains, (S. thalpophilum NCTC11429, P. aeruginosa PSE305 and G. thermoleovorans strain CCB_US3_UF5) with MeDuSa [41], followed by alignment in MAUVE [42] using the MeDuSa ordered scaffolds of the isolates.

MAUVE and MeDuSa Alignments
MAUVE and MeDuSa were used t with the strains identified as closest in th shown in Figure 2 were derived by first reference strains, (S. thalpophilum NCTC1 strain CCB_US3_UF5) with MeDuSa [41 the MeDuSa ordered scaffolds of the isol ), were used in the MAUVE alignments below. Note that the 16S rRNA sequences came from prior sequencing of the 16S rRNA gene [12,31].
In the case of P. aeruginosa strain S3, Figure 1 shows a cluster in the upper-right corner containing closely related strains with more than 98-99% sequence similarity, based on 16S rRNA sequences and 93-97% ANI. P. aeruginosa PSE305 showed the highest ANI value and 16S rRNA gene identity of 97.69% and 99% respectively. Accordingly, P. aeruginosa strain S3 and other strains included in the analysis belong to the same species. The results are consistent with the phylogenetic results based on 16S rRNA gene identity as previously reported [12], but the ANI analysis indicated that our strain was closest to P. aeruginosa PSE305, rather than P. aeruginosa BUP2. Based on 16S rRNA comparative sequence analysis P. aeruginosa O12 PA7 showed 99% sequence similarity to P. aeruginosa strain S3 but had <95% ANI value. For Geobacillus sp. EC-3, Figure 1 shows a cluster in the upper-right corner containing closely related strains with more than 97% sequence similarity based on both 16S rRNA sequences and ANI analysis. The closest completed Geobacillus in the public databases was Geobacillus thermoleovorans strain CCB_US3_UF5. The proximity of Geobacillus sp. EC-3 to G. thermoleovorans (99.4% ANI and 99.8% 16S rRNA) suggests that EC-3 is a thermoleovorans. Sphingobacterium sp. strain S2 and P. aeruginosa strain S3 were isolated from the compost samples, while the most closely related strains, S. thalpophilum DSM11723 and NCTC11429 and P. aeruginosa PSE305, were isolated from human clinical samples. While Pseudomonas aeruginosa is a well-documented pathogen, even Spingobacterium has been identified as a potential opportunistic pathogen [38,39]. Geobacillus EC-3 was also isolated from compost while the closest strain in the database, G. thermoleovorans strain CCB_US3_UF5, was isolated from Ulu hot springs in Malaysia [40].

MAUVE and MeDuSa Alignments
MAUVE and MeDuSa were used to find sequence homologies of the two isolates with the strains identified as closest in the database by ANI calculations. The alignments shown in Figure 2 were derived by first aligning the SPADE-assembled contigs against reference strains, (S. thalpophilum NCTC11429, P. aeruginosa PSE305 and G. thermoleovorans strain CCB_US3_UF5) with MeDuSa [41], followed by alignment in MAUVE [42] using the MeDuSa ordered scaffolds of the isolates. The MeDuSa alignment of contigs from the assembled Sphingobacterium S2 against S. thalpophilum NCTC11429 reduced the contig count from 87 to 26. P. aeruginosa PSE305 was used as a reference strain in MeDuSa which reduced the contig count from 63 to 9. Me-DuSa reduced the 111 contigs of the Spades assembled Geobacillus sp. EC-3 to 4. Overall, the genomes of these isolates had good synteny with their closest relatives as measured by genome assembly and alignments in MeDuSa and MAUVE. Of particular interest was the assembly and alignments of Geobacillus sp. EC-3. EC-3 appears closest to G. thermoleovorans strain CCB_US3_UF5. However, there are several recent reports of Geobacillus genome projects and at least three isolates, strains ARTRW1, FJAT2391 and KCTC 3570, have a very similar genome organization, as evaluated in MAUVE, in comparison with EC-3 and CCB_US3_UF5, although they were isolated from fairly diverse sources. Strain AR-TRW1 was isolated from hot springs in Turkey [43], strain FJAT2391 came from soil in China (Genbank Accession # PRJNA340206), and strain KCTC 3570 (Genbank Accession # PRJNA310809) came from soil near a hot effluent stream in Pennsylvania, USA. The alignment of the genomes of these strains is presented in Figure S1 (Supplemental Information). This conservation of genomic architecture suggests strong selective pressure.

Genetic Systems and Metabolism
The distribution of identified genes to cell subsystems is shown in Figure 3. The distribution patterns are roughly the same between strains with the exception that a greater The MeDuSa alignment of contigs from the assembled Sphingobacterium S2 against S. thalpophilum NCTC11429 reduced the contig count from 87 to 26. P. aeruginosa PSE305 was used as a reference strain in MeDuSa which reduced the contig count from 63 to 9. MeDuSa reduced the 111 contigs of the Spades assembled Geobacillus sp. EC-3 to 4. Overall, the genomes of these isolates had good synteny with their closest relatives as measured by genome assembly and alignments in MeDuSa and MAUVE. Of particular interest was the assembly and alignments of Geobacillus sp. EC-3. EC-3 appears closest to G. thermoleovorans strain CCB_US3_UF5. However, there are several recent reports of Geobacillus genome projects and at least three isolates, strains ARTRW1, FJAT2391 and KCTC 3570, have a very similar genome organization, as evaluated in MAUVE, in comparison with EC-3 and CCB_US3_UF5, although they were isolated from fairly diverse sources. Strain ARTRW1 was isolated from hot springs in Turkey [43], strain FJAT2391 came from soil in China (Genbank Accession # PRJNA340206), and strain KCTC 3570 (Genbank Accession # PRJNA310809) came from soil near a hot effluent stream in Pennsylvania, USA. The alignment of the genomes of these strains is presented in Figure S1 (Supplemental Information). This conservation of genomic architecture suggests strong selective pressure.

Genetic Systems and Metabolism
The distribution of identified genes to cell subsystems is shown in Figure 3. The distribution patterns are roughly the same between strains with the exception that a greater percentage of the genomes of S3 and EC-3 are devoted to "Cell Processing" compared to S2, whereas S2 has a greater genetic commitment to "Membrane Transport". percentage of the genomes of S3 and EC-3 are devoted to "Cell Processing" compared S2, whereas S2 has a greater genetic commitment to "Membrane Transport". Sphingobacterium spp. are Gram-negative rods, aerobic, exhibiting sliding motil and form yellow-pigmented colonies. Sphingobacterium have been isolated from dive environments such as soil, water, compost, deserts, blood and urine samples from hum patients. A distinctive feature of Sphingobacterium is the presence of sphingolipids at re tively high concentrations in their cell wall [15,44]. In Sphingobacteria S2, 1589 ORFs w assigned metabolic functions. Genes (269) were identified as participating in amino a metabolism while 329 genes were dedicated to carbohydrate metabolism. Energy met olism and lipid metabolism had 183 and 149 genes identified, respectively, while 76 gen were assigned to xenobiotic biodegradation and metabolism ( Figure 4). Sphingobacterium spp. are Gram-negative rods, aerobic, exhibiting sliding motility and form yellow-pigmented colonies. Sphingobacterium have been isolated from diverse environments such as soil, water, compost, deserts, blood and urine samples from human patients. A distinctive feature of Sphingobacterium is the presence of sphingolipids at relatively high concentrations in their cell wall [15,44]. In Sphingobacteria S2, 1589 ORFs were assigned metabolic functions. Genes (269) were identified as participating in amino acid metabolism while 329 genes were dedicated to carbohydrate metabolism. Energy metabolism and lipid metabolism had 183 and 149 genes identified, respectively, while 76 genes were assigned to xenobiotic biodegradation and metabolism ( Figure 4). P. aeruginosa is a Gram-negative bacterium able to grow, aerobically, on a wide range of substrates and anaerobically, when nitrate is available as terminal electron acceptor. It is capable of thriving in highly diverse and unusual ecological niches with low availability of nutrients. Its metabolic versatility allows it to use a variety of diverse carbon sources, including certain disinfectants [45]. Moreover, it can synthesize a number of antimicrobial compounds [24,25]. Using genome annotation through PATRIC, 1084 ORFs in P. aeruginosa were assigned to metabolic pathways. Among the major pathways, 532 genes were assigned to amino acid metabolism and 241, 401, 194 and 219 genes were assigned for energy metabolism, carbohydrate metabolism, metabolism of cofactors and vitamins and lipid metabolism, respectively. Genes, 266, were identified as involved with xenobiotics while 136 and 174 genes were linked to nucleotide metabolism and biosynthesis of secondary metabolites. P. aeruginosa is a Gram-negative bacterium able to grow, aerobically, on a wide range of substrates and anaerobically, when nitrate is available as terminal electron acceptor. It is capable of thriving in highly diverse and unusual ecological niches with low availability of nutrients. Its metabolic versatility allows it to use a variety of diverse carbon sources, including certain disinfectants [45]. Moreover, it can synthesize a number of antimicrobial compounds [24,25]. Using genome annotation through PATRIC, 1084 ORFs in P. aeruginosa were assigned to metabolic pathways. Among the major pathways, 532 genes were assigned to amino acid metabolism and 241, 401, 194 and 219 genes were assigned for energy metabolism, carbohydrate metabolism, metabolism of cofactors and vitamins and lipid metabolism, respectively. Genes, 266, were identified as involved with xenobiotics while 136 and 174 genes were linked to nucleotide metabolism and biosynthesis of secondary metabolites.
Geobacillus spp. are Gram-positive spore forming rods that are found widely distributed. They are capable thermophiles and have been routinely isolated from hot springs and compost [40,46]. In strain EC-3, 1662 ORFs were assigned to metabolic functions with 357 devoted to amino acid metabolism, 334 assigned to carbohydrate metabolism, and 169 and 129 assigned to energy and lipid metabolism, respectively. While there are no glaring differences in gene distributions between the three genomes, it is, however, curious to note that Sphingobacterium S2 devotes twice as much genetic power to glycan biosynthesis Geobacillus spp. are Gram-positive spore forming rods that are found widely distributed. They are capable thermophiles and have been routinely isolated from hot springs and compost [40,46]. In strain EC-3, 1662 ORFs were assigned to metabolic functions with 357 devoted to amino acid metabolism, 334 assigned to carbohydrate metabolism, and 169 and 129 assigned to energy and lipid metabolism, respectively. While there are no glaring differences in gene distributions between the three genomes, it is, however, curious to note that Sphingobacterium S2 devotes twice as much genetic power to glycan biosynthesis and metabolism. This indicates that Sphingobacterium can express various enzymes involved in the synthesis and degradation of glycans that may be of value for various biotechnological applications.

Xenobiotic Biodegradation Metabolism
The role of P. aeruginosa in the degradation of large complex molecules including PAHs, xenobiotic compounds, oil, dyes and plastics is well documented [26][27][28][29][30]. Sphingobacterium spp. have also been reported to have a potential role in the biodegradation of different pollutants, including mixed plastic waste, PAHs, biodegradation of oil and pesticides [19][20][21]. Geobacillus spp. have been identified that can reduce azo dyes and degrade alpha-naphthol and nylon [47][48][49]. Table 2 describes the major pathways and the number of genes related to the biodegradation of different xenobiotic compounds in strains S2, S3 and EC-3. According to sequence analysis, numerous pathways for the biodegradation of xenobiotic compounds were found in the three strains with approximately three times as many in P. aeruginosa S3 as compared to Sphingobacterium sp. S2 and Geobacillus sp. EC-3. This may be due, in part, to the fact that P. aeruginosa is more comprehensively studied compared to Sphingobacterium and Geobacillus. Among all the degradation pathways found in the three strains, the degradation of benzoate in P. aeruginosa had the greatest number of annotated genes. Benzoate, an aromatic compound, has been widely used as a model for the study of the bacterial catabolism of aromatic compounds [50]. Genetic evidence for the ability to degrade 1,4-dichlorobenzene, 2,4-dichlorobenzene and benzoate were detected in all three genomes. Many genes related to the degradation pathway of one of the most important classes of pollutants, PAHs, such as naphthalene, anthracene, 1-and 2-methylnaphthalene, were found in all three strains. Among the halogenated organic compounds, 29 genes were dedicated to tetrachloroethene degradation in P. aeruginosa S3 while only 6 and 8 were found in Sphingobacterium sp. S2 and Geobacillus sp. EC-3, respectively. For aromatic compounds and chlorinated aromatic compounds, pathways for the biodegradation of toluene, trinitrotoluene, xylene degradation, 1,4-Dichlorobenzene degradation and 2,4-Dichlorobenzoate were also found. Genes for the biodegradation of Bisphenol A, one of the most abundantly produced chemicals released into the environment and a serious health concern and environmental pollutant, was also found in P. aeruginosa S3 [51] and Sphingobacterium sp. S2.

Lactate Metabolism
Lactate utilization as the sole carbon source is a property of many bacteria where a key step of the process is the oxidation of lactate [52][53][54][55][56]. Lactate dehydrogenases found in microbes are of two types, NAD-dependent lactate dehydrogenases (nLDHs) and NAD-independent lactate dehydrogenases (iLDHs), also called respiratory lactate. The latter is usually considered to be the enzyme mainly responsible for the metabolism of lactate as a carbon source [53]. The lactate utilization system is comprised of three main membrane bound proteins: NAD-independent lactate dehydrogenase (L-iLDH), NADindependent D-lactate dehydrogenase (D-iLDH), and a lactate permease (LldP). Lactate permease is responsible for the uptake of lactate into the cells and lactate dehydrogenases carry out the oxidation of lactate to pyruvate [55,57]. Lactate utilization has been observed in some pathogens [55], stimulating their growth during infections while enhancing the synthesis of pathogenic determinants and increasing resistance against various bactericidal mechanisms [55]. The utilization of lactate by different Pseudomonas strains is well documented [56,[58][59][60]. In the sequence analysis of our P. aeruginosa strain S3, a complete cascade of genes was found encoding the machinery for lactate utilization including a lactate permease, both L and D-lactate dehydrogenases and a lactate-responsive regulator LldR (Table 3). This strain was isolated and characterized for its potential to degrade poly(lactic acid) and its ability to utilize lactate as a sole carbon source was established in our previous study [12]. Presence of the lactate utilization machinery found through genome sequencing is consistent with previous observations that both L-iLDH and D-iLDH are present in the single operon and are induced coordinately in Pseudomonas strains and expression of both enzymes is controlled by the presence of an enantiomer of lactate [59]. In the genome analysis of our isolate Sphingobacterium sp. S2, we found an incomplete set of genes for lactate metabolism. Both L-lactate dehydrogenase and D-lactate dehydrogenase were present, but no lactate permease was detected ( Table 3). Absence of lactate permease is consistent with our previous findings that showed that Sphingobacterium sp. S2 did not utilize lactic acid as the sole source of carbon. Inability to grow on lactic acid had been previously reported in literature for different strains of Sphingobacterium [18,61]. The fact that these strains were isolated, based on their ability to degrade PLA, suggests that a degradation product other than lactate is involved.
Geobacillus sp. EC-3 was capable of growing on lactate as the sole carbon source and we detected lactate dehydrogenase and lactate permease in strain EC-3. In addition, we identified lactate utilization proteins LutA, LutB and LutC, along with the lactate responsive regulator LutR. These lactate utilization proteins are discussed below, relative to biofilm formation.

Genetic Determinants for Biofilm Formation and Regulation
In contrast to the planktonic lifestyle, cells within a biofilm matrix are in close proximity where secreted enzymes provide optimal returns for the population, especially when targeting the substratum on which the biofilm forms [62]. The phenomenon of microbial biofilm formation is also related to other survival strategies like metal and antimicrobial resistance, tolerance and bioremediation [63,64]. The application of biofilm mediated bioremediation has been found superior to other bioremediation strategies and is being applied in bioremediation of different environmental pollutants [65][66][67][68][69]. Microorganisms that develop a biofilm through attachment and formation of an extracellular protective matrix are physiologically more resilient to environmental changes, making them a logical choice for the remediation of different pollutants. These microbes use different strategies like biosorption, bioaccumulation and biomineralization to slowly degrade compounds [70]. All three of our isolates were capable of forming biofilm on PLA. This was tested because the formation of biofilm on solid polymers that can be degraded for carbon and/or energy is a logical strategy for microbes. Excreted enzymes capable of degradation can be sequestered within the biofilm matrix in close proximity to the polymer and not lost to solution. Therefore, we sought to identify genes involved in biofilm formation in strains S2, S3 and EC-3. P. aeruginosa is a remarkably adept opportunist with striking ability to develop biofilm [22,71,72]. In our previous study, we also observed biofilm formation by our isolate P. aeruginosa strain S3 on the surface of PLA during the process of biodegradation [12]. The phenomenon of biofilm formation on the surface of PLA had previously been reported by other authors as well [37,73,74]. In the genetic analysis of our isolate, we found factors involved in the development of the matrix of P. aeruginosa biofilm and its regulation ( Table 4). The genes for three types of exopolysaccharides (EPS), previously reported as involved in the construction of the biofilm matrix of P. aeruginosa, (Pel, Psl and alginate [72,75]) were found in our isolate. These EPS molecules form the protective matrix [76]. Psl is the primary factor in charge of the initiation and maintenance of the biofilm structure by providing cell to cell and cell to surface interactions [77][78][79][80]. It also works as a signaling molecule to the successive events involved in the formation of biofilms and also acts as a defensive layer for different immune and antibiotic attacks [81]. Pel polysaccharide is a glucose-rich extracellular matrix and is involved in the formation of biofilms that are attached to the solid surfaces. It is considered to be less important, compared to Psl [72,80,82,83]. In P. aeruginosa from clinical isolates of CF patients, alginate is produced [23]. Besides its role in maintenance and protection of biofilm structure, it is essential for water and nutrient preservation [84]. Table 4. Genetic elements involved in biofilm formation and regulation detected in P. aeruginosa S3, Sphingobacterium S2 and Geobacillus EC-3.  Biofilm formation is a multicellular process stimulated by environmental signals and controlled by regulatory networks. During the biofilm formation, cells undergo many phenotypic shifts that are regulated by a large array of genes [85]. In the genome of P. aeruginosa strain S3, several regulatory factors were identified. One of these regulatory factors was the signaling molecule bis-(3 -5 )-cyclic dimeric guanosine monophosphate (c-di-GMP), which is considered to be one of the most significant molecular determinants in biofilm regulation [86]. A c-di-GMP molecule controls the interchange between the planktonic and biofilm-associated lifestyle of bacteria by stimulating the biosynthesis of adhesins and exopolysaccharides during the formation of biofilm [87]. The bacterial cell to cell communication system known as quorum sensing (QS) is involved in the maintenance of many biological processes like biofilm formation, bioluminescence, antibiotic production, virulence factor expression, competence for DNA uptake, and sporulation [88,89]. LasR/LasI, RhlR/RhlI and PQS are the three QS signaling systems employed by P. aeruginosa to control biofilm formation [90,91]. These three QS signaling system were found to be the part of the genome for our isolate.

Factors involved in Biofilm formation and regulation detected in
There is little information in the literature regarding the genetic elements involved in biofilm formation in Sphingobacterium, Genome analysis of our isolate Sphingobacterium sp. S2 showed the presence of some genetic elements that are reported to be involved in biofilm formation in the literature, such as genes for Stage 0 sporulation protein YaaT. This protein is reported to be involved in the sporulation process and biofilm development [92,93]. A response regulator, LuxR involved in the sensory mechanism, and gliding motility protein precursors GldC, GldJ, GldN were also detected in the genome of our strain. These proteins are known for their role in biofilm development [94]. A probable outer membrane protein precursor of OmpA family, considered to be putative adhesins and adhesion-related proteins, were also reported in Sphingobacterium sp. S2 [94].

Enzymes
The biodegradation of polymers is carried out by two types of enzymes, extracellular enzymes that degrade long chain polymers into short oligomers or subunits that are subsequently carried inside the cell, and intracellular enzymes that further degrade the small, transported units [100,101]. The degradation of synthetic polymers in the environment can be a slow process [102,103]. PLA is a synthetic linear aliphatic polyester of lactic acid monomers joined together by ester linkages [3]. The presence of ester bonds in its backbone, make the polymer sensitive to hydrolysis, both chemically as well as enzymatically [101]. The biodegradation of polyesters is mostly carried out by esterolytic enzymes such as esterases, lipases, or proteases. In the literature, microbial degradation of PLA is reported to be conducted by proteases, lipases, esterases and a few cutinases [4]. Sphingobacterium spp., P. aeruginosa, and Geobacillus spp. had been documented before to have a role in the degradation of different environmental pollutants, such as mixed plastic waste, PAHs, oil, and dyes and pesticides. P. aeruginosa has also been reported to have the potential to degrade PLA nanocomposites [19,26,30,104,105]. In our previous studies we have reported the expression of PLA hydrolyzing lipase and esterase from our isolates P. aeruginosa strain S3 and Sphingobacterium sp. strain S2 [12,97,98]. Genome annotation of S2, S3 and EC-3 reveal the presence of numerous hydrolytic enzymes (Table 5). In the genome of P. aeruginosa S3, 75 different types of proteases, 50 esterases and 25 different types of lipases were detected (i.e., the sum of "Common to Reference Genome" and those "Unique to Strain"). Similarly, in the genome of Sphingobacterium sp. S2, 36 proteases, 30 esterases and 18 lipases were identified. Of the three genomes, Geobacillus sp. EC-3, the smallest of the three genomes, had the most putative hydrolytic enzymes; EC-3 had 20 lipases, 127 proteases and 72 lipases.

Discussion
The present study reports the whole genome sequence analysis of three bacterial strains isolated from compost, P. aeruginosa S3, Sphinogobacterium sp. S2, and Geobacillus sp. EC-3, each possessing the metabolic skill of utilizing a polymeric solid, poly(lactic acid), as the sole source of carbon at~30 • C or 58 • C. The initial steps in the degradation of solids by bacteria are extracellular, until sufficiently small molecules that can be transported across the outer cell envelope are generated through the activities of secreted enzymes. An obvious microbial strategy for optimizing the effectiveness of secreted enzymes is the migration to a position as proximal to the solid as possible, and nothing is closer than a biofilm. All three of our isolates were able to form biofilm on PLA and all three have putative biofilm markers in their genomes. Genomic analyses of P. aeruginosa strains have a robust literature because of its medical importance and as a consequence, there are many genetic elements that can be identified in P. aeruginosa S3 that are implicated in biofilm formation. These include structural elements (outer membrane matrix proteins), transport mechanisms, and regulatory elements like cyclic di-GMP and three quorum sensing systems. In addition to genetic determinants for biofilm formation, many virulence-associated genes were identified in strain S3.
Regarding Sphingobacterium, a recent description of a Sphingobacterium sp. isolate [106] capable of inhibiting the fungal activity of Fusarium posits that siderophores and chitinases may account for the antifungal activities and, in the genomic analysis of this strain (SJ-25), they identify potential candidate genes. Strain S2 also contains at least four siderophores as well as TonB and an abundance of TonB-dependent receptors. We also detected a chitin binding protein but no indication of excreted chitinases. Curiously, a biofilm related protein, identified as homologous to Stage 0 sporulation protein YaaT, was also detected. Sphingobacterium is a Gram-negative nonsporulating rod in the phylum Bacteroidetes.
Draft genomes of strains S2, S3 and EC-3 were studied to gain an insight into the genetic elements that are involved in the degradation of PLA. The catabolic genes responsible for biodegradation of different xenobiotic compounds, genes responsible for formation and regulation of biofilm, genes for transport and utilization of lactate and several enzymes predicted to be involved in the degradation of many organic pollutants were identified. Of interest was the apparent lack of lactate permease in Sphingobacterium sp. S2, suggesting that an alternative mode of attack on PLA leads to a molecule other than lactate, that is transportable, or that lactate is further modified extracellularly before transport. Finally, Chai, et al. [109] have detected catabolic lactate genes, lutA, lutB and lutC, in Bacillus subtilis and note that the upregulation of these genes is required for growth on lactate as the sole carbon source. Moreover, these genes are under dual regulation, permitting expression in liquid culture as well as within a biofilm. This gene cluster was also found in Geobacillus sp. EC-3. Common features of these three phylogenetically disparate isolates were an abundance of genetic determinants for hydrolytic enzymes and biofilm formation. These three strains were isolated through serial selection on a single carbon source, PLA. The genomic analyses provide insights into the potential array of genes that may be required for the efficient degradation of PLA.

DNA Extraction
Three of our previously isolated PLA degrading bacterial strains, Sphingobacterium sp. strain S2, P. aeruginosa strain S3 and Geobacillus EC-3 (GenBank accession numbers KY432687, KY432688, and MH183212, respectively) were selected for genome sequencing [12]. Strains S2 and S3 were grown separately in 100 mL of LB in a 250 mL Erlenmeyer flask for 16 h in a shaking incubator at 30 • C and 70 rpm. Geobacillus EC-3 was grown in M9 minimal media shaking at 58 • C. Cells were pelleted (10,000 RPM in a Sorvall SS34 rotor, Beckman Coulter Life Sciences, Indianapolis, Indiana USA) and genomic DNA was isolated using MO BIO PowerSoil DNA isolation kit (MO BIO laboratories, Inc., Loker Ave West, Carlsbad, CA, USA). NanoDrop ND-1000 spectrophotometer and ND-1000 V3.1.8 software (Thermo Fisher Scientific Inc., Wilmington, DE, USA) were used to determine DNA concentrations of purified samples and sent for whole genome sequencing at Michigan State University Genomics Facility (MSU-RTSF), East Lansing, MI, USA.

Genome Sequencing
Libraries for sequencing were prepared using the Illumina TruSeq Nano DNA Library Preparation Kit (Illumina, Inc., San Diego CA, USA) on a Perkin Elmer Sciclone NGS robot (Perkin Elmer, Boston, MA, USA). Before sequencing, the qualities of the libraries were tested and quantification was performed using a combination of Qubit dsDNA HS (Thermo Fisher Scientific, Waltham, MA, USA), Caliper LabChip GX HS DNA (Perkin Elmer, Boston, MA, USA) and Kapa Illumina Library Quantification qPCR assays (Roche Sequencing and Life Science, Wilmington, MA, USA). Libraries were pooled in equimolar quantities and loaded on an Illumina MiSeq standard v2 flow cell with a 2 × 250 bp paired end format and using a v2 500 cycle reagent cartridge. Illumina Real Time Analysis (v1.18.64) was used for base calling and the output was converted to FastQ format with Illumina Bc12fastq (v1.8.4) after demultiplexing (Illumina, Inc., San Diego CA, USA). A total of 6,304,420 reads were obtained for strain S2, 5,800,229 reads for strain S3 and 5,730,761 reads for strain EC-3. The genomic reads have been deposited at the Sequence Read Archives at NCBI with Bioproject IDs of SRP149807 and PRJNA721072.

Sequence Assembly, Annotation, and Analysis
Assembly of the genomes was performed using the full Spades assembly function within PATRIC (PATRIC 3.4.9) [34] as implemented in the miseq assembly option. This assembly option incorporates BayesHammer algorithms followed by Spades, (Spades version 3.8.). Rast tool kit as implemented in PATRIC [110] was used for the annotation of contigs. The assembled contig file generated from this assembly was used as seed for the Comprehensive Genome Analysis function in PATRIC. The genomes were interrogated for the distribution of specific protein families (PGFams) using the protein family sorter tool on PATRIC. The genomes were compared to their closest reference genomes available on PATRIC to examine the strain-specific unique proteins as well as proteins common to the closest relative using the filter option in protein family sorter tool on PATRIC.

Average Nucleotide Identity (ANI) for Species Delineation
Isolates were further analyzed using a whole genome based average nucleotide identity (ANI) method to delineate the genomes to their correctly matching relatives. ANI values were calculated using MiSI (microbial species identifier) tool that is publicly available at the Integrated Microbial Genomes (IMG) database [111]. The algorithm used in the original method proposed by Konstantinidis and Tiedje was modified and used to determine ANI between two genomes [112]. The average of the nucleotide identity of the orthologous genes of the pair of genomes was calculated and identified as bidirectional best hits (BBHs) using a similarity search tool accessed in January, 2019, NSimScan (http://www.scidm.org/, accessed on 7 June 2021). The ANI of one genome to another genome is defined as the sum of the %-identity times the alignment length for all best bidirectional hits, divided by the sum of the lengths of the BBH genes. This pairwise calculation is performed in both directions. The strains used for comparison were complete genomes obtained from NCBI and are as follows. For the Sphingobacterium comparisons; S. thalpophilum DSM 11723 (Draft genome of 32 contigs), S. sp. G1-14, S. sp. B29, S. multivo-rum DSM 11691, S. lactis DSM 22361, S. wenxiniae DSM 22789, S. mizutaii DSM 11724 and S. sp. 21. For the Pseudomonas aeruginosa comparisons; P. aeruginosa PSE302, P. aeruginosa PA96, P. aeruginosa PA01H20, P. aeruginosa DSM 50071 P. aeruginosa PAO1, P. aeruginosa PAK, P. aeruginosa O12 PA7, P. aeruginosa PA_D25, P. aeruginosa PA_D1, P. aeruginosa KU and P. aeruginosa T52373. Regarding Geobacillus sp. comparison; EC-3 Geobacillus thermoleovorans CCB_US3_UF5, Geobacillus thermoleovorans strain SGAir0734, Geobacillus thermocatenulatus strain BGSC 93A1, Geobacillus sp. A8 and Geobacillus kaustophilus NBRC 102445 were used.

Comparative Alignments Using MeDuSa and MAUVE
For the comparative alignment of the genomes with their reference genomes and their visualization, MeDuSa [41] was used to reduce the number of contigs through comparison with the gene order of the closest strain. This was followed by alignment with MAUVE [42] to a reference strain to provide an estimate of alignment similarity. The reference strain selected was the closest finished genome in the public database. Alignments in MAUVE were also performed with contigs directly from the Spades assembly. P. aeruginosa PSE305 was used as a reference for strain S3 (ANI of 98.22) while Sphingobacterium thalpophilum NCTC11429 (ANI of 98.46) was selected as a reference for Sphingobacterium sp. S2. According to the "Similar Genome Finder" function implemented in "PATRIC", G. thermoleovorans strains B23 and CCB_US3_UF5 from public databases were closest to EC-3. This function uses Mash/MinHash [113] to determine genome similarities. Strain CCB_US3_UF5 was used as the reference strain in MAUVE as this genome has been finished (1 chromosome) and has an ANI similarity of 99.9%.