Complete Genome Analysis of Undecylprodigiosin Pigment Biosynthesizing Marine Streptomyces Species Displaying Potential Bioactive Applications

Marine Streptomyces species are underexplored for their pigment molecules and genes. In this study, we report the genome of the undecylprodigiosin biosynthesizing gene cluster carrying Streptomyces sp. strain BSE6.1, displaying antioxidant, antimicrobial, and staining properties. This Gram-positive obligate aerobic bacterium was isolated from the coastal sediment of the Andaman and Nicobar Islands, India. Pink to reddish pigmented colonies with whitish powdery spores on both agar and broth media are the important morphological characteristics of this bacterium. Growth tolerance to NaCl concentrations was 2 to 7%. The assembled genome of Streptomyces sp. BSE6.1 contains one linear chromosome 8.02 Mb in length with 7157 protein-coding genes, 82 tRNAs, 3 rRNAs and at least 11 gene clusters related to the synthesis of various secondary metabolites, including undecylprodigiosin. This strain carries type I, type II, and type III polyketide synthases (PKS) genes. Type I PKS gene cluster is involved in the biosynthesis of red pigment undecylprodigiosin of BSE6.1, similar to the one found in the S. coelicolor A3(2). This red pigment was reported to have various applications in the food and pharmaceutical industries. The genome of Streptomyces sp. BSE6.1 was submitted to NCBI with a BioProject ID of PRJNA514840 (Sequence Read Archive ID: SRR10849367 and Genome accession ID: CP085300).


Introduction
In recent years, marine pigmented bacteria have been gaining more research interest due to the potential applications of pigment molecules in the food and drug industries [1][2][3]. Among a wide array of pigmented microbes in terrestrial and marine environments, Streptomyces species have gained enormous attention in biotechnological applications. Although Streptomyces species are well known to produce a wide range of pigments, including blue, yellow, red, orange, pink, purple, blue-green, brown, and black [1,2], prodigiosin molecules, which are red in color, are not well studied amongst the Streptomyces species distributed in marine milieus.
The biosynthetic pathway of prodigiosin has been well understood in Serratia marcescens [19,20] and among many other prodigiosin-producing bacterial species. S. marcescens synthesizes prodigiosin through 33 genes, whereas S. coelicolor uses only 23 genes to synthesize prodigiosin derivatives [19,21]. The red gene cluster biosynthesizes prodiginines in Streptomyces species. Both Serratia and Streptomyces utilize 4-methoxy-2,2bipyrrole-5-carbaldehyde to synthesize prodigiosin and undecylprodigiosin, correspondingly [19,20]. Although the genome contents of several Streptomyces species have been reported in the last decade [4,22], the genomes of red pigment-producing Streptomyces species, especially marine Streptomyces, have remain largely uninvestigated, leaving a gap in the understanding of their evolutionary significances and drug discovery potential. Therefore, we intended to analyze and understand the genome of prodigiosin-producing Streptomyces BSE6.1 isolated from a coastal sediment sample.
Prodigiosin pigments are well known for their antimicrobial, anticancer, and cytotoxic properties [1,2,21,23]. Application of dried prodigiosin as a food-grade colorant in the development of prodigiosin coated microcapsules [24] and agar jellies [25] has been demonstrated from the extractions of S. marcescens [24], Zooshikella sp., and Streptomyces sp. [25]. Prodigiosin extracted from Streptomyces species has demonstrated promising antimicrobial activities against several pathogenic microbes such as Corynebacterium bovis, Mycobacterium smegmatis, Nocardia asteroids [7], and Staphylococcus aureus [7,25]. It is thought that the combined activity of antimicrobial and food colorant applications of prodigiosin would facilitate a synergistic effect in disease treatment. The present study introduces a novel species of a red-pigmented Streptomyces strain isolated from Andaman Islands, India's marine environment, and its genome for industrial and biotechnological applications. The preliminary studies on prodigiosin-producing Streptomyces have demonstrated antimicrobial [7] and staining properties [8,25]. Although several Streptomyces species are known to produce a wide range of pigment compounds [1,2], the production of prodiginine derivatives by a limited number of Streptomyces species encouraged us to investigate the corresponding gene clusters in this Streptomyces sp. and compare it with other bacterial species.
Andaman and Nicobar Islands are a chain of 836 Islands, including islands, islets and rocky outcrops, that are pristine and unexplored for microbial resources. Bio-prospecting of microbial pigments from this environment was initiated very recently [1,2,26]. The erratic weather conditions observed in this geographically distinct location appear to favor many novel pigmented microbes with potential biotechnological applications. Therefore, the present study explored the pigmented bacterial resources available in the Andaman Islands and found a potential Streptomyces sp. strain BSE6.1 with antibacterial and dye activity. As Andaman waters are still underexplored, we aimed to investigate the novelty of Streptomyces sp. strain BSE6.1 through whole-genome analysis, predict the pigment gene clusters, and compare them with those of other Streptomyces species genomes available in the public nucleotide databases.

Materials and Methods
A red-pigmented bacterial isolate designated as BSE6.1 was isolated from a marine sediment sample collected from Burmanallah coast (11 • 33 52.24 N, 92 • 44 01.51 E), South Andaman Islands, India. A serially diluted sediment sample was inoculated onto marine agar 2216 (Himedia, Mumbai) plates and incubated at 28 • C. After a couple of weeks, redpigmented colonies grown were sub-cultured either on freshly prepared marine agar plates or 2% nutrient agar. Pure cultures were stored as glycerol suspensions (30%, w/v) at −20 • C for further analysis. Salt tolerance was tested on marine agar plates supplemented with various percentages of NaCl (1 to 10%), followed by streaking a pure culture, incubating at 28 • C, and measuring growth after two days. Catalase and oxidase activities were performed according to standard microbial biochemical tests [27].
Genomic DNA of Streptomyces BSE6.1 was extracted using the Cetyl Trimethyl Ammonium Bromide (CTAB) and phenol-chloroform method. Extracted DNA was treated with RNase A and purified. DNA was quantified by measuring its absorbance at A260 and A280 in a NanoDrop. The Illumina Hiseq X Ten sequencing system was used to obtain 150 bp short-read paired-end raw data. In addition to these short reads, long reads were obtained using the MinIoN platform. The workflow used to assemble these raw reads and analyze the genome assembly is depicted in Figure 1. The paired-end data quality of short reads was checked using FASTQC v0.11.8 [28]. BBDuk (BBmap v38.93) was used to filter low-quality reads and adaptor sequences [29], whereas the long reads were checked with NanoPlot v1.38.1 [30] and filtered with PoreChop v0.4.8 [31]. The filtered high-quality short and long reads were assembled into contigs using a hybrid de novo assembler Unicycler v0.4.8 [32], in a de novo fashion. The 16S rRNA genes were extracted from the assembled scaffolds using Barrnap [33] and were aligned against the non-redundant nucleotide database at NCBI. The complete genome of the nearest neighbor (Streptomyces sp. KPB2-Accession ID: CP034353.1) [34], was used as a reference. The contigs were sorted and merged into scaffolds with the help of a reference genome using MeDusa v1.6 [35]. A gap-filling step was performed using GapCloser v1.12 [36] to generate a draft genome assembly. Furthermore, the genome assembly was polished with Pilon v1.24 [37] by mapping filtered short reads (Bowtie2 v2.4.4. [38]) and filtered long reads (minimap2 [39]) against the assembly and sorting the alignments with samtools v1.13 [40].
Genome assembly was checked for its quality using BUSCO v5.2.2 [41] and CheckM v1.1.3 [42] tools. In silico multi-locus sequence typing (MLST) of the genome was performed using the online webserver at the Centre of Genomic Epidemiology [43]. Type strain identification of the genome was performed at Type(Strain) Genome Server (TYGS) [44]. In addition to the type strain identification, a species tree was constructed with FastME [45] at KBase server [46] using 49 core Clusters of Orthologous Groups (COGs) of 200 related genomes. An additional phylogenetic tree was constructed with the 16s rRNA genes of Streptomyces species available at the Ribosomal RNA database [47]. Duplicate sequences were removed, and multiple sequence alignment (MSA) was performed using default parameters of MAFFT v7.487 for FFT-NS-I refinement method [48]. A maximum-likelihood tree was constructed based on the MSA using default parameters and 1000 bootstraps with RAxML v8.2.12 [49]. The 16s rRNA gene of Staphylococcus aureus (RefSeq ID: GCF_000013425.1) was used as an outgroup. The origin of replication (OriC) was identified using DoriC database [50] and Mauve aligner [51]. Pairwise genomic comparison of strain BSE6.1 was made with 3 other related genomes. Dotplots were constructed with minimap2 based pairwise alignment using D-Genies [52]. Prokka v1.14.6 was used to perform a local de novo annotation [53]. Pan-genome comparison with 100 related genomes (~90% 16S nucleotide identity;~80% whole-genome aligned fraction identity) was made using the pan-genome tool at KBase server [46]. Gene clusters related to the secondary metabolite biosynthesis were identified using the antiSMASH 5.0 pipeline [54]. The red pigmentproducing gene cluster of BSE6.1 was compared with that of S. coelicolor A3(2), Serratia, and Hahella using the multigene BLAST tool [55]. The distribution of various coding sequences (CDS) and gene clusters across the genome was plotted using Circos [56]. (2), Serratia, and Hahella using the multigene BLAST tool [55]. The distribution of various coding sequences (CDS) and gene clusters across the genome was plotted using Circos [56].

Results and Discussion
Strain BSE6.1 produced a pink-colored growth in Minimal broth with 2% NaCl and red pigmentation in all other compatible media. Pale pink to reddish colonies with white Microorganisms 2021, 9, 2249 5 of 17 powdery spores were observed after 7 or 10 days of incubation. Salt tolerance was observed up to a range of 2 to 7%. This bacterium was positive for catalase and oxidase activities. In our earlier study, strain BSE6.1 showed potential antibacterial activity against different human pathogens and also displayed a strong ability to stain epidermis and parenchyma cells of Tridax procumbens stem [25]. The maximum pigment production was observed at 29 • C, and the maximum temperature tolerance for its growth was 38 • C ( Figure 2). The peak absorption spectrum of the red pigment of BSE6.1 was observed at 528 nm [25].

Results and Discussion
Strain BSE6.1 produced a pink-colored growth in Minimal broth with 2% NaCl and red pigmentation in all other compatible media. Pale pink to reddish colonies with white powdery spores were observed after 7 or 10 days of incubation. Salt tolerance was observed up to a range of 2 to 7%. This bacterium was positive for catalase and oxidase activities. In our earlier study, strain BSE6.1 showed potential antibacterial activity against different human pathogens and also displayed a strong ability to stain epidermis and parenchyma cells of Tridax procumbens stem [25]. The maximum pigment production was observed at 29 °C, and the maximum temperature tolerance for its growth was 38 °C (Figure 2). The peak absorption spectrum of the red pigment of BSE6.1 was observed at 528 nm [25]. Identification of the red pigment through thin layer chromatography (TLC), Fouriertransform infrared spectroscopy (FT-IR), and proton nuclear magnetic resonance ( 1 H-NMR) analyses revealed the presence of antimicrobial pigment -prodiginine derivatives in Streptomyces sp. BSE6.1 [25]. However, the genome analysis of strain BSE6.1 reveals the presence of an undecylprodigiosin gene cluster which is responsible for undecylprodigiosin production. Therefore, the other red fraction of Streptomyces strain BSE6.1 [25] is yet to be elucidated and identified through LC-MS, 13 C NMR, HSQC, HMBC, and COSY data to confirm the production of undecylprodigiosin or related derivatives.
Whole-genome sequencing of strain BSE6.1 produced a total of 7,528,288 reads. Assembling these raw reads resulted in a single scaffold of 8.02 Mb with no extra-chromosomal content. Annotating the assembled genome of strain BSE6.1 indicated the presence of at least 7157 protein-coding genes, 82 tRNA coding genes, 3 rRNA coding genes, and 1 responsible for the production of tmRNA (Table 2, Figure 3). Subsystem coverage of the identified CDS was 19%, involving nearly 324 subsystem types ( Figure S1). Subsystems with the highest coverage of genes/features include amino acid, carbohydrate, protein, and vitamin metabolic pathways. Furthermore, at least 43 genes were involved in defense mechanisms such as resistance to antibiotics and toxic compounds. In addition, at least 11 gene clusters involved in the synthesis of other secondary metabolites were also identified ( Figure S2). Most members of the Streptomyces genus have linear chromosomes [4,5] and strain BSE6.1 is not an exception. There are no overlapping 5 and 3 ends in the scaffold, indicating its non-circular configuration. Furthermore, the oriC region and dnaA gene were identified approximately at the center of the scaffold, similar to that of S. coelicolor A3(2) (Figure 3). Streptomyces spectabilis Prodiginine [7] Streptomyces spectabilis BCC 4785 Metacycloprodigiosin [9] Streptomyces variegatus Prodigiosin [16] Streptoverticillium rubrireticuli 100-19 Soil Undecylprodiginine and butylcycloheptylprodiginine [8]  BLAST analysis based on the 16s rRNA sequences suggested that strain BSE6.1 had a 99.71% similarity with various unclassified Streptomyces species available in the GenBank. The most similar strains include Streptomyces sp. NA03103 (isolated from marine sediment in China) (GenBank: CP054920), Streptomyces sp. strain HB-N217 (isolated from a marine sponge, Forcepia sp. in the USA) [77], Streptomyces sp. CCM_MD2014 (soil isolate from the USA) [78], Streptomyces sp. KPB2 (isolated from the pollen of kiwi fruit from South Korea) [34], Streptomyces sp. PM-R01 (isolated from Durian fruit, Durio zibethinus, in Thailand) (GenBank: LC381944), and Streptomyces sp. IT-M01 (isolated from a sea crab, Thalamita crenata, in Thailand) (GenBank: LC386952). Furthermore, 16S rRNA genes of BSE6.1 and 208 Streptomyces species were used to construct a phylogenetic tree ( Figure S3). The strain typing of BSE6.1 at TYGS indicated no available type strain, which is closely related to the query genome. The highest pairwise digital DNA-DNA hybridization similarity (dDDH, d4 value corresponding to the sum of all identities found in HSPs divided by overall HSP length) was 48.7% with type strain Streptomyces coelicoflavus NBRC 15399 (Sup. Data 1). A genome blast distance phylogenetic (GBDP) tree was constructed for BSE6.1 and the related type strains using 16S rRNA gene and complete genome data (Figure 4a,b). In addition to detecting the closest type strain, a species tree was constructed using 49 core COGs in related genomes [46]   BLAST analysis based on the 16s rRNA sequences suggested that strain BSE6.1 had a 99.71% similarity with various unclassified Streptomyces species available in the Gen-Bank. The most similar strains include Streptomyces sp. NA03103 (isolated from marine sediment in China) (GenBank: CP054920), Streptomyces sp. strain HB-N217 (isolated from a marine sponge, Forcepia sp. in the USA) [77], Streptomyces sp. CCM_MD2014 (soil isolate from the USA) [78], Streptomyces sp. KPB2 (isolated from the pollen of kiwi fruit from South Korea) [34], Streptomyces sp. PM-R01 (isolated from Durian fruit, Durio zibethinus, in Thailand) (GenBank: LC381944), and Streptomyces sp. IT-M01 (isolated from a sea crab, Thalamita crenata, in Thailand) (GenBank: LC386952). Furthermore, 16S rRNA genes of BSE6.1 and 208 Streptomyces species were used to construct a phylogenetic tree ( Figure  S3). The strain typing of BSE6.1 at TYGS indicated no available type strain, which is closely related to the query genome. The highest pairwise digital DNA-DNA hybridization similarity (dDDH, d4 value corresponding to the sum of all identities found in HSPs divided The scaffold is followed by coding regions (CDS) in the sense (yellow bands) and anti-sense (orange bands) directions. Grey bands represent hypothetical CDS. The third circle represents the distribution of gene clusters coding for secondary metabolites (green: clusters which are >75% similar to those present in related organisms; grey: <75% similarity). The fourth circle represents the RNA genes (orange), transposases (grey), phage genes (purple) dnaA gene (blue), and oriC region (green and labelled). Histograms in the fifth circle indicate the GC content per 10,000 bases. The innermost circle represents GC skew data per 10,000 bases (blue indicates positive skewness and grey negative skewness).
However, the whole-genome comparison of BSE6.1 with other closely related species shows many variations in its genomic content ( Figure 5). In concordance with the phylogenetic distances, the genomes of strain KPB2 and strain NA03103 have the most similar genomic regions with BSE6.1. Comparatively less identical homologous regions were observed while comparing BSE6.1 with strain CCM_MD2014. Another comparison of BSE6.1 with one of the well-studied pigment-producing bacteria, S. coelicolor A3(2) [70], presented the least identical synteny among the four comparisons. Furthermore, the in silico MLST analysis of the BSE6.1 genome revealed the presence of a novel allelic profile-16S_99, atpD_185, gyrB_124, recA_156, rpoB_175 and trpB_190 (Table 3). All the in silico analyses suggested that the strain BSE6.1 could be a novel species of Streptomyces. However, further phenotypic characterizations are needed to confirm its novelty.
(Sup. Data 1). A genome blast distance phylogenetic (GBDP) tree was constructed for BSE6.1 and the related type strains using 16S rRNA gene and complete genome data (Figure 4a,b). In addition to detecting the closest type strain, a species tree was constructed using 49 core COGs in related genomes [46] (Sup. Data2). In the species tree, BSE6.1 clustered with the strains viz. Streptomyces sp. KPB2, S. coelicolor A3(2), S. lividans TK24, S. olivaceus, S. parvulus, etc (Figure 4c). However, the whole-genome comparison of BSE6.1 with other closely related species shows many variations in its genomic content ( Figure 5). In concordance with the phylogenetic distances, the genomes of strain KPB2 and strain NA03103 have the most similar  with one of the well-studied pigment-producing bacteria, S. coelicolor A3(2) [70], presented the least identical synteny among the four comparisons. Furthermore, the in silico MLST analysis of the BSE6.1 genome revealed the presence of a novel allelic profile-16S_99, atpD_185, gyrB_124, recA_156, rpoB_175 and trpB_190 (Table 3). All the in silico analyses suggested that the strain BSE6.1 could be a novel species of Streptomyces. However, further phenotypic characterizations are needed to confirm its novelty. A pan-genomic comparison was made between 101 related genomes belonging to the Streptomycetaceae family and that of strain BSE6.1 (Figure 6). A total of 720,604 translated genes belong to 123,491 homologous gene families were identified. Out of these, 726 families were conserved across the genomes, 41,274 were shell gene families, and 81,497 were singletons. Strain BSE6.1 has 7157 genes, of which 902 belong to the core gene cluster, 6016 genes belonging to the shell gene cluster, and 239 genes are unique to BSE6.1. The genes confined to strain BSE6.1 are mostly hypothetical (184 out of 239 genes), apart from some interesting genes viz. serine protease genes (perform physiological roles), MarR family A pan-genomic comparison was made between 101 related genomes belonging to the Streptomycetaceae family and that of strain BSE6.1 (Figure 6). A total of 720,604 translated genes belong to 123,491 homologous gene families were identified. Out of these, 726 families were conserved across the genomes, 41,274 were shell gene families, and 81,497 were singletons. Strain BSE6.1 has 7157 genes, of which 902 belong to the core gene cluster, 6016 genes belonging to the shell gene cluster, and 239 genes are unique to BSE6.1. The genes confined to strain BSE6.1 are mostly hypothetical (184 out of 239 genes), apart from some interesting genes viz. serine protease genes (perform physiological roles), MarR family (responsible for multiple antibiotic resistance), SsgA sporulation regulator, etc (Sup. Data 3).
Streptomyces species are ubiquitous in nature, with more than 500 Streptomyces species reported from various environments such as terrestrial, coastal, deep-sea, deserts, and polar regions [6]. Under unfavorable conditions, these species produce external hyphae, which divide into spores. Streptomyces species possess antibiotic resistance genes; thus, they display potential bioactive properties. Many species of Streptomyces are known to produce secondary metabolites, antibiotics [79,80], and very few Streptomyces species are known to produce pigments such as prodigiosin derivatives having antimicrobial and anticancer properties [1,6,19]. The genome analysis of BSE6.1 revealed the presence of 23 gene clusters responsible for the production of ectoine, polyketides, etc ( Figure S2). Out of these 23 clusters, at least 11 showed >75% similarity with existing gene clusters of different strains (Figures S4 and S5). The information about all the other gene clusters and their similarity to the other Streptomyces may be accessed through anti-smash (Sup. Data 5).
(responsible for multiple antibiotic resistance), SsgA sporulation regulator, etc (Sup. Data 3). Streptomyces species are ubiquitous in nature, with more than 500 Streptomyces species reported from various environments such as terrestrial, coastal, deep-sea, deserts, and polar regions [6]. Under unfavorable conditions, these species produce external hyphae, which divide into spores. Streptomyces species possess antibiotic resistance genes; thus, they display potential bioactive properties. Many species of Streptomyces are known to produce secondary metabolites, antibiotics [79,80], and very few Streptomyces species are known to produce pigments such as prodigiosin derivatives having antimicrobial and anticancer properties [1,6,19]. The genome analysis of BSE6.1 revealed the presence of 23 gene clusters responsible for the production of ectoine, polyketides, etc ( Figure S2). Out The genome of BSE6.1 contains three types of PKSs, namely type I, type II, and type III. Strain BSE6.1 has two copies of type III polyketide synthase (PKS) genes observed in clusters 20 and 21, coding for herboxidiene, an antitumor molecule reported in Streptomyces sp. [81], and germicidin, which is responsible for the development of spore formation and aerial hyphae elongation [82], respectively. The type III PKS genes in Streptomyces species are known to produce red to brownish pigments with potential antimicrobial and antioxidant activities [83,84]. Cluster 13 represents a type II PKS, which is responsible for grey-pink spore pigmentation in Streptomyces species [85,86].
Interestingly, the genome content of strain BSE6.1 is distinct from other Streptomyces species. It is an important evolutionary aspect that these related and non-related bacterial lineages are capable of producing a variety of prodiginine analogs for their defensive function in the surrounding milieus. As studies on the diversity and distribution of marine pigmented Streptomyces species are scarce, further research on this aspect would provide new insights into the evolutionary spread and species distribution of pigmented Streptomyces in different environments. We infer that pigment gene clusters of microbes such as Streptomyces may serve as an evolutionary marker to address the actual place of origin and spread of prodiginine pigments in the marine or terrestrial milieus during the evolutionary process. The variability in the whole genome content and novel alleles in the MLST profile indicate its status as a novel species. Thus, based on complete genome analysis, we propose strain BSE6.1 as Streptomyces prasanthi sp. nov. This study provides the whole genome of Streptomyces sp. BSE6.1 for further comparative studies with other Streptomyces species on taxonomical, evolutionary, and biotechnological aspects. As it is the first ever mined genome of prodigiosin-producing marine Streptomyces BSE6.1, it would serve as a reference genome for comparative studies to predict the novelty of the genomic contents of other Streptomyces species and non-Streptomyces species.
Author Contributions: Conceptualization, lab work, data analysis, validation, and manuscript writing were completed by C.R., M.A. worked on bioinformatics and manuscript writing. Supervision, editing, and approval by N.V.V. and R.K., L.D. edited and provided additional information to improve the manuscript. All authors have read and agreed to the published version of the manuscript.