Natural Product Gene Clusters in the Filamentous Nostocales Cyanobacterium HT-58-2

Cyanobacteria are known as rich repositories of natural products. One cyanobacterial-microbial consortium (isolate HT-58-2) is known to produce two fundamentally new classes of natural products: the tetrapyrrole pigments tolyporphins A–R, and the diterpenoid compounds tolypodiol, 6-deoxytolypodiol, and 11-hydroxytolypodiol. The genome (7.85 Mbp) of the Nostocales cyanobacterium HT-58-2 was annotated previously for tetrapyrrole biosynthesis genes, which led to the identification of a putative biosynthetic gene cluster (BGC) for tolyporphins. Here, bioinformatics tools have been employed to annotate the genome more broadly in an effort to identify pathways for the biosynthesis of tolypodiols as well as other natural products. A putative BGC (15 genes) for tolypodiols has been identified. Four BGCs have been identified for the biosynthesis of other natural products. Two BGCs related to nitrogen fixation may be relevant, given the association of nitrogen stress with production of tolyporphins. The results point to the rich biosynthetic capacity of the HT-58-2 cyanobacterium beyond the production of tolyporphins and tolypodiols.


Introduction
Tolyporphins represent a class of compounds in the pigments of life family with structural features distinct from other prominent constituents, including heme, chlorophylls, bacteriochlorophylls, cobalamin, and coenzyme F 430 . Tolyporphin A, the first member of the tolyporphins family, was found in the lipophilic extract of a cyanobacterial sample. The sample, HT-58-2, was obtained from Nan Madol on the island of Pohnpei in Micronesia [1]. The discovery and identification of tolyporphin A originated with a broad screen of diverse cyanobacteria for anti-cancer activity [2]. Indeed, tolyporphin A was found to exhibit efflux pump inhibition and photocytotoxicity toward tumor cells [3][4][5].
An astonishing aspect of the discovery of tolyporphin A is the presence of a compound with a dioxobacteriochlorin chromophore in a cyanobacterial sample. Cyanobacteria employ chlorophyll, not bacteriochlorophyll, for photosynthesis, and the presence of compounds with a bacteriochlorin chromophore in a chlorin-based photosynthetic system is unprecedented. Over the years, fractionation of lipophilic extracts from the HT-58-2 culture has led to the identification of a family of tolyporphins, now numbering 18 (A-R) [6][7][8] (Figure 1). Most of the tolyporphins are dioxobacteriochlorins, but three are oxochlorins, and one is a porphyrin. The structural diversity suggests an in vivo role for tolyporphins as secondary metabolites in as-yet undefined defense processes. If so, this represents a fundamentally new biological function in the pigments of life family. To date, the HT-58-2 culture remains the only known producer of tolyporphins. , and a porphyrin (tolyporphin P). The π-chromophore is shown in magenta, green, and red, respectively.
To pursue questions of biosynthesis and in vivo function of tolyporphins, in late 2015 we obtained the tolyporphin-producing culture HT-58-2. Subsequent studies have revealed the following: (1) the HT-58-2 culture is dominated by a single filamentous cyanobacterium in a non-axenic cyanobacterial-microbial consortium (Figure 2A,B) [9,10]; (2) several-weeks growth under stress imparted by the deprivation of aqueous-soluble nitrate profoundly increases the production of tolyporphins, reaching a level rivalling that of chlorophyll [11]; (3) tolyporphins are present in the sheath and cell septa of the filamentous cyanobacterium, as revealed by hyperspectral confocal fluorescence imaging [12]; (4) the cyanobacterium contains a circular genome (7.85 Mbp) with genes for biosynthesis of heme, chlorophyll a, phycocyanobilin, and cobalamin (in part), as well as a putative biosynthetic gene cluster (here termed BGC-T) for the biosynthesis of tolyporphins ( Figure  2E) [9]. Notably, BGC-T contains all genes in the core pathway (from L-glutamic acid to protoporphyrinogen IX) of tetrapyrrole biosynthesis, except hemD, which is located elsewhere in the genome.  , and a porphyrin (tolyporphin P). The π-chromophore is shown in magenta, green, and red, respectively.
To pursue questions of biosynthesis and in vivo function of tolyporphins, in late 2015 we obtained the tolyporphin-producing culture HT-58-2. Subsequent studies have revealed the following: (1) the HT-58-2 culture is dominated by a single filamentous cyanobacterium in a non-axenic cyanobacterial-microbial consortium (Figure 2A,B) [9,10]; (2) several-weeks growth under stress imparted by the deprivation of aqueous-soluble nitrate profoundly increases the production of tolyporphins, reaching a level rivalling that of chlorophyll [11]; (3) tolyporphins are present in the sheath and cell septa of the filamentous cyanobacterium, as revealed by hyperspectral confocal fluorescence imaging [12]; (4) the cyanobacterium contains a circular genome (7.85 Mbp) with genes for biosynthesis of heme, chlorophyll a, phycocyanobilin, and cobalamin (in part), as well as a putative biosynthetic gene cluster (here termed BGC-T) for the biosynthesis of tolyporphins ( Figure 2E) [9]. Notably, BGC-T contains all genes in the core pathway (from L-glutamic acid to protoporphyrinogen IX) of tetrapyrrole biosynthesis, except hemD, which is located elsewhere in the genome.
In an initial report in 1992 [1], the Nostocales cyanobacterium HT-58-2 was categorized as Tolypothrix nodosa on the basis of visual inspection of the characteristic filamentous morphology ( Figure 2C,D). The term "tolypothrix" refers to "a hairy ball of yarn," whereas "nodosa" denotes the presence of nodules, which are believed to be responsible for nitrogen fixation. The availability of the sequence information enabled phylogenomic analysis, which showed the HT-58-2 cyanobacterium to be more closely aligned with the genus Brasilonema according to 16S rRNA [9].
Scrutiny of the lipophilic extracts of the HT-58-2 culture has revealed other new natural products. The natural products include tolypodiol (Figure 3), the first diterpenoid compound obtained from cyanobacteria [13]. A monoacetate derivative of tolypodiol was subsequently synthesized by chemical means [13]. Both tolypodiol and the synthetic O-acetate derivative showed potent anti-inflammatory activity in a mouse ear edema assay. Recently, two tolypodiol analogues, 6-deoxytolypodiol and 11-hydroxytolypodiol, were also found in extracts from the HT-58-2 culture. The structures and absolute configuration of the two tolypodiol analogues were determined, and the compounds were assayed for anti-inflammatory activity [14]. Prior work [9] annotating the genome of the Nostocales HT-58-2 suggested numerous genes for diverse functions, but did not delve deeply into the BGCs of natural products, including those for tolypodiols.  In an initial report in 1992 [1], the Nostocales cyanobacterium HT-58-2 was categorized as Tolypothrix nodosa on the basis of visual inspection of the characteristic filamentous morphology ( Figure 2C,D). The term "tolypothrix" refers to "a hairy ball of yarn," whereas "nodosa" denotes the presence of nodules, which are believed to be responsible for nitrogen fixation. The availability of the sequence information enabled phylogenomic analysis, which showed the HT-58-2 cyanobacterium to be more closely aligned with the genus Brasilonema according to 16S rRNA [9].
Scrutiny of the lipophilic extracts of the HT-58-2 culture has revealed other new natural products. The natural products include tolypodiol (Figure 3), the first diterpenoid compound obtained from cyanobacteria [13]. A monoacetate derivative of tolypodiol was subsequently synthesized by chemical means [13]. Both tolypodiol and the synthetic Oacetate derivative showed potent anti-inflammatory activity in a mouse ear edema assay. In this paper, we report an evaluation of genes that support the biosynthesis of tolypodiols as well as other natural products, deepening the coding potential survey of the genome that was initially reported [9]. A large impetus for this work stems from two facts: (1) the in vivo functions of tolyporphins and tolypodiols are not known; and (2) the biosynthesis of tolyporphins depends markedly on environmental conditions. The production of myriad natural products by cyanobacteria can be prompted by a host of environmental stimuli  In this paper, we report an evaluation of genes that support the biosynthesis of tolypodiols as well as other natural products, deepening the coding potential survey of the genome that was initially reported [9]. A large impetus for this work stems from two facts: (1) the in vivo functions of tolyporphins and tolypodiols are not known; and (2) the biosynthesis of tolyporphins depends markedly on environmental conditions. The production of myriad natural products by cyanobacteria can be prompted by a host of environmental stimuli [15,16]. The issue then arises as to the biosynthetic capacity of HT-58-2, i.e., what other natural products might be produced either under normal or stressed conditions?

Materials and Methods
The HT-58-2 sample was incubated in BG-11 medium as described previously [9], under continuous white light (62 µmol m −2 s −1 ) at 28 • C with shaking at 120 rpm. BG-11 medium provides aqueous-soluble nitrogen in the form of NaNO 3 .
Analysis of genes for the biosynthesis of natural products within the genome of Nostocales HT-58-2 (accession number: CP019636) was performed by the use of AntiSMASH [17], PRISM [18], and ARTS [19] with default settings. Putative BGCs for tolypodiols and nitrogen fixation were annotated and labeled manually based on protein homology searching and conserved domain analysis via the BLASTP program [20]. Individual gene or protein alignments were performed with Clustal Omega [21] and formatted with ESPript 3.0 [22].

Putative Tolypodiols BGC
To pursue the biosynthesis of tolypodiols, a search for BGCs related to terpenoids in the Nostocales HT-58-2 was performed (hereafter in this study, "HT-58-2" will refer to the cyanobacterium unless otherwise noted). The biosynthesis of the terpenoid skeleton in cyanobacteria proceeds via a non-mevalonate pathway, termed the methylerythritol-4phosphate (MEP) pathway. The pathway is shown in Figure 4 [23,24]. Specific genes that encode enzymes involved in the MEP pathway were annotated throughout the genome of HT-58-2; the annotation was performed manually according to protein homology. A putative BGC of~17 kbp with genes that encode almost the entire MEP pathway was found at 38,701-55,799 bp of the HT-58-2 genome (CP019636). Two of the requisite eight genes were absent, which coded for 2-C-methyl-D-erythritol 4-phosphate cytidylyltransferase (MCT) and 2-C-methyl-D-erythritol 2,4-cyclodiphosphate synthase (MDS), corresponding to the third and fifth steps, as shown in Figure 4. The two genes missing in the BGC, ispD and ispF, occur elsewhere in the genome. The other six genes in the MEP pathway, including dxs, dxr, ispE, ispG, ispH, and crtE, appear both in the BGC and at least once elsewhere distributed in the genome of HT-58-2 ( Figure 5). This situation resembles that of the putative BGC for tolyporphins, where a subset of genes within the BGC-T also appear elsewhere in the genome. ferase (MCT) and 2-C-methyl-D-erythritol 2,4-cyclodiphosphate synthase (MDS), corresponding to the third and fifth steps, as shown in Figure 4. The two genes missing in the BGC, ispD and ispF, occur elsewhere in the genome. The other six genes in the MEP pathway, including dxs, dxr, ispE, ispG, ispH, and crtE, appear both in the BGC and at least once elsewhere distributed in the genome of HT-58-2 ( Figure 5). This situation resembles that of the putative BGC for tolyporphins, where a subset of genes within the BGC-T also appear elsewhere in the genome.  The assigned tolypodiols BGC is shown in Figure 6. In addition to the six genes for enzymes corresponding to the biosynthesis of the terpenoid backbone, four additional genes were identified that are expected to encode enzymes of the ubiquinone/terpenoidquinone pathway. The latter pathway entails the attachment of an all-trans-polyprenyl unit (such as geranylgeranyl) to the 3-position of 4-hydroxybenzoate. The arene motif in the resulting polyprenyl benzoquinol has intriguing structural resemblance to the arene in tolypodiols. The corresponding enzymes UbiC, UbiA, UbiH, and UbiE are known to catalyze the reactions shown in Figure 7 [25][26][27][28]. The product from the backbone MEP pathway of terpenoid biosynthesis, geranylgeranyl diphosphate, might participate in the biosynthesis of terpenoid-quinone compounds (such as tolypodiols), given the coclustering of ubi/isp/dxr/dxs/crtE genes. The list of aligned proteins and corresponding genes is provided in Table 1. The assigned tolypodiols BGC is shown in Figure 6. In addition to the six genes for enzymes corresponding to the biosynthesis of the terpenoid backbone, four additional genes were identified that are expected to encode enzymes of the ubiquinone/terpenoidquinone pathway. The latter pathway entails the attachment of an all-trans-polyprenyl unit (such as geranylgeranyl) to the 3-position of 4-hydroxybenzoate. The arene motif in the resulting polyprenyl benzoquinol has intriguing structural resemblance to the arene in tolypodiols. The corresponding enzymes UbiC, UbiA, UbiH, and UbiE are known to catalyze the reactions shown in Figure 7 [25][26][27][28]. The product from the backbone MEP pathway of terpenoid biosynthesis, geranylgeranyl diphosphate, might participate in the biosynthesis of terpenoid-quinone compounds (such as tolypodiols), given the co-clustering of ubi/isp/dxr/dxs/crtE genes. The list of aligned proteins and corresponding genes is provided in Table 1.

BGCs for Diverse Natural Products
In the genome of HT-58-2, eighteen clusters of genes were identified through the use of AntiSMASH (Table 2). Further analysis showed that four of the eighteen clusters (hereafter, BGCs) aligned with relatively high similarity (>50%) with known BGCs for the following compounds: (1) hapalosin; (2) anatoxin-a/homoanatoxin-a; (3) shinorine; and (4) heterocyst glycolipids ( Figure 8). In-depth examination concerning the four BGCs is provided in the following section.

Hapalosin BGC
Hapalosin is a cyclodepsipeptide that has been shown to cause the reversal of multidrug resistance in tumor cell lines [29]. The compound has been detected from lipophilic extracts of three cyanobacterial strains: Hapalosiphon welwitschii UH strain IC-52-3, Westiella intricata UH strain HT-29-1, and Fischerella sp. PCC 9431. The UH strains were collected by the same team at the University of Hawaii that discovered HT-58-2 as part of a global search for novel natural products from diverse cyanobacteria [2]. Cloning and heterologous expression of the majority of the hapalosin BGC from Fischerella sp. PCC 9431 was recently described [30].
In HT-58-2, a non-ribosomal peptide synthase/Type I polyketide synthase (NRPS/T1PKS)-related gene cluster was found spanning 240 kb (region nine), as shown in Table 2. More than ten T1PKS/NRPS domains are arranged in this region, which are likely involved in the biosynthesis of cyclic peptide-polyketides or lipopeptides, such as the cy-

Hapalosin BGC
Hapalosin is a cyclodepsipeptide that has been shown to cause the reversal of multidrug resistance in tumor cell lines [29]. The compound has been detected from lipophilic extracts of three cyanobacterial strains: Hapalosiphon welwitschii UH strain IC-52-3, Westiella intricata UH strain HT-29-1, and Fischerella sp. PCC 9431. The UH strains were collected by the same team at the University of Hawaii that discovered HT-58-2 as part of a global search for novel natural products from diverse cyanobacteria [2]. Cloning and heterologous expression of the majority of the hapalosin BGC from Fischerella sp. PCC 9431 was recently described [30]. In HT-58-2, a non-ribosomal peptide synthase/Type I polyketide synthase (NRPS/T1PKS)related gene cluster was found spanning 240 kb (region nine), as shown in Table 2. More than ten T1PKS/NRPS domains are arranged in this region, which are likely involved in the biosynthesis of cyclic peptide-polyketides or lipopeptides, such as the cyclodepsipeptide hapalosin. Five genes in the BGC aligned with all five members from the hapalosin BGC from Fischerella sp. PCC 9431 (Figure 9) [31]. The overall percent identity between amino acid sequences of HapA-E in the two BGCs is around 50-60% with an E-value of 0.0 as given by BLASTP; the module domains of PKS and NRPS from HT-58-2 differed from those in Fischerella sp. PCC 9431 ( Table 3). The distinct difference in sequences and modules suggests that hapalosin might not be the only product produced by this long BGC. A second group of hapalosin genes was identified (entry eight, Table 2), but the corresponding pathway appears to be incomplete.
Hapalosin is a cyclodepsipeptide that has been shown to cause the reversal of multidrug resistance in tumor cell lines [29]. The compound has been detected from lipophilic extracts of three cyanobacterial strains: Hapalosiphon welwitschii UH strain IC-52-3, Westiella intricata UH strain HT-29-1, and Fischerella sp. PCC 9431. The UH strains were collected by the same team at the University of Hawaii that discovered HT-58-2 as part of a global search for novel natural products from diverse cyanobacteria [2]. Cloning and heterologous expression of the majority of the hapalosin BGC from Fischerella sp. PCC 9431 was recently described [30].
In HT-58-2, a non-ribosomal peptide synthase/Type I polyketide synthase (NRPS/T1PKS)-related gene cluster was found spanning 240 kb (region nine), as shown in Table 2. More than ten T1PKS/NRPS domains are arranged in this region, which are likely involved in the biosynthesis of cyclic peptide-polyketides or lipopeptides, such as the cyclodepsipeptide hapalosin. Five genes in the BGC aligned with all five members from the hapalosin BGC from Fischerella sp. PCC 9431 (Figure 9) [31]. The overall percent identity between amino acid sequences of HapA-E in the two BGCs is around 50-60% with an Evalue of 0.0 as given by BLASTP; the module domains of PKS and NRPS from HT-58-2 differed from those in Fischerella sp. PCC 9431 ( Table 3). The distinct difference in sequences and modules suggests that hapalosin might not be the only product produced by this long BGC. A second group of hapalosin genes was identified (entry eight, Table 2), but the corresponding pathway appears to be incomplete.

Anatoxin-a/Homoanatoxin-a BGC
Anatoxin-a belongs to a class of toxins produced by cyanobacteria that have been reported as threats to humans and animals [32,33]. Anatoxin-a was first isolated from Anabaena flos-aquae in 1977 [34], and several analogues, including homoanatoxin-a, dihydroantoxin-a, and dihydrohomoanatoxin-a, were later identified from multiple genera of cyanobacteria [35]. Known as Very Fast Death Factor (VFDF), anatoxin-a can cause tremors, paralysis, and death within a few minutes when injected into the body cavity of mice [36], and also has high acute oral toxicity [37]. Thus, monitoring water supplies for the presence of cyanobacteria that produce anatoxins has become an important issue. However, due to the limitations of many traditional means of observations such as light microscopy, it is preferable to employ genomic methods to identify cyanobacteria that have the potential to produce anatoxins.
A putative anatoxin-a/homoanatoxin-a BGC was found between nucleotides 2,729,361 and 2,752,515 bp in the genome of HT-58-2 ( Figure 10). Aligned with the reported ana BGC from Oscillatoria sp. PCC 6506 [38,39], most genes (anaA-G) share 80-90% nucleotide and amino acid identity, except for that of AnaJ with 65% aligned amino acid identity ( Table 4). The main difference between the two ana BGCs is the position of the thioesterase gene (anaA). anaA is observed upstream of the anaB-G cluster in Oscillatoria sp. PCC 6506, versus downstream of the BGC in HT-58-2 ( Figure 10). Additionally, both predicted protein functions and identified functional domains in the polyketide synthases from HT-58-2 are extremely similar to those from Oscillatoria sp. PCC 6506, except for that of AnaG. In particular, three module domains (KS-AT-ACP) were detected in AnaG of HT-58-2, while an additional SAM-dependent methyltransferase domain exists in Oscillatoria sp. PCC 6506. In Figure 11, a 599 amino acid gap can be observed in the alignment of the two respective AnaG proteins, which corresponds to the methyl transferase region on the basis of module searching. The lack of a methyltransferase in the ana BGC of HT-58-2 might preclude methyl extension of the acetyl group (forming the propionyl group) in the conversion of anatoxin-a and dihydroanatoxin-a to the respective homoanatoxin-a and dihydrohomoanatoxin-a, unless a methyltransferase encoded elsewhere in the genome is used.
The BGC for anatoxins found in HT-58-2 is most closely related to that identified in Cylindrospermum stagnale PCC 7417 ( Figure 10) on the basis of BLAST results (data not shown). C. stagnale PCC 7417 has been reported to produce dihydroanatoxin-a, rather than anatoxin-a/homoanatoxin-a [40]. In addition to the presence and alignment of AnaA-G and AnaJ, a MATE efflux transporter is encoded by both BGCs but at distinct positions: downstream from the BGC for anatoxins in HT-58-2, while upstream in C. stagnale PCC 7417. The MATE efflux transporter was reported to export cyanotoxin (encoded within the saxitoxin BGC) from growing cyanobacteria [41] and might play the same role in transporting anatoxins.     The BGC for anatoxins found in HT-58-2 is most closely related to that identified in Cylindrospermum stagnale PCC 7417 ( Figure 10) on the basis of BLAST results (data not shown). C. stagnale PCC 7417 has been reported to produce dihydroanatoxin-a, rather than anatoxin-a/homoanatoxin-a [40]. In addition to the presence and alignment of AnaA-G and AnaJ, a MATE efflux transporter is encoded by both BGCs but at distinct positions: downstream from the BGC for anatoxins in HT-58-2, while upstream in C. stagnale PCC 7417. The MATE efflux transporter was reported to export cyanotoxin (encoded within the saxitoxin BGC) from growing cyanobacteria [41] and might play the same role in transporting anatoxins.

Hexose-Shinorine BGC
Cyanobacteria are known to produce pigments that afford protection against ultraviolet light. Mycosporine-like amino acids (MAAs) and scytonemins are two types of ultraviolet (UV)-absorbing compounds for protection against UV-B and UV-A radiation, respectively [42]. Shinorine is one of the ~30 MAAs earlier identified in 28 cyanobacterial strains [43] and red algae, Porphyra umbilicalis [44]. The structure of shinorine contains a cyclohexenimine core bearing glycine and serine substituents. Shinorine has a strong UVabsorbing potential due to a large molar absorption coefficient (ε = 28,100-50,000 M −1 cm -1 ) in the UV-A range [45], which generally is more penetrating than UV-B [46].
A putative BGC for the biosynthesis of shinorine was identified from HT-58-2 at the region of 4,124,014-4,131,995 bp, which can be compared to the reported MAA producer Anabaena variabilis ATCC 29,413 [47] (Figure 12). The shinorine BGC contains four coding Figure 11. Alignment between AnaG from HT-58-2 and from Oscillatoria sp. PCC 6506 via ClustalW Omega. The image was generated by ESPript 3. Red highlight indicates residues of identity or high similarity, non-highlighted fonts show non-matching amino acids, and dots indicate gaps in the sequence alignment.

Hexose-Shinorine BGC
Cyanobacteria are known to produce pigments that afford protection against ultraviolet light. Mycosporine-like amino acids (MAAs) and scytonemins are two types of ultraviolet (UV)-absorbing compounds for protection against UV-B and UV-A radiation, respectively [42]. Shinorine is one of the~30 MAAs earlier identified in 28 cyanobacterial strains [43] and red algae, Porphyra umbilicalis [44]. The structure of shinorine contains a cyclohexenimine core bearing glycine and serine substituents. Shinorine has a strong UV-absorbing potential due to a large molar absorption coefficient (ε = 28,100-50,000 M −1 cm −1 ) in the UV-A range [45], which generally is more penetrating than UV-B [46].
A putative BGC for the biosynthesis of shinorine was identified from HT-58-2 at the region of 4,124,014-4,131,995 bp, which can be compared to the reported MAA producer Anabaena variabilis ATCC 29,413 [47] (Figure 12). The shinorine BGC contains four coding sequences (CDSs), with the arrangement of genes and the predicted functions of each CDS shown in Figure 12 and Table 5. Comparison can be made with an NRPS in A. variabilis ATCC 29,413 (Ava_3855) and in HT-58-2 (ARV60046). The latter contains an additional condensation domain, which may affect the final product derived from the shinorine BGC. Such a condensation module was also observed in NRPSs from other cyanobacterial strains, e.g., Chlorogloeopsis fritschii PCC 6912 [48] and Chlorogloeopsis sp. PCC 9212 [49]. The two Chlorogloeopsis strains were identified by the presence of a shinorine BGC similar to that of HT-58-2. ATCC 29,413 (Ava_3855) and in HT-58-2 (ARV60046). The latter contains an additional condensation domain, which may affect the final product derived from the shinorine BGC. Such a condensation module was also observed in NRPSs from other cyanobacterial strains, e.g., Chlorogloeopsis fritschii PCC 6912 [48] and Chlorogloeopsis sp. PCC 9212 [49]. The two Chlorogloeopsis strains were identified by the presence of a shinorine BGC similar to that of HT-58-2.

Heterocyst Glycolipid BGC
Heterocyst glycolipids (HGs) form a protective layer for oxygen-sensitive nitrogenase enzymes [50] in the envelope of heterocystous nitrogen-fixing cyanobacteria. The HT-58-2 cyanobacterium is filamentous and is capable of nitrogen fixation [9]; therefore, the presence of a BGC for HG biosynthesis in the genome of HT-58-2 seemed likely. Indeed, an HG BGC occurs in HT-58-2 at position 974,174-991,938 bp. Six genes therein encode proteins aligned with high similarity (>63% identity) to those of a reported BGC for HGs from Nostoc sp. 'Peltigera membranacea cyanobiont' [51]. The alignment of the two HG BGCs and the predicted product functions are shown in Figure 13 and Table 6. A shortchain dehydrogenase/reductase (SDR) is absent in the HG BGC from HT-58-2, while a carboxypeptidase regulatory-like domain protein is present between HglG and HglE. Adjacent to the assumed HG BGC are twelve photosynthesis and phycobilisome-related proteins, for which the relationship to the formation of HGs remains unknown. Overall, the putative HG BGC in HT-58-2 aligned with 85% similarity to the proteins of the Nostoc sp., although the effect of the absent SDR on the biosynthesis of HGs requires further study.

Heterocyst Glycolipid BGC
Heterocyst glycolipids (HGs) form a protective layer for oxygen-sensitive nitrogenase enzymes [50] in the envelope of heterocystous nitrogen-fixing cyanobacteria. The HT-58-2 cyanobacterium is filamentous and is capable of nitrogen fixation [9]; therefore, the presence of a BGC for HG biosynthesis in the genome of HT-58-2 seemed likely. Indeed, an HG BGC occurs in HT-58-2 at position 974,174-991,938 bp. Six genes therein encode proteins aligned with high similarity (>63% identity) to those of a reported BGC for HGs from Nostoc sp. 'Peltigera membranacea cyanobiont' [51]. The alignment of the two HG BGCs and the predicted product functions are shown in Figure 13 and Table 6. A shortchain dehydrogenase/reductase (SDR) is absent in the HG BGC from HT-58-2, while a carboxypeptidase regulatory-like domain protein is present between HglG and HglE. Adjacent to the assumed HG BGC are twelve photosynthesis and phycobilisome-related proteins, for which the relationship to the formation of HGs remains unknown. Overall, the putative HG BGC in HT-58-2 aligned with 85% similarity to the proteins of the Nostoc sp., although the effect of the absent SDR on the biosynthesis of HGs requires further study. ATCC 29,413 (Ava_3855) and in HT-58-2 (ARV60046). The latter contains an additional condensation domain, which may affect the final product derived from the shinorine BGC. Such a condensation module was also observed in NRPSs from other cyanobacterial strains, e.g., Chlorogloeopsis fritschii PCC 6912 [48] and Chlorogloeopsis sp. PCC 9212 [49]. The two Chlorogloeopsis strains were identified by the presence of a shinorine BGC similar to that of HT-58-2.

Heterocyst Glycolipid BGC
Heterocyst glycolipids (HGs) form a protective layer for oxygen-sensitive nitrogenase enzymes [50] in the envelope of heterocystous nitrogen-fixing cyanobacteria. The HT-58-2 cyanobacterium is filamentous and is capable of nitrogen fixation [9]; therefore, the presence of a BGC for HG biosynthesis in the genome of HT-58-2 seemed likely. Indeed, an HG BGC occurs in HT-58-2 at position 974,174-991,938 bp. Six genes therein encode proteins aligned with high similarity (>63% identity) to those of a reported BGC for HGs from Nostoc sp. 'Peltigera membranacea cyanobiont' [51]. The alignment of the two HG BGCs and the predicted product functions are shown in Figure 13 and Table 6. A shortchain dehydrogenase/reductase (SDR) is absent in the HG BGC from HT-58-2, while a carboxypeptidase regulatory-like domain protein is present between HglG and HglE. Adjacent to the assumed HG BGC are twelve photosynthesis and phycobilisome-related proteins, for which the relationship to the formation of HGs remains unknown. Overall, the putative HG BGC in HT-58-2 aligned with 85% similarity to the proteins of the Nostoc sp., although the effect of the absent SDR on the biosynthesis of HGs requires further study.

Putative BGC for Nitrogen Fixation
Genes for nitrogen fixation enable cyanobacteria to grow in the absence of dissolved nitrogenous compounds. The presence of a minimum set of six genes, termed nifHDK and nifENB for catalysis and biosynthesis proteins, respectively, has been reported to be essential for nitrogen fixation [52]. In HT-58-2, genes for nitrogen fixation were found to be concentrated in two regions: 4,862,885-4,900,602 bp (nif cluster 1) and 4,972,977-4,986,558 bp (nif cluster 2) ( Figure 14). The former cluster contains five of the six genes (missing nifB), whereas the latter cluster contains three genes (missing nifENK); regardless, all six required genes are represented among the two clusters. The expression of the genes in the two clusters may differ under distinct growth conditions [53].

Putative BGC for Nitrogen Fixation
Genes for nitrogen fixation enable cyanobacteria to grow in the absence of dissolved nitrogenous compounds. The presence of a minimum set of six genes, termed nifHDK and nifENB for catalysis and biosynthesis proteins, respectively, has been reported to be essential for nitrogen fixation [52]. In HT-58-2, genes for nitrogen fixation were found to be concentrated in two regions: 4,862,885-4,900,602 bp (nif cluster 1) and 4,972,977-4,986,558 bp (nif cluster 2) ( Figure 14). The former cluster contains five of the six genes (missing nifB), whereas the latter cluster contains three genes (missing nifENK); regardless, all six required genes are represented among the two clusters. The expression of the genes in the two clusters may differ under distinct growth conditions [53].
In addition to the nif and vnf genes encoding nitrogenases (the latter a vanadiumdependent nitrogenase), modABC genes are present in cluster 1; such genes encode proteins involved in the high affinity molybdate/tungsten uptake system. The presence of Mo-related proteins in the nif gene cluster 1 supports the inferred role of Mo transport in regulating nitrogen fixation by HT-58-2.

Figure 14.
Gene clusters related to nitrogen fixation in HT-58-2. Nitrogen fixation genes are labeled in magenta, while additional genes related to ion transport or energy production are shown in maroon.

Discussion
Cyanobacteria are essential constituents of the earth's biota and occupy diverse terrestrial and aquatic ecosystems [54,55]. Cyanobacteria provide a tremendous treasury of natural products with rich applications to serve human needs in the pharmaceutical, food, and energy industries [56]. The culture HT-58-2, which contains a filamentous cyanobacterium and several other bacteria (mostly photoheterotrophic bacteria), is known to produce two distinct classes of natural products: tolyporphins, new members of the tetrapyrrole macrocycle family; and tolypodiols, which are diterpenoids. Both sets of natural products have been subjected to initial evaluation for biological activities. Tolyporphin A was found to inhibit efflux pump activity and thereby reverse multidrug resistance (MDR) in tumor cells [1,3,4]. Tolypodiols and analogues were evaluated for anti- Figure 14. Gene clusters related to nitrogen fixation in HT-58-2. Nitrogen fixation genes are labeled in magenta, while additional genes related to ion transport or energy production are shown in maroon.
In addition to the nif and vnf genes encoding nitrogenases (the latter a vanadiumdependent nitrogenase), modABC genes are present in cluster 1; such genes encode proteins involved in the high affinity molybdate/tungsten uptake system. The presence of Morelated proteins in the nif gene cluster 1 supports the inferred role of Mo transport in regulating nitrogen fixation by HT-58-2.

Discussion
Cyanobacteria are essential constituents of the earth's biota and occupy diverse terrestrial and aquatic ecosystems [54,55]. Cyanobacteria provide a tremendous treasury of natural products with rich applications to serve human needs in the pharmaceutical, food, and energy industries [56]. The culture HT-58-2, which contains a filamentous cyanobacterium and several other bacteria (mostly photoheterotrophic bacteria), is known to produce two distinct classes of natural products: tolyporphins, new members of the tetrapyrrole macrocycle family; and tolypodiols, which are diterpenoids. Both sets of natural products have been subjected to initial evaluation for biological activities. Tolyporphin A was found to inhibit efflux pump activity and thereby reverse multidrug resistance (MDR) in tumor cells [1,3,4]. Tolypodiols and analogues were evaluated for anti-inflammatory properties germane to neurological disorders [13,14]. In contrast with the intriguing reports concerning bioactive properties, the biosynthetic pathways to tolyporphins and tolypodiols have been little explored, and essentially nothing is known concerning the functional roles of these natural products in the cyanobacterium and associated community bacteria.
Knowledge of the full genome of the Nostocales HT-58-2 enables the identification of genes or BGCs that might contribute to the biosynthesis of tolyporphins and tolypodiols. Studies of the tolyporphins BGC (BGC-T) will be reported elsewhere. Here, a BGC was examined that contains almost all genes in the MEP pathway related to the backbone biosynthesis of terpenoids. Genes co-localizing with those in the MEP pathway are ubi genes, which are involved in the terpenoid-quinone pathway. Additionally, the significant product of the MEP pathway, geranylgeranyl diphosphate, could react with 4-hydroxybenzoate (a product of the UbiC-catalyzed reaction derived from the common precursor chorismic acid) to afford many important aromatic compounds such as ubiquinone and vitamin K [57]. Such a BGC for tolypodiols requires further analysis to establish the complete biosynthetic steps derived from the MEP pathway. The absence of genes (ispD and ispF) corresponding to MCT and MDS proteins in the tolypodiols BGC is (1) not believed to result from an incompleteness of sequencing reads when assembling and closing the circular genome of Nostocales HT-58-2 [9]; and (2) is not concerning with regard to biosynthesis, because such genes are present elsewhere in the genome.
The HT-58-2 culture grows under a medium deprived of aqueous-soluble nitrate (BG-11o), consistent with a presumed nitrogen fixation capacity of heterocyst-forming cyanobacteria [11]. The production of tolyporphins is profoundly increased in the absence versus presence of soluble nitrate [11]. A minimum set of six conserved nif genes (nifHDKENB) is required for nitrogen fixation [52]. Genes for two different nitrogenases (nif and vnf ) were found in two clustered regions. Nitrogen fixation in HT-58-2 is probably regulated by metal ion concentrations, because molybdate-dependent transporters (mod) and nitrogenases (nif, vnf ) are co-localized. Further studies may probe co-relationships of gene expression for nitrogen fixation and tolyporphins production in the face of environmental stimuli.
The studies reported here suggest that HT-58-2 may be capable of producing diverse natural products beyond tolyporphins and tolypodiols. Four BGCs are aligned with relatively high similarity (above 50% similarity) with those reported for the biosynthesis of hapalosin, anatoxin-a/homoanatoxin-a, hexose-shinorine, and heterocyst glycolipids. The alignments are not complete, because a gap region occurs in the putative hapalosin BGC and at least one gene is absent in each of the BGCs. Alignment and conserved domain analysis of the protein AnaG did not reveal a means for the methylation of anatoxin-a and dihydroanatoxin-a to form the corresponding homoanatoxin-a and dihydrohomoanatoxina. However, the effects on product structures of absent domains, or differences in domains of individual proteins, remain unknown.
Traditional methods of discovering bioactive products from microorganisms are often limited by available cultivation conditions or purification methodologies [58]. Genome mining and bioinformatics provide complementary strategies for exploring biosynthetic pathways of metabolites. Although none of the four natural products (hapalosin, anatoxina/homoanatoxin-a, hexose-shinorine, and heterocyst glycolipids) has yet been isolated from the HT-58-2 culture, the identification of the putative BGCs highlights potential opportunities. The potential presence of anatoxins and analogues in particular should prompt caution in handling the HT-58-2 cultures and extracts thereof, given their known physiological effects. Moreover, biological assays with crude extracts must be cautiously interpreted given the possible presence of numerous bioactive natural products. Identification of putative BGCs enables further work to address issues concerning possible evolutionary origin. Taken together, the putative BGCs described in this study establish a framework for investigation of the biosynthesis of tolypodiols and other natural products, as well as nitrogen fixation and regulation in the HT-58-2 cyanobacterial-bacterial consortium.