Cryptic Diversity of Black Band Disease Cyanobacteria in Siderastrea siderea Corals Revealed by Chemical Ecology and Comparative Genome-Resolved Metagenomics

Black band disease is a globally distributed and easily recognizable coral disease. Despite years of study, the etiology of this coral disease, which impacts dozens of stony coral species, is not completely understood. Although black band disease mats are predominantly composed of the cyanobacterial species Roseofilum reptotaenium, other filamentous cyanobacterial strains and bacterial heterotrophs are readily detected. Through chemical ecology and metagenomic sequencing, we uncovered cryptic strains of Roseofilum species from Siderastrea siderea corals that differ from those on other corals in the Caribbean and Pacific. Isolation of metabolites from Siderastrea-derived Roseofilum revealed the prevalence of unique forms of looekeyolides, distinct from previously characterized Roseofilum reptotaenium strains. In addition, comparative genomics of Roseofilum strains showed that only Siderastrea-based Roseofilum strains have the genetic capacity to produce lasso peptides, a family of compounds with diverse biological activity. All nine Roseofilum strains examined here shared the genetic capacity to produce looekeyolides and malyngamides, suggesting these compounds support the ecology of this genus. Similar biosynthetic gene clusters are not found in other cyanobacterial genera associated with black band disease, which may suggest that looekeyolides and malyngamides contribute to disease etiology through yet unknown mechanisms.


Introduction
Breakthroughs in sequencing technologies over the last few decades have shed light on the extensive genetic diversity of microbial life and its tremendous wealth of biosynthetic gene clusters. Cyanobacteria, especially filamentous types, have proven to be a rich source of secondary metabolites, including antimicrobial and bioactive natural products [1][2][3]. In some cases, these unique products are toxic and can be produced at levels high enough to be detrimental to humans, pets, and wildlife such as during harmful algal blooms of the cyanobacteria Microcystis. Recent metagenomic and metatranscriptomic sequencing of Microcystis blooms revealed the presence not only of co-occurring toxigenic and nontoxigenic strains, but also strains that harbored partial gene clusters for microcystin that were abundant and expressed during specific successional phases of the bloom [4]. Thus, the genomic revolution is providing new avenues to explore the functional and ecological roles of cryptic diversity within cyanobacteria.
Black band disease (BBD) is arguably the longest-studied coral disease, as it was first identified in the scientific literature in the 1970s [5] and documented in artwork as early as the 1890s [6]. Yet, we still do not fully understand the etiology of this destructive and globally distributed coral disease. The engineer of BBD is the filamentous cyanobacterium Roseofilum reptotaenium [7], which forms a dense, polymicrobial mat under which anoxic and sulfidic conditions smother and kill coral tissue [8]. While Roseofilum is part of the normal microflora of corals, where it can be found at low levels even in corals unaffected by BBD [9], it is unknown what triggers Roseofilum to form BBD mats. However, the natural products formed by Roseofilum may play a role in manipulating the microbial communities on the coral surface through quorum sensing [9] or other means.
While investigating natural products from BBD mats in corals, we uncovered a pair of novel compounds related to previously described looekeyolides A and B [10]. Strikingly, these compounds were detected only in BBD cyanobacterial mats collected from the massive starlet coral, Siderastrea siderea. Herein, we characterized the cyanobacteria associated with BBD in S. siderea corals through chemical ecology, 16S rRNA sequencing, and genomeresolved metagenomics to determine the differences between these cyanobacteria and previously characterized strains of Roseofilum.

Collection of S. siderea-Associated Black Band Disease Cyanobacterial Mats
BBD cyanobacterial mats were sampled from S. siderea corals when found during SCUBA diving expeditions in Belize and Florida from 2014 to 2018 (Table S2). These samples were divided among analyses for characterization of major secondary metabolites, characterization of bacterial community composition, and genome-resolved metagenomics. In addition, a non-axenic, cyanobacterial enrichment culture was grown from a BBD mat on a S. siderea coral in Florida. The predominant cyanobacterium was likely a Geitlerinema strain, as this was the only cyanobacterial genome retrieved from the culture (Table S2).

Isolation and Characterization of Novel Looekeyolides
Previously, we reported two related macrocyclic metabolites, looekeyolide A and looekeyolide B (Figure 1), isolated from the lipophilic extracts of black band disease mats collected from Montastraea cavernosa, Orbicella annularis, Orbicella faveolata, Pseudodiploria strigosa, and Goniopora fruticosa, and from cultured Roseofilum reptotaenium [10]. Looekeyolide A is a 20-membered macrocyclic compound formed by a 16-carbon polyketide chain, 2-deamino-2-hydroxymethionine and D-leucine, and looekeyolide B is its autooxidation product at the 2-deamino-2-hydroxymethionine moiety. Interestingly, liquid chromatography-mass spectrometry (LC-MS) analysis of the BBD extracts from a collection of several small samples from the coral S. siderea growing in Southwater Caye, Belize, collected in July 2014 indicated the absence of the peaks at m/z 686 [M + Na] + for looekeyolide A and at m/z 702 [M + Na] + for looekeyolide B. Instead, we observed peaks at m/z 720 [M + Na] + for looekeyolide C and 736 [M + Na] + for looekeyolide D, two new related compounds separated by 16 mass units (Figure 1 and S1) as reported for looekeyolides A and B. Similar results were observed from the lipophilic extract of an August 2018 BBD collection from S. siderea growing in Curlew Cay, Belize ( Figure S2) and the lipophilic extract of a July 2018 BBD collection from S. siderea from Fort Lauderdale, Florida ( Figure S3). This mass spectral information prompted us to conduct further chemical investigation of these small samples.
The lipophilic extracts of each batch of freeze-dried sample collections were subjected to reversed phase column chromatography followed by reversed phase high performance liquid chromatography (HPLC) using MeOH-20% water to give looekeyolide D. Although the related compound looekeyolide C was detected in low-resolution electrospray ionization mass spectrometry (LRESIMS) traces, the isolation procedures auto-oxidized it completely to the stable looekeyolide D, and looekeyolide C was not isolated for other spectral studies. The lipophilic extracts of each batch of freeze-dried sample collections were subjecte to reversed phase column chromatography followed by reversed phase high performanc liquid chromatography (HPLC) using MeOH-20% water to give looekeyolide D. Althoug the related compound looekeyolide C was detected in low-resolution electrospray ioniza tion mass spectrometry (LRESIMS) traces, the isolation procedures auto-oxidized it com pletely to the stable looekeyolide D, and looekeyolide C was not isolated for other spectra studies.

Bacterial Composition of Belize Samples
Bacterial community composition was characterized for both the RNA and DNA fractions of ten Belize S. siderea colonies with black band disease (BBD). A total of 1,815,043 16S rRNA gene amplicon sequencing reads passed quality-filtering, with an average of 90,752 reads per sample (min 2973, max 295,333) (Table S1). Only 54 amplicon sequence variants (ASVs) were detected in the twenty libraries (10 RNA, 10 DNA), with just six prevalent ASVs (Table 2, Figure 2B). Table 2. Six predominant Amplicon Sequence Variants (ASVs) in V6 amplicon libraries from black band disease (BBD) cyanobacterial mats from S. siderea corals in Belize. For each ASV, the SILVA classification is provided, as well as the closest BLAST match in GenBank for comparison. Accession numbers in bold indicate sequences originating from previous BBD studies.

ASV Name
SILVA Classification Closest BLAST Match (% Similarity for V6 Region)

Bacterial Composition of Belize Samples
Bacterial community composition was characterized for both the RNA and DNA fractions of ten Belize S. siderea colonies with black band disease (BBD). A total of 1,815,043 16S rRNA gene amplicon sequencing reads passed quality-filtering, with an average o 90,752 reads per sample (min 2973, max 295,333) (Table S1). Only 54 amplicon sequence variants (ASVs) were detected in the twenty libraries (10 RNA, 10 DNA), with just six prevalent ASVs (Table 2, Figure 2B).   The most prevalent ASV, hereafter referred to as Cyano1, ranged from 58 to 100% relative abundance per sample and was classified as the cyanobacterial genus Roseofilum. This was the only ASV detected in the RNA fraction of SID1, the sample with the highest number of sequencing reads (295,333). Five additional prevalent ASVs had relative abundances of less than 30% per sample. These included three additional cyanobacterial ASVs, one ASV classified as Ruegeria, and one ASV that was classified only as Bacteria with the Mar. Drugs 2023, 21, 76 7 of 18 SILVA database but was identified as Beggiotoa through a BLASTN search (Table 2). While the Cyano1 ASV was classified as "Roseofilum AO1-A" with SILVA, BLASTN searches of these sequences revealed that they share only 95% sequence similarity in the 60-bp V6 region of the 16S rRNA gene with the Roseofilum strain AO1-A (KU579397), isolated from the Great Barrier Reef [11], or with Roseofilum strain Cy1 (KP689103), isolated from the Florida Reef Tract [12]. Instead, the Cyano1 (Roseofilum) ASV was an exact match to several clone library sequences (EF123634, EF123639, EF123644, EF123645, EF123646) that were previously detected in BBD cyanobacterial mats from Caribbean S. siderea corals [13].

Metagenome-Assembled Genomes of Cyanobacteria from Belize and Florida
Four of the Belize S. siderea colonies (SIDH, SIDI, SIDL, and SIDO) produced detectable levels of looekeyolide C/D by LC-MS, while six of the colonies (SID1, SID2, SID3, SID8m, SIDa, and SIDE) did not. However, the same predominant cyanobacterial ASV (Cyano 1) was found in both DNA and RNA fractions of BBD from all Belize S. siderea colonies. The variability in detection of looekeyolides is likely due to the small sample sizes collected and small amounts of chemical extracts. To confirm the genetic potential for the biosynthesis of looekeyolides, we compared metagenome-assembled genomes (MAGs) of cyanobacteria from pooled metagenomes of producer or non-producer samples (as described in the methods) from Belize as well as producer samples from Florida.
Quality-filtered metagenomic sequencing reads ranged from roughly 13 million to 33 million per metagenomic library (Table S2). Seven cyanobacterial MAGs with >90% completeness and <6% contamination were retrieved from all six metagenome libraries (Table S3). Roseofilum MAGs were retrieved from all three Belize metagenomes and one Florida metagenome, encompassing both producers and non-producers of looekeyolide C. In addition, cyanobacterial MAGs that do not belong to the genus Roseofilum were retrieved from one Belize metagenome and two Florida metagenomes (Table S3). Among the four Roseofilum MAGs from S. siderea corals, the average nucleotide identity (ANI) of shared genes was >99%, while the ANI of the Roseofilum MAGs compared to non-Roseofilum cyanobacteria from S. siderea corals was too close to the detection limit for accuracy (<75%), suggesting they belong to different genera (Table S4). Roseofilum MAGs from S. siderea corals had >98% ANI with Roseofilum MAGs retrieved from other Caribbean coral species and had >94% ANI with Roseofilum MAGs retrieved from Pacific coral species (Table S4). Of the non-Roseofilum MAGs from S. siderea corals, SID2_20 and SBC9 had >99% ANI of shared genes with each other and with the Geitlerinema BBD 1991 MAG from Caribbean Montastraea cavernosa [20,21]. All three strains, SID2_20, SBC9, and BBD 1991 were classified as Geitlerinema species by GTDBtk. The non-Roseofilum MAG SBLK1 was too close to the detection limit for accuracy (<75%) from SID2_20, SBC9, and Geitlerinema BBD 1991. SBLK1 was classified to the cyanobacterial family Spirulinaceae by GTDB-Tk. The 16S rRNA gene was not detected in the Spirulinaceae bacterium SBLK1 for further taxonomic identification. The presence of three distinct cyanobacterial genera within the order Oscillatoriales in the MAGs was consistent with the presence of three cyanobacterial genera (Roseofilum, Geitlerinema, and Hormoscilla) in the 16S rRNA amplicon libraries, although not an exact match for all genera. As the Roseofilum strains were the most predominant cyanobacteria within BBD mats from Belize and previous studies from both the Caribbean and Pacific [9,22,23], we focused the pangenome analysis primarily on the Roseofilum genomes.

Comparative Genomics of Black Band Disease-Associated Cyanobacteria
Comparative genomics of nine Roseofilum MAGs included four S. siderea-associated Roseofilum MAGs from this study and five Roseofilum MAGs from other coral species [12,22,23]. All nine MAGs passed the quality threshold of >90% completeness and <6% contamination (Table S3). Pangenome analysis identified 2746 core genes found in all nine Roseofilum genomes, 3352 shell genes in two to eight Roseofilum genomes, and 2370 cloud genes found in only one Roseofilum genome (Figure 3). Three distinct clusters of genomes were detected: Caribbean Roseofilum from four Siderastrea corals, three Pacific Roseofilum strains, and Caribbean Roseofilum from two other boulder corals. A total of 294 genes were found in all 4 Roseofilum genomes from S. siderea but not in any other Roseofilum genomes. Of these, 225 (77%) were annotated as hypothetical proteins, while only 69 (23%) had functional annotations.
rRNA gene was not detected in the Spirulinaceae bacterium SBLK1 for further taxonomic identification. The presence of three distinct cyanobacterial genera within the order Oscillatoriales in the MAGs was consistent with the presence of three cyanobacterial genera (Roseofilum, Geitlerinema, and Hormoscilla) in the 16S rRNA amplicon libraries, although not an exact match for all genera. As the Roseofilum strains were the most predominant cyanobacteria within BBD mats from Belize and previous studies from both the Caribbean and Pacific [9,22,23], we focused the pangenome analysis primarily on the Roseofilum genomes.

Comparative Genomics of Black Band Disease-Associated Cyanobacteria
Comparative genomics of nine Roseofilum MAGs included four S. siderea-associated Roseofilum MAGs from this study and five Roseofilum MAGs from other coral species [12,22,23]. All nine MAGs passed the quality threshold of >90% completeness and <6% contamination (Table S3). Pangenome analysis identified 2746 core genes found in all nine Roseofilum genomes, 3352 shell genes in two to eight Roseofilum genomes, and 2370 cloud genes found in only one Roseofilum genome (Figure 3). Three distinct clusters of genomes were detected: Caribbean Roseofilum from four Siderastrea corals, three Pacific Roseofilum strains, and Caribbean Roseofilum from two other boulder corals. A total of 294 genes were found in all 4 Roseofilum genomes from S. siderea but not in any other Roseofilum genomes. Of these, 225 (77%) were annotated as hypothetical proteins, while only 69 (23%) had functional annotations. Each of the nine Roseofilum MAGs had 14 to 19 biosynthetic gene clusters identified by antiSMASH, including multiple clusters for terpenes, ribosomally synthesized and post-translationally modified (RiPP)-like clusters as well as RiPP recognition elements (RRE), Type I polyketide synthases (T1PKSs), nonribosomal peptide synthetases (NRPSs), and hybrid T1PKS/NRPS clusters ( Figure 4). Each of the nine Roseofilum MAGs had one biosynthetic gene cluster for tRNA-dependent cyclodipeptide synthase (CDPS), which has been more commonly found in the genomes of Actinobacteria, Firmicutes, and Proteobacteria [24,25]. The Spirulinaceae MAG also had 14 detectable biosynthetic gene clusters, including gene clusters for antimicrobial lanthipeptides, thiopeptide, and cyanobactins, as well as resorcinol (Figure 4). In contrast to the Roseofilum and Spirulinaceae MAGs, only eight or nine biosynthetic gene clusters per genome were detected in Geitlerinema MAGs. Each of the nine Roseofilum MAGs had 14 to 19 biosynthetic gene clusters identified by antiSMASH, including multiple clusters for terpenes, ribosomally synthesized and post-translationally modified (RiPP)-like clusters as well as RiPP recognition elements (RRE), Type I polyketide synthases (T1PKSs), nonribosomal peptide synthetases (NRPSs), and hybrid T1PKS/NRPS clusters ( Figure 4). Each of the nine Roseofilum MAGs had one biosynthetic gene cluster for tRNA-dependent cyclodipeptide synthase (CDPS), which has been more commonly found in the genomes of Actinobacteria, Firmicutes, and Proteobacteria [24,25]. The Spirulinaceae MAG also had 14 detectable biosynthetic gene clusters, including gene clusters for antimicrobial lanthipeptides, thiopeptide, and cyanobactins, as well as resorcinol (Figure 4). In contrast to the Roseofilum and Spirulinaceae MAGs, only eight or nine biosynthetic gene clusters per genome were detected in Geitlerinema MAGs. The putative biosynthetic gene clusters for looekeyolides are classified as hybrid T1PKS/NRPS clusters and complete biosynthetic gene clusters were retrieved from all nine Roseofilum MAGs ( Figure 5), meaning both the producer and non-producer samples had the genetic potential to make looekeyolides, although the level of biosynthesis of the looekeyolides may vary among samples. In addition, two Pacific strains [22,23] appear to have the genetic capacity to produce looekeyolides, but their natural products have not been elucidated. The putative biosynthetic gene clusters for looekeyolides are classified as hybrid T1PKS/NRPS clusters and complete biosynthetic gene clusters were retrieved from all nine Roseofilum MAGs ( Figure 5), meaning both the producer and non-producer samples had the genetic potential to make looekeyolides, although the level of biosynthesis of the looekeyolides may vary among samples. In addition, two Pacific strains [22,23] appear to have the genetic capacity to produce looekeyolides, but their natural products have not been elucidated. A putative biosynthetic pathway for looekeyolide C/D is proposed ( Figure 6), with high similarity to the pathway for looekeyolide A/B previously described [10] in Roseofilum MAGs from corals other than S. siderea. The adenylation domain of LklI in Roseofilum MAGs from S. siderea that produce looekeyolide C/D had specificity for L-phenylalanine, while Roseofilum MAGs from corals other than S. siderea that produce looekeyolide A/B had specificity for L-leucine (Table S5, Figure S11). Most of the looekeyolide biosynthetic genes in Caribbean Roseofilum from multiple coral species (Orbicella annularis, Pseudodiploria strigosa, Montastraea cavernosa) were 97% to 99% similar to the genes in Caribbean Roseofilum from S. siderea except for LklI which was 92% similar due to the low identity (46%) of their A domains ( Figure S12). Hybrid T1PKS/NRPS biosynthetic gene clusters predicted to produce looekeyolides were not detected in Geitlerinema or Spirulinaceae MAGs.  A putative biosynthetic pathway for looekeyolide C/D is proposed ( Figure 6), with high similarity to the pathway for looekeyolide A/B previously described [10] in Roseofilum MAGs from corals other than S. siderea. The adenylation domain of LklI in Roseofilum MAGs from S. siderea that produce looekeyolide C/D had specificity for L-phenylalanine, while Roseofilum MAGs from corals other than S. siderea that produce looekeyolide A/B had specificity for L-leucine (Table S5, Figure S11). Most of the looekeyolide biosynthetic genes in Caribbean Roseofilum from multiple coral species (Orbicella annularis, Pseudodiploria strigosa, Montastraea cavernosa) were 97% to 99% similar to the genes in Caribbean Roseofilum from S. siderea except for LklI which was 92% similar due to the low identity (46%) of their A domains ( Figure S12). Hybrid T1PKS/NRPS biosynthetic gene clusters predicted to produce looekeyolides were not detected in Geitlerinema or Spirulinaceae MAGs.
Each of the nine Roseofilum MAGs had hybrid T1PKS/NRPS clusters annotated as malyngamides including malyngamide C acetate and malyngamide I ( Figure S13). Malyngamides are small amides, many of which have lyngbic acid as a carboxylic acid side chain. Both malyngamides and lyngbic acid from Caribbean filamentous cyanobacteria, including Roseofilum, have previously been shown to interfere with bacterial quorum sensing [9,26]. Most of the nine Roseofilum MAGs had multiple putative malyngamide biosynthetic gene clusters, and no clear patterns were observed that corresponded to differences in gene clusters among coral hosts or geographic locations ( Figure S13). Hybrid T1PKS/NRPS biosynthetic gene clusters predicted to produce malyngamides were not detected in Geitlerinema or Spirulinaceae MAGs.
Analysis of biosynthetic gene clusters also revealed that the Roseofilum MAGs from S. siderea corals had one type of biosynthetic gene cluster that was not found in the other Roseofilum strains. All four Roseofilum MAGs from S. siderea corals had a lasso peptide biosynthetic gene cluster that encoded a 98 aa stand-alone RiPP recognition element (RRE), a 135 aa lasso peptide transglutaminase homolog (leader peptidase, capB), and a 633 aa lasso peptide asparagine synthase homolog (lasso cyclase, capC). These biosynthetic genes were flanked on each side by genes for ABC-transporter related genes (Figure 7). While the amino acid sequences in the RiPP recognition element and the capB leader peptidase were identical in all four Roseofilum MAGs from S. siderea corals, the amino acid sequences for the capC lasso cyclase in SID1.26 and SBFL6 differed by 4 amino acids from SID2. 16 and SID3. 16. A search with blastp of the S. siderea-associated Roseofilum lasso cyclase amino acid sequence (from SID2.16) showed low similarity (≤67% similarity) to homologues in other cyanobacterial genomes.
MAGs from S. siderea that produce looekeyolide C/D had specificity for L-phenylalanine, while Roseofilum MAGs from corals other than S. siderea that produce looekeyolide A/B had specificity for L-leucine (Table S5, Figure S11). Most of the looekeyolide biosynthetic genes in Caribbean Roseofilum from multiple coral species (Orbicella annularis, Pseudodiploria strigosa, Montastraea cavernosa) were 97% to 99% similar to the genes in Caribbean Roseofilum from S. siderea except for LklI which was 92% similar due to the low identity (46%) of their A domains ( Figure S12). Hybrid T1PKS/NRPS biosynthetic gene clusters predicted to produce looekeyolides were not detected in Geitlerinema or Spirulinaceae MAGs.  Each of the nine Roseofilum MAGs had hybrid T1PKS/NRPS clusters annotated as malyngamides including malyngamide C acetate and malyngamide I ( Figure S13). Malyngamides are small amides, many of which have lyngbic acid as a carboxylic acid side chain. Both malyngamides and lyngbic acid from Caribbean filamentous cyanobacteria, including Roseofilum, have previously been shown to interfere with bacterial quorum sensing [9,26]. Most of the nine Roseofilum MAGs had multiple putative malyngamide biosynthetic gene clusters, and no clear patterns were observed that corresponded to differences in gene clusters among coral hosts or geographic locations ( Figure S13). Hybrid T1PKS/NRPS biosynthetic gene clusters predicted to produce malyngamides were not detected in Geitlerinema or Spirulinaceae MAGs.
Analysis of biosynthetic gene clusters also revealed that the Roseofilum MAGs from S. siderea corals had one type of biosynthetic gene cluster that was not found in the other Roseofilum strains. All four Roseofilum MAGs from S. siderea corals had a lasso peptide biosynthetic gene cluster that encoded a 98 aa stand-alone RiPP recognition element (RRE), a 135 aa lasso peptide transglutaminase homolog (leader peptidase, capB), and a 633 aa lasso peptide asparagine synthase homolog (lasso cyclase, capC). These biosynthetic genes were flanked on each side by genes for ABC-transporter related genes (Figure 7). While the amino acid sequences in the RiPP recognition element and the capB leader peptidase were identical in all four Roseofilum MAGs from S. siderea corals, the amino acid sequences for the capC lasso cyclase in SID1.26 and SBFL6 differed by 4 amino acids from SID2.16 and SID3.16. A search with blastp of the S. siderea-associated Roseofilum lasso cyclase amino acid sequence (from SID2.16) showed low similarity (≤67% similarity) to homologues in other cyanobacterial genomes.

Discussion
Roseofilum reptotaenium, the cyanobacterial engineer of black band disease (BBD) in corals [7], is found in tropical coral reefs around the world and impacts at least 72 coral species [27]. Here, we uncovered cryptic diversity among Roseofilum strains through both chemical and genomic analyses. The sequence-based threshold of 95% ANI has been proposed as the delineation of bacterial species [28][29][30][31]. Using this metric, all six Roseofilum strains from the Caribbean are the same cyanobacterial species regardless of the host coral species (>98% ANI), while the three Pacific Roseofilum strains were very close to this threshold (94.25-94.68% ANI) and thus, potentially represent a separate species. However, using ANI for comparison only reveals the similarity among shared genes and does not capture differences in gene content, i.e., when genes are present in one strain and absent in another. Roseofilum strains on S. siderea corals were both chemically and genetically distinct from other strains in the Caribbean despite belonging to the same cyanobacterial species. This difference was consistent across sites in Belize and Florida and through time,

Discussion
Roseofilum reptotaenium, the cyanobacterial engineer of black band disease (BBD) in corals [7], is found in tropical coral reefs around the world and impacts at least 72 coral species [27]. Here, we uncovered cryptic diversity among Roseofilum strains through both chemical and genomic analyses. The sequence-based threshold of 95% ANI has been proposed as the delineation of bacterial species [28][29][30][31]. Using this metric, all six Roseofilum strains from the Caribbean are the same cyanobacterial species regardless of the host coral species (>98% ANI), while the three Pacific Roseofilum strains were very close to this threshold (94.25-94.68% ANI) and thus, potentially represent a separate species. However, using ANI for comparison only reveals the similarity among shared genes and does not capture differences in gene content, i.e., when genes are present in one strain and absent in another. Roseofilum strains on S. siderea corals were both chemically and genetically distinct from other strains in the Caribbean despite belonging to the same cyanobacterial species. This difference was consistent across sites in Belize and Florida and through time, as samples were collected in 2014, 2015, and 2018. Of note, surveys using only 16S rRNA amplicons would not be able to distinguish among these distinct strains of Roseofilum, thus highlighting the utility of metagenomic sequencing in uncovering the functional differences among visually similar filamentous cyanobacteria in reef ecosystems.
Over 2000 metabolites have been described from Cyanobacteria [32]. Some of the ecological roles of these natural products include grazing deterrents, allelopathy, iron scavenging, UV protection, and signaling [33]. Comparative analysis of nine Roseofilum genomes showed that each of the Roseofilum genomes had multiple terpene, ribosomal and non-ribosomal peptide, and polyketide biosynthetic gene clusters. Each of these classes of cyanobacterial natural products includes potential antibacterial or antiviral compounds [34][35][36]. In fact, every type of biosynthetic cluster detected in these BBD-associated cyanobacterial genomes, regardless of genus, includes natural products that exhibit antimicrobial properties. These antimicrobial agents may play a role in the progression of BBD by allowing the cyanobacteria to outcompete other coral-associated microorganisms that would normally suppress pathogen growth.
All Roseofilum genomes examined here had hybrid peptide/polyketide biosynthetic gene clusters proposed to encode for the cyclic depsipeptide looekeyolides, the lipopeptide malyngamides, and a tRNA-dependent cyclodipeptide that were not found in four non-Roseofilum BBD-associated cyanobacterial genomes. Our previous work demonstrated that looekeyolides from Roseofilum under laboratory conditions do not alter growth and biofilm formation by marine bacteria, do not act as siderophores, and do not impact photosynthetic performance of the coral [10]. The oxygen sensitive looekeyolide A reduces hydrogen peroxide levels, suggesting a role in combating reactive oxygen species on the coral surface [10]. Malyngamides from filamentous cyanobacteria have demonstrated both cytotoxic and anticancer properties [37] and antibacterial properties against Gram positive pathogens [38]. Malyngamide C and lyngbic acid have also demonstrated quorum-sensing inhibition in marine bacteria [9,26]. In addition, the tRNA-dependent cyclodipeptides have variously shown antibacterial, antifungal, antiviral, and antitumor properties [39]. In contrast to looekeyolides and malyngamides, only the S. siderea-associated Roseofilum genomes contained biosynthetic gene clusters for lasso peptides. Lasso peptides are underexplored in Cyanobacteria [40]. Characterized lasso peptides have demonstrated a variety of activities including antimicrobial properties, and the unique lasso structure imparts heat and chemical resistance [41].
The biosynthesis of malyngamides and lasso peptides has been well characterized [42,43], setting the stage for their heterologous production and bioactivity investigation. In addition to the proposed biosynthetic pathway for looekeyolide C presented here, we recently proposed a pathway for looekeyolide A [10]. With a cultivated strain of Roseofilum that produces looekeyolide A [10] and the genome sequences for multiple, unique Roseofilum strains, we are poised for future studies to uncover the bioactivity of these natural products and their potential use for novel applications.
Collectively, Roseofilum genomes associated with BBD from locations in the Caribbean and the Pacific share a wide assortment of peptide and polyketide natural products that may have bioactive properties. The exact roles of looekeyolides, malyngamides, and other secondary metabolites are not known, but the conserved nature of these compounds implies they play an important role in the ecology of these cyanobacteria and may also contribute to disease etiology through manipulation of the microbial communities around them.

Sample Collection and Enrichment Culturing
Black band disease (BBD) cyanobacterial mats were collected from Siderastrea siderea corals in Belize and Florida by aspiration with a needleless syringe for both chemical analysis and extraction of nucleic acids. BBD mats from several colonies of Siderastrea were combined for bulk analysis in three batches: one from South Water Caye, Belize in July 2014, one from Curlew Cay, Belize in August 2018, and one from Fort Lauderdale, Florida in July 2018. For microbiome analysis, relatively thin BBD mats (Figure 1) from ten colonies of S. siderea were sampled while SCUBA diving in September 2015 at Carrie Bow Cay, Curlew Cay, or South Water Channel near the Smithsonian Carrie Bow Cay Field Station in Belize. One additional S. siderea coral exhibiting BBD was sampled at Looe Key in the Florida Keys National Marine Sanctuary in July 2017. Finally, a BBD mat was collected from a S. siderea coral offshore from Ft. Lauderdale, FL in July 2018. A non-axenic, cyanobacterial enrichment culture of the BBD mat from Ft. Lauderdale, FL was grown in artificial seawater amended with Cyanobacterial BG-11 media (ATCC medium 616) as previously described [12].

Characterization of Major Secondary Metabolites
Bulk cyanobacterial mats of the 2014 collection were freeze-dried and extracted repeatedly with MeOH. Similarly, the 2018 collection was freeze-dried and extracted with 50% EtOAc-50% MeOH saturated with helium gas. The extracts were chromatographed on a column of C 18 (3 g) using a MeOH-H 2 O step gradient system to give five sub-fractions. The sub-fraction 3 (0.002 g), eluted with 80% MeOH-20% H 2 O was further separated by reversed-phase HPLC (semi-prep 250 mm × 10 mm, 5 µm, RP-18, flow 3.0 mL/min) using 80% MeOH-20% H 2 O to give 0.6 mg of looekeyolide D (t R = 10.3 min, yield, 0.03% dry wt) (July 2014 batch) and 0.3 mg of looekeyolide D (t R = 10.3 min, yield, 0.06% dry wt) (August 2018 batch). Looekeyolide C was not isolated and assumed to be completely oxidized during the isolation process.
Optical rotations were recorded on a Jasco P2000 polarimeter. UV spectrophotometric data was acquired on a Shimadzu PharmaSpec UV-visible spectrophotometer. NMR data were collected on a JEOL ECA-600 spectrometer operating at 600.17 MHz for 1 H and 150.9 MHz for 13 C. 1 H NMR chemical shifts (referenced to residual CD 3 OD at δ 3.30) were assigned using a combination of data from 2D DQF COSY and multiplicity-edited HSQC experiments. The edited-HSQC experiment was optimized for J CH = 140 Hz and the HMBC experiment was optimized for 2/3 J CH = 8 Hz. 13 C NMR chemical shifts (referenced to CD 3 OD observed at δ 49.0) were assigned on the basis of multiplicity-edited HSQC experiments. Low resolution liquid chromatography mass spectrometry (LRLC-MS) was performed on a Thermo Scientific (Waltham, MA, USA) LTQ LC-MS ESI instrument connected to a Grace Vydac Reversed-phase column (C18, 218TP, 5 µ, 100 mm × 2.1 mm) using a mixture of 0.1% HCOOH in water (A) and 0.1% HCOOH in CH 3 CN (B) at a rate of 0.2 mL/min. The gradient system used was 90% A to 0% A in 15 min followed by 100% B for the next 10 min. HRMS data was obtained using an Agilent 6210 LC-TOF mass spectrometer equipped with an APCI/ESI multimode ion source detector at the Mass Spectrometer Facility at the University of California, Riverside, California. Varian BondElut octadecyl (C 18 ) was used for column chromatography. All solvents used were of HPLC grade (Fisher Scientific).

V6 Amplicon Libraries of Belize Samples
The V6 region of bacterial 16S rRNA genes were amplified from both DNA and cDNA of the Belize samples with previously published primers [44] using previously described methods [9]. Briefly, the V6 region was amplified in triplicate with Phusion High-Fidelity Polymerase (New England Biolabs, Ipswich, MA, USA). Triplicate PCR amplifications were pooled for each sample, cleaned with a MinElute kit (Qiagen, Germantown, MD, USA), and quantified by NanoDrop (ThermoScientific, NanoDrop Products, Wilmington, DE, USA). Two hundred nanograms of each cleaned amplicon library was submitted to the Interdisciplinary Center for Biotechnology Research at the University of Florida (RRID:SCR_019152) where the libraries were size selected for fragments from 200 to 240 bp with a 2% agarose PippinPrep cassette and cleaned again to remove agarose. Sequencing was performed on an Illumina MiSeq with a 150-bp paired-end protocol, using single indexing. Sequencing reads were parsed by Illumina index at the sequencing center and further parsed by the inline barcode using with the command-line options of FASTXtoolkit (http://hannonlab.cshl.edu/fastx_toolkit/ (accessed on 30 July 2018)). Primers and adaptors were removed using cutadapt v. 2.8 [45] and sickle v. 1.33 [46]. Parsed, quality-filtered amplicon sequencing reads are publicly available through NCBI's Sequence Read Archive under the Bioproject ID PRJNA645365. Quality-filtered paired reads were merged and amplicon sequence variants were determined from de-replicated sequences using taxonomic assignment from the SILVA small subunit ribosomal RNA database v. 132 database [47] with DADA2 v. 1.10.1 [48]. Sequences classified as mitochondria or chloroplast were removed from further analysis. Prevalent sequences that were unclassified were searched against NCBI's non-redundant nucleotide collection with BLASTn [49]. Bacterial community analysis was completed with phyloseq v. 1.26.1 [50] and plotted with ggplot2 v. 3.1.1 [51].

Metagenomic Library Preparation
A total of six metagenomic libraries were prepared. To ensure enough DNA for library preparation, extracted DNA from Belize samples were pooled as follows. Samples SID1, SID2, and SID3 were pooled for metagenome library "SID1", samples SID8m, SIDa, and SIDE were pooled for metagenome library "SID2", and samples SIDH, SIDI, SIDL, and SIDO, all known producers of looekeyolide C/D, were pooled for metagenome library "SID3". The three pooled DNA samples from Belize were sent to the University of Maryland Institute for Bioscience and Biotechnology Research where metagenomic libraries were prepared with a TruSeq DNA Sample Preparation Kit (Illumina, San Diego, CA, USA) and sequenced on an Illumina HiSeq with a 100-bp paired-end protocol. Metagenomic libraries for the three Florida samples were prepared with a Nextera DNA Flex kit (Illumina, San Diego, CA, USA) and sequenced on an Illumina NextSeq500 at the University of Florida Interdisciplinary Center for Biotechnology Research with a 150-bp paired-end protocol.

Metagenomic Analysis
Quality-filtering and removal of sequencing adaptors of the 100-bp sequencing reads of the Belize samples was performed with cutadapt v. 2.8 [45] and sickle v. 1.33 [46] with a removal of all reads with Ns, a minimum quality score of 30, and a minimum length of 100 bp. Quality-filtering of the 150-bp sequencing reads of the Florida samples was performed with the Minoche [52] [54,55]. Unassembled quality-filtered sequencing reads were mapped to the metagenomic assemblies with bowtie2 v. 2.3.5.1 [56] and sorted with SAMtools v. 1.10 [57].
Metagenome-assembled genomes (MAGs) were retrieved by binning of contigs with MetaBAT v. 2.13 [58]. Cyanobacterial MAGs from this study as well as our previously published BBD cyanobacterial MAGs [12] are publicly available through NCBI's Sequence Read Archive under the Bioproject ID PRJNA647383. Genome quality was assessed with the Microbial Genomes Atlas (MiGA) online [59]. Taxonomic classification of cyanobacterial MAGs was performed with GTDB-Tk v. 2.1.0 and database version R207_v2 using default settings [60,61]. The average nucleotide identity of shared genes was assessed pairwise with the Average Nucleotide Identity calculator from the enveomics toolbox [62]. The genomes of closely related strains of Roseofilum, including four strains from this study and five previously published strains [12,22,23], were annotated with Prokka v. 1.12 [63] and comparative genomic content was analyzed with Roary v. 3.12.0 [64]. An approximatelymaximum-likelihood phylogenetic tree of the nine Roseofilum genomes was created from the alignment of core genes with FastTree v. 2.1.7 [65] and plotted with Phadango v. 1.3.0 [66]. Biosynthetic gene clusters were identified with the online antiSMASH database bacterial version 6 [67] and with PRISM4 v. 4.4.5 [68]. Biosynthetic gene clusters were visualized with clinker v. 0.0.21 [69] and edited with inkscape v. 1.1.0 [70][71][72].  Figure S12: Comparison of gene similarities in the biosynthetic gene clusters for looekeyolide A/B and looekeyolide C/D. Figure S13: Biosynthetic gene clusters for malyngamides in Roseofilum MAGs; Table S1: Metadata and sequencing read metrics for V6 amplicon libraries from Black Band Disease cyanobacterial mats from Siderastrea siderea corals in Belize; Table S2: Metadata and sequencing read metrics for metagenomic libraries from Black Band Disease cyanobacterial mats from Siderastrea siderea corals in Belize and Florida; Table S3: Quality metrics of metagenome-assembled genomes (MAGs) of (A) non-Roseofilum cyanobacteria from Siderastrea siderea, (B) Roseofilum from Siderastrea siderea, and C) previously published Roseofilum strains; Table S4: (A) Pairwise Average Nucleotide Identity (ANI) of shared genes among cyanobacterial MAGs from Black Band Disease on Siderastrea siderea corals. Values of 75% or below are too close to the detection limit for confident assessments. Values above this threshold are highlighted in green, (B) Pairwise Average Nucleotide Identity (ANI) of shared genes among Roseofilum MAGs from Black Band Disease on multiple coral species. Table  S5