Cloning, Characterization and Heterologous Expression of the Indolocarbazole Biosynthetic Gene Cluster from Marine-Derived Streptomyces sanyensis FMA

The indolocarbazole (ICZ) alkaloids have attracted much attention due to their unique structures and potential therapeutic applications. A series of ICZs were recently isolated and identified from a marine-derived actinomycete strain, Streptomyces sanyensis FMA. To elucidate the biosynthetic machinery associated with ICZs production in S. sanyensis FMA, PCR using degenerate primers was carried out to clone the FAD-dependent monooxygenase gene fragment for ICZ ring formation, which was used as a probe to isolate the 34.6-kb DNA region containing the spc gene cluster. Sequence analysis revealed genes for ICZ ring formation (spcO, D, P, C), sugar unit formation (spcA, B, E, K, J, I), glycosylation (spcN, G), methylation (spcMA, MB), as well as regulation (spcR). Their involvement in ICZ biosynthesis was confirmed by gene inactivation and heterologous expression in Streptomyces coelicolor M1152. This work represents the first cloning and characterization of an ICZ gene cluster isolated from a marine-derived actinomycete strain and would be helpful for thoroughly understanding the biosynthetic mechanism of ICZ glycosides.

The therapeutic diversity and medicinal potential of ICZs have inspired interest in their biosynthesis research. The biosynthetic gene clusters for rebeccamycin (REB), STA, AT2433-A1 and K252a (Scheme 1) have been reported [11][12][13][14][15], and their assembling mechanisms have been widely studied [2,[16][17][18][19][20][21]. REB and AT2433-A1 are halogenated and contain a fully oxidized C-7 carbon and a β-glycosidic bond to only one indole nitrogen; conversely, STA and K252a are not halogenated and bear a fully reduced C-7 carbon and a sugar attached to both indole nitrogens ( Figure 1). The ICZ rings are synthesized from two molecules of tryptophan involving a series of oxidation steps. These reactions include amino oxidation (catalyzed by amino oxidases RebO/StaO/AtmO/InkO/NokA), chromopyrrolic acid formation (catalyzed by heme-dependent oxidases RebD/StaD/AtmD/InkD/NokB), ring-closing reaction (catalyzed by cytochrome P450s RebP/StaP/AtmP/InkP/NokC) and oxidative decarboxylation (catalyzed by FAD-dependent monooxygenases RebC/StaC/AtmC/InkE/NokD) [16,19,20,[22][23][24]. As a result, two classes of aglycone scaffolds-arcyriaflavin A (for REB and AT2433-A1) and K252C (for STA and K252a), are generated, respectively. In recent years, an increasing amount of ICZs were isolated from marine-derived strains; in addition, the isolated ICZs are usually comprised of multiple analogs [4,[25][26][27]. However, genetic information regarding these compounds has been seldom reported. Recently, a series of ICZs were isolated and identified from S. sanyensis FMA(=219808), which was isolated from mangrove soil samples collected in Sanya, Hainan Province of China [28]. The characterized ICZs include K252c, K252a, 3′-epi-K252a, RK286c, 4-bis(3-indolyl)-1H-pyrrole-2,5-dione and two novel ICZs, streptocarbazoles A and B (Scheme 1). The bioassay experiments revealed that streptocarbazole A was cytotoxic on HL-60 and A-549 cell lines and could arrest the cell cycle of Hela cells at the G 2 /M phase [4]. In contrast to all the other reported cyclic ICZ glycosides, which typically bear cyclic N-glycosidic linkages between the 1,5-carbons of the glycosyl moiety and two indole nitrogens of K252c, the aglycones of streptocarbazoles A and B are linked to the 1,3-carbons of the glycosyl moiety, indicating that a novel enzymatic mechanism might be involved in the C-N bond formation between the C-3′ of deoxysugar and the N-12 of aglycone (Scheme 1). This exceptional cyclic N-glycosidic linkage prompted us to investigate the biosynthetic mechanism of ICZs compounds in the marine-derived S. sanyensis FMA. Therefore, we did alignments of the ICZ ring biosynthetic genes and designed a pair of degenerate primers to amplify the corresponding DNA fragment encoding the FAD-dependent monooxygenase from strain FMA. A cosmid library of strain FMA was constructed, from which the spc biosynthetic gene cluster encoding ICZ production was isolated, which was surprisingly highly homologous to the STA gene cluster from Streptomyces sp. TP-A0274. Here, we report the cloning, characterization and heterologous expression of the spc gene cluster from S. sanyensis FMA.

Cloning and Sequencing of the spc Gene Cluster from S. sanyensis FMA
The formation of ICZ rings involves four conserved enzymes; therefore, a pair of degenerate primers were designed according to the alignment result of RebC (CAC93716), StaC (BAF47693), AtmC (ABC02791) and InkE (ABD59214), which revealed highly conserved regions of AADLGWKLAA and VLVRPDGHVAWR, as shown in Figure 1. A distinct product at the expected size of 0.6 kb was obtained by PCR from genomic DNA of the strain FMA using the degenerate primers and was cloned into pUM-T to yield pWLI601. Sequencing results showed that the PCR-amplified product was very similar to known FAD-dependent monooxygenases for ICZ ring biosynthesis, with 59% identity to RebC, 71% identity to StaC, 59% identity to AtmC and 55% identity to InkE, indicating that the amplified gene fragment is probably involved in ICZ biosynthesis in strain FMA. This cloning strategy would be applicable for probing the biosynthetic genes for other ICZ ring-containing natural products.
The cosmid library of strain FMA was constructed using SuperCos1 as the vector. A pair of specific primers was designed according to the internal sequence of the 0.6 kb fragment and was used for library screening. Five overlapped positive cosmids (pWLI611-615) were identified, and pWLI615 was chosen for further sequencing, ultimately giving a 34.6 kb continuous DNA region. The overall G + C content of the region was 75.5%. The sequence was deposited in the GenBank database under the accession number KC182794.

Organization and Characterization of the spc Gene Cluster
In total, 19 open reading frames (ORFs) were identified, among which 15 were designated as ICZ glycoside biosynthetic genes and the other four were predicted to be beyond the cluster (Figure 2). The composition and organization of the cluster are highly conserved with the STA gene cluster from Streptomyces sp. TP-A0274. The results are summarized in Table 1. spcODPC genes, which exhibit 65%-78% identity to the known homologous genes, encode the ICZ ring K252c. spcABEKJI genes, showing 64%-86% identity to their homologs, are responsible for the assembly of the sugar moiety, followed by C-N bond formation, catalyzed by SpcG and SpcN, sequentially. The two methylation-tailoring steps are performed by SpcMA and SpcMB, respectively. Expression of the gene cluster is probably regulated by SpcR, a LuxR family transcriptional activator harboring a typical Helix-Turn-Helix (HTH) motif for DNA binding at the C-terminus.  Table 1. Although 7 ICZs-K252c, K252a, 3′-epi-K252a, RK286c, 4-bis(3-indolyl)-1H-pyrrole-2,5-dione and streptocarbazoles A and B compounds have been originally isolated from S. sanyensis FMA, to our surprise, the isolated ICZ gene cluster exhibit high homology to that of the STA gene cluster from Streptomyces sp. TP-A0274. Further fermentation of strain FMA showed the major ICZ compound accumulated in strain FMA is indeed STA (Figure 3). We assume that the backbone formation of the previously isolated ICZs is directed by the spc gene cluster, and certain enzyme(s) beyond the cluster might be involved in the biosynthesis of some minor ICZ components as well. Culture conditions may influence gene expression, leading to metabolic changes and, consequently, resulting in the different metabolite profile of ICZs in this strain. Additional 20 kb DNA regions both upstream and downstream of the defined gene cluster were further analyzed for possible genes involving biosynthesis of the minor ICZs components (data not shown). No obvious hit was obtained. In addition, genome sequencing has been performed [29], revealing that the spc gene cluster is the only ICZ biosynthesis locus in the genome, and further analysis is currently going on.

Involvement of the spc Gene Cluster in ICZs Biosynthesis in S. sanyensis FMA
To support the predicted involvement of this locus in the ICZs biosynthesis in S. sanyensis FMA, spcCIR genes were inactivated by using the PCR targeting strategy (Table S2, Supplementary Materials). The target genes were replaced with the aac(3)IV/oriT cassette, resulting in mutant cosmids pWLI621 (ΔspcC), pWLI622 (ΔspcI) and pWLI623 (ΔspcR), which were then transferred into the wild type S. sanyensis FMA. Apramycin-resistant (Apr R ) and kanamycin-sensitive (Kan S ) exconjugants were selected as double crossover mutants. LIW601 (ΔspcC), LIW602 (ΔspcI) and LIW603 (ΔspcR) and their genotypes were confirmed by PCR (Figures S1-S3, Supplementary Materials). All the mutants were fermented and tested for ICZs formation, using the wild-type strain as a positive control. Inactivation of spcC almost completely abolished ICZs production in LIW601 ( Figure 3A, panel ii), which proved its essential role for ICZs biosynthesis, consistent with previously reported results [19,23]. No ICZs production was observed in LIW603, indicating that spcR is a positive regulator (Figure 3a, panel iv). Conversely, spcI mutant accumulated two compounds (2 and 3) with very similar UV-vis spectra to that of STA (Figure 3a, panel iii).
The identity of the predominant ICZ compound, STA, produced by strain FMA was confirmed by MS and 1 H NMR analysis, which were identical to those previously reported [30] (Figures S4 and S5, Supplementary Materials). The two compounds produced by spcI mutant were isolated, and their structures were determined by MS, 1 H NMR, 13 C NMR, 1 H-1 H COSY, HSQC and HMBC (for two) data ( Figures S6-S14, Supplementary Materials). Compound 2 was identified as K252d, with a molecular weight of 457, while compound 3 is K252c, with a molecular weight of 311. Interestingly, only K252c was reported to be accumulated in the ΔstaI mutant of Streptomyces sp. TP-A0274 [31]. Therefore, we proposed that inactivation of spcI led to accumulation of both TDP-sugar and ICZ ring, and further, spcG may use TDP-L-rhamnose as a sugar donor to catalyze its attachment onto the N-13 atom of K252c to afford K252d (Scheme 2). The substrate promiscuity of SpcG would make it an alternative tool for glycodiversification.

Heterologous Expression of the spc Gene Cluster in Streptomyces coelicolor M1152
Heterologous expression was performed in S. coelicolor M1152, which is well characterized and does not harbor ICZs gene clusters in its genome. pWLI615 was equipped with oriT and φC31 attP/int for conjugation and integration at the attB site at the chromosome, resulting in pWLI617. To test production in a heterologous host, pWLI617 was introduced into S. coelicolor M1152 by conjugation. Apramycin-resistant exconjugants were selected to generate S. coelicolor M1152/pWLI617. HPLC analysis of the fermentation cultures showed that S. coelicolor M1152/pWLI617 produced STA in an excellent yield, in contrast, STA was completely absent from S. coelicolor M1152 (Figure 3b). The identity of STA was confirmed by MS analysis, giving the characteristic molecular ions (m/z for [M + H] + of 467.2), consistent with the theoretical calculated molecular mass for C 28 H 26 N 4 O 3 ( Figure S4, Supplementary Materials). Thus, the spc gene cluster was successfully expressed in the heterologous host, demonstrating its integrity for STA biosynthesis.

Bacterial Strains, Plasmids and Reagents
Bacterial strains and plasmids used and constructed during this study are listed in Table S1. Escherichia coli DH5α was used as the host for general subcloning [32]. E. coli Top10 (Invitrogen, Carlsbad, La Jolla, CA, USA) was used as the transduction host for cosmid library construction. E. coli ET12567/pUZ8002 [33] was used as the cosmid donor host for E. coli-Streptomyces intergeneric conjugation. E. coli BW25113/pIJ790 was used for λRED-mediated PCR-targeting [34]. S. sanyensis FMA wild-type strain has been described previously [4,28]. E. coli strains were grown and manipulated following standard protocols [32,34,35]. S. sanyensis FMA strains were grown at 30 °C on ISP-4 medium for sporulation and conjugation and were cultured in TSB medium for genomic DNA preparation. Common biochemicals and chemicals were purchased from standard commercial sources.

DNA Manipulation, Sequencing and Bioinformatic Analysis
Plasmid extractions and DNA purification were carried out using commercial kits (OMEGA, BIO-TEK). Genomic DNAs were prepared according to the literature protocol [36]. Both primer synthesis and DNA sequencing were performed at Sunny Biotech Co. Ltd. (Shanghai, China). Orf assignments and their proposed function were accomplished by using the FramePlot 4.0beta [37] and Blast programs [38], respectively.

Genomic Library Construction
S. sanyensis FMA genomic DNA was partially digested with Sau3AI, and fragments of 40-50 kb were recovered and dephosphorylated with CIAP and then ligated into SuperCos1 that was pretreated with XbaI, dephosphorylated and digested with BamHI. The ligation product was packaged into lambda particles with the MaxPlax Lambda Packaging Extract (Epicenter, Madison, WI, USA), as per the manufacture's instruction and plated on E. coli Top10. The titer of the primary library was about 2 × 10 5 cfu per μg of DNA.

Gene Inactivation
Gene inactivation in S. sanyensis FMA was performed using the REDIRECT Technology, according to the literature protocol [34,35]. The amplified aac(3)IV-oriT resistance cassette from pIJ773 was transformed into E. coli BW25113/pIJ790/pWLI615 to replace an internal region of the target gene. Mutant cosmids pWLI621 (ΔspcC), pWLI622 (ΔspcI) and pWLI623 (ΔspcR) were constructed (Table S2, Supplementary Materials) and introduced into S. sanyensis FMA by conjugation from E. coli ET12567/pUZ8002, according to the reported procedure [36]. The desired mutants were selected by the apramycin-resistant and kanamycin-sensitive phenotype and were further confirmed by PCR (Table S3, Supplementary Materials).

Heterologous Expression of the spc Gene Cluster in S. coelicolor M1152
S. coelicolor M1152 was used as the surrogate host for heterologous expression. A DNA fragment from pSET152AB was transformed into E. coli BW25113/pIJ790/pWLI615 to insert the aac(3)IV-oriT-φC31-attP/int into the neomycin resistance gene of SuperCos1. The resulting cosmid pWLI617 was passed through E. coli ET12567/pUZ8002 and then introduced into S. coelicolor M1152 via conjugation, according to the established procedure [36]. Apramycin-resistant exconjugants were selected to afford S. coelicolor M1152/pWLI617. Fermentations of S. coelicolor M1152/pWLI617 and S. coelicolor M1152 were performed under identical conditions as the wild-type S. sanyensis FMA and were analyzed for ICZs production by HPLC with the wild-type strain FMA as a positive control.

Production and Analyses of ICZs in S. sanyensis FMA Strains
For the production of ICZs, both seed and production media consisted of 1.5% soybean meal, 0.5% yeast extract, 0.2% soluble starch, 0.2% peptone, 0.4% NaCl, 0.4% CaCO 3 and 3.3% sea salt, pH 7.3 [4]. Spores of FMA strains were first inoculated into 50 mL of seed medium in a 250 mL flask and incubated at 28 °C, 220 rpm for 2 days. The resulting seed cultures were used to inoculate the production medium (5 mL into 50 mL of medium in a 250 mL flask for production analysis or 20 mL into 200 mL in a 1 L flask for isolation) and incubated at 28 °C, 220 rpm for another 5 days. The fermentation cultures were harvested by centrifugation, and the supernatant was extracted twice with an equal volume of ethyl acetate. The combined EtOAc extracts were concentrated in vacuo to afford a brown residue. The mycelia were extracted twice with acetone. The combined acetone extracts were concentrated in vacuo to afford the water phase. The resulting water phase was extracted twice with EtOAc. The combined EtOAc extracts were concentrated in vacuo to afford a brown residue. The above residues were dissolved in MeOH, combined, filtered through a 0.2 μm filter and subjected to HPLC. The HPLC system consisted of Agilent 1260 Infinity Quaternary pumps and a 1260 Infinity diode-array detector. Analytical HPLC was performed on an Eclipse C18 column (5 μm, 4.6 × 150 mm) developed with a linear gradient from 30% to 100% MeOH/H 2 O in 20 min, followed by an additional 10 min at 100% MeOH at flow rate of 1 mL/min and UV detection at 290 nm. Semi-preparative HPLC was conducted using an YMC-Pack ODS-A C18 column (5 μm, 120 nm, 250 × 10 mm). Samples were eluted with a linear gradient from 70% to 95% MeOH/H 2 O in 25 min, followed by 100% MeOH for 5 min at a flow rate of 2.0 mL/min and UV detection at 290 nm. The identities of STA, K252c and K252d produced by FMA strains were confirmed by MS and NMR analysis. LC-MS was carried out on Agilent 6430 Triple Quadrupole LC mass spectrometer. NMR data was recorded with a Bruker Avance 600 spectrometer.

Nucleotide Sequence Accession Number
The nucleotide sequence reported in this paper has been deposited in the GenBank database under accession number KC182794.

Conclusions
In conclusion, we described the cloning, characterization and heterologous expression of the ICZ gene cluster from the marine-derived S. sanyensis FMA. Inactivation of 3 spc genes confirmed its identity. Although this cluster is highly homologous to the STA gene cluster from Streptomyces sp. TP-A0274, a different phenotype of the aminotransferase gene mutant was observed. The accumulation of both K252c and K252d in spcI mutant revealed the relaxed substrate specificity of the N-glycosyltransferase SpcG. In addition, this cluster was expressed in S. coelicolor M1152 with a comparable yield. The work reported here represents the first cloning and characterization of an ICZ gene cluster from a marine-derived actinomycete strain and would be useful for comprehensive elucidation of the biosynthetic mechanism of ICZ glycosides.